From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carlo Zancanaro Subject: Re: Improving Shepherd Date: Tue, 06 Feb 2018 02:56:12 +1100 Message-ID: <87d11jh1lv.fsf@zancanaro.id.au> References: <871si8bc5g.fsf@zancanaro.id.au> <877errn23f.fsf@zancanaro.id.au> <871shzeg8m.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:37051) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eij8D-0006jl-Lo for guix-devel@gnu.org; Mon, 05 Feb 2018 10:56:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eij8A-0006Z7-IZ for guix-devel@gnu.org; Mon, 05 Feb 2018 10:56:33 -0500 In-reply-to: <871shzeg8m.fsf@gnu.org> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: guix-devel@gnu.org --=-=-= Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Hey Ludo, On Mon, Feb 05 2018, Ludovic Court=C3=A8s wrote: >> User services - Alex has already sent a patch to the list to=20 >> allow >> generating user services from the Guix side. The idea is to=20 >> generate a >> Shepherd config file, allowing a user to invoke shepherd=20 >> manually to >> start their services. A further extension to this would be to=20 >> have >> something like systemd's "user sessions", where the pid 1=20 >> Shepherd >> automatically starts a user's services when they log in. > > After replying to Alex=E2=80=99 message, I realized that we could just=20 > as well > have a separate =E2=80=9Cguix service=E2=80=9D or similar tool to take ca= re of=20 > this? > > This needs more thought (and perhaps taking a look at systemd=20 > user > sessions, which I=E2=80=99m not familiar with), but I think Alex=E2=80=99= =20 > approach is a > good starting point. We were thinking it might work like this: - services->package constructs a package which places a file in=20 the profile containing the necessary references - pid 1 shepherd listens to elogind login/logout events, and=20 starts the services when necessary Admittedly this isn't the nicest way for it to work, but it might=20 be a good starting point. There were some discussions on the list a while ago about how to=20 have `guix environment` automatically start services, too, so I=20 wonder what overlap there could be there. Although maybe=20 environment services (in containers) have more in common with=20 system services than user services. >> Child process control - this is my personal frustration, where >> Shepherd loses track of processes that fork away (e.g. "emacs >> --daemon"). I barely know anything about Linux process=20 >> management, but >> from my reading this can be solved through Linux namespaces (if=20 >> user >> namespaces are available). Could someone who knows more about=20 >> this let >> me know if that's a productive direction for me to investigate?=20 >> Or >> tell me a better way to go about it? > > Currently shepherd monitors SIGCHLD, and it=E2=80=99s not supposed to=20 > miss > those; in some cases it might handle them later than you=E2=80=99d=20 > expect, which > means that in the meantime you see a zombie process, but=20 > otherwise it > seems to work. > > ISTR you reported an issue when using =E2=80=98shepherd --daemonize=E2=80= =99,=20 > right? > Perhaps the issue is limited to that mode? I no longer use the daemonize function. My user shepherd runs "in=20 the foreground" (it's started when my X session starts), so it's=20 not that. Jelle fixed the problem I was having by delaying the=20 SIGCHLD handler registration until it's needed. It is still buggy=20 if a process is started before the daemonize command is given to=20 root service, though. If you try running "emacs --daemon" with=20 "make-forkexec-constructor" (and #:pid-file, and put something in=20 your emacs config to make it write out the pid) you should be able=20 to reproduce what I am seeing. If you kill emacs (or if it=20 crashes) then shepherd continues to report that it is started and=20 running. When I look at htop's output I can also see that my emacs=20 process is not a child of my shepherd process. I would like to add a --daemon/--daemonize command line argument=20 to shepherd instead of the current "send the root service a=20 daemonize message". I think the use cases of turning it into a=20 daemon later are limited, and it just gives you an additional way=20 of shooting yourself in the foot. >> Concurrency/parallelism - I think Jelle was planning to work on=20 >> this, >> but I might be wrong about that. Maybe I volunteered? We're=20 >> keen to >> see Shepherd starting services in parallel, where possible.=20 >> This will >> require some changes to the way we start/stop services (because=20 >> at the >> moment we just send a "start" signal to a single service to=20 >> start it, >> which makes it hard to be parallel), and will require us to=20 >> actually >> build some sort of real dependency resolution. Longer-term our=20 >> goal >> should be to bring fibers into Shepherd, but Efraim mentioned=20 >> that >> fibers doesn't compile on ARM at the moment, so we'll have to=20 >> get that >> working first at least. > > I=E2=80=99d really like to see that happen. I=E2=80=99ve become more fam= iliar=20 > with > Fibers, and I think it=E2=80=99ll be perfect for the Shepherd (and we=E2= =80=99ll=20 > fix the > ARM build issue, no doubt.) I'm not going to put much time/effort into this until we have=20 fibers building on ARM. I think these changes are likely to break=20 shepherd's config API, too. In particular, with higher levels of=20 concurrency I want to move the mutable state out of =20 objects. > It seems that signalfd(2) is Linux-only though, which is a=20 > bummer. The > solution might be to get over it and have it implemented on=20 > GNU/Hurd=E2=80=A6 > (I saw this discussion: > ;=20 > I > suspect it=E2=80=99s within reach.) Failing that, could we have our signal handlers just convert the=20 signal to a message in our event loop? I have a very rudimentary=20 understanding of signal handling, but I assume we could have our=20 main event loop just reading things off of two channels: one of=20 signal events, one of fd events. >> This would mean that Shepherd could decide the best way to=20 >> start/stop >> services, including doing so in parallel if possible. > > Sounds good. That=E2=80=99s annoyed most of us already, so if you get=20 > that > fixed, you=E2=80=99ll make a lot of people happy. :-) I'll have a go at this in the next few weeks. I'll be travelling=20 until the end of February, so I'm not expecting much, but we'll=20 see! Carlo --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEE1lpncq7JnOkt+LaeqdyPv9awIbwFAlp4fpwACgkQqdyPv9aw Iby3iA//RIDJaCHC08AsKBMHFItj37lua4TVkRpNXddDZ2D9fP9M2kJ+9U2ET/tr fyZ48AEZnjg2uIOX7vI2alaf8g6mdudHLd5I6r5nPvIZGaHh8HKMN+/szGEZa5tr pAsJpcPxif42NI5iwsBclVNW3GFVytpdbudJMHwNz1VUL0zH2X7THcTsIgNjRo3A ckPDejw6ctJu9XFFiu+tR1+5d0DbHVJSUU8ReJcbGoLC/Ammcf1Pg1GN4qy7HugA 9RdxUdMCP4/DRibMLO9ZSyx0hRvGqMKhG1UH7ip/zf15A2Vb5dxGe6mrm12dbvzJ iJoU+nY6kxYEUEiCnvQW15Zx4fcRHjysmtUyJ4SI3SAUgkCOxHGWIzJR26jMw74c 9IPRPce1o598/VLvErRM8/7kHD3KpIoGwd0nBvAnW1yeCLiobfgSEE3FYGPrayxv 2jKdiUrIdgijQ9ybFQy2/3+dzQBlAtPsIFm+E9iEEeVMdwYK8L6x57AFFcoL3S1K vqKpuXTje50peHELMqqOYxNUcUDClZTSybR8sqF9An+emx06kb/4WH3PK9K7KPoN Myv3VT9fjhqkmmAGSxmX0E+iKFlH3WKwPJcM3BGNVeicG7tTF9VubW95OLkeN/hH FTynqJrXgI+x1qnnnW7nMW7freJGU1uNZtKop3kcaXeyZz6QzVM= =KT6/ -----END PGP SIGNATURE----- --=-=-=--