From mboxrd@z Thu Jan  1 00:00:00 1970
From: Danny Milosavljevic <dannym@scratchpost.org>
Subject: Re: What is the philosophy behind shepherd?
Date: Thu, 11 Apr 2019 03:21:19 +0200
Message-ID: <20190411021926.596503a8@scratchpost.org>
References: <87o95jlyo0.fsf@gmail.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
	boundary="Sig_/Hw+JZ=+eqgAOa++D=sSND6e";
	protocol="application/pgp-signature"
Return-path: <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([209.51.188.92]:35265)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dannym@scratchpost.org>) id 1hEOPa-0001Yq-HH
	for guix-devel@gnu.org; Wed, 10 Apr 2019 21:21:56 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dannym@scratchpost.org>) id 1hEOPW-0002oN-7w
	for guix-devel@gnu.org; Wed, 10 Apr 2019 21:21:54 -0400
Received: from dd26836.kasserver.com ([85.13.145.193]:55488)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <dannym@scratchpost.org>)
	id 1hEOPS-0002hz-Ls
	for guix-devel@gnu.org; Wed, 10 Apr 2019 21:21:48 -0400
In-Reply-To: <87o95jlyo0.fsf@gmail.com>
List-Id: "Development of GNU Guix and the GNU System distribution."
	<guix-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-devel>,
	<mailto:guix-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/guix-devel/>
List-Post: <mailto:guix-devel@gnu.org>
List-Help: <mailto:guix-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-devel>,
	<mailto:guix-devel-request@gnu.org?subject=subscribe>
Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org
Sender: "Guix-devel" <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
To: Katherine Cox-Buday <cox.katherine.e@gmail.com>
Cc: guix-devel@gnu.org

--Sig_/Hw+JZ=+eqgAOa++D=sSND6e
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Hi,

On Sat, 06 Apr 2019 14:30:07 -0500
Katherine Cox-Buday <cox.katherine.e@gmail.com> wrote:

> "system layer" which is responding to the many complicated signals
> coming into a system from thing happening (e.g. networks becoming
> available/unavailable, VPNs mucking with DNS and routing tables, etc.).
> He characterizes systemd and things like it as something that lives
> between kernel-space and user-space.
>=20
> It really opened my eyes to why something like systemd exists rather
> than sticking with the old-style init systems.

There are some init system approaches:

* Ideally, there would be /etc/inittab, and only /etc/inittab .
Dependency resolution (if any) would be done in a shotgun approach (restart
services that break, so eventually the whole thing would work--it just does
a lot of service restarts in the beginning.  Furthermore,
it backs off on restarting so it didn't restart services too quickly.
It doesn't need pid files or anything like that--it just
remembered what was started and what terminated on its own :P).
That is clearly the MIT approach: Have something shitty but simple which do=
es
the minimal thing.  That's it.  SysV init supports that but unfortunately
adds to it in the same process.  Why?
If you want to start a service supervisor, just invoke it in /etc/inittab -=
 it's fine!

* SysV init: It has a directory (actually one per runlevel) with shell scri=
pts
and those are started, in name order.  No dependency resolution and no idea=
 of
what each shell script does.  No standardized library either.  No
parallelization.  No activation.  That's clearly bad.  I don't understand h=
ow
the SysV init afficionados can defend it--that's not a sane way to manage a
large system.
And if you want to manage a small system, use /etc/inittab .
For that matter, if you want to manage a large system, use /etc/inittab and
invoke a service manager in there.

* OpenRC: This init system has simple bash scripts BUT with included library
(in bash) for common things, including *very good* dependency resolution,
parallel startup, and activation.  That's all.  It just works.  Still one
big conflation of init and supervisor in one process.

* Shepherd: This init system is pretty similar to OpenRC.  There are a few
bootstrapping cases.  For example, in the beginning, the system Shepherd
cannot log to syslog because it didn't start syslog (yet?).

Other init systems do insane things to make this work, including writing th=
eir
own builtin syslog daemon or by starting syslog themselves (complicating the
management model to be understood by the user) etc.

Shepherd just logs to the kernel log until syslog comes up, IF it comes up
(in which case syslogd will copy the kernel log anyway--or not, shepherd
doesn't care).
That's clearly a simpler solution.

These kinds of cases keep coming up and it could get pretty bad--but I
think together with Guix, Shepherd is nice because the actual set of
shepherd services depends on your "operating-system" form, so the
complicated cases--like conflicting services--usually don't show up
at runtime because the runtime service set doesn't contain extra services
in the first place (started or not).

Also, you can connect to a REPL there at runtime and extend it/inspect it,
so debugging is nice (as opposed to all the others on the list where
debugging is non-existent).

* Systemd: They standardized the service definition which means the usual
ways of managing programs as service are now short three-liners and also
easily understood.  I like that part.  I don't like pretty much everything
else they did.  For example they had *horrible* security bugs like
"User=3D" setting being a "suggestion".  If it couldn't switch to that
user, it just started that service as root.  Problem?  What problem?
Similarly, a sane system would have a way to distinguish user ids from
user names since a user name "5353" is allowed in UNIX just fine.  They
don't distinguish.  What's more, *clearly* when you write "User=3D0foo"
you mean "root" /s.  It went on like this for quite some years and the
bugtracker showed that they don't understand security and apparently
can't use C correctly (using C correctly is very difficult) to the point
of Linus Torvalds stopping a main systemd developer from contributing to
the upstream kernel because they broke kernel interfaces.
Then the memory leaks, the complicated source code they have, the
extraneous DIY service reimplementations, the
dbus-interface-*all*-the-things (the usual IPC between kernel and user
space is text files in /sys, not object broker interfaces that
introduce another non-file namespace to manage).  I'm all for
standardizing, but I'd prefer if the people doing the standardizing
were actually knowledgeable in managing large systems securely.

(The first thing in securing something is to cut it the down to the
essentials.  Every source code line you can remove from the process,
do remove.  Systemd usually does the opposite--although lately they
at least have separate processes for their NIH services.  The latter
doesn't help when it's still monolithic in practise)

> Does Shepherd take the stance that it is, or is to become a "system
> layer"?

I don't think so.  It's not really set in stone, but why would it do that?

The whole idea is misguided.  There's nothing stopping a user service
from providing a character special device (see cuse, uio) and providing the
same interface the kernel would have.  Also, if the kernel people actually
looked at Plan 9 they'd have more stuff as plain read-write text files in
/sys instead of weird ioctl or netlink sockets (or instead of sockets in
general, I guess).  Those could be easily provided even by shell scripts
(by bind mounting something to /sys/foo/bar).

That's not what they propose, though.  Now they want the kernel to provide
dbus services, coupling everything with everything else, intermingling
complicated state.

> If so, one of the criticisms he has for systemd is that instead of
> pulling in protocols for things (e.g. DNS), and allowing best-of-breed
> software to handle the implementation, it has pulled in the
> responsibility for implementation as well. Any thoughts on that?

Yes, most of the time it's the not-invented-here syndrome in systemd.

Programs that are maintained for many years usually have very few bugs,
are slowly adapted to cover all reasonable use cases and the authors
get an intimate understanding of the problem domain and use that
understanding to make the design of the program match it well.

Replacing syslogd that has like >30 years of experience in it by your
own haphazard implementation seems... ill-advised.

systemd has integrated DNS, time server, hostname, readahead, /dev file
generator, quota support, backlight, timezone support, cgroup support,
cron, inetd, log rotation etcetcetc (stop it already).

Binary log files.  Why?  Anything wrong with compression?  Or filters?
What if they get corrupted (for example on an unexpected shutdown, one of
the *major* reasons for having log files in the first place)?

That said, large systems have other challenges and sometimes a more
complicated supervisor is necessary.

For example in order to have LTSP client support, your X server has to
do XDMCP.  For that to find your other hosts, your network has to be
set up and for Kerberos (login) to work your time synchronization has
to work.
In order to do time synchronization your network should be setup (in
order to use NTP).  In order to log correctly, syslog should requires
time synchronization, so network should be setup, so network setup
cannot log to syslog.  On the other hand, maybe you can have your
dhcp server give you the initial time, then you don't need the other
one so early (see
https://tools.ietf.org/id/draft-ogud-dhc-udp-time-option-00.html ).
The secure DNS will fail if the time is wrong so if NTP tries to
resolve the host by name it will fail if the time is wrong.
Sometimes the hostname comes from DHCP.  But that means syslog can't
use the hostname in local logfiles since otherwise syslog would require
the hostname to be set up first, which would require DHCP, which would
require networking, which would require syslog.

So clearly some kind of dependency resolution has to be available.
Also, trade-offs sometimes have to be made.  If syslog is
configured to log to another machine, you'd prioritize bringing the
network up to logging the network bringup.  If syslog is configured
to log to a local machine, maybe you don't care about accurate
timestamps either in which case you'd prioritize syslog above logging
the network bringup.  If you like accurate timestamps and local logfiles,
maybe have a simple initial-time getter that you can use without total
network configuration (like getting it from dhcp), then start up syslog
and then start the network and then the periodic time synchronizer.

As you can see what one needs is actual goals to be specifiable.  Nobody
I know does that or even allows to specify that.

Also, in a good design, the dhcp client service in shepherd would=20
provide "dhcp-client" regardless of the actual client used so you can
replace it easily by another implementation.

Calling it "dhcpcd" or something would be ill-advised (having a comment
say so is OK, but just "provides '(dhcpcd)" is ill-advised).

Shepherd allows "provides '(dhcpcd dhcp-client)" for a shepherd service
which is nice.

Both "requires" and "provides" would list actual features the service
requires or provides, usually not just one thing.

For example, dnsmasq has both dns and dhcp server support (!), so its
"provides" entry would be "provides '(dns-server dhcp-server)" and
if someone would only require dns-server, shepherd would start up
dnsmasq anyway.

Likewise, "ntp-client" would provide "time" and "ntp-client", because that's
what it does.  It's an ntp client and in the end it will provide the time.

You could also provide time another way and it would be just as well.

So this part of the design of Shepherd is very nice, although our
dnsmasq service doesn't do the above for some reason.

In any case, service management is complicated nowadays and modularization
has downsides, among others dependency resolution.  Same here.

Then there is the problem of when to start up services.  If your
init system has every service known to man in its service list
(Guix doesn't), you probably don't want to start them all at once.
It's nice to have a facility to start services on demand.
With network services, this is possible using socket activation where
systemd will provide a dummy (inet or unix) socket, and, ONLY once
someone connects to it, actually start up the real handler.
Same principle for dbus service activation, but the dbus-daemon does it.

But you'd have the same effect if you had swap space and just let the
kernel swap out things which sit there doing nothing.  If you access the
socket, boom it gets swapped in again and continues.  Furthermore,
modern programs already lazy-load libraries and functions in those
libraries and lazy-allocate memory, so it's not wasting any time or
memory either.  What exactly is socket activation trying to solve?
If you want to have a load balancer, use xinetd or... a load balancer.

As for the video, "It's buggy"... "It's software".  Well, yeah, but why
not wait until the bugs are mostly gone before integrating it into all
the distributions?  What's the rush?  Also, I think it's a cultural
problem that they don't understand security.  So it's not just buggy,
but the bugs are *careless*.  If you make the failure modes nice a buggy
program won't crash your entire system or make it unusable.

I do use systemd.  At the university I work at (among other things)
we have a lot of systemd-using operating systems installed and many of
those act up in weird ways.  For example the user services touted in
the video hang your entire login--you can't get to the desktop--if one
of those user services hangs.  (That's a quite recent version).
I mean how did they even manage to do that?  That's what you get for
tightly coupling things.  Now the whole system is brittle.  Sheesh.
That's what you get for making systemd [both modular and] monolithic.

Sometimes dbus proxies from the user bus to the session bus, sometimes
it doesn't.  Seems to depend on the phase of the moon.  It differs
on the same machine without configuration changes, from restart to
restart.  And they want to bring that into the kernel.  Yeah, right.

See also http://skarnet.org/software/s6/systemd.html .

In any case, I think init should be as minimal as possible and not
include all those funny things in the first place.  System configuration
and init is *not* the same.  Why is it in the same executable, one
which will bring the system down if it crashes?  That makes no sense.

--Sig_/Hw+JZ=+eqgAOa++D=sSND6e
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAlyulo8ACgkQ5xo1VCww
uqVgoAf+MVbx7VKsJc6/YTOmRMn89pBU1yCz9QqMUSSBsjXsLQ8+4isv/hefFe95
HvegieE5AwN+lFcveRAf7x6cto+p6Y/YluL058XAShveUS5kLrFnSePc6AagsaAo
uh9nd1S0MkiWmzC3x1aouSmK2BWOTbnTVfmndWsHhTc8XIkt5K5Me5Ems7ww9ysE
EJ3q3eH/lhEoLSR8nBul8uQ++52dNgBpRM2mPBUajlqA1BGL74xK6Dpm4BUeQ73c
7zIkAtqxvtLvSpo6+3cNt+n3FdFZkXg+/89z8a5ww+oXakbmoCFP3zgk1MgHqDao
AChmFBMIdYnntb7lY9k/J8A6R1jwjg==
=nsat
-----END PGP SIGNATURE-----

--Sig_/Hw+JZ=+eqgAOa++D=sSND6e--