1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
| | \input texinfo @c -*- texinfo -*-
@c shepherd.texi -- The documentation in Texinfo format.
@documentencoding UTF-8
@setfilename shepherd.info
@settitle The GNU Shepherd Manual
@include version.texi
@set OLD-YEARS 2002, 2003
@set NEW-YEARS 2013, 2016
@copying
Copyright @copyright{} @value{OLD-YEARS} Wolfgang J@"ahrling@*
Copyright @copyright{} @value{NEW-YEARS} Ludovic Courtès
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
copy of the license is included in the section entitled ``GNU Free
Documentation License''.
@end copying
@dircategory System software
@direntry
* shepherd: (shepherd). The Shepherd service manager.
* herd: (shepherd)Invoking herd
Controlling the Shepherd service manager.
* reboot: (shepherd)Invoking reboot
Rebooting a Shepherd-controlled system.
* halt: (shepherd)Invoking halt
Turning off a Shepherd-controlled system.
@end direntry
@titlepage
@title The GNU Shepherd Manual
@subtitle For use with the GNU Shepherd @value{VERSION}
@subtitle Last updated @value{UPDATED}
@author Wolfgang J@"ahrling
@author Ludovic Courtès
@insertcopying
@end titlepage
@contents
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ifnottex
@node Top
@top The GNU Shepherd Manual
This manual documents the GNU@tie{}Shepherd version @value{VERSION}, a
service manager for the GNU system.
@menu
* Introduction:: Introduction to the Shepherd service manager.
* Jump Start:: How to do simple things with the Shepherd.
* herd and shepherd:: User interface to service management.
* Services:: Details on services.
* Runlevels:: Details on runlevels.
* Misc Facilities:: Generally useful things provided by the Shepherd.
* Internals:: Hacking shepherd.
* GNU Free Documentation License:: The license of this manual.
* Concept Index::
* Procedure and Macro Index::
* Variable Index::
* Type Index::
@end menu
@end ifnottex
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Introduction
@chapter Introduction
@cindex service manager
This manual documents the GNU@tie{}Daemon Shepherd, or GNU@tie{}Shepherd
for short. The Shepherd looks after system services, typically @dfn{daemons}.
It is used to start and stop them in a reliable
fashion. For instance it will dynamically determine and start any
other services that our desired service depends upon. As another
example, the Shepherd might detect conflicts among services. In this
situation it would simply prevent the conflicting services from
running concurrently.
The Shepherd is the @dfn{init system} of the GNU operating system---it is the
first user process that gets started, typically with PID 1, and runs
as @code{root}. Normally the purpose of init systems is to manage all
system-wide services, but the Shepherd can also be a useful tool assisting
unprivileged users in the management of their own daemons.
Flexible software requires some time to master and
the Shepherd is no different. But don't worry: this manual should allow you to
get started quickly. Its first chapter is designed as a practical
introduction to the Shepherd and should be all you need for everyday use
(@pxref{Jump Start}). In chapter two we will describe the
@command{herd} and @command{shepherd} programs, and their relationship, in
more detail (@ref{herd and shepherd}). Subsequent chapters provide a full
reference manual and plenty of examples, covering all of Shepherd's
capabilities. Finally, the last chapter provides information for
those souls brave enough to hack the Shepherd itself.
@cindex dmd
The Shepherd was formerly known as ``dmd'', which stands for @dfn{Daemon
Managing Daemons} (or @dfn{Daemons-Managing Daemon}?).
@cindex Guile
@cindex Scheme
@cindex GOOPS
This program is written in Guile, an implementation of the
Scheme programming language, using the GOOPS extension for
object-orientation. Guile is also the Shepherd's configuration language.
@xref{Introduction,,, guile, GNU Guile Reference Manual}, for an
introduction to Guile. We have tried to
make the Shepherd's basic features as accessible as possible---you should be
able to use these even if you do not know how to program in Scheme. A
basic grasp of Guile and GOOPS is required only if you wish to make
use of the Shepherd's more advanced features.
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Jump Start
@chapter Jump Start
@cindex prefix
This chapter gives a short overview of the Shepherd. It is enough if you just
need the basic features of it. As it is not assumed that readers are
familiar with all the involved issues, a very experienced user might
be annoyed by the often very detailed descriptions in this
introduction. Those users are encouraged to just skip to the
reference section.
Note that all the full file names in the following text are based on
the assumption that you have installed the Shepherd with an empty prefix. If
your the Shepherd installation for example resides in @code{/usr/local}
instead, add this directory name in front of the absolute file names
mentioned below.
@cindex Configuration file
When @command{shepherd} gets started, it reads and evaluates a
configuration file. When it is started with superuser priviledges, it
tries to use @code{/etc/shepherd.scm}. When started as normal user, it
looks for a file called @code{$XDG_CONFIG_HOME/shepherd/init.scm}. If
the @code{XDG_CONFIG_HOME} environment variable is not defined,
@code{$HOME/.config/shepherd/init.scm} is used instead. With the option
@code{--config} (or, for short, @code{-c}), you can specify where to
look instead. So if you want to start @command{shepherd} with an
alternative file, use one of the following commands:
@example
shepherd --config=/etc/shepherd.scm.old
shepherd -c /etc/shepherd.scm.old
@end example
@cindex Starting a service
As the final ``d'' suggests, @command{shepherd} is just
a daemon that (usually) runs in the
background, so you will not interact with it directly. After it is
started, @command{shepherd} will listen on a socket special file, usually
@code{/var/run/shepherd/socket}, for further commands. You use the tool
@dfn{herd} to send these commands to @command{shepherd}. Usage of herd is simple and
straightforward: To start a service called @code{apache}, you use:
@example
herd start apache
@end example
@cindex Status (of services)
@cindex Service status
When you do this, all its dependencies will get resolved. For
example, a webserver is quite likely to depend on working networking,
thus it will depend on a service called @code{networking}. So if you
want to start @code{apache}, and @code{networking} is not yet running, it
will automatically be started as well. The current status of all the
services defined in the configuration file can be queried like this:
@example
herd status
@end example
@noindent
Or, to get additional details about each service, run:
@example
herd detailed-status
@end example
@noindent
In this example, this would show the @code{networking} and @code{apache}
services as started. If you just want to know the status of the
@code{apache} service, run:
@example
herd status apache
@end example
@cindex Stopping a service
You can stop
a service and all the services that depend on it will be stopped.
Using the example above, if you stop @code{networking}, the service
@code{apache} will be stopped as well---which makes perfect sense,
as it cannot work without the network being up. To actually stop a
service, you use the following, probably not very surprising, command:
@example
herd stop networking
@end example
There are two more actions you can perform on every service: The
actions @code{enable} and @code{disable} are used to prevent and allow
starting of the particular service. If a service is intended to be
restarted whenever it terminates (how this can be done will not be
covered in this introduction), but it is respawning too often in a
short period of time (by default 5 times in 5 seconds), it will
automatically be disabled. After you have fixed the problem that
caused it from being respawned too fast, you can start it again with
the commands:
@example
herd enable foo
herd start foo
@end example
@cindex virtual services
@cindex fallback services
But there is far more you can do than just that. Services can not
only simply depend on other services, they can also depend on
@emph{virtual} services. A virtual service is a service that is
provided by one or more service additionally. For instance, a service
called @code{exim} might provide the virtual service
@code{mailer-daemon}. That could as well be provided by a service
called @code{smail}, as both are mailer-daemons. If a service needs
any mailer-daemon, no matter which one, it can just depend on
@code{mailer-daemon}, and one of those who provide it gets started (if
none is running yet) when resolving dependencies. The nice thing is
that, if trying to start one of them fails, @command{shepherd} will go on and try to
start the next one, so you can also use virtual services for
specifying @emph{fallbacks}.
Additionally to all that, you can perform service-specific actions.
Coming back to our original example, @code{apache} is able to
reload its modules, therefore the action @code{reload-modules} might
be available:
@example
herd reload-modules apache
@end example
The service-specific actions can only be used when the service is
started, i.e. the only thing you can do to a stopped service is
starting it. An exception exists, see below. (If you may at some
point find this too restrictive because you want to use variants of
the same service which are started in different ways, consider using
different services for those variants instead, which all provide the
same virtual service and thus conflict with each other, if this is
desired. That's one of the reasons why virtual services exist, after
all.)
There are two actions which are special, because even if services
can implement them on their own, a default implementation is provided
by @command{shepherd} (another reason why they are special is that the default
implementations can be called even when the service is not running;
this inconsistency is just to make it more intuitive to get
information about the status of a service, see below).
These actions are @code{restart} and @code{status}. The default
implementation of @code{restart} calls @code{stop} and @code{start} on
the affected service in order, the @code{status} action displays some
general information about the service, like what it provides, what it
depends on and with which other services it conflicts (because they
provide a virtual service that is also provided by that particular
service).
Another special action is @code{list-actions}, which displays a list
of the additional actions a service provides; obviously, it can also
be called when the service is not running. Services cannot provide
their own implementation of @code{list-actions}.
A special service is @code{root}, which is used for controlling the
Shepherd itself. You can also reference to this service as
@code{shepherd}. It implements various actions. For example, the
@code{status} action displays which services are started and which ones
are stopped, whereas @code{detailed-status} has the effect of applying
the default implementation of @code{status} to all services one after
another. The @code{load} action is unusual insofar as it shows a
feature that is actually available to all services, but which we have
not seen yet: It takes an additional argument. You can use @code{load}
to load arbitrary code into the Shepherd at runtime, like this:
@example
herd load shepherd ~/additional-services.scm
@end example
This is enough now about the @command{herd} and @command{shepherd} programs, we
will now take a look at how to configure the Shepherd. In the configuration
file, we need mainly the definition of services. We can also do
various other things there, like starting a few services already.
FIXME: Finish. For now, look at the @code{examples/} subdirectory.
@example
...
@end example
Ok, to summarize:
@itemize @bullet
@item
@command{shepherd} is a daemon, @command{herd} the program that controls it.
@item
You can start, stop, restart, enable and disable every service, as
well as display its status.
@item
You can perform additional service-specific actions, which you can
also list.
@item
Actions can have arguments.
@item
You can display the status of a service, even if the service does not
provide a specific implementation for this action. The same is true
for restarting.
@item
The @code{root}/@code{shepherd} service is used to control
@command{shepherd} itself.
@end itemize
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node herd and shepherd
@chapter @command{herd} and @command{shepherd}
@cindex herd
@cindex shepherd
@cindex daemon
@cindex daemon controller
@cindex relative file names
@cindex herding, of daemons
The daemon that runs in the background and is responsible for
controlling the services is @command{shepherd}, while the user interface
tool is called @command{herd}: it's the command that allows you to
actually @emph{herd} your daemons@footnote{
@cindex deco, daemon controller
In the past, when the
GNU@tie{}Shepherd was known as GNU@tie{}dmd, the @command{herd} command
was called @code{deco}, for @dfn{DaEmon COntroller}.}. To perform an
action, like stopping a service or calling an action of a service, you
use the herd program. It will communicate with shepherd over a Unix
Domain Socket.
Thus, you start @command{shepherd} once, and then always use herd whenever you want
to do something service-related. Since herd passes its current
working directory to @command{shepherd}, you can pass relative file names without
trouble. Both @command{shepherd} and herd understand the standard arguments
@code{--help}, @code{--version} and @code{--usage}.
@menu
* Invoking shepherd:: How to start the service damon.
* Invoking herd:: Controlling daemons.
* Invoking reboot:: Rebooting a shepherd-controlled system.
* Invoking halt:: Turning off a shepherd-controlled system.
@end menu
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Invoking shepherd
@section Invoking @command{shepherd}
@cindex @command{shepherd} Invocation
@cindex invoking @command{shepherd}
The @code{shepherd} program has the following synopsis:
@example
shepherd [@var{option}@dots{}]
@end example
It accepts the following options:
@table @samp
@item -c @var{file}
@itemx --config=@var{file}
Read and evaluate @var{file} as the configuration script on startup.
@var{file} is evaluated in the context of a fresh module where bindings
from the @code{(shepherd service)} module and Guile's @code{(oop goops)} are
available, in addition to the default set of Guile bindings. In
particular, this means that code in @var{file} may use
@code{register-services}, the @code{<service>} class, and related tools
(@pxref{Services}).
@item -I
@itemx --insecure
@cindex security
@cindex insecure
Do not check if the directory where the socket---our communication
rendez-vous with @command{herd}---is located has permissions @code{700}.
If this option is not specified, @command{shepherd} will abort if the
permissions are not as expected.
@item -l [@var{file}]
@itemx --logfile[=@var{file}]
@cindex logging
@cindex log file
Log output into @var{file}, or if @var{file} is not given,
@code{/var/log/shepherd.log} when running as superuser,
@code{$XDG_CONFIG_HOME/shepherd/shepherd.log} otherwise.
@item --pid[=@var{file}]
When @command{shepherd} is ready to accept connections, write its PID to @var{file} or
to the standard output if @var{file} is omitted.
@item -p [@var{file}]
@itemx --persistency[=@var{file}]
@c FIXME-CRITICAL
@item -s @var{file}
@itemx --socket=@var{file}
@cindex socket special file
Receive further commands on the socket special file @var{file}. If
this option is not specified, @file{@var{localstatedir}/run/shepherd/socket} is
taken.
If @code{-} is specified as file name, commands will be read from
standard input, one per line, as would be passed on a @command{herd}
command line (@pxref{Invoking herd}).
@item --quiet
Synonym for @code{--silent}.
@end table
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Invoking herd
@section Invoking herd
@cindex herd
The @command{herd} command is a generic client program to control a
running instance of @command{shepherd} (@pxref{Invoking shepherd}). It has the
following synopsis:
@example
herd [@var{option}@dots{}] @var{action} [@var{service} [@var{arg}@dots{}]]
@end example
It causes the @var{action} of the @var{service} to be invoked. When
@var{service} is omitted and @var{action} is @code{status} or
@code{detailed-status}, the @code{root} service is used@footnote{This
shorthand does not work for other actions such as @code{stop}, because
inadvertently typing @code{herd stop} would stop all the services, which
could be pretty annoying.} (@pxref{The root and unknown services}, for
more information on the @code{root} service.)
For each action, you should pass the appropriate @var{arg}s. Actions
that are available for every service are @code{start}, @code{stop},
@code{restart}, @code{status}, @code{enable}, @code{disable}, and
@code{doc}.
If you pass a file name as an @var{arg}, it will be passed as-is to
the Shepherd, thus if it is not an absolute name, it is local to the current
working directory of @command{shepherd}, not to herd.
The @code{herd} command understands the following option:
@table @samp
@item -s @var{file}
@itemx --socket=@var{file}
Send commands to the socket special file @var{file}. If this option is
not specified, @file{@var{localstatedir}/run/shepherd/socket} is taken.
@end table
The @code{herd} command returns zero on success, and a non-zero exit
code on failure. In particular, it returns a non-zero exit code when
@var{action} or @var{service} does not exist and when the given action
failed.
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Invoking reboot
@section Invoking reboot
@cindex herd
The @command{reboot} command is a convenience client program to instruct
the Shepherd (when used as an init system) to stop all running services and
reboot the system. It has the following synopsis:
@example
reboot [@var{option}@dots{}]
@end example
It is equivalent to running @command{herd stop shepherd}. The
@code{reboot} command understands the following option:
@table @samp
@item -s @var{file}
@itemx --socket=@var{file}
Send commands to the socket special file @var{file}. If this option is
not specified, @file{@var{localstatedir}/run/shepherd/socket} is taken.
@end table
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Invoking halt
@section Invoking halt
@cindex herd
The @command{halt} command is a convenience client program to instruct
the Shepherd (when used as an init system) to stop all running services and turn
off the system. It has the following synopsis:
@example
halt [@var{option}@dots{}]
@end example
It is equivalent to running @command{herd power-off shepherd}. As
usual, the @code{halt} command understands the following option:
@table @samp
@item -s @var{file}
@itemx --socket=@var{file}
Send commands to the socket special file @var{file}. If this option is
not specified, @file{@var{localstatedir}/run/shepherd/socket} is taken.
@end table
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Services
@chapter Services
@cindex service
@tindex <service>
The @dfn{service} is obviously a very important concept of the Shepherd. On the
Guile level, a service is represented as an instance of
@code{<service>}, a GOOPS class (@pxref{GOOPS,,, guile, GNU Guile
Reference Manual}). When creating an instance of it, you can specify
the initial values of its slots, and you actually must do this for some
of the slots.
The @code{<service>} class and its associated procedures and methods are
defined in the @code{(shepherd service)} module.
@menu
* Slots of services:: What a <service> object consists of.
* Methods of services:: What you can do with a <service> object.
* Service Convenience:: How to conveniently work with services.
* Service De- and Constructors:: Commonly used ways of starting and
stopping services.
* Service Examples:: Examples that show how services are used.
* The root and unknown services:: Special services in the Shepherd.
@end menu
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Slots of services
@section Slots of services
@cindex <service>, slots of
@cindex slots of <service>
A service has the following slots, all of which can be initialized
with a keyword (i.e. @code{#:provides}, used when creating the object)
of the same name, except where stated otherwise. You should not
access them directly with @code{slot-ref} or @code{slot-set!}
usually, use the methods of the service class @ref{Methods of
services} instead.
@itemize @bullet
@item
@vindex provides (slot of <service>)
@cindex canonical names of services
@code{provides} is a list of symbols that are provided by the service.
A symbol can only be provided by one service running at a time,
i.e. if two services provide the same symbol, only one of them can
run, starting the other one will fail. Therefore, these symbols are
mainly used to denote conflicting services. The first symbol in the
list is the canonical name for the service, thus it must be unique.
This slot has no default value and must therefore be initialized.
@item
@vindex requires (slot of <service>)
@code{requires} is, like @code{provides}, a list of symbols that
specify services. In this case, they name what this service depends
on, i.e. before the service can be started, services that provide
those symbols must be started. If a required symbol is provided by
several services, one will be started. By default, this slot
contains the empty list.
@item
@vindex running (slot of <service>)
@cindex Hook for individual services
@code{running} is a hook that can be used by each service in its own
way. The default value is @code{#f}, which indicates that the service
is not running. When an attempt is made to start the service, it will
be set to the return value of the procedure in the @code{start} slot.
It will also be passed as an argument to the procedure in the
@code{stop} slot. This slot can not be initialized with a keyword.
@item
@vindex respawn? (slot of <service>)
@cindex Respawning services
@code{respawn?} specifies whether the service should be respawned by
the Shepherd. If this slot has the value @code{#t}, then assume the
@code{running} slot specifies a child process PID and restart the
service if that process terminates. Otherwise this slot is @code{#f},
which is the default. See also the @code{last-respawns} slot.
@item
@vindex start (slot of <service>)
@cindex Starting a service
@cindex Service constructor
@code{start} contains the ``constructor'' for the service, which will
be called to start the service. (Therefore, it is not a constructor
in the sense that it initializes the slots of a @code{<service>}
object.) This must be a procedure that accepts any amount of
arguments, which will be the additional arguments supplied by the
user. If the starting attempt failed, it must return @code{#f}. The
value will be stored in the @code{running} slot. The default value is
a procedure that returns @code{#t} and performs no further actions,
therefore it is desirable to specify a different one usually.
@item
@vindex stop (slot of <service>)
@cindex Stoping a service
@cindex Service destructor
@code{stop} is, similar to @code{start}, a slot containing a
procedure. But in this case, it gets the current value of the
@code{running} slot as first argument and the user-supplied arguments
as further arguments; it gets called to stop the service. Its return
value will again be stored in the @code{running} slot, so that it
should return @code{#f} if it is now possible again to start the
service at a later point. The default value is a procedure that
returns @code{#f} and performs no further actions.
@item
@vindex actions (slot of <service>)
@cindex Actions of services
@cindex Service actions
@code{actions} specifies the additional actions that can be performed
on a service when it is running. A typical example for this is the
@code{restart} action. The macro @code{make-actions} @ref{Service
Convenience} is provided to abstract the actual data representation
format for this slot. (It actually is a hash currently.)
@item
@vindex enabled? (slot of <service>)
@code{enabled?} cannot be initialized with a keyword, and contains
@code{#t} by default. When the value becomes @code{#f} at some point,
this will prevent the service from getting started. A service can be
enabled and disabled with the methods @code{enable} and
@code{disable}, respectively @ref{Methods of services}.
@item
@vindex last-respawns (slot of <service>)
@code{last-respawns} cannot be initialized with a keyword and is only
ever used when the @code{respawn?} slot contains @code{#t}; it is a
circular list with @code{(car respawn-limit)} elements, where each
element contains the time when it was restarted, initially all 0,
later a time in seconds since the Epoch. The first element is the one
that contains the oldest one, the last one the newest.
@item
@vindex stop-delay? (slot of <service>)
@code{stop-delay?} being false causes the @code{stop} slot to be
unused; instead, stopping the service will just cause the
@code{waiting-for-termination?} slot be set to @code{#t}.
@item
@vindex waiting-for-termination? (slot of <service>)
@code{waiting-for-termination?} cannot be initialized with a keyword
and should not be used by others, it is only used internally for
respawnable services when the @code{stop-delay?} slot contains a true
value. @code{waiting-for-termination?} contains @code{#t} if the
service is still running, but the user requested that it be stopped,
in which case if the service terminates the next time, the respawn
handler will not start it again.
otherwise @code{#f}.
@end itemize
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Methods of services
@section Methods of services
@deffn {method} start (obj <service>)
Start the service @var{obj}, including all the services it depends on.
It tries quite hard to do this: When a service that provides a
required symbol can not be started, it will look for another service
that also provides this symbol, until starting one such service
succeeds. There is some room for theoretical improvement here, of
course, but in pratice the current strategy already works very well.
This method returns the new value of the @code{running} slot
@ref{Slots of services}, which is @code{#f} if the service could not
be started.
@end deffn
@deffn {method} stop (obj <service>)
This will stop the service @var{obj}, trying to stop services that
depend in it first, so they can be shutdown cleanly. If this will
fail, it will continue anyway. Stopping of services should usually
succeed, though. Otherwise, the behaviour is very similar to the
@code{start} method. The return value is also the new @code{running}
value, thus @code{#f} if the service was stopped.
@end deffn
@deffn {method} action (obj <service>) the-action . args
Calls the action @var{the-action} (a symbol) of the service @var{obj},
with the specified @var{args}, which have a meaning depending on the
particular action.
@end deffn
@deffn {method} conflicts-with (obj <service>)
Returns a list of the canonical names of services that conflict with
the service @var{obj}.
@end deffn
@deffn {method} canonical-name (obj <service>)
Returns the canonical name of @var{obj}, which is the first element of
the @code{provides} list.
@end deffn
@deffn {method} provided-by (obj <service>)
Returns which symbols are provided by @var{obj}.
@end deffn
@deffn {method} required-by (obj <service>)
Returns which symbols are required by @var{obj}.
@end deffn
@deffn {method} running? (obj <service>)
Returns whether the service @var{obj} is running.
@end deffn
@deffn {method} respawn? (obj <service>)
Returns whether the service @var{obj} should be respawned if it
terminates.
@end deffn
@deffn {method} default-display-status (obj <service>)
Display status information about @var{obj}. This method is called
when the user performs the action @code{status} on @var{obj}, but
there is no specific implementation given for it. It is also called
when @code{detailed-status} is applied on the @code{root} service.
@end deffn
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Service Convenience
@section Service Convenience
In addition to the facilities listed below, there are also some
procedures that provide commonly needed constructors and destructors
for services @ref{Service De- and Constructors}.
@deffn {procedure} register-services . services
Register all @var{services}, so that they can be taken into account
when trying to resolve dependencies.
@end deffn
@deffn {procedure} lookup-services name
Return a list of all registered services which provide the symbol
@var{name}.
@end deffn
@deffn {macro} make-actions (name proc) ...
This macro is used to create a value for the @code{actions} slot of a
service object @ref{Slots of services}. Each @var{name} is a symbol
and each @var{proc} the corresponding procedure that will be called to
perform the action. A @var{proc} has one argument, which will be the
current value of the @code{running} slot of the service.
@end deffn
@deffn {method} start (obj <symbol>)
Start a registered service providing @var{obj}.
@end deffn
@deffn {method} stop (obj <symbol>)
Stop a registered service providing @var{obj}.
@end deffn
@deffn {method} action (obj <symbol>) the-action . args
The same as the @code{action} method of class @code{<service>}, but
uses a service that provides @var{obj} and is running.
@end deffn
@deffn {procedure} for-each-service proc
Call @var{proc}, a procedure taking one argument, once for each
registered service.
@end deffn
@deffn {procedure} find-running services
Check if any of @var{services} is running. If this is the case,
return its canonical name. If not, return @code{#f}. Only the first
one will be returned; this is because this is mainly intended to be
applied on the return value of @code{lookup-services}.
@end deffn
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Service De- and Constructors
@section Service De- and Constructors
@cindex generating constructors
@cindex generating destructors
@cindex constructors, generation of
@cindex destructors, generation of
All of the procedures listed below return procedures generated from
the supplied arguments. These procedures take one argument in the
case of destructors and no arguments in the case of constructors.
@deffn {procedure} make-system-constructor @var{command}@dots{}
The returned procedure will execute @var{command} in a shell and
return @code{#t} if execution was successful, otherwise @code{#f}.
For convenience, it takes multiple arguments which will be
concatenated first.
@end deffn
@deffn {procedure} make-system-destructor @var{command}@dots{}
Similar to @code{make-system-constructor}, but returns @code{#f} if
execution of the @var{command} was successful, @code{#t} if not.
@end deffn
@deffn {procedure} make-forkexec-constructor @var{command} @
[#:user #f] @
[#:group #f] @
[#:pid-file #f] @
[#:directory (default-service-directory)] @
[#:environment-variables (default-environment-variables)]
Return a procedure that forks a child process, closes all file
descriptors except the standard output and standard error descriptors, sets
the current directory to @var{directory}, changes the environment to
@var{environment-variables} (using the @code{environ} procedure), sets the
current user to @var{user} and the current group to @var{group} unless they
are @code{#f}, and executes @var{command} (a list of strings.) The result of
the procedure will be the PID of the child process.
When @var{pid-file} is true, it must be the name of a PID file
associated with the process being launched; the return value is the PID
read from that file, once that file has been created.
@end deffn
@deffn {procedure} make-kill-destructor [@var{signal}]
Returns a procedure that sends @var{signal} to the pid which it takes
as argument. This @emph{does} work together with respawning services,
because in that case the @code{stop} method of the @code{<service>}
class sets the @code{running} slot to @code{#f} before actually
calling the destructor; if it would not do that, killing the process
in the destructor would immediately respawn the service.
@end deffn
The @code{make-forkexec-constructor} procedure builds upon the following
procedures.
@deffn {procedure} exec-command @var{command} @
[#:user #f] @
[#:group #f] @
[#:directory (default-service-directory)] @
[#:environment-variables (default-environment-variables)]
@deffnx {procedure} fork+exec-command @var{command} @
[#:user #f] @
[#:group #f] @
[#:directory (default-service-directory)] @
[#:environment-variables (default-environment-variables)]
Run @var{command} as the current process from @var{directory}, and with
@var{environment-variables} (a list of strings like @code{"PATH=/bin"}.)
File descriptors 1 and 2 are kept as is, whereas file descriptor 0
(standard input) points to @file{/dev/null}; all other file descriptors
are closed prior to yielding control to @var{command}.
By default, @var{command} is run as the current user. If the
@var{user} keyword argument is present and not false, change to
@var{user} immediately before invoking @var{command}. @var{user} may
be a string, indicating a user name, or a number, indicating a user
ID. Likewise, @var{command} will be run under the current group,
unless the @var{group} keyword argument is present and not false.
@code{fork+exec-command} does the same as @code{exec-command}, but in
a separate process whose PID it returns.
@end deffn
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Service Examples
@section Service Examples
FIXME: This needs a lot of work.
You can create a service and then register it this way:
@lisp
(define apache (make <service>
#:provides '(apache)
#:start (...)
#:stop (...)))
(register-services apache)
@end lisp
However, as you usually won't need a variable for the service, you can
pass it directly to @code{register-services}. Here is an example that
also specifies some more initial values for the slots:
@lisp
(register-services
(make <service>
#:provides '(apache-2.0 apache httpd)
#:requires '()
#:start (...)
#:stop (...)
#:actions (make-actions
(reload-modules (...))
(restart (...)))))
@end lisp
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node The root and unknown services
@section The @code{root} and @code{unknown} services
@cindex root service
@cindex special services
The service @code{root} is special, because it is used to control the
Shepherd itself. It has an alias @code{shepherd}. It provides the
following actions (in addition to @code{enable}, @code{disable} and
@code{restart} which do not make sense here).
@table @code
@item status
Displays which services are started and which ones are not.
@item detailed-status
Displays detailed information about every registered service.
@item load @var{file}
Evaluate the Scheme code in @var{file} in a fresh module that uses the
@code{(oop goops)} and @code{(shepherd services)} modules---as with the
@code{--config} option of @command{shepherd} (@pxref{Invoking shepherd}).
@item unload @var{service-name}
Attempt to remove the service identified by @var{service-name}.
@command{shepherd} will first stop the service, if necessary, and then
remove it from the list of registered services. Any services
depending upon @var{service-name} will be stopped as part of this
process.
If @var{service-name} simply does not exist, output a warning and do
nothing. If it exists, but is provided by several services, output a
warning and do nothing. This latter case might occur for instance with
the fictional service @code{web-server}, which might be provided by both
@code{apache} and @code{nginx}. If @var{service-name} is the special
string and @code{all}, attempt to remove all services except for the Shepherd
itself.
@item reload @var{file-name}
Unload all known optional services using unload's @code{all} option,
then load @var{file-name} using @code{load} functionality. If
file-name does not exist or @code{load} encounters an error, you may
end up with no defined services. As these can be reloaded at a later
stage this is not considered a problem. If the @code{unload} stage
fails, @code{reload} will not attempt to load @var{file-name}.
@item daemonize
Fork and go into the background. This should be called before
respawnable services are started, as otherwise we would not get the
@code{SIGCHLD} signals when they terminate.
@item enable-persistency
When terminating, safe the list of running services in a file.
@c FIXME-CRITICAL: How can we specify which one?
@item disable-persistency
Don't safe the list of running services when terminating.
@end table
@cindex unknown service
@cindex fallback service
The @code{unknown} service must be defined by the user and if it
exists, is used as a fallback whenever we try to invoke an unknown
action of an existing service or use a service that does not exist.
This is useful only in few cases, but enables you to do various sorts
of unusual things.
@c FIXME-CRITICAL: finish
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Runlevels
@chapter Runlevels
RUNLEVELS DO NOT WORK YET! Do not use them! Ignore this section!
@cindex runlevel
@tindex <runlevel>
A @dfn{runlevel} makes it easier to start and stop groups of services,
to bring the system into a certain state. An object of class
@code{<runlevel>} is an abstract runlevel, and has the following
methods:
@deffn {method} enter (rl <runlevel>) services
This will be called when the runlevel should be entered.
@var{services} is the list of the currently running services.
@end deffn
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Misc Facilities
@chapter Misc Facilities
This is a list of facilities which are available to code running
inside of the Shepherd and is considered generally useful, but is not directly
related to one of the other topic covered in this manual.
@menu
* Errors:: Signalling, handling and ignoring errors.
* Communication:: Input/Output in various ways.
* Others:: Stuff that is useful, but is homeless.
@end menu
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Errors
@section Errors
@cindex assertions
@deffn {macro} assert expr
If @var{expr} yields @code{#f}, display an appropriate error
message and throw an @code{assertion-failed} exception.
@end deffn
@deffn {procedure} caught-error key args
Tell the Shepherd that a @var{key} error with @var{args} has occured. This is
the simplest way to cause caught error result in uniformly formated
warning messages. The current implementation is not very good,
though.
@end deffn
@deffn {procedure} call/cc proc
An alias for @code{call-with-current-continuation}.
@end deffn
@deffn {procedure} call/ec proc
A simplistic implementation of the nonstandard, but popular procedure
@code{call-with-escape-continuation}, i.e. a @code{call/cc} for
outgoing continuations only. Note that the variant included in the Shepherd is
not aware of @code{dynamic-wind} at all and does not yet support
returning multiple values.
@end deffn
@cindex system errors
@deffn {macro} without-system-error expr@dots{}
Evaluates the @var{expr}s, not going further if a system error occurs,
but also doing nothing about it.
@end deffn
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Communication
@section Communication
The @code{(shepherd comm)} module provides primitives that allow clients such
as @command{herd} to connect to @command{shepherd} and send it commands to
control or change its behavior (@pxref{Slots of services, actions of
services}).
@tindex <shepherd-command>
Currently, clients may only send @dfn{commands}, represented by the
@code{<shepherd-command>} type. Each command specifies a service it
applies to, an action name, a list of strings to be used as arguments,
and a working directory. Commands are instantiated with
@code{shepherd-command}:
@deffn {procedure} shepherd-command @var{action} @var{service} @
[#:@var{arguments} '()] [#:@var{directory} (getcwd)]
Return a new command (a @code{<shepherd-command>}) object for
@var{action} on @var{service}.
@end deffn
@noindent
Commands may then be written to or read from a communication channel
with the following procedures:
@deffn {procedure} write-command @var{command} @var{port}
Write @var{command} to @var{port}.
@end deffn
@deffn {procedure} read-command @var{port}
Receive a command from @var{port} and return it.
@end deffn
In practice, communication with @command{shepherd} takes place over a
Unix-domain socket, as discussed earlier (@pxref{Invoking shepherd}).
Clients may open a connection with the procedure below.
@deffn {procedure} open-connection [@var{file}]
Open a connection to the daemon, using the Unix-domain socket at
@var{file}, and return the socket.
When @var{file} is omitted, the default socket is used.
@end deffn
@cindex output
The daemon writes output to be logged or passed to the
currently-connected client using @code{local-output}:
@deffn {procedure} local-output format-string . args
This procedure should be used for all output operations in the Shepherd. It
outputs the @var{args} according to the @var{format-string}, then
inserts a newline. It writes to whatever is the main output target of
the Shepherd, which might be multiple at the same time in future versions.
@end deffn
@cindex protocol, between @command{shepherd} and its clients
Under the hood, @code{write-command} and @code{read-command} write/read
commands as s-expressions (sexps). Each sexp is intelligible and
specifies a protocol version. The idea is that users can write their
own clients rather than having to invoke @command{herd}. For instance,
when you type @command{herd status}, what is sent over the wire is the
following sexp:
@lisp
(shepherd-command
(version 0)
(action status) (service root)
(arguments ()) (directory "/data/src/dmd"))
@end lisp
The reply is also an sexp, along these lines:
@lisp
(reply (version 0)
(result (((service @dots{}) @dots{})))
(error #f) (messages ()))
@end lisp
This reply indicates that the @code{status} action was successful,
because @code{error} is @code{#f}, and gives a list of sexps denoting
the status of services as its @code{result}. The @code{messages} field
is a possibly-empty list of strings meant to be displayed as is to the
user.
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Others
@section Others
@cindex hashes
@deffn {procedure} copy-hashq-table table new-size
Create a hash-table with size @var{new-size}, and insert all values
from @var{table} into it, using @code{eq?} when inserting. This
procedure is mainly used internally, but is a generally useful
utillity, so it can by used by everyone.
@end deffn
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Internals
@chapter Internals
This chapter contains information about the design and the
implementation details of the Shepherd for people who want to hack it.
The GNU@tie{}Shepherd is developed by a group of people in connection
with @uref{https://www.gnu.org/software/guix/, GuixSD}, GNU's advanced
distribution, but it can be used on other distros as well. You're very
much welcome to join us! You can report bugs to
@email{bug-guix@@gnu.org} and send patches or suggestions to
@email{guix-devel@@gnu.org}.
@menu
* Coding standards:: How to properly hack the Shepherd.
* Design decisions:: Why the Shepherd is what it is.
* Service Internals:: How services actually work.
* Runlevel evolution:: Learning from past mistakes.
@end menu
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Coding standards
@section Coding standards
About formatting: Use common sense and GNU Emacs (which actually is
the same, of course), and you almost can't get the formatting wrong.
Formatting should be as in Guile and Guix, basically. @xref{Coding
Style,,, guix, GNU Guix Reference Manual}, for more info.
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Design decisions
@section Design decisions
@quotation Note
This section was written by Wolfgang Jährling back in 2003 and documents
the original design of what was then known as GNU@tie{}dmd. The main
ideas remain valid but some implementation details and goals have
changed.
@end quotation
The general idea of a service manager that uses dependencies, similar
to those of a Makefile, came from the developers of the GNU Hurd, but
as few people are satisfied with System V Init, many other people had
the same idea independently. Nevertheless, the Shepherd was written with the
goal of becoming a replacement for System V Init on GNU/Hurd, which
was one of the reasons for choosing the extension language of the GNU
project, Guile, for implementation (another reason being that it makes
it just so much easier).
The runlevel concept (i.e. thinking in @emph{groups} of services) is
sometimes useful, but often one also wants to operate on single
services. System V Init makes this hard: While you can start and stop
a service, @code{init} will not know about it, and use the runlevel
configuration as its source of information, opening the door for
inconsistencies (which fortunatly are not a practical problem
usually). In the Shepherd, this was avoided by having a central entity that is
responsible for starting and stopping the services, which therefore
knows which services are actually started (if not completely
inproperly used, but that is a requirement which is impossible to
avoid anyway). While runlevels are not implemented yet, it is clear
that they will sit on top of the service concept, i.e. runlevels will
merely be an optional extension that the service concept does not rely
on. This also makes changes in the runlevel design easier when it may
become necessary.
The consequence of having a daemon running that controls the services
is that we need another program as user interface which communicates
with the daemon. Fortunatly, this makes the commands necessary for
controlling services pretty short and intuitive, and gives the
additional bonus of adding some more flexibility. For example, it is
easiely possible to grant password-protected control over certain
services to unprivileged users, if desired.
An essential aspect of the design of the Shepherd (which was already mentioned
above) is that it should always know exactly what is happening,
i.e. which services are started and stopped. The alternative would
have been to not use a daemon, but to save the state on the file
system, again opening the door for inconsistencies of all sorts.
Also, we would have to use a seperate program for respawning a service
(which just starts the services, waits until it terminates and then
starts it again). Killing the program that does the respawning (but
not the service that is supposed to be respawned) would cause horrible
confusion. My understanding of ``The Right Thing'' is that this
conceptionally limited strategy is exactly what we do not want.
The way dependencies work in the Shepherd took a while to mature, as it was not
easy to figure out what is appropriate. I decided to not make it too
sophisticated by trying to guess what the user might want just to
theoretically fulfill the request we are processing. If something
goes wrong, it is usually better to tell the user about the problem
and let her fix it, taking care to make finding solutions or
workarounds for problems (like a misconfigured service) easy. This
way, the user is in control of what happens and we can keep the
implementation simple. To make a long story short, @emph{we don't try
to be too clever}, which is usually a good idea in developing
software.
If you wonder why I was giving a ``misconfigured service'' as an
example above, consider the following situation, which actually is a
wonderful example for what was said in the previous paragraph: Service
X depends on symbol S, which is provided by both A and B. A depends
on AA, B depends on BB. AA and BB conflict with each other. The
configuration of A contains an error, which will prevent it from
starting; no service is running, but we want to start X now. In
resolving its dependencies, we first try to start A, which will cause
AA to be started. After this is done, the attempt of starting A
fails, so we go on to B, but its dependency BB will fail to start
because it conflicts with the running service AA. So we fail to
provide S, thus X cannot be started. There are several possibilities
to deal with this:
@itemize @bullet
@item
When starting A fails, terminate those services which have been
started in order to fulfill its dependencies (directly and
indirectly). In case AA was running already, we would not want to
terminate it. Well, maybe we would, to avoid the conflict with BB.
But even if we would find out somehow that we need to terminate AA to
eventually start X, is the user aware of this and wants this to happen
(assuming AA was running already)? Probably not, she very likely has
assumed that starting A succeeds and thus terminating AA is not
necessary. Remember, unrelated (running) services might depend in AA.
Even if we ignore this issue, this strategy is not only complicated,
but also far from being perfect: Let's assume starting A succeeds, but
X also depends on a service Z, which requires BB. In that case, we
would need to detect in the first place that we should not even try to
start A, but directly satisfy X's dependency on S with B.
@item
We could do it like stated above, but stop AA only if we know we won't
need it anymore (for resolving further dependencies), and start it
only when it does not conflict with anything that needs to get
started. But should we stop it if it conflicts with something that
@emph{might} get started? (We do not always know for sure what we
will start, as starting a service might fail and we want to fall back
to a service that also provides the particular required symbol in that
case.) I think that either decision will be bad in one case or
another, even if this solution is already horribly complicated.
@item
When we are at it, we could just calculate a desired end-position, and
try to get there by starting (and stopping!) services, recalculating
what needs to be done whenever starting a service fails, also marking
that particular service as unstartable, except if it fails to start
because a dependency could not be resolved (or maybe even then?).
This is even more complicated. Instead of implementing this and
thereby producing code that (a) nobody understands, (b) certainly has
a lot of bugs, (c) will be unmaintainable and (d) causes users to
panic because they won't understand what will happen, I decided to do
the following instead:
@item
Just report the error, and let the user fix it (in this case, fix the
configuration of A) or work around it (in this case, disable A so that
we won't start AA but directly go on to starting B).
@end itemize
I hope you can agree that the latter solution after all is the best
one, because we can be sure to not do something that the user does not
want us to do. Software should not run amok. This explanation was
very long, but I think it was necessary to justify why the Shepherd uses a very
primitive algorithm to resolve dependencies, despite the fact that it
could theoretically be a bit more clever in certain situations.
One might argue that it is possible to ask the user if the planned
actions are ok with her, and if the plan changes ask again, but
especially given that services are supposed to usually work, I see few
reasons to make the source code of the Shepherd more complicated than
necessary. If you volunteer to write @emph{and} maintain a more
clever strategy (and volunteer to explain it to everyone who wants to
understand it), you are welcome to do so, of course@dots{}
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Service Internals
@section Service Internals
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Runlevel evolution
@section Runlevel evolution
@quotation Note
This section was written by Wolfgang Jährling back in 2003 and is kept
mostly for historians to read.
@end quotation
This section describes how the runlevel concept evolved over time.
This is basically a collection of mistakes, but is provided here for
your information, and possibly for your amusement, but I'm not sure if
all this weird dependency stuff is really that funny.
@menu
* Runlevel assumptions:: What runlevels should be like
* Runlevels - part one:: The first attempts of making it work
* Runlevels - part two:: It should work... somehow...
@end menu
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Runlevel assumptions
@subsection Runlevel assumptions
A runlevel is a system state, i.e. it consists of the information
about which services are supposed to be available and which not. This
vague definition implies that several different runlevel styles can be
implemented in a service manager.
For example, you can do it like System V Init, specifying which
services should be started when we enter a runlevel and which ones
should be stopped when leaving it. But one could also specify for
every service in which runlevels it should be running.
In the Shepherd, we do not want to limit ourselfes to a single runlevel style.
We allow for all possible strategies to be implemented, providing the
most useful ones as defaults. We also want to make it possible to
combine the different styles arbitrariely.
Therefore, when entering a runlevel, we call a user-defined piece of
code, passing it the list of currently active services and expecting
as the result a list of service symbols which tell us which services
we want to have running. This interface makes it very easy to
implement runlevel styles, but makes it not-so-easy for the runlevel
implementation itself, because we have to get from the current state
into a desired state, which might be more or less vague (since it is
not required to be a list of canonical names). Obviously service
conflicts and already running services need to be taken into account
when deciding which services should be used to provide the various
symbols.
Also, the runlevel implementation should be implemented completely on
top of the service concept, i.e. the service part should not depend on
the idea of runlevels or care about them at all. Otherwise
understanding the service part (which is the most essential aspect of
the Shepherd) would become harder than necessary.
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Runlevels - part one
@subsection Runlevels, part one
I came up with the following method (here in Pseudo-Scheme), which is
possibly slightly buggy, but should give you the idea:
@lisp
;; Beginning with the canonical names in CURRENT-SERVICES, start and
;; stop services until getting into a state where everything requested
;; in TARGET-SERVICES (which does not only consist of canonical names)
;; is provided, and the things they depends on, but no more.
(define (switch-runlevel current-services target-services)
(let ((target-services-backup target-services)
(unstartable '()))
(let retry ()
(repeat-until-none-of-these-changes-annythig
;; Replace all of them with canonical names which provide them.
(canonicalize-names! target-services unstartable current-services)
;; Add what we need additionally.
(add-dependencies! target-services unstartable current-services))
(remove-redundancy! target-services)
(stop-all-unneeded target-services)
(catch 'service-could-not-be-started
(lambda ()
;; Iterate over the list, starting only those which
;; have all dependencies already resolved, so nothing
;; we don't want will be started. Repeat until done.
(carefully-start target-services))
(lambda (key service)
(set! unstartable (cons service unstartable))
(set! target-services backup-target-services)
(set! current-services (compute-current-services))
(retry))))))
@end lisp
This indeed looks like a nice way to get what we want. However, the
details of this are not as easy as it looks like. When replacing
virtual services with canonical names, we have to be very careful.
Consider the following situation:
The virtual service X is provided by both A and B, while Y is provided
only by B. We want to start C (which depends on X) and D (which
depends on Y). Obviously we should use B to fulfill the dependency
of C and D on X and Y, respectively. But when we see that we need
something that provides X, we are likely to do the wrong thing: Select
A. Thus, we need to clean this up later. I wanted to do this as
follows:
While substituting virtual services with canonical names, we also safe
which one we selected to fulfill what, like this:
@lisp
((A . (X))
(B . (Y)))
@end lisp
Later we look for conflicts, and as A and B conflict, we look which
one can be removed (things they provide but are not required by anyone
should be ignored, thus we need to create a list like the above). In
this case, we can replace A with B as B also provides X (but A does
not provide Y, thus the reverse is impossible). If both could be
used, we probably should decide which one to use by looking at further
conflicts, which gets pretty hairy. But, in this case, we are lucky
and end up with this:
@lisp
((B . (X Y)))
@end lisp
This way of finding out which service we should use in case of
conflicts sounds pretty sane, but if you think it will work well, you
have been fooled, because actually it breaks horribly in the following
situation:
@multitable @columnfractions .10 .30
@item Service @tab Provides
@item A @tab @code{W X Y -}
@item B @tab @code{W X - Z}
@item C @tab @code{- X Y Z}
@item D @tab @code{W - - -}
@end multitable
If we need all of W, X, Y and Z, then obviously we need to take C and
D. But if we have a list like this, we cannot fix it:
@lisp
((A . (W X Y))
(B . (Z)))
@end lisp
Thus, we cannot do it this way.
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Runlevels - part two
@subsection Runlevels, part two
Let's look again at the table at the end of part two:
@multitable @columnfractions .10 .30
@item Service @tab Provides
@item A @tab @code{W X Y -}
@item B @tab @code{W X - Z}
@item C @tab @code{- X Y Z}
@item D @tab @code{W - - -}
@end multitable
If from this table it is so obvious for us what we should do, then it
should also be possible to calculate it for a computer, given such a
table as input. Ok, we have to take into account conflicts that are
not visible in this table, but the general idea is usable. But how do
we find which combination works? I found only one way yet: Kind of a
brute force attack: Try combinations until we find one that works.
This alone would be too slow. With 20 services we would have 2^20
possible combinations, that is a bit more than a million. Fortunatly,
we can optimize this. First I thought we could remove all services
from the list that do not provide any symbol we need, but that is
obviously a stupid idea, as we might need them for dependencies, in
which case we need to take into account their conflicts. But the
following method would work:
Very often a symbol that is required will be a canonical name already,
i.e. be provided only by a single service. Using our example above,
let's suppose we also need the symbol V, which is provided only by D.
The first step we do is to look which (required) symbols are provided
only by a single service, as we will need this service for sure. In
this case, we would need D. But by using it, we would also get the
other symbols it provides, W in this case. This means that we don't
need to bother looking at other services that provide W, as we cannot
use them because they conflict with a service that we definitely need.
In this case, we can remove A and B from the list this way. Note that
we can remove them entirely, as all their conflicts become irrelevant
to us now. In this simple case we would not even have to do much
else, C is the only remaining service.
After this first step, there remain the symbols that are provided by
two or more services. In every combination we try, exactly one of
them must be used (and somehow we should take into account which
services are running already). This also reduces the amount of
possible combinations a lot. So what remains after that are the
services we might need for fulfilling dependencies. For them, we
could try all combinations (2^n), making sure that we always try
subsets before any of their supersets to avoid starting unneeded
services. We should take into account which services are already
running as well.
The remaining question is, what to do if starting a service fails. A
simple solution would be to recursively remove all services that
depend on it directly or indirectly. That might cause undesired
side-effects, if a service was running but it had to be stopped
because one of the services that provides something it depends on gets
exchanged for another service that provides the same symbol, but fails
to start. The fact that we would have to stop the (first) service is
a problem on its own, though.
@c *********************************************************************
@node GNU Free Documentation License
@appendix GNU Free Documentation License
@include fdl-1.3.texi
@c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@node Concept Index
@unnumbered Concept Index
@printindex cp
@node Procedure and Macro Index
@unnumbered Procedure and Macro Index
@printindex fn
@node Variable Index
@unnumbered Variable Index
@printindex vr
@node Type Index
@unnumbered Type Index
@printindex tp
@bye
|