Hi, following a recent discussion on guix-sysadmin I have to confirm the ssh-daemon issue since it is still happening on some of the machines I administer Previous possibly related bug reports are https://issues.guix.gnu.org/issue/30993 and https://issues.guix.gnu.org/issue/32197 Unfortunately this issue is *not* well reproducible, it depends on some mysterious (to me) timing factor; AFAIU it does *not* depend on the shepherd version, probably it depends on "something" related to IPv6 (read below the details) Andreas Enge writes: [...] > My impression is that the problem is still there. I am quite certain it > happened when I rebooted dover, since I had to connect on the serial console > to manually restart the ssh service. I'm sure it happened when milano-guix-1 was rebooted due to data centre maintenance and happened yesterday to one of my personal Guix machines at office [...] My situation is similar to the one observed by Andreas > Well, it is in /var/log/messages: > Aug 3 21:11:38 localhost sshd[360]: Server listening on 0.0.0.0 port 22. > Aug 3 21:11:55 localhost shepherd[1]: Service ssh-daemon could not be started. --8<---------------cut here---------------start------------->8--- [...] Sep 4 21:46:02 localhost shepherd[1]: Service syslogd has been started. [...] Sep 4 21:46:03 localhost shepherd[1]: Service loopback has been started. [...] Sep 4 21:46:22 localhost vmunix: [ 0.226337] PCI: Using configuration type 1 for base access Sep 4 21:46:09 localhost dhclient: DHCPREQUEST for 10.38.2.16 on eno1 to 255.255.255.255 port 67 [...] Sep 4 21:46:24 localhost shepherd[1]: Service networking has been started. [...] Sep 4 21:46:12 localhost sshd[577]: Server listening on 0.0.0.0 port 22. [...] Sep 4 21:46:30 localhost vmunix: [ 0.250107] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 10 *11 12 14 15) Sep 4 21:46:13 localhost dhclient: DHCPREQUEST for 10.38.2.16 on eno1 to 255.255.255.255 port 67 [...] Sep 4 21:46:16 localhost dhclient: DHCPACK of 10.38.2.16 from 10.38.2.1 [...] Sep 4 21:46:33 localhost shepherd[1]: Service ssh-daemon could not be started. [...] Sep 4 21:46:47 localhost vmunix: [ 0.731142] Segment Routing with IPv6 --8<---------------cut here---------------end--------------->8--- Please note the timing of the dhclient and the sshd processes: I inserted them as printed in /var/log/messages but they are not time-sequential: does it means something or is irrelevant? So the sshd process started (as far as I cen see there is no trace it was stopped) and pretty soon shepherd noticed ssh-daemon was not started. Logging in from the console I see the ssh-daemon is stopped but enabled: --8<---------------cut here---------------start------------->8--- Status of ssh-daemon: It is stopped. It is enabled. Provides (ssh-daemon). Requires (syslogd loopback). Conflicts with (). Will be respawned. --8<---------------cut here---------------end--------------->8--- [...] If I start it via `sudo herd start ssh-daemon` it immediatly starts, like in Andreas experience: > Aug 3 21:13:10 localhost sshd[385]: Server listening on 0.0.0.0 port 22. > Aug 3 21:13:10 localhost sshd[385]: Server listening on :: port 22. > Aug 3 21:13:11 localhost shepherd[1]: Service ssh-daemon has been started. --8<---------------cut here---------------start------------->8--- Sep 5 13:38:55 localhost sshd[745]: Server listening on 0.0.0.0 port 22. Sep 5 13:38:55 localhost sshd[745]: Server listening on :: port 22. Sep 5 13:38:55 localhost shepherd[1]: Service ssh-daemon has been started. --8<---------------cut here---------------end--------------->8--- Please notice the difference from above: this time the sshd server is also listening on the IPv6 address :: while in the above log it was only listening on the 0.0.0.0 IPv4 address Does the failure have something to do with IPv6 not available when sshd starts for the first time after a reboot? Please have a look at the following /var/log/message excerpt from my system after a succesfull ssh-daemon start soon after a reboot (no "manual" intervention): --8<---------------cut here---------------start------------->8--- Sep 5 14:45:00 localhost vmunix: [ 0.247544] pci 0000:00:14.0: reg 0x10: [mem 0xf7c20000-0xf7c2ffff 64bit] Sep 5 14:44:45 localhost sshd[574]: Server listening on 0.0.0.0 port 22. [...] Sep 5 14:44:47 localhost sshd[574]: Server listening on :: port 22. [...] Sep 5 14:45:05 localhost shepherd[1]: Service ssh-daemon has been started. --8<---------------cut here---------------end--------------->8--- Bingo? This time ssh was started also on :: and it works right after a reboot. It really seems it has something to do with IPv6 but I cannot understand exactly what :-S (do I have to disable IPv6 in my configs?) For completeness, I have to say that the issue happened yesterday after a `guix system reconfigure`, this is my current system generation: --8<---------------cut here---------------start------------->8--- Generation 8 Sep 04 2019 17:19:08 (current) file name: /var/guix/profiles/system-8-link canonical file name: /gnu/store/iw2ayn696f8ipmd5gzw9fxljf9h8w4pr-system label: GNU with Linux-Libre 5.2.11 bootloader: grub-efi root device: UUID: 26bd54ec-4e74-4b3a-96ff-58f2f34e4a1a kernel: /gnu/store/xgl60ivx8p5p79zjbf08p4x09881wf4s-linux-libre-5.2.11/bzImage --8<---------------cut here---------------end--------------->8--- Reconfigured with this guix version: --8<---------------cut here---------------start------------->8--- g@batondor ~$ sudo -i guix describe Generation 6 Sep 04 2019 17:17:02 (current) guix 5ee1c04 repository URL: https://git.savannah.gnu.org/git/guix.git branch: master commit: 5ee1c0459eebdd3b7771abaeab0f0b52ff86fdd5 --8<---------------cut here---------------end--------------->8--- This is the shepherd version: --8<---------------cut here---------------start------------->8--- g@batondor ~$ shepherd --version shepherd (GNU Shepherd) 0.6.1 --8<---------------cut here---------------end--------------->8--- Thanks! Gio' -- Giovanni Biscuolo Xelera IT Infrastructures