From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Subject: bug#40572: installer networking: Connman detects no technologies on Acer Aspire Date: Tue, 14 Apr 2020 11:03:55 +0200 Message-ID: <87v9m2zbes.fsf@gnu.org> References: <20200412184329.itthwjjjwprhhhbg@pelzflorian.localdomain> <87blnvmzub.fsf@gnu.org> <20200413115721.mjdrelwvrdvhm7zc@pelzflorian.localdomain> <87wo6jjzh6.fsf@gnu.org> <20200413150347.kaapjhmdwbhkij7t@pelzflorian.localdomain> <20200413155202.rjmnp4v2vnrpq3vc@pelzflorian.localdomain> <871rorgvk8.fsf@gnu.org> <20200413181415.6pgj63zcew3bkkma@pelzflorian.localdomain> <20200413215646.wduvdya6rqhr4h4r@pelzflorian.localdomain> <8736972fc1.fsf@gnu.org> <20200414004344.fg7yio3frp534jih@pelzflorian.localdomain> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:56478) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jOHVC-0004O7-2o for bug-guix@gnu.org; Tue, 14 Apr 2020 05:05:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jOHV9-0000se-5r for bug-guix@gnu.org; Tue, 14 Apr 2020 05:05:05 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:49914) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jOHV8-0000sR-OU for bug-guix@gnu.org; Tue, 14 Apr 2020 05:05:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jOHV8-0004cK-JF for bug-guix@gnu.org; Tue, 14 Apr 2020 05:05:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <20200414004344.fg7yio3frp534jih@pelzflorian.localdomain> (pelzflorian@pelzflorian.de's message of "Tue, 14 Apr 2020 02:43:44 +0200") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane-mx.org@gnu.org Sender: "bug-Guix" To: "pelzflorian (Florian Pelz)" Cc: 40572@debbugs.gnu.org --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Florian, "pelzflorian (Florian Pelz)" skribis: > The installer crashed again after entering a newly invented hostname > =E2=80=9Ca=E2=80=9D (perhaps it was already in use from my previous attem= pt?). But > later my normal hostname =E2=80=9Cflorianmacbook=E2=80=9D worked and netw= ork > technologies failed. The hostname command displays =E2=80=9Cgnu=E2=80=9D= now. > > Here are the log files. From among the dbus trace files, only > dbus.trace.301 is different after the installer crashed (see =E2=80=9Cdiff > logs/dbus.trace.301 logs/after-network-failed/dbus.trace.301=E2=80=9D). > Though you may be more interested in early trace logs/dbus.trace.228. > But I=E2=80=99m going to sleep now. ;) Uh, well deserved. :-) The logs show very well what happened. From /var/log/messages (stripped): --8<---------------cut here---------------start------------->8--- Apr 14 01:52:21 localhost vmunix: [ 12.733898] random: dbus-uuidgen: unin= itialized urandom read (12 bytes read) Apr 14 01:52:21 localhost vmunix: [ 27.690871] shepherd[1]: Service root = has been started. Apr 14 01:52:26 localhost shepherd[1]: Service dbus-system could not be sta= rted.=20 Apr 14 01:52:26 localhost shepherd[1]: Service networking depends on dbus-s= ystem.=20 Apr 14 01:52:26 localhost shepherd[1]: Service networking could not be star= ted.=20 Apr 14 01:52:31 localhost shepherd[1]: Service dbus-system could not be sta= rted.=20 Apr 14 01:52:36 localhost shepherd[1]: Service dbus-system could not be sta= rted.=20 Apr 14 01:52:36 localhost shepherd[1]: Service wpa-supplicant depends on db= us-system.=20 Apr 14 01:52:36 localhost shepherd[1]: Service wpa-supplicant could not be = started.=20 Apr 14 01:52:36 localhost shepherd[1]: Service loopback has been started.=20 Apr 14 01:52:41 localhost /gnu/store/bfvr3brh7f9dqh26jf49767ypbanqycm-gpm-1= .20.7/sbin/gpm[258]: *** info [daemon/startup.c(136)]:=20 Apr 14 01:52:41 localhost /gnu/store/bfvr3brh7f9dqh26jf49767ypbanqycm-gpm-1= .20.7/sbin/gpm[258]: Started gpm successfully. Entered daemon mode. Apr 14 01:52:41 localhost shepherd[1]: Service gpm could not be started.=20 Apr 14 01:52:43 localhost dbus-daemon[244]: Failed to start message bus: Fa= iled to bind socket "/var/run/dbus/system_bus_socket": Address already in u= se Apr 14 01:52:45 localhost vmunix: [ 78.947812] mc: Linux media interface:= v0.10 Apr 14 01:52:46 localhost shepherd[1]: Service dbus-system could not be sta= rted.=20 Apr 14 01:52:46 localhost shepherd[1]: Service term-tty1 depends on dbus-sy= stem.=20 Apr 14 01:52:46 localhost shepherd[1]: Service term-tty1 could not be start= ed.=20 Apr 14 01:52:50 localhost dbus-daemon[262]: Failed to start message bus: Fa= iled to open "/var/run/dbus/pid": File exists Apr 14 01:53:14 localhost shepherd[1]: Service dbus-system has been started= .=20 Apr 14 01:53:14 localhost shepherd[1]: Service term-tty1 has been started.= =20 --8<---------------cut here---------------end--------------->8--- That alone shows the problem: dbus-system was initially wrongfully considered as =E2=80=9Cnot started=E2=80=9D, thus subsequent attempts to st= art it result in EADDRINUSE. This is confirmed by strace logs: 228 -> starts fine openat(AT_FDCWD, "/var/run/dbus/pid", O_WRONLY|O_CREAT|O_EXCL, 0644) = =3D 5 fcntl(5, F_GETFL) =3D 0x8001 (flags O_WRONLY|O_LA= RGEFILE) fstat(5, {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 write(5, "228\n", 4) =3D 4 244 -> sendto(3, "<28>Apr 14 01:52:43 dbus-daemon[244]: Failed to start messag= e bus: Failed to bind socket \"/var/run/dbus/system_bus_socket\": Address a= lready in use", 146, MSG_NOSIGNAL, NULL, 0) =3D 146 exit_group(1) =3D ? 262 -> sendto(3, "<28>Apr 14 01:52:50 dbus-daemon[262]: Failed to start messag= e bus: Failed to open \"/var/run/dbus/pid\": File exists", 114, MSG_NOSIGNA= L, NULL, 0) =3D 114 exit_group(1) =3D ? 301 -> starts fine (did 228 die in the meantime? go figure) openat(AT_FDCWD, "/var/run/dbus/pid", O_WRONLY|O_CREAT|O_EXCL, 0644) = =3D 5 fcntl(5, F_GETFL) =3D 0x8001 (flags O_WRONLY|O_LA= RGEFILE) fstat(5, {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 write(5, "301\n", 4) =3D 4 Everything seems to be extremely slow on this machine (booting from an actual DVD, right?). For example: --8<---------------cut here---------------start------------->8--- [ 27.690871] shepherd[1]: Service root has been started. [ 37.063759] shepherd[1]: starting services... [...] [ 39.969589] shepherd[1]: Service host-name has been started. [ 41.959013] shepherd[1]: Service user-homes has been started. --8<---------------cut here---------------end--------------->8--- That=E2=80=99s 27s before shepherd is started, and another 10s before =E2= =80=9Cstarting services=E2=80=9D (the only thing that happens in between in shepherd.conf = is loading .go files for the services.) My guess is that cold-cache I/O is very slow. A plausible scenario is that loading =E2=80=98dbus-daemon=E2=80=99 the first time takes several sec= onds; dbus-daemon has enough time to fork, but it does not produce its PID file until after the 5s =E2=80=98%pid-file-timeout=E2=80=99 has timeout has= expired. Thus, shepherd marks it as =E2=80=9Cfailed to start=E2=80=9D but it=E2=80= =99s actually running. To confirm this hypothesis, we need to run =E2=80=9Cstrace -t=E2=80=9D, see= below (sorry for not thinking about doing it!). If you can try again with the patch below, that=E2=80=99s awesome. Then we=E2=80=99ll compare the timestamps in /var/log/messages and those in the strace log. If that=E2=80=99s confirmed, we can work around it locally by passing: #:pid-file-timeout 15 to =E2=80=98make-forkexec-constructor=E2=80=99 for dbus-daemon or, alternat= ely, setting =E2=80=98%pid-file-timeout=E2=80=99 globally from shepherd.conf. You were right that it relates to . It also reminds me of a discussion with Konrad about the best way to make this configurable. Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/gnu/services/dbus.scm b/gnu/services/dbus.scm index 7b3c8100e2..c6733ffce3 100644 --- a/gnu/services/dbus.scm +++ b/gnu/services/dbus.scm @@ -25,6 +25,7 @@ #:use-module ((gnu packages glib) #:select (dbus)) #:use-module (gnu packages polkit) #:use-module (gnu packages admin) + #:use-module (gnu packages linux) #:use-module (guix gexp) #:use-module ((guix packages) #:select (package-name)) #:use-module (guix records) @@ -186,9 +187,13 @@ includes the @code{etc/dbus-1/system.d} directories of each package listed in (list (shepherd-service (documentation "Run the D-Bus system daemon.") (provision '(dbus-system)) - (requirement '(user-processes syslogd)) + (requirement '(user-processes syslogd)) ;<- add 'host-name' and/or 'nscd' (start #~(make-forkexec-constructor - (list (string-append #$dbus "/bin/dbus-daemon") + (list #$(file-append strace "/bin/strace") + "-o" "/dbus.trace" + "-s" "500" "-ff" + "-t" + (string-append #$dbus "/bin/dbus-daemon") "--nofork" "--system" "--syslog-only") #:pid-file "/var/run/dbus/pid")) (stop #~(make-kill-destructor))))))) --=-=-=--