all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: "pelzflorian (Florian Pelz)" <pelzflorian@pelzflorian.de>
Cc: 40572@debbugs.gnu.org
Subject: bug#40572: installer networking: Connman detects no technologies on Acer Aspire
Date: Tue, 14 Apr 2020 11:03:55 +0200	[thread overview]
Message-ID: <87v9m2zbes.fsf@gnu.org> (raw)
In-Reply-To: <20200414004344.fg7yio3frp534jih@pelzflorian.localdomain> (pelzflorian@pelzflorian.de's message of "Tue, 14 Apr 2020 02:43:44 +0200")

[-- Attachment #1: Type: text/plain, Size: 5974 bytes --]

Hi Florian,

"pelzflorian (Florian Pelz)" <pelzflorian@pelzflorian.de> skribis:

> The installer crashed again after entering a newly invented hostname
> “a” (perhaps it was already in use from my previous attempt?).  But
> later my normal hostname “florianmacbook” worked and network
> technologies failed.  The hostname command displays “gnu” now.
>
> Here are the log files.  From among the dbus trace files, only
> dbus.trace.301 is different after the installer crashed (see “diff
> logs/dbus.trace.301 logs/after-network-failed/dbus.trace.301”).
> Though you may be more interested in early trace logs/dbus.trace.228.
> But I’m going to sleep now. ;)

Uh, well deserved.  :-)

The logs show very well what happened.  From /var/log/messages
(stripped):

--8<---------------cut here---------------start------------->8---
Apr 14 01:52:21 localhost vmunix: [   12.733898] random: dbus-uuidgen: uninitialized urandom read (12 bytes read)
Apr 14 01:52:21 localhost vmunix: [   27.690871] shepherd[1]: Service root has been started.
Apr 14 01:52:26 localhost shepherd[1]: Service dbus-system could not be started. 
Apr 14 01:52:26 localhost shepherd[1]: Service networking depends on dbus-system. 
Apr 14 01:52:26 localhost shepherd[1]: Service networking could not be started. 
Apr 14 01:52:31 localhost shepherd[1]: Service dbus-system could not be started. 
Apr 14 01:52:36 localhost shepherd[1]: Service dbus-system could not be started. 
Apr 14 01:52:36 localhost shepherd[1]: Service wpa-supplicant depends on dbus-system. 
Apr 14 01:52:36 localhost shepherd[1]: Service wpa-supplicant could not be started. 
Apr 14 01:52:36 localhost shepherd[1]: Service loopback has been started. 
Apr 14 01:52:41 localhost /gnu/store/bfvr3brh7f9dqh26jf49767ypbanqycm-gpm-1.20.7/sbin/gpm[258]: *** info [daemon/startup.c(136)]: 
Apr 14 01:52:41 localhost /gnu/store/bfvr3brh7f9dqh26jf49767ypbanqycm-gpm-1.20.7/sbin/gpm[258]: Started gpm successfully. Entered daemon mode.
Apr 14 01:52:41 localhost shepherd[1]: Service gpm could not be started. 
Apr 14 01:52:43 localhost dbus-daemon[244]: Failed to start message bus: Failed to bind socket "/var/run/dbus/system_bus_socket": Address already in use
Apr 14 01:52:45 localhost vmunix: [   78.947812] mc: Linux media interface: v0.10
Apr 14 01:52:46 localhost shepherd[1]: Service dbus-system could not be started. 
Apr 14 01:52:46 localhost shepherd[1]: Service term-tty1 depends on dbus-system. 
Apr 14 01:52:46 localhost shepherd[1]: Service term-tty1 could not be started. 
Apr 14 01:52:50 localhost dbus-daemon[262]: Failed to start message bus: Failed to open "/var/run/dbus/pid": File exists
Apr 14 01:53:14 localhost shepherd[1]: Service dbus-system has been started. 
Apr 14 01:53:14 localhost shepherd[1]: Service term-tty1 has been started. 
--8<---------------cut here---------------end--------------->8---

That alone shows the problem: dbus-system was initially wrongfully
considered as “not started”, thus subsequent attempts to start it result
in EADDRINUSE.  This is confirmed by strace logs:

  228 -> starts fine
    openat(AT_FDCWD, "/var/run/dbus/pid", O_WRONLY|O_CREAT|O_EXCL, 0644) = 5
    fcntl(5, F_GETFL)                       = 0x8001 (flags O_WRONLY|O_LARGEFILE)
    fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
    write(5, "228\n", 4)                    = 4

  244 ->
    sendto(3, "<28>Apr 14 01:52:43 dbus-daemon[244]: Failed to start message bus: Failed to bind socket \"/var/run/dbus/system_bus_socket\": Address already in use", 146, MSG_NOSIGNAL, NULL, 0) = 146
    exit_group(1)                           = ?

  262 ->
    sendto(3, "<28>Apr 14 01:52:50 dbus-daemon[262]: Failed to start message bus: Failed to open \"/var/run/dbus/pid\": File exists", 114, MSG_NOSIGNAL, NULL, 0) = 114
    exit_group(1)                           = ?

  301 -> starts fine (did 228 die in the meantime? go figure)
    openat(AT_FDCWD, "/var/run/dbus/pid", O_WRONLY|O_CREAT|O_EXCL, 0644) = 5
    fcntl(5, F_GETFL)                       = 0x8001 (flags O_WRONLY|O_LARGEFILE)
    fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
    write(5, "301\n", 4)                    = 4

Everything seems to be extremely slow on this machine (booting from an
actual DVD, right?).  For example:

--8<---------------cut here---------------start------------->8---
[   27.690871] shepherd[1]: Service root has been started.
[   37.063759] shepherd[1]: starting services...

[...]

[   39.969589] shepherd[1]: Service host-name has been started.
[   41.959013] shepherd[1]: Service user-homes has been started.
--8<---------------cut here---------------end--------------->8---

That’s 27s before shepherd is started, and another 10s before “starting
services” (the only thing that happens in between in shepherd.conf is
loading .go files for the services.)

My guess is that cold-cache I/O is very slow.  A plausible scenario is
that loading ‘dbus-daemon’ the first time takes several seconds;
dbus-daemon has enough time to fork, but it does not produce its PID
file until after the 5s ‘%pid-file-timeout’ has timeout has expired.
Thus, shepherd marks it as “failed to start” but it’s actually running.

To confirm this hypothesis, we need to run “strace -t”, see below (sorry
for not thinking about doing it!).  If you can try again with the patch
below, that’s awesome.  Then we’ll compare the timestamps in
/var/log/messages and those in the strace log.

If that’s confirmed, we can work around it locally by passing:

  #:pid-file-timeout 15

to ‘make-forkexec-constructor’ for dbus-daemon or, alternately, setting
‘%pid-file-timeout’ globally from shepherd.conf.

You were right that it relates to
<https://issues.guix.gnu.org/issue/35550>.  It also reminds me of a
discussion with Konrad about the best way to make this configurable.

Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1380 bytes --]

diff --git a/gnu/services/dbus.scm b/gnu/services/dbus.scm
index 7b3c8100e2..c6733ffce3 100644
--- a/gnu/services/dbus.scm
+++ b/gnu/services/dbus.scm
@@ -25,6 +25,7 @@
   #:use-module ((gnu packages glib) #:select (dbus))
   #:use-module (gnu packages polkit)
   #:use-module (gnu packages admin)
+  #:use-module (gnu packages linux)
   #:use-module (guix gexp)
   #:use-module ((guix packages) #:select (package-name))
   #:use-module (guix records)
@@ -186,9 +187,13 @@ includes the @code{etc/dbus-1/system.d} directories of each package listed in
      (list (shepherd-service
             (documentation "Run the D-Bus system daemon.")
             (provision '(dbus-system))
-            (requirement '(user-processes syslogd))
+            (requirement '(user-processes syslogd)) ;<- add 'host-name' and/or 'nscd'
             (start #~(make-forkexec-constructor
-                      (list (string-append #$dbus "/bin/dbus-daemon")
+                      (list #$(file-append strace "/bin/strace")
+                            "-o" "/dbus.trace"
+                            "-s" "500" "-ff"
+                            "-t"
+                            (string-append #$dbus "/bin/dbus-daemon")
                             "--nofork" "--system" "--syslog-only")
                       #:pid-file "/var/run/dbus/pid"))
             (stop #~(make-kill-destructor)))))))

  reply	other threads:[~2020-04-14  9:05 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-24 20:06 bug#38735: GuixSD graphical Installer hangs on connecting to wifi Jesse Gibbons
2019-12-26 20:32 ` Danny Milosavljevic
2019-12-29 16:01   ` Ludovic Courtès
2020-01-03  4:10     ` Jesse Gibbons
2020-04-12  9:19 ` 1.1.0rc2 available for testing! Ludovic Courtès
2020-04-12  9:35   ` Danny Milosavljevic
2020-04-12  9:38     ` Danny Milosavljevic
2020-04-12  9:36   ` pelzflorian (Florian Pelz)
2020-04-12 12:09   ` bug#40572: installer networking: Connman detects no technologies on Acer Aspire pelzflorian (Florian Pelz)
2020-04-12 14:24     ` Mathieu Othacehe
2020-04-12 15:14       ` pelzflorian (Florian Pelz)
2020-04-12 15:41         ` pelzflorian (Florian Pelz)
2020-04-12 16:34           ` pelzflorian (Florian Pelz)
2020-04-12 17:28             ` pelzflorian (Florian Pelz)
2020-04-12 16:52         ` Danny Milosavljevic
2020-04-12 16:56           ` Danny Milosavljevic
2020-04-12 17:09           ` pelzflorian (Florian Pelz)
2020-04-12 17:45             ` pelzflorian (Florian Pelz)
2020-04-12 18:47               ` Danny Milosavljevic
2020-04-13  7:36                 ` pelzflorian (Florian Pelz)
2020-04-13 11:20               ` Danny Milosavljevic
2020-04-13 14:13                 ` Ludovic Courtès
2020-04-12 17:37         ` Mathieu Othacehe
2020-04-12 18:43           ` pelzflorian (Florian Pelz)
     [not found]             ` <87blnvmzub.fsf@gnu.org>
2020-04-13 11:57               ` pelzflorian (Florian Pelz)
2020-04-13 13:18                 ` Ludovic Courtès
2020-04-13 15:03                   ` pelzflorian (Florian Pelz)
     [not found]                     ` <20200413155202.rjmnp4v2vnrpq3vc@pelzflorian.localdomain>
2020-04-13 17:11                       ` Ludovic Courtès
2020-04-13 18:14                         ` pelzflorian (Florian Pelz)
2020-04-13 21:56                           ` pelzflorian (Florian Pelz)
2020-04-13 22:19                             ` Ludovic Courtès
2020-04-13 22:25                             ` Ludovic Courtès
2020-04-14  0:43                               ` pelzflorian (Florian Pelz)
2020-04-14  9:03                                 ` Ludovic Courtès [this message]
2020-04-14  9:40                                   ` pelzflorian (Florian Pelz)
2020-04-14 12:00                                     ` pelzflorian (Florian Pelz)
2020-04-14 14:36                                       ` Ludovic Courtès
2020-04-14 18:40                                         ` pelzflorian (Florian Pelz)
2020-04-14 20:17                                           ` Ludovic Courtès
2020-04-14 21:30                                             ` pelzflorian (Florian Pelz)
2020-04-14 21:50                                               ` Ludovic Courtès
2020-04-14 22:49                                         ` Bengt Richter
2020-04-13 22:24                         ` pelzflorian (Florian Pelz)
2020-04-13  7:42   ` 1.1.0rc2 available for testing! pelzflorian (Florian Pelz)
2020-04-13 10:41     ` Ludovic Courtès
2020-04-13 11:54       ` Mathieu Othacehe
2020-04-13 14:44         ` Ludovic Courtès
2020-04-14 12:25         ` Maxim Cournoyer
2020-04-14 14:24           ` Mathieu Othacehe
2020-04-15  1:22             ` Maxim Cournoyer
2020-04-15 16:58             ` Ludovic Courtès
2020-04-15 17:30               ` Mathieu Othacehe
2020-04-16  8:20                 ` Ludovic Courtès
2020-04-13 11:07   ` bug#38735: " Robert Smith
2020-04-13 11:07   ` Robert Smith
2020-04-13 13:45     ` Ludovic Courtès
2020-04-13 15:56       ` Robert Smith
2020-04-13 20:28         ` bug#38735: " Ludovic Courtès
2020-04-13 20:28         ` Ludovic Courtès
2020-04-13 21:42           ` bug#38735: " Robert Smith
2020-04-13 21:42           ` Robert Smith
2020-04-13 15:56       ` bug#38735: " Robert Smith
2020-04-13 13:45     ` Ludovic Courtès
2020-04-15 14:07     ` Tobias Geerinckx-Rice
2020-04-15 14:15       ` bug#38735: " Tobias Geerinckx-Rice via Bug reports for GNU Guix
2020-04-15 14:07     ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
2020-04-13 14:42   ` Compressed ISO image got bigger Ludovic Courtès
2020-04-13 14:55     ` Jonathan Brielmaier
2020-04-13 20:37       ` Ludovic Courtès
2020-04-13 21:52         ` Tobias Geerinckx-Rice
2020-04-14 10:36           ` Ludovic Courtès
2020-04-14  9:37   ` bug#40624: installer: info manual language does not adapt to locale pelzflorian (Florian Pelz)
2020-04-14 11:02     ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v9m2zbes.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=40572@debbugs.gnu.org \
    --cc=pelzflorian@pelzflorian.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.