From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Subject: bug#35550: Installer: wpa_supplicant fails to start Date: Mon, 06 May 2019 00:21:26 +0200 Message-ID: <87k1f4y23d.fsf@gnu.org> References: <87sgtv8hcz.fsf@gnu.org> <875zqr8dnw.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Return-path: Received: from eggs.gnu.org ([209.51.188.92]:37890) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hNPWG-0000Z3-By for bug-guix@gnu.org; Sun, 05 May 2019 18:22:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hNPWE-0005za-TX for bug-guix@gnu.org; Sun, 05 May 2019 18:22:04 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:41572) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hNPWE-0005zJ-Q3 for bug-guix@gnu.org; Sun, 05 May 2019 18:22:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hNPWE-000834-Fz for bug-guix@gnu.org; Sun, 05 May 2019 18:22:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <875zqr8dnw.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Fri, 03 May 2019 22:51:31 +0200") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: 35550@debbugs.gnu.org Cc: sirgazil --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Ludovic Court=C3=A8s skribis: > So why is =E2=80=98wpa-supplicant=E2=80=99 marked as failing to start on = the first > attempt? > > The only reason I can think of is if =E2=80=98read-pid-file=E2=80=99 from= (shepherd > service) returns immediately and returns #f instead of a number. That > can actually happen if the PID file exists but is empty (or contains > garbage). I=E2=80=99ve produced an ISO with the patch below and ran it on the bare me= tal to get confirmation (too bad the bug doesn=E2=80=99t show in QEMU :-/). In= deed, =E2=80=98read-pid-file=E2=80=99 for /var/run/wpa_supplicant.pid systematica= lly reads the empty string the first time the =E2=80=98wpa-supplicant=E2=80=99 service is= started. (After that, if we kill the process and try to restart the service, the problem doesn=E2=80=99t show up.) --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm index 53437b6..e21492e 100644 --- a/modules/shepherd/service.scm +++ b/modules/shepherd/service.scm @@ -717,9 +717,12 @@ otherwise return the number that was read (a PID)." (let loop () (catch 'system-error (lambda () - (string->number - (string-trim-both - (call-with-input-file file get-string-all)))) + (define str + (call-with-input-file file get-string-all)) + + (local-output (l10n "read-pid-file ~s -> ~s") + file str) + (string->number (string-trim-both str))) (lambda args (let ((errno (system-error-errno args))) (if (= ENOENT errno) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable With the second patch below, I confirm that the =E2=80=98wpa-supplicant=E2= =80=99 starts correctly. We can see in /var/log/messages that =E2=80=98read-pid-file=E2= =80=99 first reads the empty string from /var/run/wpa_supplicant.pid, then tries again, and gets a valid PID on the second attempt. It=E2=80=99s surprising that the timing is always like that, and only on the bare metal, but that=E2=80=99s the way it is. It=E2=80=99d be great if you could do some testing with the patch below. T= hen I guess we=E2=80=99ll push a Shepherd release with this fix. I wonder if this could also explain . Thanks, Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/gnu/packages/admin.scm b/gnu/packages/admin.scm index dfc3467bf8..e1dd248679 100644 --- a/gnu/packages/admin.scm +++ b/gnu/packages/admin.scm @@ -188,7 +188,8 @@ and provides a \"top-like\" mode (monitoring).") version ".tar.gz")) (sha256 (base32 - "1ys2w83vm62spr8bx38sccfdpy9fqmj7wfywm5k8ihsy2k61da2i")))) + "1ys2w83vm62spr8bx38sccfdpy9fqmj7wfywm5k8ihsy2k61da2i")) + (patches (search-patches "shepherd-debug.patch")))) (build-system gnu-build-system) (arguments '(#:configure-flags '("--localstatedir=/var"))) diff --git a/gnu/packages/patches/shepherd-debug.patch b/gnu/packages/patches/shepherd-debug.patch new file mode 100644 index 0000000000..2fd97cc578 --- /dev/null +++ b/gnu/packages/patches/shepherd-debug.patch @@ -0,0 +1,43 @@ +diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm +index 53437b6..bef8f42 100644 +--- a/modules/shepherd/service.scm ++++ b/modules/shepherd/service.scm +@@ -715,21 +715,28 @@ number. Return #f if FILE was not created or does not contain a number; + otherwise return the number that was read (a PID)." + (define start (current-time)) + (let loop () ++ (define (retry) ++ (and (< (current-time) (+ start max-delay)) ++ (begin ++ ;; FILE does not exist yet, so wait and try again. ++ ;; XXX: Ideally we would yield to the main event loop ++ ;; and/or use inotify. ++ (sleep 1) ++ (loop)))) ++ + (catch 'system-error + (lambda () +- (string->number +- (string-trim-both +- (call-with-input-file file get-string-all)))) ++ (define str ++ (call-with-input-file file get-string-all)) ++ ++ (local-output (l10n "read-pid-file ~s -> ~s") ++ file str) ++ (or (string->number (string-trim-both str)) ++ (retry))) + (lambda args + (let ((errno (system-error-errno args))) + (if (= ENOENT errno) +- (and (< (current-time) (+ start max-delay)) +- (begin +- ;; FILE does not exist yet, so wait and try again. +- ;; XXX: Ideally we would yield to the main event loop +- ;; and/or use inotify. +- (sleep 1) +- (loop))) ++ (retry) + (apply throw args))))))) + + (define* (exec-command command --=-=-=--