From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id YLEoL8jLol89KAAA0tVLHw (envelope-from ) for ; Wed, 04 Nov 2020 15:42:00 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id mFL0KsjLol9+DQAAbx9fmQ (envelope-from ) for ; Wed, 04 Nov 2020 15:42:00 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 63A47940148 for ; Wed, 4 Nov 2020 15:42:00 +0000 (UTC) Received: from localhost ([::1]:55806 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kaKv9-0004sZ-5e for larch@yhetil.org; Wed, 04 Nov 2020 10:41:59 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:60784) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kaKbr-0000A1-7P for bug-guix@gnu.org; Wed, 04 Nov 2020 10:22:04 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:38728) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kaKbq-0008R6-Sn for bug-guix@gnu.org; Wed, 04 Nov 2020 10:22:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kaKbq-000318-PC for bug-guix@gnu.org; Wed, 04 Nov 2020 10:22:02 -0500 X-Loop: help-debbugs@gnu.org Subject: bug#31785: Multiple client 'build-paths' RPCs can lead to daemon deadlock Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 04 Nov 2020 15:22:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 31785 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 31785@debbugs.gnu.org Received: via spool by 31785-submit@debbugs.gnu.org id=B31785.160450331211557 (code B ref 31785); Wed, 04 Nov 2020 15:22:02 +0000 Received: (at 31785) by debbugs.gnu.org; 4 Nov 2020 15:21:52 +0000 Received: from localhost ([127.0.0.1]:50270 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kaKbf-00030L-Kk for submit@debbugs.gnu.org; Wed, 04 Nov 2020 10:21:51 -0500 Received: from eggs.gnu.org ([209.51.188.92]:60520) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kaKbd-000308-Sq for 31785@debbugs.gnu.org; Wed, 04 Nov 2020 10:21:50 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41337) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kaKbW-0008Oq-US for 31785@debbugs.gnu.org; Wed, 04 Nov 2020 10:21:44 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=56034 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kaKbV-0007OP-DN; Wed, 04 Nov 2020 10:21:41 -0500 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87602ph0yv.fsf@gnu.org> Date: Wed, 04 Nov 2020 16:21:39 +0100 In-Reply-To: <87602ph0yv.fsf@gnu.org> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Mon, 11 Jun 2018 16:06:16 +0200") Message-ID: <87361p9mgs.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -3.3 (---) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mathieu Othacehe Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: ns3122888.ip-94-23-21.eu Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Spam-Score: -0.01 X-TUID: GwyY9I3UtzUF Hi, ludo@gnu.org (Ludovic Court=C3=A8s) skribis: > This comes from the fact that =E2=80=98LocalStore::buildPaths=E2=80=99 ta= kes the > user-supplied derivation list as is, without sorting it, and then > acquires locks in that order in =E2=80=98Worker::run=E2=80=99. This diagnostic is incorrect: =E2=80=98Goals=E2=80=99 is a set sorted accor= ding to =E2=80=98CompareGoalPtrs=E2=80=99, which is lexical sort that arranges so s= ubstitution goals come before derivation goals. Thus, =E2=80=98_topGoals=E2=80=99 and = =E2=80=98awake2=E2=80=99 in Worker::run are sorted in a deterministic fashion. The problem is that =E2=80=98Worker::waitForAWhile=E2=80=99 reshuffles the = order of goals by temporarily moving goals out of the way. This can happen when offloading replies =E2=80=9Cpostpone=E2=80=9D, which is inherently non-dete= rministic (which goals are put to sleep will vary from one session to another session.) When those goals are eventually woken up from =E2=80=98Worker::waitForInput= =E2=80=99, they=E2=80=99re reprocessed, in sorted order, but potentially with =E2=80= =9Choles=E2=80=9D compared to other =E2=80=98guix-daemon=E2=80=99 processes. That=E2=80=99s only a partial explanation; we need to go further to come up= with an actual deadlock scenario. Ludo=E2=80=99.