From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id UAGQHjb7ll/bLQAA0tVLHw (envelope-from ) for ; Mon, 26 Oct 2020 16:37:10 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id SJOCGjb7ll+oLAAAbx9fmQ (envelope-from ) for ; Mon, 26 Oct 2020 16:37:10 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id F13149401BC for ; Mon, 26 Oct 2020 16:37:09 +0000 (UTC) Received: from localhost ([::1]:46962 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kX5Ua-00019q-W3 for larch@yhetil.org; Mon, 26 Oct 2020 12:37:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60672) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kX5F0-0005Gr-Km for bug-guix@gnu.org; Mon, 26 Oct 2020 12:21:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:57710) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kX5F0-0000P3-6A for bug-guix@gnu.org; Mon, 26 Oct 2020 12:21:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kX5F0-00032T-0a for bug-guix@gnu.org; Mon, 26 Oct 2020 12:21:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#43565: cuirass: Fibers scheduling blocked. Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Mon, 26 Oct 2020 16:21:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 43565 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Mathieu Othacehe Received: via spool by 43565-submit@debbugs.gnu.org id=B43565.160372924811641 (code B ref 43565); Mon, 26 Oct 2020 16:21:01 +0000 Received: (at 43565) by debbugs.gnu.org; 26 Oct 2020 16:20:48 +0000 Received: from localhost ([127.0.0.1]:41023 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kX5El-00031h-Qa for submit@debbugs.gnu.org; Mon, 26 Oct 2020 12:20:48 -0400 Received: from eggs.gnu.org ([209.51.188.92]:60334) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kX5Ej-00031T-Mp for 43565@debbugs.gnu.org; Mon, 26 Oct 2020 12:20:46 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:57841) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kX5Ee-0000GP-CG for 43565@debbugs.gnu.org; Mon, 26 Oct 2020 12:20:40 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=33550 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kX5EV-0006xg-3B; Mon, 26 Oct 2020 12:20:37 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87eemtzr1q.fsf@gnu.org> <87r1qc27mo.fsf@gnu.org> <874kmmzd92.fsf@gnu.org> <871rhpqgjy.fsf@gnu.org> <87v9excbj8.fsf@gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 5 Brumaire an 229 de la =?UTF-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Mon, 26 Oct 2020 17:20:29 +0100 In-Reply-To: <87v9excbj8.fsf@gnu.org> (Mathieu Othacehe's message of "Mon, 26 Oct 2020 15:22:19 +0100") Message-ID: <87tuuh9cxe.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -3.3 (---) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 43565@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Spam-Score: -1.51 X-TUID: E//nEkobmmu2 Hello! Mathieu Othacehe skribis: >> But does Cuirass create file descriptors as O_NONBLOCK? This has to be >> done explicitly, Fibers won=E2=80=99t do it for us. As it turns out, th= e answer >> is no, in at least one important case: the connection to the daemon >> (untested patch below). >> >> While GC is running, Cuirass typically sends =E2=80=98build-derivations= =E2=80=99 RPCs >> and they block until the GC lock is released. That can lead to the >> situation above: a bunch of threads blocked in =E2=80=98read=E2=80=99 fr= om their daemon >> socket, waiting for the RPC reply. OTOH, =E2=80=98build-derivations=E2= =80=99 RPCs are >> made from a fresh thread created by =E2=80=98build-derivations&=E2=80=99. > > While I agree not opening file descriptors with O_NONBLOCK is an issue, > build-derivations is called in a separate thread. Blocking this separate > thread should not block the fibers. Agreed. > Now the question is why there's no fetching while the GC is running? The > answer is that "latest-repository-commit" called by "fetch-input" will > block the only fiber dedicated to fetching. Having multiple fibers > trying to fetch wouldn't solve anything because fetching requires some > building from the daemon. Exactly: when the GC lock is taken, =E2=80=98latest-repository-commit=E2=80= =99 makes an =E2=80=98add-to-store=E2=80=99 RPC, and that RPC blocks. Thus the whole fe= tch fiber is blocked. The patch should address this case. That said, nothing useful happens anyway when the GC lock is held, so it wouldn=E2=80=99t have any practical effect. I believe there are other cases where RPCs can be slow, for example when there=E2=80=99s contention on the sqlite database. Perhaps that could help= a bit there although again, it=E2=80=99s a situation where nothing useful can happen. > Long story short, I think we can apply your patch that can be useful to > prevent fibers talking directly to the daemon to block, even though it > won't help for this particular hang, that will only be fixed the GC time > will be reduced to something more acceptable. Yeah please go ahead if you want, or let me know if you=E2=80=99d rather le= t me apply it. Thanks! Ludo=E2=80=99.