From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id SNmaGxNIjWIyNwAAbAwnHQ (envelope-from ) for ; Tue, 24 May 2022 23:03:15 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id AIeAGxNIjWKPfwAAauVa8A (envelope-from ) for ; Tue, 24 May 2022 23:03:15 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id B33872BB62 for ; Tue, 24 May 2022 23:03:14 +0200 (CEST) Received: from localhost ([::1]:57392 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ntbgP-0003m8-Lk for larch@yhetil.org; Tue, 24 May 2022 17:03:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39374) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ntbgE-0003ju-Sb for bug-guix@gnu.org; Tue, 24 May 2022 17:03:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:59701) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ntbgE-0002bi-JX for bug-guix@gnu.org; Tue, 24 May 2022 17:03:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ntbgE-0001dq-Dx for bug-guix@gnu.org; Tue, 24 May 2022 17:03:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#55441: [cuirass] hang in "In progress..."; runs out of pgsql connections Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Tue, 24 May 2022 21:03:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 55441 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Maxim Cournoyer Cc: 55441@debbugs.gnu.org, Mathieu Othacehe Received: via spool by 55441-submit@debbugs.gnu.org id=B55441.16534261566277 (code B ref 55441); Tue, 24 May 2022 21:03:02 +0000 Received: (at 55441) by debbugs.gnu.org; 24 May 2022 21:02:36 +0000 Received: from localhost ([127.0.0.1]:53598 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ntbfn-0001d8-Nc for submit@debbugs.gnu.org; Tue, 24 May 2022 17:02:36 -0400 Received: from eggs.gnu.org ([209.51.188.92]:45980) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ntbfZ-0001cj-Gz for 55441@debbugs.gnu.org; Tue, 24 May 2022 17:02:35 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46874) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ntbfU-0002aQ-8b; Tue, 24 May 2022 17:02:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=On9l6mGBjxbAgR7oNeJ2PRCQnsn+3LphuMNCEJryBAE=; b=DKQ4s9iwRlPqHzMgb+G9 vGzmVM2Vaj6lTRSJTXxj/VRvtDvr6rikC/0ozAiv2t3KJdqE7DJyuGFUx5GOiABh7aZY/Dllux4nI lqX0bF/w4uj/x+CAAiMJfdlr4ENEJEsDRsSzZFNSmX9q7blC0SMbK/Ryqbt1S48L3EQopWCEvX4ud gzB3H6BkDCgjqLqilerRoMZI0SqU6uTNqfoKzcZojBxJyiQ8MGj+hAFXFWBQKkN52ukJOh4/Z9lKU Mot8EwyeqySn5IKLaKyCbsYQjgRCYcuA68XTHSwaG0e4kajjyJOuxaaORAPx4vPUFmtpd9dz9nWgG dH1kZCkrf4B9QQ==; Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=37742 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ntbfT-0004Ng-KL; Tue, 24 May 2022 17:02:15 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87fslcaznn.fsf@gmail.com> <87mtfj174l.fsf@gnu.org> <878rr2kug6.fsf_-_@gmail.com> <87pmke6ig4.fsf@gnu.org> <87o7zxalhu.fsf@gnu.org> <87k0alahug.fsf@gnu.org> <87fsl9acuw.fsf@gnu.org> <878rr1jsd1.fsf@gmail.com> <87fsl87gb5.fsf@gnu.org> <874k1n5loz.fsf@gnu.org> <87r14ovyud.fsf@gnu.org> Date: Tue, 24 May 2022 23:02:13 +0200 In-Reply-To: <87r14ovyud.fsf@gnu.org> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Fri, 20 May 2022 20:32:58 +0200") Message-ID: <87k0aaiqzu.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1653426195; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=On9l6mGBjxbAgR7oNeJ2PRCQnsn+3LphuMNCEJryBAE=; b=VyhqsZZOcbbAiaKZhjrd3qPuyAbkQbLP7tJs3i5IbT+jN82KBLZhk+8HgkIm/GvweatMyr HjltNa4bd7HFCF/InarjRwghJKRKRuvRxLRkYw8Tg3/vR5r/N0mg9R52lV+u8rwRyQqxCK VXAggjBjWhkyO1jjIYy7KEk5jHegem/wqfKYu7X4NnLJnFOQVMD0jTPO618Nur67Nx5cdj 9UMaUq+kHSpuPzZhOU1eiuxAVIIcMjbGAswUnayHR7rBsfwUBhrlQu3pIr+mu1viMx8xoj zPP8zEExMoW7CheWllfG6hJwTCkVfB/0ZUZsx7mR/g/1qB9R5ygM5Ye8QCpOcw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1653426195; a=rsa-sha256; cv=none; b=OZPVR16FIdc3GxpA2f3Ms9QdCZt9r436e9JIhAsSmZ51l5ZUwFYBGfJpycxiXSNv3nQYQX u+N4jWTf81PtoVMQqYQyxNYLPGngiFhZu8izTe9tOgJk/iFAysFjJjRcgYxzqnIPSQZkBN 3gl1tTo6O0sXCMXHQIgSdiKyD+b2NpQKHse4MxVOzavdSjumP3vUUnHpiBRAFZHPwHbP8W GOjUf8rbhwuhZtUgAPBMdg1N8EIKis5UK0NZSoDtkpPI2U8JyzrfSaRsM0KbyrZ5iPWIer 0MxIT7cMDCsXIusVOA87p7MwgDi5bdpAu6E5BIJzbAgcc2bp4D8ZWCPVGhyOYQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=DKQ4s9iw; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -3.94 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=DKQ4s9iw; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: B33872BB62 X-Spam-Score: -3.94 X-Migadu-Scanner: scn1.migadu.com X-TUID: pC++pp9EQAkw Hi! Ludovic Court=C3=A8s skribis: > Fixed in Guix commit a4994d739306abcf3f36706012fb88b35a970e6b with a > test that reproduces the issue. > > Commit d02b7abe24fac84ef1fb1880f51d56fc9fb6cfef updates the =E2=80=98guix= =E2=80=99 > package so we should be able to reconfigure berlin now and hopefully > (crossing fingers!) be done with it. An update: Cuirass is now up-to-date on berlin.guix, built from Guix commit adf5ae5a412ed13302186dd4ce8e2df783d4515d. Unfortunately, while evaluations now run to completion, child processes of =E2=80=98cuirass evaluate=E2=80=99 stick around at the end: --8<---------------cut here---------------start------------->8--- (gdb) bt #0 futex_wait (private=3D0, expected=3D2, futex_word=3D0x7f5b1d054f08) at = ../sysdeps/nptl/futex-internal.h:146 #1 __lll_lock_wait (futex=3Dfutex@entry=3D0x7f5b1d054f08, private=3D0) at = lowlevellock.c:52 #2 0x00007f5b1d873ef3 in __GI___pthread_mutex_lock (mutex=3Dmutex@entry=3D= 0x7f5b1d054f08) at ../nptl/pthread_mutex_lock.c:80 #3 0x00007f5b1d995303 in scm_c_weak_set_remove_x (pred=3D, = closure=3D0x7f5b13dd8d00, raw_hash=3D1824276156261873434, set=3D#) at weak-set.c:794 #4 scm_weak_set_remove_x (obj=3D# 7f5b= 13dd8d00>, set=3D#) at weak-set.c:817 #5 close_port (explicit=3D, port=3D# 7f5b13dd8d00>) at ports.c:891 #6 close_port (port=3D# 7f5b13dd8d00>,= explicit=3D) at ports.c:874 #7 0x00007f5af3a7df82 in ?? () #8 0x0000000000dbd860 in ?? () #9 0x00007f5af3a7df60 in ?? () #10 0x0000000000db82b8 in ?? () #11 0x00007f5b1d972ccc in scm_jit_enter_mcode (thread=3D0x7f5b157bf240, mco= de=3D0xdbd86c "\034\217\003") at jit.c:6038 #12 0x00007f5b1d9c7f3c in vm_regular_engine (thread=3D0x7f5b157bf240) at vm= -engine.c:360 #13 0x00007f5b1d9d55e9 in scm_call_n (proc=3D, argv=3D, nargs=3D0) at vm.c:1608 #14 0x00007f5b1d939a0e in scm_call_with_unblocked_asyncs (proc=3D#) at async.c:406 #15 0x00007f5b1d9c8336 in vm_regular_engine (thread=3D0x7f5b157bf240) at vm= -engine.c:972 #16 0x00007f5b1d9d55e9 in scm_call_n (proc=3D, argv=3D, nargs=3D0) at vm.c:1608 #17 0x00007f5b1d9c4be6 in really_launch (d=3D0x7f5aebccac80) at threads.c:7= 78 #18 0x00007f5b1d93b85a in c_body (d=3D0x7f5aea691d80) at continuations.c:430 #19 0x00007f5aeeb118c2 in ?? () #20 0x00007f5b1553d7e0 in ?? () #21 0x00007f5b138a7370 in ?? () #22 0x0000000000000048 in ?? () #23 0x00007f5b1d972ccc in scm_jit_enter_mcode (thread=3D0x7f5b157bf240, mco= de=3D0xdbc874 "\034<\003") at jit.c:6038 #24 0x00007f5b1d9c7f3c in vm_regular_engine (thread=3D0x7f5b157bf240) at vm= -engine.c:360 #25 0x00007f5b1d9d55e9 in scm_call_n (proc=3D, argv=3D, nargs=3D2) at vm.c:1608 #26 0x00007f5b1d93d09a in scm_call_2 (proc=3D, arg1=3D, arg2=3D) at eval.c:503 #27 0x00007f5b1d9f3752 in scm_c_with_exception_handler.constprop.0 (type=3D= #t, handler_data=3Dhandler_data@entry=3D0x7f5aea691d10, thunk_data=3Dthunk_= data@entry=3D0x7f5aea691d10, thunk=3D, handler=3D) at exceptions.c:170 #28 0x00007f5b1d9c588f in scm_c_catch (tag=3D, body=3D, body_data=3D, handler=3D, handler_= data=3D, pre_unwind_handler=3D, pre_unwind_handler_data=3D0x7f5b1= 56b2040) at throw.c:168 #29 0x00007f5b1d93de66 in scm_i_with_continuation_barrier (pre_unwind_handl= er=3D0x7f5b1d93db80 , pre_unwind_handler_data=3D0x7f5b1= 56b2040, handler_data=3D0x7f5aea691d80, handler=3D0x7f5b1d9448b0 , body_data=3D0x7f5aea691d80, body= =3D0x7f5b1d93b850 ) at continuations.c:368 #30 scm_c_with_continuation_barrier (func=3D, data=3D) at continuations.c:464 #31 0x00007f5b1d9c4b39 in with_guile (base=3D0x7f5aea691e08, data=3D0x7f5ae= a691e30) at threads.c:645 #32 0x00007f5b1d89b0ba in GC_call_with_stack_base () from /gnu/store/2lczkx= bdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1 #33 0x00007f5b1d9bd16d in scm_i_with_guile (dynamic_state=3D= , data=3D0x7f5aebccac80, func=3D0x7f5b1d9c4b70 ) at threads.= c:688 #34 launch_thread (d=3D0x7f5aebccac80) at threads.c:787 #35 0x00007f5b1d871d7e in start_thread (arg=3D0x7f5aea692640) at pthread_cr= eate.c:473 #36 0x00007f5b1d46feff in clone () at ../sysdeps/unix/sysv/linux/x86_64/clo= ne.S:95 (gdb) info threads Id Target Id Frame * 1 process 53801 "guile" futex_wait (private=3D0, expected=3D2, futex_w= ord=3D0x7f5b1d054f08) at ../sysdeps/nptl/futex-internal.h:146 --8<---------------cut here---------------end--------------->8--- Notice there=E2=80=99s a single thread: it very much looks like the random results one gets when forking a multithreaded process (in this case, this one thread is a finalization thread, except it=E2=80=99s running in a process that doesn=E2=80=99t actually have the other Guile threads). The fork+threads problem is already manifesting, after all. I=E2=80=99ll try and come up with a solution to that, if nobody beats me at= it. What=E2=80=99s annoying is that it=E2=80=99s not easy to test: the problem = doesn=E2=80=99t manifest on my 4-core laptop, but it does on the 96-core berlin. To be continued=E2=80=A6 Ludo=E2=80=99.