From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <bug-guix-bounces+larch=yhetil.org@gnu.org> Received: from mp11.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id 4DAnMeJsW2SwNwAASxT56A (envelope-from <bug-guix-bounces+larch=yhetil.org@gnu.org>) for <larch@yhetil.org>; Wed, 10 May 2023 12:07:30 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id MDsxMeJsW2QSvgAA9RJhRA (envelope-from <bug-guix-bounces+larch=yhetil.org@gnu.org>) for <larch@yhetil.org>; Wed, 10 May 2023 12:07:30 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 53974ABCD for <larch@yhetil.org>; Wed, 10 May 2023 12:07:30 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <bug-guix-bounces@gnu.org>) id 1pwgiu-0002cz-MT; Wed, 10 May 2023 06:07:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1pwgis-0002co-MN for bug-guix@gnu.org; Wed, 10 May 2023 06:07:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1pwgis-0003fQ-EP for bug-guix@gnu.org; Wed, 10 May 2023 06:07:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1pwgis-0002dt-9x for bug-guix@gnu.org; Wed, 10 May 2023 06:07:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#63412: Topological sorting in cuirass Resent-From: Andreas Enge <andreas@enge.fr> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org> Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 10 May 2023 10:07:02 +0000 Resent-Message-ID: <handler.63412.B.168371321210136@debbugs.gnu.org> Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 63412 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 63412@debbugs.gnu.org X-Debbugs-Original-To: bug-guix@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.168371321210136 (code B ref -1); Wed, 10 May 2023 10:07:02 +0000 Received: (at submit) by debbugs.gnu.org; 10 May 2023 10:06:52 +0000 Received: from localhost ([127.0.0.1]:45276 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces@debbugs.gnu.org>) id 1pwgih-0002dP-PW for submit@debbugs.gnu.org; Wed, 10 May 2023 06:06:52 -0400 Received: from lists.gnu.org ([209.51.188.17]:34622) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <andreas@enge.fr>) id 1pwgif-0002dH-Is for submit@debbugs.gnu.org; Wed, 10 May 2023 06:06:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <andreas@enge.fr>) id 1pwgid-0002c0-Nf for bug-guix@gnu.org; Wed, 10 May 2023 06:06:47 -0400 Received: from hera.aquilenet.fr ([185.233.100.1]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <andreas@enge.fr>) id 1pwgiW-0003XH-Hx for bug-guix@gnu.org; Wed, 10 May 2023 06:06:42 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 03CD6354; Wed, 10 May 2023 12:06:32 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at hera.aquilenet.fr Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7Ed9LQSqCVIA; Wed, 10 May 2023 12:06:30 +0200 (CEST) Received: from jurong (unknown [IPv6:2001:861:c4:f2f0::c64]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 44FE45B; Wed, 10 May 2023 12:06:30 +0200 (CEST) Date: Wed, 10 May 2023 12:06:28 +0200 From: Andreas Enge <andreas@enge.fr> Message-ID: <ZFtspPexmg3YM/ug@jurong> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=185.233.100.1; envelope-from=andreas@enge.fr; helo=hera.aquilenet.fr X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix <bug-guix.gnu.org> List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-guix>, <mailto:bug-guix-request@gnu.org?subject=unsubscribe> List-Archive: <https://lists.gnu.org/archive/html/bug-guix> List-Post: <mailto:bug-guix@gnu.org> List-Help: <mailto:bug-guix-request@gnu.org?subject=help> List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-guix>, <mailto:bug-guix-request@gnu.org?subject=subscribe> Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: bug-guix-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Seal: i=1; s=key1; d=yhetil.org; t=1683713250; a=rsa-sha256; cv=none; b=aZ6Rt8l68n8f4ARWbUG5w1hG2EENuVX0rZrZW8FuJ/cNPKaVo83SPNH6dJiSIi9g75N9qi 3hf5NXhpiEuWehyqfMuANWlYqeV1QHaaAj6IufKXhlwNEy6mk7T/FvwgyCdFYOnafBRbqU 0cHLvwIKORVNm+1Xuu2xLYzp3hsyGo2Arph6X+B/Qo5iUpN9OpA+MZjAV7JLnXHlt6XfSo PCT0GeChNY/zMB2Er6M3XXDM5Dr3SHtwDt6QCMsdKrVukcszuUQgAVUchtntsM4WjfYl9o n3KZT4w5OWebt6asEEnFoA2Rgxz8n1c1A4bkIeX0Sm93GEm8zqSF09LmGwWRZw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1683713250; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:resent-cc:resent-from:resent-sender: resent-message-id:list-id:list-help:list-unsubscribe:list-subscribe: list-post; bh=9Z+CTlh8rI2U5g/8hsl3a+GETPEtBg+E4WQLca1ztVQ=; b=gDUvpuKAEA3ufJpkjto/VNuFuWRL6Q8grfnqLeWSexY/j/frRegCmt3eirFtlMo/is1iOr 20JavDkYjrgGh03sh9eta66DA8a9+xjGSLpqOyE6F+EP4lXp4+GcmeSZBStsXkhSjwbrdQ iTL6wzy2qISeepVkwfU9Csc8P/ZEnV12gX/1Jw23I94R7I8PJ+TY9u3qiw1q0YxYjykrfM AZV1/5zknEwYX6V6tGVlk8T2SoMJmINIDgJYqogpjUTtBQbCFovtli34QV3GdLDMmx/oF5 mP/X4Suj+FJKTCUQO2DXyFy9lObW4VkaH71MAJG/cAPToAG5C0f8PUUjstK6KQ== X-Migadu-Scanner: scn1.migadu.com Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -2.88 X-Spam-Score: -2.88 X-Migadu-Queue-Id: 53974ABCD X-TUID: kUMaQ+FGWOwY This is a wishlist bug, but it is important for architectures where we are currently short on build power, and where this issue can stall builds and waste an arbitrary amount of build power. Cuirass should sort builds and only offload derivations for which all inputs are available. In my current understanding, cuirass offloads arbitrary derivations, and the machine to which they are offloaded then starts building recursively all inputs. If this is true, then it is possible that at some point in time, all build slots are taken by the same package built as many times as there are machines; I have seen something like this when working on core-updates, where several machines were building the main gcc compiler at the same time. At worse, if cuirass asks every machine to build a leaf package, this may result in a simultaneous full bootstrap on all of them. The situation becomes worse when the package in question fails. Then as I understand it, each machine may receive a request to build something depending on the failing package and try the failing build and thus waste build power that will not be available to build other packages successfully. Solving this problem may also make reports of build failures more accurate and legible. For instance, doxygen currently fails to build on aarch64: https://ci.guix.gnu.org/build/969427/details and is reported as "Failed", and not as "Failed (dependency)". However, looking at the build log https://ci.guix.gnu.org/build/969427/log/raw shows this: ... building path(s) `/gnu/store/p5vqrwywz053r1vkiyw54dp9gj7vw9xd-ninja-1.11.1' ... builder for `/gnu/store/0zf7fqndzf2k595r4s6wblmpccdwr3nx-ninja-1.11.1.drv' failed with exit code 1 @ build-failed /gnu/store/0zf7fqndzf2k595r4s6wblmpccdwr3nx-ninja-1.11.1.drv - 1 builder for `/gnu/store/0zf7fqndzf2k595r4s6wblmpccdwr3nx-ninja-1.11.1.drv' failed with exit code 1 cannot build derivation `/gnu/store/hlscqram59id51hxg0fj15041v52h1kw-meson-1.1.0.drv': 1 dependencies couldn't be built cannot build derivation `/gnu/store/w8qxkrwpffd9qs5w1jggy1yi27ycm0xr-jsoncpp-1.9.5.drv': 2 dependencies couldn't be built cannot build derivation `/gnu/store/mss4yv015cil1vnjnglq506m83b7n3dy-cmake-bootstrap-3.24.2.drv': 1 dependencies couldn't be built cannot build derivation `/gnu/store/w0irp6xn30nlmpizhcbjnvhqmsba41jn-cmake-minimal-3.24.2.drv': 2 dependencies couldn't be built cannot build derivation `/gnu/store/rqk2rbnpjpcnqswz8hqari1rnw6r8v1m-doxygen-1.9.5.drv': 1 dependencies couldn't be built So it is indeed a different package that fails (and the last few lines give a list of dependencies between ninja and doxygen, each of which may or may not fail once ninja is fixed). Notice that this could be solved without a topological sorting of the dependency graph: It would be enough to keep an array deriv in which deriv[i] contains a list of derivations requiring i more inputs to be built, together with the list of inputs; elements in deriv[0] are ready to be sent to a build machine, and upon completion of a build, all derivations depending on it should be moved from deriv[i] to deriv[i-1] if the input has been built successfully, or marked as "Failed (dependency)" if the input has failed. (But this could be expensive, and may require appropriate data structures.) Alternatively, build jobs could be sorted topologically and then be kept in a list; then before sending out a job, all its inputs have been tried to be built; the job should then be sent if all inputs are available, or be marked as "Failed (dependency)" if any of them has failed. Andreas