From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id IBBWJM8mLWDNAQAA0tVLHw (envelope-from ) for ; Wed, 17 Feb 2021 14:23:11 +0000 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id UL4mIM8mLWDzPQAA1q6Kng (envelope-from ) for ; Wed, 17 Feb 2021 14:23:11 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id CE5C42E980 for ; Wed, 17 Feb 2021 15:23:10 +0100 (CET) Received: from localhost ([::1]:52336 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lCNjR-0001OC-UU for larch@yhetil.org; Wed, 17 Feb 2021 09:23:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55790) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lCNjK-0001Nv-NP for bug-guix@gnu.org; Wed, 17 Feb 2021 09:23:02 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:59540) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lCNjK-000247-F6 for bug-guix@gnu.org; Wed, 17 Feb 2021 09:23:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lCNjK-0002jF-9i for bug-guix@gnu.org; Wed, 17 Feb 2021 09:23:02 -0500 X-Loop: help-debbugs@gnu.org Subject: bug#46402: Cuirass rebuilds the same packae multiple times Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 17 Feb 2021 14:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 46402 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Mathieu Othacehe Received: via spool by 46402-submit@debbugs.gnu.org id=B46402.161357175210440 (code B ref 46402); Wed, 17 Feb 2021 14:23:02 +0000 Received: (at 46402) by debbugs.gnu.org; 17 Feb 2021 14:22:32 +0000 Received: from localhost ([127.0.0.1]:42853 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lCNip-0002iK-HV for submit@debbugs.gnu.org; Wed, 17 Feb 2021 09:22:31 -0500 Received: from eggs.gnu.org ([209.51.188.92]:37806) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lCNin-0002i5-MD for 46402@debbugs.gnu.org; Wed, 17 Feb 2021 09:22:30 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47150) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lCNih-0001ud-Ga; Wed, 17 Feb 2021 09:22:23 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=36676 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1lCNig-0005v1-LI; Wed, 17 Feb 2021 09:22:22 -0500 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <20210209141915.40114e57@tachikoma.lepiller.eu> <87lfbxs0w9.fsf@gnu.org> <87zh0c8ajo.fsf@gnu.org> <87o8gs2mjq.fsf@gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 29 =?UTF-8?Q?Pluvi=C3=B4se?= an 229 de la =?UTF-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Wed, 17 Feb 2021 15:22:21 +0100 In-Reply-To: <87o8gs2mjq.fsf@gnu.org> (Mathieu Othacehe's message of "Wed, 10 Feb 2021 12:24:09 +0100") Message-ID: <87k0r6wz8i.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 46402@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -2.86 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Migadu-Queue-Id: CE5C42E980 X-Spam-Score: -2.86 X-Migadu-Scanner: scn1.migadu.com X-TUID: elk0ZQ/9lrzV Howdy! Mathieu Othacehe skribis: >> Seems to me that =E2=80=98BuildSteps=E2=80=99 is an orthogonal concern t= hat has little >> to do with Cuirass=E2=80=99 job and with its data model. In Hydra I saw= that as >> a (necessary) kludge. > > I'm not sure to follow you here. Cuirass and Hydra have an almost > identical database schema and are now working very similarly from what I > understand. > > In Hydra, a JobSet (Specification in Cuirass) has several Builds. Each > Build can be broken in several BuildSteps, corresponding to transitive > derivation inputs that must be built. > > Hydra manages to get those BuildSteps to be built in a topological > order, in the same way as the Guix Build Coordinator. > > This makes me think that we could implement this exact same mechanism in > Cuirass but I'm maybe missing something. When Cuirass was started, I wanted to avoid what I perceived as a shortcoming of Hydra=E2=80=99s design: one daemon connection per job and bu= ild steps, which kinda replicate what the daemon is doing. So I suggested going for one connection for all the jobs and passing all the derivations to the daemon so that the daemon can see the big picture, make better scheduling decisions, and so we don=E2=80=99t have to re-implem= ent =E2=80=9Cbuild steps=E2=80=9D. But as you know, this strategy didn=E2=80=99t work out as expected because = of scalability issues in the daemon. Regardless, it seems to me that =E2=80=98BuildSteps=E2=80=99 is a low-level= thing compared to the rest of the Cuirass database: it reifies part of the derivation graph whereas the rest of the database is all about =E2=80=9Cjob= s=E2=80=9D and =E2=80=9Cbuilds=E2=80=9D thereof. It=E2=80=99s not the same abstractio= n level. I realize it=E2=80=99s somewhat subjective though and I don=E2=80=99t want = to impede progress! >> If Cuirass would instead delegate derivation build requests to a >> Coordinator/daemon-like thing, it wouldn=E2=80=99t have to worry about t= hose >> details. That would better separate concerns. > > I think that having Cuirass delegating its builds to the Coordinator is > not the right move. That would mean doubling the size of the CI code > base, doubling the number of databases, for a feature that we could > implement in Cuirass, just by making it catch-up on Hydra. I see. Generally speaking, I think better separation of concerns may sometimes be worth extra code, insomuch as it makes it easier to reason about things, to debug, and to add new features. Of course it=E2=80=99s a tradeoff; adding too much code just for the beauty of abstractions isn=E2= =80=99t reasonable either. I wonder if having two databases instead of single one (which would essentially be the union of those two databases) is a problem. I guess one problem is if that makes it hard to make commonly-needed =E2=80=9Cjoins= =E2=80=9D across the two databases. Regarding features, one thing I like about the Coordinator is its support for retrying builds, which could serve to detect flaky builds or build processes that are kernel- or hardware-dependent. I think it=E2=80= =99s a feature we=E2=80=99d want eventually, but I wonder if it should be Cuirass= =E2=80=99s job. It=E2=80=99d be nice to focus on a single code base for =E2=80=9Cdistribute= d builds=E2=80=9D in general, and I was hoping for a Coordinator/Cuirass convergence on this aspect. But at the end of the day, what matters most is what we achieve. Cuirass has been doing so much better on many fronts over the last few weeks, including reliability, build throughput, and monitoring. At the same time, the Coordinator proves useful and easy to deploy in more experimental setups; I think Chris=E2=80=99s instance now aggregates results from a variety of machines, including POWER and GNU/Hurd, and that seemed quite easy to do. I=E2=80=99m not going to complain about over-success in this area! :-) Ludo=E2=80=99.