From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id wAWDOYGCW2ReAwAASxT56A (envelope-from ) for ; Wed, 10 May 2023 13:39:46 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id +ISVOIGCW2SnDQEAG6o9tA (envelope-from ) for ; Wed, 10 May 2023 13:39:45 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id B396A2DFFB for ; Wed, 10 May 2023 13:39:45 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pwiA9-00020F-OV; Wed, 10 May 2023 07:39:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pwiA5-0001zF-5J for guix-devel@gnu.org; Wed, 10 May 2023 07:39:14 -0400 Received: from hera.aquilenet.fr ([185.233.100.1]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pwiA2-0006p1-T5 for guix-devel@gnu.org; Wed, 10 May 2023 07:39:12 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 06DA0261; Wed, 10 May 2023 13:39:08 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at hera.aquilenet.fr Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WvFZwchze2GU; Wed, 10 May 2023 13:39:07 +0200 (CEST) Received: from jurong (unknown [IPv6:2001:861:c4:f2f0::c64]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 418D31BE; Wed, 10 May 2023 13:39:07 +0200 (CEST) Date: Wed, 10 May 2023 13:39:05 +0200 From: Andreas Enge To: guix-devel@gnu.org Subject: Tooling for branch workflows Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=185.233.100.1; envelope-from=andreas@enge.fr; helo=hera.aquilenet.fr X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1683718785; a=rsa-sha256; cv=none; b=lVlndw9vY9epXNgXirbx2/IjqAlJxpCXNGsKMXNKNtOC5T6cTKoddkTjlqcmENls3KEWMF KiAyMIcqYXN4r9Utzv3L6+UJAZMG+VXrq/BuFjm2WSCvhol+INRE1fQ3OiLySBdYWwjTN/ AE1Tkg4iDAvVjbbvsX2K5kweFbQKVBg9tLOgYoxAhzl6i3Obda6XpntaeTuGvHDL8sF+vD NL7hrK/zG5QYjq+AG9Od6yVdGYs0hjp1IrlcY7EolkHrLD4Dsv+y1nnTIctn3koU2GbPBL fKfdcQZzzpyFUx+yFrYcDxsRCO6/ZG5Yye1dcRdPPLoLBDd0x+vtUjX2KyPfDA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1683718785; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:list-id:list-help:list-unsubscribe: list-subscribe:list-post; bh=4aJF79Urzn+ZQ4C6H3wGThIo8l6wUIQz2vkRvwosA4Q=; b=kbRQ21CUUZcZ7+1yax8TddA7zgdv3i3gl9cqOz6xtKNnJ5fQzZoFTuWaxskbqrAyDmBWfk dKtnxq8CS4H90SROQcD8tW9SaRkCfqnfHbuSu1lPQ+lNEmVDXOyoNQJ6sIzEYSDM2IYN7B qAMRfUX9s0izYWaS3iZC/Eb07iPR07Gw25Yaqm9wAyfi1xc0Cxrg9Pk0pQ3TfJRr92d5/V NbmQJaW9yIrlnENmA7l9Ky+ugdryUU+UJPb79VqKi+ZIG2IPdJ3FLEEeiYEqGUxMAMVFFY XHFwhWPQEMqGCqQRRzv/8qeWtjAhmTfVi887qW0LHioB0oqX4vbwe4Q9W2OMew== X-Migadu-Spam-Score: -3.48 X-Spam-Score: -3.48 X-Migadu-Queue-Id: B396A2DFFB X-Migadu-Scanner: scn0.migadu.com Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-TUID: X9TZ1QQqE4as Hello all, the title says it all, I wish to share some conclusions from working on the core-updates merge. Clearly our tooling could be improved for the task; there was some flying by night without instruments, and in the end I merged the branch without being really able to tell how it compared to master... (You may also blame it partially on my lack of patience.) Having feature branches may or may not make things a bit easier, but it will definitely not solve the problems. This mail is also of course a bit politically sensitive: It may look like I am complaining about other people's work, who are volunteers and do what they can, without offering to work on the code myself. So as a preamble, let me express my gratitude to the few people who have been working tirelessly on our tooling and contributing to our infrastructure, without whom big code changes like we did on core-updates (and now on feature branches) would simply be impossible; their work is vital to the project and often not very visible. If I am critical, it is not to diminish their work, but to discuss about a positive path forward; and I hope more people will find the motivation to do infrastructure work, which I think will be decisive for the success of Guix (together with policy and organisational questions). We have two build farms, berlin and bordeaux (which is a good thing for checking reproducibility and for redundancy, but maybe a bit of a problem concerning hardware requirements for "exotic" architectures), running two different CI projects, cuirass and the Guix build coordinator (gbc in the following); both have a very low bus factor (1 to 2?), and it would be nice to get more people onboard. For this, more documentation would be helpful. Both have pros and cons, and are architectured quite differently, so I do not know whether convergence is achievable. I ended up relying mostly on cuirass for reasons I do not completely remember any more. The dashboard with its green and red dots is a very useful tool compared to lists of builds, which become unusable with over 20000 packages. The bigger build power on bordeaux is helpful, and I found the web interface of gbc a bit slow and down a bit too often. With this experience, I just filed three wishlist bugs for cuirass: - Topological sorting in cuirass https://issues.guix.gnu.org/63412 The lack of ordering the builds is a big problem wasting a lot of build power; it is solved in gbc and, I think, the reason why the bordeaux build farm fares better for aarch64 with fewer machines. I would tag this as "important". - Evaluation comparison on cuirass https://issues.guix.gnu.org/63414 Without being able to compare a branch to master, it is difficult to decide whether one should merge. This is sort of solved in gbc, but so far the bordeaux build farm has been used more for QA of single patches (or a short list of patches featuring in a single issue) than for building complete branches. - Stop and restart builds in cuirass https://issues.guix.gnu.org/63413 Manual intervention is not easy in cuirass (I spent hours clicking on "restart" or using the REST API with a shell script through wget, which resulted in my IP being banned as a DoS suspect...); and to my knowledge, there is no web interface for doing so in gbc. In both systems one can probably tinker with the underlying databases, but this also does not qualify as "easy". gdb just got a very nice feature on "blocking builds": https://data.guix.gnu.org/revision/8f92dfd9ae7ac491ab7fb4b425799a8c909708a8/blocking-builds?system=aarch64-linux&target=none&limit_results=50 As I understand them, these are the "first failures", derivations all inputs of which are available, but which fail themselves; so they give the place where work is needed (and repairs will immediately make a difference). Once the topological sorting in cuirass is sorted out, these should be the builds marked as "Failed" (as opposed to "Failed (dependency)"), so with the first issue above handled, they could easily be shown by cuirass as well. This was a long message to say "I filed three bugs", but maybe it can be the starting point to discuss more items on how to go forward with our build and CI infrastructure. Andreas