From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id wFdeL5OflWJuQAAAbAwnHQ (envelope-from ) for ; Tue, 31 May 2022 06:54:43 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id YG1jLpOflWLDkAAAG6o9tA (envelope-from ) for ; Tue, 31 May 2022 06:54:43 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 180DE1CD39 for ; Tue, 31 May 2022 06:54:42 +0200 (CEST) Received: from localhost ([::1]:44026 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nvttx-0006rK-UZ for larch@yhetil.org; Tue, 31 May 2022 00:54:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58822) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nvtte-0006r8-Is for guix-devel@gnu.org; Tue, 31 May 2022 00:54:22 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:35496) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nvttc-0000ux-ID; Tue, 31 May 2022 00:54:22 -0400 Received: by mail-pf1-x429.google.com with SMTP id u2so1252540pfc.2; Mon, 30 May 2022 21:54:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=TiSw3RNZcCQZtdXnnKOMasXaBDkxssAIaWutvRLzsAk=; b=hE1mIgTwfq5MbTbY7uiawkDrHXgAFD+Vf+hosw68hwCAEAcGUxWsjXHwhCr397iso8 fr6P7BbpsNci+C1OaxJYT9bo83Hg/bNZxv/LKK36UBTNBDlELoeSzahHCtFcOg8IwJxe jS0X9Rd5JUxRCFNL7Mcdkq08PnLtHsOh+j6BmY98tqMUqECf6t2yYE35Kz+FOoCnvIfp /gMfHrtlSOWoBnKP/bGcPdTsWzUgHSBIxbYqlpDVp5CKYgjoF3d7cLLtnguUbibvRQMq 1YXzTqAMmtxtS+5sUIis6XWaJbDh7UAxeOHcsy+Z11IOcm6QA2TLmr7wRbFBi5st+9YN x2ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=TiSw3RNZcCQZtdXnnKOMasXaBDkxssAIaWutvRLzsAk=; b=fPWA6a9JmepTODq/JoQcyeE5ieXu43DcVfr/VnKy1phpbORiG6JqZxM1iH6ye0dPjO BR14njWWJJJ/vE6H1WqpLO7N0npqJTsajslFsIugBW3/ZR/gOVnxwoa4PCtiUkj4fKQJ ZG06QyQ/PToOIY1yq7mj+63uUkN+kThMh5rM2iN3TKb6vF3IFtsBRLgBSLof+AT6Sxmi zLFnN46PAfvkA/espvQDJ2Vcn+2UxZlvKwi7RwD/mvUsSoEFiJnVNHZsCPNDedcvRk6+ mNmg0WYcHefMNjLZJj0obs1LyNTMyAsJsILcXUGOr28RNww8GiF0Rr9a0LOl3Zyovub7 G84g== X-Gm-Message-State: AOAM532hHETQefFb9OtRIUdxsfTahV4YdC0l8FASJAQDwzb8PAH+TTHU 0M+CRJc573mU8ovJDYesJ6qeWQoeUOGUhaQsHyw= X-Google-Smtp-Source: ABdhPJwxHMALxCS8os4UnVwM2qjRPy2KZVB7U2YeuVFxsg/OOA7VDKsum1mjhZSki8JWm1C6OLnPEkqWHu4QyNiA3Cg= X-Received: by 2002:a63:8b4b:0:b0:3fa:4c70:1204 with SMTP id j72-20020a638b4b000000b003fa4c701204mr35140681pge.405.1653972856954; Mon, 30 May 2022 21:54:16 -0700 (PDT) MIME-Version: 1.0 References: <2067ba1e606855eace261fd0b0ae9721b369bbd5.camel@telenet.be> <878ru5epyv.fsf@elephly.net> <753ba5897ed397b5e95175cd139137975245945b.camel@telenet.be> <875yozdta6.fsf@gnu.org> <752f9901f6c3c813b5553534aefc2e4e8f5aad63.camel@telenet.be> In-Reply-To: <752f9901f6c3c813b5553534aefc2e4e8f5aad63.camel@telenet.be> From: =?UTF-8?Q?G=C3=A1bor_Boskovits?= Date: Tue, 31 May 2022 06:54:05 +0200 Message-ID: Subject: Re: Faster "guix pull" by incremental compilation and non-circular modules? To: Maxime Devos Cc: =?UTF-8?Q?Ludovic_Court=C3=A8s?= , Ricardo Wurmus , Guix Devel Content-Type: multipart/alternative; boundary="0000000000009b629405e0479011" Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=gboskovits@gmail.com; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1653972883; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=TiSw3RNZcCQZtdXnnKOMasXaBDkxssAIaWutvRLzsAk=; b=UGRYLpBll89Axl8G6AU6xpYTf+dFryA++4EEW8ksfHVHLBl71y7VRWuAL9CpuM4ySoy4Ja KL3iOA8+ffbDUjoI3IlqSvfpjDCe8+T0zQ7D2FGiMgD+3lJ7p1mVwOymGfJvitFCE8Bwdu Bvq5sWkeozOqYLmMg8ai2I5VCge1ThXFABaBH4j06OhZzDPYui9DFV1FCVyuSa2m9970Gq HSl15qv8V2ROl2Kh4CgaBtQ4sgOnUNdhi8RH+fNK7AH0bXy/lbV1zj+U23ocww/2uhg9dn TYmli0+JEQeAEwCpAQSZbwzvnQ9pCuJWtuWafemVZdx8r0516s6IeFq0A4JUsg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1653972883; a=rsa-sha256; cv=none; b=EMDdUbaw6o2pyZ2VJ0x9mFYxRqUQ6BZnuD8A2EqzA/TYiq1STaxAr9eVnE57wKQ0+RBCYf 0wj1MiX3Yms4O7XdwMiZtURqz2x2T4r7FZHWSEX79zdPXWGIS8lGMZsRoQ55GWJHq9RuDX 1ftC/0Uo3l5JW0lfrnxq6+b8zn1ZYaDHEbXyID9J4etovae1JegSrRutjhWNQJUJmPjMZI ZQgV0iJ26u38S1bUjHgKI4y8+AvLUaocRk8TEl5d604W7mjlI9zSFfRj/dMsojkmNX2Ya+ Ltd158u4tZv+teo/LkkBZPV6c1vtXkkPXeU75Nww+W5YRxA2rOBm/eqLhvndTw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=hE1mIgTw; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -6.53 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=hE1mIgTw; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 180DE1CD39 X-Spam-Score: -6.53 X-Migadu-Scanner: scn0.migadu.com X-TUID: hsQt/lw6r3U7 --0000000000009b629405e0479011 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello Maxime, Maxime Devos ezt =C3=ADrta (id=C5=91pont: 2022. fe= br. 28., H, 19:51): > Ludovic Court=C3=A8s schreef op ma 28-02-2022 om 14:17 [+0100]: > > Hi, > > > > Maxime Devos skribis: > > > > > 2. Instead of building all of Guix as a single derivation, > > > create a DAG of derivations. More concretely: > > > > > > First read the *.scm files to determine which module imports > > > which modules. Then to compile, say, (gnu packages acl), > > > a derivation taking gnu/packages/acl.scm and its dependencies > > > gnu/packages/attr.go, gnu/packages/base.go, ... is made > > > compiling gnu/packages/acl.scm to a gnu/packages/acl.go. > > > > > > Then to build all of Guix, 'union-build' or 'file-union' is used= . > > > > This is what (guix self), used by =E2=80=98guix pull=E2=80=99, is alrea= dy doing. > > > > However, currently, package modules are split in just two groups: the > > =E2=80=9Cbase=E2=80=9D group is the closure of (guix packages base), an= d the second > > group has all the rest: > > > > [...] > > Looking at (guix self), it also has a few groups for non-package > modules (system tests, scripts, ...). > > > At its core though, the situation pretty much reflects the free softwar= e > > situation: there are low-level packages (glibc, GCC, GTK, etc.) that > > might depend on high-level packages (Python, Pandoc, Rust, etc.). > > > > It=E2=80=99s not easy to split this spaghetti ball in smaller groups. > > It's not easy to manually split the spaghetti, but we don't have > to, we could let the computer split the spaghetti for us (at least > partially, because of the circular imports), by computing the graph of > strongly-connected components and considering each SCC to be a =E2=80=98g= roup=E2=80=99, > some of which depend on other groups, forming a DAG. > I was thinking about a bit of a different structure that can also be automated. My original idea was to use the already existing tree structure of the derivations, and split it based on depth. I think that gives a bit more structure, but might require splitting things that now are together (for example iirc sometimes we are defining bootstrap packages inheriting from the fully fledged ones, which introduces a syntactic dependency on something that is higher up the tree). Wdyt? Regards, g_bor > I believe Ricardo Wurmus has some script for computing the SCC? > > Splitting large SCC in smaller parts can be left as an exercise > for later, it's a somewhat orthogonal concern. > > > Thoughts? > > I think it would be nice to let (guix self) automatically determine the > DAG of groups. It would reduce the ad-hocness of the *...-modules* > variables (some care required for patches, guix/man-db.scm, .js ...). > > It would also make the node tree wide, which could reduce memory usage > (which might help with the =E2=80=98guix pull segfaults on i686-linux=E2= =80=99 > reports). In case of a crash (*), "guix pull" does not have to start > over from scratch, which would also help with those reports. > > (*) This does not help with "failed to compute the derivation of Guix". > > Some work would be required, but I think it will be worth it, and it > only has to be done once. > > TBC, this was just an idea I wanted to share, I won't be working on it > in the forseeable future. > > Greetings, > Maxime. > --0000000000009b629405e0479011 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello Maxime,

Maxime Devos <maximedevos@telenet.be> ezt =C3= =ADrta (id=C5=91pont: 2022. febr. 28., H, 19:51):
Ludovic Court=C3=A8s schreef op ma 28-02-= 2022 om 14:17 [+0100]:
> Hi,
>
> Maxime Devos <maximedevos@telenet.be> skribis:
>
> > =C2=A0=C2=A02. Instead of building all of Guix as a single deriva= tion,
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0create a DAG of derivations.=C2=A0 = More concretely:
> >
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0First read the *.scm files to deter= mine which module imports
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0which modules. Then to compile, say= , (gnu packages acl),
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0a derivation taking gnu/packages/ac= l.scm and its dependencies
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0gnu/packages/attr.go, gnu/packages/= base.go, ... is made
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0compiling gnu/packages/acl.scm to a= gnu/packages/acl.go.
> >
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0Then to build all of Guix, 'uni= on-build' or 'file-union' is used.
>
> This is what (guix self), used by =E2=80=98guix pull=E2=80=99, is alre= ady doing.
>
> However, currently, package modules are split in just two groups: the<= br> > =E2=80=9Cbase=E2=80=9D group is the closure of (guix packages base), a= nd the second
> group has all the rest:
>
> [...]

Looking at (guix self), it also has a few groups for non-package
modules (system tests, scripts, ...).

> At its core though, the situation pretty much reflects the free softwa= re
> situation: there are low-level packages (glibc, GCC, GTK, etc.) that > might depend on high-level packages (Python, Pandoc, Rust, etc.).
>
> It=E2=80=99s not easy to split this spaghetti ball in smaller groups.<= br>
It's not easy to manually split the spaghetti, but we don't have to, we could let the computer=C2=A0split the spaghetti for us (at least
partially, because of the circular imports), by computing the graph of
strongly-connected components and considering each SCC to be a =E2=80=98gro= up=E2=80=99,
some of which depend on other groups, forming a DAG.
<= br>
I was thinking about a bit of a different structure that can = also be automated. My original idea was to use the already existing tree st= ructure of the derivations, and split it based on depth. I think that gives= a bit more structure, but might require splitting things that now are toge= ther (for example iirc sometimes we are defining bootstrap packages inherit= ing from the fully fledged ones, which introduces a syntactic dependency on= something that is=C2=A0 higher up the tree). Wdyt?

Regards,
g_bor


I believe Ricardo Wurmus has some script for computing the SCC?

Splitting large SCC in smaller parts can be left as an exercise
for later, it's a somewhat orthogonal concern.

> Thoughts?

I think it would be nice to let (guix self) automatically determine the
DAG of groups.=C2=A0 It would reduce the ad-hocness of the *...-modules* variables (some care required for patches, guix/man-db.scm, .js ...).

It would also make the node tree wide, which could reduce memory usage
(which might help with the =E2=80=98guix pull segfaults on i686-linux=E2=80= =99
reports).=C2=A0 In case of a crash (*), "guix pull" does not have= to start
over from scratch, which would also help with those reports.

(*) This does not help with "failed to compute the derivation of Guix&= quot;.

Some work would be required, but I think it will be worth it, and it
only has to be done once.

TBC, this was just an idea I wanted to share, I won't be working on it<= br> in the forseeable future.

Greetings,
Maxime.
--0000000000009b629405e0479011--