From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id aCkSFlv9nGHfrQAAgWs5BA (envelope-from ) for ; Tue, 23 Nov 2021 15:40:27 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id YOfTEVv9nGGpfAAA1q6Kng (envelope-from ) for ; Tue, 23 Nov 2021 14:40:27 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id BEEC623327 for ; Tue, 23 Nov 2021 15:40:26 +0100 (CET) Received: from localhost ([::1]:60138 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mpWy9-0003BB-SP for larch@yhetil.org; Tue, 23 Nov 2021 09:40:25 -0500 Received: from eggs.gnu.org ([209.51.188.92]:32806) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mpWxh-0003An-W8 for guix-devel@gnu.org; Tue, 23 Nov 2021 09:39:58 -0500 Received: from mail-4323.proton.ch ([185.70.43.23]:29121) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mpWxe-0003sq-0Q for guix-devel@gnu.org; Tue, 23 Nov 2021 09:39:57 -0500 Date: Tue, 23 Nov 2021 14:39:49 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rixotstudio.cz; s=protonmail2; t=1637678390; bh=uS3PKCChA1PQzQO5U+fYhpAu4vCTETsD3NUdBXc0k/Y=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=w2AWHDmukT+MGiKhCKnYwd9kqXRsujoiFv3qFs1fegpYPexW0F8POdQD0pg3hPtcG rRSazLlQk4lQA+g5uysSZMvTTQZdFJQrGdhDck0QwQkSh1LQdze78kkm3OAsKpUlES PxG0P0wqk6nso6uPbWBIMmoinXT8+s1fWcLWWrZPt/XPbxkqcsIl0ckSv0Z2BLaS1I iby5q98Eg/gSr5D9Rrst9jQME7gqJ662opP7o+Bjqck5jKJJhm5lk6L84d8oK0MnMb 9rb0hpSKW9pVqCMd6Hhqkj9qD4UgqQGL0iwJnkGPEUQ1NB1lw1d2e7y1VOSLZy24i/ BU+ZNH19EcYkQ== To: zimoun From: Jacob Hrbek Subject: Re: Proposal: Build timers Message-ID: In-Reply-To: <865ysjw0ek.fsf@gmail.com> References: <868rxfwuib.fsf@gmail.com> <865ysjw0ek.fsf@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha512; boundary="------eafafa0950e0480065f6bdaa978feb32d2947329349455adcd7343ac2612b60e"; charset=utf-8 Received-SPF: pass client-ip=185.70.43.23; envelope-from=kreyren@rixotstudio.cz; helo=mail-4323.proton.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jacob Hrbek Cc: "guix-devel@gnu.org" Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1637678426; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=uS3PKCChA1PQzQO5U+fYhpAu4vCTETsD3NUdBXc0k/Y=; b=QJsse8vsBaj4vkRRFdvtS3KNTvvLVEYOqY8fRne5DS9jLsQ0OQHCJEqqFg140ybkluHjQy A5K9AFiZnSxVKVkpAAOgwqrwbLZcXRgFOPevg096Re1UiePXNXBNHJ6bDFVcno6frG+HvE luaoLN40YnKd6emy5/0LzwGzKNFngvDk1MUHF2JxCTy8oEnbJTMlHoqMVcxyAdAuutVaKk DIr07YNRhscrcVTmf+uq2nksUDMnNwTopnE6faj7BSsJH0qiiAlmTKBalWeyzwbxXaSHWz k6yjJImb3+pBsIIM8eafy9zi9Z0N/7cx/7mdSMGUnP1KBTczyjUCzLAmuN+Lkw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1637678426; a=rsa-sha256; cv=none; b=CJ0BSwMGjO5YCn8VhtjqUfaFraXBzxvSZv8ZXGMylVvMO3MliQpV5rXCPkpBjVZ/aJ4NaF lWV2YPCmSgJEvZJsGiJy1LsRkIqZho9MN0z0Yk/4j8VPYZZNvsv7LIUVwAon8xKy+KvHAN 9lhR+4ULkwLKTu5xcrHOipgwTFkc7mmw/phwPnN4cJ+vF0SDwbDG/0Kb3W1g0lypI43EJx G7o5rKZzw71rfPY5ntmwUPO4rBeB3u8Kcj4mVvdH+X99hwRijMdhlZTlmCprH8xN7pyuqS iFp0EW+cmE3zWAymLtCw2FuzpcSoMAAO6FFENZAT8RKnXn/hZ03a1oqIq3unMg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=rixotstudio.cz header.s=protonmail2 header.b=w2AWHDmu; dmarc=pass (policy=none) header.from=rixotstudio.cz; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -6.08 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=rixotstudio.cz header.s=protonmail2 header.b=w2AWHDmu; dmarc=pass (policy=none) header.from=rixotstudio.cz; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: BEEC623327 X-Spam-Score: -6.08 X-Migadu-Scanner: scn0.migadu.com X-TUID: +Biv6Wnzsuvb This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------eafafa0950e0480065f6bdaa978feb32d2947329349455adcd7343ac2612b60e Content-Type: multipart/mixed;boundary=---------------------f5c1f30e0dbf49b267b8cb2d4ce12875 -----------------------f5c1f30e0dbf49b267b8cb2d4ce12875 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain;charset=utf-8 > Are you assuming here that the two machines are the same? Or are they d= ifferent? I am assuming that the two machines are the same with the exception of cpu= frequency and threads that are used as a variable to assume the build tim= e from known value. > This approximation would not even be accurate enough for the same machin= e. For instance, the test suite of the julia package runs mainly sequenti= al using one thread... I am aware of this scenario and I adapted the equasion for it, but I recog= nize that this exponentially increases the inaccuracy with more threads an= d I don't believe that there is a mathematical way with the provided value= s to handle that scenario so we would have to adjust the calculation for t= hose packages. = Alternative and more robust solution would be to build the package with `-= -jobs` set on "cpu threads - 1" and then compare the build time to a scena= rio without this set which would give us two coordinates to be interpreted= in an exponential and eventually quadratic or even logarithmic function. = As we already have to build the package multiple times to establish reprod= ucibility so this shouldn't add any overhead just slightly slower reproduc= tion build. The exponential function would be less vulnerable to tasks that are unable= to utilize all available processing resources, but it's more inaccurate s= o I assume that we would eventually have to use quadratic/logarithmic defi= nition that uses the previously provided equasion on both coordinates to d= etermine build on different system which to me seems as a robust in my cur= rent testing, but I am willing to give it more time if we (me and other GN= U Guix devs/contributors) agree that this would be a mergable effort as it= would basically require a one value preferably stored in the guix repo th= at is used to calculate the build time. -- Jacob "Kreyren" Hrbek Sent with ProtonMail Secure Email. =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 Original M= essage =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 On Tuesday, November 23rd, 2021 at 11:56 AM, zimoun wrote: > Hi, > = > On Tue, 23 Nov 2021 at 06:21, Jacob Hrbek kreyren@rixotstudio.cz wrote: > = > > 1. locally: Storing the value somewhere on the system and adding up t= o > > = > > it each build to provide more accurate average. > = > Timing is already stored, see =E2=80=9Cguix build --log-file=E2=80=9D. G= ive a look at > = > =E2=80=99/var/log/guix/drvs=E2=80=99. For instance, > = > --8<---------------cut here---------------start------------->8--- > = > $ bzcat /var/log/guix/drvs/aq/abymi9yk7pv89614dcdfll3hh4i5mc-julia-1.5.3= .drv.bz2 | grep phase | grep seconds > = > phase `set-SOURCE-DATE-EPOCH' succeeded after 0.0 seconds phase` set-pat= hs' succeeded after 0.0 seconds > = > phase `install-locale' succeeded after 0.0 seconds phase` unpack' succee= ded after 1.1 seconds > = > phase `use-system-libwhich' succeeded after 0.0 seconds phase` disable-d= ocumentation' succeeded after 0.0 seconds > = > phase `prepare-deps' succeeded after 0.0 seconds phase` bootstrap' succe= eded after 0.0 seconds > = > phase `patch-usr-bin-file' succeeded after 0.0 seconds phase` patch-sour= ce-shebangs' succeeded after 0.2 seconds > = > phase `patch-generated-file-shebangs' succeeded after 0.0 seconds phase`= fix-include-and-link-paths' succeeded after 0.0 seconds > = > phase `replace-default-shell' succeeded after 0.0 seconds phase` fix-pre= compile' succeeded after 0.0 seconds > = > phase `build' succeeded after 354.3 seconds phase` set-home' succeeded a= fter 0.0 seconds > = > phase `disable-broken-tests' succeeded after 0.0 seconds phase` check' s= ucceeded after 7428.8 seconds > = > phase `install' succeeded after 16.0 seconds phase` make-wrapper' succee= ded after 0.0 seconds > = > phase `patch-shebangs' succeeded after 0.0 seconds phase` strip' succeed= ed after 0.0 seconds > = > phase `validate-runpath' succeeded after 0.0 seconds phase` validate-doc= umentation-location' succeeded after 0.0 seconds > = > phase `delete-info-dir-file' succeeded after 0.0 seconds phase` patch-do= t-desktop-files' succeeded after 0.0 seconds > = > phase `install-license-files' succeeded after 0.0 seconds phase` reset-g= zip-timestamps' succeeded after 0.0 seconds > = > phase `compress-documentation' succeeded after 0.0 seconds > = > --8<---------------cut here---------------end--------------->8--- > = > Therefore, you need to extract somehow that information. > = > > optionally This local database can be shared across multiple > > = > > systems that add values to it like simple listener waiting for POST > > = > > requests. > = > It should be better to use a content-addressed distribution such as IPFS > = > or GNUnet, IMHO. > = > > - within the guix repo: Since we are already building the package we > > = > > can take the time and then do the provided math in reverse to > > = > > calculate the time: > > = > > Build took 100 sec on system with 8 threads at 2.4 Ghz max cpu fre= quency: > > = > > 100 * (2.4 * 8) =3D 1920 (build time value per one thread at 1 Ghz= ) > > = > > Building the package on system with 2 threads at 2.4 Ghz max cpu f= requency: > > = > > 1920 / (2 * 2.4) =3D 400 > > = > > We can then assume that the build will take 1920/400=3D4.8 -> 4.8 > > = > > times longer on this system. > > = > = > Are you assuming here that the two machines are the same? Or are they > = > different? > = > This approximation would not even be accurate enough for the same > = > machine. For instance, the test suite of the julia package runs mainly > = > sequential using one thread. If you go back to numbers above, > = > build=3D354.3 seconds and check=3D7428.8 seconds, so the number of threa= ds > = > only tweaks timing of build phase, which will not impact much the > = > overall timing. > = > Somehow, IIUC your proposal, you would like, based on timings from > = > machine A about a set of packages, and timings from machine B about the > = > same set of packages, knowing the timing of machine B for package foo, > = > then extrapolate timing for machine A of package foo. The maths for > = > that are not linear, IMHO, and require =E2=80=9Ccomplicated=E2=80=9D heu= ristics. It is > = > not that complicated, it =E2=80=9Cjust=E2=80=9D require some statistical= regression =E2=80=93 > = > though it is not straightforward either. :-) > = > BTW, why not directly substitute package foo from machine B? > = > > The math might need to be adjusted, but it seems to be sufficiently > > = > > accurate through my testing, fwiw I see that `(max cpu frequency * cpu= threads)` is an important component of the equasion using the analogy > > = > > of a (possibly silly) "pokemon battle" assuming interpreting a > > = > > mathematical equasion to define the Health Points of the package and > > = > > damage per second of the CPU then simply substracting these two values > > = > > to determine how long it will take to build alike package has 500 HP > > = > > -> Needs a CPU that deals 100 HP to complete in 5 sec or CPU that > > = > > deals 50 HP to finish in 10 sec. > = > I will be happy if I am wrong. I guess this back-to-envelope would be > = > not accurate enough; for two reasons. As I said elsewhere, to your > = > example value of 100 seconds is attached a strong variability, depending > = > one on how the package itself scales at build time and more than often > = > this scaling is not linear versus the number of threads =E2=80=93 from m= y > = > experience; and two on the stressed context where the build happens. > = > > About accuracy: I highly doubt that we need to worry about the system > > = > > noise as that will be mitigated after enough systems submit average > > = > > build time with their max CPU frequency and threads used.. we > > = > > shouldn't really bother past that at the current stage and we can > > = > > always add additional metadata if needed. > = > A average is not meaningful by itself. It provides a first-order > = > approximation and generally it is not sufficient; the second-order is > = > also required. Especially when drawing a model for prediction. From > = > what I remember about stats, and assuming the distribution is Gaussian, > = > the mean and standard error are required to capture that information. > = > My guess is because standard error, the mean would not provide useful > = > prediction shareable across heterogeneous machines. > = > I will be happy to be wrong and only numbers can answer to this > = > question. If you are interested by building a model or verify your > = > assumptions, I am sure it is possible to dump the current Cuirass > = > postgres database and then do some analytics. It would be a starting > = > point to evaluate if the effort implied by your proposal is worth. > = > I am not convinced such model would be doable for practical use across > = > heterogeneous machines, but it would help for monitoring CI. > = > Cheers, > = > simon -----------------------f5c1f30e0dbf49b267b8cb2d4ce12875 Content-Type: application/pgp-keys; filename="publickey - kreyren@rixotstudio.cz - 0x1677DB82.asc"; name="publickey - kreyren@rixotstudio.cz - 0x1677DB82.asc" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="publickey - kreyren@rixotstudio.cz - 0x1677DB82.asc"; name="publickey - kreyren@rixotstudio.cz - 0x1677DB82.asc" LS0tLS1CRUdJTiBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tDQpWZXJzaW9uOiBPcGVuUEdQLmpz IHY0LjEwLjEwDQpDb21tZW50OiBodHRwczovL29wZW5wZ3Bqcy5vcmcNCg0KeGpNRVlBbDNGaFlK S3dZQkJBSGFSdzhCQVFkQVFLQXBtZFI4dEc5YUtFZHh3SEovWktPMkN2Wk1SV1B0DQpCTk5HcUpV aHAyTE5MMnR5WlhseVpXNUFjbWw0YjNSemRIVmthVzh1WTNvZ1BHdHlaWGx5Wlc1QWNtbDQNCmIz UnpkSFZrYVc4dVkzbyt3bzhFRUJZS0FDQUZBbUFKZHhZR0N3a0hDQU1DQkJVSUNnSUVGZ0lCQUFJ Wg0KQVFJYkF3SWVBUUFoQ1JDdDAzMFVxMEw4cVJZaEJCWjMyNEtUaktobGM0RWpCNjNUZlJTclF2 eXA1N1FBDQovMHRsYmRuQ0l6cmVLWG12VzJYU1lYekFKb3RKZHhDekUrWEFUTStxUERLekFRQ2Ni SHA3eXc2K0FybmcNCmVTdEdGbi9vbGh4VFBkcHU2NDFDTEdpZ1BtRW9CYzQ0QkdBSmR4WVNDaXNH QVFRQmwxVUJCUUVCQjBEYQ0KaUkzalFmU29pM0RaNC9OZm14R2RzUnN2OS9CcU1nVzVqNmpkQnFr eUlBTUJDQWZDZUFRWUZnZ0FDUVVDDQpZQWwzRmdJYkRBQWhDUkN0MDMwVXEwTDhxUlloQkJaMzI0 S1RqS2hsYzRFakI2M1RmUlNyUXZ5cEhjRUINCkFPUXhTL0ovVU0wZWU4azJqYmxpV2QvUTBJZCtY OFVIQlhoeXFWUmMyMnFyQVFETEhjVzk3V1FiU0pGbw0KMTlrd3Q3ME95SGVwRjZMV3BERDBQdUlT WkQ2SUNnPT0NCj05a1pnDQotLS0tLUVORCBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tDQo= -----------------------f5c1f30e0dbf49b267b8cb2d4ce12875-- --------eafafa0950e0480065f6bdaa978feb32d2947329349455adcd7343ac2612b60e Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: ProtonMail wnUEARYKAAYFAmGc/SQAIQkQrdN9FKtC/KkWIQQWd9uCk4yoZXOBIwet030U q0L8qYuOAQCCR4hpPuNPARBdCYe9JOBMfWSk/DO+Us8lpfIvyCOE/QD/cx3v Y/HBrX9IWf6J4TE3yjr9yvTdXSSA4/It1dgVywQ= =E6lj -----END PGP SIGNATURE----- --------eafafa0950e0480065f6bdaa978feb32d2947329349455adcd7343ac2612b60e--