From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id SM+QFonXnGEWSQAAgWs5BA (envelope-from ) for ; Tue, 23 Nov 2021 12:59:05 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id kAlKEonXnGEHKgAAbx9fmQ (envelope-from ) for ; Tue, 23 Nov 2021 11:59:05 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 6C49485C0 for ; Tue, 23 Nov 2021 12:59:04 +0100 (CET) Received: from localhost ([::1]:59760 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mpURz-0006nf-81 for larch@yhetil.org; Tue, 23 Nov 2021 06:59:03 -0500 Received: from eggs.gnu.org ([209.51.188.92]:49090) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mpURW-0006nR-46 for guix-devel@gnu.org; Tue, 23 Nov 2021 06:58:34 -0500 Received: from [2a00:1450:4864:20::431] (port=39464 helo=mail-wr1-x431.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mpURT-0002Ya-Iz for guix-devel@gnu.org; Tue, 23 Nov 2021 06:58:33 -0500 Received: by mail-wr1-x431.google.com with SMTP id a18so1958270wrn.6 for ; Tue, 23 Nov 2021 03:58:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-transfer-encoding; bh=iD7isHm2EsOYt5FsbgOgpxsQsYAeyiaJNNJ8BVONeUY=; b=dDV81nzubNyp64c6Bsn8CKkBYmmHgNvgCYhlT7/GYzg6FSrN3/RKBldTvVpgj1UHz2 Cys/IHkgQyRpZnj96Lwa7KEVvYNxziWFQfVeb1xJinJzIB+2A75XcPURXqE+OzymgEX7 7asVIyJGuoFtBLFKOf9K0hNAXIrF9rrfLHSTEh/fg91DVaypmtmgw3T90H909VeC4cSu 3hT/neZ6K3wy1yOMmrCMKFRp0api3nw6/SlW/pH5CsmTUp88nSPS9kn99Ow6KwPyZS2+ WzzfhAWQXVFoYCP7/NOQlUFJTMKqxYuNCct/sPMFWI4ItU8pTvuZsIwKPj+9EJ5Y4ty9 gLVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=iD7isHm2EsOYt5FsbgOgpxsQsYAeyiaJNNJ8BVONeUY=; b=H41198v8K0ywjoeP8iYe/fqSVcFDswaoWE/rq1MdMNd4lssDowgOp8HZqpxD6M5hmk 4XEFQZEr1nJ+M+6WAwuU9iML6so/Wf2h84JfCHTFHY/wZsWELNn2zWn5Jd6QVU4d4I6/ mB1kEOhoUTqIN4U32CgMvNyGDCXwgOA8SnlhuLZlAB2wc2382NPTfiWH4dWN+xFnk2Zf 3pWmk11klFqLis12nwN0IjM92Wh2f5TCoc5byDWaYdMqfBqsc653p/H2XbOGHpY1FDeQ HUTVbgmV+y8MlwuqHDgYskJt5TeSgN3ztyVxHe66wWRv2GgPgvaEC5NSnVjGJ8tHP3Oj DzZA== X-Gm-Message-State: AOAM532uSEfSObuwUUUy7ofDf6PBQCbQ4w24GnF20XXoAmRevGK/aQQq 3Pb7kJIWwpWQJ4w1ORn9AGtnBw3sQkU= X-Google-Smtp-Source: ABdhPJy0gKnePAiJTwDILx/xW7sQAiZ93MXZmTzb2EGfv6E1eMFEM7AOrb5zz1TrlkWeiKWEQJXsEg== X-Received: by 2002:adf:d091:: with SMTP id y17mr6400620wrh.418.1637668710091; Tue, 23 Nov 2021 03:58:30 -0800 (PST) Received: from lili ([2a01:e0a:59b:9120:65d2:2476:f637:db1e]) by smtp.gmail.com with ESMTPSA id d2sm968889wmb.31.2021.11.23.03.58.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Nov 2021 03:58:29 -0800 (PST) From: zimoun To: Jacob Hrbek Subject: Re: Proposal: Build timers In-Reply-To: References: <868rxfwuib.fsf@gmail.com> Date: Tue, 23 Nov 2021 12:56:35 +0100 Message-ID: <865ysjw0ek.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a00:1450:4864:20::431 (failed) Received-SPF: pass client-ip=2a00:1450:4864:20::431; envelope-from=zimon.toutoune@gmail.com; helo=mail-wr1-x431.google.com X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, PDS_HP_HELO_NORDNS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "guix-devel@gnu.org" Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1637668745; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=iD7isHm2EsOYt5FsbgOgpxsQsYAeyiaJNNJ8BVONeUY=; b=YMItQWpyFuiFRw/OV1CFxV3Mama+bmw5nc/RjMmKdye+C/4XPeyA76UPbt4Pnci7ncOuRZ 1q5IXDjycdMq77LMx4KgLJIq1pcVKD6vLyEW6GJkrhp0ojvrvEnMc2V3GMtn6nGQYeDTph bKif9LL8ztdK7LYZnLFQA3kLJBgRg5bkXl0ZtFtTR7/CxPUAvWUplcbO2twaKmjIrHJ5G2 KkQUd2INqekcFoERWxL4l1nZxZkAYjhDi/j8Vi1jM5eCLsXu4Ko6+onemM9ryJZr5bUYdk ARcN0mjJ3cXVfgDIbdpcgtBaQo/ZBCymWJaz6S7sX+YNDCL3bmPfpkF/SxrAgQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1637668745; a=rsa-sha256; cv=none; b=XTpkNk1ONQJ9dd6JgiDI4qp3ez91ywsJ3ugCCMsWbhNmzErjcZ2mFtbjGasY1RypXzZMxM s2wF4V55CPn41E1WBVXSB++mNIjiNdGVMUYbBDKE0smoz+YIwEHRmUvJqONTW0ySiVbPHA hBeSFw1g/aOMIWdaPFjhZeKiRSKEPXOcs/rKAtu7Y8fY1NYUlFERXFj0VLyWLwMiqZ4DaW 0sR7Pus9bAhmvu+w+eRAkoYOT97Yt86Om/DlzLFr/rArr1yHo88tq1MRh0Q8Vy+tel9fR2 64z8Br46rGPEafYtARJyPIJV0eB1EceL6X0JooFOX52JqKotnkrjhjgRlNPiTA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=dDV81nzu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -4.08 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=dDV81nzu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 6C49485C0 X-Spam-Score: -4.08 X-Migadu-Scanner: scn0.migadu.com X-TUID: pmxfiA+gMa5s Hi, On Tue, 23 Nov 2021 at 06:21, Jacob Hrbek wrote: > 1. locally: Storing the value somewhere on the system and adding up to > it each build to provide more accurate average. Timing is already stored, see =E2=80=9Cguix build --log-file=E2=80=9D. Giv= e a look at =E2=80=99/var/log/guix/drvs=E2=80=99. For instance, --8<---------------cut here---------------start------------->8--- $ bzcat /var/log/guix/drvs/aq/abymi9yk7pv89614dcdfll3hh4i5mc-julia-1.5.3.dr= v.bz2 | grep phase | grep seconds phase `set-SOURCE-DATE-EPOCH' succeeded after 0.0 seconds phase `set-paths' succeeded after 0.0 seconds phase `install-locale' succeeded after 0.0 seconds phase `unpack' succeeded after 1.1 seconds phase `use-system-libwhich' succeeded after 0.0 seconds phase `disable-documentation' succeeded after 0.0 seconds phase `prepare-deps' succeeded after 0.0 seconds phase `bootstrap' succeeded after 0.0 seconds phase `patch-usr-bin-file' succeeded after 0.0 seconds phase `patch-source-shebangs' succeeded after 0.2 seconds phase `patch-generated-file-shebangs' succeeded after 0.0 seconds phase `fix-include-and-link-paths' succeeded after 0.0 seconds phase `replace-default-shell' succeeded after 0.0 seconds phase `fix-precompile' succeeded after 0.0 seconds phase `build' succeeded after 354.3 seconds phase `set-home' succeeded after 0.0 seconds phase `disable-broken-tests' succeeded after 0.0 seconds phase `check' succeeded after 7428.8 seconds phase `install' succeeded after 16.0 seconds phase `make-wrapper' succeeded after 0.0 seconds phase `patch-shebangs' succeeded after 0.0 seconds phase `strip' succeeded after 0.0 seconds phase `validate-runpath' succeeded after 0.0 seconds phase `validate-documentation-location' succeeded after 0.0 seconds phase `delete-info-dir-file' succeeded after 0.0 seconds phase `patch-dot-desktop-files' succeeded after 0.0 seconds phase `install-license-files' succeeded after 0.0 seconds phase `reset-gzip-timestamps' succeeded after 0.0 seconds phase `compress-documentation' succeeded after 0.0 seconds --8<---------------cut here---------------end--------------->8--- Therefore, you need to extract somehow that information. > **optionally** This local database can be shared across multiple > systems that add values to it like simple listener waiting for POST > requests. It should be better to use a content-addressed distribution such as IPFS or GNUnet, IMHO. > - within the guix repo: Since we are already building the package we > can take the time and then do the provided math in reverse to > calculate the time:=20 > > Build took 100 sec on system with 8 threads at 2.4 Ghz max cpu freque= ncy: > > 100 * (2.4 * 8) =3D 1920 (build time value per one thread at 1 Ghz) > > Building the package on system with 2 threads at 2.4 Ghz max cpu freq= uency: > > 1920 / (2 * 2.4) =3D 400 > > We can then assume that the build will take 1920/400=3D4.8 -> 4.8 > times longer on this system.=20 Are you assuming here that the two machines are the same? Or are they different? This approximation would not even be accurate enough for the same machine. For instance, the test suite of the julia package runs mainly sequential using one thread. If you go back to numbers above, build=3D354.3 seconds and check=3D7428.8 seconds, so the number of threads only tweaks timing of build phase, which will not impact much the overall timing. Somehow, IIUC your proposal, you would like, based on timings from machine A about a set of packages, and timings from machine B about the same set of packages, knowing the timing of machine B for package foo, then extrapolate timing for machine A of package foo. The maths for that are not linear, IMHO, and require =E2=80=9Ccomplicated=E2=80=9D heuris= tics. It is not that complicated, it =E2=80=9Cjust=E2=80=9D require some statistical re= gression =E2=80=93 though it is not straightforward either. :-) BTW, why not directly substitute package foo from machine B? > The math might need to be adjusted, but it seems to be sufficiently > accurate through my testing, fwiw I see that `(max cpu frequency * cpu > threads)` is an important component of the equasion using the analogy > of a (possibly silly) "pokemon battle" assuming interpreting a > mathematical equasion to define the Health Points of the package and > damage per second of the CPU then simply substracting these two values > to determine how long it will take to build alike package has 500 HP > -> Needs a CPU that deals 100 HP to complete in 5 sec or CPU that > deals 50 HP to finish in 10 sec. I will be happy if I am wrong. I guess this back-to-envelope would be not accurate enough; for two reasons. As I said elsewhere, to your example value of 100 seconds is attached a strong variability, depending one on how the package itself scales at build time and more than often this scaling is not linear versus the number of threads =E2=80=93 from my experience; and two on the stressed context where the build happens. > About accuracy: I highly doubt that we need to worry about the system > noise as that will be mitigated after enough systems submit average > build time with their max CPU frequency and threads used.. we > shouldn't really bother past that at the current stage and we can > always add additional metadata if needed. A average is not meaningful by itself. It provides a first-order approximation and generally it is not sufficient; the second-order is also required. Especially when drawing a model for prediction. From what I remember about stats, and assuming the distribution is Gaussian, the mean and standard error are required to capture that information. My guess is because standard error, the mean would not provide useful prediction shareable across heterogeneous machines. I will be happy to be wrong and only numbers can answer to this question. If you are interested by building a model or verify your assumptions, I am sure it is possible to dump the current Cuirass postgres database and then do some analytics. It would be a starting point to evaluate if the effort implied by your proposal is worth. I am not convinced such model would be doable for practical use across heterogeneous machines, but it would help for monitoring CI. Cheers, simon