Re: Proposal: Build timers

all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: zimoun <zimon.toutoune@gmail.com>
To: Jacob Hrbek <kreyren@rixotstudio.cz>,
	Julien Lepiller <julien@lepiller.eu>
Cc: "guix-devel@gnu.org" <guix-devel@gnu.org>
Subject: Re: Proposal: Build timers
Date: Wed, 24 Nov 2021 12:35:33 +0100	[thread overview]
Message-ID: <86r1b5u6pm.fsf@gmail.com> (raw)
In-Reply-To: <Sh5w-vUJPr0GSLPB4XuG2Emkj-1_CJnrqCuvSX-KibQJSLn5KGYV1BuHXojUsIcosHpsFtLpuxppZcBt7_zwL_Mar-MkaSvlNe75CZSKSXA=@rixotstudio.cz>

Hi,

On Tue, 23 Nov 2021 at 14:39, Jacob Hrbek <kreyren@rixotstudio.cz> wrote:

>> This approximation would not even be accurate enough for the same
>> machine.  For instance, the test suite of the julia package runs
>> mainly sequential using one thread... 
>
> I am aware of this scenario and I adapted the equasion for it, but I
> recognize that this exponentially increases the inaccuracy with more
> threads and I don't believe that there is a mathematical way with the
> provided values to handle that scenario so we would have to adjust the
> calculation for those packages.

What I am trying to explain is that the model cannot work to be
predictable enough with what I consider a meaningful accuracy.
Obviously, relaxing the precision, it is easy to infer a rule of thumb;
a simple cross-multiplication fits the job. ;-)

The “pokémon-battle” model is a simple linear model
(cross-multiplication); using Jacob’s “notation”:

 - HP: time to build on machine A
 - DPS = nthread * cpufreq : “power” of machine

Then it is expected to evaluate ’a’ and ’b’ on average such that:

  HP = a * DPS + b

based on some experiments.  Last, on machine B, knowing both nthread'
and cpufreq' for that machine B, you are expecting to evaluate HP' for
that machine B applying the formula:

  HP' = a * nthread' * cpufreq' + b

Jacob, do I correctly understand the model?

In any case, that’s what LFS is doing, instead HP is named SBU.  And
instead DPS, they use a reference package.  And this normalization is
better, IMHO.  Other said, for one specific package considered as
reference, they compute HP1 (resp. HP2) for machine A (resp. B), then
for machine A, they know HP for another package and they deduce,

  HP' = HP2/HP1 * HP

All this is trivial. :-) The key is the accuracy, i.e., the error
between the prediction HP' and the real time.  Here, the issue is that
HP1 and HP2 capture for one specific package the overall time; which
depends on hidden parameters as nthread, cpufreq, IO, and other
parameters from hardware.  But that a strong assumption when considering
these hidden parameters (evaluated for one specific package) are equally
the same for any other package.

It is a strong assumption because the hidden parameters depends on
hardware specifications (nthread, cpufreq, etc.) *and* how the package
itself exploits them.

Therefore, the difference between the prediction and the real time is
highly variable, and thus personally I am not convince the effort is
worth; for local build.  That’s another story. ;-)

LSF is well-aware of the issue and it is documented [1,2].

The root of the issue is the model based on a strong assumption; both
(model and assumption) do not fit how the reality concrete works, IMHO.

One straightforward way — requiring some work though – for improving the
accuracy is to use statistical regressions.  We cannot do really better
to capture the hardware specification – noticing that the machine stress
(what the machine is currently doing when the build happens) introduces
a variability hard to estimate beforehand.  However, it is possible to
do better when dealing with packages.  Other said, exploit the data from
the build farms.

Well, I stop here because it rings a bell: model could be discussed at
length if it is never applied to concrete numbers. :-)

Let keep it pragmatic! :-)

Using the simple LFS model and SBU, what would be the typical error?

For instance, I propose that we collectively send the timings of
packages: bash, gmsh, julia, emacs, vim; or any other 5 packages for
x86_64 architecture.  Then we can compare typical errors between
prediction and real, i.e., evaluate “accuracy“ for SBU and then decide
if it is acceptable or not. :-)

Cheers,
simon

1: <https://www.linuxfromscratch.org/lfs/view/stable/chapter04/aboutsbus.html>
2: <https://www.linuxfromscratch.org/~bdubbs/about.html>

next prev parent reply	other threads:[~2021-11-24 11:43 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-22 22:02 Proposal: Build timers Jacob Hrbek
2021-11-23  1:06 ` zimoun
2021-11-23  6:21   ` Jacob Hrbek
2021-11-23 11:56     ` zimoun
2021-11-23 14:39       ` Jacob Hrbek
2021-11-24 11:35         ` zimoun [this message]
2021-11-25  4:00           ` Jacob Hrbek
2021-11-23 12:05     ` Julien Lepiller
2021-11-23 16:23       ` zimoun
2021-11-23 20:09 ` Liliana Marie Prikler
2021-11-23 21:31   ` Jacob Hrbek
2021-11-23 21:35   ` Jacob Hrbek
2021-11-23 23:50     ` Julien Lepiller
2021-11-24 11:31       ` zimoun
2021-11-24 20:23         ` Vagrant Cascadian
2021-11-24 21:50           ` zimoun
2021-11-25  4:03           ` Jacob Hrbek
2021-11-25  5:21             ` Liliana Marie Prikler
2021-11-25 10:23             ` zimoun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86r1b5u6pm.fsf@gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=julien@lepiller.eu \
    --cc=kreyren@rixotstudio.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.