As discussed earlier today on IRC with Clément, we could add performance monitoring capabilities to Cuirass. Interesting metrics would be: • time of push to time of evaluation completion; • time of evaluation completion to time of build completion. We could visualize that per job over time. Perhaps these are also stats that ‘guix weather’ could display. Ludo’.
Hello,
> As discussed earlier today on IRC with Clément, we could add performance
> monitoring capabilities to Cuirass. Interesting metrics would be:
>
> • time of push to time of evaluation completion;
>
> • time of evaluation completion to time of build completion.
Small update on that one. With Cuirass commit
154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following
timestamps:
* Checkout commit time.
* Evaluation creation.
* Evaluation checkouts completion.
* Evaluation completion.
For the first timestamp, I'm using Guile-Git to extract the commit time,
which is not the commit push time. In fact, I think there is no such
thing as "commit push time" in git.
We can still compute the metric 'time of commit to time of evaluation
completion', but it's less relevant than the proposed 'time of push to
time of evaluation completion'.
The other proposed metric, 'time of evaluation completion to time of
build completion' can now be computed.
Regarding the actual computation and reporting of those metrics, I'm
still considering different options. I'd like to have a look to
Guile-prometheus that is written by Christopher.
Thanks,
Mathieu
[-- Attachment #1: Type: text/plain, Size: 1391 bytes --] Mathieu Othacehe <othacehe@gnu.org> writes: > Hello, > >> As discussed earlier today on IRC with Clément, we could add performance >> monitoring capabilities to Cuirass. Interesting metrics would be: >> >> • time of push to time of evaluation completion; >> >> • time of evaluation completion to time of build completion. > > Small update on that one. With Cuirass commit > 154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following > timestamps: > > * Checkout commit time. > * Evaluation creation. > * Evaluation checkouts completion. > * Evaluation completion. > > For the first timestamp, I'm using Guile-Git to extract the commit time, > which is not the commit push time. In fact, I think there is no such > thing as "commit push time" in git. I had this issue with the Guix Data Service as well, it uses the timestamp in the email sent by the Savannah git hook, which is the closest I've got to "commit push time". > We can still compute the metric 'time of commit to time of evaluation > completion', but it's less relevant than the proposed 'time of push to > time of evaluation completion'. As someone can commit, then potentially push those commits hours later, assuming no one else has pushed, this data might be a bit noisy. Time between Curiass noticing the new commit to the evaluation completion might be cleaner. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 962 bytes --]
Hi, Christopher Baines <mail@cbaines.net> skribis: > Mathieu Othacehe <othacehe@gnu.org> writes: > >> Hello, >> >>> As discussed earlier today on IRC with Clément, we could add performance >>> monitoring capabilities to Cuirass. Interesting metrics would be: >>> >>> • time of push to time of evaluation completion; >>> >>> • time of evaluation completion to time of build completion. >> >> Small update on that one. With Cuirass commit >> 154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following >> timestamps: >> >> * Checkout commit time. >> * Evaluation creation. >> * Evaluation checkouts completion. >> * Evaluation completion. >> >> For the first timestamp, I'm using Guile-Git to extract the commit time, >> which is not the commit push time. In fact, I think there is no such >> thing as "commit push time" in git. > > I had this issue with the Guix Data Service as well, it uses the > timestamp in the email sent by the Savannah git hook, which is the > closest I've got to "commit push time". Neat. >> We can still compute the metric 'time of commit to time of evaluation >> completion', but it's less relevant than the proposed 'time of push to >> time of evaluation completion'. > > As someone can commit, then potentially push those commits hours later, > assuming no one else has pushed, this data might be a bit noisy. Time > between Curiass noticing the new commit to the evaluation completion > might be cleaner. Agreed. We regularly push commits that are weeks or months old (sometimes years), so there might be too many outliers when looking at the commit time. Thanks for pushing this, Mathieu! Ludo’.
Hello,
> Agreed. We regularly push commits that are weeks or months old
> (sometimes years), so there might be too many outliers when looking at
> the commit time.
Yes, so I used checkout time instead of commit time with
af12a80599346968fb9f52edb33b48dd26852788.
I also turned Evaluation 'in_progress' field into 'status' field. This
way it's much easier to create some metrics on evaluations. It also
allows to distinguish between 'aborted' and 'failed' evaluations.
Thanks,
Mathieu
Hello, I just pushed support for computing and displaying metrics in Cuirass. I started with two metrics: * Builds per day * Average evaluation speed per specification. Those metrics can now be seen at: https://ci.guix.gnu.org/metrics and are updated every hour. I plan to add more metrics such as: - Evaluation completion percentage. - Evaluation completion speed. - Failed evaluations percentage. - Pending builds per day. Don't hesitate to comment or propose other metrics. Thanks, Mathieu
Hi Mathieu, Really cool! On Mon, 14 Sep 2020 at 15:35, Mathieu Othacehe <othacehe@gnu.org> wrote: > * Builds per day > * Average evaluation speed per specification. Something interesting could be: min and max (of 100 evaluations). The standard error deviation too but I am not sure it is easy to interpret with a quick look. Instead, the median could be interesting. For example, consider these 2 evaluations: https://ci.guix.gnu.org/build/2094496/details https://ci.guix.gnu.org/build/3035986/details Well, if there is say 99 evaluations of first "kind" and 1 of second kind, the average is: (99*849 + 1_595_796_252) / 100 = 15_958_803.03 which does not really represent the effective workload. Well, I will try to give a look if I can schedule a moment. :-) > Those metrics can now be seen at: > > https://ci.guix.gnu.org/metrics Nice plot! > I plan to add more metrics such as: > > - Evaluation completion percentage. > - Evaluation completion speed. > - Failed evaluations percentage. > - Pending builds per day. Cool! Maybe time between commit time (not author time) and start of the build. Cheers, simon
Hi! Mathieu Othacehe <othacehe@gnu.org> skribis: > I just pushed support for computing and displaying metrics in Cuirass. I > started with two metrics: > > * Builds per day > * Average evaluation speed per specification. > > Those metrics can now be seen at: > > https://ci.guix.gnu.org/metrics > > and are updated every hour. This is very cool, thumbs up! > I plan to add more metrics such as: > > - Evaluation completion percentage. > - Evaluation completion speed. > - Failed evaluations percentage. > - Pending builds per day. That’d be awesome. As discussed on IRC, builds per day should be compared to new derivations per day. For example, if on a day there’s 100 new derivations and we only manage to build 10 of them, we have a problem. BTW, in cuirass.log I noticed this: --8<---------------cut here---------------start------------->8--- 2020-09-14T21:16:21 Updating metric average-eval-duration-per-spec (guix-modular-master) to 414.8085106382979. 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (kernel-updates). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (kernel-updates). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (kernel-updates). 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (staging-staging). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (staging-staging). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (staging-staging). 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (version-1.0.1). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (version-1.0.1). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.0.1). 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (version-1.1.0). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (version-1.1.0). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0). 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop). --8<---------------cut here---------------end--------------->8--- Perhaps it can’t compute an average yet for these jobsets? Thanks! Ludo’.
[-- Attachment #1: Type: text/plain, Size: 1553 bytes --] Hi all. zimoun <zimon.toutoune@gmail.com> writes: > Hi Mathieu, > > Really cool! > > On Mon, 14 Sep 2020 at 15:35, Mathieu Othacehe <othacehe@gnu.org> wrote: > >> * Builds per day >> * Average evaluation speed per specification. > > Something interesting could be: min and max (of 100 evaluations). > The standard error deviation too but I am not sure it is easy to > interpret with a quick look. Instead, the median could be > interesting. > > For example, consider these 2 evaluations: > > https://ci.guix.gnu.org/build/2094496/details > https://ci.guix.gnu.org/build/3035986/details > I'm getting a 504 Gateway Time-out error when visiting the above links(at the time of sending this email). > Well, if there is say 99 evaluations of first "kind" and 1 of second > kind, the average is: > (99*849 + 1_595_796_252) / 100 = 15_958_803.03 > which does not really represent the effective workload. > > Well, I will try to give a look if I can schedule a moment. :-) > > >> Those metrics can now be seen at: >> >> https://ci.guix.gnu.org/metrics > > Nice plot! > > >> I plan to add more metrics such as: >> >> - Evaluation completion percentage. >> - Evaluation completion speed. >> - Failed evaluations percentage. >> - Pending builds per day. > > Cool! > Maybe time between commit time (not author time) and start of the build. > > > Cheers, > simon > > > > -- Bonface M. K. (https://www.bonfacemunyoki.com) Chief Emacs Mchochezi GPG key = D4F09EB110177E03C28E2FE1F5BBAE1E0392253F [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 869 bytes --]
On Mon, Sep 14, 2020 at 03:34:17PM +0200, Mathieu Othacehe wrote:
> I just pushed support for computing and displaying metrics in Cuirass. I
> started with two metrics:
> * Builds per day
> * Average evaluation speed per specification.
> Those metrics can now be seen at:
> https://ci.guix.gnu.org/metrics
Congratulations, that looks like a very useful start already!
(And the number of builds has doubled since yesterday, so someone already
put it to good use!)
How about also adding metrics per build machine? I have the impression,
for instance, that the aarch64 machine in my living room is not used.
If this is confirmed, we could take appropriate action (uncomment it in
/etc/machines.scm :-), compare to other used machines, change the scheduling
in the daemon, or even turn it off to conserve energy should it turn out
that we have too much build power...).
Andreas
Hello Andreas, > Congratulations, that looks like a very useful start already! > (And the number of builds has doubled since yesterday, so someone already > put it to good use!) Thanks for your feedback :) > How about also adding metrics per build machine? I have the impression, > for instance, that the aarch64 machine in my living room is not used. > If this is confirmed, we could take appropriate action (uncomment it in > /etc/machines.scm :-), compare to other used machines, change the scheduling > in the daemon, or even turn it off to conserve energy should it turn out > that we have too much build power...). Yes I would really like to have something like: https://hydra.nixos.org/machines, with a build rate for every machine. However, it cannot be done without structural changes to how offloading is handled. For now it's working this way: Cuirass -> guix-daemon -> guix offload -> build machines Which means that Cuirass has almost no information about offloaded builds. We are currently starting discussions about inviting the Guix Build Coordinator to the party. That could maybe help us implement what you are proposing, among other things. Thanks, Mathieu
Hey Ludo, > As discussed on IRC, builds per day should be compared to new > derivations per day. For example, if on a day there’s 100 new > derivations and we only manage to build 10 of them, we have a problem. I added this line, and they sadly do not overlap :( > 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0). > 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop). > 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop). > 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop). > > Perhaps it can’t compute an average yet for these jobsets? Yes as soon as those evaluations will be repaired, we should be able to compute those metrics. I chose to keep the error messages as a remainder. I added various other metrics and updated the "/metrics" page. Once we have a better view, we should think of adding thresholds on those metrics. Closing this one! Thanks, Mathieu -- https://othacehe.org
Hi, Mathieu Othacehe <othacehe@gnu.org> skribis: >> As discussed on IRC, builds per day should be compared to new >> derivations per day. For example, if on a day there’s 100 new >> derivations and we only manage to build 10 of them, we have a problem. > > I added this line, and they sadly do not overlap :( It seems less bad than I thought though, and the rendering is pretty. :-) >> 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0). >> 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop). >> 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop). >> 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop). >> >> Perhaps it can’t compute an average yet for these jobsets? > > Yes as soon as those evaluations will be repaired, we should be able to > compute those metrics. I chose to keep the error messages as a > remainder. Makes sense. > I added various other metrics and updated the "/metrics" page. Once we > have a better view, we should think of adding thresholds on those > metrics. Excellent. Thanks a lot for closing this gap! Ludo’.
Hi Mathieu! Mathieu Othacehe <othacehe@gnu.org> skribis: >> How about also adding metrics per build machine? I have the impression, >> for instance, that the aarch64 machine in my living room is not used. >> If this is confirmed, we could take appropriate action (uncomment it in >> /etc/machines.scm :-), compare to other used machines, change the scheduling >> in the daemon, or even turn it off to conserve energy should it turn out >> that we have too much build power...). > > Yes I would really like to have something like: > https://hydra.nixos.org/machines, with a build rate for every machine. +1! > However, it cannot be done without structural changes to how offloading > is handled. For now it's working this way: > > Cuirass -> guix-daemon -> guix offload -> build machines > > Which means that Cuirass has almost no information about offloaded > builds. In practice, it could parse the offload events that it gets; a bit of a hack, but good enough. However… > We are currently starting discussions about inviting the Guix Build > Coordinator to the party. … this sounds like the better option longer-term. Ludo’.