* bug#32548: Cuirass: Performance monitoring @ 2018-08-27 22:33 Ludovic Courtès 2020-09-06 14:42 ` Mathieu Othacehe 0 siblings, 1 reply; 14+ messages in thread From: Ludovic Courtès @ 2018-08-27 22:33 UTC (permalink / raw) To: 32548 As discussed earlier today on IRC with Clément, we could add performance monitoring capabilities to Cuirass. Interesting metrics would be: • time of push to time of evaluation completion; • time of evaluation completion to time of build completion. We could visualize that per job over time. Perhaps these are also stats that ‘guix weather’ could display. Ludo’. ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2018-08-27 22:33 bug#32548: Cuirass: Performance monitoring Ludovic Courtès @ 2020-09-06 14:42 ` Mathieu Othacehe 2020-09-06 18:51 ` Christopher Baines 0 siblings, 1 reply; 14+ messages in thread From: Mathieu Othacehe @ 2020-09-06 14:42 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 32548 Hello, > As discussed earlier today on IRC with Clément, we could add performance > monitoring capabilities to Cuirass. Interesting metrics would be: > > • time of push to time of evaluation completion; > > • time of evaluation completion to time of build completion. Small update on that one. With Cuirass commit 154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following timestamps: * Checkout commit time. * Evaluation creation. * Evaluation checkouts completion. * Evaluation completion. For the first timestamp, I'm using Guile-Git to extract the commit time, which is not the commit push time. In fact, I think there is no such thing as "commit push time" in git. We can still compute the metric 'time of commit to time of evaluation completion', but it's less relevant than the proposed 'time of push to time of evaluation completion'. The other proposed metric, 'time of evaluation completion to time of build completion' can now be computed. Regarding the actual computation and reporting of those metrics, I'm still considering different options. I'd like to have a look to Guile-prometheus that is written by Christopher. Thanks, Mathieu ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-06 14:42 ` Mathieu Othacehe @ 2020-09-06 18:51 ` Christopher Baines 2020-09-07 8:11 ` Ludovic Courtès 0 siblings, 1 reply; 14+ messages in thread From: Christopher Baines @ 2020-09-06 18:51 UTC (permalink / raw) To: Mathieu Othacehe; +Cc: 32548 [-- Attachment #1: Type: text/plain, Size: 1391 bytes --] Mathieu Othacehe <othacehe@gnu.org> writes: > Hello, > >> As discussed earlier today on IRC with Clément, we could add performance >> monitoring capabilities to Cuirass. Interesting metrics would be: >> >> • time of push to time of evaluation completion; >> >> • time of evaluation completion to time of build completion. > > Small update on that one. With Cuirass commit > 154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following > timestamps: > > * Checkout commit time. > * Evaluation creation. > * Evaluation checkouts completion. > * Evaluation completion. > > For the first timestamp, I'm using Guile-Git to extract the commit time, > which is not the commit push time. In fact, I think there is no such > thing as "commit push time" in git. I had this issue with the Guix Data Service as well, it uses the timestamp in the email sent by the Savannah git hook, which is the closest I've got to "commit push time". > We can still compute the metric 'time of commit to time of evaluation > completion', but it's less relevant than the proposed 'time of push to > time of evaluation completion'. As someone can commit, then potentially push those commits hours later, assuming no one else has pushed, this data might be a bit noisy. Time between Curiass noticing the new commit to the evaluation completion might be cleaner. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 962 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-06 18:51 ` Christopher Baines @ 2020-09-07 8:11 ` Ludovic Courtès 2020-09-10 13:26 ` Mathieu Othacehe 0 siblings, 1 reply; 14+ messages in thread From: Ludovic Courtès @ 2020-09-07 8:11 UTC (permalink / raw) To: Christopher Baines; +Cc: Mathieu Othacehe, 32548 Hi, Christopher Baines <mail@cbaines.net> skribis: > Mathieu Othacehe <othacehe@gnu.org> writes: > >> Hello, >> >>> As discussed earlier today on IRC with Clément, we could add performance >>> monitoring capabilities to Cuirass. Interesting metrics would be: >>> >>> • time of push to time of evaluation completion; >>> >>> • time of evaluation completion to time of build completion. >> >> Small update on that one. With Cuirass commit >> 154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following >> timestamps: >> >> * Checkout commit time. >> * Evaluation creation. >> * Evaluation checkouts completion. >> * Evaluation completion. >> >> For the first timestamp, I'm using Guile-Git to extract the commit time, >> which is not the commit push time. In fact, I think there is no such >> thing as "commit push time" in git. > > I had this issue with the Guix Data Service as well, it uses the > timestamp in the email sent by the Savannah git hook, which is the > closest I've got to "commit push time". Neat. >> We can still compute the metric 'time of commit to time of evaluation >> completion', but it's less relevant than the proposed 'time of push to >> time of evaluation completion'. > > As someone can commit, then potentially push those commits hours later, > assuming no one else has pushed, this data might be a bit noisy. Time > between Curiass noticing the new commit to the evaluation completion > might be cleaner. Agreed. We regularly push commits that are weeks or months old (sometimes years), so there might be too many outliers when looking at the commit time. Thanks for pushing this, Mathieu! Ludo’. ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-07 8:11 ` Ludovic Courtès @ 2020-09-10 13:26 ` Mathieu Othacehe 2020-09-14 13:34 ` Mathieu Othacehe 0 siblings, 1 reply; 14+ messages in thread From: Mathieu Othacehe @ 2020-09-10 13:26 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 32548 Hello, > Agreed. We regularly push commits that are weeks or months old > (sometimes years), so there might be too many outliers when looking at > the commit time. Yes, so I used checkout time instead of commit time with af12a80599346968fb9f52edb33b48dd26852788. I also turned Evaluation 'in_progress' field into 'status' field. This way it's much easier to create some metrics on evaluations. It also allows to distinguish between 'aborted' and 'failed' evaluations. Thanks, Mathieu ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-10 13:26 ` Mathieu Othacehe @ 2020-09-14 13:34 ` Mathieu Othacehe 2020-09-14 14:10 ` zimoun ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Mathieu Othacehe @ 2020-09-14 13:34 UTC (permalink / raw) To: Ludovic Courtès, mail; +Cc: 32548 Hello, I just pushed support for computing and displaying metrics in Cuirass. I started with two metrics: * Builds per day * Average evaluation speed per specification. Those metrics can now be seen at: https://ci.guix.gnu.org/metrics and are updated every hour. I plan to add more metrics such as: - Evaluation completion percentage. - Evaluation completion speed. - Failed evaluations percentage. - Pending builds per day. Don't hesitate to comment or propose other metrics. Thanks, Mathieu ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-14 13:34 ` Mathieu Othacehe @ 2020-09-14 14:10 ` zimoun 2020-09-16 2:21 ` Bonface M. K. 2020-09-14 19:27 ` Ludovic Courtès 2020-09-16 15:56 ` Andreas Enge 2 siblings, 1 reply; 14+ messages in thread From: zimoun @ 2020-09-14 14:10 UTC (permalink / raw) To: Mathieu Othacehe; +Cc: 32548 Hi Mathieu, Really cool! On Mon, 14 Sep 2020 at 15:35, Mathieu Othacehe <othacehe@gnu.org> wrote: > * Builds per day > * Average evaluation speed per specification. Something interesting could be: min and max (of 100 evaluations). The standard error deviation too but I am not sure it is easy to interpret with a quick look. Instead, the median could be interesting. For example, consider these 2 evaluations: https://ci.guix.gnu.org/build/2094496/details https://ci.guix.gnu.org/build/3035986/details Well, if there is say 99 evaluations of first "kind" and 1 of second kind, the average is: (99*849 + 1_595_796_252) / 100 = 15_958_803.03 which does not really represent the effective workload. Well, I will try to give a look if I can schedule a moment. :-) > Those metrics can now be seen at: > > https://ci.guix.gnu.org/metrics Nice plot! > I plan to add more metrics such as: > > - Evaluation completion percentage. > - Evaluation completion speed. > - Failed evaluations percentage. > - Pending builds per day. Cool! Maybe time between commit time (not author time) and start of the build. Cheers, simon ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-14 14:10 ` zimoun @ 2020-09-16 2:21 ` Bonface M. K. 0 siblings, 0 replies; 14+ messages in thread From: Bonface M. K. @ 2020-09-16 2:21 UTC (permalink / raw) To: zimoun; +Cc: Mathieu Othacehe, 32548 [-- Attachment #1: Type: text/plain, Size: 1553 bytes --] Hi all. zimoun <zimon.toutoune@gmail.com> writes: > Hi Mathieu, > > Really cool! > > On Mon, 14 Sep 2020 at 15:35, Mathieu Othacehe <othacehe@gnu.org> wrote: > >> * Builds per day >> * Average evaluation speed per specification. > > Something interesting could be: min and max (of 100 evaluations). > The standard error deviation too but I am not sure it is easy to > interpret with a quick look. Instead, the median could be > interesting. > > For example, consider these 2 evaluations: > > https://ci.guix.gnu.org/build/2094496/details > https://ci.guix.gnu.org/build/3035986/details > I'm getting a 504 Gateway Time-out error when visiting the above links(at the time of sending this email). > Well, if there is say 99 evaluations of first "kind" and 1 of second > kind, the average is: > (99*849 + 1_595_796_252) / 100 = 15_958_803.03 > which does not really represent the effective workload. > > Well, I will try to give a look if I can schedule a moment. :-) > > >> Those metrics can now be seen at: >> >> https://ci.guix.gnu.org/metrics > > Nice plot! > > >> I plan to add more metrics such as: >> >> - Evaluation completion percentage. >> - Evaluation completion speed. >> - Failed evaluations percentage. >> - Pending builds per day. > > Cool! > Maybe time between commit time (not author time) and start of the build. > > > Cheers, > simon > > > > -- Bonface M. K. (https://www.bonfacemunyoki.com) Chief Emacs Mchochezi GPG key = D4F09EB110177E03C28E2FE1F5BBAE1E0392253F [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 869 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-14 13:34 ` Mathieu Othacehe 2020-09-14 14:10 ` zimoun @ 2020-09-14 19:27 ` Ludovic Courtès 2020-09-17 10:07 ` Mathieu Othacehe 2020-09-16 15:56 ` Andreas Enge 2 siblings, 1 reply; 14+ messages in thread From: Ludovic Courtès @ 2020-09-14 19:27 UTC (permalink / raw) To: Mathieu Othacehe; +Cc: 32548 Hi! Mathieu Othacehe <othacehe@gnu.org> skribis: > I just pushed support for computing and displaying metrics in Cuirass. I > started with two metrics: > > * Builds per day > * Average evaluation speed per specification. > > Those metrics can now be seen at: > > https://ci.guix.gnu.org/metrics > > and are updated every hour. This is very cool, thumbs up! > I plan to add more metrics such as: > > - Evaluation completion percentage. > - Evaluation completion speed. > - Failed evaluations percentage. > - Pending builds per day. That’d be awesome. As discussed on IRC, builds per day should be compared to new derivations per day. For example, if on a day there’s 100 new derivations and we only manage to build 10 of them, we have a problem. BTW, in cuirass.log I noticed this: --8<---------------cut here---------------start------------->8--- 2020-09-14T21:16:21 Updating metric average-eval-duration-per-spec (guix-modular-master) to 414.8085106382979. 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (kernel-updates). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (kernel-updates). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (kernel-updates). 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (staging-staging). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (staging-staging). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (staging-staging). 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (version-1.0.1). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (version-1.0.1). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.0.1). 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (version-1.1.0). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (version-1.1.0). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0). 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop). 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop). 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop). --8<---------------cut here---------------end--------------->8--- Perhaps it can’t compute an average yet for these jobsets? Thanks! Ludo’. ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-14 19:27 ` Ludovic Courtès @ 2020-09-17 10:07 ` Mathieu Othacehe 2020-09-17 20:22 ` Ludovic Courtès 0 siblings, 1 reply; 14+ messages in thread From: Mathieu Othacehe @ 2020-09-17 10:07 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 32548-done Hey Ludo, > As discussed on IRC, builds per day should be compared to new > derivations per day. For example, if on a day there’s 100 new > derivations and we only manage to build 10 of them, we have a problem. I added this line, and they sadly do not overlap :( > 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0). > 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop). > 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop). > 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop). > > Perhaps it can’t compute an average yet for these jobsets? Yes as soon as those evaluations will be repaired, we should be able to compute those metrics. I chose to keep the error messages as a remainder. I added various other metrics and updated the "/metrics" page. Once we have a better view, we should think of adding thresholds on those metrics. Closing this one! Thanks, Mathieu -- https://othacehe.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-17 10:07 ` Mathieu Othacehe @ 2020-09-17 20:22 ` Ludovic Courtès 0 siblings, 0 replies; 14+ messages in thread From: Ludovic Courtès @ 2020-09-17 20:22 UTC (permalink / raw) To: Mathieu Othacehe; +Cc: 32548-done Hi, Mathieu Othacehe <othacehe@gnu.org> skribis: >> As discussed on IRC, builds per day should be compared to new >> derivations per day. For example, if on a day there’s 100 new >> derivations and we only manage to build 10 of them, we have a problem. > > I added this line, and they sadly do not overlap :( It seems less bad than I thought though, and the rendering is pretty. :-) >> 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0). >> 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop). >> 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop). >> 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop). >> >> Perhaps it can’t compute an average yet for these jobsets? > > Yes as soon as those evaluations will be repaired, we should be able to > compute those metrics. I chose to keep the error messages as a > remainder. Makes sense. > I added various other metrics and updated the "/metrics" page. Once we > have a better view, we should think of adding thresholds on those > metrics. Excellent. Thanks a lot for closing this gap! Ludo’. ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-14 13:34 ` Mathieu Othacehe 2020-09-14 14:10 ` zimoun 2020-09-14 19:27 ` Ludovic Courtès @ 2020-09-16 15:56 ` Andreas Enge 2020-09-17 7:10 ` Mathieu Othacehe 2 siblings, 1 reply; 14+ messages in thread From: Andreas Enge @ 2020-09-16 15:56 UTC (permalink / raw) To: Mathieu Othacehe; +Cc: 32548 On Mon, Sep 14, 2020 at 03:34:17PM +0200, Mathieu Othacehe wrote: > I just pushed support for computing and displaying metrics in Cuirass. I > started with two metrics: > * Builds per day > * Average evaluation speed per specification. > Those metrics can now be seen at: > https://ci.guix.gnu.org/metrics Congratulations, that looks like a very useful start already! (And the number of builds has doubled since yesterday, so someone already put it to good use!) How about also adding metrics per build machine? I have the impression, for instance, that the aarch64 machine in my living room is not used. If this is confirmed, we could take appropriate action (uncomment it in /etc/machines.scm :-), compare to other used machines, change the scheduling in the daemon, or even turn it off to conserve energy should it turn out that we have too much build power...). Andreas ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-16 15:56 ` Andreas Enge @ 2020-09-17 7:10 ` Mathieu Othacehe 2020-09-18 12:21 ` Ludovic Courtès 0 siblings, 1 reply; 14+ messages in thread From: Mathieu Othacehe @ 2020-09-17 7:10 UTC (permalink / raw) To: Andreas Enge; +Cc: 32548 Hello Andreas, > Congratulations, that looks like a very useful start already! > (And the number of builds has doubled since yesterday, so someone already > put it to good use!) Thanks for your feedback :) > How about also adding metrics per build machine? I have the impression, > for instance, that the aarch64 machine in my living room is not used. > If this is confirmed, we could take appropriate action (uncomment it in > /etc/machines.scm :-), compare to other used machines, change the scheduling > in the daemon, or even turn it off to conserve energy should it turn out > that we have too much build power...). Yes I would really like to have something like: https://hydra.nixos.org/machines, with a build rate for every machine. However, it cannot be done without structural changes to how offloading is handled. For now it's working this way: Cuirass -> guix-daemon -> guix offload -> build machines Which means that Cuirass has almost no information about offloaded builds. We are currently starting discussions about inviting the Guix Build Coordinator to the party. That could maybe help us implement what you are proposing, among other things. Thanks, Mathieu ^ permalink raw reply [flat|nested] 14+ messages in thread
* bug#32548: Cuirass: Performance monitoring 2020-09-17 7:10 ` Mathieu Othacehe @ 2020-09-18 12:21 ` Ludovic Courtès 0 siblings, 0 replies; 14+ messages in thread From: Ludovic Courtès @ 2020-09-18 12:21 UTC (permalink / raw) To: Mathieu Othacehe; +Cc: 32548 Hi Mathieu! Mathieu Othacehe <othacehe@gnu.org> skribis: >> How about also adding metrics per build machine? I have the impression, >> for instance, that the aarch64 machine in my living room is not used. >> If this is confirmed, we could take appropriate action (uncomment it in >> /etc/machines.scm :-), compare to other used machines, change the scheduling >> in the daemon, or even turn it off to conserve energy should it turn out >> that we have too much build power...). > > Yes I would really like to have something like: > https://hydra.nixos.org/machines, with a build rate for every machine. +1! > However, it cannot be done without structural changes to how offloading > is handled. For now it's working this way: > > Cuirass -> guix-daemon -> guix offload -> build machines > > Which means that Cuirass has almost no information about offloaded > builds. In practice, it could parse the offload events that it gets; a bit of a hack, but good enough. However… > We are currently starting discussions about inviting the Guix Build > Coordinator to the party. … this sounds like the better option longer-term. Ludo’. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2020-09-18 12:22 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-08-27 22:33 bug#32548: Cuirass: Performance monitoring Ludovic Courtès 2020-09-06 14:42 ` Mathieu Othacehe 2020-09-06 18:51 ` Christopher Baines 2020-09-07 8:11 ` Ludovic Courtès 2020-09-10 13:26 ` Mathieu Othacehe 2020-09-14 13:34 ` Mathieu Othacehe 2020-09-14 14:10 ` zimoun 2020-09-16 2:21 ` Bonface M. K. 2020-09-14 19:27 ` Ludovic Courtès 2020-09-17 10:07 ` Mathieu Othacehe 2020-09-17 20:22 ` Ludovic Courtès 2020-09-16 15:56 ` Andreas Enge 2020-09-17 7:10 ` Mathieu Othacehe 2020-09-18 12:21 ` Ludovic Courtès
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/guix.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.