all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Mathieu Othacehe <othacehe@gnu.org>
Cc: 34033@debbugs.gnu.org
Subject: bug#34033: Offloading sometimes hangs
Date: Fri, 03 Jul 2020 15:58:55 +0200	[thread overview]
Message-ID: <871rlsog2o.fsf@gnu.org> (raw)
In-Reply-To: <877dvlkriv.fsf@gnu.org> (Mathieu Othacehe's message of "Fri, 03 Jul 2020 09:05:12 +0200")

Hi!

Mathieu Othacehe <othacehe@gnu.org> skribis:

>> Something is going wrong here! I'll keep investigating.
>
> To help us investigate those issues I added a "/status" page, which is
> also accessible from a new drop-down menu in the Cuirass navigation bar.
>
> See, https://ci.guix.gnu.org/status.

Nice!  So it’s roughly like the info at /api/queue, but filtered to
running builds, right?

> Hydra has the same interface, but also a "Machine status" page, that
> breaks down the running builds machine per machine. I plan to implement
> that one next. Reading Hydra code, I also discovered that some part of
> the offloading is directly done from Hydra, which talks with the
> nix-daemon of the connected build machines, interesting!

Yes, Hydra does most of the scheduling by itself.  Since this is
redundant with what the daemon + offload do, I thought Cuirass shouldn’t
do any scheduling at all and instead let the daemon take care of it
all.

This has advantages (the daemon has a global view and can achieve better
scheduling), and drawbacks (the protocol requires us to wait for
‘build-things’ completion before we can queue more builds, and
scheduling decisions are almost invisible to Cuirass).

> While I'm writing, we have 5 running builds for ~1 hour, and 76040 queued
> builds. Given the computing power of Berlin, there must be a bottleneck
> somewhere.

Yes!  I’ve often run “guix processes” on berlin, then stracing the
‘SessionPID’ process.  It’s insightful because you sometimes see the
daemon is stuck waiting for a machine to offload to, sometimes it’s
stuck waiting for a build that will perhaps just eventually timeout…

Ludo’.




      reply	other threads:[~2020-07-03 14:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-10 16:09 bug#34033: Offloading sometimes hangs Ludovic Courtès
2019-01-14 22:45 ` Ludovic Courtès
2020-02-22  4:37   ` Maxim Cournoyer
2020-02-22 20:35     ` Ludovic Courtès
2020-02-24 13:59       ` Maxim Cournoyer
2020-02-24 14:59         ` Ludovic Courtès
2020-07-02 14:20   ` Mathieu Othacehe
2020-07-03  7:05     ` Mathieu Othacehe
2020-07-03 13:58       ` Ludovic Courtès [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871rlsog2o.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=34033@debbugs.gnu.org \
    --cc=othacehe@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.