unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#59514: Stuck builds in Cuirass
@ 2022-11-23 12:50 Marius Bakke
  2022-11-23 13:26 ` Mathieu Othacehe
  0 siblings, 1 reply; 2+ messages in thread
From: Marius Bakke @ 2022-11-23 12:50 UTC (permalink / raw)
  To: 59514; +Cc: othacehe


[-- Attachment #1.1: Type: text/plain, Size: 556 bytes --]

Hi,

Cuirass has a tendency to not notice when a build is finished, leaving
it in a "running" state.

The phenomenon can be observed by going to
<https://ci.guix.gnu.org/status> and look at builds that are running for
a suspiciously long time.

Typically the build log will indicate that it has finished, yet Cuirass
is patiently waiting...and not scheduling further builds.

Restarting the builds typically get things going again.

I wrote a nasty script to automatically restart builds that are running
for >1 hour, but it's not a sustainable solution:


[-- Attachment #1.2: restart-old-builds.py --]
[-- Type: text/plain, Size: 1593 bytes --]

#!/usr/bin/env python3

# Restart stuck builds....   TODO fix cuirass properly.

import requests
from bs4 import BeautifulSoup
import re

builds_page = "https://ci.guix.gnu.org/status"
builds_html = requests.get(builds_page).text

soup = BeautifulSoup(builds_html, "html5lib")
main = soup.find('main', {'id': 'content'})
table = main.find('table')

result = {}

for row in table.find_all('tr'):
    data = row.find_all('td')
    if len(data) > 0:
        build_id = row.find('a').contents[0]
        name = data[0].contents[0]
        age = data[1].contents[0]
        system = data[2].contents[0]
        log = data[3]

        result[build_id] = {'name': name, 'age': age, 'system': system}

age_re = re.compile("(\d+) (\w+) ago")
restart = []

for id in result.keys():
    age = result[id]['age']
    match = age_re.match(result[id]['age'])
    if match is not None:  # "seconds ago"
        digits = match.group(1)
        time_unit = match.group(2)
        if time_unit == "hours":
            restart.append(id)
        elif time_unit == "minutes" and int(digits) > 60:
            restart.append(id)

certificate_file = "/home/marius/tmp/mbakke.cert.pem"
certificate_key = "/home/marius/tmp/mbakke.key.pem"

import time

print(f"Found {len(restart)} stuck builds..!")

for id in restart:
    print(f"Going to restart {result[id]['name']} ({id}, running since {result[id]['age']})...")
    requests.get(f"https://ci.guix.gnu.org/admin/build/{id}/restart",
                 cert=(certificate_file, certificate_key))
    time.sleep(3)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-11-23 13:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23 12:50 bug#59514: Stuck builds in Cuirass Marius Bakke
2022-11-23 13:26 ` Mathieu Othacehe

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).