all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / Atom feed
* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
@ 2018-11-13  1:09 Mark H Weaver
  2018-11-13 20:51 ` Mark H Weaver
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Mark H Weaver @ 2018-11-13  1:09 UTC (permalink / raw)
  To: 33362

As I write this, there are two system test builds that have been stuck
for many hours, endlessly printing "shepherd[1]: waiting for udevd...":

  https://hydra.gnu.org/build/3153725
  https://hydra.gnu.org/build/3154365

They are both on i686-linux, and on the 'core-updates' branch, but with
two different build slaves (hydra.gnunet.org and guix.sjd.se).

I will now abort these builds, to free up build slots for other jobs.

      Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2018-11-13  1:09 bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..." Mark H Weaver
@ 2018-11-13 20:51 ` Mark H Weaver
  2018-11-29 22:58   ` Mark H Weaver
  2018-12-14 23:52   ` Mark H Weaver
  2018-11-20 16:02 ` Ludovic Courtès
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 10+ messages in thread
From: Mark H Weaver @ 2018-11-13 20:51 UTC (permalink / raw)
  To: 33362

Mark H Weaver <mhw@netris.org> writes:

> As I write this, there are two system test builds that have been stuck
> for many hours, endlessly printing "shepherd[1]: waiting for udevd...":
>
>   https://hydra.gnu.org/build/3153725
>   https://hydra.gnu.org/build/3154365
>
> They are both on i686-linux, and on the 'core-updates' branch, but with
> two different build slaves (hydra.gnunet.org and guix.sjd.se).
>
> I will now abort these builds, to free up build slots for other jobs.

Today I found another Hydra build stuck, and aborted it.  Moreover, I
cancelled all 'core-updates' jobs starting with "test.", to prevent more
build slots being held for long periods of time.

I expect that Berlin's build farm will also have several build slots
held in this way, but I'm not sure how to find out if it's happening,
or how to abort those jobs.

      Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2018-11-13  1:09 bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..." Mark H Weaver
  2018-11-13 20:51 ` Mark H Weaver
@ 2018-11-20 16:02 ` Ludovic Courtès
  2018-12-09 21:46 ` Mark H Weaver
  2021-11-26  1:34 ` zimoun
  3 siblings, 0 replies; 10+ messages in thread
From: Ludovic Courtès @ 2018-11-20 16:02 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 33362

Hi Mark,

Mark H Weaver <mhw@netris.org> skribis:

> As I write this, there are two system test builds that have been stuck
> for many hours, endlessly printing "shepherd[1]: waiting for udevd...":

Basic system tests pass for me on current-ish ‘core-updates’, on x86_64:

--8<---------------cut here---------------start------------->8---
$ git describe
v0.15.0-4036-g2d546858b1
$ make check-system TESTS='basic openssh mcron'
Compiling Scheme modules...
Running 3 system tests...
TOTAL: 3
PASS: /gnu/store/3zywc7y192p0n8l2cbm58vdkbpi3dchw-basic
PASS: /gnu/store/sirffl42jh6as45fy98frdfaglkgqif0-mcron
PASS: /gnu/store/pdw9jxq745w4833jx4hzfma6c48kwda8-openssh
--8<---------------cut here---------------end--------------->8---

Does the problem still show up for you?

Ludo’.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2018-11-13 20:51 ` Mark H Weaver
@ 2018-11-29 22:58   ` Mark H Weaver
  2018-11-29 23:19     ` Danny Milosavljevic
  2018-12-09  8:33     ` Mark H Weaver
  2018-12-14 23:52   ` Mark H Weaver
  1 sibling, 2 replies; 10+ messages in thread
From: Mark H Weaver @ 2018-11-29 22:58 UTC (permalink / raw)
  To: 33362

Mark H Weaver <mhw@netris.org> writes:

> Mark H Weaver <mhw@netris.org> writes:
>
>> As I write this, there are two system test builds that have been stuck
>> for many hours, endlessly printing "shepherd[1]: waiting for udevd...":
>>
>>   https://hydra.gnu.org/build/3153725
>>   https://hydra.gnu.org/build/3154365
>>
>> They are both on i686-linux, and on the 'core-updates' branch, but with
>> two different build slaves (hydra.gnunet.org and guix.sjd.se).
>>
>> I will now abort these builds, to free up build slots for other jobs.
>
> Today I found another Hydra build stuck, and aborted it.  Moreover, I
> cancelled all 'core-updates' jobs starting with "test.", to prevent more
> build slots being held for long periods of time.

This has now happened on the 'master' branch, on x86_64:

  https://hydra.gnu.org/build/3194655

As I write this, this job has held 1 out of 3 build slots on x86_64 for
over 12 hours, waiting for udevd.

I forcefully killed the build.

      Mark


--8<---------------cut here---------------start------------->8---
@ build-started /gnu/store/30n23jvd99acs55k31frvl0lx1x3g5zf-bitlbee-test.drv - x86_64-linux /var/log/guix/drvs/30//n23jvd99acs55k31frvl0lx1x3g5zf-bitlbee-test.drv.bz2 513
^[c^[[?7l^[[2J^[[0mSeaBIOS (version rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org)


iPXE (http://ipxe.org) 00:03.0 C980 PCI2.10 PnP PMM+0FF91290+0FEF1290 C980
Press Ctrl-B to configure iPXE (PCI 00:03.0)...                                                                               


Booting from ROM..^[c^[[?7l^[[2J[    0.000000] Linux version 4.19.5-gnu (nixbld@) (gcc version 7.3.0 (GCC)) #1 SMP 1
[    0.000000] Command line: console=ttyS0 --root=/dev/vda1 --system=/gnu/store/9avywswbm01qr2xin1s6s1fryp17rqpz-system --load=/gnu/store/9avywswbm01qr2xin1s6s1fryp17rqpz-system/boot panic=1
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   Centaur CentaurHauls
[    0.000000] x86/fpu: x87 FPU will use FXSAVE
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000000ffddfff] usable
[    0.000000] BIOS-e820: [mem 0x000000000ffde000-0x000000000fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.8 present.
[    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
[    0.000000] last_pfn = 0xffde max_arch_pfn = 0x400000000
[    0.000000] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
[    0.000000] found SMP MP-table at [mem 0x000f5920-0x000f592f] mapped at [(____ptrval____)]
[    0.000000] Scanning 1 areas for low memory corruption
[    0.000000] RAMDISK: [mem 0x0f4cf000-0x0ffcffff]
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x00000000000F5730 000014 (v00 BOCHS )
[    0.000000] ACPI: RSDT 0x000000000FFE157C 000030 (v01 BOCHS  BXPCRSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: FACP 0x000000000FFE1458 000074 (v01 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.000000] ACPI: DSDT 0x000000000FFE0040 001418 (v01 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: FACS 0x000000000FFE0000 000040
[    0.000000] ACPI: APIC 0x000000000FFE14CC 000078 (v01 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.000000] ACPI: HPET 0x000000000FFE1544 000038 (v01 BOCHS  BXPCHPET 00000001 BXPC 00000001)
[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at [mem 0x0000000000000000-0x000000000ffddfff]
[    0.000000] NODE_DATA(0) allocated [mem 0x0ffda000-0x0ffddfff]
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000000001000-0x000000000ffddfff]
[    0.000000]   Normal   empty
[    0.000000]   Device   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.000000]   node   0: [mem 0x0000000000100000-0x000000000ffddfff]
[    0.000000] Reserved but unavailable: 132 pages
[    0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000000ffddfff]
[    0.000000] ACPI: PM-Timer IO Port: 0x608
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[    0.000000] IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[    0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] PM: Registered nosave memory: [mem 0x00000000-0x00000fff]
[    0.000000] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
[    0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000effff]
[    0.000000] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff]
[    0.000000] [mem 0x10000000-0xfffbffff] available for PCI devices
[    0.000000] Booting paravirtualized kernel on bare hardware
[    0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.000000] random: get_random_bytes called from start_kernel+0x99/0x537 with crng_init=0
[    0.000000] setup_percpu: NR_CPUS:256 nr_cpumask_bits:256 nr_cpu_ids:1 nr_node_ids:1
[    0.000000] percpu: Embedded 45 pages/cpu @(____ptrval____) s147288 r8192 d28840 u2097152
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 64359
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: console=ttyS0 --root=/dev/vda1 --system=/gnu/store/9avywswbm01qr2xin1s6s1fryp17rqpz-system --load=/gnu/store/9avywswbm01qr2xin1s6s1fryp17rqpz-system/boot panic=1
[    0.000000] Memory: 220076K/261616K available (12300K kernel code, 1575K rwdata, 3884K rodata, 1824K init, 2632K bss, 41540K reserved, 0K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] ftrace: allocating 39285 entries in 154 pages
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu: 	RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=1.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] NR_IRQS: 16640, nr_irqs: 256, preallocated irqs: 16
[    0.000000] Console: colour VGA+ 80x25
[    0.000000] console [ttyS0] enabled
[    0.000000] ACPI: Core revision 20180810
[    0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
[    0.008000] APIC: Switch to symmetric I/O mode setup
[    0.012000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.048000] tsc: Unable to calibrate against PIT
[    0.048000] tsc: using HPET reference calibration
[    0.048000] tsc: Detected 2792.855 MHz processor
[    0.001990] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x2841e2f9134, max_idle_ns: 440795219952 ns
[    0.003887] Calibrating delay loop (skipped), value calculated using timer frequency.. 5585.71 BogoMIPS (lpj=11171420)
[    0.007669] pid_max: default: 32768 minimum: 301
[    0.009292] Security Framework initialized
[    0.009651] Yama: becoming mindful.
[    0.013219] AppArmor: AppArmor initialized
[    0.016697] Dentry cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.017834] Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
[    0.018652] Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
[    0.019069] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes)
[    0.039980] mce: CPU supports 10 MCE banks
[    0.041659] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[    0.041976] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
[    0.042516] Spectre V2 : Spectre mitigation: LFENCE not serializing, switching to generic retpoline
[    0.043035] Spectre V2 : Mitigation: Full generic retpoline
[    0.043339] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.043798] Speculative Store Bypass: Vulnerable
[    0.548955] Freeing SMP alternatives memory: 28K
[    0.590970] smpboot: CPU0: AMD QEMU Virtual CPU version 2.5+ (family: 0x6, model: 0x6, stepping: 0x3)
[    0.596752] Performance Events: PMU not available due to virtualization, using software events only.
[    0.599826] rcu: Hierarchical SRCU implementation.
[    0.608287] NMI watchdog: Perf NMI watchdog permanently disabled
[    0.613279] smp: Bringing up secondary CPUs ...
[    0.613728] smp: Brought up 1 node, 1 CPU
[    0.614040] smpboot: Max logical packages: 1
[    0.614395] smpboot: Total of 1 processors activated (5585.71 BogoMIPS)
[    0.630298] devtmpfs: initialized
[    0.635460] x86/mm: Memory block size: 128MB
[    0.644658] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.645509] futex hash table entries: 256 (order: 2, 16384 bytes)
[    0.649879] pinctrl core: initialized pinctrl subsystem
[    0.659269] RTC time: 10:09:11, date: 11/29/18
[    0.664985] NET: Registered protocol family 16
[    0.669097] audit: initializing netlink subsys (disabled)
[    0.675447] audit: type=2000 audit(1543486150.716:1): state=initialized audit_enabled=0 res=1
[    0.676794] cpuidle: using governor ladder
[    0.677189] cpuidle: using governor menu
[    0.679203] ACPI: bus type PCI registered
[    0.679581] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.682951] PCI: Using configuration type 1 for base access
[    0.721243] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.730282] ACPI: Added _OSI(Module Device)
[    0.730574] ACPI: Added _OSI(Processor Device)
[    0.730920] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.731274] ACPI: Added _OSI(Processor Aggregator Device)
[    0.731702] ACPI: Added _OSI(Linux-Dell-Video)
[    0.731965] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[    0.756160] ACPI: 1 ACPI AML tables successfully acquired and loaded
[    0.783269] ACPI: Interpreter enabled
[    0.784934] ACPI: (supports S0 S3 S4 S5)
[    0.785294] ACPI: Using IOAPIC for interrupt routing
[    0.786266] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.789697] ACPI: Enabled 2 GPEs in block 00 to 0F
[    0.852693] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    0.853619] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
[    0.854339] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
[    0.855466] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[    0.865564] acpiphp: Slot [3] registered
[    0.866061] acpiphp: Slot [4] registered
[    0.866487] acpiphp: Slot [5] registered
[    0.867004] acpiphp: Slot [6] registered
[    0.867446] acpiphp: Slot [7] registered
[    0.867950] acpiphp: Slot [8] registered
[    0.868414] acpiphp: Slot [9] registered
[    0.868842] acpiphp: Slot [10] registered
[    0.869297] acpiphp: Slot [11] registered
[    0.869749] acpiphp: Slot [12] registered
[    0.870172] acpiphp: Slot [13] registered
[    0.870631] acpiphp: Slot [14] registered
[    0.871244] acpiphp: Slot [15] registered
[    0.871684] acpiphp: Slot [16] registered
[    0.872149] acpiphp: Slot [17] registered
[    0.872583] acpiphp: Slot [18] registered
[    0.873008] acpiphp: Slot [19] registered
[    0.873491] acpiphp: Slot [20] registered
[    0.873972] acpiphp: Slot [21] registered
[    0.874408] acpiphp: Slot [22] registered
[    0.874880] acpiphp: Slot [23] registered
[    0.875435] acpiphp: Slot [24] registered
[    0.875867] acpiphp: Slot [25] registered
[    0.876296] acpiphp: Slot [26] registered
[    0.876749] acpiphp: Slot [27] registered
[    0.877186] acpiphp: Slot [28] registered
[    0.877719] acpiphp: Slot [29] registered
[    0.878149] acpiphp: Slot [30] registered
[    0.878580] acpiphp: Slot [31] registered
[    0.879290] PCI host bridge to bus 0000:00
[    0.879731] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.880135] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.880540] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.880960] pci_bus 0000:00: root bus resource [mem 0x10000000-0xfebfffff window]
[    0.881400] pci_bus 0000:00: root bus resource [mem 0x100000000-0x17fffffff window]
[    0.881998] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.898402] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
[    0.899062] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
[    0.899479] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
[    0.899911] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
[    0.902479] pci 0000:00:01.3: quirk: [io  0x0600-0x063f] claimed by PIIX4 ACPI
[    0.903096] pci 0000:00:01.3: quirk: [io  0x0700-0x070f] claimed by PIIX4 SMB
[    0.999262] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
[    1.000854] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
[    1.002149] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
[    1.003542] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
[    1.004334] ACPI: PCI Interrupt Link [LNKS] (IRQs *9)
[    1.016401] pci 0000:00:02.0: vgaarb: setting as boot VGA device
[    1.016902] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    1.017508] pci 0000:00:02.0: vgaarb: bridge control possible
[    1.017951] vgaarb: loaded
[    1.023855] SCSI subsystem initialized
[    1.025999] ACPI: bus type USB registered
[    1.026856] usbcore: registered new interface driver usbfs
[    1.027391] usbcore: registered new interface driver hub
[    1.028088] usbcore: registered new device driver usb
[    1.030143] EDAC MC: Ver: 3.0.0
[    1.034569] PCI: Using ACPI for IRQ routing
[    1.047072] NetLabel: Initializing
[    1.047338] NetLabel:  domain hash size = 128
[    1.047597] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
[    1.050042] NetLabel:  unlabeled traffic allowed by default
[    1.055387] HPET: 3 timers in total, 0 timers will be used for per-cpu timer
[    1.056087] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[    1.056488] hpet0: 3 comparators, 64-bit 100.000000 MHz counter
[    1.062065] clocksource: Switched to clocksource tsc-early
[    1.270005] VFS: Disk quotas dquot_6.6.0
[    1.270604] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    1.277314] AppArmor: AppArmor Filesystem Enabled
[    1.278662] pnp: PnP ACPI init
[    1.287379] pnp: PnP ACPI: found 6 devices
[    1.323353] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    1.326579] NET: Registered protocol family 2
[    1.334899] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096 bytes)
[    1.336156] TCP established hash table entries: 2048 (order: 2, 16384 bytes)
[    1.336797] TCP bind hash table entries: 2048 (order: 3, 32768 bytes)
[    1.337313] TCP: Hash tables configured (established 2048 bind 2048)
[    1.339653] UDP hash table entries: 256 (order: 1, 8192 bytes)
[    1.340237] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[    1.342698] NET: Registered protocol family 1
[    1.343742] pci 0000:00:01.0: PIIX3: Enabling Passive Release
[    1.344229] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[    1.344657] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[    1.345351] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    1.354709] Trying to unpack rootfs image as initramfs...
[    2.708916] Freeing initrd memory: 11268K
[    2.711141] Scanning for low memory corruption every 60 seconds
[    2.722225] Initialise system trusted keyrings
[    2.724539] workingset: timestamp_bits=40 max_order=16 bucket_order=0
[    2.742365] zbud: loaded
[    2.753683] pstore: using deflate compression
[    2.755963] Allocating IMA blacklist keyring.
[    2.787798] Key type asymmetric registered
[    2.788317] Asymmetric key parser 'x509' registered
[    2.789209] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
[    2.791500] io scheduler noop registered
[    2.791912] io scheduler deadline registered (default)
[    2.793536] io scheduler cfq registered
[    2.798310] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[    2.804456] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[    2.808795] ACPI: Power Button [PWRF]
[    2.820362] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[    2.845154] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    2.882637] Linux agpgart interface v0.103
[    2.942331] brd: module loaded
[    2.966458] loop: module loaded
[    2.985023] scsi host0: ata_piix
[    2.988709] scsi host1: ata_piix
[    2.989888] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc100 irq 14
[    2.990448] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc108 irq 15
[    2.998922] libphy: Fixed MDIO Bus: probed
[    2.999505] tun: Universal TUN/TAP device driver, 1.6
[    3.001512] PPP generic driver version 2.4.2
[    3.003308] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    3.003843] ehci-pci: EHCI PCI platform driver
[    3.004440] ehci-platform: EHCI generic platform driver
[    3.004953] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    3.005453] ohci-pci: OHCI PCI platform driver
[    3.006120] ohci-platform: OHCI generic platform driver
[    3.006690] uhci_hcd: USB Universal Host Controller Interface driver
[    3.009539] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
[    3.014828] serio: i8042 KBD port at 0x60,0x64 irq 1
[    3.015641] serio: i8042 AUX port at 0x60,0x64 irq 12
[    3.018958] mousedev: PS/2 mouse device common for all mice
[    3.024949] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1
[    3.026863] rtc_cmos 00:00: RTC can wake from S4
[    3.032658] rtc_cmos 00:00: registered as rtc0
[    3.033973] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes nvram, hpet irqs
[    3.034644] i2c /dev entries driver
[    3.036056] device-mapper: uevent: version 1.0.3
[    3.042686] device-mapper: ioctl: 4.39.0-ioctl (2018-04-03) initialised: dm-devel@redhat.com
[    3.044218] ledtrig-cpu: registered to indicate activity on CPUs
[    3.054662] NET: Registered protocol family 10
[    3.078608] Segment Routing with IPv6
[    3.081437] NET: Registered protocol family 17
[    3.082138] Key type dns_resolver registered
[    3.089595] sched_clock: Marking stable (3137613189, -49029043)->(3206256657, -117672511)
[    3.100722] registered taskstats version 1
[    3.101059] Loading compiled-in X.509 certificates
[    3.102658] zswap: loaded using pool lzo/zbud
[    3.136596] Key type big_key registered
[    3.148083] Key type trusted registered
[    3.150698] Key type encrypted registered
[    3.158853] AppArmor: AppArmor sha1 policy hashing enabled
[    3.172714] ima: No TPM chip found, activating TPM-bypass!
[    3.173294] ima: Allocated hash algorithm: sha1
[    3.181142] ata2.00: ATAPI: QEMU DVD-ROM, 2.5+, max UDMA/100
[    3.192938] evm: Initialising EVM extended attributes:
[    3.193283] evm: security.selinux
[    3.193522] evm: security.SMACK64
[    3.193727] evm: security.SMACK64EXEC
[    3.193931] evm: security.SMACK64TRANSMUTE
[    3.194153] evm: security.SMACK64MMAP
[    3.194354] evm: security.apparmor
[    3.194541] evm: security.ima
[    3.194708] evm: security.capability
[    3.194926] evm: HMAC attrs: 0x1
[    3.233677] scsi 1:0:0:0: CD-ROM            QEMU     QEMU DVD-ROM     2.5+ PQ: 0 ANSI: 5
[    3.234657]   Magic number: 10:885:174
[    3.237214] acpi device:0a: hash matches
[    3.238633] rtc_cmos 00:00: setting system clock to 2018-11-29 10:09:13 UTC (1543486153)
[    3.294888] sr 1:0:0:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray
[    3.295686] cdrom: Uniform CD-ROM driver Revision: 3.20
[    3.302249] sr 1:0:0:0: Attached scsi generic sg0 type 5
[    3.474633] Freeing unused kernel image memory: 1824K
[    3.487601] Write protecting the kernel read-only data: 18432k
[    3.496646] Freeing unused kernel image memory: 2028K
[    3.497529] Freeing unused kernel image memory: 212K
[    3.498808] Run /init as init process
GC Warning: pthread_getattr_np or pthread_attr_getstack failed for main thread
GC Warning: Couldn't read /proc/stat
[    3.727996] tsc: Refined TSC clocksource calibration: 2792.877 MHz
[    3.730506] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2841f7f18a5, max_idle_ns: 440795343253 ns
[    3.756117] clocksource: Switched to clocksource tsc
Welcome, this is GNU's early boot Guile.
Use '--repl' for an initrd REPL.

loading kernel modules...
[    4.720811] usbcore: registered new interface driver usb-storage
[    4.797877] usbcore: registered new interface driver uas
[    4.891491] hidraw: raw HID events driver (C) Jiri Kosina
[    4.924691] usbcore: registered new interface driver usbhid
[    4.925902] usbhid: USB HID core driver
[    5.064893] isci: Intel(R) C600 SAS Controller Driver - version 1.2.0
[    5.359460] PCI Interrupt Link [LNKC] enabled at IRQ 11
[    5.650659] PCI Interrupt Link [LNKD] enabled at IRQ 10
[    5.933730] PCI Interrupt Link [LNKA] enabled at IRQ 10
[    6.211059] PCI Interrupt Link [LNKB] enabled at IRQ 11
[    6.546500] virtio_blk virtio4: [vda] 143360 512-byte logical blocks (73.4 MB/70.0 MiB)
[    6.573822]  vda: vda1 vda2
[    6.740363] random: fast init done
[    6.746988] random: crng init done
[    6.792220] FS-Cache: Loaded
[    6.819059] 9pnet: Installing 9P2000 support
[    6.828759] 9p: Installing v9fs 9p2000 file system support
[    6.830918] FS-Cache: Netfs '9p' registered for caching
[    6.919369] EXT4-fs (vda1): mounted filesystem with ordered data mode. Opts: (null)
loading '/gnu/store/9avywswbm01qr2xin1s6s1fryp17rqpz-system/boot'...
making '/gnu/store/9avywswbm01qr2xin1s6s1fryp17rqpz-system' the current system...
setting up setuid programs in '/run/setuid-programs'...
populating /etc from /gnu/store/np5p64haqakrrpb6y0wqqm3r2ry2j0a3-etc...
adding user 'root'...
adding group 'root'...
adding group 'wheel'...
adding group 'users'...
adding group 'nogroup'...
adding group 'tty'...
adding group 'dialout'...
adding group 'kmem'...
adding group 'input'...
adding group 'video'...
adding group 'audio'...
adding group 'netdev'...
adding group 'lp'...
adding group 'disk'...
adding group 'floppy'...
adding group 'cdrom'...
adding group 'tape'...
adding group 'kvm'...
adding group 'guixbuild'...
adding group 'bitlbee'...
usermod: no changes
adding user 'alice'...
adding user 'nobody'...
adding user 'guixbuilder01'...
adding user 'guixbuilder02'...
adding user 'guixbuilder03'...
adding user 'guixbuilder04'...
adding user 'guixbuilder05'...
adding user 'guixbuilder06'...
adding user 'guixbuilder07'...
adding user 'guixbuilder08'...
adding user 'guixbuilder09'...
adding user 'guixbuilder10'...
adding user 'bitlbee'...
registering public key '/gnu/store/zjxk76lb94k34i0ab9vra4sp4qh4vnq3-guix-0.15.0-8.71a78ba/share/guix/hydra.gnu.org.pub'...
[   31.638611] shepherd[1]: Service root has been started.
[   35.970631] shepherd[1]: starting services...
[   36.019828] shepherd[1]: Service root-file-system has been started.
[   36.077056] shepherd[1]: Service user-file-systems has been started.
[   36.237915] shepherd[1]: waiting for udevd...
[   36.774385] shepherd[1]: waiting for udevd...
[   37.308971] shepherd[1]: waiting for udevd...
[   37.856855] shepherd[1]: waiting for udevd...
[   38.388725] shepherd[1]: waiting for udevd...
[   38.921770] shepherd[1]: waiting for udevd...
[   39.464416] shepherd[1]: waiting for udevd...
[   40.010167] shepherd[1]: waiting for udevd...
[   40.562618] shepherd[1]: waiting for udevd...

[... 84322 similar lines removed ...]

[45824.277069] shepherd[1]: waiting for udevd...
[45824.809158] shepherd[1]: waiting for udevd...
[45825.418089] shepherd[1]: waiting for udevd...
[45825.958464] shepherd[1]: waiting for udevd...
[45826.534511] shepherd[1]: waiting for udevd...
[45827.090077] shepherd[1]: waiting for udevd...

[... I forcefully killed the build here ...]

--8<---------------cut here---------------end--------------->8---

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2018-11-29 22:58   ` Mark H Weaver
@ 2018-11-29 23:19     ` Danny Milosavljevic
  2018-12-09  8:33     ` Mark H Weaver
  1 sibling, 0 replies; 10+ messages in thread
From: Danny Milosavljevic @ 2018-11-29 23:19 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 33362

[-- Attachment #1: Type: text/plain, Size: 1494 bytes --]

Hmm, I wonder whether it might make sense to add "--debug" to udevd in gnu/services/base.scm in order to track down this problem.

The loop looks rather harmless and only checks whether the actual udev socket is available - which it must be for udevd to work.

It does the following right now:

         (start #~(lambda ()
                    (define udevd
                      ;; 'udevd' from eudev.
                      #$(file-append udev "/sbin/udevd")) ; TODO: Add ' "--debug"' here

                    (define (wait-for-udevd)
                      ;; Wait until someone's listening on udevd's control
                      ;; socket.
                      (let ((sock (socket AF_UNIX SOCK_SEQPACKET 0)))
                        (let try ()
                          (catch 'system-error
                            (lambda ()
                              (connect sock PF_UNIX "/run/udev/control")
                              (close-port sock))
                            (lambda args
                              (format #t "waiting for udevd...~%")
                              (usleep 500000)
                              (try))))))
[...]
                    (let ((pid (fork+exec-command (list udevd))))
                      ;; Wait until udevd is up and running.  This appears to
                      ;; be needed so that the events triggered below are
                      ;; actually handled.
                      (wait-for-udevd)



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2018-11-29 22:58   ` Mark H Weaver
  2018-11-29 23:19     ` Danny Milosavljevic
@ 2018-12-09  8:33     ` Mark H Weaver
  1 sibling, 0 replies; 10+ messages in thread
From: Mark H Weaver @ 2018-12-09  8:33 UTC (permalink / raw)
  To: 33362

In recent evaluations of 'master', there have been two more instances of
this problem occurring, on x86_64 and i686:

  https://hydra.gnu.org/build/3252398  (from eval 110355, commit 08861d25)
  https://hydra.gnu.org/build/3225361  (from eval 110351, commit 69f867b1)

I aborted these jobs.

      Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2018-11-13  1:09 bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..." Mark H Weaver
  2018-11-13 20:51 ` Mark H Weaver
  2018-11-20 16:02 ` Ludovic Courtès
@ 2018-12-09 21:46 ` Mark H Weaver
  2021-11-26  1:34 ` zimoun
  3 siblings, 0 replies; 10+ messages in thread
From: Mark H Weaver @ 2018-12-09 21:46 UTC (permalink / raw)
  To: 33362

I found two more instances of this problem, in evaluation 110355:

  https://hydra.gnu.org/build/3253071
  https://hydra.gnu.org/build/3252182

I killed these jobs, and cancelled all pending 'test.*' jobs.

       Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2018-11-13 20:51 ` Mark H Weaver
  2018-11-29 22:58   ` Mark H Weaver
@ 2018-12-14 23:52   ` Mark H Weaver
  1 sibling, 0 replies; 10+ messages in thread
From: Mark H Weaver @ 2018-12-14 23:52 UTC (permalink / raw)
  To: 33362

I just found that 4 out of 6 Intel build slots on Hydra have been
occupied for the last 10+ hours on more instances of this bug:

  https://hydra.gnu.org/build/3281985
  https://hydra.gnu.org/build/3281874
  https://hydra.gnu.org/build/3281957
  https://hydra.gnu.org/build/3281830

I killed these jobs.

      Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2018-11-13  1:09 bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..." Mark H Weaver
                   ` (2 preceding siblings ...)
  2018-12-09 21:46 ` Mark H Weaver
@ 2021-11-26  1:34 ` zimoun
  2021-11-26 23:21   ` Mark H Weaver
  3 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2021-11-26  1:34 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 33362

Hi Mark,

On Mon, 12 Nov 2018 at 20:09, Mark H Weaver <mhw@netris.org> wrote:

> As I write this, there are two system test builds that have been stuck
> for many hours, endlessly printing "shepherd[1]: waiting for udevd...":
>
>   https://hydra.gnu.org/build/3153725
>   https://hydra.gnu.org/build/3154365
>
> They are both on i686-linux, and on the 'core-updates' branch, but with
> two different build slaves (hydra.gnunet.org and guix.sjd.se).
>
> I will now abort these builds, to free up build slots for other jobs.

I am doing bug triage and I hit this old one #33362 [1] from 2018.  It
is about hydra which is down now, IIRC.  Does it still make sense to
keep it open?  Or can we close it?

1: <http://issues.guix.gnu.org/issue/33362>


Cheers,
simon




^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..."
  2021-11-26  1:34 ` zimoun
@ 2021-11-26 23:21   ` Mark H Weaver
  0 siblings, 0 replies; 10+ messages in thread
From: Mark H Weaver @ 2021-11-26 23:21 UTC (permalink / raw)
  To: zimoun; +Cc: 33362

Hi Simon,

zimoun <zimon.toutoune@gmail.com> writes:

> On Mon, 12 Nov 2018 at 20:09, Mark H Weaver <mhw@netris.org> wrote:
>
>> As I write this, there are two system test builds that have been stuck
>> for many hours, endlessly printing "shepherd[1]: waiting for udevd...":
>>
>>   https://hydra.gnu.org/build/3153725
>>   https://hydra.gnu.org/build/3154365
>>
>> They are both on i686-linux, and on the 'core-updates' branch, but with
>> two different build slaves (hydra.gnunet.org and guix.sjd.se).
>>
>> I will now abort these builds, to free up build slots for other jobs.
>
> I am doing bug triage and I hit this old one #33362 [1] from 2018.  It
> is about hydra which is down now, IIRC.

I doubt that this bug was about Hydra.  It guess that the bug was in our
system test derivations.  As far as I know, the only relevance of Hydra
to this bug is that Hydra was our CI system at that time, and therefore
it was Hydra that brought this bug to my attention.

> Does it still make sense to keep it open?  Or can we close it?

If the bug hasn't occurred recently, then I agree it's okay to close it.

It would be good to check our modern Cuirass-based ci.guix.gnu.org to
find out whether this failure mode is still occurring in our system
tests.

I see that Cuirass's web interface has improved quite significantly in
the last couple of years, and I'm very grateful to those who've worked
on it.  However, Cuirass still seems to be missing some important
functionality that Hydra had.  Most notably, unless I missed something,
it seems to lack the ability to compare the results of two evaluations
and show the *differences* between those results, i.e. to enumerate the
newly failing jobs, the newly succeeding jobs, and the newly aborted
jobs.

Without that functionality, it's not easy for us to notice when a job
starts to fail, unless a user files a bug report.  The total list of job
failures has always been too large to easily notice changes in that list
without assistance.

What do you think?

Thanks again for your work on this.

      Regards,
        Mark

-- 
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>.




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-11-26 23:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-13  1:09 bug#33362: System tests stuck in "shepherd[1]: waiting for udevd..." Mark H Weaver
2018-11-13 20:51 ` Mark H Weaver
2018-11-29 22:58   ` Mark H Weaver
2018-11-29 23:19     ` Danny Milosavljevic
2018-12-09  8:33     ` Mark H Weaver
2018-12-14 23:52   ` Mark H Weaver
2018-11-20 16:02 ` Ludovic Courtès
2018-12-09 21:46 ` Mark H Weaver
2021-11-26  1:34 ` zimoun
2021-11-26 23:21   ` Mark H Weaver

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.