all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#72166: Shepherd periodically goes unresponsive on one of my machines
@ 2024-07-18  0:43 Jonathan Frederickson
  2024-07-19 15:35 ` Ludovic Courtès
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Frederickson @ 2024-07-18  0:43 UTC (permalink / raw)
  To: 72166

I've been running into an issue with Shepherd on one of my machines. Every so often (and I haven't figured out what conditions trigger it), my Shepherd instances (both home and PID 1) will go unresponsive. I thought I had tracked it down to a misbehaving home service that I had configured, but it's just happened again without that service running.

'herd status' hangs indefinitely:

jfred@terracard ~$ sudo herd status
Password: 
<never returns>

...on both instances:

jfred@terracard ~$ herd status
<never returns>

The PID 1 shepherd instance isn't reaping defunct processes:

jfred@terracard ~$ ps aux | grep -i lock
jfred      541  0.0  0.0   3700  2304 ?        S    18:30   0:00 swayidle -w timeout 300 swaylock -f -i ~/.wallpapers/user-manual.jpg timeout 10 if pgrep swaylock; then swaymsg "output * dpms off"; fi resume swaymsg "output * dpms on" before-sleep swaylock -f -i ~/.wallpapers/user-manual.jpg
jfred     3111  0.0  0.0      0     0 ?        Z    18:53   0:00 [swaylock] <defunct>
jfred     3112  0.0  0.0      0     0 ?        Zs   18:53   0:00 [swaylock] <defunct>

Some further troubleshooting... strace indicates that it's waiting on a read() on its fd 9:

jfred@terracard ~ [env]$ sudo strace -fp 1
Password: 
strace: Process 1 attached with 5 threads
[pid   144] read(9,  <unfinished ...>
[pid   142] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid   141] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid   140] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY^

...which seems to be:

jfred@terracard ~ [env]$ sudo ls -l /proc/1/fd/9
lr-x------ 1 root root 64 Jul 17 20:39 /proc/1/fd/9 -> 'pipe:[4015]'
jfred@terracard ~ [env]$ sudo lsof -n | grep 4015
lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
      Output information may be incomplete.
shepherd     1                      root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1                      root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  140 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  140 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  141 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  141 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  142 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  142 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  144 shepherd        root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  144 shepherd        root   11w     FIFO               0,15       0t0       4015 pipe

My system configuration for this machine can be found here, and I last ran a 'guix pull' on June 21: https://github.com/jfrederickson/dotfiles/blob/master/guix/guix/system/machines/terracard/config.scm

Has anyone else run into this?




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-09-04  9:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-18  0:43 bug#72166: Shepherd periodically goes unresponsive on one of my machines Jonathan Frederickson
2024-07-19 15:35 ` Ludovic Courtès
2024-07-19 16:25   ` Jonathan Frederickson
2024-07-22  7:14     ` Ludovic Courtès
2024-07-25  0:08       ` Jonathan Frederickson
2024-08-16 16:12         ` Ludovic Courtès
2024-08-18 22:54           ` Jonathan Frederickson
2024-08-19 20:01             ` Jonathan Frederickson
     [not found]             ` <eec39e18-8a2b-440f-ad97-4779e56362af@app.fastmail.com>
     [not found]               ` <8734mztl9u.fsf@gnu.org>
     [not found]                 ` <b86349b4-4c6f-4fef-b29b-95db86065a85@terracrypt.net>
2024-08-22  9:35                   ` Ludovic Courtès
2024-08-22 17:52                     ` Jonathan Frederickson
2024-09-04  9:24                       ` Ludovic Courtès

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.