all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Jonathan Frederickson" <jonathan@terracrypt.net>
To: 72166@debbugs.gnu.org
Subject: bug#72166: Shepherd periodically goes unresponsive on one of my machines
Date: Wed, 17 Jul 2024 20:43:15 -0400	[thread overview]
Message-ID: <df6e8894-fd84-446f-a67f-50cdcc9de5b3@app.fastmail.com> (raw)

I've been running into an issue with Shepherd on one of my machines. Every so often (and I haven't figured out what conditions trigger it), my Shepherd instances (both home and PID 1) will go unresponsive. I thought I had tracked it down to a misbehaving home service that I had configured, but it's just happened again without that service running.

'herd status' hangs indefinitely:

jfred@terracard ~$ sudo herd status
Password: 
<never returns>

...on both instances:

jfred@terracard ~$ herd status
<never returns>

The PID 1 shepherd instance isn't reaping defunct processes:

jfred@terracard ~$ ps aux | grep -i lock
jfred      541  0.0  0.0   3700  2304 ?        S    18:30   0:00 swayidle -w timeout 300 swaylock -f -i ~/.wallpapers/user-manual.jpg timeout 10 if pgrep swaylock; then swaymsg "output * dpms off"; fi resume swaymsg "output * dpms on" before-sleep swaylock -f -i ~/.wallpapers/user-manual.jpg
jfred     3111  0.0  0.0      0     0 ?        Z    18:53   0:00 [swaylock] <defunct>
jfred     3112  0.0  0.0      0     0 ?        Zs   18:53   0:00 [swaylock] <defunct>

Some further troubleshooting... strace indicates that it's waiting on a read() on its fd 9:

jfred@terracard ~ [env]$ sudo strace -fp 1
Password: 
strace: Process 1 attached with 5 threads
[pid   144] read(9,  <unfinished ...>
[pid   142] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid   141] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid   140] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY^

...which seems to be:

jfred@terracard ~ [env]$ sudo ls -l /proc/1/fd/9
lr-x------ 1 root root 64 Jul 17 20:39 /proc/1/fd/9 -> 'pipe:[4015]'
jfred@terracard ~ [env]$ sudo lsof -n | grep 4015
lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
      Output information may be incomplete.
shepherd     1                      root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1                      root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  140 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  140 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  141 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  141 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  142 GC-marker       root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  142 GC-marker       root   11w     FIFO               0,15       0t0       4015 pipe
shepherd     1  144 shepherd        root    9r     FIFO               0,15       0t0       4015 pipe
shepherd     1  144 shepherd        root   11w     FIFO               0,15       0t0       4015 pipe

My system configuration for this machine can be found here, and I last ran a 'guix pull' on June 21: https://github.com/jfrederickson/dotfiles/blob/master/guix/guix/system/machines/terracard/config.scm

Has anyone else run into this?




             reply	other threads:[~2024-07-26 16:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-18  0:43 Jonathan Frederickson [this message]
2024-07-19 15:35 ` bug#72166: Shepherd periodically goes unresponsive on one of my machines Ludovic Courtès
2024-07-19 16:25   ` Jonathan Frederickson
2024-07-22  7:14     ` Ludovic Courtès
2024-07-25  0:08       ` Jonathan Frederickson
2024-08-16 16:12         ` Ludovic Courtès
2024-08-18 22:54           ` Jonathan Frederickson
2024-08-19 20:01             ` Jonathan Frederickson
     [not found]             ` <eec39e18-8a2b-440f-ad97-4779e56362af@app.fastmail.com>
     [not found]               ` <8734mztl9u.fsf@gnu.org>
     [not found]                 ` <b86349b4-4c6f-4fef-b29b-95db86065a85@terracrypt.net>
2024-08-22  9:35                   ` Ludovic Courtès
2024-08-22 17:52                     ` Jonathan Frederickson
2024-09-04  9:24                       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df6e8894-fd84-446f-a67f-50cdcc9de5b3@app.fastmail.com \
    --to=jonathan@terracrypt.net \
    --cc=72166@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.