unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* shepherd, fibers, and signals (asyncs)
@ 2023-12-15 22:03 Attila Lendvai
  2023-12-17  0:59 ` Attila Lendvai
  2024-01-09 19:29 ` Ludovic Courtès
  0 siblings, 2 replies; 3+ messages in thread
From: Attila Lendvai @ 2023-12-15 22:03 UTC (permalink / raw)
  To: guix-devel

dear Guix,

context:

Shepherd stops responding during "guix system reconfigure"
https://issues.guix.gnu.org/67538
https://issues.guix.gnu.org/65178
https://issues.guix.gnu.org/67230

i've added a ton of logging and asserts in my fork:

https://codeberg.org/attila-lendvai-patches/shepherd

which resulted in this report:

https://github.com/wingo/fibers/issues/29#issuecomment-1858319291

to which @emixa-d kindly responded:

https://github.com/wingo/fibers/issues/29#issuecomment-1858497720

which essentially identifies the following:

--------------

posix signal handlers are  async, and shepherd uses the fibers API from inside signal handlers, specifically in at least handle-SIGCHLD.

this violates the fibers API, and most probably leads to the root cause of the reconfigure hang: a match-error flying out from service-controller due to losing the value of the parameter called (current-process-monitor), which then makes that fiber exit.

i have little experience with posix signal handlers, so i probably won't come up with a fix for this, or at least not without someone's bird's eye view guidance.

maybe the solution could be something like packaging up posix signals and delivering them to the fibers universe by some form of polling of an atomic variable? or is there some signal-safe semaphore facility in guile that could be used in accordance with the fibers API?

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Virtue is never left to stand alone. He who has it will have neighbors.”
	— Confucius (551–479 BC)



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: shepherd, fibers, and signals (asyncs)
  2023-12-15 22:03 shepherd, fibers, and signals (asyncs) Attila Lendvai
@ 2023-12-17  0:59 ` Attila Lendvai
  2024-01-09 19:29 ` Ludovic Courtès
  1 sibling, 0 replies; 3+ messages in thread
From: Attila Lendvai @ 2023-12-17  0:59 UTC (permalink / raw)
  To: Attila Lendvai; +Cc: guix-devel

@emixa-d kindly proposed something that turned out to be a fix:

https://github.com/wingo/fibers/issues/29#issuecomment-1858922276

i've sent it to:

shepherd: sometimes hangs on `guix system reconfigure`
https://issues.guix.gnu.org/67839#6

in essence:

shepherd violates the fibers API by calling it from an async signal handler, and this is an issue indeed, but the bugs caused by this should manifest rarely and randomly; i.e. the frozen recofigure behavior must be caused by something else.

the actual root cause here was that the with-process-monitor parameterize was not covering some code that it should have.

-- 
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“I am not what happened to me, I am what I choose to become.”
	— Carl Jung (1875–1961)



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: shepherd, fibers, and signals (asyncs)
  2023-12-15 22:03 shepherd, fibers, and signals (asyncs) Attila Lendvai
  2023-12-17  0:59 ` Attila Lendvai
@ 2024-01-09 19:29 ` Ludovic Courtès
  1 sibling, 0 replies; 3+ messages in thread
From: Ludovic Courtès @ 2024-01-09 19:29 UTC (permalink / raw)
  To: Attila Lendvai; +Cc: guix-devel

Hi Attila,

Apologies for the delay, I’m seeing this just now.

Attila Lendvai <attila@lendvai.name> skribis:

> posix signal handlers are async, and shepherd uses the fibers API from
> inside signal handlers, specifically in at least handle-SIGCHLD.

Shepherd has two modes: ‘signalfd’ on Linux, and Guile signal handlers
elsewhere (= GNU/Hurd).

The latter sucks for many reasons and was plain untested until recently
because GNU/Hurd support had been lost (Fibers used to be Linux-only).

The behavior on signalfd-less systems is now fixed in 0.10.3, though in
suboptimal ways.

At a fundamental level, we should fix signal handling in Guile and
Fibers to avoid race conditions.  (And get ‘signalfd’ on the Hurd. :-))

HTH!

Ludo’.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-01-09 19:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-15 22:03 shepherd, fibers, and signals (asyncs) Attila Lendvai
2023-12-17  0:59 ` Attila Lendvai
2024-01-09 19:29 ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).