unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#63982: Shepherd can crash when a user service fails to start
@ 2023-06-09 17:13 Maxim Cournoyer
  2023-06-12 13:44 ` Ludovic Courtès
  2023-06-18 15:14 ` bug#63982: Shepherd wrong-type-arg nils
  0 siblings, 2 replies; 13+ messages in thread
From: Maxim Cournoyer @ 2023-06-09 17:13 UTC (permalink / raw)
  To: 63982; +Cc: Ludovic Courtès

[-- Attachment #1: Type: text/plain, Size: 10125 bytes --]

Hi!

I've noticed that while all my user services (managed via GNU Stow --
not via Guix Home) were working, 'herd status' would report that
/run/user/1000/shepherd/socket was missing and bail out.

Starting from a nonexistent /run/user/1000/shepherd/socket, using old
Shepherd 0.9.1:

--8<---------------cut here---------------start------------->8---
$ /gnu/store/dblbnj1yra4yrrfjbnzsa0ldcl3170ap-shepherd-0.9.1/bin/shepherd
Service root has been started.
WARNING: Use of `load' in declarative module (#{ g115}#).  Add #:declarative? #f to your define-module invocation.

Some deprecated features have been used.  Set the environment
variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
program to get more information.  Set it to "no" to suppress
this message.
$
Warning: due to a long standing Gtk+ bug
https://gitlab.gnome.org/GNOME/gtk/issues/221
Emacs might crash when run in daemon mode and the X11 connection is unexpectedly lost.
Using an Emacs configured with --with-x-toolkit=lucid does not have this problem.
Loading time (native compiled elisp)...
Loading time (native compiled elisp)...done
Loading /home/maxim/.emacs.d/recentf...
Loading /home/maxim/.emacs.d/recentf...done
Cleaning up the recentf list...
Cleaning up the recentf list...done (0 removed)
../../.emacs: Warning: Use keywords rather than deprecated positional arguments to `define-minor-mode'
Preparing diary...
No diary entries for Friday, June 9, 2023
Preparing diary...done
Appointment reminders enabled
Loading /home/maxim/.emacs.d/emms/cache...
Loading /home/maxim/.emacs.d/emms/cache...done
[yas] Prepared just-in-time loading of snippets successfully.
[yas] Prepared just-in-time loading of snippets successfully.
Starting new Ispell process aspell with english dictionary... \
Starting new Ispell process aspell with english dictionary...done
Starting Emacs daemon.
Unable to start the daemon.
Another instance of Emacs is running the server, either as daemon or interactively.
You can use emacsclient to connect to that Emacs process.
Saving file /home/maxim/.emacs.d/emms/history...
Wrote /home/maxim/.emacs.d/emms/history
Wrote /home/maxim/.emacs.d/recentf
Error: server did not start correctly
Service emacs could not be started.
gpg-agent: a gpg-agent is already running - not starting a new one
Service gpg-agent could not be started.
Service ibus-daemon has been started.

$ herd status
Started:
 + ibus-daemon
 + root
Stopped:
 - emacs
 - gpg-agent
 - jackd
 - workrave
--8<---------------cut here---------------end--------------->8---

If I then run it anew, it fails with "shepherd: while opening socket
'/run/user/1000/shepherd/socket': bind: Address already in use", because
apparently 'herd stop root' didn't remove it.

--8<---------------cut here---------------start------------->8---
$ herd stop root
Exiting.
[...]

$ /gnu/store/dblbnj1yra4yrrfjbnzsa0ldcl3170ap-shepherd-0.9.1/bin/shepherd
Service root has been started.
WARNING: Use of `load' in declarative module (#{ g115}#).  Add #:declarative? #f to your define-module invocation.

Some deprecated features have been used.  Set the environment
variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
program to get more information.  Set it to "no" to suppress
this message.
maxim@hurd ~/src/guix [env]$
Warning: due to a long standing Gtk+ bug
https://gitlab.gnome.org/GNOME/gtk/issues/221
Emacs might crash when run in daemon mode and the X11 connection is unexpectedly lost.
Using an Emacs configured with --with-x-toolkit=lucid does not have this problem.
Loading time (native compiled elisp)...
Loading time (native compiled elisp)...done
Loading /home/maxim/.emacs.d/recentf...
Loading /home/maxim/.emacs.d/recentf...done
Cleaning up the recentf list...
Cleaning up the recentf list...done (0 removed)
../../.emacs: Warning: Use keywords rather than deprecated positional arguments to `define-minor-mode'
Preparing diary...
No diary entries for Friday, June 9, 2023
Preparing diary...done
Appointment reminders enabled
Loading /home/maxim/.emacs.d/emms/cache...
Loading /home/maxim/.emacs.d/emms/cache...done
[yas] Prepared just-in-time loading of snippets successfully.
[yas] Prepared just-in-time loading of snippets successfully.
Starting new Ispell process aspell with english dictionary... \
Starting new Ispell process aspell with english dictionary...done
Starting Emacs daemon.
Unable to start the daemon.
Another instance of Emacs is running the server, either as daemon or interactively.
You can use emacsclient to connect to that Emacs process.
Saving file /home/maxim/.emacs.d/emms/history...
Wrote /home/maxim/.emacs.d/emms/history
Wrote /home/maxim/.emacs.d/recentf
Error: server did not start correctly
Service emacs could not be started.
gpg-agent: a gpg-agent is already running - not starting a new one
Service gpg-agent could not be started.
Service ibus-daemon has been started.
shepherd: while opening socket '/run/user/1000/shepherd/socket': bind: Address already in use

Exiting shepherd...
Service ibus-daemon has been stopped.

Some deprecated features have been used.  Set the environment
variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
program to get more information.  Set it to "no" to suppress
this message.

$
--8<---------------cut here---------------end--------------->8---

Even after removing it manually with 'rm
/run/user/1000/shepherd/socket', it still fails:

--8<---------------cut here---------------start------------->8---
$ /gnu/store/dblbnj1yra4yrrfjbnzsa0ldcl3170ap-shepherd-0.9.1/bin/shepherd
Service root has been started.
WARNING: Use of `load' in declarative module (#{ g115}#).  Add #:declarative? #f to your define-module invocation.

Some deprecated features have been used.  Set the environment
variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
program to get more information.  Set it to "no" to suppress
this message.
maxim@hurd ~/src/guix [env]$
Warning: due to a long standing Gtk+ bug
https://gitlab.gnome.org/GNOME/gtk/issues/221
Emacs might crash when run in daemon mode and the X11 connection is unexpectedly lost.
Using an Emacs configured with --with-x-toolkit=lucid does not have this problem.
Loading time (native compiled elisp)...
Loading time (native compiled elisp)...done
Loading /home/maxim/.emacs.d/recentf...
Loading /home/maxim/.emacs.d/recentf...done
Cleaning up the recentf list...
Cleaning up the recentf list...done (0 removed)
../../.emacs: Warning: Use keywords rather than deprecated positional arguments to `define-minor-mode'
Preparing diary...
No diary entries for Friday, June 9, 2023
Preparing diary...done
Appointment reminders enabled
Loading /home/maxim/.emacs.d/emms/cache...
Loading /home/maxim/.emacs.d/emms/cache...done
[yas] Prepared just-in-time loading of snippets successfully.
[yas] Prepared just-in-time loading of snippets successfully.
Starting new Ispell process aspell with english dictionary... \
Starting new Ispell process aspell with english dictionary...done
Starting Emacs daemon.
Unable to start the daemon.
Another instance of Emacs is running the server, either as daemon or interactively.
You can use emacsclient to connect to that Emacs process.
Saving file /home/maxim/.emacs.d/emms/history...
Wrote /home/maxim/.emacs.d/emms/history
Wrote /home/maxim/.emacs.d/recentf
Error: server did not start correctly
Service emacs could not be started.
gpg-agent: a gpg-agent is already running - not starting a new one
Service gpg-agent could not be started.
Service ibus-daemon has been started.
shepherd: while opening socket '/run/user/1000/shepherd/socket': bind: Address already in use

Exiting shepherd...
Service ibus-daemon has been stopped.

Some deprecated features have been used.  Set the environment
variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
program to get more information.  Set it to "no" to suppress
this message.
--8<---------------cut here---------------end--------------->8---

It apparently is caused by Emacs failing to start, because if I comment
it out from the init.scm file, then the same Shepherd invocation is
happy:

--8<---------------cut here---------------start------------->8---
;; Services to start when shepherd starts:
(for-each start '(;emacs
		  gpg-agent
		  ibus-daemon))
--8<---------------cut here---------------end--------------->8---

--8<---------------cut here---------------start------------->8---
$ herd status
Started:
 + ibus-daemon
 + root
Stopped:
 - emacs
 - gpg-agent
 - jackd
 - workrave
--8<---------------cut here---------------end--------------->8---

But that's with Shepherd 0.9.1.  If I run the exact same config that now
works, I see:

--8<---------------cut here---------------start------------->8---
rm /run/user/1000/shepherd/socket

$ /gnu/store/y826g8wrpzskcs82ffxppj7mmz257ksi-shepherd-0.10.1/bin/shepherd
Starting service root...
Service root started.
Service root running with value #t.
Service root has been started.
WARNING: Use of `load' in declarative module (#{ g119}#).  Add #:declarative? #f to your define-module invocation.

Some deprecated features have been used.  Set the environment
variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
program to get more information.  Set it to "no" to suppress
this message.
Starting service gpg-agent...

$ herd status
herd: error: /run/user/1000/shepherd/socket: No such file or directory

$ file /run/user/1000/shepherd/socket
/run/user/1000/shepherd/socket: cannot open `/run/user/1000/shepherd/socket' (No such file or directory)

$ pgrep -a shepherd
1 /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/bin/guile --no-auto-compile /gnu/store/nl0948z46yndpx3kihhi540l5c422wv4-shepherd-0.10.0/bin/shepherd --config /gnu/store/7dxbjccbqamk4wa0nyf7zsc4ywimb1fh-shepherd.conf
24700 /gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9/bin/guile --no-auto-compile /gnu/store/y826g8wrpzskcs82ffxppj7mmz257ksi-shepherd-0.10.1/bin/shepherd
--8<---------------cut here---------------end--------------->8---

It seems a bug exists in both 0.9.1 and 0.10.1, but that something also
regressed going from 0.9.1 to 0.10.1.

Attached are the two relevant
Shepherd config files to test:


[-- Attachment #2: init.scm --]
[-- Type: application/octet-stream, Size: 287 bytes --]

;;; Shepherd User Services
(load "services.scm")

(register-services
 emacs
 gpg-agent
 jackd
 ibus-daemon
 workrave)

;; Send shepherd into the background.
(action 'shepherd 'daemonize)

;; Services to start when shepherd starts:
(for-each start '(emacs
		  gpg-agent
		  ibus-daemon))

[-- Attachment #3: services.scm --]
[-- Type: application/octet-stream, Size: 1178 bytes --]

(define emacs
  (make <service>
    #:provides '(emacs)
    #:requires '()
    #:start (make-system-constructor "emacs --daemon")
    #:stop (make-system-destructor "emacsclient --eval \"(kill-emacs)\"")))

(define ibus-daemon
  (make <service>
    #:provides '(ibus-daemon)
    #:requires '()
    #:start (make-system-constructor "ibus-daemon --xim --daemonize --replace")
    #:stop (make-system-destructor "pkill ibus-daemon")))

(define jackd
  (make <service>
    #:provides '(jackd)
    #:requires '()
    #:start (make-system-constructor "jackd -d alsa &")
    #:stop (make-system-destructor "pkill jackd")))

(define gpg-agent
  (let ((pinentry (string-append (getenv "HOME")
				 "/.guix-profile/bin/pinentry")))
    (make <service>
      #:provides '(gpg-agent)
      #:requires '()
      #:start (make-system-constructor
	       (string-append "gpg-agent --daemon "
			      "--pinentry-program " pinentry))
      #:stop (make-system-destructor "gpgconf --kill gpg-agent"))))

(define workrave
  (make <service>
    #:provides '(workrave)
    #:requires '()
    #:start (make-system-constructor "workrave &")
    #:stop (make-system-destructor "pkill -9 workrave")))

[-- Attachment #4: Type: text/plain, Size: 19 bytes --]


-- 
Thanks,
Maxim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-09 17:13 bug#63982: Shepherd can crash when a user service fails to start Maxim Cournoyer
@ 2023-06-12 13:44 ` Ludovic Courtès
  2023-06-12 17:32   ` Maxim Cournoyer
  2023-06-18 15:14 ` bug#63982: Shepherd wrong-type-arg nils
  1 sibling, 1 reply; 13+ messages in thread
From: Ludovic Courtès @ 2023-06-12 13:44 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 63982

Hi Maxim,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> rm /run/user/1000/shepherd/socket
>
> $ /gnu/store/y826g8wrpzskcs82ffxppj7mmz257ksi-shepherd-0.10.1/bin/shepherd
> Starting service root...
> Service root started.
> Service root running with value #t.
> Service root has been started.
> WARNING: Use of `load' in declarative module (#{ g119}#).  Add #:declarative? #f to your define-module invocation.
>
> Some deprecated features have been used.  Set the environment
> variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
> program to get more information.  Set it to "no" to suppress
> this message.
> Starting service gpg-agent...
>
> $ herd status
> herd: error: /run/user/1000/shepherd/socket: No such file or directory

Thanks for the detailed bug report!

I believe this is fixed by Shepherd commit
24c964021ebd3d63ce6e22808dd09dbe16116a6c, which introduces an additional
change: loading the config file asynchronously.

If you wish to test it, you can use the ‘shepherd’ channel.

Let me know how it goes!

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-12 13:44 ` Ludovic Courtès
@ 2023-06-12 17:32   ` Maxim Cournoyer
  2023-06-14 15:57     ` Ludovic Courtès
  0 siblings, 1 reply; 13+ messages in thread
From: Maxim Cournoyer @ 2023-06-12 17:32 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 63982

Hi Ludovic!

Ludovic Courtès <ludo@gnu.org> writes:

> Hi Maxim,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> rm /run/user/1000/shepherd/socket
>>
>> $ /gnu/store/y826g8wrpzskcs82ffxppj7mmz257ksi-shepherd-0.10.1/bin/shepherd
>> Starting service root...
>> Service root started.
>> Service root running with value #t.
>> Service root has been started.
>> WARNING: Use of `load' in declarative module (#{ g119}#).  Add #:declarative? #f to your define-module invocation.
>>
>> Some deprecated features have been used.  Set the environment
>> variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
>> program to get more information.  Set it to "no" to suppress
>> this message.
>> Starting service gpg-agent...
>>
>> $ herd status
>> herd: error: /run/user/1000/shepherd/socket: No such file or directory
>
> Thanks for the detailed bug report!
>
> I believe this is fixed by Shepherd commit
> 24c964021ebd3d63ce6e22808dd09dbe16116a6c, which introduces an additional
> change: loading the config file asynchronously.

Nitpick: I'd use a git message tag for 'Reported-by', as can be inserted
in the commit buffer in Magit with C-c C-p.  They should be placed at
the bottom of the git message to be considered by tools parsing them.

> If you wish to test it, you can use the ‘shepherd’ channel.

I've done so by placing in my ~/.config/guix/channels.scm file:

       (channel
        (name 'shepherd)
        (url "https://git.savannah.gnu.org/git/shepherd.git")
        (introduction
         (make-channel-introduction
          "788a6d6f1d5c170db68aa4bbfb77024fdc468ed3"  ;2022-05-21
          (openpgp-fingerprint
           "3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5"))))
           

It'd be nice to have this in the Shepherd doc for easy copy & paste.

> Let me know how it goes!

I've edited my ~/.xsession file to use
/gnu/store/ahzl8vxxcd5bqlljwgn8wkp4884sr72l-shepherd-0.10.99-tarball,
and I'm now seeing this:

--8<---------------cut here---------------start------------->8---
$ herd status
Démarrés :
 + root
Starting:
 ^ emacs
Arrêtés :
 - gpg-agent
 - ibus-daemon
 - jackd
 - workrave
--8<---------------cut here---------------end--------------->8---
 
Interestingly, the Emacs client is usable.  It doesn't change from
there, and requesting it to be stopped hangs Shepherd:

--8<---------------cut here---------------start------------->8---
$ herd stop emacs

--8<---------------cut here---------------end--------------->8---

If I comment out the Emacs service from the ~/.config/shepherd/init.scm
file, the same seems to happen on my next service, gpg-agent:

--8<---------------cut here---------------start------------->8---
$ herd status
Démarrés :
 + root
Starting:
 ^ gpg-agent
Arrêtés :
 - emacs
 - ibus-daemon
 - jackd
 - workrave
--8<---------------cut here---------------end--------------->8---

Etc. if I comment that one (now hanging on starting ibus-daemon).  It
seems something is still off?

Thanks for working toward a fix!

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-12 17:32   ` Maxim Cournoyer
@ 2023-06-14 15:57     ` Ludovic Courtès
  2023-06-19  1:42       ` bug#63982: Service hangs in 'starting' with Shepherd 0.10 (was: Shepherd can crash when a user service fails to start) Maxim Cournoyer
  0 siblings, 1 reply; 13+ messages in thread
From: Ludovic Courtès @ 2023-06-14 15:57 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 63982

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

>> I believe this is fixed by Shepherd commit
>> 24c964021ebd3d63ce6e22808dd09dbe16116a6c, which introduces an additional
>> change: loading the config file asynchronously.
>
> Nitpick: I'd use a git message tag for 'Reported-by', as can be inserted
> in the commit buffer in Magit with C-c C-p.  They should be placed at
> the bottom of the git message to be considered by tools parsing them.

Neat, I didn’t know about it, I’ll do that now (I think I started using
the “Reported by” convention before Git came into existence…).

>> If you wish to test it, you can use the ‘shepherd’ channel.
>
> I've done so by placing in my ~/.config/guix/channels.scm file:
>
>        (channel
>         (name 'shepherd)
>         (url "https://git.savannah.gnu.org/git/shepherd.git")
>         (introduction
>          (make-channel-introduction
>           "788a6d6f1d5c170db68aa4bbfb77024fdc468ed3"  ;2022-05-21
>           (openpgp-fingerprint
>            "3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5"))))
>            
>
> It'd be nice to have this in the Shepherd doc for easy copy & paste.

I’ll add that to ‘README’.

>> Let me know how it goes!
>
> I've edited my ~/.xsession file to use
> /gnu/store/ahzl8vxxcd5bqlljwgn8wkp4884sr72l-shepherd-0.10.99-tarball,
> and I'm now seeing this:
>
> $ herd status
> Démarrés :
>  + root
> Starting:
>  ^ emacs
> Arrêtés :
>  - gpg-agent
>  - ibus-daemon
>  - jackd
>  - workrave

Uh, so it remains in “starting” state?

> Interestingly, the Emacs client is usable.  It doesn't change from
> there, and requesting it to be stopped hangs Shepherd:

Technically it’s waiting for ‘emacs’ to be in “running” state before
attempting to stop it.

> If I comment out the Emacs service from the ~/.config/shepherd/init.scm
> file, the same seems to happen on my next service, gpg-agent:
>
> $ herd status
> Démarrés :
>  + root
> Starting:
>  ^ gpg-agent
> Arrêtés :
>  - emacs
>  - ibus-daemon
>  - jackd
>  - workrave
>
> Etc. if I comment that one (now hanging on starting ibus-daemon).  It
> seems something is still off?

Looks like it.  Could you share ~/.local/var/log/shepherd.log?

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd wrong-type-arg
  2023-06-09 17:13 bug#63982: Shepherd can crash when a user service fails to start Maxim Cournoyer
  2023-06-12 13:44 ` Ludovic Courtès
@ 2023-06-18 15:14 ` nils
  2023-06-22 20:08   ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
  1 sibling, 1 reply; 13+ messages in thread
From: nils @ 2023-06-18 15:14 UTC (permalink / raw)
  To: 63982@debbugs.gnu.org

Hello,

I am affected by this as well, but with slightly different symptoms.
Using guix home on a foreign system (Debian 12), I tried different shepherd versions with

(service home-shepherd-service-type
 (home-shepherd-configuration
   (shepherd (specification->package "shepherd@0.9")))

, and guix home describe --list-installed shows me that this works (in the sense that a different shepherd version is installed).
None of the versions I tried got me a functional shepherd service.

These are the error messages by shepherd version:

0.8.1:
Service root has been started.
WARNING: Use of `load' in declarative module (#{ g91}#).  Add #:declarative? #f to your define-module invocation.
Loading /gnu/store/w6rlja8v65dwv16ivcqx513q7827n6aq-shepherd.conf.
herd: exception caught while executing 'load' on service 'root':
In procedure string-append: Wrong type (expecting string): #f

No /run/user/1000/shepherd/socket is created. 

0.9.3:
Service root has been started.
WARNING: Use of `load' in declarative module (#{ g117}#).  Add #:declarative? #f to your define-module invocation.
wrong-type-arg("string-append" "Wrong type (expecting ~A): ~S" ("string" #f) (#f))

Some deprecated features have been used.  Set the environment
variable GUILE_WARN_DEPRECATED to "detailed" and rerun the
program to get more information.  Set it to "no" to suppress
this message.

No /run/user/1000/shepherd/socket is created.

0.10.1:
Starting service root...
Service root started.
Service root running with value #t.
Service root has been started.
WARNING: Use of `load' in declarative module (#{ g107}#).  Add #:declarative? #f to your define-module invocation.
wrong-type-arg("string-append" "Wrong type (expecting ~A): ~S" ("string" #f) (#f))

No /run/user/1000/shepherd/socket is created.

0.10.99:
Starting service root...
Service root started.
Service root running with value #t.
Service root has been started.
WARNING: Use of `load' in declarative module (#{ g119}#).  Add #:declarative? #f to your define-module invocation.
Uncaught exception while loading configuration file '/gnu/store/w6rlja8v65dwv16ivcqx513q7827n6aq-shepherd.conf': (wrong-type-arg "string-append" "Wrong type (expecting ~A): ~S"
 ("string" #f) (#f))

, and then the reconfiguration hangs. /run/user/1000/shepherd/socket is created, and herd status shows that root is started, other services are not shown, and are not started.


Content of config (/gnu/store/w6rlja8v65dwv16ivcqx513q7827n6aq-shepherd.conf):
(begin (use-modules (srfi srfi-34) (system repl error-handling)) (apply register-services (map (lambda (file) (load file)) (quote ("/gnu/store/71n4r0hccps574aqcks7zyk5rz5zardq-
shepherd-eww.scm" "/gnu/store/0r14z4psnf9h2nfqiflm0nv6m2bv04si-shepherd-eww-open-lockscreen-like-background.scm" "/gnu/store/ylidynn5akvk3lmqrxbgqkz0c8hn3y8c-shepherd-syncthing
.scm" "/gnu/store/9igwbpbwavl6r94ph7qss7i5cqq9d8nj-shepherd-mcron.scm")))) (action (quote root) (quote daemonize)) (format #t "Starting services...~%") (let ((services-to-start
(quote (eww eww-open-lockscreen-like-background syncthing mcron)))) (if (defined? (quote start-in-the-background)) (start-in-the-background services-to-start) (for-each start
services-to-start)) (redirect-port (open-input-file "/dev/null") (current-input-port))))

~/.local/state/log/shepherd.log does not contain anything that's not already in the messages above.

Is there anything else I can provide? Without a running shepherd, my system doesn't work super well.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Service hangs in 'starting' with Shepherd 0.10 (was: Shepherd can crash when a user service fails to start)
  2023-06-14 15:57     ` Ludovic Courtès
@ 2023-06-19  1:42       ` Maxim Cournoyer
  2023-06-21 14:20         ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
  2023-06-22 21:35         ` Ludovic Courtès
  0 siblings, 2 replies; 13+ messages in thread
From: Maxim Cournoyer @ 2023-06-19  1:42 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 63982

[-- Attachment #1: Type: text/plain, Size: 5738 bytes --]

Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>>> I believe this is fixed by Shepherd commit
>>> 24c964021ebd3d63ce6e22808dd09dbe16116a6c, which introduces an additional
>>> change: loading the config file asynchronously.
>>
>> Nitpick: I'd use a git message tag for 'Reported-by', as can be inserted
>> in the commit buffer in Magit with C-c C-p.  They should be placed at
>> the bottom of the git message to be considered by tools parsing them.
>
> Neat, I didn’t know about it, I’ll do that now (I think I started using
> the “Reported by” convention before Git came into existence…).
>
>>> If you wish to test it, you can use the ‘shepherd’ channel.
>>
>> I've done so by placing in my ~/.config/guix/channels.scm file:
>>
>>        (channel
>>         (name 'shepherd)
>>         (url "https://git.savannah.gnu.org/git/shepherd.git")
>>         (introduction
>>          (make-channel-introduction
>>           "788a6d6f1d5c170db68aa4bbfb77024fdc468ed3"  ;2022-05-21
>>           (openpgp-fingerprint
>>            "3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5"))))
>>
>>
>> It'd be nice to have this in the Shepherd doc for easy copy & paste.
>
> I’ll add that to ‘README’.

Neat, thank you.

>>> Let me know how it goes!
>>
>> I've edited my ~/.xsession file to use
>> /gnu/store/ahzl8vxxcd5bqlljwgn8wkp4884sr72l-shepherd-0.10.99-tarball,
>> and I'm now seeing this:
>>
>> $ herd status
>> Démarrés :
>>  + root
>> Starting:
>>  ^ emacs
>> Arrêtés :
>>  - gpg-agent
>>  - ibus-daemon
>>  - jackd
>>  - workrave
>
> Uh, so it remains in “starting” state?

Yes!  Which is surprising, because it's actually running fine, and
Shepherd 0.9.3 didn't have this issue (perhaps because it only knew of a
started/stopped service).

The other surprising thing is that because it thinks that Emacs hasn't
finished starting, it doesn't even attempt to try starting the other
services; they remain stopped although they should work.


[...]

> Looks like it.  Could you share ~/.local/var/log/shepherd.log?

I have something a bit more detailed, with various versions (the logs
are under ~/.local/state/shepherd/shepherd.log by default).  If you need
to, you should be able to reproduce on your end using the attached
~/.config/shepherd/{init.scm,services.scm} files (and ensuring the
service commands are on your PATH):

--8<---------------cut here---------------start------------->8---
Using /gnu/store/dblbnj1yra4yrrfjbnzsa0ldcl3170ap-shepherd-0.9.1/bin/shepherd

$ herd status
Started:
 + Emacs
 + Gpg-agent
 + ibus-daemon
 + jackd
 + root
 + workrave

Using /gnu/store/cdc1gzbp3q15kdiwn2i5j3437jwx61ac-shepherd-0.9.2/bin/shepherd

$ herd status
Started:
 + emacs
 + gpg-agent
 + ibus-daemon
 + jackd
 + root
 + workrave

Using /gnu/store/a9jdd8kgckwlq97yw3pjqs6sy4lqgrfq-shepherd-0.9.3/bin/shepherd

$ herd status
Started:
 + emacs
 + gpg-agent
 + ibus-daemon
 + jackd
 + root
 + workrave

~/.local/state/shepherd/shepherd.log:

2023-06-18 21:04:47 Service root démarré.
2023-06-18 21:04:57 Service emacs démarré.
2023-06-18 21:04:57 Service jackd démarré.
2023-06-18 21:04:57 Service gpg-agent démarré.
2023-06-18 21:04:57 Service ibus-daemon démarré.
2023-06-18 21:04:57 Service workrave démarré.

Using /gnu/store/ahzl8vxxcd5bqlljwgn8wkp4884sr72l-shepherd-0.10.99-tarball/bin/shepherd

$ herd status
Started:
 + root
Starting:
 ^ emacs
Stopped:
 - gpg-agent
 - ibus-daemon
 - jackd
 - workrave

~/.local/state/shepherd/shepherd.log:

2023-06-18 21:06:12 Starting service root...
2023-06-18 21:06:12 Service root started.
2023-06-18 21:06:12 Service root running with value #t.
2023-06-18 21:06:12 Service root démarré.
2023-06-18 21:06:12 Starting service emacs...
2023-06-18 21:06:12 [bash] 
2023-06-18 21:06:12 [bash] Warning: due to a long standing Gtk+ bug
2023-06-18 21:06:12 [bash] https://gitlab.gnome.org/GNOME/gtk/issues/221
2023-06-18 21:06:12 [bash] Emacs might crash when run in daemon mode and the X11 connection is unexpectedly lost.
2023-06-18 21:06:12 [bash] Using an Emacs configured with --with-x-toolkit=lucid does not have this problem.
2023-06-18 21:06:13 [bash] Loading time (native compiled elisp)...
2023-06-18 21:06:13 [bash] Loading time (native compiled elisp)...done
2023-06-18 21:06:13 [bash] Loading /home/maxim/.emacs.d/recentf...
2023-06-18 21:06:13 [bash] Loading /home/maxim/.emacs.d/recentf...done
2023-06-18 21:06:13 [bash] Cleaning up the recentf list...
2023-06-18 21:06:13 [bash] Cleaning up the recentf list...done (0 removed)
2023-06-18 21:06:13 [bash] .emacs: Warning: Use keywords rather than deprecated positional arguments to `define-minor-mode'
2023-06-18 21:06:15 [bash] Preparing diary...
2023-06-18 21:06:15 [bash] No diary entries for Sunday, June 18, 2023: Father's Day
2023-06-18 21:06:15 [bash] Preparing diary...done
2023-06-18 21:06:15 [bash] Appointment reminders enabled
2023-06-18 21:06:16 [bash] Loading /home/maxim/.emacs.d/emms/cache...
2023-06-18 21:06:16 [bash] Loading /home/maxim/.emacs.d/emms/cache...done
2023-06-18 21:06:18 [bash] [yas] Prepared just-in-time loading of snippets successfully.
2023-06-18 21:06:20 [bash] [yas] Prepared just-in-time loading of snippets successfully.
2023-06-18 21:06:22 [bash] Starting new Ispell process aspell with english dictionary... \ 
2023-06-18 21:06:22 [bash] Starting new Ispell process aspell with english dictionary...done
2023-06-18 21:06:22 [bash] Starting Emacs daemon.
--8<---------------cut here---------------end--------------->8---


[-- Attachment #2: init.scm --]
[-- Type: application/octet-stream, Size: 417 bytes --]

;;; Shepherd User Services
(load "services.scm")

(register-services
 emacs
 gpg-agent
 jackd
 ibus-daemon
 workrave)

;; Send shepherd into the background.
(action 'shepherd 'daemonize)

;;; FIXME: All disabled because of this bug: https://issues.guix.gnu.org/63982
;; Services to start when shepherd starts:
(for-each start '(emacs
                  jackd
		  gpg-agent
		  ibus-daemon
                  workrave))

[-- Attachment #3: services.scm --]
[-- Type: application/octet-stream, Size: 1178 bytes --]

(define emacs
  (make <service>
    #:provides '(emacs)
    #:requires '()
    #:start (make-system-constructor "emacs --daemon")
    #:stop (make-system-destructor "emacsclient --eval \"(kill-emacs)\"")))

(define ibus-daemon
  (make <service>
    #:provides '(ibus-daemon)
    #:requires '()
    #:start (make-system-constructor "ibus-daemon --xim --daemonize --replace")
    #:stop (make-system-destructor "pkill ibus-daemon")))

(define jackd
  (make <service>
    #:provides '(jackd)
    #:requires '()
    #:start (make-system-constructor "jackd -d alsa &")
    #:stop (make-system-destructor "pkill jackd")))

(define gpg-agent
  (let ((pinentry (string-append (getenv "HOME")
				 "/.guix-profile/bin/pinentry")))
    (make <service>
      #:provides '(gpg-agent)
      #:requires '()
      #:start (make-system-constructor
	       (string-append "gpg-agent --daemon "
			      "--pinentry-program " pinentry))
      #:stop (make-system-destructor "gpgconf --kill gpg-agent"))))

(define workrave
  (make <service>
    #:provides '(workrave)
    #:requires '()
    #:start (make-system-constructor "workrave &")
    #:stop (make-system-destructor "pkill -9 workrave")))

[-- Attachment #4: Type: text/plain, Size: 19 bytes --]


-- 
Thanks,
Maxim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-19  1:42       ` bug#63982: Service hangs in 'starting' with Shepherd 0.10 (was: Shepherd can crash when a user service fails to start) Maxim Cournoyer
@ 2023-06-21 14:20         ` Ludovic Courtès
  2023-06-22 21:35         ` Ludovic Courtès
  1 sibling, 0 replies; 13+ messages in thread
From: Ludovic Courtès @ 2023-06-21 14:20 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 63982

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> The other surprising thing is that because it thinks that Emacs hasn't
> finished starting, it doesn't even attempt to try starting the other
> services; they remain stopped although they should work.

This is because you’re starting them sequentially with:

  (for-each start …)

If you instead use ‘start-in-the-background’, it’ll start them in
parallel.

(BTW, you might want to use the new interface eventually:
<https://gnu.org/s/shepherd/manual/html_node/Legacy-GOOPS-Interface.html>.)

> Using /gnu/store/ahzl8vxxcd5bqlljwgn8wkp4884sr72l-shepherd-0.10.99-tarball/bin/shepherd
>
> $ herd status
> Started:
>  + root
> Starting:
>  ^ emacs
> Stopped:
>  - gpg-agent
>  - ibus-daemon
>  - jackd
>  - workrave
>
> ~/.local/state/shepherd/shepherd.log:
>
> 2023-06-18 21:06:12 Starting service root...
> 2023-06-18 21:06:12 Service root started.
> 2023-06-18 21:06:12 Service root running with value #t.
> 2023-06-18 21:06:12 Service root démarré.
> 2023-06-18 21:06:12 Starting service emacs...
> 2023-06-18 21:06:12 [bash] 
> 2023-06-18 21:06:12 [bash] Warning: due to a long standing Gtk+ bug
> 2023-06-18 21:06:12 [bash] https://gitlab.gnome.org/GNOME/gtk/issues/221
> 2023-06-18 21:06:12 [bash] Emacs might crash when run in daemon mode and the X11 connection is unexpectedly lost.
> 2023-06-18 21:06:12 [bash] Using an Emacs configured with --with-x-toolkit=lucid does not have this problem.
> 2023-06-18 21:06:13 [bash] Loading time (native compiled elisp)...
> 2023-06-18 21:06:13 [bash] Loading time (native compiled elisp)...done
> 2023-06-18 21:06:13 [bash] Loading /home/maxim/.emacs.d/recentf...
> 2023-06-18 21:06:13 [bash] Loading /home/maxim/.emacs.d/recentf...done
> 2023-06-18 21:06:13 [bash] Cleaning up the recentf list...
> 2023-06-18 21:06:13 [bash] Cleaning up the recentf list...done (0 removed)
> 2023-06-18 21:06:13 [bash] .emacs: Warning: Use keywords rather than deprecated positional arguments to `define-minor-mode'
> 2023-06-18 21:06:15 [bash] Preparing diary...
> 2023-06-18 21:06:15 [bash] No diary entries for Sunday, June 18, 2023: Father's Day
> 2023-06-18 21:06:15 [bash] Preparing diary...done
> 2023-06-18 21:06:15 [bash] Appointment reminders enabled
> 2023-06-18 21:06:16 [bash] Loading /home/maxim/.emacs.d/emms/cache...
> 2023-06-18 21:06:16 [bash] Loading /home/maxim/.emacs.d/emms/cache...done
> 2023-06-18 21:06:18 [bash] [yas] Prepared just-in-time loading of snippets successfully.
> 2023-06-18 21:06:20 [bash] [yas] Prepared just-in-time loading of snippets successfully.
> 2023-06-18 21:06:22 [bash] Starting new Ispell process aspell with english dictionary... \ 
> 2023-06-18 21:06:22 [bash] Starting new Ispell process aspell with english dictionary...done
> 2023-06-18 21:06:22 [bash] Starting Emacs daemon.

And what’s the process tree like, if you run “pstree -p N” where N is
the PID of shepherd?

It looks as though ‘bash -c "emacs --daemon"’ didn’t terminate, which is
what’s needed to transition from “starting” to “running”.

Could you ‘strace -f -s 100 -o /tmp/log.strace shepherd’, keeping only
the ‘emacs’ service?

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-18 15:14 ` bug#63982: Shepherd wrong-type-arg nils
@ 2023-06-22 20:08   ` Ludovic Courtès
  2023-06-25 13:03     ` nils
  0 siblings, 1 reply; 13+ messages in thread
From: Ludovic Courtès @ 2023-06-22 20:08 UTC (permalink / raw)
  To: nils; +Cc: 63982@debbugs.gnu.org

Hi,

nils@landt.email skribis:

> 0.10.99:
> Starting service root...
> Service root started.
> Service root running with value #t.
> Service root has been started.
> WARNING: Use of `load' in declarative module (#{ g119}#).  Add #:declarative? #f to your define-module invocation.
> Uncaught exception while loading configuration file '/gnu/store/w6rlja8v65dwv16ivcqx513q7827n6aq-shepherd.conf': (wrong-type-arg "string-append" "Wrong type (expecting ~A): ~S"
>  ("string" #f) (#f))
>
> , and then the reconfiguration hangs. /run/user/1000/shepherd/socket is created, and herd status shows that root is started, other services are not shown, and are not started.
>
>
> Content of config (/gnu/store/w6rlja8v65dwv16ivcqx513q7827n6aq-shepherd.conf):
> (begin (use-modules (srfi srfi-34) (system repl error-handling)) (apply register-services (map (lambda (file) (load file)) (quote ("/gnu/store/71n4r0hccps574aqcks7zyk5rz5zardq-
> shepherd-eww.scm" "/gnu/store/0r14z4psnf9h2nfqiflm0nv6m2bv04si-shepherd-eww-open-lockscreen-like-background.scm" "/gnu/store/ylidynn5akvk3lmqrxbgqkz0c8hn3y8c-shepherd-syncthing
> .scm" "/gnu/store/9igwbpbwavl6r94ph7qss7i5cqq9d8nj-shepherd-mcron.scm")))) (action (quote root) (quote daemonize)) (format #t "Starting services...~%") (let ((services-to-start
> (quote (eww eww-open-lockscreen-like-background syncthing mcron)))) (if (defined? (quote start-in-the-background)) (start-in-the-background services-to-start) (for-each start
> services-to-start)) (redirect-port (open-input-file "/dev/null") (current-input-port))))

This suggests a problem in the config file: one of the shepherd-*.scm
files listed above ends up calling (string-append #f …).

We’d need to see those files to understand what’s happening but it looks
different from what Maxim reported.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-19  1:42       ` bug#63982: Service hangs in 'starting' with Shepherd 0.10 (was: Shepherd can crash when a user service fails to start) Maxim Cournoyer
  2023-06-21 14:20         ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
@ 2023-06-22 21:35         ` Ludovic Courtès
  2023-06-26 15:53           ` Maxim Cournoyer
  2023-07-12 17:46           ` Ludovic Courtès
  1 sibling, 2 replies; 13+ messages in thread
From: Ludovic Courtès @ 2023-06-22 21:35 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 63982

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:

[...]

>> Uh, so it remains in “starting” state?
>
> Yes!

Turns out that this happens when calling the ‘daemonize’ action on
‘root’.  I have a reproducer now and am investigating…

Ludo’.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-22 20:08   ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
@ 2023-06-25 13:03     ` nils
  0 siblings, 0 replies; 13+ messages in thread
From: nils @ 2023-06-25 13:03 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 63982@debbugs.gnu.org

> Ludovic Courtès <ludo@gnu.org> hat am 22.06.2023 22:08 CEST geschrieben:
> This suggests a problem in the config file: one of the shepherd-*.scm
> files listed above ends up calling (string-append #f …).
> 
> We’d need to see those files to understand what’s happening but it looks
> different from what Maxim reported.

Indeed I misdiagnosed the issue because it happened after a guix upgrade.
I used $XDG_LOG_HOME in my shepherd services, and as of f74df2ab879fc5457982bbc85b7455a90e82317d this is no longer set by default. 
Thanks for your help!




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-22 21:35         ` Ludovic Courtès
@ 2023-06-26 15:53           ` Maxim Cournoyer
  2023-07-12 17:46           ` Ludovic Courtès
  1 sibling, 0 replies; 13+ messages in thread
From: Maxim Cournoyer @ 2023-06-26 15:53 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 63982

Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>
> [...]
>
>>> Uh, so it remains in “starting” state?
>>
>> Yes!
>
> Turns out that this happens when calling the ‘daemonize’ action on
> ‘root’.  I have a reproducer now and am investigating…

Great, thanks for investigating and let me know if I can provide
something useful.  It seems introducing cooperative scheduling is a path
layered with traps, eh :-).

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-06-22 21:35         ` Ludovic Courtès
  2023-06-26 15:53           ` Maxim Cournoyer
@ 2023-07-12 17:46           ` Ludovic Courtès
  2023-07-19  1:11             ` Maxim Cournoyer
  1 sibling, 1 reply; 13+ messages in thread
From: Ludovic Courtès @ 2023-07-12 17:46 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 63982

[-- Attachment #1: Type: text/plain, Size: 1267 bytes --]

Hi!

Ludovic Courtès <ludo@gnu.org> skribis:

> Turns out that this happens when calling the ‘daemonize’ action on
> ‘root’.  I have a reproducer now and am investigating…

Good news: this is fixed in Shepherd commit
f4272d2f0f393d2aa3e9d76b36ab6aa5f2fc72c2!

The root cause is inconsistent semantics when mixing epoll, signalfd,
and fork, specifically this part from signalfd(2):

   epoll(7) semantics
       If  a  process adds (via epoll_ctl(2)) a signalfd file descriptor to an
       epoll(7) instance, then epoll_wait(2) returns events only  for  signals
       sent  to that process.  In particular, if the process then uses fork(2)
       to create a child process, then the child will be able to read(2)  sig‐
       nals  that  are  sent  to  it  using  the signalfd file descriptor, but
       epoll_wait(2) will not indicate that the signalfd  file  descriptor  is
       ready.   In  this  scenario,  a  possible  workaround is that after the
       fork(2), the child process can close the signalfd file descriptor  that
       it  inherited  from the parent process and then create another signalfd
       file descriptor and add it to the epoll instance. […]

The C program below illustrates this behavior:


[-- Attachment #2: The C program. --]
[-- Type: text/plain, Size: 1472 bytes --]

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/signal.h>
#include <sys/signalfd.h>
#include <sys/epoll.h>

int
main ()
{
  int ep, sfd;

  sigset_t signals;
  sigemptyset (&signals);
  sigaddset (&signals, SIGINT);
  sigaddset (&signals, SIGHUP);

  sigprocmask (SIG_BLOCK, &signals, NULL);
  sfd = signalfd (-1, &signals, SFD_CLOEXEC);

  ep = epoll_create1 (EPOLL_CLOEXEC);

  struct epoll_event events = { .events = EPOLLIN | EPOLLONESHOT, .data = NULL };
  epoll_ctl (ep, EPOLL_CTL_ADD, sfd, &events);

  epoll_wait (ep, &events, 1, 123);

  if (fork () == 0)
    {
      /* Quoth signalfd(2):

	 If  a  process adds (via epoll_ctl(2)) a signalfd file descriptor to an
	 epoll(7) instance, then epoll_wait(2) returns events only  for  signals
	 sent  to that process.  In particular, if the process then uses fork(2)
	 to create a child process, then the child will be able to read(2)  sig‐
	 nals  that  are  sent  to  it  using  the signalfd file descriptor, but
	 epoll_wait(2) will not indicate that the signalfd  file  descriptor  is
	 ready.   */

      printf ("try this: kill -INT %i\n", getpid ());
      while (1)
	{
	  struct signalfd_siginfo info;
	  if (epoll_wait (ep, &events, 1, 777) > 0)
	    {
	      read (sfd, &info, sizeof info);
	      printf ("got signal %i!\n", info.ssi_signo);
	      epoll_ctl (ep, EPOLL_CTL_MOD, sfd, &events);
	    }
	}
    }

  return 0;
}

[-- Attachment #3: Type: text/plain, Size: 218 bytes --]


Of course it took me a while to find out about this; I first looked at
things individually and didn’t expect the mixture to behave
inconsistently.

Maxim, let me know if it works for you!

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* bug#63982: Shepherd can crash when a user service fails to start
  2023-07-12 17:46           ` Ludovic Courtès
@ 2023-07-19  1:11             ` Maxim Cournoyer
  0 siblings, 0 replies; 13+ messages in thread
From: Maxim Cournoyer @ 2023-07-19  1:11 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 63982-done

Hey Ludo!

Ludovic Courtès <ludo@gnu.org> writes:

> Hi!
>
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> Turns out that this happens when calling the ‘daemonize’ action on
>> ‘root’.  I have a reproducer now and am investigating…
>
> Good news: this is fixed in Shepherd commit
> f4272d2f0f393d2aa3e9d76b36ab6aa5f2fc72c2!
>
> The root cause is inconsistent semantics when mixing epoll, signalfd,
> and fork, specifically this part from signalfd(2):
>
>    epoll(7) semantics
>        If  a  process adds (via epoll_ctl(2)) a signalfd file descriptor to an
>        epoll(7) instance, then epoll_wait(2) returns events only  for  signals
>        sent  to that process.  In particular, if the process then uses fork(2)
>        to create a child process, then the child will be able to read(2)  sig‐
>        nals  that  are  sent  to  it  using  the signalfd file descriptor, but
>        epoll_wait(2) will not indicate that the signalfd  file  descriptor  is
>        ready.   In  this  scenario,  a  possible  workaround is that after the
>        fork(2), the child process can close the signalfd file descriptor  that
>        it  inherited  from the parent process and then create another signalfd
>        file descriptor and add it to the epoll instance. […]
>
> The C program below illustrates this behavior:
>
> #include <stdlib.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/signal.h>
> #include <sys/signalfd.h>
> #include <sys/epoll.h>
>
> int
> main ()
> {
>   int ep, sfd;
>
>   sigset_t signals;
>   sigemptyset (&signals);
>   sigaddset (&signals, SIGINT);
>   sigaddset (&signals, SIGHUP);
>
>   sigprocmask (SIG_BLOCK, &signals, NULL);
>   sfd = signalfd (-1, &signals, SFD_CLOEXEC);
>
>   ep = epoll_create1 (EPOLL_CLOEXEC);
>
>   struct epoll_event events = { .events = EPOLLIN | EPOLLONESHOT, .data = NULL };
>   epoll_ctl (ep, EPOLL_CTL_ADD, sfd, &events);
>
>   epoll_wait (ep, &events, 1, 123);
>
>   if (fork () == 0)
>     {
>       /* Quoth signalfd(2):
>
> 	 If  a  process adds (via epoll_ctl(2)) a signalfd file descriptor to an
> 	 epoll(7) instance, then epoll_wait(2) returns events only  for  signals
> 	 sent  to that process.  In particular, if the process then uses fork(2)
> 	 to create a child process, then the child will be able to read(2)  sig‐
> 	 nals  that  are  sent  to  it  using  the signalfd file descriptor, but
> 	 epoll_wait(2) will not indicate that the signalfd  file  descriptor  is
> 	 ready.   */
>
>       printf ("try this: kill -INT %i\n", getpid ());
>       while (1)
> 	{
> 	  struct signalfd_siginfo info;
> 	  if (epoll_wait (ep, &events, 1, 777) > 0)
> 	    {
> 	      read (sfd, &info, sizeof info);
> 	      printf ("got signal %i!\n", info.ssi_signo);
> 	      epoll_ctl (ep, EPOLL_CTL_MOD, sfd, &events);
> 	    }
> 	}
>     }
>
>   return 0;
> }
>
>
> Of course it took me a while to find out about this; I first looked at
> things individually and didn’t expect the mixture to behave
> inconsistently.

Tricky!  Thanks for sharing the result of your investigation, it's
always enlightening!

> Maxim, let me know if it works for you!

Better than ever!  Thanks a lot for fixing the various issues reported
here.

I'm closing this one!

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-07-19  1:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-09 17:13 bug#63982: Shepherd can crash when a user service fails to start Maxim Cournoyer
2023-06-12 13:44 ` Ludovic Courtès
2023-06-12 17:32   ` Maxim Cournoyer
2023-06-14 15:57     ` Ludovic Courtès
2023-06-19  1:42       ` bug#63982: Service hangs in 'starting' with Shepherd 0.10 (was: Shepherd can crash when a user service fails to start) Maxim Cournoyer
2023-06-21 14:20         ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
2023-06-22 21:35         ` Ludovic Courtès
2023-06-26 15:53           ` Maxim Cournoyer
2023-07-12 17:46           ` Ludovic Courtès
2023-07-19  1:11             ` Maxim Cournoyer
2023-06-18 15:14 ` bug#63982: Shepherd wrong-type-arg nils
2023-06-22 20:08   ` bug#63982: Shepherd can crash when a user service fails to start Ludovic Courtès
2023-06-25 13:03     ` nils

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).