unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Improving Shepherd
@ 2018-01-29 21:14 Carlo Zancanaro
  2018-01-29 22:27 ` Jelle Licht
  2018-02-05 10:49 ` Carlo Zancanaro
  0 siblings, 2 replies; 18+ messages in thread
From: Carlo Zancanaro @ 2018-01-29 21:14 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 480 bytes --]

I'm keen to do some work on shepherd. Partially this is driven by 
me using it to manage my user session and having it not always 
work right, and partially this is driven by me grepping the code 
for "FIXME" (which was slightly overwhelming). If anyone is keen 
to chat about it on Friday, please find me! I have some ideas 
about things I'd like to do, but I don't really have any idea what 
I'm doing. Any help/advice/encouragement you can give me will be 
appreciated!

Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-01-29 21:14 Improving Shepherd Carlo Zancanaro
@ 2018-01-29 22:27 ` Jelle Licht
  2018-02-05 10:49 ` Carlo Zancanaro
  1 sibling, 0 replies; 18+ messages in thread
From: Jelle Licht @ 2018-01-29 22:27 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 785 bytes --]

2018-01-29 22:14 GMT+01:00 Carlo Zancanaro <carlo@zancanaro.id.au>:

> I'm keen to do some work on shepherd. Partially this is driven by me using
> it to manage my user session and having it not always work right, and
> partially this is driven by me grepping the code for "FIXME" (which was
> slightly overwhelming). If anyone is keen to chat about it on Friday,
> please find me! I have some ideas about things I'd like to do, but I don't
> really have any idea what I'm doing. Any help/advice/encouragement you can
> give me will be appreciated!
>

Count me in! I am currently not using GNU Shepherd for my user session yet,
but would like to collaborate on some future direction on making it more
easy to use.
I'll only be there after/around lunch though ;-).

>
> Carlo
>
- Jelle

[-- Attachment #2: Type: text/html, Size: 1376 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-01-29 21:14 Improving Shepherd Carlo Zancanaro
  2018-01-29 22:27 ` Jelle Licht
@ 2018-02-05 10:49 ` Carlo Zancanaro
  2018-02-05 13:08   ` Ludovic Courtès
  2018-02-05 16:00   ` Danny Milosavljevic
  1 sibling, 2 replies; 18+ messages in thread
From: Carlo Zancanaro @ 2018-02-05 10:49 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 3208 bytes --]

A few people came to join me on Friday to think about Shepherd. 
Thanks Alex, Efraim, and Jelle.

We talked about a few different things that we'd like to achieve 
with Shepherd. The most significant and achievable things were, I 
think: user services, child process control, and 
concurrency/parallelism.

User services - Alex has already sent a patch to the list to allow 
generating user services from the Guix side. The idea is to 
generate a Shepherd config file, allowing a user to invoke 
shepherd manually to start their services. A further extension to 
this would be to have something like systemd's "user sessions", 
where the pid 1 Shepherd automatically starts a user's services 
when they log in.

Child process control - this is my personal frustration, where 
Shepherd loses track of processes that fork away (e.g. "emacs 
--daemon"). I barely know anything about Linux process management, 
but from my reading this can be solved through Linux namespaces 
(if user namespaces are available). Could someone who knows more 
about this let me know if that's a productive direction for me to 
investigate? Or tell me a better way to go about it?

Concurrency/parallelism - I think Jelle was planning to work on 
this, but I might be wrong about that. Maybe I volunteered? We're 
keen to see Shepherd starting services in parallel, where 
possible. This will require some changes to the way we start/stop 
services (because at the moment we just send a "start" signal to a 
single service to start it, which makes it hard to be parallel), 
and will require us to actually build some sort of real dependency 
resolution. Longer-term our goal should be to bring fibers into 
Shepherd, but Efraim mentioned that fibers doesn't compile on ARM 
at the moment, so we'll have to get that working first at least.

I mentioned an idea to the guys on Friday about how Shepherd 
should treat enabled/disabled services. I've thought about it some 
more, and I think it might work. The general idea is that Shepherd 
would always try to run an enabled service, and it would leave a 
disabled service as-is (unless it's needed to start another 
service). So it would kind of work like this:
 - if stopped and enabled: try to start service
 - if started and enabled: monitor, and restart service if it 
 fails
 - if retrying too often: disable this service, and all which 
 depend on it
 - else: only start if another enabled service depends on this one

This would mean that Shepherd could decide the best way to 
start/stop services, including doing so in parallel if possible.

So, there are our ideas! Any thoughts, or words of wisdom? 
Feedback is welcome.

Carlo

On Mon, Jan 29 2018, Carlo Zancanaro wrote:
> I'm keen to do some work on shepherd. Partially this is driven 
> by
> me using it to manage my user session and having it not always
> work right, and partially this is driven by me grepping the code
> for "FIXME" (which was slightly overwhelming). If anyone is keen
> to chat about it on Friday, please find me! I have some ideas
> about things I'd like to do, but I don't really have any idea 
> what
> I'm doing. Any help/advice/encouragement you can give me will be
> appreciated!
>
> Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-05 10:49 ` Carlo Zancanaro
@ 2018-02-05 13:08   ` Ludovic Courtès
  2018-02-05 15:56     ` Carlo Zancanaro
  2018-02-10 13:34     ` Jelle Licht
  2018-02-05 16:00   ` Danny Milosavljevic
  1 sibling, 2 replies; 18+ messages in thread
From: Ludovic Courtès @ 2018-02-05 13:08 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: guix-devel

Hello!

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> A few people came to join me on Friday to think about Shepherd. Thanks
> Alex, Efraim, and Jelle.

Thanks for summarizing!  I was hoping to chime in as well but that did
not happen.

> User services - Alex has already sent a patch to the list to allow
> generating user services from the Guix side. The idea is to generate a
> Shepherd config file, allowing a user to invoke shepherd manually to
> start their services. A further extension to this would be to have
> something like systemd's "user sessions", where the pid 1 Shepherd
> automatically starts a user's services when they log in.

After replying to Alex’ message, I realized that we could just as well
have a separate “guix service” or similar tool to take care of this?

This needs more thought (and perhaps taking a look at systemd user
sessions, which I’m not familiar with), but I think Alex’ approach is a
good starting point.

> Child process control - this is my personal frustration, where
> Shepherd loses track of processes that fork away (e.g. "emacs
> --daemon"). I barely know anything about Linux process management, but
> from my reading this can be solved through Linux namespaces (if user
> namespaces are available). Could someone who knows more about this let
> me know if that's a productive direction for me to investigate? Or
> tell me a better way to go about it?

Currently shepherd monitors SIGCHLD, and it’s not supposed to miss
those; in some cases it might handle them later than you’d expect, which
means that in the meantime you see a zombie process, but otherwise it
seems to work.

ISTR you reported an issue when using ‘shepherd --daemonize’, right?
Perhaps the issue is limited to that mode?

> Concurrency/parallelism - I think Jelle was planning to work on this,
> but I might be wrong about that. Maybe I volunteered? We're keen to
> see Shepherd starting services in parallel, where possible. This will
> require some changes to the way we start/stop services (because at the
> moment we just send a "start" signal to a single service to start it,
> which makes it hard to be parallel), and will require us to actually
> build some sort of real dependency resolution. Longer-term our goal
> should be to bring fibers into Shepherd, but Efraim mentioned that
> fibers doesn't compile on ARM at the moment, so we'll have to get that
> working first at least.

I’d really like to see that happen.  I’ve become more familiar with
Fibers, and I think it’ll be perfect for the Shepherd (and we’ll fix the
ARM build issue, no doubt.)

One thing I’d like to do is to handle SIGCHLD via signalfd(2) instead of
an actual signal handler like we do now.  That would make it easy to
have signal handling part of the main event loop and thus, it would
integrate well with Fibers.

It seems that signalfd(2) is Linux-only though, which is a bummer.  The
solution might be to get over it and have it implemented on GNU/Hurd…
(I saw this discussion:
<https://www.gnu.org/software/hurd/glibc/signal/signal_thread.html>; I
suspect it’s within reach.)

> I mentioned an idea to the guys on Friday about how Shepherd should
> treat enabled/disabled services. I've thought about it some more, and
> I think it might work. The general idea is that Shepherd would always
> try to run an enabled service, and it would leave a disabled service
> as-is (unless it's needed to start another service). So it would kind
> of work like this:
> - if stopped and enabled: try to start service
> - if started and enabled: monitor, and restart service if it fails
> - if retrying too often: disable this service, and all which depend on
> it
> - else: only start if another enabled service depends on this one
>
> This would mean that Shepherd could decide the best way to start/stop
> services, including doing so in parallel if possible.

Sounds good.  That’s annoyed most of us already, so if you get that
fixed, you’ll make a lot of people happy.  :-)

Ludo’.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-05 13:08   ` Ludovic Courtès
@ 2018-02-05 15:56     ` Carlo Zancanaro
  2018-02-09 13:26       ` Ludovic Courtès
  2018-02-10 13:34     ` Jelle Licht
  1 sibling, 1 reply; 18+ messages in thread
From: Carlo Zancanaro @ 2018-02-05 15:56 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 5699 bytes --]

Hey Ludo,

On Mon, Feb 05 2018, Ludovic Courtès wrote:
>> User services - Alex has already sent a patch to the list to 
>> allow
>> generating user services from the Guix side. The idea is to 
>> generate a
>> Shepherd config file, allowing a user to invoke shepherd 
>> manually to
>> start their services. A further extension to this would be to 
>> have
>> something like systemd's "user sessions", where the pid 1 
>> Shepherd
>> automatically starts a user's services when they log in.
>
> After replying to Alex’ message, I realized that we could just 
> as well
> have a separate “guix service” or similar tool to take care of 
> this?
>
> This needs more thought (and perhaps taking a look at systemd 
> user
> sessions, which I’m not familiar with), but I think Alex’ 
> approach is a
> good starting point.

We were thinking it might work like this:
 - services->package constructs a package which places a file in 
 the profile containing the necessary references
 - pid 1 shepherd listens to elogind login/logout events, and 
 starts the services when necessary

Admittedly this isn't the nicest way for it to work, but it might 
be a good starting point.

There were some discussions on the list a while ago about how to 
have `guix environment` automatically start services, too, so I 
wonder what overlap there could be there. Although maybe 
environment services (in containers) have more in common with 
system services than user services.

>> Child process control - this is my personal frustration, where
>> Shepherd loses track of processes that fork away (e.g. "emacs
>> --daemon"). I barely know anything about Linux process 
>> management, but
>> from my reading this can be solved through Linux namespaces (if 
>> user
>> namespaces are available). Could someone who knows more about 
>> this let
>> me know if that's a productive direction for me to investigate? 
>> Or
>> tell me a better way to go about it?
>
> Currently shepherd monitors SIGCHLD, and it’s not supposed to 
> miss
> those; in some cases it might handle them later than you’d 
> expect, which
> means that in the meantime you see a zombie process, but 
> otherwise it
> seems to work.
>
> ISTR you reported an issue when using ‘shepherd --daemonize’, 
> right?
> Perhaps the issue is limited to that mode?

I no longer use the daemonize function. My user shepherd runs "in 
the foreground" (it's started when my X session starts), so it's 
not that. Jelle fixed the problem I was having by delaying the 
SIGCHLD handler registration until it's needed. It is still buggy 
if a process is started before the daemonize command is given to 
root service, though.

If you try running "emacs --daemon" with 
"make-forkexec-constructor" (and #:pid-file, and put something in 
your emacs config to make it write out the pid) you should be able 
to reproduce what I am seeing. If you kill emacs (or if it 
crashes) then shepherd continues to report that it is started and 
running. When I look at htop's output I can also see that my emacs 
process is not a child of my shepherd process.

I would like to add a --daemon/--daemonize command line argument 
to shepherd instead of the current "send the root service a 
daemonize message". I think the use cases of turning it into a 
daemon later are limited, and it just gives you an additional way 
of shooting yourself in the foot.

>> Concurrency/parallelism - I think Jelle was planning to work on 
>> this,
>> but I might be wrong about that. Maybe I volunteered? We're 
>> keen to
>> see Shepherd starting services in parallel, where possible. 
>> This will
>> require some changes to the way we start/stop services (because 
>> at the
>> moment we just send a "start" signal to a single service to 
>> start it,
>> which makes it hard to be parallel), and will require us to 
>> actually
>> build some sort of real dependency resolution. Longer-term our 
>> goal
>> should be to bring fibers into Shepherd, but Efraim mentioned 
>> that
>> fibers doesn't compile on ARM at the moment, so we'll have to 
>> get that
>> working first at least.
>
> I’d really like to see that happen.  I’ve become more familiar 
> with
> Fibers, and I think it’ll be perfect for the Shepherd (and we’ll 
> fix the
> ARM build issue, no doubt.)

I'm not going to put much time/effort into this until we have 
fibers building on ARM. I think these changes are likely to break 
shepherd's config API, too. In particular, with higher levels of 
concurrency I want to move the mutable state out of <service> 
objects.

> It seems that signalfd(2) is Linux-only though, which is a 
> bummer.  The
> solution might be to get over it and have it implemented on 
> GNU/Hurd…
> (I saw this discussion:
> <https://www.gnu.org/software/hurd/glibc/signal/signal_thread.html>; 
> I
> suspect it’s within reach.)

Failing that, could we have our signal handlers just convert the 
signal to a message in our event loop? I have a very rudimentary 
understanding of signal handling, but I assume we could have our 
main event loop just reading things off of two channels: one of 
signal events, one of fd events.

>> This would mean that Shepherd could decide the best way to 
>> start/stop
>> services, including doing so in parallel if possible.
>
> Sounds good.  That’s annoyed most of us already, so if you get 
> that
> fixed, you’ll make a lot of people happy.  :-)

I'll have a go at this in the next few weeks. I'll be travelling 
until the end of February, so I'm not expecting much, but we'll 
see!

Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-05 10:49 ` Carlo Zancanaro
  2018-02-05 13:08   ` Ludovic Courtès
@ 2018-02-05 16:00   ` Danny Milosavljevic
  2018-02-05 16:41     ` Carlo Zancanaro
  2018-02-09 13:22     ` Ludovic Courtès
  1 sibling, 2 replies; 18+ messages in thread
From: Danny Milosavljevic @ 2018-02-05 16:00 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: guix-devel

Hi Carlo,

On Mon, 05 Feb 2018 21:49:08 +1100
Carlo Zancanaro <carlo@zancanaro.id.au> wrote:

> User services - Alex has already sent a patch to the list to allow 
> generating user services from the Guix side. The idea is to 
> generate a Shepherd config file, allowing a user to invoke 
> shepherd manually to start their services.

>A further extension to 
> this would be to have something like systemd's "user sessions", 
> where the pid 1 Shepherd automatically starts a user's services 
> when they log in.

I assume that means "starts a user's shepherd when they log in".

elogind already emits a signal on dbus which tells you when a user logged in

        return sd_bus_emit_signal(
                        u->manager->bus,
                        "/org/freedesktop/login1",
                        "org.freedesktop.login1.Manager",
                        new_user ? "UserNew" : "UserRemoved",
                        "uo", (uint32_t) u->uid, p);

Also, a directory /run/user/<id> appears - which alternatively can be
monitored by inotify or something.

So the system shepherd could have a shepherd service which does

  while (1) {
     wait until /run/user/<id> appears
     vfork
       if child: setuid, exec user shepherd, _exit
       if parent: wait until child dies
  }

We better be sure that no one else can create directories in /run/user .

In non-pseudocode, both "wait until /run/user/<id> appears" and
"wait until child dies" would have to be in the same call,
maybe epoll or something.

Maybe call the service shepherd-nursery-service or something, like a star
nursery :)

> Child process control - this is my personal frustration, where 
> Shepherd loses track of processes that fork away (e.g. "emacs 
> --daemon"). I barely know anything about Linux process management, 
> but from my reading this can be solved through Linux namespaces 
> (if user namespaces are available). Could someone who knows more 
> about this let me know if that's a productive direction for me to 
> investigate? Or tell me a better way to go about it?

User namespaces just present a different set of names to your process
(via VFS) so it looks like a chroot basically.
It does nothing for processes except fake their ids and limit your
overview of them.

You probably want process groups (see setsid(2)) or maybe containers.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-05 16:00   ` Danny Milosavljevic
@ 2018-02-05 16:41     ` Carlo Zancanaro
  2018-02-09 13:22     ` Ludovic Courtès
  1 sibling, 0 replies; 18+ messages in thread
From: Carlo Zancanaro @ 2018-02-05 16:41 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 703 bytes --]

Hey Danny,

On Mon, Feb 05 2018, Danny Milosavljevic wrote:
> I assume that means "starts a user's shepherd when they log in".

Either that, or run the services itself. In either case, what you 
have sent is very helpful!

> User namespaces just present a different set of names to your 
> process
> (via VFS) so it looks like a chroot basically.
> It does nothing for processes except fake their ids and limit 
> your
> overview of them.
>
> You probably want process groups (see setsid(2)) or maybe 
> containers.

Okay. I've been trying to read about containers/cgroups/namespaces 
and I think my mind has just blurred them all into the same thing. 
I'll read up about process groups. Thanks!

Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-05 16:00   ` Danny Milosavljevic
  2018-02-05 16:41     ` Carlo Zancanaro
@ 2018-02-09 13:22     ` Ludovic Courtès
  2018-02-09 20:51       ` David Pirotte
  1 sibling, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2018-02-09 13:22 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: guix-devel, Carlo Zancanaro

Hey!

Danny Milosavljevic <dannym@scratchpost.org> skribis:

> On Mon, 05 Feb 2018 21:49:08 +1100
> Carlo Zancanaro <carlo@zancanaro.id.au> wrote:
>
>> User services - Alex has already sent a patch to the list to allow 
>> generating user services from the Guix side. The idea is to 
>> generate a Shepherd config file, allowing a user to invoke 
>> shepherd manually to start their services.
>
>>A further extension to 
>> this would be to have something like systemd's "user sessions", 
>> where the pid 1 Shepherd automatically starts a user's services 
>> when they log in.
>
> I assume that means "starts a user's shepherd when they log in".
>
> elogind already emits a signal on dbus which tells you when a user logged in
>
>         return sd_bus_emit_signal(
>                         u->manager->bus,
>                         "/org/freedesktop/login1",
>                         "org.freedesktop.login1.Manager",
>                         new_user ? "UserNew" : "UserRemoved",
>                         "uo", (uint32_t) u->uid, p);

I think there’s Guile D-Bus client though.  Another yak to shave…

> Also, a directory /run/user/<id> appears - which alternatively can be
> monitored by inotify or something.
>
> So the system shepherd could have a shepherd service which does
>
>   while (1) {
>      wait until /run/user/<id> appears
>      vfork
>        if child: setuid, exec user shepherd, _exit
>        if parent: wait until child dies
>   }
>
> We better be sure that no one else can create directories in /run/user .
>
> In non-pseudocode, both "wait until /run/user/<id> appears" and
> "wait until child dies" would have to be in the same call,
> maybe epoll or something.

Yes, inotify (ISTR there *are* inotify bindings for Guile somewhere.)

> Maybe call the service shepherd-nursery-service or something, like a star
> nursery :)

:-)

Ludo’.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-05 15:56     ` Carlo Zancanaro
@ 2018-02-09 13:26       ` Ludovic Courtès
  2018-02-09 19:50         ` Carlo Zancanaro
  2018-02-09 21:32         ` Christopher Lemmer Webber
  0 siblings, 2 replies; 18+ messages in thread
From: Ludovic Courtès @ 2018-02-09 13:26 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: guix-devel

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> Hey Ludo,
>
> On Mon, Feb 05 2018, Ludovic Courtès wrote:
>>> User services - Alex has already sent a patch to the list to allow
>>> generating user services from the Guix side. The idea is to
>>> generate a
>>> Shepherd config file, allowing a user to invoke shepherd manually
>>> to
>>> start their services. A further extension to this would be to have
>>> something like systemd's "user sessions", where the pid 1 Shepherd
>>> automatically starts a user's services when they log in.
>>
>> After replying to Alex’ message, I realized that we could just as
>> well
>> have a separate “guix service” or similar tool to take care of this?
>>
>> This needs more thought (and perhaps taking a look at systemd user
>> sessions, which I’m not familiar with), but I think Alex’ approach
>> is a
>> good starting point.
>
> We were thinking it might work like this:
> - services->package constructs a package which places a file in the
> profile containing the necessary references
> - pid 1 shepherd listens to elogind login/logout events, and starts
> the services when necessary
>
> Admittedly this isn't the nicest way for it to work, but it might be a
> good starting point.

Yes, sounds reasonable.

> There were some discussions on the list a while ago about how to have
> `guix environment` automatically start services, too, so I wonder what
> overlap there could be there. Although maybe environment services (in
> containers) have more in common with system services than user
> services.

That’s a separate topic I think, but I agree it’d be useful.

>> Currently shepherd monitors SIGCHLD, and it’s not supposed to miss
>> those; in some cases it might handle them later than you’d expect,
>> which
>> means that in the meantime you see a zombie process, but otherwise
>> it
>> seems to work.
>>
>> ISTR you reported an issue when using ‘shepherd --daemonize’, right?
>> Perhaps the issue is limited to that mode?
>
> I no longer use the daemonize function. My user shepherd runs "in the
> foreground" (it's started when my X session starts), so it's not
> that. Jelle fixed the problem I was having by delaying the SIGCHLD
> handler registration until it's needed. It is still buggy if a process
> is started before the daemonize command is given to root service,
> though.
>
> If you try running "emacs --daemon" with "make-forkexec-constructor"
> (and #:pid-file, and put something in your emacs config to make it
> write out the pid) you should be able to reproduce what I am
> seeing. If you kill emacs (or if it crashes) then shepherd continues
> to report that it is started and running. When I look at htop's output
> I can also see that my emacs process is not a child of my shepherd
> process.
>
> I would like to add a --daemon/--daemonize command line argument to
> shepherd instead of the current "send the root service a daemonize
> message". I think the use cases of turning it into a daemon later are
> limited, and it just gives you an additional way of shooting yourself
> in the foot.

Also a separate topic ;-), but if you still experience a bug, please
report it and see whether you can provide a reduced test case to
reproduce it.

>> I’d really like to see that happen.  I’ve become more familiar with
>> Fibers, and I think it’ll be perfect for the Shepherd (and we’ll fix
>> the
>> ARM build issue, no doubt.)
>
> I'm not going to put much time/effort into this until we have fibers
> building on ARM.

Hopefully it’s nothing serious: Fibers doesn’t rely on anything
architecture-specific.

> I think these changes are likely to break shepherd's config API,
> too.

I’m not sure.  We may be able to keep the exact same API.  At least
that’s what I had in mind for the first Fibers-enabled Shepherd.

> In particular, with higher levels of concurrency I want to move the
> mutable state out of <service> objects.

The only piece of mutable state is the ‘running’ value.  We can make
that an “atomic box”, and users won’t even notice.

>> It seems that signalfd(2) is Linux-only though, which is a bummer.
>> The
>> solution might be to get over it and have it implemented on
>> GNU/Hurd…
>> (I saw this discussion:
>> <https://www.gnu.org/software/hurd/glibc/signal/signal_thread.html>;
>> I
>> suspect it’s within reach.)
>
> Failing that, could we have our signal handlers just convert the
> signal to a message in our event loop?

Yes, they could send a message on a Fibers channel.

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-09 13:26       ` Ludovic Courtès
@ 2018-02-09 19:50         ` Carlo Zancanaro
  2018-02-09 21:32         ` Christopher Lemmer Webber
  1 sibling, 0 replies; 18+ messages in thread
From: Carlo Zancanaro @ 2018-02-09 19:50 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1148 bytes --]

Hey Ludo,

On Fri, Feb 09 2018, Ludovic Courtès wrote:
>> In particular, with higher levels of concurrency I want to move 
>> the
>> mutable state out of <service> objects.
>
> The only piece of mutable state is the ‘running’ value.  We can 
> make
> that an “atomic box”, and users won’t even notice.

That's not quite true, unfortunately. I count four pieces of 
mutable state in the <service> object: `running`, `enabled?`, 
`waiting-for-termination?` and `last-respawns`. They should be 
stored elsewhere so that Shepherd can manage that state however it 
wants. We don't want to expose that to a user, where they could 
break Shepherd's assumptions about when/how it's modified (because 
user configuration can do anything it wants - including starting a 
long-running thread to mutate it later).

We shouldn't have to break much. My thought is just to remove 
those mutable fields from the <service> object (maybe leaving 
`enabled?`, but changing its meaning slightly to just be whether 
the service is enabled at the start). In practice it shouldn't 
break any real-world configuration, I hope.

Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-09 13:22     ` Ludovic Courtès
@ 2018-02-09 20:51       ` David Pirotte
  0 siblings, 0 replies; 18+ messages in thread
From: David Pirotte @ 2018-02-09 20:51 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel, Carlo Zancanaro

[-- Attachment #1: Type: text/plain, Size: 150 bytes --]

Hello,

> Yes, inotify (ISTR there *are* inotify bindings for Guile somewhere.)

	 https://github.com/ChaosEternal/guile-inotify2.git

David


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-09 13:26       ` Ludovic Courtès
  2018-02-09 19:50         ` Carlo Zancanaro
@ 2018-02-09 21:32         ` Christopher Lemmer Webber
  2018-02-14 13:10           ` Ludovic Courtès
  1 sibling, 1 reply; 18+ messages in thread
From: Christopher Lemmer Webber @ 2018-02-09 21:32 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel, Carlo Zancanaro

Ludovic Courtès writes:

> Hopefully it’s nothing serious: Fibers doesn’t rely on anything
> architecture-specific.

I think it relies on epoll currently?  But I think there should be no
reason other architectures couldn't also be supported.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-05 13:08   ` Ludovic Courtès
  2018-02-05 15:56     ` Carlo Zancanaro
@ 2018-02-10 13:34     ` Jelle Licht
  2018-02-14 13:25       ` Ludovic Courtès
  1 sibling, 1 reply; 18+ messages in thread
From: Jelle Licht @ 2018-02-10 13:34 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2720 bytes --]

Hey all,

2018-02-05 14:08 GMT+01:00 Ludovic Courtès <ludo@gnu.org>:

> Hello!
>
> [...]
>
> Currently shepherd monitors SIGCHLD, and it’s not supposed to miss
> those; in some cases it might handle them later than you’d expect, which
> means that in the meantime you see a zombie process, but otherwise it
> seems to work.
>
> ISTR you reported an issue when using ‘shepherd --daemonize’, right?
> Perhaps the issue is limited to that mode?
>

Playing around with signalfd(2) for a bit, it seems that implementations
are
allowed to coalesce several 'pending' signals at the same time. In the case
of SIGCHLD, this means the parent process might never be properly
informed of *mutliple* signals being received around the same time. Could
it have something to do with this problem as well?

>
> > Concurrency/parallelism - I think Jelle was planning to work on this,
> > but I might be wrong about that. Maybe I volunteered? We're keen to
> > see Shepherd starting services in parallel, where possible. This will
> > require some changes to the way we start/stop services (because at the
> > moment we just send a "start" signal to a single service to start it,
> > which makes it hard to be parallel), and will require us to actually
> > build some sort of real dependency resolution. Longer-term our goal
> > should be to bring fibers into Shepherd, but Efraim mentioned that
> > fibers doesn't compile on ARM at the moment, so we'll have to get that
> > working first at least.
>
> I’d really like to see that happen.  I’ve become more familiar with
> Fibers, and I think it’ll be perfect for the Shepherd (and we’ll fix the
> ARM build issue, no doubt.)
>
> One thing I’d like to do is to handle SIGCHLD via signalfd(2) instead of
> an actual signal handler like we do now.  That would make it easy to
> have signal handling part of the main event loop and thus, it would
> integrate well with Fibers.
>
> It seems that signalfd(2) is Linux-only though, which is a bummer.  The
> solution might be to get over it and have it implemented on GNU/Hurd…
> (I saw this discussion:
> <https://www.gnu.org/software/hurd/glibc/signal/signal_thread.html>; I
> suspect it’s within reach.)
>

Good news: signfalfd seems to work as far as I can see. I am not quite sure
how to make it work consistently with guile ports yet though.

To make use of signalfd, one normally masks signals so that these can
handled via signalfd instead of the default signal handlers; any process
forked start out with the same signal mask, so we would need to make
sure to either reset the signal mask for spawned processes.

>
> [...]
>
> Ludo’.
>
>
Jelle

[-- Attachment #2: Type: text/html, Size: 3687 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-09 21:32         ` Christopher Lemmer Webber
@ 2018-02-14 13:10           ` Ludovic Courtès
  2018-02-15 13:55             ` Andy Wingo
  0 siblings, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2018-02-14 13:10 UTC (permalink / raw)
  To: Christopher Lemmer Webber; +Cc: guix-devel, Carlo Zancanaro

Christopher Lemmer Webber <cwebber@dustycloud.org> skribis:

> Ludovic Courtès writes:
>
>> Hopefully it’s nothing serious: Fibers doesn’t rely on anything
>> architecture-specific.
>
> I think it relies on epoll currently?  But I think there should be no
> reason other architectures couldn't also be supported.

Ooh good point, that may rule out GNU/Hurd.  :-/

Ludo’.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-10 13:34     ` Jelle Licht
@ 2018-02-14 13:25       ` Ludovic Courtès
  2018-02-15 17:05         ` Jelle Licht
  0 siblings, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2018-02-14 13:25 UTC (permalink / raw)
  To: Jelle Licht; +Cc: guix-devel

Heya,

Jelle Licht <jlicht@fsfe.org> skribis:

> Good news: signfalfd seems to work as far as I can see. I am not quite sure
> how to make it work consistently with guile ports yet though. 

Good!  What do you mean by “work with guile ports” though?

> To make use of signalfd, one normally masks signals so that these can 
> handled via signalfd instead of the default signal handlers; any process
> forked start out with the same signal mask, so we would need to make
> sure to either reset the signal mask for spawned processes. 

Right, we could do that in ‘exec-command’, which is the central place
for fork+exec.

Well, let us know what to do next, then!  :-)

Ludo’.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-14 13:10           ` Ludovic Courtès
@ 2018-02-15 13:55             ` Andy Wingo
  0 siblings, 0 replies; 18+ messages in thread
From: Andy Wingo @ 2018-02-15 13:55 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel, Carlo Zancanaro

On Wed 14 Feb 2018 14:10, ludo@gnu.org (Ludovic Courtès) writes:

> Christopher Lemmer Webber <cwebber@dustycloud.org> skribis:
>
>> Ludovic Courtès writes:
>>
>>> Hopefully it’s nothing serious: Fibers doesn’t rely on anything
>>> architecture-specific.
>>
>> I think it relies on epoll currently?  But I think there should be no
>> reason other architectures couldn't also be supported.
>
> Ooh good point, that may rule out GNU/Hurd.  :-/

You can always replace the epoll module :)

Andy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-14 13:25       ` Ludovic Courtès
@ 2018-02-15 17:05         ` Jelle Licht
  2018-02-15 19:04           ` Mark H Weaver
  0 siblings, 1 reply; 18+ messages in thread
From: Jelle Licht @ 2018-02-15 17:05 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel


Ludovic Courtès <ludo@gnu.org> writes:

> Heya,
>
> Jelle Licht <jlicht@fsfe.org> skribis:
>
>> Good news: signfalfd seems to work as far as I can see. I am 
>> not quite sure
>> how to make it work consistently with guile ports yet though.
>
> Good!  What do you mean by “work with guile ports” though?
>

It seems that I am running into problems with the way guile 
handles
signals atm. As far as I understood the good people of #guile on
freenode, guile handles signals with a separate thread that 
actually
makes sure signal handling is done at the 'right' time. As such, 
it
seems that there is no easy way to set the mask of blocked signals 
for
all guile threads.

My approach was to wrap `pthread_sigmask' (initially 
`sigprocmask') icw
a call to `signalfd', but it seems that "my" guile thread only 
receives
the signal about ~two-thirds of the time. This only happens when
triggering the signal via 'external' means, such as the kill 
command.
Using the `raise' function from within my guile repl/program did 
always
reliably trigger events coming in on my signalfd based port.

Without being able to block all relevant signals via 
`pthread_sigmask'
from the other guile threads, it seems very difficult to reliably 
use
signalfd based ports to handle signals. Some (ugly) code at [1]
demonstrates this: run the guile script, and find the pid of the 
guile
process via `pgrep', and then send a SIGCHLD signal via `kill -17
<pid>'. You should still see the signal handler for the supposedly
blocked signal be triggered.

tl;dr: I cannot seem to block signals from being handled by guile 
in
some way, which to me seems a prerequisite for using 
signalfd-based
signal handling. My uneducated guess is that guile needs to 
support a
way to set signal masks for all threads in order to deal with 
this.


>> To make use of signalfd, one normally masks signals so that 
>> these can
>> handled via signalfd instead of the default signal handlers; 
>> any process
>> forked start out with the same signal mask, so we would need to 
>> make
>> sure to either reset the signal mask for spawned processes.
>
> Right, we could do that in ‘exec-command’, which is the central 
> place
> for fork+exec.

Right, this does not seem as difficult as I initially thought. If 
the
earlier things I mentioned are resolved/worked around, this should 
be
easy to implement.
>
> Well, let us know what to do next, then!  :-)
>
> Ludo’.

-Jelle

[1]: https://paste.debian.net/1010454/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Improving Shepherd
  2018-02-15 17:05         ` Jelle Licht
@ 2018-02-15 19:04           ` Mark H Weaver
  0 siblings, 0 replies; 18+ messages in thread
From: Mark H Weaver @ 2018-02-15 19:04 UTC (permalink / raw)
  To: Jelle Licht; +Cc: guix-devel

Jelle Licht <jlicht@fsfe.org> writes:

> tl;dr: I cannot seem to block signals from being handled by guile in
> some way, which to me seems a prerequisite for using signalfd-based
> signal handling. My uneducated guess is that guile needs to support a
> way to set signal masks for all threads in order to deal with this.

Does POSIX provide a way to set the signal mask for another thread?
Last time I looked, I couldn't find one.

I've long desired to get rid of the signal thread in Guile, and instead
arrange for signals to be delivered directly to the thread that's
supposed to receive it.  Guile's 'sigaction' has long allowed the user
to specify which thread should receive each kind of signal, although
POSIX doesn't support this.

I want to do this for a couple of reasons.  One is to avoid spawning
threads unless the user asks for it, to avoid possible safety issues
with fork.  Another reason is that I'd like to arrange for long-running
system calls to be reliably interrupted when a signal is received.

The main thing I've been stuck on is that I haven't found a way to set
the signal mask on other threads.

      Mark

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-02-15 19:05 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-29 21:14 Improving Shepherd Carlo Zancanaro
2018-01-29 22:27 ` Jelle Licht
2018-02-05 10:49 ` Carlo Zancanaro
2018-02-05 13:08   ` Ludovic Courtès
2018-02-05 15:56     ` Carlo Zancanaro
2018-02-09 13:26       ` Ludovic Courtès
2018-02-09 19:50         ` Carlo Zancanaro
2018-02-09 21:32         ` Christopher Lemmer Webber
2018-02-14 13:10           ` Ludovic Courtès
2018-02-15 13:55             ` Andy Wingo
2018-02-10 13:34     ` Jelle Licht
2018-02-14 13:25       ` Ludovic Courtès
2018-02-15 17:05         ` Jelle Licht
2018-02-15 19:04           ` Mark H Weaver
2018-02-05 16:00   ` Danny Milosavljevic
2018-02-05 16:41     ` Carlo Zancanaro
2018-02-09 13:22     ` Ludovic Courtès
2018-02-09 20:51       ` David Pirotte

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).