unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw@netris.org>
Cc: guile-devel@gnu.org
Subject: Re: SCM_SYSCALL
Date: Sat, 06 Jul 2013 23:05:43 +0200	[thread overview]
Message-ID: <87li5jo0uw.fsf@gnu.org> (raw)
In-Reply-To: <87fvvrabes.fsf@tines.lan> (Mark H. Weaver's message of "Sat, 06 Jul 2013 12:41:31 -0400")

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> Hmm.  Shouldn't our signal handlers be run in a different thread?  Maybe
>>> we can't make this change until 2.2, but it seems to me that there are
>>> very serious problems trying to run signal handlers from within asyncs,
>>> analogous to the problems running finalizers within asyncs.  Commonly,
>>> signal handlers need to mutate some global state, but that cannot in
>>> general be done safely from within asyncs, because asyncs might be
>>> called while the global state is in an inconsistent state, at least for
>>> data structures implemented in Scheme.
>>>
>>> What do you think?

[...]

>> However, with a fixed SCM_SYSCALL, the result is pretty much the same as
>> with SA_RESTART (see <http://bugs.gnu.org/14640>): when SCM_ASYNC_TICK
>> is called right after we get EINTR, chances are that the async hasn’t
>> been queued yet, so we get back to our read(2) call, and thus the
>> Scheme-level signal handler is never called.  (Typically, when running
>> the test through strace, it passes, because the timing is “better”, but
>> it fails without strace.)
>
> Right, so the problem is that, when Guile is built with thread support,
> our signal delivery mechanism depends on the signal handling thread
> executing, which adds an unpredictable amount of latency.
>
> Initially I looked at how to fix the test case to work around this
> problem, but really I think we need to fix the way that signals are
> delivered.  If one chooses to deliver signals to a thread that's doing a
> 'read' (or other interruptible system call), then we ought to arrange
> things so that the async is queued in time to be run before restarting
> the call.
>
> I think the best solution is to get rid of our internal signal handler
> thread altogether, and instead arrange for signals to be delivered
> directly to the thread that the user specified, by setting the thread
> signal masks appropriately.  The C-level signal handler would then set
> some global state that would be noticed by the SCM_SYSCALL loop.

The C-level handler must restrict itself to async-signal-safe functions,
which excludes GC_malloc for instance (leading to hard-to-fix issues
like those you identified in the 2nd ‘take_signal’.)

Another issue is that, at the Scheme level, the signal is specified to
be delivered to the thread that called ‘sigaction’ or to whatever thread
was specified in the ‘sigaction’ call.  It’s unclear to me that this
could be achieved with just pthread_sigmask (which is missing on some
platforms, such as MinGW.)

> In some ways, this would bring us closer to the non-thread signal
> handling mechanism in scmsigs.c, which queued the asyncs directly from
> the signal handler.  Unfortunately, that code is not safe.  For example,
> if the non-thread 'take_signal' (the second one in scmsigs.c) is run
> while 'scm_async_click' (async.c) is in between the following two lines:
>
>       asyncs = t->active_asyncs;
>       t->active_asyncs = SCM_EOL;
>
> Then the signal will be lost.  Other problems can happen if the
> non-threaded 'take_signal' interrupts itself (e.g. if two different
> signals are delivered at nearly the same time).

Indeed.

> So we'd need to devise a new mechanism that _is_ safe.
> It is certainly doable.
>
> If you're okay with this general approach, I'll look into it.

Well, this is a can of worms, so we’d rather open it in 2.1, IMO.  :-)
It’s not clear to me that this can be improved without breaking
something else, but I’m all ears.

Back to the problem at hand: do you have any idea on how to write a test
case?  If not, I would just commit the SCM_SYSCALL fix.

Thanks,
Ludo’.



  reply	other threads:[~2013-07-06 21:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-23 21:25 SCM_SYSCALL Ludovic Courtès
2013-07-03 18:19 ` SCM_SYSCALL Mark H Weaver
2013-07-04 22:28   ` SCM_SYSCALL Ludovic Courtès
2013-07-05 18:56     ` SCM_SYSCALL Mark H Weaver
2013-07-05 20:01       ` SCM_SYSCALL Ludovic Courtès
2013-07-06 16:41         ` SCM_SYSCALL Mark H Weaver
2013-07-06 21:05           ` Ludovic Courtès [this message]
2013-07-17 16:04 ` SCM_SYSCALL Ludovic Courtès
2014-03-23 19:56 ` SCM_SYSCALL Andy Wingo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87li5jo0uw.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guile-devel@gnu.org \
    --cc=mhw@netris.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).