unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#10641: [2.0.3+] start_signal_delivery_thread failure on x86_64-freebsd8.2
@ 2012-01-29 18:00 Ludovic Courtès
  2013-03-13 10:02 ` Andy Wingo
  0 siblings, 1 reply; 3+ messages in thread
From: Ludovic Courtès @ 2012-01-29 18:00 UTC (permalink / raw)
  To: 10641

Hello!

For the record, scm_spawn_thread sometimes return #f, instead of a valid
thread, when called from start_signal_delivery_thread, itself called
from the pthread key destructor.

For this reason, I added an assertion check in commit 0f4f2d9a, which
gets hit systematically on that platform, for instance when running
standalone/test-scm-spawn-thread.

The backtrace looks like this:

--8<---------------cut here---------------start------------->8---
(gdb) thread apply all bt

Thread 3 (Thread 801a041c0 (LWP 100363)):
#0  0x00000008012903cc in __error () from /lib/libthr.so.3
#1  0x000000080128e501 in pthread_cond_signal () from /lib/libthr.so.3
#2  0x00000008007196e1 in scm_pthread_cond_timedwait (cond=0x640e28, mutex=0x640ca8, wt=0x7fffffffd980) at ../../libguile/threads.c:2024
#3  0x00000008007198cd in block_self (queue=0x855e20, sleep_object=Variable "sleep_object" is not available.
) at ../../libguile/threads.c:454
#4  0x0000000800719d65 in scm_join_thread_timed (thread=0x855e50, timeout=0x13c963492, timeoutval=Variable "timeoutval" is not available.
) at ../../libguile/threads.c:1295
#5  0x0000000000400c33 in inner_main (data=Variable "data" is not available.
) at ../../../test-suite/standalone/test-scm-spawn-thread.c:55
#6  0x00000008006a067a in c_body (d=0x7fffffffdbe0) at ../../libguile/continuations.c:518
#7  0x0000000800727803 in vm_regular_engine (vm=0x6add50, program=0x6b00c0, argv=0x1, nargs=8680160) at vm-i-system.c:1007
#8  0x000000080071fbc6 in scm_c_vm_run (vm=0x6add50, program=0x785930, argv=0x7fffffffdb40, nargs=4) at ../../libguile/vm.c:567
#9  0x00000008006a7ce3 in scm_call_4 (proc=0x785930, arg1=Variable "arg1" is not available.
) at ../../libguile/eval.c:506
#10 0x00000008006a09f9 in scm_i_with_continuation_barrier (body=0x8006a0670 <c_body>, body_data=0x7fffffffdbe0, handler=0x8006a08b0 <c_handler>, handler_data=0x7fffffffdbe0,
    pre_unwind_handler=0x8006a0710 <pre_unwind_handler>, pre_unwind_handler_data=0x6adc80) at ../../libguile/continuations.c:455
#11 0x00000008006a0ad5 in scm_c_with_continuation_barrier (func=Variable "func" is not available.
) at ../../libguile/continuations.c:552
#12 0x000000080071b470 in with_guile_and_parent (base=0x7fffffffdc40, data=Variable "data" is not available.
) at ../../libguile/threads.c:913
#13 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#14 0x000000080071abf1 in scm_i_with_guile_and_parent (func=Variable "func" is not available.
) at ../../libguile/threads.c:959
#15 0x0000000000400b80 in main (argc=Variable "argc" is not available.
) at ../../../test-suite/standalone/test-scm-spawn-thread.c:68

Thread 2 (Thread 801a0ae40 (LWP 100280)):
#0  0x00000008016250dc in thr_kill () from /lib/libc.so.7
#1  0x00000008016c1dcb in abort () from /lib/libc.so.7
#2  0x00000008016ab1a5 in __assert () from /lib/libc.so.7
#3  0x000000080071ab2d in scm_spawn_thread (body=Variable "body" is not available.
) at ../../libguile/threads.c:1175
#4  0x00000008006f7ab1 in start_signal_delivery_thread () at ../../libguile/scmsigs.c:211
#5  0x000000080128d9c8 in pthread_once () from /lib/libthr.so.3
#6  0x000000080071b1b0 in do_thread_exit (v=Variable "v" is not available.
) at ../../libguile/threads.c:666
#7  0x00000008006a067a in c_body (d=0x7fffffbfedc0) at ../../libguile/continuations.c:518
#8  0x0000000800727803 in vm_regular_engine (vm=0x855e00, program=0x8570c0, argv=0x1, nargs=9272864) at vm-i-system.c:1007
#9  0x000000080071fbc6 in scm_c_vm_run (vm=0x855e00, program=0x785930, argv=0x7fffffbfed20, nargs=4) at ../../libguile/vm.c:567
#10 0x00000008006a7ce3 in scm_call_4 (proc=0x785930, arg1=Variable "arg1" is not available.
) at ../../libguile/eval.c:506
#11 0x00000008006a09f9 in scm_i_with_continuation_barrier (body=0x8006a0670 <c_body>, body_data=0x7fffffbfedc0, handler=0x8006a08b0 <c_handler>, handler_data=0x7fffffbfedc0,
    pre_unwind_handler=0x8006a0710 <pre_unwind_handler>, pre_unwind_handler_data=0x6adc80) at ../../libguile/continuations.c:455
#12 0x00000008006a0ad5 in scm_c_with_continuation_barrier (func=Variable "func" is not available.
) at ../../libguile/continuations.c:552
#13 0x0000000801153bc8 in GC_call_with_gc_active () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#14 0x000000080071b411 in with_guile_and_parent (base=0x7fffffbfee60, data=Variable "data" is not available.
) at ../../libguile/threads.c:236
#15 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#16 0x000000080071abf1 in scm_i_with_guile_and_parent (func=Variable "func" is not available.
) at ../../libguile/threads.c:959
#17 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#18 0x000000080071af2e in on_thread_exit (v=Variable "v" is not available.
) at ../../libguile/threads.c:748
#19 0x000000080128985b in pthread_key_delete () from /lib/libthr.so.3
#20 0x000000080128f5f3 in pthread_exit () from /lib/libthr.so.3
#21 0x00000008012864f9 in pthread_getprio () from /lib/libthr.so.3
#22 0x0000000000000000 in ?? ()
Cannot access memory at address 0x7fffffbff000

Thread 1 (Thread 801a35c80 (LWP 100403)):
#0  0x00000008016c4fcc in write () from /lib/libc.so.7
#1  0x00000008016c4a60 in memcpy () from /lib/libc.so.7
#2  0x00000008016c49ab in memcpy () from /lib/libc.so.7
#3  0x00000008016c359d in f_prealloc () from /lib/libc.so.7
#4  0x00000008016c2d5c in fwrite () from /lib/libc.so.7
#5  0x00000008016b5f3b in open () from /lib/libc.so.7
#6  0x00000008016b74e9 in open () from /lib/libc.so.7
#7  0x00000008016b98da in vfprintf () from /lib/libc.so.7
#8  0x00000008016a794a in printf () from /lib/libc.so.7
#9  0x000000080071b040 in really_spawn (d=0x7fffffbfeaa0) at ../../libguile/threads.c:1100
---Type <return> to continue, or q <return> to quit---
#10 0x00000008006a067a in c_body (d=0x7fffff9fde70) at ../../libguile/continuations.c:518
#11 0x0000000800727803 in vm_regular_engine (vm=0x801fc0, program=0x8d80c0, argv=0x1, nargs=9272448) at vm-i-system.c:1007
#12 0x000000080071fbc6 in scm_c_vm_run (vm=0x801fc0, program=0x785930, argv=0x7fffff9fddd0, nargs=4) at ../../libguile/vm.c:567
#13 0x00000008006a7ce3 in scm_call_4 (proc=0x785930, arg1=Variable "arg1" is not available.
) at ../../libguile/eval.c:506
#14 0x00000008006a09f9 in scm_i_with_continuation_barrier (body=0x8006a0670 <c_body>, body_data=0x7fffff9fde70, handler=0x8006a08b0 <c_handler>, handler_data=0x7fffff9fde70, 
    pre_unwind_handler=0x8006a0710 <pre_unwind_handler>, pre_unwind_handler_data=0x6adc80) at ../../libguile/continuations.c:455
#15 0x00000008006a0ad5 in scm_c_with_continuation_barrier (func=Variable "func" is not available.
) at ../../libguile/continuations.c:552
#16 0x000000080071b470 in with_guile_and_parent (base=0x7fffff9fded0, data=Variable "data" is not available.
) at ../../libguile/threads.c:913
#17 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#18 0x000000080071abf1 in scm_i_with_guile_and_parent (func=Variable "func" is not available.
) at ../../libguile/threads.c:959
#19 0x000000080071ac74 in spawn_thread (d=Variable "d" is not available.
) at ../../libguile/threads.c:1133
#20 0x000000080115337c in GC_inner_start_routine () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#21 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#22 0x00000008012864f1 in pthread_getprio () from /lib/libthr.so.3
#23 0x0000000000000000 in ?? ()
Cannot access memory at address 0x7fffff9fe000
#0  0x00000008016250dc in thr_kill () from /lib/libc.so.7
--8<---------------cut here---------------end--------------->8---

Adding printfs shows that the thread calling scm_spawn_thread leaves
cond_wait before the signal thread has signaled the condition (in
really_spawn).

A similar program succeeds, suggesting that it’s not a bug/limitation of
libpthread:

--8<---------------cut here---------------start------------->8---
#include <pthread.h>
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>

#define GC_THREADS 1
#include <gc/gc.h>

struct sync
{
  pthread_cond_t cond;
  pthread_mutex_t mutex;
};

static void *
hello (void *arg)
{
  int err;
  struct sync *s = (struct sync *) arg;

  printf ("hello from %p\n", pthread_self ());
  GC_MALLOC (123);

  err = pthread_mutex_lock (&s->mutex);
  assert (err == 0);

  err = pthread_cond_signal (&s->cond);
  assert (err == 0);

  err = pthread_mutex_unlock (&s->mutex);
  assert (err == 0);
}

static void
on_thread_exit ()
{
  int err;
  pthread_t child;
  struct sync s;

  pthread_mutex_init (&s.mutex, NULL);
  pthread_cond_init (&s.cond, NULL);

  pthread_mutex_lock (&s.mutex);
  err = pthread_create (&child, NULL, hello, &s);
  assert (err == 0);
  err = pthread_cond_wait (&s.cond, &s.mutex);
  assert (err == 0);
  err = pthread_mutex_unlock (&s.mutex);
  assert (err == 0);

  printf ("child %p seen from %p\n", child, pthread_self ());
}

static void *
entry (void *unused)
{
  pthread_key_t k;
  pthread_key_create (&k, on_thread_exit);
  pthread_setspecific (k, (void *) 123);
  return NULL;
}

int
main ()
{
  int err;
  pthread_t child;
  void *ret;

  GC_INIT ();

  err = pthread_create (&child, NULL, entry, NULL);
  assert (err == 0);

  err = pthread_join (child, &ret);
  assert (err == 0);

  return 0;
}
--8<---------------cut here---------------end--------------->8---

To be continued...

Ludo’.





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#10641: [2.0.3+] start_signal_delivery_thread failure on x86_64-freebsd8.2
  2012-01-29 18:00 bug#10641: [2.0.3+] start_signal_delivery_thread failure on x86_64-freebsd8.2 Ludovic Courtès
@ 2013-03-13 10:02 ` Andy Wingo
  2013-03-29  9:49   ` Ludovic Courtès
  0 siblings, 1 reply; 3+ messages in thread
From: Andy Wingo @ 2013-03-13 10:02 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 10641-done

Hi,

On Sun 29 Jan 2012 19:00, ludo@gnu.org (Ludovic Courtès) writes:

> Adding printfs shows that the thread calling scm_spawn_thread leaves
> cond_wait before the signal thread has signaled the condition (in
> really_spawn).

From http://pubs.opengroup.org/onlinepubs/009604599/functions/pthread_cond_wait.html

  When using condition variables there is always a Boolean predicate
  involving shared variables associated with each condition wait that is
  true if the thread should proceed. Spurious wakeups from the
  pthread_cond_timedwait() or pthread_cond_wait() functions may
  occur. Since the return from pthread_cond_timedwait() or
  pthread_cond_wait() does not imply anything about the value of this
  predicate, the predicate should be re-evaluated upon such return.

It seems this code is not robust in the face of spurious wakeups.  I
pushed a patch that waits for data.thread to become non-false.  That
should fix this issue.

Cheers,

Andy
-- 
http://wingolog.org/





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#10641: [2.0.3+] start_signal_delivery_thread failure on x86_64-freebsd8.2
  2013-03-13 10:02 ` Andy Wingo
@ 2013-03-29  9:49   ` Ludovic Courtès
  0 siblings, 0 replies; 3+ messages in thread
From: Ludovic Courtès @ 2013-03-29  9:49 UTC (permalink / raw)
  To: Andy Wingo; +Cc: 10641-done

Andy Wingo <wingo@pobox.com> skribis:

> On Sun 29 Jan 2012 19:00, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Adding printfs shows that the thread calling scm_spawn_thread leaves
>> cond_wait before the signal thread has signaled the condition (in
>> really_spawn).
>
> From http://pubs.opengroup.org/onlinepubs/009604599/functions/pthread_cond_wait.html
>
>   When using condition variables there is always a Boolean predicate
>   involving shared variables associated with each condition wait that is
>   true if the thread should proceed. Spurious wakeups from the
>   pthread_cond_timedwait() or pthread_cond_wait() functions may
>   occur. Since the return from pthread_cond_timedwait() or
>   pthread_cond_wait() does not imply anything about the value of this
>   predicate, the predicate should be re-evaluated upon such return.
>
> It seems this code is not robust in the face of spurious wakeups.  I
> pushed a patch that waits for data.thread to become non-false.  That
> should fix this issue.

Good catch, and congratulations!  I can confirm that this fixes
--with-thread builds on FreeBSD 8.2:

  http://hydra.nixos.org/build/4519811

Thanks!

Ludo’.





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-03-29  9:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-29 18:00 bug#10641: [2.0.3+] start_signal_delivery_thread failure on x86_64-freebsd8.2 Ludovic Courtès
2013-03-13 10:02 ` Andy Wingo
2013-03-29  9:49   ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).