unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Hang in threads.test
@ 2010-09-18 21:03 Neil Jerram
  2010-09-22 20:23 ` Neil Jerram
  0 siblings, 1 reply; 14+ messages in thread
From: Neil Jerram @ 2010-09-18 21:03 UTC (permalink / raw)
  To: guile-devel

My nightly build of master, on a relatively slow old machine, is
hanging, on most nights, in `make check'.  The tail of check-guile.log
says

PASS: threads.test: mutex-ownership: mutex ownership for unlocked mutex
PASS: threads.test: mutex-ownership: locking mutex on behalf of other thread
PASS: threads.test: mutex-ownership: locking mutex with no owner

so it looks like the problematic test is "mutex with owner not retained
(bug #27450)" in threads.test.  Here are the backtraces from GDB, from
today's build.

(gdb) info threads
  3 Thread 0x40bd8b70 (LWP 22491)  0x4001d424 in __kernel_vsyscall ()
  2 Thread 0x409d7b70 (LWP 22605)  0x4001d424 in __kernel_vsyscall ()
* 1 Thread 0x404ff8f0 (LWP 22194)  0x4001d424 in __kernel_vsyscall ()
(gdb) bt
#0  0x4001d424 in __kernel_vsyscall ()
#1  0x40469385 in sem_wait@@GLIBC_2.1 () from /lib/i686/cmov/libpthread.so.0
#2  0x40182018 in GC_stop_world () from /usr/lib/libgc.so.1
#3  0x40171b1e in GC_stopped_mark () from /usr/lib/libgc.so.1
#4  0x40171df9 in GC_try_to_collect_inner () from /usr/lib/libgc.so.1
#5  0x40172194 in GC_try_to_collect () from /usr/lib/libgc.so.1
#6  0x40172270 in GC_gcollect () from /usr/lib/libgc.so.1
#7  0x4007abca in scm_i_gc () at ../../libguile/gc.c:398
#8  scm_gc () at ../../libguile/gc.c:381
#9  0x400fb4fb in vm_debug_engine (vm=0x92031b8, program=0x9250b30, argv=0xbf98d880, nargs=1) at ../../libguile/vm-i-system.c:847
#10 0x400ea8aa in scm_c_vm_run (vm=0x92031b8, program=0x9250b30, argv=0xbf98d880, nargs=1) at ../../libguile/vm.c:560
#11 0x4006f8b7 in scm_primitive_eval (exp=0x993b6b8) at ../../libguile/eval.c:844
#12 0x40092473 in scm_primitive_load (filename=0xcb4a020) at ../../libguile/load.c:126
#13 0x400fb4f2 in vm_debug_engine (vm=0x92031b8, program=0x936bf18, argv=0xbf98d9d4, nargs=1) at ../../libguile/vm-i-system.c:850
#14 0x400ea8aa in scm_c_vm_run (vm=0x92031b8, program=0x936bf18, argv=0xbf98d9d4, nargs=1) at ../../libguile/vm.c:560
#15 0x4006e175 in scm_call_1 (proc=0x936bf18, arg1=0x94846b0) at ../../libguile/eval.c:561
#16 0x4006fcfe in scm_for_each (proc=0x936bf18, arg1=0x93f2930, args=0x304) at ../../libguile/eval.c:789
#17 0x400fb4c9 in vm_debug_engine (vm=0x92031b8, program=0x9250b30, argv=0xbf98db40, nargs=1) at ../../libguile/vm-i-system.c:856
#18 0x400ea8aa in scm_c_vm_run (vm=0x92031b8, program=0x9250b30, argv=0xbf98db40, nargs=1) at ../../libguile/vm.c:560
#19 0x4006f8b7 in scm_primitive_eval (exp=0x935c6a8) at ../../libguile/eval.c:844
#20 0x4006f931 in scm_eval (exp=0x935c6a8, module_or_state=0x9205828) at ../../libguile/eval.c:878
#21 0x400bc5d5 in scm_shell (argc=10, argv=0xbf98dfe4) at ../../libguile/script.c:760
#22 0x4008b316 in invoke_main_func (body_data=0xbf98df00) at ../../libguile/init.c:383
#23 0x400658d2 in c_body (d=0xbf98de54) at ../../libguile/continuations.c:473
#24 0x400e7242 in apply_catch_closure (clo=0x0, args=0x304) at ../../libguile/throw.c:146
#25 0x400fa942 in vm_debug_engine (vm=0x92031b8, program=0x9198840, argv=0xbf98dd30, nargs=4) at ../../libguile/vm-i-system.c:918
#26 0x400ea8aa in scm_c_vm_run (vm=0x92031b8, program=0x9198840, argv=0xbf98dd30, nargs=4) at ../../libguile/vm.c:560
#27 0x4006e08d in scm_call_4 (proc=0x9198840, arg1=0x404, arg2=0x9315f70, arg3=0x9315f60, arg4=0x9315f50) at ../../libguile/eval.c:582
#28 0x400e7ba2 in scm_catch_with_pre_unwind_handler (key=0x404, thunk=0x9315f70, handler=0x9315f60, pre_unwind_handler=0x9315f50) at ../../libguile/throw.c:86
#29 0x400e7c72 in scm_c_catch (tag=0x404, body=0x400658c0 <c_body>, body_data=0xbf98de54, handler=0x400658e0 <c_handler>, handler_data=0xbf98de54, pre_unwind_handler=0x400e75e0 <scm_handle_by_message_noexit>, pre_unwind_handler_data=0x0) at ../../libguile/throw.c:213
#30 0x40065b9b in scm_i_with_continuation_barrier (body=0x400658c0 <c_body>, body_data=0xbf98de54, handler=0x400658e0 <c_handler>, handler_data=0xbf98de54, pre_unwind_handler=0x400e75e0 <scm_handle_by_message_noexit>, pre_unwind_handler_data=0x0) at ../../libguile/continuations.c:450
#31 0x40065c73 in scm_c_with_continuation_barrier (func=0x4008b2d0 <invoke_main_func>, data=0xbf98df00) at ../../libguile/continuations.c:491
#32 0x400e6e1c in scm_i_with_guile_and_parent (func=0x4008b2d0 <invoke_main_func>, data=0xbf98df00, parent=0x0) at ../../libguile/threads.c:741
#33 0x400e6f3e in scm_with_guile (func=0x4008b2d0 <invoke_main_func>, data=0xbf98df00) at ../../libguile/threads.c:720
#34 0x4008b2af in scm_boot_guile (argc=10, argv=0xbf98dfe4, main_func=0x80488a0 <inner_main>, closure=0x0) at ../../libguile/init.c:366
#35 0x0804889b in main (argc=10, argv=0xbf98dfe4) at ../../libguile/guile.c:70
(gdb) thread 2
[Switching to thread 2 (Thread 0x409d7b70 (LWP 22605))]#0  0x4001d424 in __kernel_vsyscall ()
(gdb) bt
#0  0x4001d424 in __kernel_vsyscall ()
#1  0x4033ab5e in sigsuspend () from /lib/i686/cmov/libc.so.6
#2  0x4018222b in GC_suspend_handler_inner () from /usr/lib/libgc.so.1
#3  0x401822b5 in GC_suspend_handler () from /usr/lib/libgc.so.1
#4  <signal handler called>
#5  0x4001d422 in __kernel_vsyscall ()
#6  0x40469c39 in __lll_lock_wait () from /lib/i686/cmov/libpthread.so.0
#7  0x4046503b in _L_lock_748 () from /lib/i686/cmov/libpthread.so.0
#8  0x40464e61 in pthread_mutex_lock () from /lib/i686/cmov/libpthread.so.0
#9  0x403e9286 in pthread_mutex_lock () from /lib/i686/cmov/libc.so.6
#10 0x40180ae7 in GC_lock () from /usr/lib/libgc.so.1
#11 0x4017760f in GC_core_malloc_atomic () from /usr/lib/libgc.so.1
#12 0x401807e8 in GC_malloc_atomic () from /usr/lib/libgc.so.1
#13 0x40079c7d in scm_gc_malloc_pointerless (size=15, what=0x4011f551 "string") at ../../libguile/gc-malloc.c:177
#14 0x400dac4e in make_stringbuf (len=6) at ../../libguile/strings.c:128
#15 0x400dca54 in scm_i_make_string (len=6, charsp=0x409d6d78) at ../../libguile/strings.c:268
#16 0x400dd2f7 in scm_from_stringn (str=0x409d6dfb "125576n\235@", len=6, encoding=0x9dfc7c0 "ANSI_X3.4-1968", handler=SCM_FAILED_CONVERSION_QUESTION_MARK) at ../../libguile/strings.c:1487
#17 0x400dd48a in scm_from_locale_stringn (str=0x409d6dfb "125576n\235@", len=6) at ../../libguile/strings.c:1536
#18 0x400e3853 in scm_gensym (prefix=0x9254e10) at ../../libguile/symbols.c:357
#19 0x400fb4f2 in vm_debug_engine (vm=0xc827bd0, program=0x9198840, argv=0x409d6f74, nargs=3) at ../../libguile/vm-i-system.c:850
#20 0x400ea8aa in scm_c_vm_run (vm=0xc827bd0, program=0x9198840, argv=0x409d6f74, nargs=3) at ../../libguile/vm.c:560
#21 0x4006e0e7 in scm_call_3 (proc=0x9198840, arg1=0x404, arg2=0xc98b1e0, arg3=0xcb62f30) at ../../libguile/eval.c:575
#22 0x400e7af8 in scm_catch (key=0x404, thunk=0xc98b1e0, handler=0xcb62f30) at ../../libguile/throw.c:73
#23 0x400e6713 in really_launch (d=0xbf98d6e4) at ../../libguile/threads.c:839
#24 0x400658d2 in c_body (d=0x409d7284) at ../../libguile/continuations.c:473
#25 0x400e7242 in apply_catch_closure (clo=0x4018a0a0, args=0x304) at ../../libguile/throw.c:146
#26 0x400fa942 in vm_debug_engine (vm=0xc827bd0, program=0x9198840, argv=0x409d7160, nargs=4) at ../../libguile/vm-i-system.c:918
#27 0x400ea8aa in scm_c_vm_run (vm=0xc827bd0, program=0x9198840, argv=0x409d7160, nargs=4) at ../../libguile/vm.c:560
#28 0x4006e08d in scm_call_4 (proc=0x9198840, arg1=0x404, arg2=0xc981ba0, arg3=0xc981b90, arg4=0xc981b80) at ../../libguile/eval.c:582
#29 0x400e7ba2 in scm_catch_with_pre_unwind_handler (key=0x404, thunk=0xc981ba0, handler=0xc981b90, pre_unwind_handler=0xc981b80) at ../../libguile/throw.c:86
#30 0x400e7c72 in scm_c_catch (tag=0x404, body=0x400658c0 <c_body>, body_data=0x409d7284, handler=0x400658e0 <c_handler>, handler_data=0x409d7284, pre_unwind_handler=0x400e75e0 <scm_handle_by_message_noexit>, pre_unwind_handler_data=0x0) at ../../libguile/throw.c:213
#31 0x40065b9b in scm_i_with_continuation_barrier (body=0x400658c0 <c_body>, body_data=0x409d7284, handler=0x400658e0 <c_handler>, handler_data=0x409d7284, pre_unwind_handler=0x400e75e0 <scm_handle_by_message_noexit>, pre_unwind_handler_data=0x0) at ../../libguile/continuations.c:450
#32 0x40065c73 in scm_c_with_continuation_barrier (func=0x400e6690 <really_launch>, data=0xbf98d6e4) at ../../libguile/continuations.c:491
#33 0x400e6e1c in scm_i_with_guile_and_parent (func=0x400e6690 <really_launch>, data=0xbf98d6e4, parent=0x91bf538) at ../../libguile/threads.c:741
#34 0x400e6eff in launch_thread (d=0xbf98d6e4) at ../../libguile/threads.c:852
#35 0x40180e48 in GC_inner_start_routine () from /usr/lib/libgc.so.1
#36 0x4017b21c in GC_call_with_stack_base () from /usr/lib/libgc.so.1
#37 0x40180cd7 in GC_start_routine () from /usr/lib/libgc.so.1
#38 0x40462955 in start_thread () from /lib/i686/cmov/libpthread.so.0
#39 0x403dc10e in clone () from /lib/i686/cmov/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 0x40bd8b70 (LWP 22491))]#0  0x4001d424 in __kernel_vsyscall ()
(gdb) bt
#0  0x4001d424 in __kernel_vsyscall ()
#1  0x4033ab5e in sigsuspend () from /lib/i686/cmov/libc.so.6
#2  0x4018222b in GC_suspend_handler_inner () from /usr/lib/libgc.so.1
#3  0x401822b5 in GC_suspend_handler () from /usr/lib/libgc.so.1
#4  <signal handler called>
#5  0x4001d422 in __kernel_vsyscall ()
#6  0x403cd1db in read () from /lib/i686/cmov/libc.so.6
#7  0x400bb0f2 in signal_delivery_thread (data=0x0) at ../../libguile/scmsigs.c:164
#8  0x400e7242 in apply_catch_closure (clo=0x1, args=0x304) at ../../libguile/throw.c:146
#9  0x400fa942 in vm_debug_engine (vm=0x97f3f20, program=0x9198840, argv=0x40bd7ea4, nargs=3) at ../../libguile/vm-i-system.c:918
#10 0x400ea8aa in scm_c_vm_run (vm=0x97f3f20, program=0x9198840, argv=0x40bd7ea4, nargs=3) at ../../libguile/vm.c:560
#11 0x4006e0e7 in scm_call_3 (proc=0x9198840, arg1=0x404, arg2=0x993a500, arg3=0x993a4f0) at ../../libguile/eval.c:575
#12 0x400e7af8 in scm_catch (key=0x404, thunk=0x993a500, handler=0x993a4f0) at ../../libguile/throw.c:73
#13 0x400e7c06 in scm_catch_with_pre_unwind_handler (key=0x404, thunk=0x993a500, handler=0x993a4f0, pre_unwind_handler=0x904) at ../../libguile/throw.c:81
#14 0x400e7c72 in scm_c_catch (tag=0x404, body=0x400bb090 <signal_delivery_thread>, body_data=0x0, handler=0x400e7660 <scm_handle_by_message>, handler_data=0x40125564, pre_unwind_handler=0, pre_unwind_handler_data=0x0) at ../../libguile/throw.c:213
#15 0x400e7cc9 in scm_internal_catch (tag=0x404, body=0x400bb090 <signal_delivery_thread>, body_data=0x0, handler=0x400e7660 <scm_handle_by_message>, handler_data=0x40125564) at ../../libguile/throw.c:222
#16 0x400e651b in really_spawn (d=0x409d714c) at ../../libguile/threads.c:929
#17 0x400658d2 in c_body (d=0x40bd8284) at ../../libguile/continuations.c:473
#18 0x400e7242 in apply_catch_closure (clo=0x1, args=0x304) at ../../libguile/throw.c:146
#19 0x400fa942 in vm_debug_engine (vm=0x97f3f20, program=0x9198840, argv=0x40bd8160, nargs=4) at ../../libguile/vm-i-system.c:918
#20 0x400ea8aa in scm_c_vm_run (vm=0x97f3f20, program=0x9198840, argv=0x40bd8160, nargs=4) at ../../libguile/vm.c:560
#21 0x4006e08d in scm_call_4 (proc=0x9198840, arg1=0x404, arg2=0x993a680, arg3=0x993a670, arg4=0x993a660) at ../../libguile/eval.c:582
#22 0x400e7ba2 in scm_catch_with_pre_unwind_handler (key=0x404, thunk=0x993a680, handler=0x993a670, pre_unwind_handler=0x993a660) at ../../libguile/throw.c:86
#23 0x400e7c72 in scm_c_catch (tag=0x404, body=0x400658c0 <c_body>, body_data=0x40bd8284, handler=0x400658e0 <c_handler>, handler_data=0x40bd8284, pre_unwind_handler=0x400e75e0 <scm_handle_by_message_noexit>, pre_unwind_handler_data=0x0) at ../../libguile/throw.c:213
#24 0x40065b9b in scm_i_with_continuation_barrier (body=0x400658c0 <c_body>, body_data=0x40bd8284, handler=0x400658e0 <c_handler>, handler_data=0x40bd8284, pre_unwind_handler=0x400e75e0 <scm_handle_by_message_noexit>, pre_unwind_handler_data=0x0) at ../../libguile/continuations.c:450
#25 0x40065c73 in scm_c_with_continuation_barrier (func=0x400e6480 <really_spawn>, data=0x409d714c) at ../../libguile/continuations.c:491
#26 0x400e6e1c in scm_i_with_guile_and_parent (func=0x400e6480 <really_spawn>, data=0x409d714c, parent=0x97f3fe0) at ../../libguile/threads.c:741
#27 0x400e6eaf in spawn_thread (d=0x409d714c) at ../../libguile/threads.c:941
#28 0x40180e48 in GC_inner_start_routine () from /usr/lib/libgc.so.1
#29 0x4017b21c in GC_call_with_stack_base () from /usr/lib/libgc.so.1
#30 0x40180cd7 in GC_start_routine () from /usr/lib/libgc.so.1
#31 0x40462955 in start_thread () from /lib/i686/cmov/libpthread.so.0
#32 0x403dc10e in clone () from /lib/i686/cmov/libc.so.6
(gdb) 

Going by the tail of check-guile.log, which I have saved off for every
day in September, this happened every day except the 1st, 3rd, 6th, 9th
and 11th.  So it's not completely deterministic, but happens more often
than not.

When the build hangs in `make check', it doesn't get as far as running
benchmarks, and so there is no data for the relevant date at
http://ossau.homelinux.net:8000/~neil/benchmark-results/master/.  So the
pattern of missing dates on that page suggests that this problem might
have been introduced by a commit on 16th August.

Any ideas?

    Neil



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2010-09-18 21:03 Hang in threads.test Neil Jerram
@ 2010-09-22 20:23 ` Neil Jerram
  2010-09-22 20:44   ` Neil Jerram
  2010-09-23 22:40   ` Neil Jerram
  0 siblings, 2 replies; 14+ messages in thread
From: Neil Jerram @ 2010-09-22 20:23 UTC (permalink / raw)
  To: guile-devel

Neil Jerram <neil@ossau.uklinux.net> writes:

> My nightly build of master, on a relatively slow old machine, is
> hanging, on most nights, in `make check'.

FWIW, I realized that my backtraces are very similar to those reported
by Dale here:
http://www.mail-archive.com/bug-guile@gnu.org/msg05066.html.

So I'll try to answer Ludo's question:

> The other thread should have call sem_post() to release this one.  Can
> you print the value of ‘GC_suspend_ack_sem’?

Till then...

        Neil



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2010-09-22 20:23 ` Neil Jerram
@ 2010-09-22 20:44   ` Neil Jerram
  2010-09-23 22:40   ` Neil Jerram
  1 sibling, 0 replies; 14+ messages in thread
From: Neil Jerram @ 2010-09-22 20:44 UTC (permalink / raw)
  To: guile-devel, Ludovic Courtès

Neil Jerram <neil@ossau.uklinux.net> writes:

> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> My nightly build of master, on a relatively slow old machine, is
>> hanging, on most nights, in `make check'.
>
> FWIW, I realized that my backtraces are very similar to those reported
> by Dale here:
> http://www.mail-archive.com/bug-guile@gnu.org/msg05066.html.
>
> So I'll try to answer Ludo's question:
>
>> The other thread should have call sem_post() to release this one.  Can
>> you print the value of ‘GC_suspend_ack_sem’?

0x4001d424 in __kernel_vsyscall ()
(gdb) p GC_suspend_ack_sem
$1 = 0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2010-09-22 20:23 ` Neil Jerram
  2010-09-22 20:44   ` Neil Jerram
@ 2010-09-23 22:40   ` Neil Jerram
  2010-09-24  9:33     ` Ludovic Courtès
  2011-02-11  9:20     ` Andy Wingo
  1 sibling, 2 replies; 14+ messages in thread
From: Neil Jerram @ 2010-09-23 22:40 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1251 bytes --]

Neil Jerram <neil@ossau.uklinux.net> writes:

> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> My nightly build of master, on a relatively slow old machine, is
>> hanging, on most nights, in `make check'.
>
> FWIW, I realized that my backtraces are very similar to those reported
> by Dale here:
> http://www.mail-archive.com/bug-guile@gnu.org/msg05066.html.

I think I have a reliable workaround for this, attached below.  My
nightly build will now use this patch, and hopefully that'll mean that
it reliably produces benchmark data again.

The hang seems to be caused by one thread (A) running (gc) at the same
time as another thread (B) is doing GC_malloc_atomic.  The third thread
in the backtrace is the signal delivery thread, and not involved.

But in the "mutex with owner not retained (bug #27450)" test there is no
thread B, so where does it come from?  It's left over from the "locking
mutex on behalf of other thread" test, two tests previously.  Adding
(join-thread t) to that earlier test means that the thread has to run
and complete before we get to the (gc) test.

My libgc is Debian version 1:7.1-3, so possibly this is a known libgc
issue that's already fixed upstream; I haven't tried to check that yet.

Regards,
        Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Workaround-for-hang-in-threads.test.patch --]
[-- Type: text/x-diff, Size: 814 bytes --]

From 6013922c08c35a4e1051d4481d5bb4580bda1430 Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Thu, 23 Sep 2010 23:13:39 +0100
Subject: [PATCH] Workaround for hang in threads.test

---
 test-suite/tests/threads.test |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/test-suite/tests/threads.test b/test-suite/tests/threads.test
index 58a2eba..234fb73 100644
--- a/test-suite/tests/threads.test
+++ b/test-suite/tests/threads.test
@@ -358,7 +358,9 @@
 	  (let* ((m (make-mutex))
 		 (t (begin-thread 'foo)))
 	    (lock-mutex m #f t)
-	    (eq? (mutex-owner m) t)))
+            (let ((result (eq? (mutex-owner m) t)))
+              (join-thread t)
+              result)))
 
         (pass-if "locking mutex with no owner"
 	  (let ((m (make-mutex)))
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2010-09-23 22:40   ` Neil Jerram
@ 2010-09-24  9:33     ` Ludovic Courtès
  2010-09-24 12:19       ` Mike Gran
  2011-02-11  9:20     ` Andy Wingo
  1 sibling, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2010-09-24  9:33 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1371 bytes --]

Hi Neil,

Neil Jerram <neil@ossau.uklinux.net> writes:

> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> Neil Jerram <neil@ossau.uklinux.net> writes:
>>
>>> My nightly build of master, on a relatively slow old machine, is
>>> hanging, on most nights, in `make check'.
>>
>> FWIW, I realized that my backtraces are very similar to those reported
>> by Dale here:
>> http://www.mail-archive.com/bug-guile@gnu.org/msg05066.html.
>
> I think I have a reliable workaround for this, attached below.  My
> nightly build will now use this patch, and hopefully that'll mean that
> it reliably produces benchmark data again.

Great, thanks for tracking this down!

> The hang seems to be caused by one thread (A) running (gc) at the same
> time as another thread (B) is doing GC_malloc_atomic.  The third thread
> in the backtrace is the signal delivery thread, and not involved.

[...]

> My libgc is Debian version 1:7.1-3, so possibly this is a known libgc
> issue that's already fixed upstream; I haven't tried to check that yet.

The Debian patches at
http://ftp.de.debian.org/debian/pool/main/libg/libgc/libgc_7.1-3.diff.gz
look harmless.

However, I cannot reproduce the problem with a stock 7.1 and a recent
CVS libgc on x86_64-linux-gnu.

Can you try running the attached program?  It stays at 200% CPU on my
dual-core machine (i.e., does not hang.)

Thanks,
Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: the file --]
[-- Type: text/x-csrc, Size: 535 bytes --]

#define GC_THREADS 1
#define GC_REDIRECT_TO_LOCAL 1

#include <gc/gc.h>

#include <pthread.h>
#include <stdlib.h>

void *a;

static void *
alloc_thread (void *data)
{
  while (1)
    a = GC_MALLOC_ATOMIC (123);
  return NULL;
}

static void *
gc_thread (void *data)
{
  while (1)
    GC_gcollect ();
  return NULL;
}

\f
int
main (int argc, char *argv[])
{
  pthread_t alloc, gc;

  GC_INIT ();

  pthread_create (&alloc, NULL, alloc_thread, NULL);
  pthread_create (&gc, NULL, gc_thread, NULL);

  while (1);

  return EXIT_SUCCESS;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2010-09-24  9:33     ` Ludovic Courtès
@ 2010-09-24 12:19       ` Mike Gran
  2010-09-27 13:30         ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Gran @ 2010-09-24 12:19 UTC (permalink / raw)
  To: Ludovic Courtès, guile-devel

> However, I cannot reproduce the problem with a stock 7.1 and a  recent

> CVS libgc on x86_64-linux-gnu.

FWIW, threads.test hangs on my box as well.  Before it was only
occasionally, but, lately it seems to happen all the time.

-Mike




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2010-09-24 12:19       ` Mike Gran
@ 2010-09-27 13:30         ` Ludovic Courtès
  2010-09-27 20:58           ` Neil Jerram
  0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2010-09-27 13:30 UTC (permalink / raw)
  To: guile-devel

Hi!

Mike Gran <spk121@yahoo.com> writes:

>> However, I cannot reproduce the problem with a stock 7.1 and a  recent
>
>> CVS libgc on x86_64-linux-gnu.
>
> FWIW, threads.test hangs on my box as well.  Before it was only
> occasionally, but, lately it seems to happen all the time.

Could you run the program that I posted and report back?

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2010-09-27 13:30         ` Ludovic Courtès
@ 2010-09-27 20:58           ` Neil Jerram
  0 siblings, 0 replies; 14+ messages in thread
From: Neil Jerram @ 2010-09-27 20:58 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 370 bytes --]

ludo@gnu.org (Ludovic Courtès) writes:

> Could you run the program that I posted and report back?

That program runs fine for me - except only 100% because I have a single
core.

I also modified it to add continual thread creation and destruction
(attached), and it was still fine.

Any further ideas?  I'll also continue playing with this...

    Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gctest.c --]
[-- Type: text/x-csrc, Size: 732 bytes --]

#define GC_THREADS 1
#define GC_REDIRECT_TO_LOCAL 1

#include <gc/gc.h>

#include <pthread.h>
#include <stdlib.h>

void *a;

static void *
alloc_thread (void *data)
{
  while (1)
    a = GC_MALLOC_ATOMIC (123);
  return NULL;
}

static void *
gc_thread (void *data)
{
  while (1)
    GC_gcollect ();
  return NULL;
}

static void *
creator_thread (void *data)
{
  pthread_t child;

  pthread_create (&child, NULL, creator_thread, NULL);
  usleep (random() % 10000);

  return NULL;
}

int
main (int argc, char *argv[])
{
  pthread_t alloc, gc;

  GC_INIT ();

  pthread_create (&alloc, NULL, alloc_thread, NULL);
  pthread_create (&gc, NULL, gc_thread, NULL);
  (void) creator_thread (NULL);

  while (1);

  return EXIT_SUCCESS;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2010-09-23 22:40   ` Neil Jerram
  2010-09-24  9:33     ` Ludovic Courtès
@ 2011-02-11  9:20     ` Andy Wingo
  2011-02-13  9:58       ` Neil Jerram
  1 sibling, 1 reply; 14+ messages in thread
From: Andy Wingo @ 2011-02-11  9:20 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Hi Neil,

I realize this was an old message, but it didn't seem that the thread
came to a conclusion:

On Fri 24 Sep 2010 00:40, Neil Jerram <neil@ossau.uklinux.net> writes:

>> Neil Jerram <neil@ossau.uklinux.net> writes:
>>
>>> My nightly build of master, on a relatively slow old machine, is
>>> hanging, on most nights, in `make check'.
>
> The hang seems to be caused by one thread (A) running (gc) at the same
> time as another thread (B) is doing GC_malloc_atomic.  The third thread
> in the backtrace is the signal delivery thread, and not involved.
>
> But in the "mutex with owner not retained (bug #27450)" test there is no
> thread B, so where does it come from?  It's left over from the "locking
> mutex on behalf of other thread" test, two tests previously.  Adding
> (join-thread t) to that earlier test means that the thread has to run
> and complete before we get to the (gc) test.

Is this fixed now for you?

Thanks,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2011-02-11  9:20     ` Andy Wingo
@ 2011-02-13  9:58       ` Neil Jerram
  2011-02-13 20:17         ` Andy Wingo
  0 siblings, 1 reply; 14+ messages in thread
From: Neil Jerram @ 2011-02-13  9:58 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

>>> Neil Jerram <neil@ossau.uklinux.net> writes:
>>>
>>>> My nightly build of master, on a relatively slow old machine, is
>>>> hanging, on most nights, in `make check'.
>>
>> The hang seems to be caused by one thread (A) running (gc) at the same
>> time as another thread (B) is doing GC_malloc_atomic.  The third thread
>> in the backtrace is the signal delivery thread, and not involved.
>>
>> But in the "mutex with owner not retained (bug #27450)" test there is no
>> thread B, so where does it come from?  It's left over from the "locking
>> mutex on behalf of other thread" test, two tests previously.  Adding
>> (join-thread t) to that earlier test means that the thread has to run
>> and complete before we get to the (gc) test.
>
> Is this fixed now for you?

No.  But that might be because the libgc on that machine - Debian
1:7.1-3 - is too old.  What is the latest recommendation for libgc
version?  README says "at least version 7.0", but I suspect that's out
of date.

       Neil



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2011-02-13  9:58       ` Neil Jerram
@ 2011-02-13 20:17         ` Andy Wingo
  2011-02-13 21:12           ` Neil Jerram
  2011-02-14 10:06           ` Ludovic Courtès
  0 siblings, 2 replies; 14+ messages in thread
From: Andy Wingo @ 2011-02-13 20:17 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

On Sun 13 Feb 2011 10:58, Neil Jerram <neil@ossau.uklinux.net> writes:

> No.  But that might be because the libgc on that machine - Debian
> 1:7.1-3 - is too old.  What is the latest recommendation for libgc
> version?  README says "at least version 7.0", but I suspect that's out
> of date.

I think we are a little incoherent on that point, given that 6.8 works,
and there is no 7.2 release.  Also, while there are bugs like
https://savannah.gnu.org/bugs/?32436 that persist with CVS libgc, it's
tough to say.  I wonder actually if this bug is related to that one.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2011-02-13 20:17         ` Andy Wingo
@ 2011-02-13 21:12           ` Neil Jerram
  2011-02-14  8:19             ` Andy Wingo
  2011-02-14 10:06           ` Ludovic Courtès
  1 sibling, 1 reply; 14+ messages in thread
From: Neil Jerram @ 2011-02-13 21:12 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> On Sun 13 Feb 2011 10:58, Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> No.  But that might be because the libgc on that machine - Debian
>> 1:7.1-3 - is too old.  What is the latest recommendation for libgc
>> version?  README says "at least version 7.0", but I suspect that's out
>> of date.
>
> I think we are a little incoherent on that point, given that 6.8 works,
> and there is no 7.2 release.  Also, while there are bugs like
> https://savannah.gnu.org/bugs/?32436 that persist with CVS libgc, it's
> tough to say.  I wonder actually if this bug is related to that one.
>
> Andy

So I'd guess this problem, and more generally the question of which
libgc version is best, is something we can live with for 2.0 - until
more data points emerge.  Right?

       Neil



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2011-02-13 21:12           ` Neil Jerram
@ 2011-02-14  8:19             ` Andy Wingo
  0 siblings, 0 replies; 14+ messages in thread
From: Andy Wingo @ 2011-02-14  8:19 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

On Sun 13 Feb 2011 22:12, Neil Jerram <neil@ossau.uklinux.net> writes:

> So I'd guess the question of which libgc version is best, is something
> we can live with for 2.0 - until more data points emerge.  Right?

Yes, I think that's (unfortunately) right.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hang in threads.test
  2011-02-13 20:17         ` Andy Wingo
  2011-02-13 21:12           ` Neil Jerram
@ 2011-02-14 10:06           ` Ludovic Courtès
  1 sibling, 0 replies; 14+ messages in thread
From: Ludovic Courtès @ 2011-02-14 10:06 UTC (permalink / raw)
  To: guile-devel

Hi,

Andy Wingo <wingo@pobox.com> writes:

> On Sun 13 Feb 2011 10:58, Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> No.  But that might be because the libgc on that machine - Debian
>> 1:7.1-3 - is too old.  What is the latest recommendation for libgc
>> version?  README says "at least version 7.0", but I suspect that's out
>> of date.
>
> I think we are a little incoherent on that point, given that 6.8 works,

GC 6.8 cannot be detected by ‘configure’, so one has to play tricks to
actually use it.

Currently 6.8 can be used because there’s a small compatibility layer in
libguile/bdw-gc.h to make it possible.  We should remove it at some point.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-02-14 10:06 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-18 21:03 Hang in threads.test Neil Jerram
2010-09-22 20:23 ` Neil Jerram
2010-09-22 20:44   ` Neil Jerram
2010-09-23 22:40   ` Neil Jerram
2010-09-24  9:33     ` Ludovic Courtès
2010-09-24 12:19       ` Mike Gran
2010-09-27 13:30         ` Ludovic Courtès
2010-09-27 20:58           ` Neil Jerram
2011-02-11  9:20     ` Andy Wingo
2011-02-13  9:58       ` Neil Jerram
2011-02-13 20:17         ` Andy Wingo
2011-02-13 21:12           ` Neil Jerram
2011-02-14  8:19             ` Andy Wingo
2011-02-14 10:06           ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).