unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* pthread crash on master
@ 2009-01-15 17:20 Greg Troxel
  2009-01-15 21:54 ` Neil Jerram
  0 siblings, 1 reply; 5+ messages in thread
From: Greg Troxel @ 2009-01-15 17:20 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 3884 bytes --]


(Thansk - now I can build master and branch-release_18.)

1.8 passes make check.  master fails with a core dump due to mutex abuse
in the socket test:

  http://autobuild.josefsson.org/guile/log-200901151131125464000.txt

here is a backtrace of the thread that crashed

(gdb) i thr
  4 process 69671  0x00007f7ffd7337ba in _lwp_park () from /usr/lib/libc.so.12
  3 process 135207  0x00007f7ffd732cfa in read () from /usr/lib/libc.so.12
  2 process 200743  0x00007f7ffd732e9a in kill () from /usr/lib/libc.so.12
* 1 process 266279  0x00007f7ffd7337ba in _lwp_park () from /usr/lib/libc.so.12
(gdb) thr 2
[Switching to thread 2 (process 200743)]#0  0x00007f7ffd732e9a in kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0x00007f7ffd732e9a in kill () from /usr/lib/libc.so.12
#1  0x00007f7ffcd0a754 in pthread__errorfunc () from /usr/lib/libpthread.so.0
#2  0x00007f7ffcd08787 in pthread_mutex_init () from /usr/lib/libpthread.so.0
#3  0x00007f7ffdc928b5 in scm_leave_guile () at threads.c:433
#4  0x00007f7ffcd0ad76 in pthread_exit () from /usr/lib/libpthread.so.0
#5  0x00007f7ffcd0aebb in pthread__cancelled () from /usr/lib/libpthread.so.0
#6  0x00007f7ffcd096f0 in pthread_cond_wait () from /usr/lib/libpthread.so.0
#7  0x00007f7ffdc92962 in scm_pthread_cond_wait (cond=0x7f7ffcc01568, mutex=0x7f7ffbce8430) at threads.c:1832
#8  0x00007f7ffdc92e82 in block_self (queue=<value optimized out>, sleep_object=<value optimized out>, mutex=0x7f7ffbce8430, 
    waittime=0x0) at threads.c:264
#9  0x00007f7ffdc93c12 in fat_mutex_unlock (mutex=0x7f7ffc823ee0, cond=0x7f7ffc823bc0, waittime=0x0, relock=1)
    at threads.c:1462
#10 0x00007f7ffdc93d9e in scm_timed_wait_condition_variable (cv=0x7f7ffc823bc0, mx=0x7f7ffc823ee0, t=0x204) at threads.c:1656
#11 0x00007f7ffdc41d7e in ceval (x=<value optimized out>, env=0x7f7ffc82cf30) at eval.i.c:1360
#12 0x00007f7ffdc3f7f3 in scm_apply (proc=0x7f7ffc77e980, arg1=<value optimized out>, args=<value optimized out>)
    at eval.i.c:1854
#13 0x00007f7ffdc95339 in scm_c_catch (tag=0x104, body=0x7f7ffdc94e10 <scm_body_thunk>, body_data=0x7f7ff7fffa60, 
    handler=0x7f7ffdc94e00 <scm_handle_by_proc>, handler_data=0x7f7ff7fffa58, pre_unwind_handler=0, 
    pre_unwind_handler_data=0x7f7ff7fffa50) at throw.c:207
#14 0x00007f7ffdc95447 in scm_catch_with_pre_unwind_handler (key=0x1027, thunk=<value optimized out>, handler=0x7f7ffc77c8a0, 
    pre_unwind_handler=0x204) at throw.c:591
#15 0x00007f7ffdc9433b in really_launch (d=<value optimized out>) at threads.c:891
#16 0x00007f7ffdc3507a in c_body (d=0x7f7ff7fffc30) at continuations.c:349
#17 0x00007f7ffdc95339 in scm_c_catch (tag=0x104, body=0x7f7ffdc35070 <c_body>, body_data=0x7f7ff7fffc30, 
    handler=0x7f7ffdc35090 <c_handler>, handler_data=0x7f7ff7fffc30, 
    pre_unwind_handler=0x7f7ffdc94d20 <scm_handle_by_message_noexit>, pre_unwind_handler_data=0x0) at throw.c:207
#18 0x00007f7ffdc35431 in scm_i_with_continuation_barrier (body=0x7f7ffdc35070 <c_body>, body_data=0x7f7ff7fffc30, 
    handler=0x7f7ffdc35090 <c_handler>, handler_data=0x7f7ff7fffc30, 
    pre_unwind_handler=0x7f7ffdc94d20 <scm_handle_by_message_noexit>, pre_unwind_handler_data=0x0) at continuations.c:325
#19 0x00007f7ffdc354d0 in scm_c_with_continuation_barrier (func=<value optimized out>, data=<value optimized out>)
    at continuations.c:367
#20 0x00007f7ffdc943f2 in scm_i_with_guile_and_parent (func=0x7f7ffdc942c0 <really_launch>, data=0x7f7fffffc6a0, 
    parent=0x7f7ffcc55340) at threads.c:842
#21 0x00007f7ffdc944e3 in launch_thread (d=0x7f7fffffc6a0) at threads.c:901
#22 0x00007f7ffcd0b662 in pthread_create () from /usr/lib/libpthread.so.0
#23 0x00007f7ffd753bd0 in swapcontext () from /usr/lib/libc.so.12
#24 0x00007f7ff8000000 in ?? ()
#25 0x0000000111110001 in ?? ()
#26 0x0000000033330003 in ?? ()
#27 0x0000000000000000 in ?? ()


[-- Attachment #2: Type: application/pgp-signature, Size: 193 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: pthread crash on master
  2009-01-15 17:20 pthread crash on master Greg Troxel
@ 2009-01-15 21:54 ` Neil Jerram
  2009-01-15 23:03   ` Neil Jerram
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Jerram @ 2009-01-15 21:54 UTC (permalink / raw)
  To: Greg Troxel; +Cc: guile-devel

2009/1/15 Greg Troxel <gdt@ir.bbn.com>:
>
> (Thansk - now I can build master and branch-release_18.)
>
> 1.8 passes make check.  master fails with a core dump due to mutex abuse
> in the socket test:
>
>  http://autobuild.josefsson.org/guile/log-200901151131125464000.txt

I'm looking into this, as it is in approximately the same area as
http://git.savannah.gnu.org/gitweb/?p=guile.git;a=commit;h=d2a510879be656b278a58e8110e4f222d0845847

      Neil




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: pthread crash on master
  2009-01-15 21:54 ` Neil Jerram
@ 2009-01-15 23:03   ` Neil Jerram
  2009-01-16 16:41     ` Greg Troxel
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Jerram @ 2009-01-15 23:03 UTC (permalink / raw)
  To: Greg Troxel; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 605 bytes --]

2009/1/15 Neil Jerram <neiljerram@googlemail.com>:
> 2009/1/15 Greg Troxel <gdt@ir.bbn.com>:
>>
>> (Thansk - now I can build master and branch-release_18.)
>>
>> 1.8 passes make check.  master fails with a core dump due to mutex abuse
>> in the socket test:
>>
>>  http://autobuild.josefsson.org/guile/log-200901151131125464000.txt
>
> I'm looking into this, as it is in approximately the same area as
> http://git.savannah.gnu.org/gitweb/?p=guile.git;a=commit;h=d2a510879be656b278a58e8110e4f222d0845847
>
>      Neil
>

Greg,

I believe I've fixed this, can you try out the attached patch?

        Neil

[-- Attachment #2: 0001-Don-t-try-to-unlock-already-unlocked-heap-mutex.patch --]
[-- Type: application/octet-stream, Size: 3639 bytes --]

From 636333d3b26cba21a7c8ed6afdbc5de5712dd73e Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Thu, 15 Jan 2009 22:36:28 +0000
Subject: [PATCH] Don't try to unlock already unlocked heap mutex

For each thread that goes into Guile mode, Guile pushes a cleanup
function, scm_leave_guile_cleanup, whose purpose is to execute
`scm_leave_guile ()' if the thread is terminated while in Guile mode.
The problem is that there are various places - like
scm_pthread_cond_wait, scm_without_guile and scm_std_select - where
the thread temporarily leaves Guile mode (which means unlocking the
heap mutex), and the cleanup function is still in place.  Therefore if
the thread is terminated at these places, the cleanup function ends up
trying to unlock a mutex (the heap mutex) which isn't actually locked.

* libguile/threads.h (scm_i_thread): New heap_mutex_locked_by_self field.

* libguile/threads.c (scm_enter_guile): Set heap_mutex_locked_by_self.
  (scm_leave_guile): Only unlock if heap_mutex_locked_by_self is 1.
  (guilify_self_1): Initialize heap_mutex_locked_by_self.
---
 libguile/threads.c |   18 +++++++++++++++---
 libguile/threads.h |    7 +++++++
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/libguile/threads.c b/libguile/threads.c
index 1d497e7..27aad3d 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -409,6 +409,7 @@ scm_enter_guile (scm_t_guile_ticket ticket)
   if (t)
     {
       scm_i_pthread_mutex_lock (&t->heap_mutex);
+      t->heap_mutex_locked_by_self = 1;
       resume (t);
     }
 }
@@ -430,7 +431,11 @@ static scm_t_guile_ticket
 scm_leave_guile ()
 {
   scm_i_thread *t = suspend ();
-  scm_i_pthread_mutex_unlock (&t->heap_mutex);
+  if (t->heap_mutex_locked_by_self)
+    {
+      t->heap_mutex_locked_by_self = 0;
+      scm_i_pthread_mutex_unlock (&t->heap_mutex);
+    }
   return (scm_t_guile_ticket) t;
 }
 
@@ -491,6 +496,7 @@ guilify_self_1 (SCM_STACKITEM *base)
     abort ();
 
   scm_i_pthread_mutex_init (&t->heap_mutex, NULL);
+  t->heap_mutex_locked_by_self = 0;
   scm_i_pthread_mutex_init (&t->admin_mutex, NULL);
   t->clear_freelists_p = 0;
   t->gc_running_p = 0;
@@ -505,6 +511,7 @@ guilify_self_1 (SCM_STACKITEM *base)
   scm_i_pthread_setspecific (scm_i_thread_key, t);
 
   scm_i_pthread_mutex_lock (&t->heap_mutex);
+  t->heap_mutex_locked_by_self = 1;
 
   scm_i_pthread_mutex_lock (&thread_admin_mutex);
   t->next_thread = all_threads;
@@ -1992,9 +1999,14 @@ void
 scm_i_thread_sleep_for_gc ()
 {
   scm_i_thread *t = suspend ();
-  t->held_mutex = &t->heap_mutex;
+
+  /* Don't put t->heap_mutex in t->held_mutex here, because if the
+     thread is cancelled during the cond wait, the thread's cleanup
+     function (scm_leave_guile_cleanup) will handle unlocking the
+     heap_mutex, so we don't need to do that again in on_thread_exit.
+  */
   scm_i_pthread_cond_wait (&wake_up_cond, &t->heap_mutex);
-  t->held_mutex = NULL;
+
   resume (t);
 }
 
diff --git a/libguile/threads.h b/libguile/threads.h
index cbff648..66ddb6a 100644
--- a/libguile/threads.h
+++ b/libguile/threads.h
@@ -72,6 +72,13 @@ typedef struct scm_i_thread {
   */
   scm_i_pthread_mutex_t heap_mutex;
 
+  /* Boolean tracking whether the above mutex is currently locked by
+     this thread.  This is equivalent to whether or not the thread is
+     in "Guile mode".  This field doesn't need any protection because
+     it is only ever set or tested by the owning thread.
+  */
+  int heap_mutex_locked_by_self;
+
   /* The freelists of this thread.  Each thread has its own lists so
      that they can all allocate concurrently.
   */
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: pthread crash on master
  2009-01-15 23:03   ` Neil Jerram
@ 2009-01-16 16:41     ` Greg Troxel
  2009-01-17 22:28       ` Neil Jerram
  0 siblings, 1 reply; 5+ messages in thread
From: Greg Troxel @ 2009-01-16 16:41 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 240 bytes --]


  [2. application/octet-stream; 0001-Don-t-try-to-unlock-already-unlocked-heap-mutex.patch]...

With that patch, make check runs successfully.
log at

http://autobuild.josefsson.org/guile/log-200901161104820417000.txt

thanks for the fix.

[-- Attachment #2: Type: application/pgp-signature, Size: 193 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: pthread crash on master
  2009-01-16 16:41     ` Greg Troxel
@ 2009-01-17 22:28       ` Neil Jerram
  0 siblings, 0 replies; 5+ messages in thread
From: Neil Jerram @ 2009-01-17 22:28 UTC (permalink / raw)
  To: Greg Troxel; +Cc: guile-devel

2009/1/16 Greg Troxel <gdt@ir.bbn.com>:
> [2. application/octet-stream; 0001-Don-t-try-to-unlock-already-unlocked-heap-mutex.patch]...
>
> With that patch, make check runs successfully.
> log at
>
> http://autobuild.josefsson.org/guile/log-200901161104820417000.txt
>
> thanks for the fix.

That's good; I've pushed it to master now.  Thanks for testing.

          Neil




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-01-17 22:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-15 17:20 pthread crash on master Greg Troxel
2009-01-15 21:54 ` Neil Jerram
2009-01-15 23:03   ` Neil Jerram
2009-01-16 16:41     ` Greg Troxel
2009-01-17 22:28       ` Neil Jerram

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).