unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#10225: Lock ordering mismatch
@ 2011-12-05 20:42 Andy Wingo
  2017-02-23 18:56 ` Andy Wingo
  0 siblings, 1 reply; 2+ messages in thread
From: Andy Wingo @ 2011-12-05 20:42 UTC (permalink / raw)
  To: 10225; +Cc: ludo

[-- Attachment #1: Type: text/plain, Size: 106 bytes --]

This message from Ludovic on 1 July indicates a problem in our threading
code that we need to fix.

Andy


[-- Attachment #2: Type: message/rfc822, Size: 8643 bytes --]

[-- Attachment #2.1.1: Type: text/plain, Size: 1009 bytes --]

Hello,

As seen in ccb80964cd7cd112e300c34d32f67125a6d6da9a, there’s a lock
ordering mismatch between ‘do_thread exit’ and ‘fat_mutex_lock’
wrt. ‘t->admin_mutex’ and ‘m->lock’.

I thought this commit solved the problem, but now I think it doesn’t
because it leaves a small window during which a mutex could be held by a
thread without being part of its `mutexes' list, thereby violating the
invariant tested at line 667:

        /* Since MUTEX is in `t->mutexes', T must be its owner.  */
        assert (scm_is_eq (m->owner, t->handle));

So I reverted the patch.

(The situation isn’t better without the patch since
“while ./check-guile srfi-18.test threads.test ; do : ; done”
eventually hits the assertion failure.)

I tried the attached patch, which is inelegant and leads to deadlocks
with canceled threads (namely the “cancel succeeds” test in
threads.test.)

IOW I would welcome fresh ideas to approach the problem.  :-)

Thanks,
Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2.1.2: Type: text/x-patch, Size: 2536 bytes --]

	Modified libguile/threads.c
diff --git a/libguile/threads.c b/libguile/threads.c
index cbacfca..d537e0e 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -1353,12 +1353,24 @@ fat_mutex_lock (SCM mutex, scm_t_timespec *timeout, SCM owner, int *ret)
   fat_mutex *m = SCM_MUTEX_DATA (mutex);
 
   SCM new_owner = SCM_UNBNDP (owner) ? scm_current_thread() : owner;
+  scm_i_thread *t =
+    scm_is_true (new_owner) ? SCM_I_THREAD_DATA (new_owner) : NULL;
   SCM err = SCM_BOOL_F;
 
   struct timeval current_time;
 
-  scm_i_scm_pthread_mutex_lock (&m->lock);
+#define LOCK					\
+  if (t != NULL)				\
+    scm_i_pthread_mutex_lock (&t->admin_mutex);	\
+  scm_i_pthread_mutex_lock (&m->lock)
+
+#define UNLOCK					\
+  scm_i_pthread_mutex_unlock (&m->lock);	\
+  if (t != NULL)				\
+    scm_i_pthread_mutex_unlock (&t->admin_mutex)
 
+
+  LOCK;
   while (1)
     {
       if (m->level == 0)
@@ -1367,22 +1379,12 @@ fat_mutex_lock (SCM mutex, scm_t_timespec *timeout, SCM owner, int *ret)
 	  m->level++;
 
 	  if (SCM_I_IS_THREAD (new_owner))
-	    {
-	      scm_i_thread *t = SCM_I_THREAD_DATA (new_owner);
-
-	      scm_i_pthread_mutex_unlock (&m->lock);
-	      scm_i_pthread_mutex_lock (&t->admin_mutex);
-
-	      /* Only keep a weak reference to MUTEX so that it's not
-		 retained when not referenced elsewhere (bug #27450).
-		 The weak pair itself is eventually removed when MUTEX
-		 is unlocked.  Note that `t->mutexes' lists mutexes
-		 currently held by T, so it should be small.  */
-	      t->mutexes = scm_weak_car_pair (mutex, t->mutexes);
-
-	      scm_i_pthread_mutex_unlock (&t->admin_mutex);
-	      scm_i_pthread_mutex_lock (&m->lock);
-	    }
+	    /* Only keep a weak reference to MUTEX so that it's not
+	       retained when not referenced elsewhere (bug #27450).
+	       The weak pair itself is eventually removed when MUTEX
+	       is unlocked.  Note that `t->mutexes' lists mutexes
+	       currently held by T, so it should be small.  */
+	    t->mutexes = scm_weak_car_pair (mutex, t->mutexes);
 	  *ret = 1;
 	  break;
 	}
@@ -1425,13 +1427,18 @@ fat_mutex_lock (SCM mutex, scm_t_timespec *timeout, SCM owner, int *ret)
 		}
 	    }
 	  block_self (m->waiting, mutex, &m->lock, timeout);
-	  scm_i_pthread_mutex_unlock (&m->lock);
-	  SCM_TICK;
-	  scm_i_scm_pthread_mutex_lock (&m->lock);
+
+	  /* UNLOCK; */
+	  /* SCM_TICK; */
+	  /* LOCK; */
 	}
     }
-  scm_i_pthread_mutex_unlock (&m->lock);
+
+  UNLOCK;
+
   return err;
+#undef LOCK
+#undef UNLOCK
 }
 
 SCM scm_lock_mutex (SCM mx)


[-- Attachment #3: Type: text/plain, Size: 27 bytes --]



-- 
http://wingolog.org/

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* bug#10225: Lock ordering mismatch
  2011-12-05 20:42 bug#10225: Lock ordering mismatch Andy Wingo
@ 2017-02-23 18:56 ` Andy Wingo
  0 siblings, 0 replies; 2+ messages in thread
From: Andy Wingo @ 2017-02-23 18:56 UTC (permalink / raw)
  To: 10225-done; +Cc: ludo

> As seen in ccb80964cd7cd112e300c34d32f67125a6d6da9a, there’s a lock
> ordering mismatch between ‘do_thread exit’ and ‘fat_mutex_lock’
> wrt. ‘t->admin_mutex’ and ‘m->lock’.

Happily with the thread-safety rewrite in 2.1.5 this is fixed!

Andy





^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-02-23 18:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-05 20:42 bug#10225: Lock ordering mismatch Andy Wingo
2017-02-23 18:56 ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).