unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: "Julian Graham" <joolean@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guile-devel@gnu.org
Subject: Re: Race condition in threading code?
Date: Sat, 30 Aug 2008 19:05:17 -0400	[thread overview]
Message-ID: <2bc5f8210808301605v5a6376ffs98b58c848c2f64fa@mail.gmail.com> (raw)
In-Reply-To: <2bc5f8210808270614s3ddc6e9fued2ed9f95da15303@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1591 bytes --]

Okay, I think I know what the problem is: Part of the SRFI-18 thread
start / creation process involves contention for a mutex, and there's
a bug in fat_mutex_lock code that causes the locking thread to
sometimes miss an unlocking thread's notification that a mutex is
available.  So it's actually a mutex bug -- specifically, in the loop
code in fat_mutex_lock that ends with the following snippet:

      ...
          scm_i_pthread_mutex_unlock (&m->lock);
          SCM_TICK;
          scm_i_scm_pthread_mutex_lock (&m->lock);
        }
      block_self (m->waiting, mutex, &m->lock, timeout);

...which means that if the loop is entered while the mutex is still
locked but the owner unlocks it after the locking thread releases the
administrative lock to run the tick, the locking thread will sleep
forever because it doesn't re-check the state of the mutex.  I've made
a small change (blocking before doing the tick instead of after) that
seems to resolve the issue (so far no lock-ups using Han-Wen's x.test
for a couple of hours).  There's a patch attached.

(Sorry, should have noticed this earlier; the problem existed before
the changes I introduced to support SRFI-18...)


Regards,
Julian


On Wed, Aug 27, 2008 at 9:14 AM, Julian Graham <joolean@gmail.com> wrote:
>> I've seen `srfi-18.test' hang from time to time, but not often enough to
>> give me an incentive to nail it down.  :-(  I don't think it relates to
>> Han-Wen's GC changes.
>
>
> Crap, I'm seeing some lockups now, too.  Sorry, guys.  I'm debugging,
> but don't let that stop you from investigating as well.  ;)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Resolve-a-deadlock-caused-by-not-checking-mutex-stat.patch --]
[-- Type: text/x-diff; name=0001-Resolve-a-deadlock-caused-by-not-checking-mutex-stat.patch, Size: 1352 bytes --]

From 12a5c7ca5e0bec9386ee24b2e2320d1aa03e55d5 Mon Sep 17 00:00:00 2001
From: Julian Graham <julian@countyhell.(none)>
Date: Sat, 30 Aug 2008 19:03:21 -0400
Subject: [PATCH] Resolve a deadlock caused by not checking mutex state after calling
 `SCM_TICK'.

---
 libguile/ChangeLog |    5 +++++
 libguile/threads.c |    2 +-
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/libguile/ChangeLog b/libguile/ChangeLog
index e8d9362..40e2bb4 100644
--- a/libguile/ChangeLog
+++ b/libguile/ChangeLog
@@ -1,3 +1,8 @@
+2008-08-29  Julian Graham  <joolean@gmail.com>
+
+	* threads.c (fat_mutex_lock): Resolve a deadlock caused by not
+	checking mutex state after calling `SCM_TICK'.	
+
 2008-08-27  Ludovic Courtès  <ludo@gnu.org>
 
 	Fix builds `--without-threads'.  Reported by Han-Wen Nienhuys
diff --git a/libguile/threads.c b/libguile/threads.c
index 7e55f3b..8699fd0 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -1292,11 +1292,11 @@ fat_mutex_lock (SCM mutex, scm_t_timespec *timeout, SCM owner, int *ret)
 		  break;
 		}
 	    }
+	  block_self (m->waiting, mutex, &m->lock, timeout);
 	  scm_i_pthread_mutex_unlock (&m->lock);
 	  SCM_TICK;
 	  scm_i_scm_pthread_mutex_lock (&m->lock);
 	}
-      block_self (m->waiting, mutex, &m->lock, timeout);
     }
   scm_i_pthread_mutex_unlock (&m->lock);
   return err;
-- 
1.5.4.3


  reply	other threads:[~2008-08-30 23:05 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-16 18:21 Race condition in threading code? Han-Wen Nienhuys
2008-08-16 18:42 ` Julian Graham
2008-08-16 18:45   ` Han-Wen Nienhuys
2008-08-26 20:23     ` Andy Wingo
2008-08-27  0:41       ` Han-Wen Nienhuys
2008-08-27  2:36       ` Han-Wen Nienhuys
2008-08-27  7:46       ` Ludovic Courtès
2008-08-27 13:14         ` Julian Graham
2008-08-30 23:05           ` Julian Graham [this message]
2008-08-31  1:49             ` Han-Wen Nienhuys
2008-08-31  2:54               ` Julian Graham
2008-08-31 13:52                 ` Han-Wen Nienhuys
2008-08-31 10:34               ` Ludovic Courtès
2008-08-31 12:58             ` Ludovic Courtès
2008-08-31 15:05               ` Julian Graham
2008-08-31 19:39                 ` Ludovic Courtès
2008-09-07  0:12                   ` Julian Graham
2008-09-08 12:37                     ` Ludovic Courtès
2008-08-31 20:08       ` Ludovic Courtès
2008-08-31 23:59         ` Han-Wen Nienhuys
2008-09-01  8:11           ` Ludovic Courtès
2008-09-01  0:18         ` Han-Wen Nienhuys
2008-09-03  4:56           ` Han-Wen Nienhuys
2008-09-04 18:12             ` Andy Wingo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bc5f8210808301605v5a6376ffs98b58c848c2f64fa@mail.gmail.com \
    --to=joolean@gmail.com \
    --cc=guile-devel@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).