From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.bugs Subject: bug#10225: Lock ordering mismatch Date: Mon, 05 Dec 2011 21:42:59 +0100 Message-ID: <87liqqrbqk.fsf@pobox.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: dough.gmane.org 1323117824 14755 80.91.229.12 (5 Dec 2011 20:43:44 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 5 Dec 2011 20:43:44 +0000 (UTC) Cc: ludo@gnu.org To: 10225@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Mon Dec 05 21:43:40 2011 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1RXfNx-00030t-4U for guile-bugs@m.gmane.org; Mon, 05 Dec 2011 21:43:37 +0100 Original-Received: from localhost ([::1]:57862 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RXfNw-0004bz-OT for guile-bugs@m.gmane.org; Mon, 05 Dec 2011 15:43:36 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:40731) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RXfNq-0004bD-B9 for bug-guile@gnu.org; Mon, 05 Dec 2011 15:43:33 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RXfNp-0006WR-6Y for bug-guile@gnu.org; Mon, 05 Dec 2011 15:43:30 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:54312) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RXfNp-0006Vk-4C for bug-guile@gnu.org; Mon, 05 Dec 2011 15:43:29 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1RXfOM-0001aT-Fu for bug-guile@gnu.org; Mon, 05 Dec 2011 15:44:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Andy Wingo Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-guile@gnu.org Resent-Date: Mon, 05 Dec 2011 20:44:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 10225 X-GNU-PR-Package: guile X-GNU-PR-Keywords: X-Debbugs-Original-To: bug-guile Original-Received: via spool by submit@debbugs.gnu.org id=B.13231178266072 (code B ref -1); Mon, 05 Dec 2011 20:44:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 5 Dec 2011 20:43:46 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RXfO5-0001Zs-Kh for submit@debbugs.gnu.org; Mon, 05 Dec 2011 15:43:46 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RXfO2-0001Zh-As for submit@debbugs.gnu.org; Mon, 05 Dec 2011 15:43:44 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RXfNT-0006NS-53 for submit@debbugs.gnu.org; Mon, 05 Dec 2011 15:43:08 -0500 Original-Received: from lists.gnu.org ([140.186.70.17]:44963) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RXfNT-0006NJ-3J for submit@debbugs.gnu.org; Mon, 05 Dec 2011 15:43:07 -0500 Original-Received: from eggs.gnu.org ([140.186.70.92]:40629) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RXfNR-0004Yy-Oa for bug-guile@gnu.org; Mon, 05 Dec 2011 15:43:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RXfNQ-0006Mr-Hb for bug-guile@gnu.org; Mon, 05 Dec 2011 15:43:05 -0500 Original-Received: from a-pb-sasl-sd.pobox.com ([74.115.168.62]:63573 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RXfNQ-0006Mj-AT; Mon, 05 Dec 2011 15:43:04 -0500 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id 2E9788D35; Mon, 5 Dec 2011 15:43:03 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:date:message-id:mime-version:content-type; s=sasl; bh=J 737VrHqzg6yPD5UTbgaWzD4RlE=; b=KocATjMRVXrBomy7SevMLZyta1gxRFphL /bBGou7xMG9V1h4jZZ15UfwB9I60dRtAU2qvU0nw2vAZhl0m6mK5U7R5DYqxdgtd LaFGMa/pNBhEzrYS1DioEbLoXR4Lpp52DQRbzoUxJeDs4cCBBymA2A+8yhSchDn+ qCd/3AyPo4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:date:message-id:mime-version:content-type; q=dns; s= sasl; b=EfVd7FtqrvTHOqWHnLXhgHdzliqVCWxOud9AmHGibtKR/c2FZoYfdGX4 SqwepVdIyDa5HvrdEmBUcd7RulvNdX0qJlknbSSHJx3L/FWD8OZcWc1yyCUF8/G9 hsza7F1VadJMbxulnZVwbJozB5ZqdK5iokXJRwIQ50pAnxOFujk= Original-Received: from a-pb-sasl-sd.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id 260C88D34; Mon, 5 Dec 2011 15:43:03 -0500 (EST) Original-Received: from badger (unknown [90.164.198.39]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTPSA id 6F4B38D33; Mon, 5 Dec 2011 15:43:02 -0500 (EST) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) X-Pobox-Relay-ID: B9C934E2-1F81-11E1-A3A5-65B1DE995924-02397024!a-pb-sasl-sd.pobox.com X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Mon, 05 Dec 2011 15:44:02 -0500 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:5956 Archived-At: --=-=-= This message from Ludovic on 1 July indicates a problem in our threading code that we need to fix. Andy --=-=-= Content-Type: message/rfc822 Content-Disposition: inline Return-Path: Received: from compute3.internal (compute3.nyi.mail.srv.osa [10.202.2.43]) by slots2a2p2 (Cyrus git2.5+0-git-fastmail-6877) with LMTPA; Fri, 01 Jul 2011 17:25:18 -0400 X-Sieve: CMU Sieve 2.4 X-Spam-score: 0.0 X-Spam-hits: BAYES_00 -1.9, RCVD_IN_DNSWL_MED -2.3, SPF_HELO_PASS -0.001, SPF_PASS -0.001, BAYES_USED global, SA_VERSION 3.3.1 X-Spam-source: IP='140.186.70.17', Host='lists.gnu.org', Country='US', FromHeader='unk', MailFrom='com' X-Spam-charsets: from='iso-8859-1', plain='utf-8' X-Resolved-to: wingo@fastmail.fm X-Delivered-to: wingo@fastmail.fm X-Mail-from: SRS0=0agM=ZU=gnu.org=guile-devel-bounces+wingo=pobox.com@bounce2.pobox.com Received: from mx2.messagingengine.com ([10.202.2.201]) by compute3.internal (LMTPProxy); Fri, 01 Jul 2011 17:25:18 -0400 Received: from maroon.pobox.com (maroon.pobox.com [208.72.237.40]) by mx2.messagingengine.com (Postfix) with ESMTP id 4CA5C7802A4 for ; Fri, 1 Jul 2011 17:25:18 -0400 (EDT) Received: from maroon.pobox.com (localhost [127.0.0.1]) by maroon.pobox.com (Postfix) with ESMTP id 28FE040A1AF for ; Fri, 1 Jul 2011 17:25:17 -0400 (EDT) X-Remote-Delivered-To: wingo@pobox.com X-Pobox-Orig-Sender: X-Pobox-Delivery-ID: 9D9E0736-A428-11E0-A491-868A2FA6B76E-02397024!maroon.pobox.com x-pobox-client-address: 140.186.70.17 x-pobox-client-name: lists.gnu.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) by maroon.pobox.com (Postfix) with ESMTP id E367F40A338 for ; Fri, 1 Jul 2011 17:25:16 -0400 (EDT) Received: from localhost ([::1]:38493 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QclD9-00074C-Bk for wingo@pobox.com; Fri, 01 Jul 2011 17:25:15 -0400 Received: from eggs.gnu.org ([140.186.70.92]:40202) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qckzc-0003RV-8d for guile-devel@gnu.org; Fri, 01 Jul 2011 17:11:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qckza-0001Bf-Mi for guile-devel@gnu.org; Fri, 01 Jul 2011 17:11:15 -0400 Received: from lo.gmane.org ([80.91.229.12]:56589) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qckza-0001BY-5w for guile-devel@gnu.org; Fri, 01 Jul 2011 17:11:14 -0400 Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1QckzX-00043R-QC for guile-devel@gnu.org; Fri, 01 Jul 2011 23:11:11 +0200 Received: from reverse-83.fdn.fr ([80.67.176.83]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 01 Jul 2011 23:11:11 +0200 Received: from ludo by reverse-83.fdn.fr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 01 Jul 2011 23:11:11 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: guile-devel@gnu.org From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Lock ordering mismatch Date: Fri, 01 Jul 2011 23:11:00 +0200 Lines: 124 Message-ID: <877h81emfv.fsf@gnu.org> X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: reverse-83.fdn.fr X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 13 Messidor an 219 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu User-Agent: Gnus/5.110017 (No Gnus v0.17) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:xOrIuBQr4KaahjKhQ9rPdUTOd7o= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 80.91.229.12 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+wingo=pobox.com@gnu.org Sender: guile-devel-bounces+wingo=pobox.com@gnu.org X-Pobox-Pass: guile-devel-bounces+wingo=pobox.com@gnu.org is whitelisted X-Truedomain-Domain: gnu.org X-Truedomain-SPF: Pass X-Truedomain-DKIM: No Signature X-Truedomain-ID: 62D548A5D840E95F0FD5EF58729647FA X-Truedomain: Neutral MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===-=-=" --===-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello, As seen in ccb80964cd7cd112e300c34d32f67125a6d6da9a, there=E2=80=99s a lock ordering mismatch between =E2=80=98do_thread=C2=A0exit=E2=80=99 and =E2=80= =98fat_mutex_lock=E2=80=99 wrt. =E2=80=98t->admin_mutex=E2=80=99 and =E2=80=98m->lock=E2=80=99. I thought this commit solved the problem, but now I think it doesn=E2=80=99t because it leaves a small window during which a mutex could be held by a thread without being part of its `mutexes' list, thereby violating the invariant tested at line 667: /* Since MUTEX is in `t->mutexes', T must be its owner. */ assert (scm_is_eq (m->owner, t->handle)); So I reverted the patch. (The situation isn=E2=80=99t better without the patch since =E2=80=9Cwhile ./check-guile srfi-18.test threads.test ; do : ; done=E2=80= =9D eventually hits the assertion failure.) I tried the attached patch, which is inelegant and leads to deadlocks with canceled threads (namely the =E2=80=9Ccancel succeeds=E2=80=9D test in threads.test.) IOW I would welcome fresh ideas to approach the problem. :-) Thanks, Ludo=E2=80=99. --===-=-= Content-Type: text/x-patch Content-Disposition: inline Modified libguile/threads.c diff --git a/libguile/threads.c b/libguile/threads.c index cbacfca..d537e0e 100644 --- a/libguile/threads.c +++ b/libguile/threads.c @@ -1353,12 +1353,24 @@ fat_mutex_lock (SCM mutex, scm_t_timespec *timeout, SCM owner, int *ret) fat_mutex *m = SCM_MUTEX_DATA (mutex); SCM new_owner = SCM_UNBNDP (owner) ? scm_current_thread() : owner; + scm_i_thread *t = + scm_is_true (new_owner) ? SCM_I_THREAD_DATA (new_owner) : NULL; SCM err = SCM_BOOL_F; struct timeval current_time; - scm_i_scm_pthread_mutex_lock (&m->lock); +#define LOCK \ + if (t != NULL) \ + scm_i_pthread_mutex_lock (&t->admin_mutex); \ + scm_i_pthread_mutex_lock (&m->lock) + +#define UNLOCK \ + scm_i_pthread_mutex_unlock (&m->lock); \ + if (t != NULL) \ + scm_i_pthread_mutex_unlock (&t->admin_mutex) + + LOCK; while (1) { if (m->level == 0) @@ -1367,22 +1379,12 @@ fat_mutex_lock (SCM mutex, scm_t_timespec *timeout, SCM owner, int *ret) m->level++; if (SCM_I_IS_THREAD (new_owner)) - { - scm_i_thread *t = SCM_I_THREAD_DATA (new_owner); - - scm_i_pthread_mutex_unlock (&m->lock); - scm_i_pthread_mutex_lock (&t->admin_mutex); - - /* Only keep a weak reference to MUTEX so that it's not - retained when not referenced elsewhere (bug #27450). - The weak pair itself is eventually removed when MUTEX - is unlocked. Note that `t->mutexes' lists mutexes - currently held by T, so it should be small. */ - t->mutexes = scm_weak_car_pair (mutex, t->mutexes); - - scm_i_pthread_mutex_unlock (&t->admin_mutex); - scm_i_pthread_mutex_lock (&m->lock); - } + /* Only keep a weak reference to MUTEX so that it's not + retained when not referenced elsewhere (bug #27450). + The weak pair itself is eventually removed when MUTEX + is unlocked. Note that `t->mutexes' lists mutexes + currently held by T, so it should be small. */ + t->mutexes = scm_weak_car_pair (mutex, t->mutexes); *ret = 1; break; } @@ -1425,13 +1427,18 @@ fat_mutex_lock (SCM mutex, scm_t_timespec *timeout, SCM owner, int *ret) } } block_self (m->waiting, mutex, &m->lock, timeout); - scm_i_pthread_mutex_unlock (&m->lock); - SCM_TICK; - scm_i_scm_pthread_mutex_lock (&m->lock); + + /* UNLOCK; */ + /* SCM_TICK; */ + /* LOCK; */ } } - scm_i_pthread_mutex_unlock (&m->lock); + + UNLOCK; + return err; +#undef LOCK +#undef UNLOCK } SCM scm_lock_mutex (SCM mx) --===-=-=-- --=-=-= -- http://wingolog.org/ --=-=-=--