From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Linas Vepstas" Newsgroups: gmane.lisp.guile.bugs Subject: Many, many lock races Date: Fri, 14 Nov 2008 18:46:32 -0600 Message-ID: <3ae3aa420811141646j25b46bf3wdc9be263d6e6612b@mail.gmail.com> Reply-To: linasvepstas@gmail.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1226710015 14796 80.91.229.12 (15 Nov 2008 00:46:55 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 15 Nov 2008 00:46:55 +0000 (UTC) To: bug-guile@gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Sat Nov 15 01:47:54 2008 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1L19KI-0001wb-VH for guile-bugs@m.gmane.org; Sat, 15 Nov 2008 01:47:51 +0100 Original-Received: from localhost ([127.0.0.1]:44152 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L19JA-0007im-Ly for guile-bugs@m.gmane.org; Fri, 14 Nov 2008 19:46:40 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L19J7-0007hS-1A for bug-guile@gnu.org; Fri, 14 Nov 2008 19:46:37 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L19J6-0007hE-8O for bug-guile@gnu.org; Fri, 14 Nov 2008 19:46:36 -0500 Original-Received: from [199.232.76.173] (port=38347 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L19J6-0007hB-29 for bug-guile@gnu.org; Fri, 14 Nov 2008 19:46:36 -0500 Original-Received: from yw-out-1718.google.com ([74.125.46.153]:48820) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L19J5-0001Kw-OQ for bug-guile@gnu.org; Fri, 14 Nov 2008 19:46:35 -0500 Original-Received: by yw-out-1718.google.com with SMTP id 9so767631ywk.66 for ; Fri, 14 Nov 2008 16:46:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:reply-to :to:subject:mime-version:content-type:content-transfer-encoding :content-disposition; bh=wUTUcM6VW5sUxUx+lX2Se/zoVdTYBamrc2IuC2jMhek=; b=RY3aHgzYm1AGTxeaVClHsIfJplmwxgIG/Sr8LFShmjdhVdCNK1WcyQVj5HVr9PUKpX VaEynLEIt93u7EQsj3KumbjiiMptqXP+X++d4s5FW7yoVuTFoptLtqhbiSyIlq+Czlr1 cv9f4s5bJeILcoXx3B6TnYzVjwOz3yQTYZcpQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:reply-to:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=QIPOGvGx7XEumxTuWWHjKX4fo7yaxPkXDjbiK3qH0zq+qB6UtbwgMucXYFcTVXDDAU Dilg7jLKBf2yAK/1VFrIHWQ5zSXfIz/UguTy20/icyUccU3zBdul8DD8px7oSyvMwVyR tPmsQ+Gxyao+AWMVKF14yRpNa13sTSson5pG0= Original-Received: by 10.100.142.15 with SMTP id p15mr778838and.33.1226709992450; Fri, 14 Nov 2008 16:46:32 -0800 (PST) Original-Received: by 10.100.249.18 with HTTP; Fri, 14 Nov 2008 16:46:32 -0800 (PST) Content-Disposition: inline X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: bug-guile@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:4098 Archived-At: I'm now going through guile-1.8.5 code, and see the potential for races leading to deadlocks in many dozens of places. What I's seeing is lots of this: scm_i_scm_pthread_mutex_lock(some_lock) do_something() scm_i_pthread_mutex_unlock(some_lock) With the current set of #defines, this turns into the following pthread_mutex_unlock(thread->heap_mutex); // leave guile pthead_mutex_lock(some_lock) pthread_mutex_lock(thead->heap_mutex) // enter guile do_something() pthread_mutex_unlock(some_lock) The above is very clearly badly nested, and leads to a race with garbage collection, resulting in a deadlock. I hope this is "obvious" to the reader: ... right? ... but, to be clear, consider the following: thread A: pthread_mutex_unlock(thread->heap_mutex); // leave guile pthead_mutex_lock(some_lock) pthread_mutex_lock(thread->heap_mutex) { //enter guile sleep, since thread C just grabbed this heap_mutex thread B: in guile mode (i.e. its own heap_mutex is held) sleeping on some_lock, which A is holding. thread C: scm_i_gc() { scm_i_thread_put_to_sleep() { scm_i_pthread_mutex_lock (thread A) scm_i_pthread_mutex_lock (thread B) { sleep, since thread B is already holding it. and so A is waiting on C is waiting on B is waiting on A ... I'm planning on going through all of these instances on a case-by-case basis, but there seems to be many dozens of these, and this will result in many dozens of patches. Suggestions? --linas