From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Julian Graham" Newsgroups: gmane.lisp.guile.devel Subject: deadlock in scm_join_thread(_timed) Date: Sun, 25 May 2008 01:33:39 -0400 Message-ID: <2bc5f8210805242233x3ac66a60r6d135abd1d8a80a5@mail.gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1211693642 32192 80.91.229.12 (25 May 2008 05:34:02 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 25 May 2008 05:34:02 +0000 (UTC) To: guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sun May 25 07:34:42 2008 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1K08sQ-0000mp-Qb for guile-devel@m.gmane.org; Sun, 25 May 2008 07:34:39 +0200 Original-Received: from localhost ([127.0.0.1]:60629 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1K08rf-0004Gb-HR for guile-devel@m.gmane.org; Sun, 25 May 2008 01:33:51 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1K08rX-0004F0-D3 for guile-devel@gnu.org; Sun, 25 May 2008 01:33:43 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1K08rW-0004Dt-2h for guile-devel@gnu.org; Sun, 25 May 2008 01:33:42 -0400 Original-Received: from [199.232.76.173] (port=37104 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1K08rV-0004De-UP for guile-devel@gnu.org; Sun, 25 May 2008 01:33:41 -0400 Original-Received: from fg-out-1718.google.com ([72.14.220.157]:22132) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1K08rV-00085J-KG for guile-devel@gnu.org; Sun, 25 May 2008 01:33:41 -0400 Original-Received: by fg-out-1718.google.com with SMTP id l26so1035418fgb.30 for ; Sat, 24 May 2008 22:33:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; bh=mhFkMnOYL3hyk9spe8V2DsswoEzNKiXdkBuAPeloe1w=; b=Tnm/s9EUs1mDUqgbOTOTr8/zNClwECULocjCThoe0ouxr1CAOyo4KTXUsA7jMIRuIbe0MkgXhLhVYDluYvW2Qx7O0AFPY0sdUUY0Ml7o8IpbkrOsxxihc4CAqeWYCrc2FcxBtfxoziCX5wfna4HyhwXqqUlj/htPznAVqjhUjqU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=hKfPmFO8q/sf9r5xvGRd6Cswumip8O8PYlVAbhzp7h5xAY8R3f3cG9IUeoYNfhVynlGpHVuokzz6VgJ/2xyKv2tLpiN2Ty4OBR+gWLQ+0VOBmzLFzkqUXAajlwzvOuqqCaWt+aC1ZxNdn5SEa0C6qUfRJHSF5Q1F0ZmMoaNODiI= Original-Received: by 10.82.167.9 with SMTP id p9mr54435bue.81.1211693619858; Sat, 24 May 2008 22:33:39 -0700 (PDT) Original-Received: by 10.82.100.3 with HTTP; Sat, 24 May 2008 22:33:39 -0700 (PDT) Content-Disposition: inline X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 2) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:7267 Archived-At: Hi everyone, While I was testing and debugging some of the SRFI-18 code that Neil and I were working on, I noticed a deadlock that happens in scm_join_thread_timed. I'm pretty sure it affects the 1.8 codebase as well, although it's probably more common when doing timed joins. Thread joining in Guile (1.9 or 1.8) works as follows: 1. If the target thread has exited, return. 2. Block on the target thread's join queue. 3. When woken (because of a pthread_cond_signal, a spurious pthreads wakeup, or, in 1.9, a timeout expiration), check the target thread's exit status -- if it has exited, return. 4. Otherwise, SCM_TICK. 5. Go to step 2. The deadlock can happen if the thread exits during the tick, because there's no check of the exit status before block_self is called again. I'm pretty sure that moving step 1 into the beginning of the loop would fix this -- I can submit a patch against 1.8, 1.9, or both. Let me know what you guys would like. Regards, Julian