From: "Linas Vepstas" <linasvepstas@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guile-user@gnu.org
Subject: Re: guile threading deadlock
Date: Sat, 8 Nov 2008 12:29:25 -0600 [thread overview]
Message-ID: <3ae3aa420811081029y2d6334a1y84cef351e69b7de1@mail.gmail.com> (raw)
In-Reply-To: <87fxm25rrg.fsf@gnu.org>
Hi,
2008/11/8 Ludovic Courtès <ludo@gnu.org>:
> Hello!
>
> "Linas Vepstas" <linasvepstas@gmail.com> writes:
>
>> I've got a little deadlock problem w/ guile. Here's the pseudocode:
I've got a much, much simpler case, see below.
> Can you try to provide actual code to reproduce the problem? :-)
> That would be great.
Sure, but you won't enjoy debugging it. Or even building it.
Go to https://code.launchpad.net/~opencog-dev
check out the branch called "staging". Build it.
then cd to directory opencog/scm and run load.sh
> Did you compile Guile with thread support
Yes. You will observe that the stack traces I sent were
deadlocked in garbage collection? Without thread
support, how could things possibly deadlock?
Anyway, I have an even simpler variant, with only *one*
thread deadlocked in gc. Here's the scenario:
thread A:
scm_init_guile();
does some other stuff, then
goes to sleep in select, waiting on socket input
(as expected).
thread B:
scm_init_guile() -- hangs here.
B deadlocks with the stack trace below:
#0 0xffffe425 in __kernel_vsyscall ()
#1 0xf7e60589 in __lll_lock_wait () from
/lib/tls/i686/cmov/libpthread.so.0
#2 0xf7e5bba6 in _L_lock_95 () from /lib/tls/i686/cmov/libpthread.so.0
#3 0xf7e5b58a in pthread_mutex_lock () from
/lib/tls/i686/cmov/libpthread.so.0
#4 0xf7844464 in scm_i_thread_put_to_sleep () at threads.c:1615
#5 0xf77eeca9 in scm_i_gc (what=0xf786422e "cells") at gc.c:552
#6 0xf77eefed in scm_gc_for_newcell (freelist=0xf787984c,
free_cells=0x99fa25c)
at gc.c:509
#7 0xf7843bff in guilify_self_2 (parent=0xf76b0e70) at
../libguile/inline.h:115
#8 0xf7845a9b in scm_i_init_thread_for_guile (base=0xf3b8a000,
parent=0xf76b0e70) at threads.c:578
#9 0xf7845d82 in scm_init_guile () at threads.c:682
#10 0xf796f928 in opencog::SchemeEval::thread_init (this=0x995bc38)
at ...
I built guile, and added debug prints: one to
scm_enter_guile, which takes a lock, and one
to scm_leave_guile, with drops a lock. I also
put prints into scm_i_thread_put_to_sleep ().
The behaviour is very clear, and every simple:
when scm_init_guile is called in thread A, the result
is that it is in "guile mode" i.e. holding the lock --
it is created holding the lock. There's a series of
pairs of calls to leave..enter which are always
paired up. Anyway, when thread A finally goes to
sleep waiting on input, it does so with its lock held.
Read libguile/threads.c:scm_enter_guile() to see
what I mean:
static void
scm_enter_guile (scm_t_guile_ticket ticket)
{
scm_i_thread *t = (scm_i_thread *)ticket;
if (t)
{
scm_i_pthread_mutex_lock (&t->heap_mutex);
resume (t);
}
}
while scm_leave_guile() does symmetrically the opposite.
Anyway, at this point, thread A is sleeping, holding the
above lock, because the last guile-thing it did was to
call scm_enter_guile().
Then *later on*, thread B calls scm_init_guile(), and
hangs, very clearly in scm_i_thread_put_to_sleep().
The printf reveal the hangs happen here:
/* Signal all threads to go to sleep
*/
scm_i_thread_go_to_sleep = 1;
for (t = all_threads; t; t = t->next_thread)
{
scm_i_pthread_mutex_lock (&t->heap_mutex);
}
Well, the for loop then gets stuck, waiting for the lock.
But the lock will never be granted, because thread A
is holding it, and is in permanent sleep. As a result,
thread B is blocked, forever, thus a deadlock.
I'm somewhat stumped, because I can't imagine how
this code *ever* could have worked in the first place.
The deadlock seems really blatent to me. It seems
criminal for guile to *ever* return to the caller, while
still holding a lock of any sort. But every clearly,
scm_init_guile() (and I guess most other calls) return
to C code, with a lock held. This is just begging for
deadlocks!
--linas
next prev parent reply other threads:[~2008-11-08 18:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-08 2:26 guile threading deadlock Linas Vepstas
2008-11-08 12:25 ` Ludovic Courtès
2008-11-08 18:29 ` Linas Vepstas [this message]
2008-11-09 17:13 ` Ludovic Courtès
2008-11-09 19:47 ` Linas Vepstas
2008-11-09 21:14 ` Ludovic Courtès
2008-11-09 22:16 ` Linas Vepstas
2008-11-09 23:36 ` Ludovic Courtès
2008-11-10 23:59 ` Linas Vepstas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3ae3aa420811081029y2d6334a1y84cef351e69b7de1@mail.gmail.com \
--to=linasvepstas@gmail.com \
--cc=guile-user@gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).