From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ken Raeburn Newsgroups: gmane.lisp.guile.devel Subject: deadlock in current git version on error during initialization Date: Wed, 26 Aug 2009 18:51:17 -0400 Message-ID: <40E40C4C-2F06-4E8B-8A15-4BEF4C93BB91@raeburn.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v936) Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1251327105 19409 80.91.229.12 (26 Aug 2009 22:51:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 26 Aug 2009 22:51:45 +0000 (UTC) To: guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Thu Aug 27 00:51:39 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MgRL6-0006yT-GJ for guile-devel@m.gmane.org; Thu, 27 Aug 2009 00:51:37 +0200 Original-Received: from localhost ([127.0.0.1]:49173 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MgRL5-0007kC-TZ for guile-devel@m.gmane.org; Wed, 26 Aug 2009 18:51:35 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MgRL1-0007gg-R2 for guile-devel@gnu.org; Wed, 26 Aug 2009 18:51:31 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MgRKx-0007Y1-B2 for guile-devel@gnu.org; Wed, 26 Aug 2009 18:51:31 -0400 Original-Received: from [199.232.76.173] (port=54212 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MgRKw-0007Xp-St for guile-devel@gnu.org; Wed, 26 Aug 2009 18:51:26 -0400 Original-Received: from splat.raeburn.org ([69.25.196.39]:50373 helo=raeburn.org) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MgRKo-00054G-JX for guile-devel@gnu.org; Wed, 26 Aug 2009 18:51:26 -0400 Original-Received: from [10.0.0.172] (squish.raeburn.org [10.0.0.172]) by raeburn.org (8.14.3/8.14.1) with ESMTP id n7QMpHsP009623; Wed, 26 Aug 2009 18:51:17 -0400 (EDT) X-Mailer: Apple Mail (2.936) X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:9180 Archived-At: After the build-order problem I just reported causes a module to fail to load, the build process hangs here: ./guile_filter_doc_snarfage --filter-snarfage) > regex-posix.doc || { rm regex-posix.doc; false; } cat alist.doc [...] regex-posix.doc | GUILE_AUTO_COMPILE=0 ../meta/ uninstalled-env guile-tools snarf-check-and-output-texi > guile-procedures.texi || { rm guile-procedures.texi; false; } ERROR: In procedure dynamic-link: ERROR: file: "libguile-srfi-srfi-1-v-4", message: "file not found" I poked at it a bit and it looks like two threads are waiting on mutexes: (gdb) thr apply all bt Thread 2 (process 49791 thread 0xf03): #0 0x9486e2ce in semaphore_wait_signal_trap () #1 0x94875da5 in pthread_mutex_lock () #2 0x0032ae66 in scm_i_init_thread_for_guile (base=0xb0080f3c, parent=0x2d2c0) at ../../libguile/threads.c:694 #3 0x0032af33 in scm_i_with_guile_and_parent (func=0x32b870 , data=0xbfff83ec, parent=0x9486e2ce) at ../../libguile/ threads.c:848 #4 0x0032b1ca in spawn_thread (d=0xb0080f3c) at ../../libguile/ threads.c:1000 #5 0x9489f155 in _pthread_start () #6 0x9489f012 in thread_start () Thread 1 (process 49791 thread 0x72f): #0 0x9487546e in __semwait_signal () #1 0x948a03e6 in _pthread_cond_wait () #2 0x9489fdcd in pthread_cond_wait$UNIX2003 () #3 0x0032bc98 in scm_pthread_cond_wait (cond=0x14e, mutex=0xbfff8430) at ../../libguile/threads.c:1858 #4 0x0032bd69 in scm_spawn_thread (body=0x14e, body_data=0x14e, handler=0x14e, handler_data=0x14e) at ../../libguile/threads.c:1029 #5 0x002fec2f in start_signal_delivery_thread () at ../../libguile/ scmsigs.c:219 #6 0x9489089e in pthread_once () #7 0x002fec95 in scm_i_ensure_signal_delivery_thread () at ../../ libguile/scmsigs.c:231 #8 0x0032b025 in on_thread_exit (v=0x3718b0) at ../../libguile/ threads.c:632 #9 0x948a2013 in _pthread_tsd_cleanup () #10 0x948a1bb5 in _pthread_exit () #11 0x00331250 in scm_handle_by_message (handler_data=0x0, tag=, args=0x11310d0) at ../../libguile/throw.c:540 #12 0x0033172c in scm_ithrow (key=0x2fb20, args=0x11310d0, noreturn=1) at ../../libguile/throw.c:802 #13 0x0021ca5d in scm_error_scm (key=0x2fb20, subr=0x16aa80, message=0x16aac0, args=0x11310f0, data=0x4) at ../../libguile/error.c:93 #14 0x0021caf5 in scm_error (key=0x14e, subr=, message=, args=0x14e, rest=0x14e) at ../../ libguile/error.c:59 #15 0x0021d028 in scm_misc_error (subr=0x14e
, message=0x14e
, args=0x14e) at ../../libguile/error.c:282 #16 0x00350ce7 in scm_dynamic_link (filename=0x16ac90) at ../../ libguile/dynl.c:86 #17 0x0025f689 in load_extension (lib=0x16ac90, init=0x16ac30) at ../../libguile/extensions.c:105 #18 0x0025f701 in scm_load_extension (lib=0x14e, init=0x9487546e) at ../../libguile/extensions.c:152 #19 0x0024b40c in ceval (x=0x404, env=0x1131138) at eval.i.c:1346 #20 0x0025ba6a in scm_primitive_eval_x (exp=0x3df20) at ../../libguile/ eval.c:4071 #21 0x002d08bb in scm_primitive_load (filename=0x15c080) at ../../ libguile/load.c:125 #22 0x0024b40c in ceval (x=0x404, env=0x6cd630) at eval.i.c:1346 [...] #90 0x0025a46d in ceval (x=0xd6c10, env=0x6e4458) at eval.i.c:1532 #91 0x0025b79b in scm_call_1 (proc=0xd6cd0, arg1=0x6e0018) at ../../ libguile/eval.c:3124 #92 0x0025ba4f in scm_primitive_eval_x (exp=0xd6cd0) at ../../libguile/ eval.c:4069 #93 0x002d08bb in scm_primitive_load (filename=0x3dca0) at ../../ libguile/load.c:125 #94 0x002d2241 in scm_c_primitive_load_path (filename=0x14e
) at ../../libguile/load.c:773 #95 0x002ca64e in scm_load_startup_files () at ../../libguile/init.c:291 #96 0x0032ae9d in scm_i_init_thread_for_guile (base=0x4, parent=0x0) at ../../libguile/threads.c:700 #97 0x0032af33 in scm_i_with_guile_and_parent (func=0x2ca6d0 , data=0xbfffe7b0, parent=0x9487546e) at ../../ libguile/threads.c:848 #98 0x0032afd9 in scm_with_guile (func=0x14e, data=0x14e) at ../../ libguile/threads.c:831 #99 0x002ca6aa in scm_boot_guile (argc=334, argv=0x14e, main_func=0x14e, closure=0x14e) at ../../libguile/init.c:360 #100 0x00001ff1 in main (argc=334, argv=0x14e) at ../../libguile/ guile.c:70 It appears that scm_i_init_thread_for_guile in thread 1 has the mutex scm_i_init_mutex locked until scm_i_init_guile returns. But many layers down, in some kind of cleanup handler, scm_spawn_thread (creating a thread in a thread-exit cleanup handler??) is waiting for some signal from the new thread that it's started up and initialized, and that thread can't initialize until it locks scm_i_init_mutex. I get the impression that any unhandled error while loading the startup files might result in a deadlock. It should probably either cause process termination, or report some kind of error to the caller. And since we're talking about a library that can be linked into random applications, I'd prefer returning an error; after all, the application might be able to continue, or at least have its own way of reporting errors. Ken