From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: Re: full moon, vm status update Date: Thu, 16 Oct 2008 14:34:08 +0200 Message-ID: References: <2bc5f8210810152235n2cfd65cau531d3a39da212a40@mail.gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1224166067 20345 80.91.229.12 (16 Oct 2008 14:07:47 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 16 Oct 2008 14:07:47 +0000 (UTC) Cc: guile-devel To: "Julian Graham" Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Thu Oct 16 16:08:46 2008 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KqTWp-0007LB-Pi for guile-devel@m.gmane.org; Thu, 16 Oct 2008 16:08:40 +0200 Original-Received: from localhost ([127.0.0.1]:58143 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KqTVk-0006nb-QF for guile-devel@m.gmane.org; Thu, 16 Oct 2008 10:07:32 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KqTVg-0006nK-Ah for guile-devel@gnu.org; Thu, 16 Oct 2008 10:07:28 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KqTVe-0006n7-Ql for guile-devel@gnu.org; Thu, 16 Oct 2008 10:07:28 -0400 Original-Received: from [199.232.76.173] (port=52235 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KqTVe-0006n4-Oc for guile-devel@gnu.org; Thu, 16 Oct 2008 10:07:26 -0400 Original-Received: from a-sasl-quonix.sasl.smtp.pobox.com ([208.72.237.25]:46855 helo=sasl.smtp.pobox.com) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KqTVe-0006fb-5R for guile-devel@gnu.org; Thu, 16 Oct 2008 10:07:26 -0400 Original-Received: from localhost.localdomain (localhost [127.0.0.1]) by a-sasl-quonix.sasl.smtp.pobox.com (Postfix) with ESMTP id F19088B0BA; Thu, 16 Oct 2008 10:06:26 -0400 (EDT) Original-Received: from unquote (117.Red-79-156-146.staticIP.rima-tde.net [79.156.146.117]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-sasl-quonix.sasl.smtp.pobox.com (Postfix) with ESMTPSA id 022DD8B0B9; Thu, 16 Oct 2008 10:06:24 -0400 (EDT) In-Reply-To: <2bc5f8210810152235n2cfd65cau531d3a39da212a40@mail.gmail.com> (Julian Graham's message of "Thu, 16 Oct 2008 01:35:02 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-Pobox-Relay-ID: 9F81539E-9B8B-11DD-8AEE-4F5276724C3F-02397024!a-sasl-quonix.pobox.com X-detected-operating-system: by monty-python.gnu.org: Solaris 10 (beta) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:7783 Archived-At: Hi, On Thu 16 Oct 2008 07:35, "Julian Graham" writes: >> My current speculation is that when you compile --with-threads, as I >> do, that the socketpair between the signal receiving thread and the >> main thread is not closed after the fork, therefore signals in the >> child might reach the parent or vice versa, causing random code to >> run which itself might cause VM problems. This loopy speculation was only part-right: it *was* async handling that was causing tests with compiled popen.scm to fail, because we didn't save the registers before running the asyncs. I have now fixed that. But it does not appear to have been related to the socketpair. > Well, yes, that's possible; the signal handling system certainly isn't > aware of forking. But since only the thread calling fork() gets > created in the child process (and the signal delivery thread should > never fork), errant signal propagation would only be one-way -- in the > child-to-parent direction. Ah yes, good point. Still sounds like a bug though. scheme@(guile-user)> (sigaction SIGINT (lambda (x) (pk x (getpid)))) $1 = (# . 268435456) scheme@(guile-user)> (define (fork-sleep-int) (primitive-fork) (pk 'after-fork (getpid)) (sleep 2) (raise SIGINT)) scheme@(guile-user)> (fork-sleep-int) ;;; (after-fork 878) ;;; (after-fork 932) scheme@(guile-user)> ;;; (2 878) scheme@(guile-user)> scheme@(guile-user)> ;;; (2 878) scheme@(guile-user)> You might have to press enter a couple of times to get the second pk to come through, I think. It seems fork + signal handling is borked with a guile compiled --with-threads -- and that's not specific to the vm branch. > I think I'm missing a little context, though. Is forking (without > doing an execve) something that happens during compilation of VM code? No, I'm just running Guile's test suite with a compiled ice-9. Socket.test does fork without exec, but even with an exec the fd table is kept the same iirc. > What should the behavior be in this situation? If the child process > doesn't need async support, I guess it could close the write end of > the signal pipe, but that seems... wrong, somehow. Well, in popen.scm, we already close all of the fd's that guile knows about in the child process. So it looks like in the popen case we're just missing one, if my suspicions are right. Then in a Guile child, we should probably re-spawn the thread to handle signals? I have no idea here. An interesting bug, eh. Who wants to fix it? (Ludo? :-) Andy -- http://wingolog.org/