From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Rob Browning Newsgroups: gmane.lisp.guile.devel Subject: Re: 1.8 make check failing in popen.test Date: Wed, 30 Aug 2006 00:50:08 -0700 Message-ID: <87irka4q8f.fsf@raven.defaultvalue.org> References: <87vepora1r.fsf@raven.defaultvalue.org> <87y7tqo0po.fsf@raven.defaultvalue.org> <87r6zhlhhk.fsf@zip.com.au> <87ejvho4ve.fsf@raven.defaultvalue.org> <87mza4uurn.fsf@zip.com.au> <87bqqieohh.fsf@raven.defaultvalue.org> <87pset1wxr.fsf@zip.com.au> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1156924234 32494 80.91.229.2 (30 Aug 2006 07:50:34 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 30 Aug 2006 07:50:34 +0000 (UTC) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Aug 30 09:50:33 2006 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1GIKq5-0002Th-56 for guile-devel@m.gmane.org; Wed, 30 Aug 2006 09:50:21 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GIKq4-0002X6-9j for guile-devel@m.gmane.org; Wed, 30 Aug 2006 03:50:20 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1GIKpv-0002Vt-T5 for guile-devel@gnu.org; Wed, 30 Aug 2006 03:50:11 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1GIKpv-0002VG-4d for guile-devel@gnu.org; Wed, 30 Aug 2006 03:50:11 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GIKpv-0002VD-1A for guile-devel@gnu.org; Wed, 30 Aug 2006 03:50:11 -0400 Original-Received: from [70.85.129.156] (helo=defaultvalue.org) by monty-python.gnu.org with esmtp (Exim 4.52) id 1GIKzG-0004pi-LH for guile-devel@gnu.org; Wed, 30 Aug 2006 03:59:50 -0400 Original-Received: from omen.defaultvalue.org (localhost [127.0.0.1]) by defaultvalue.org (Postfix) with ESMTP id 4EC5190D2B for ; Wed, 30 Aug 2006 00:50:09 -0700 (PDT) Original-Received: from raven.defaultvalue.org (raven.defaultvalue.org [192.168.1.7]) by omen.defaultvalue.org (Postfix) with ESMTP id F3BA723D98 for ; Wed, 30 Aug 2006 00:50:08 -0700 (PDT) Original-Received: by raven.defaultvalue.org (Postfix, from userid 1000) id C28D5355120; Wed, 30 Aug 2006 00:50:08 -0700 (PDT) Original-To: guile-devel@gnu.org In-Reply-To: <87pset1wxr.fsf@zip.com.au> (Kevin Ryde's message of "Tue, 22 Aug 2006 09:38:40 +1000") User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:6044 Archived-At: Kevin Ryde writes: > Looks about right. What's the child process doing? It's supposed to > be writing to the parent to say continue. (Unless it failed to fork > there should be some child, either running or a zombie.) (Consider the following info preliminary. I haven't had any time to try and figure out the actual cause, but since I just discovered this, and I have to stop for the moment, I wanted to let everyone else have a look.) After further investigation, it appears that particular child might not be running, at least not on some of the runs. I switched back to the original code (the code that would hang), added some debug statements, ran strace -p -s 100, etc. on "make check", and found that the child appears to be segfaulting at least some of the time here (in popen.scm): (port-for-each (lambda (pt-entry) ;;(dbg-out (list 'pt-entry pt-entry)) (false-if-exception (let ((pt-fileno (fileno pt-entry))) (if (not (or (= pt-fileno input-fdes) (= pt-fileno output-fdes) (= pt-fileno error-fdes))) (close-fdes pt-fileno)))))) When I uncomment the dbg-out statement above (which just writes the arg and a newline to an output-port and then forces the output), I see this on the console: ERROR: popen.test: open-output-pipe: no duplicate - arguments: ((wrong-type-arg "list-copy" "Wrong type argument in position ~A: ~S" (1 (pt-entry . #)) ((pt-entry . #)))) this in the dbg-out output file: ... (pt-entry #) (pt-entry #) (pt-entry #) and this in the strace (1402 is the forked child process): 1402 write(7, "ERROR: popen.test: open-output-pipe: no duplicate - arguments: ((wrong-type-arg \"list-copy\" \"Wrong t"..., 263) = -1 EBADF (Bad file descriptor) 1402 write(2, "ERROR", 5) = -1 EBADF (Bad file descriptor) 1402 write(2, "\nException during displaying of ", 32) = -1 EBADF (Bad file descriptor) 1402 write(7, "ERROR: popen.test: open-output-pipe: no duplicate - arguments: ((wrong-type-arg \"list-copy\" \"Wrong t"..., 263) = -1 EBADF (Bad file descriptor) 1402 write(2, "ERROR", 5) = -1 EBADF (Bad file descriptor) 1402 write(2, "\nException during displaying of ", 32) = -1 EBADF (Bad file descriptor) 1402 exit_group(1) = ? If I omit the dbg-out statement in the above code, then I can just see the child die due to a SEGV in the strace log (2126 is the child): 2126 close(12 2123 <... close resumed> ) = 0 2126 <... close resumed> ) = 0 2123 access("/etc/ld.so.nohwcap", F_OK 2126 close(10 2123 <... access resumed> ) = -1 ENOENT (No such file or directory) 2126 <... close resumed> ) = 0 2123 open("/lib/tls/i686/cmov/libdl.so.2", O_RDONLY 2126 close(29 2123 <... open resumed> ) = 5 2126 <... close resumed> ) = 0 2123 read(5, 2126 --- SIGSEGV (Segmentation fault) @ 0 (0) --- So I started from a clean tree, enabled core dumps, and here's what gdb had to say about the resulting core: Program terminated with signal 11, Segmentation fault. #0 0x400729ca in scm_fileno (port=0x0) at ioext.c:180 180 port = SCM_COERCE_OUTPORT (port); (gdb) where #0 0x400729ca in scm_fileno (port=0x0) at ioext.c:180 #1 0x4005ad41 in ceval (x=0x404, env=0x40372710) at eval.c:4218 #2 0x4005b26e in ceval (x=, env=0x40372710) at eval.c:3634 In any case, as I said, consider all this preliminary. For everything but the core dump, I wasn't working from a clean tree. -- Rob Browning rlb @defaultvalue.org and @debian.org; previously @cs.utexas.edu GPG starting 2002-11-03 = 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4 _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://lists.gnu.org/mailman/listinfo/guile-devel