From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: rixed@happyleptic.org Newsgroups: gmane.lisp.guile.user Subject: Re: debugging guile runtime Date: Thu, 1 Sep 2011 13:32:07 +0200 Message-ID: <20110901113207.GA17194@ccellier.rd.securactive.lan> References: <20110829125030.GB30079@ccellier.rd.securactive.lan> <877h5t70iw.fsf@pobox.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1314876756 14967 80.91.229.12 (1 Sep 2011 11:32:36 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 1 Sep 2011 11:32:36 +0000 (UTC) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Thu Sep 01 13:32:29 2011 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Qz5VR-0008QO-5I for guile-user@m.gmane.org; Thu, 01 Sep 2011 13:32:25 +0200 Original-Received: from localhost ([::1]:49278 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qz5VQ-0005NK-EU for guile-user@m.gmane.org; Thu, 01 Sep 2011 07:32:24 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:49074) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qz5VN-0005M0-9K for guile-user@gnu.org; Thu, 01 Sep 2011 07:32:22 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qz5VL-0004sR-TU for guile-user@gnu.org; Thu, 01 Sep 2011 07:32:21 -0400 Original-Received: from eneide.happyleptic.org ([213.251.171.101]:35020) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qz5VL-0004mQ-Ik for guile-user@gnu.org; Thu, 01 Sep 2011 07:32:19 -0400 Original-Received: from extranet.securactive.org ([82.234.213.170] helo=ccellier.rd.securactive.lan) by eneide.happyleptic.org with esmtp (Exim 4.72) (envelope-from ) id 1Qz5VE-0004K3-AC for guile-user@gnu.org; Thu, 01 Sep 2011 13:32:12 +0200 Original-Received: from rixed by ccellier.rd.securactive.lan with local (Exim 4.72) (envelope-from ) id 1Qz5V9-0004dQ-I8 for guile-user@gnu.org; Thu, 01 Sep 2011 13:32:07 +0200 Mail-Followup-To: rixed@happyleptic.org, guile-user@gnu.org Content-Disposition: inline In-Reply-To: <877h5t70iw.fsf@pobox.com> User-Agent: Mutt/1.5.20 (2009-06-14) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 213.251.171.101 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:8758 Archived-At: > > #1 ports are not thread safe (and any other thread safety issues) ; > > In general I think this issue needs to be split up between issues with > port buffers and other issues; while it might be helpful to you to have > a tracker bug, it's not helpful to me to conflate things that require > different fixes. > > So! As you say, not thread-safe. But can we fix it in 2.0? I am not > sure. We can add a mutex onto the end of scm_t_port. But it seems that > ignoring ABI compatibility might allow us to focus on the solution more > easily. I can't see what solution you envision that requires breaking the API. Are you thinking about a more functional style API for ports ? I confess that the only solution I envisaged for now was merely to add a global lock on all ports operations (I'm a little afraid about a per-port mutex but that's certainly because I spend the last 2 days hunting a race condition in my C program :-)) Anyway, as I'm not very familiar with the runtime I though you (or someone who is) would suggest the best solution to me ; so I did not try anything yet :-) > What is your target? How much are you willing to do yourself? Ideally, I'd like the list to hint me toward a quick fix that I could implement quickly, so that the problem is at least solved and the program I work on can go to production, thus effectively testing the fix before commit it into guile. The alternative being me rewriting some 15 lines of scheme into 150 lines of C (and suffering sarcasm from my colleagues :-)) I might have a week or so to devote to this matter before my team favor the other solution (do it in C). A full week of time of paid work on the runtime! Too bad I'm a guile and scheme newbie, so unfortunately it's probably roughly equivalent to 2 hours for any of you. > > #2 fork may freeze in some occurrence ; > > I assume this is because of the port-table mutex bug that you posted > earlier? We should be able to fix this with an atfork. I was referring to the problem I posted a while ago about this small scheme program that was deadlocking in open-pipe: ---- (use-modules (ice-9 popen) (ice-9 threads)) (define (repeat n f) (if (> n 0) (begin (f) (repeat (- n 1) f)))) (define (forever f) (f) (forever f)) (display "Spawn a thread that performs some writes\n") (make-thread forever (lambda () (display "write...\n"))) (display "Now exec some processes...\n") (forever (lambda () (let ((pipe (open-input-pipe "sleep 0"))) (close-pipe pipe)))) --- I can't reproduce the bug with guile2, and honestly I can't say for sure that I have seen it in the wild with guile 2 (although it's frequent when the app runs with guile 1.8), but I have not performed many tests with guile 2 yet. I'm about to upgrade one of our most used test server with guile 2 to see how it behaves, so we will quickly know if it's still relevant or not. > > #3 the use of select prevent the extended app to open more than 1024 > > files ; > > I recall something about this; can you give a link to a bug? If there > isn't one, can you file one? I did not filled a bug at savannah, but posted a patch here. For the record, here is the only part of the patch that's still relevant for v2.0.2: diff --git a/libguile/fports.c b/libguile/fports.c index 0b84d44..f19d291 100644 --- a/libguile/fports.c +++ b/libguile/fports.c @@ -49,7 +49,9 @@ #ifdef HAVE_STRUCT_STAT_ST_BLKSIZE #include #endif - +#ifdef HAVE_POLL_H +#include +#endif #include #include @@ -585,7 +587,14 @@ scm_fdes_to_port (int fdes, char *mode, SCM name) static int fport_input_waiting (SCM port) { -#ifdef HAVE_SELECT +#ifdef HAVE_POLL + int fdes = SCM_FSTREAM (port)->fdes; + struct pollfd pollfd = { fdes, POLLIN, 0 }; + if (poll(&pollfd, 1, 0) < 0) + scm_syserror ("fport_input_waiting"); + return pollfd.revents & POLLIN ? 1 : 0; + +#elif defined(HAVE_SELECT) int fdes = SCM_FSTREAM (port)->fdes; struct timeval timeout; SELECT_TYPE read_set; Patch for 1.8 was much heavier though, since select was used here and there within the runtime (by fport_wait_for_input which vanished and by scm_accept that's not using select any more). With the above patch a guile2 user can open more than 1024 files (as long as he does not call explicitly select of course). I'm using this and it seams to work (again, not heavily tested with guile2). > > #4 fork does not close all open files. > > This won't change in 2.0. You can do something in an atfork, but... I'm > not sure this is the right thing. The POSIX behavior was > well-considered, and we should be hesitant to change it without a good > reason. Yes this was discussed already, I was wrong and the current behavior is correct (yet I still think POSIX is weird here but that's another matter). > > #5 new syntax definitions are not loaded by compiler > > Hmm? Also, already discussed. Mark convinced me that this is not a bug and I should stop using load for loading code but use the module system instead.