From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Newsgroups: gmane.lisp.guile.devel Subject: Re: bug#22608: Module system thread unsafety and .go compilation Date: Wed, 10 Feb 2016 14:50:42 +0100 Message-ID: <87egckn07h.fsf@gnu.org> References: <8737t1k5yk.fsf@T420.taylan> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1455112267 20235 80.91.229.3 (10 Feb 2016 13:51:07 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 10 Feb 2016 13:51:07 +0000 (UTC) Cc: 22608@debbugs.gnu.org, guile-devel@gnu.org To: taylanbayirli@gmail.com (Taylan Ulrich =?utf-8?Q?=22Bay=C4=B1rl=C4=B1?= =?utf-8?Q?=2FKammer=22?=) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Feb 10 14:50:59 2016 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aTVAZ-0007dd-Je for guile-devel@m.gmane.org; Wed, 10 Feb 2016 14:50:59 +0100 Original-Received: from localhost ([::1]:39415 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aTVAZ-0003eS-3W for guile-devel@m.gmane.org; Wed, 10 Feb 2016 08:50:59 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50750) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aTVAM-0003b1-R4 for guile-devel@gnu.org; Wed, 10 Feb 2016 08:50:47 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aTVAL-0004Fj-Du for guile-devel@gnu.org; Wed, 10 Feb 2016 08:50:46 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:36382) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aTVAL-0004FW-2Z; Wed, 10 Feb 2016 08:50:45 -0500 Original-Received: from reverse-83.fdn.fr ([80.67.176.83]:58368 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aTVAK-0001rL-8k; Wed, 10 Feb 2016 08:50:44 -0500 In-Reply-To: <8737t1k5yk.fsf@T420.taylan> ("Taylan Ulrich \=\?utf-8\?Q\?\=5C\=22Bay\=C4\=B1rl\=C4\=B1\=2FKammer\=5C\=22\=22's\?\= message of "Tue, 09 Feb 2016 21:02:27 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:18167 Archived-At: taylanbayirli@gmail.com (Taylan Ulrich "Bay=C4=B1rl=C4=B1/Kammer") skribis: > Sadly that assumption isn't met when autoloads are involved. > Minimal-ish test-case: > > - Check out 0889321. > > - Build it. > > - Edit gnu/build/activation.scm and gnu/build/linux-boot.scm to contain > merely the following expressions, respectively: > > (define-module (gnu build activation) > #:use-module (gnu build linux-boot)) > > (define-module (gnu build linux-boot) > #:autoload (system base compile) (compile-file)) > > - Run make again. > > If you're on a multi-core system, you will probably get an error saying > something weird like "no such language scheme". Do you have a clear explanation of why this happens? I would expect (system base compile) to already be loaded for instance, so it=E2=80=99s not clear to me what=E2=80=99s going on. Or is it just the mutation of (gnu bu= ild linux-boot) that=E2=80=99s causing problems? > Solution proposals: > > 1. s/par-for-each/for-each/. Will make compilation slower on multi-core > machines. We would do the same for guix pull, which is a bit sad > because it's so fast right now. Very simple solution though. > > 2. We find out some partitioning of the Scheme modules such that there > is minimal overlap in total loaded modules when the modules in one > subset are each loaded by one Guile process. Then each Guile process > loads & compiles the modules in its given subset serially, but these > Guile processes run in parallel. This could speed things up even > more than now because the module-loading phases of the processes > would be parallel too. It also has the side-effect that less memory > is consumed the fewer cores you have (because less Scheme modules > loaded into memory at once). If someone (Ludo?) has a good general > overview of Guix's module graph then maybe they can come up with a > sensible partitioning of the modules, say into 4 subsets (maxing out > benefits at quad-core), such that loading all modules in one subset > loads a minimal amount of modules that are outside that subset. That > should be the only challenging part of this solution. > > 3. We do nothing for now since this bug triggers rarely, and can be > worked around by simply re-running make. (We just have to hope that > it doesn't trigger on guix pull or on clean builds after some commit; > there's no "just rerun make" in guix pull or an automated build of > Guix.) AFAIU Wingo expressed motivation to make Guile's module > system thread safe, so this problem would then truly disappear. Short-term, I=E2=80=99d do #1 or #3; probably #1 though, because random fai= lures are no fun, and we know they can happen. Longer-term, I=E2=80=99m not convinced by #2. I think I would instead build packages in reverse topological order, probably serially at first, which would address (with the caveat that the (gnu packages =E2=80=A6) modules cannot be topologically-sorted, but OTOH they typically don=E2=80=99t use macros, so we=E2=80=99re fine.) That would require a tool to extract and the =E2=80=98define-module=E2=80= =99 forms and build a graph from there. But really, we must fix , an in particular, =E2=80=98compile-file=E2=80=99 should not mutate the global module name spa= ce. I think we could do something like: (define (compile-file* =E2=80=A6) (let ((root the-root-module) (compile-root (copy-module the-root-module))) (dynamic-wind (lambda () (set! the-root-module compile-root) ;; ditto with the-scm-module ) (lambda () (compile-file =E2=80=A6)) (lambda () (set! the-root-module root) ;; =E2=80=A6 )))) It=E2=80=99s unclear how costly =E2=80=98copy-module=E2=80=99 would be, and= the whole strategy depends on it. Eventually it seems clear that Guile proper needs to address this use case, and needs to provide thread-safe modules. Ludo=E2=80=99.