From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#22608: Module system thread unsafety and .go compilation Date: Wed, 10 Feb 2016 14:50:42 +0100 Message-ID: <87egckn07h.fsf__31665.9640441232$1455112298$gmane$org@gnu.org> References: <8737t1k5yk.fsf@T420.taylan> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:51023) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aTVAd-00042n-T1 for bug-guix@gnu.org; Wed, 10 Feb 2016 08:51:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aTVAc-0004bW-Jg for bug-guix@gnu.org; Wed, 10 Feb 2016 08:51:03 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:53423) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aTVAc-0004bS-GO for bug-guix@gnu.org; Wed, 10 Feb 2016 08:51:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84) (envelope-from ) id 1aTVAc-0006bW-CB for bug-guix@gnu.org; Wed, 10 Feb 2016 08:51:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <8737t1k5yk.fsf@T420.taylan> ("Taylan Ulrich \=\?utf-8\?Q\?\=5C\=22Bay\=C4\=B1rl\=C4\=B1\=2FKammer\=5C\=22\=22's\?\= message of "Tue, 09 Feb 2016 21:02:27 +0100") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org To: "Taylan Ulrich \"=?UTF-8?Q?Bay=C4=B1rl=C4=B1/Kammer?=\"" Cc: 22608@debbugs.gnu.org, guile-devel@gnu.org taylanbayirli@gmail.com (Taylan Ulrich "Bay=C4=B1rl=C4=B1/Kammer") skribis: > Sadly that assumption isn't met when autoloads are involved. > Minimal-ish test-case: > > - Check out 0889321. > > - Build it. > > - Edit gnu/build/activation.scm and gnu/build/linux-boot.scm to contain > merely the following expressions, respectively: > > (define-module (gnu build activation) > #:use-module (gnu build linux-boot)) > > (define-module (gnu build linux-boot) > #:autoload (system base compile) (compile-file)) > > - Run make again. > > If you're on a multi-core system, you will probably get an error saying > something weird like "no such language scheme". Do you have a clear explanation of why this happens? I would expect (system base compile) to already be loaded for instance, so it=E2=80=99s not clear to me what=E2=80=99s going on. Or is it just the mutation of (gnu bu= ild linux-boot) that=E2=80=99s causing problems? > Solution proposals: > > 1. s/par-for-each/for-each/. Will make compilation slower on multi-core > machines. We would do the same for guix pull, which is a bit sad > because it's so fast right now. Very simple solution though. > > 2. We find out some partitioning of the Scheme modules such that there > is minimal overlap in total loaded modules when the modules in one > subset are each loaded by one Guile process. Then each Guile process > loads & compiles the modules in its given subset serially, but these > Guile processes run in parallel. This could speed things up even > more than now because the module-loading phases of the processes > would be parallel too. It also has the side-effect that less memory > is consumed the fewer cores you have (because less Scheme modules > loaded into memory at once). If someone (Ludo?) has a good general > overview of Guix's module graph then maybe they can come up with a > sensible partitioning of the modules, say into 4 subsets (maxing out > benefits at quad-core), such that loading all modules in one subset > loads a minimal amount of modules that are outside that subset. That > should be the only challenging part of this solution. > > 3. We do nothing for now since this bug triggers rarely, and can be > worked around by simply re-running make. (We just have to hope that > it doesn't trigger on guix pull or on clean builds after some commit; > there's no "just rerun make" in guix pull or an automated build of > Guix.) AFAIU Wingo expressed motivation to make Guile's module > system thread safe, so this problem would then truly disappear. Short-term, I=E2=80=99d do #1 or #3; probably #1 though, because random fai= lures are no fun, and we know they can happen. Longer-term, I=E2=80=99m not convinced by #2. I think I would instead build packages in reverse topological order, probably serially at first, which would address (with the caveat that the (gnu packages =E2=80=A6) modules cannot be topologically-sorted, but OTOH they typically don=E2=80=99t use macros, so we=E2=80=99re fine.) That would require a tool to extract and the =E2=80=98define-module=E2=80= =99 forms and build a graph from there. But really, we must fix , an in particular, =E2=80=98compile-file=E2=80=99 should not mutate the global module name spa= ce. I think we could do something like: (define (compile-file* =E2=80=A6) (let ((root the-root-module) (compile-root (copy-module the-root-module))) (dynamic-wind (lambda () (set! the-root-module compile-root) ;; ditto with the-scm-module ) (lambda () (compile-file =E2=80=A6)) (lambda () (set! the-root-module root) ;; =E2=80=A6 )))) It=E2=80=99s unclear how costly =E2=80=98copy-module=E2=80=99 would be, and= the whole strategy depends on it. Eventually it seems clear that Guile proper needs to address this use case, and needs to provide thread-safe modules. Ludo=E2=80=99.