unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Module system thread unsafety and .go compilation
@ 2016-02-09 20:02 Taylan Ulrich Bayırlı/Kammer
  2016-02-10 13:50 ` bug#22608: " Ludovic Courtès
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2016-02-09 20:02 UTC (permalink / raw)
  To: bug-guix; +Cc: guile-devel

To speed up the compilation of the many Scheme files in Guix, we use a
script that first loads all modules to be compiled into the Guile
process (by calling 'resolve-interface' on the module names), and then
the corresponding Scheme files are compiled in a par-for-each.

While Guile's module system is known to be thread unsafe, the idea was
that all mutation should happen in the serial loading phase, and the
parallel compile-file calls should then be thread safe.

Sadly that assumption isn't met when autoloads are involved.
Minimal-ish test-case:

- Check out 0889321.

- Build it.

- Edit gnu/build/activation.scm and gnu/build/linux-boot.scm to contain
  merely the following expressions, respectively:

(define-module (gnu build activation)
  #:use-module (gnu build linux-boot))

(define-module (gnu build linux-boot)
  #:autoload   (system base compile) (compile-file))

- Run make again.

If you're on a multi-core system, you will probably get an error saying
something weird like "no such language scheme".

Note: when you then run make *again* it succeeds.


Solution proposals:

1. s/par-for-each/for-each/.  Will make compilation slower on multi-core
   machines.  We would do the same for guix pull, which is a bit sad
   because it's so fast right now.  Very simple solution though.

2. We find out some partitioning of the Scheme modules such that there
   is minimal overlap in total loaded modules when the modules in one
   subset are each loaded by one Guile process.  Then each Guile process
   loads & compiles the modules in its given subset serially, but these
   Guile processes run in parallel.  This could speed things up even
   more than now because the module-loading phases of the processes
   would be parallel too.  It also has the side-effect that less memory
   is consumed the fewer cores you have (because less Scheme modules
   loaded into memory at once).  If someone (Ludo?)  has a good general
   overview of Guix's module graph then maybe they can come up with a
   sensible partitioning of the modules, say into 4 subsets (maxing out
   benefits at quad-core), such that loading all modules in one subset
   loads a minimal amount of modules that are outside that subset.  That
   should be the only challenging part of this solution.

3. We do nothing for now since this bug triggers rarely, and can be
   worked around by simply re-running make.  (We just have to hope that
   it doesn't trigger on guix pull or on clean builds after some commit;
   there's no "just rerun make" in guix pull or an automated build of
   Guix.)  AFAIU Wingo expressed motivation to make Guile's module
   system thread safe, so this problem would then truly disappear.

I think #2 is a pretty good solution.  The only thing worrying me is
that we might not be able to sensibly partition the Scheme modules
according to any simple logic that can be automated (like guix/ is one
subset, gnu/packages/ is another, etc.).  Maintaining the subsets
manually in the Makefile would be pretty ugly.  But maybe some simple
logic, possibly combined with few special-cases in the code, would be
good enough.

Thoughts?

Taylan



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug#22608: Module system thread unsafety and .go compilation
  2016-02-09 20:02 Module system thread unsafety and .go compilation Taylan Ulrich Bayırlı/Kammer
@ 2016-02-10 13:50 ` Ludovic Courtès
  2016-02-10 13:50 ` Ludovic Courtès
  2018-07-03 22:10 ` Ludovic Courtès
  2 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2016-02-10 13:50 UTC (permalink / raw)
  To: Taylan Ulrich "Bayırlı/Kammer"; +Cc: 22608, guile-devel

taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") skribis:

> Sadly that assumption isn't met when autoloads are involved.
> Minimal-ish test-case:
>
> - Check out 0889321.
>
> - Build it.
>
> - Edit gnu/build/activation.scm and gnu/build/linux-boot.scm to contain
>   merely the following expressions, respectively:
>
> (define-module (gnu build activation)
>   #:use-module (gnu build linux-boot))
>
> (define-module (gnu build linux-boot)
>   #:autoload   (system base compile) (compile-file))
>
> - Run make again.
>
> If you're on a multi-core system, you will probably get an error saying
> something weird like "no such language scheme".

Do you have a clear explanation of why this happens?  I would expect
(system base compile) to already be loaded for instance, so it’s not
clear to me what’s going on.  Or is it just the mutation of (gnu build
linux-boot) that’s causing problems?

> Solution proposals:
>
> 1. s/par-for-each/for-each/.  Will make compilation slower on multi-core
>    machines.  We would do the same for guix pull, which is a bit sad
>    because it's so fast right now.  Very simple solution though.
>
> 2. We find out some partitioning of the Scheme modules such that there
>    is minimal overlap in total loaded modules when the modules in one
>    subset are each loaded by one Guile process.  Then each Guile process
>    loads & compiles the modules in its given subset serially, but these
>    Guile processes run in parallel.  This could speed things up even
>    more than now because the module-loading phases of the processes
>    would be parallel too.  It also has the side-effect that less memory
>    is consumed the fewer cores you have (because less Scheme modules
>    loaded into memory at once).  If someone (Ludo?)  has a good general
>    overview of Guix's module graph then maybe they can come up with a
>    sensible partitioning of the modules, say into 4 subsets (maxing out
>    benefits at quad-core), such that loading all modules in one subset
>    loads a minimal amount of modules that are outside that subset.  That
>    should be the only challenging part of this solution.
>
> 3. We do nothing for now since this bug triggers rarely, and can be
>    worked around by simply re-running make.  (We just have to hope that
>    it doesn't trigger on guix pull or on clean builds after some commit;
>    there's no "just rerun make" in guix pull or an automated build of
>    Guix.)  AFAIU Wingo expressed motivation to make Guile's module
>    system thread safe, so this problem would then truly disappear.

Short-term, I’d do #1 or #3; probably #1 though, because random failures
are no fun, and we know they can happen.

Longer-term, I’m not convinced by #2.  I think I would instead build
packages in reverse topological order, probably serially at first, which
would address <http://bugs.gnu.org/15602> (with the caveat that the (gnu
packages …) modules cannot be topologically-sorted, but OTOH they
typically don’t use macros, so we’re fine.)

That would require a tool to extract and the ‘define-module’ forms and
build a graph from there.

But really, we must fix <http://bugs.gnu.org/15602>, an in particular,
‘compile-file’ should not mutate the global module name space.  I think
we could do something like:

  (define (compile-file* …)
    (let ((root the-root-module)
          (compile-root (copy-module the-root-module)))
      (dynamic-wind
        (lambda ()
          (set! the-root-module compile-root)
          ;; ditto with the-scm-module
          )
        (lambda ()
          (compile-file …))
        (lambda ()
          (set! the-root-module root)
          ;; …
          ))))

It’s unclear how costly ‘copy-module’ would be, and the whole strategy
depends on it.

Eventually it seems clear that Guile proper needs to address this use
case, and needs to provide thread-safe modules.

Ludo’.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug#22608: Module system thread unsafety and .go compilation
  2016-02-09 20:02 Module system thread unsafety and .go compilation Taylan Ulrich Bayırlı/Kammer
  2016-02-10 13:50 ` bug#22608: " Ludovic Courtès
@ 2016-02-10 13:50 ` Ludovic Courtès
  2018-07-03 22:10 ` Ludovic Courtès
  2 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2016-02-10 13:50 UTC (permalink / raw)
  To: Taylan Ulrich "Bayırlı/Kammer"; +Cc: 22608, guile-devel

taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") skribis:

> Sadly that assumption isn't met when autoloads are involved.
> Minimal-ish test-case:
>
> - Check out 0889321.
>
> - Build it.
>
> - Edit gnu/build/activation.scm and gnu/build/linux-boot.scm to contain
>   merely the following expressions, respectively:
>
> (define-module (gnu build activation)
>   #:use-module (gnu build linux-boot))
>
> (define-module (gnu build linux-boot)
>   #:autoload   (system base compile) (compile-file))
>
> - Run make again.
>
> If you're on a multi-core system, you will probably get an error saying
> something weird like "no such language scheme".

Do you have a clear explanation of why this happens?  I would expect
(system base compile) to already be loaded for instance, so it’s not
clear to me what’s going on.  Or is it just the mutation of (gnu build
linux-boot) that’s causing problems?

> Solution proposals:
>
> 1. s/par-for-each/for-each/.  Will make compilation slower on multi-core
>    machines.  We would do the same for guix pull, which is a bit sad
>    because it's so fast right now.  Very simple solution though.
>
> 2. We find out some partitioning of the Scheme modules such that there
>    is minimal overlap in total loaded modules when the modules in one
>    subset are each loaded by one Guile process.  Then each Guile process
>    loads & compiles the modules in its given subset serially, but these
>    Guile processes run in parallel.  This could speed things up even
>    more than now because the module-loading phases of the processes
>    would be parallel too.  It also has the side-effect that less memory
>    is consumed the fewer cores you have (because less Scheme modules
>    loaded into memory at once).  If someone (Ludo?)  has a good general
>    overview of Guix's module graph then maybe they can come up with a
>    sensible partitioning of the modules, say into 4 subsets (maxing out
>    benefits at quad-core), such that loading all modules in one subset
>    loads a minimal amount of modules that are outside that subset.  That
>    should be the only challenging part of this solution.
>
> 3. We do nothing for now since this bug triggers rarely, and can be
>    worked around by simply re-running make.  (We just have to hope that
>    it doesn't trigger on guix pull or on clean builds after some commit;
>    there's no "just rerun make" in guix pull or an automated build of
>    Guix.)  AFAIU Wingo expressed motivation to make Guile's module
>    system thread safe, so this problem would then truly disappear.

Short-term, I’d do #1 or #3; probably #1 though, because random failures
are no fun, and we know they can happen.

Longer-term, I’m not convinced by #2.  I think I would instead build
packages in reverse topological order, probably serially at first, which
would address <http://bugs.gnu.org/15602> (with the caveat that the (gnu
packages …) modules cannot be topologically-sorted, but OTOH they
typically don’t use macros, so we’re fine.)

That would require a tool to extract and the ‘define-module’ forms and
build a graph from there.

But really, we must fix <http://bugs.gnu.org/15602>, an in particular,
‘compile-file’ should not mutate the global module name space.  I think
we could do something like:

  (define (compile-file* …)
    (let ((root the-root-module)
          (compile-root (copy-module the-root-module)))
      (dynamic-wind
        (lambda ()
          (set! the-root-module compile-root)
          ;; ditto with the-scm-module
          )
        (lambda ()
          (compile-file …))
        (lambda ()
          (set! the-root-module root)
          ;; …
          ))))

It’s unclear how costly ‘copy-module’ would be, and the whole strategy
depends on it.

Eventually it seems clear that Guile proper needs to address this use
case, and needs to provide thread-safe modules.

Ludo’.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug#22608: Module system thread unsafety and .go compilation
  2016-02-09 20:02 Module system thread unsafety and .go compilation Taylan Ulrich Bayırlı/Kammer
  2016-02-10 13:50 ` bug#22608: " Ludovic Courtès
  2016-02-10 13:50 ` Ludovic Courtès
@ 2018-07-03 22:10 ` Ludovic Courtès
  2022-10-08  0:21   ` Maxim Cournoyer
  2 siblings, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2018-07-03 22:10 UTC (permalink / raw)
  To: Taylan Ulrich "Bayırlı/Kammer"; +Cc: 22608, guile-devel

Hello,

taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") skribis:

> To speed up the compilation of the many Scheme files in Guix, we use a
> script that first loads all modules to be compiled into the Guile
> process (by calling 'resolve-interface' on the module names), and then
> the corresponding Scheme files are compiled in a par-for-each.
>
> While Guile's module system is known to be thread unsafe, the idea was
> that all mutation should happen in the serial loading phase, and the
> parallel compile-file calls should then be thread safe.
>
> Sadly that assumption isn't met when autoloads are involved.

For the record, these issues should be fixed in Guile 2.2.4:

533e3ff17 * Serialize accesses to submodule hash tables.
46bcbfa56 * Module import obarrays are accessed in a critical section.
761cf0fb8 * Make module autoloading thread-safe.

‘guix pull’ now defaults to 2.2.4, so we’ll see if indeed those crashes
disappear.

Ludo’.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug#22608: Module system thread unsafety and .go compilation
  2018-07-03 22:10 ` Ludovic Courtès
@ 2022-10-08  0:21   ` Maxim Cournoyer
  2022-10-10  8:07     ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Maxim Cournoyer @ 2022-10-08  0:21 UTC (permalink / raw)
  To: Ludovic Courtès
  Cc: Taylan Ulrich "Bayırlı/Kammer", 22608, guile-devel

Hi,

ludo@gnu.org (Ludovic Courtès) writes:

> Hello,
>
> taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") skribis:
>
>> To speed up the compilation of the many Scheme files in Guix, we use a
>> script that first loads all modules to be compiled into the Guile
>> process (by calling 'resolve-interface' on the module names), and then
>> the corresponding Scheme files are compiled in a par-for-each.
>>
>> While Guile's module system is known to be thread unsafe, the idea was
>> that all mutation should happen in the serial loading phase, and the
>> parallel compile-file calls should then be thread safe.
>>
>> Sadly that assumption isn't met when autoloads are involved.
>
> For the record, these issues should be fixed in Guile 2.2.4:
>
> 533e3ff17 * Serialize accesses to submodule hash tables.
> 46bcbfa56 * Module import obarrays are accessed in a critical section.
> 761cf0fb8 * Make module autoloading thread-safe.
>
> ‘guix pull’ now defaults to 2.2.4, so we’ll see if indeed those crashes
> disappear.

I think we haven't seen these in the last 4 years!  We still have
references to https://bugs.gnu.org/15602 in our code base though;
although the upstream issue appears to have been fixed.  Could we remove
the workarounds now?

-- 
Thanks,
Maxim



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug#22608: Module system thread unsafety and .go compilation
  2022-10-08  0:21   ` Maxim Cournoyer
@ 2022-10-10  8:07     ` Ludovic Courtès
  0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2022-10-10  8:07 UTC (permalink / raw)
  To: Maxim Cournoyer; +Cc: 22608, guile-devel

Hi!

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:

[...]

>> For the record, these issues should be fixed in Guile 2.2.4:
>>
>> 533e3ff17 * Serialize accesses to submodule hash tables.
>> 46bcbfa56 * Module import obarrays are accessed in a critical section.
>> 761cf0fb8 * Make module autoloading thread-safe.
>>
>> ‘guix pull’ now defaults to 2.2.4, so we’ll see if indeed those crashes
>> disappear.
>
> I think we haven't seen these in the last 4 years!  We still have
> references to https://bugs.gnu.org/15602 in our code base though;
> although the upstream issue appears to have been fixed.  Could we remove
> the workarounds now?

The module thread-safety issue discussed here appears to be done.

However the workarounds for <https://bugs.gnu.org/15602> must remain:
that specific issue is still there.

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-10-10  8:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-09 20:02 Module system thread unsafety and .go compilation Taylan Ulrich Bayırlı/Kammer
2016-02-10 13:50 ` bug#22608: " Ludovic Courtès
2016-02-10 13:50 ` Ludovic Courtès
2018-07-03 22:10 ` Ludovic Courtès
2022-10-08  0:21   ` Maxim Cournoyer
2022-10-10  8:07     ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).