all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* [bug#36630] [PATCH] guix: parallelize building the manual-database
@ 2019-07-12 21:42 Arne Babenhauserheide
  2019-07-15 16:12 ` Ludovic Courtès
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Arne Babenhauserheide @ 2019-07-12 21:42 UTC (permalink / raw)
  To: 36630; +Cc: Arne Babenhauserheide

* guix/profiles.scm (manual-database): par-map over the entries.  This
distributes the load roughly equally over all cores and avoids blocking on
I/O.  The order of the entries stays the same since write-mandb-database sorts
them.
---
 guix/profiles.scm | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/guix/profiles.scm b/guix/profiles.scm
index f5c863945c..374f0f8a90 100644
--- a/guix/profiles.scm
+++ b/guix/profiles.scm
@@ -1312,15 +1312,11 @@ the entries in MANIFEST."
         #~(begin
             (use-modules (guix man-db)
                          (guix build utils)
+                         (ice-9 threads)
                          (srfi srfi-1)
                          (srfi srfi-19))

-            (define (compute-entries)
-              ;; This is the most expensive part (I/O and CPU, due to
-              ;; decompression), so report progress as we traverse INPUTS.
-              (let* ((inputs '#$(manifest-inputs manifest))
-                     (total  (length inputs)))
-                (append-map (lambda (directory count)
+            (define (compute-entry directory count total)
                               (format #t "\r[~3d/~3d] building list of \
 man-db entries..."
                                       count total)
@@ -1330,8 +1326,16 @@ man-db entries..."
                                 (if (directory-exists? man)
                                     (mandb-entries man)
                                     '())))
-                            inputs
-                            (iota total 1))))
+
+            (define (compute-entries)
+              ;; This is the most expensive part (I/O and CPU, due to
+              ;; decompression), so report progress as we traverse INPUTS.
+              (let* ((inputs '#$(manifest-inputs manifest))
+                     (total  (length inputs)))
+                (apply append (par-map compute-entry
+                                       inputs
+                                       (iota total 1)
+                                       (make-list total total)))))

             (define man-directory
               (string-append #$output "/share/man"))
--
2.22.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-12 21:42 [bug#36630] [PATCH] guix: parallelize building the manual-database Arne Babenhauserheide
@ 2019-07-15 16:12 ` Ludovic Courtès
  2019-07-15 23:32   ` Arne Babenhauserheide
  2019-10-23 19:59 ` [bug#36630] [PATCH] use two threads to build man-pages and secure output msgs with mutex Arne Babenhauserheide
  2019-10-27 16:27 ` Arne Babenhauserheide
  2 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2019-07-15 16:12 UTC (permalink / raw)
  To: Arne Babenhauserheide; +Cc: 36630

Hi Arne,

Arne Babenhauserheide <arne_bab@web.de> skribis:

> * guix/profiles.scm (manual-database): par-map over the entries.  This
> distributes the load roughly equally over all cores and avoids blocking on
> I/O.  The order of the entries stays the same since write-mandb-database sorts
> them.

I would think the whole process is largely I/O-bound.  Did you try
measuring differences?

I picked the manual-database derivation returned for:

  guix environment --ad-hoc jupyter python-ipython python-ipykernel -n

(It has 3,046 entries.)

On a SSD and with a hot cache, on my 4-core laptop, I get 74s with
‘master’, and 53s with this patch.

However, it will definitely not scale linearly, so we should probably
cap at 2 or 4 threads.  WDYT?

Another issue with the patch is that the [n/total] counter does not grow
monotically now: it might temporally go backwards.  Consequently, at
-v1, users will see a progress bar that hesitates and occasionally goes
backward, which isn’t great.

This would need to fix it with a mutex-protected global counter.

All in all, I’m not sure this is worth the complexity.

WDYT?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-15 16:12 ` Ludovic Courtès
@ 2019-07-15 23:32   ` Arne Babenhauserheide
  2019-07-16 21:14     ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Arne Babenhauserheide @ 2019-07-15 23:32 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 36630

[-- Attachment #1: Type: text/plain, Size: 3022 bytes --]

Hi Ludo’,

Ludovic Courtès <ludo@gnu.org> writes:
>> * guix/profiles.scm (manual-database): par-map over the entries.  This
>> distributes the load roughly equally over all cores and avoids blocking on
>> I/O.  The order of the entries stays the same since write-mandb-database sorts
>> them.
>
> I would think the whole process is largely I/O-bound.  Did you try
> measuring differences?

I did not measure the difference in build-time, but I did check the
system load. Without this patch, one of my cores is under full
load. With this patch all 12 hyperthreads have a mean load of 50%.

> I picked the manual-database derivation returned for:
>   guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
> (It has 3,046 entries.)

How exactly did you run the derivation? I’d like to check it if you can
give me the exact commandline to run (a command I can run repeatedly).


> On a SSD and with a hot cache, on my 4-core laptop, I get 74s with
> ‘master’, and 53s with this patch.

I’m using a machine with 6 physical cores, hyperthreading, and an NVMe
M.2 disk, so it is likely that it would not be disk-bound for me at 4
threads.

> However, it will definitely not scale linearly, so we should probably
> cap at 2 or 4 threads.  WDYT?

Looking at the underlying action, this seems to be a task that scales
pretty well. It just unpacks files into the disk-cache.

It should also not consume much memory, so I don’t see a reason to
artificially limit the number of threads.

> Another issue with the patch is that the [n/total] counter does not grow
> monotically now: it might temporally go backwards.  Consequently, at
> -v1, users will see a progress bar that hesitates and occasionally goes
> backward, which isn’t great.

It typically jumps forward in the beginning and then stalls until the
first manual page is finished.

Since par-map uses a global queue of futures to process, and since the
output is the first part of (compute-entry …), I don’t expect the
progress to move backwards in ways a user sees: It could only move
backwards during the initial step where all threads start at the same
time, and there the initial output should be overwritten fast enough to
not be noticeable.

> This would need to fix it with a mutex-protected global counter.

A global counter would be pretty bad for scaling. As it is, this code
needs no communication between processes besides returning the final
result, so it behaves exactly like a normal map, aside from being
faster. So I’d prefer to accept the forward-jumping.

> All in all, I’m not sure this is worth the complexity.
>
> WDYT?

Given that building manual pages is the most timeconsuming part when
installing a small tool into my profile, I think it is worth the
complexity. Especially because most of the complexity is being taken
care of by (ice-9 threads par-map).

Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1076 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-15 23:32   ` Arne Babenhauserheide
@ 2019-07-16 21:14     ` Ludovic Courtès
  2019-07-17 22:06       ` Arne Babenhauserheide
  0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2019-07-16 21:14 UTC (permalink / raw)
  To: Arne Babenhauserheide; +Cc: 36630

Hello,

Arne Babenhauserheide <arne_bab@web.de> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:

[...]

>> I picked the manual-database derivation returned for:
>>   guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
>> (It has 3,046 entries.)
>
> How exactly did you run the derivation? I’d like to check it if you can
> give me the exact commandline to run (a command I can run repeatedly).

If you run the command above, it’ll list
/gnu/store/…-manual-database.drv.  So you can just run:

  guix build /gnu/store/…-manual-database.drv

or:

  guix build /gnu/store/…-manual-database.drv --check

if it had already been built before.

>> On a SSD and with a hot cache, on my 4-core laptop, I get 74s with
>> ‘master’, and 53s with this patch.
>
> I’m using a machine with 6 physical cores, hyperthreading, and an NVMe
> M.2 disk, so it is likely that it would not be disk-bound for me at 4
> threads.

The result may be entirely different with a spinning disk.  :-)

I’m not saying we should optimize for spinning disks, just that what you
see is at one end of the spectrum.

>> However, it will definitely not scale linearly, so we should probably
>> cap at 2 or 4 threads.  WDYT?
>
> Looking at the underlying action, this seems to be a task that scales
> pretty well. It just unpacks files into the disk-cache.
>
> It should also not consume much memory, so I don’t see a reason to
> artificially limit the number of threads.

On a many-core machine like we have in our build farm, with spinning
disks, I believe that using one thread per core would be
counterproductive.

>> Another issue with the patch is that the [n/total] counter does not grow
>> monotically now: it might temporally go backwards.  Consequently, at
>> -v1, users will see a progress bar that hesitates and occasionally goes
>> backward, which isn’t great.
>
> It typically jumps forward in the beginning and then stalls until the
> first manual page is finished.
>
> Since par-map uses a global queue of futures to process, and since the
> output is the first part of (compute-entry …), I don’t expect the
> progress to move backwards in ways a user sees: It could only move
> backwards during the initial step where all threads start at the same
> time, and there the initial output should be overwritten fast enough to
> not be noticeable.

Hmm, maybe.  I’m sure we’ll get reports saying this looks weird and
Something Must Absolutely Be Done About It.  :-)

But anyway, another issue is that we would need to honor
‘parallel-job-count’, which means using ‘n-par-map’, which doesn’t use
futures.

> Given that building manual pages is the most timeconsuming part when
> installing a small tool into my profile, I think it is worth the
> complexity. Especially because most of the complexity is being taken
> care of by (ice-9 threads par-map).

Just today I realized that the example above (with Jupyter) has so many
entries because of propagated inputs; in particular libxext along brings
1,000+ man pages.  We should definitely do something about these
packages.

Needs more thought…

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-16 21:14     ` Ludovic Courtès
@ 2019-07-17 22:06       ` Arne Babenhauserheide
  2019-07-18  8:55         ` Ludovic Courtès
  2019-07-18  8:57         ` Ludovic Courtès
  0 siblings, 2 replies; 14+ messages in thread
From: Arne Babenhauserheide @ 2019-07-17 22:06 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 36630

[-- Attachment #1: Type: text/plain, Size: 1592 bytes --]

Hi,

Ludovic Courtès <ludo@gnu.org> skribis:
> Arne Babenhauserheide <arne_bab@web.de> skribis:
Offtopic: I love reading Esperanto here!
>> Ludovic Courtès <ludo@gnu.org> writes:

>>>   guix environment --ad-hoc jupyter python-ipython python-ipykernel -n

>> How exactly did you run the derivation?

> If you run the command above, it’ll list
> /gnu/store/…-manual-database.drv.  So you can just run:
>
>   guix build /gnu/store/…-manual-database.drv
>
> or:
>
>   guix build /gnu/store/…-manual-database.drv --check
>
> if it had already been built before.

Somehow I can’t get guix to actually run my changed code with this
command, so I’m not sure I tested the right thing.

What is the clean approach to run the profile.scm from git?

>>> On a SSD and with a hot cache, on my 4-core laptop, I get 74s with
>>> ‘master’, and 53s with this patch.
>>
>> I’m using a machine with 6 physical cores, hyperthreading, and an NVMe
>> M.2 disk, so it is likely that it would not be disk-bound for me at 4
>> threads.
>
> The result may be entirely different with a spinning disk.  :-)
>
> I’m not saying we should optimize for spinning disks, just that what you
> see is at one end of the spectrum.

That’s right, yes.

> But anyway, another issue is that we would need to honor
> ‘parallel-job-count’, which means using ‘n-par-map’, which doesn’t use
> futures.

Ouch, yes. That’s an issue … 

Thank you for bringing it up!

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1076 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-17 22:06       ` Arne Babenhauserheide
@ 2019-07-18  8:55         ` Ludovic Courtès
  2019-07-18  8:57         ` Ludovic Courtès
  1 sibling, 0 replies; 14+ messages in thread
From: Ludovic Courtès @ 2019-07-18  8:55 UTC (permalink / raw)
  To: Arne Babenhauserheide; +Cc: 36630

Hello,

Arne Babenhauserheide <arne_bab@web.de> skribis:

> Ludovic Courtès <ludo@gnu.org> skribis:
>> Arne Babenhauserheide <arne_bab@web.de> skribis:
> Offtopic: I love reading Esperanto here!
>>> Ludovic Courtès <ludo@gnu.org> writes:
>
>>>>   guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
>
>>> How exactly did you run the derivation?
>
>> If you run the command above, it’ll list
>> /gnu/store/…-manual-database.drv.  So you can just run:
>>
>>   guix build /gnu/store/…-manual-database.drv
>>
>> or:
>>
>>   guix build /gnu/store/…-manual-database.drv --check
>>
>> if it had already been built before.
>
> Somehow I can’t get guix to actually run my changed code with this
> command, so I’m not sure I tested the right thing.

Did you try the ‘guix environment -n’ command above?  Doesn’t it show
the manual-database.drv?

Alternately, you can also do something like:

  guix install -p /tmp/foo jupyter python-ipython python-ipykernel -n

HTH,
Ludo’.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-17 22:06       ` Arne Babenhauserheide
  2019-07-18  8:55         ` Ludovic Courtès
@ 2019-07-18  8:57         ` Ludovic Courtès
  2019-07-18 10:59           ` Arne Babenhauserheide
  1 sibling, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2019-07-18  8:57 UTC (permalink / raw)
  To: Arne Babenhauserheide; +Cc: 36630

Hello,

Arne Babenhauserheide <arne_bab@web.de> skribis:

> Ludovic Courtès <ludo@gnu.org> skribis:
>> Arne Babenhauserheide <arne_bab@web.de> skribis:
> Offtopic: I love reading Esperanto here!
>>> Ludovic Courtès <ludo@gnu.org> writes:
>
>>>>   guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
>
>>> How exactly did you run the derivation?
>
>> If you run the command above, it’ll list
>> /gnu/store/…-manual-database.drv.  So you can just run:
>>
>>   guix build /gnu/store/…-manual-database.drv
>>
>> or:
>>
>>   guix build /gnu/store/…-manual-database.drv --check
>>
>> if it had already been built before.
>
> Somehow I can’t get guix to actually run my changed code with this
> command, so I’m not sure I tested the right thing.

Did you try the ‘guix environment -n’ command above?  Doesn’t it show
the manual-database.drv?

Alternately, you can also do something like:

  guix install -p /tmp/foo jupyter python-ipython python-ipykernel -n

HTH,
Ludo’.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-18  8:57         ` Ludovic Courtès
@ 2019-07-18 10:59           ` Arne Babenhauserheide
  2019-07-18 13:46             ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Arne Babenhauserheide @ 2019-07-18 10:59 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 36630

[-- Attachment #1: Type: text/plain, Size: 550 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Did you try the ‘guix environment -n’ command above?  Doesn’t it show
> the manual-database.drv?

It does show the manual database, but then running guix build
/gnu/....drv --check does not run my changed code.

I’m doing

time ./pre-inst-env guix build /gnu/store/jnkxwwxk71n07fs6naa11fxmg3vpnnb3-manual-database.drv --check

But it runs the installed guix, not the local changes to profile.scm.

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1076 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-18 10:59           ` Arne Babenhauserheide
@ 2019-07-18 13:46             ` Ludovic Courtès
  2019-07-18 20:03               ` Arne Babenhauserheide
  0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2019-07-18 13:46 UTC (permalink / raw)
  To: Arne Babenhauserheide; +Cc: 36630

Arne Babenhauserheide <arne_bab@web.de> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Did you try the ‘guix environment -n’ command above?  Doesn’t it show
>> the manual-database.drv?
>
> It does show the manual database, but then running guix build
> /gnu/....drv --check does not run my changed code.
>
> I’m doing
>
> time ./pre-inst-env guix build /gnu/store/jnkxwwxk71n07fs6naa11fxmg3vpnnb3-manual-database.drv --check
>
> But it runs the installed guix, not the local changes to profile.scm.

Right, sorry for being unclear: you need to run

  ./pre-inst-env guix environment -n …

That will run you modified code and thus create a manual-database.drv
that uses your code; it’s this manual-database.drv that you should pass
to ‘guix build’.

HTH,
Ludo’.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-18 13:46             ` Ludovic Courtès
@ 2019-07-18 20:03               ` Arne Babenhauserheide
  2019-10-23 20:01                 ` Arne Babenhauserheide
  0 siblings, 1 reply; 14+ messages in thread
From: Arne Babenhauserheide @ 2019-07-18 20:03 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 36630

[-- Attachment #1: Type: text/plain, Size: 1815 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Arne Babenhauserheide <arne_bab@web.de> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>>> Did you try the ‘guix environment -n’ command above?  Doesn’t it show
>>> the manual-database.drv?
>>
>> It does show the manual database, but then running guix build
>> /gnu/....drv --check does not run my changed code.
>>
>> I’m doing
>>
>> time ./pre-inst-env guix build /gnu/store/jnkxwwxk71n07fs6naa11fxmg3vpnnb3-manual-database.drv --check
>>
>> But it runs the installed guix, not the local changes to profile.scm.
>
> Right, sorry for being unclear: you need to run
>
>   ./pre-inst-env guix environment -n …
>
> That will run you modified code and thus create a manual-database.drv
> that uses your code; it’s this manual-database.drv that you should pass
> to ‘guix build’.

That works now — thank you!

With the change:
185552 entries processed in 108.2 s
Before the change:
185552 entries processed in 220.1 s

The exact commands I’m running:
cd Dokumente/Guix/guix
git checkout master # with the change
./pre-inst-env guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
time guix build /gnu/...-manual-database.drv
git checkout 97bf46e64c11c64a968fdb833983ede6bdafbc00
./pre-inst-env guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
time guix build /gnu/...-manual-database.drv

So I also see roughly factor 2 speedup, which means a limit to 4 threads
should work.

(what I saw is that I only see the …manual-database.drv before I build
it, after it’s built, I no longer see it in the environment output)

How do I get the defined limit of cores and threads?

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1076 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] use two threads to build man-pages and secure output msgs with mutex
  2019-07-12 21:42 [bug#36630] [PATCH] guix: parallelize building the manual-database Arne Babenhauserheide
  2019-07-15 16:12 ` Ludovic Courtès
@ 2019-10-23 19:59 ` Arne Babenhauserheide
  2019-10-27 16:27 ` Arne Babenhauserheide
  2 siblings, 0 replies; 14+ messages in thread
From: Arne Babenhauserheide @ 2019-10-23 19:59 UTC (permalink / raw)
  To: 36630; +Cc: Arne Babenhauserheide

---
 guix/profiles.scm | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/guix/profiles.scm b/guix/profiles.scm
index a0ae9a4c7e..9d2133df2d 100644
--- a/guix/profiles.scm
+++ b/guix/profiles.scm
@@ -1315,11 +1315,18 @@ the entries in MANIFEST."
                          (ice-9 threads)
                          (srfi srfi-1)
                          (srfi srfi-19))
+            ;; ensure conflict free user-output
+            (define status-output-mutex (make-mutex))
+
+            (define (status-message msg)
+              (lock-mutex status-output-mutex)
+              (display msg)
+              (unlock-mutex status-output-mutex))

             (define (compute-entry directory count total)
-                              (format #t "\r[~3d/~3d] building list of \
+                              (status-message (format #f "\r[~3d/~3d] building list of \
 man-db entries..."
-                                      count total)
+                                      count total))
                               (force-output)
                               (let ((man (string-append directory
                                                         "/share/man")))
@@ -1332,7 +1339,9 @@ man-db entries..."
               ;; decompression), so report progress as we traverse INPUTS.
               (let* ((inputs '#$(manifest-inputs manifest))
                      (total  (length inputs)))
-                (apply append (par-map compute-entry
+                ;; experimenting shows that two threads suffice to remove most
+                ;; waiting
+                (apply append (n-par-map 2 compute-entry
                                        inputs
                                        (iota total 1)
                                        (make-list total total)))))
--
2.23.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] guix: parallelize building the manual-database
  2019-07-18 20:03               ` Arne Babenhauserheide
@ 2019-10-23 20:01                 ` Arne Babenhauserheide
  2020-03-31 13:02                   ` bug#36630: " Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Arne Babenhauserheide @ 2019-10-23 20:01 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 36630

[-- Attachment #1: Type: text/plain, Size: 1250 bytes --]


Arne Babenhauserheide <arne_bab@web.de> writes:
>> Right, sorry for being unclear: you need to run
>>
>>   ./pre-inst-env guix environment -n …
>>
>> That will run you modified code and thus create a manual-database.drv
>> that uses your code; it’s this manual-database.drv that you should pass
>> to ‘guix build’.
>
> That works now — thank you!
> With the change:
> 185552 entries processed in 108.2 s
> Before the change:
> 185552 entries processed in 220.1 s
>
> The exact commands I’m running:
> cd Dokumente/Guix/guix
> git checkout master # with the change
> ./pre-inst-env guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
> time guix build /gnu/...-manual-database.drv
> git checkout 97bf46e64c11c64a968fdb833983ede6bdafbc00
> ./pre-inst-env guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
> time guix build /gnu/...-manual-database.drv

I now reduced the thread count to exactly 2 (to avoid running into
resource troubles; I hope that two should be safe) and added a mutex for
status messages to ensure that writes don’t overlap.

The patch should arrive shortly.

Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1076 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [bug#36630] [PATCH] use two threads to build man-pages and secure output msgs with mutex
  2019-07-12 21:42 [bug#36630] [PATCH] guix: parallelize building the manual-database Arne Babenhauserheide
  2019-07-15 16:12 ` Ludovic Courtès
  2019-10-23 19:59 ` [bug#36630] [PATCH] use two threads to build man-pages and secure output msgs with mutex Arne Babenhauserheide
@ 2019-10-27 16:27 ` Arne Babenhauserheide
  2 siblings, 0 replies; 14+ messages in thread
From: Arne Babenhauserheide @ 2019-10-27 16:27 UTC (permalink / raw)
  To: 36630; +Cc: Arne Babenhauserheide

---
 guix/profiles.scm | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/guix/profiles.scm b/guix/profiles.scm
index a0ae9a4c7e..9d2133df2d 100644
--- a/guix/profiles.scm
+++ b/guix/profiles.scm
@@ -1315,11 +1315,18 @@ the entries in MANIFEST."
                          (ice-9 threads)
                          (srfi srfi-1)
                          (srfi srfi-19))
+            ;; ensure conflict free user-output
+            (define status-output-mutex (make-mutex))
+
+            (define (status-message msg)
+              (lock-mutex status-output-mutex)
+              (display msg)
+              (unlock-mutex status-output-mutex))

             (define (compute-entry directory count total)
-                              (format #t "\r[~3d/~3d] building list of \
+                              (status-message (format #f "\r[~3d/~3d] building list of \
 man-db entries..."
-                                      count total)
+                                      count total))
                               (force-output)
                               (let ((man (string-append directory
                                                         "/share/man")))
@@ -1332,7 +1339,9 @@ man-db entries..."
               ;; decompression), so report progress as we traverse INPUTS.
               (let* ((inputs '#$(manifest-inputs manifest))
                      (total  (length inputs)))
-                (apply append (par-map compute-entry
+                ;; experimenting shows that two threads suffice to remove most
+                ;; waiting
+                (apply append (n-par-map 2 compute-entry
                                        inputs
                                        (iota total 1)
                                        (make-list total total)))))
--
2.23.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* bug#36630: [PATCH] guix: parallelize building the manual-database
  2019-10-23 20:01                 ` Arne Babenhauserheide
@ 2020-03-31 13:02                   ` Ludovic Courtès
  0 siblings, 0 replies; 14+ messages in thread
From: Ludovic Courtès @ 2020-03-31 13:02 UTC (permalink / raw)
  To: Arne Babenhauserheide; +Cc: 36630-done

Hi Arne,

Arne Babenhauserheide <arne_bab@web.de> skribis:

> I now reduced the thread count to exactly 2 (to avoid running into
> resource troubles; I hope that two should be safe) and added a mutex for
> status messages to ensure that writes don’t overlap.

It’s been 9 months but I finally committed a slightly modified variant
as ef4b5f2fed3ca13a0e15a821ba7e561cd4395aa6.  It turns out that the
mutex was unnecessary as ports are thread-safe.

As noted in the log, I see a 36% speedup on my SSD laptop with 4 cores
(slightly less with 2 cores).  It’s not great, but still an improvement!

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-03-31 13:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-12 21:42 [bug#36630] [PATCH] guix: parallelize building the manual-database Arne Babenhauserheide
2019-07-15 16:12 ` Ludovic Courtès
2019-07-15 23:32   ` Arne Babenhauserheide
2019-07-16 21:14     ` Ludovic Courtès
2019-07-17 22:06       ` Arne Babenhauserheide
2019-07-18  8:55         ` Ludovic Courtès
2019-07-18  8:57         ` Ludovic Courtès
2019-07-18 10:59           ` Arne Babenhauserheide
2019-07-18 13:46             ` Ludovic Courtès
2019-07-18 20:03               ` Arne Babenhauserheide
2019-10-23 20:01                 ` Arne Babenhauserheide
2020-03-31 13:02                   ` bug#36630: " Ludovic Courtès
2019-10-23 19:59 ` [bug#36630] [PATCH] use two threads to build man-pages and secure output msgs with mutex Arne Babenhauserheide
2019-10-27 16:27 ` Arne Babenhauserheide

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.