unofficial mirror of gwl-devel@gnu.org
 help / color / mirror / Atom feed
* auto-connect fails to parallelize independent processes
@ 2022-06-10  0:56 Olivier Dion via
  2022-06-10 13:13 ` Ricardo Wurmus
  2022-06-14 19:53 ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Olivier Dion
  0 siblings, 2 replies; 9+ messages in thread
From: Olivier Dion via @ 2022-06-10  0:56 UTC (permalink / raw)
  To: gwl-devel

Hi,

Given the following example:
--8<---------------cut here---------------start------------->8---
process proc (with output)
  packages "coreutils"
  outputs : file output
  # {
    for i in $(seq 10); do
      date >> {{outputs}}
      sleep 1
    done
  }

workflow parallel
  processes
    auto-connect
      proc "A.out"
      proc "B.out"
--8<---------------cut here---------------end--------------->8---

We can see that both processes are completely independent, and yet they
are batched linearly.

If we re-write the workflow with `graph' instead:
--8<---------------cut here---------------start------------->8---

define A : proc "A.out"
define B : proc "B.out"

workflow parallel
  processes
    graph
      A ->
      B ->
--8<---------------cut here---------------end--------------->8---

we do get parallel execution of the processes.

-- 
Olivier Dion
oldiob.dev



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto-connect fails to parallelize independent processes
  2022-06-10  0:56 auto-connect fails to parallelize independent processes Olivier Dion via
@ 2022-06-10 13:13 ` Ricardo Wurmus
  2022-06-13 20:02   ` Olivier Dion via
  2022-06-14 19:04   ` Olivier Dion via
  2022-06-14 19:53 ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Olivier Dion
  1 sibling, 2 replies; 9+ messages in thread
From: Ricardo Wurmus @ 2022-06-10 13:13 UTC (permalink / raw)
  To: Olivier Dion; +Cc: gwl-devel


Olivier Dion via <gwl-devel@gnu.org> writes:

> We can see that both processes are completely independent, and yet they
> are batched linearly. […]

I haven’t been able to reproduce this yet.

I wrote this test and it passes:

(let* ((proc (lambda (output)
               (make-process
                (name "does-it-matter?")
                (outputs (file output))
                (procedure '()))))
       (p1 (proc "A.out"))
       (p2 (proc "B.out"))
       (wf (make-workflow
            (name "test-workflow")
            (processes
             (auto-connect p1 p2)))))
  (test-equal "auto-connect parallelizes independent processes"
    (list (list p2 p1))
    (parallel-step-execution-order
     (workflow-processes wf)
     (workflow-restrictions wf))))

The result of parallel-step-execution-order means that p2 and p1 both
belong to the first set of processes that can be executed
simultaneously.  (There is only one set of processes, so this list of p2
and p1 is the only entry in the list of steps.)

If the above is in fact an accurate transcription of your report, we
need to look at the users of workflow-run-order (which uses
parallel-step-execution-order) to locate the bug.

-- 
Ricardo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto-connect fails to parallelize independent processes
  2022-06-10 13:13 ` Ricardo Wurmus
@ 2022-06-13 20:02   ` Olivier Dion via
  2022-06-14 19:04   ` Olivier Dion via
  1 sibling, 0 replies; 9+ messages in thread
From: Olivier Dion via @ 2022-06-13 20:02 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: gwl-devel

On Fri, 10 Jun 2022, Ricardo Wurmus <rekado@elephly.net> wrote:
> Olivier Dion via <gwl-devel@gnu.org> writes:
>
>> We can see that both processes are completely independent, and yet they
>> are batched linearly. […]
>
> I haven’t been able to reproduce this yet.

If you run the previous example, can you see that the first timestamp of
process B is 10 seconds after first timestamp of process A?

-- 
Olivier Dion
oldiob.dev


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto-connect fails to parallelize independent processes
  2022-06-10 13:13 ` Ricardo Wurmus
  2022-06-13 20:02   ` Olivier Dion via
@ 2022-06-14 19:04   ` Olivier Dion via
  1 sibling, 0 replies; 9+ messages in thread
From: Olivier Dion via @ 2022-06-14 19:04 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: gwl-devel

On Fri, 10 Jun 2022, Ricardo Wurmus <rekado@elephly.net> wrote:
>
> I haven’t been able to reproduce this yet.

Afte re-testing with `graph', it seems the problem is not due to
`auto-connect'.  I've probably think it was because of caching, oops.

But the following workflow still has the same problem:
--8<---------------cut here---------------start------------->8---
process proc (with output)
  packages "coreutils"
  outputs : file output
  # {
    rm -f {{outputs}}
    for i in $(seq 3); do
      date >> {{outputs}}
      sleep 1
    done
  }

define A : proc "A.out"
define B : proc "B.out"
define C : proc "C.out"

workflow parallel
  processes
    graph
      C -> A B
--8<---------------cut here---------------end--------------->8---

A and B should run in parallel, but they do not.

-- 
Olivier Dion
oldiob.dev


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] gwl/workflows: Execute independent processes in parallel
  2022-06-10  0:56 auto-connect fails to parallelize independent processes Olivier Dion via
  2022-06-10 13:13 ` Ricardo Wurmus
@ 2022-06-14 19:53 ` Olivier Dion
  2022-06-14 19:53   ` [PATCH 2/2] gwl/ui: Protect format to currnt-error-port with mutex Olivier Dion
  2022-06-15 23:27   ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Ricardo Wurmus
  1 sibling, 2 replies; 9+ messages in thread
From: Olivier Dion @ 2022-06-14 19:53 UTC (permalink / raw)
  To: Olivier Dion via; +Cc: Olivier Dion

Processes in the same list can be executed in parallel with `par-map'.  The
results are then appended to the accumulator.
---
 gwl/workflows.scm | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gwl/workflows.scm b/gwl/workflows.scm
index 03b29e0..74a1c74 100644
--- a/gwl/workflows.scm
+++ b/gwl/workflows.scm
@@ -43,6 +43,7 @@
 
   #:use-module (ice-9 match)
   #:use-module (ice-9 format)
+  #:use-module (ice-9 threads)
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-9 gnu)
   #:use-module (srfi srfi-11)
@@ -447,13 +448,12 @@ can be used in a fold over a WORKFLOW's processes."
   (lambda (item acc)
     (match item
       ((? list?)
-       (fold (lambda (process res)
-               (cons (proc process) res))
-             acc
-             ;; By reversing the order of the processes
-             ;; in STEP we keep the output order the same
-             ;; as the order of the sequential function.
-             (reverse item)))
+       (append
+        ;; By reversing the order of the processes
+        ;; in STEP we keep the output order the same
+        ;; as the order of the sequential function.
+        (par-map (cut proc <>) (reverse item))
+        acc))
       (_ (cons (proc item) acc)))))
 
 (define* (workflow-prepare workflow engine
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] gwl/ui: Protect format to currnt-error-port  with mutex
  2022-06-14 19:53 ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Olivier Dion
@ 2022-06-14 19:53   ` Olivier Dion
  2022-06-15 23:28     ` Ricardo Wurmus
  2022-06-15 23:27   ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Ricardo Wurmus
  1 sibling, 1 reply; 9+ messages in thread
From: Olivier Dion @ 2022-06-14 19:53 UTC (permalink / raw)
  To: Olivier Dion via; +Cc: Olivier Dion

When executing processes in parallel, outputs from threads can be mangled
together if the access is not exclusive.
---
 gwl/ui.scm | 59 +++++++++++++++++++++++++++++-------------------------
 1 file changed, 32 insertions(+), 27 deletions(-)

diff --git a/gwl/ui.scm b/gwl/ui.scm
index 35bd127..339649c 100644
--- a/gwl/ui.scm
+++ b/gwl/ui.scm
@@ -19,6 +19,7 @@
   #:use-module (guix colors)
   #:use-module (srfi srfi-26)
   #:use-module (ice-9 format)
+  #:use-module (ice-9 threads)
   #:export (G_
             log-event
 
@@ -39,6 +40,8 @@
 (define %debug-color (color BOLD MAGENTA))
 (define %execute-color (color BOLD YELLOW))
 
+(define log-mutex (make-recursive-mutex))
+
 (define* (print-diagnostic-prefix prefix #:optional location
                                   #:key (colors (color)))
   "Print PREFIX as a diagnostic line prefix."
@@ -56,36 +59,38 @@
           (colorize-string prefix colors))
         identity))
 
-  (if (location? location)
-      (format (current-error-port) "~a: ~a"
-              (location-color (location->string location))
-              (prefix-color prefix))
-      (format (current-error-port) "~a"
-              (prefix-color prefix))))
+  (with-mutex log-mutex
+    (if (location? location)
+        (format (current-error-port) "~a: ~a"
+                (location-color (location->string location))
+                (prefix-color prefix))
+        (format (current-error-port) "~a"
+                (prefix-color prefix)))))
 
 (define (log-event type . message)
   (define print?
     (or (member 'all (%config 'log-events))
         (member type (%config 'log-events))))
   (when print?
-    (case type
-      ((error)
-       (print-diagnostic-prefix (G_ "error: ") #:colors %error-color))
-      ((info)
-       (print-diagnostic-prefix (G_ "info: ") #:colors %info-color))
-      ((execute)
-       (print-diagnostic-prefix (G_ "run: ") #:colors %execute-color))
-      ((cache)
-       (print-diagnostic-prefix (G_ "cache: ") #:colors %debug-color))
-      ((debug)
-       (print-diagnostic-prefix (G_ "debug: ") #:colors %debug-color))
-      ((process)
-       (print-diagnostic-prefix (G_ "process: ") #:colors %execute-color))
-      ((guix)
-       (print-diagnostic-prefix (G_ "guix: ") #:colors %execute-color))
-      (else #true))
-    (force-output (current-error-port))
-    (format (current-error-port) "~2,2f "
-            (/ (get-internal-real-time)
-               internal-time-units-per-second))
-    (apply format (current-error-port) message)))
+    (with-mutex log-mutex
+      (case type
+        ((error)
+         (print-diagnostic-prefix (G_ "error: ") #:colors %error-color))
+        ((info)
+         (print-diagnostic-prefix (G_ "info: ") #:colors %info-color))
+        ((execute)
+         (print-diagnostic-prefix (G_ "run: ") #:colors %execute-color))
+        ((cache)
+         (print-diagnostic-prefix (G_ "cache: ") #:colors %debug-color))
+        ((debug)
+         (print-diagnostic-prefix (G_ "debug: ") #:colors %debug-color))
+        ((process)
+         (print-diagnostic-prefix (G_ "process: ") #:colors %execute-color))
+        ((guix)
+         (print-diagnostic-prefix (G_ "guix: ") #:colors %execute-color))
+        (else #true))
+      (force-output (current-error-port))
+      (format (current-error-port) "~2,2f "
+              (/ (get-internal-real-time)
+                 internal-time-units-per-second))
+      (apply format (current-error-port) message))))
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] gwl/workflows: Execute independent processes in parallel
  2022-06-14 19:53 ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Olivier Dion
  2022-06-14 19:53   ` [PATCH 2/2] gwl/ui: Protect format to currnt-error-port with mutex Olivier Dion
@ 2022-06-15 23:27   ` Ricardo Wurmus
  2022-06-16  1:23     ` Olivier Dion via
  1 sibling, 1 reply; 9+ messages in thread
From: Ricardo Wurmus @ 2022-06-15 23:27 UTC (permalink / raw)
  To: Olivier Dion; +Cc: gwl-devel


Olivier Dion <olivier.dion@polymtl.ca> writes:

> Processes in the same list can be executed in parallel with `par-map'.  The
> results are then appended to the accumulator.

Thanks for the patch.  I applied it.

-- 
Ricardo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] gwl/ui: Protect format to currnt-error-port  with mutex
  2022-06-14 19:53   ` [PATCH 2/2] gwl/ui: Protect format to currnt-error-port with mutex Olivier Dion
@ 2022-06-15 23:28     ` Ricardo Wurmus
  0 siblings, 0 replies; 9+ messages in thread
From: Ricardo Wurmus @ 2022-06-15 23:28 UTC (permalink / raw)
  To: Olivier Dion; +Cc: gwl-devel


Olivier Dion <olivier.dion@polymtl.ca> writes:

> When executing processes in parallel, outputs from threads can be mangled
> together if the access is not exclusive.

Thanks, applied!

-- 
Ricardo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] gwl/workflows: Execute independent processes in parallel
  2022-06-15 23:27   ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Ricardo Wurmus
@ 2022-06-16  1:23     ` Olivier Dion via
  0 siblings, 0 replies; 9+ messages in thread
From: Olivier Dion via @ 2022-06-16  1:23 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: gwl-devel

On Thu, 16 Jun 2022, Ricardo Wurmus <rekado@elephly.net> wrote:
> Olivier Dion <olivier.dion@polymtl.ca> writes:
>
>> Processes in the same list can be executed in parallel with `par-map'.  The
>> results are then appended to the accumulator.
>
> Thanks for the patch.  I applied it.

I just realize that we can remove the `(reverse item)' in the patch.
Although we can't predict the order of `par-map', the reversal of the
list is unnecessary with the removal of `fold'.

-- 
Olivier Dion
oldiob.dev


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-06-16  1:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-10  0:56 auto-connect fails to parallelize independent processes Olivier Dion via
2022-06-10 13:13 ` Ricardo Wurmus
2022-06-13 20:02   ` Olivier Dion via
2022-06-14 19:04   ` Olivier Dion via
2022-06-14 19:53 ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Olivier Dion
2022-06-14 19:53   ` [PATCH 2/2] gwl/ui: Protect format to currnt-error-port with mutex Olivier Dion
2022-06-15 23:28     ` Ricardo Wurmus
2022-06-15 23:27   ` [PATCH 1/2] gwl/workflows: Execute independent processes in parallel Ricardo Wurmus
2022-06-16  1:23     ` Olivier Dion via

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).