From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from list by lists.gnu.org with archive (Exim 4.71)
	id 1gliPx-00037R-V3
	for mharc-gwl-devel@gnu.org; Mon, 21 Jan 2019 17:51:46 -0500
Received: from eggs.gnu.org ([209.51.188.92]:40753)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <rekado@elephly.net>) id 1gliPu-00034I-4u
	for gwl-devel@gnu.org; Mon, 21 Jan 2019 17:51:43 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <rekado@elephly.net>) id 1gliPs-00066k-QP
	for gwl-devel@gnu.org; Mon, 21 Jan 2019 17:51:42 -0500
Received: from sender-of-o53.zoho.com ([135.84.80.218]:21763)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <rekado@elephly.net>) id 1gliPp-00064i-9f
	for gwl-devel@gnu.org; Mon, 21 Jan 2019 17:51:38 -0500
References: <87bm4df2ld.fsf@elephly.net>
	<CAJ3okZ1enxGxXxjAVdgNjzU8mwEs8aDquZfLKAaLaT1WLYXUTg@mail.gmail.com>
	<878szgg9bi.fsf@elephly.net>
	<CAJ3okZ1Yt_F=CqX5SmZ7e0O1F_TXvm8X1ahVHnvqOTf0uc+DbA@mail.gmail.com>
	<871s58fk0z.fsf@elephly.net>
	<CAJ3okZ2F2_C_GtXLnsFbHpOs23uiA7QFomSxzm4LaKC_=QmERQ@mail.gmail.com>
From: Ricardo Wurmus <rekado@elephly.net>
In-reply-to: <CAJ3okZ2F2_C_GtXLnsFbHpOs23uiA7QFomSxzm4LaKC_=QmERQ@mail.gmail.com>
Date: Mon, 21 Jan 2019 23:51:21 +0100
Message-ID: <871s55eiae.fsf@elephly.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: =?UTF-8?B?UmU6IG1lcmdpbmcg4oCccHJvY2Vzc2Vz4oCdIGFuZCDigJxyZXN0?=
 =?UTF-8?B?cmljdGlvbnPigJ0=?=
List-Id: <gwl-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/gwl-devel>,
	<mailto:gwl-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/gwl-devel/>
List-Post: <mailto:gwl-devel@gnu.org>
List-Help: <mailto:gwl-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/gwl-devel>,
	<mailto:gwl-devel-request@gnu.org?subject=subscribe>
To: zimoun <zimon.toutoune@gmail.com>
Cc: gwl-devel@gnu.org


Hi simon,

> For example, I run:
>
>  guix gc
>  GUILE_AUTO_COMPILE=3D0 GUIX_WORKFLOW_PATH=3D./doc/examples/ \
>   ./pre-inst-env guix workflow -r simple
>
> and all the dance with the store shows up. Beautiful! :-)
>
> Is it possible to turn off the test (make check) when building hello ?

This is not supported in Guix, so there=E2=80=99s nothing I can do in the G=
WL.

> Cosmetic comment. :-)
> About the `A -> B' which means A depends on B.
> To me, the arrow is counterintuitive, notationally speaking. :-)
> Because the data flow is going from B to A.
> Even if this notation is usual when speaking of dependencies and graph.

The arrow is read as =E2=80=9Cdepends on=E2=80=9D.  If you want to we could=
 just as well
support an arrow in the opposite direction, as it really has no
meaning.  But I think that would be more confusing.

>> >> Or like this assuming that all of the processes declare inputs and
>> >> outputs *somehow*:
>> >>
>> >>   (workflow
>> >>    (name "simple")
>> >>    (processes
>> >>      (eat "fruit") (eat "veges") greet sleep bye))
>> >
>> > With this, I do not see how the graph could be deduced; without
>> > specifying the inputs-outputs relationship and without specifying the
>> > processes relationship.
>>
>> This will only work if these processes declare inputs and outputs and
>> they can be matched up.  Otherwise all of these processes would be
>> deemed independent.
>>
>> I still wonder how processes should declare inputs.  The easiest and
>> possibly least useful way I can think of is to have them declare
>> abstract symbols.
>>
>> --8<---------------cut here---------------start------------->8---
>> (process: 'bake
>>   (data-inputs '(flour eggs))
>>   (procedure '(display "baking"))
>>   (outputs '(cake)))
>>
>> (process: fry
>>   (data-inputs '(flour eggs))
>>   (procedure '(display "frying"))
>>   (outputs '(pancake)))
>>
>> (process: (take thing)
>>   (procedure '(format #t "taking ~a." thing))
>>   (outputs (list thing)))
>>
>> (workflow: dinner
>>   (processes
>>     (list (take 'flour) (take 'eggs) fry bake)))
>> --8<---------------cut here---------------end--------------->8---
>>
> [...]
>> Given this information we can deduce the adjacency list:
>>
>>   (graph
>>    (fry  -> (take 'flour) (take 'eggs))
>>    (bake -> (take 'flour) (take 'eggs)))
>>
> [...]
>> I=E2=80=99m not sure how useful this is as a *generic* mechanism, though=
.  One
>> could also use this as a very specific mechanism, for example to have a
>> process declare that it outputs a certain file, and another that it
>> takes this very same file as an input.
>
> From a simple user perspective, I find more readable the current
> version with `graph'. Because I am able to see the flow even if I do
> not know about the processes fry, bake and take.

Right.  I also prefer the explicit =E2=80=9Cgraph=E2=80=9D syntax.  With =
=E2=80=9Clink=E2=80=9D
(formerly =E2=80=9Cconnect=E2=80=9D) it=E2=80=99s *possible* but not requir=
ied to automatically
link up all of the processes.  I suspect that this is more in line with
what Snakemake users might expect.

Luckily, we can offer both ways without problems.

> From my point of view, the `let' part fixes the entry point or some
> specific location of outputs (for debugging purpose?).
>
> (define (eat input output)
>  (process
>   (name "Eat")
>   (data-inputs input)
>   (outputs output)))
>
> (define (cook input output)
>  (process
>   (name "Cook")
>   (data-inputs input)
>   (outputs output)))
>
> (define (take input output)
>  (process
>   (name "Take")
>   (data-inputs input)
>   (outputs output)))
>
> (workflow
>   (processes
>     (let ((take-choc (inputs take "/path/to/chocolate"))
>           (take-cake (outputs take "/path/to/store/cake"))
>           (miam (outputs eat "/path/to/my/mouth")))
>     (graph
>        (cook -> take-choc)
>        (take-cake -> cook)
>        (miam -> take-cake)))
>
> If the inputs/outputs are not specified in the `let' part, then they
> are automatically stored somewhere in /tmp/ or elsewhere and then
> (optionally) removed when all the workflow is done.
>
> I imagine `inputs'/`outputs' returning a curryfied process, somehow.
>
> And similarly about options, e.g,
>  (define* (cook input output #:optional temp-woven)
>      blah)
>
>
> Does it make sense ?

This seems to be from the perspective of data flow as you indicated
earlier.  I=E2=80=99m not sure I fully understand it, but I give it a try. =
 (To
me it seems similar to continuations.)

Expressed as a data flow the workflow looks like this:

  (take "chocolate") =3D> cook =3D> (take "cake") =3D> miam

At each step we generate a value that can be processed by the next
step.  This looks suspiciously like an Arrow[1].

[1]: https://www.haskell.org/arrows/syntax.html

  (push "chocolate"
    (>>> take cook take miam))

i.e. we push the value =E2=80=9Cchocolate=E2=80=9D into a chain where a pro=
cedure=E2=80=99s
outputs are connected to the next procedure=E2=80=99s inputs.

The example makes it a bit hard to think about this clearly =E2=80=94 what =
about
the second invocation of =E2=80=9Ctake=E2=80=9D?  What about multiple input=
s?  Isn=E2=80=99t
this just function composition and application?

  ((>>> take cook take miam) "chocolate")

  ((compose miam take cook take) "chocolate")

I don=E2=80=99t really know what to do with the output field of a process in
this case.  Is it really needed at all?  I guess it is needed when the
data flow is more complex and named outputs can be used.

x >=E2=80=93 A =E2=80=93> B =E2=80=94> C =E2=80=93> E =E2=80=93> F
     |    `=E2=80=93=E2=80=93> D =E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=
=93=E2=80=93/
     `=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93/

x is the input to the data flow.

    (flow (x)
      (a <- (A x))     ; apply A and bind output to =E2=80=9Ca=E2=80=9D
      (b <- (B a))     ; apply B and bind output to =E2=80=9Cb=E2=80=9D
      (e <- (>>> C E)) ; apply C and then E, bind the output to =E2=80=9Ce=
=E2=80=9D
      (d <- (D a b))   ; apply D and bind the output to =E2=80=9Cd=E2=80=9D
      (-> (F e d)))    ; return F applied to =E2=80=9Ce=E2=80=9D and =E2=80=
=9Cd=E2=80=9D

=E2=80=9Cflow=E2=80=9D would somehow figure out in what order to run things=
.  I feel
that there should be a better way to express this, but I haven=E2=80=99t fo=
und
one.

--
Ricardo