From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1gliPx-00037R-V3 for mharc-gwl-devel@gnu.org; Mon, 21 Jan 2019 17:51:46 -0500 Received: from eggs.gnu.org ([209.51.188.92]:40753) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gliPu-00034I-4u for gwl-devel@gnu.org; Mon, 21 Jan 2019 17:51:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gliPs-00066k-QP for gwl-devel@gnu.org; Mon, 21 Jan 2019 17:51:42 -0500 Received: from sender-of-o53.zoho.com ([135.84.80.218]:21763) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gliPp-00064i-9f for gwl-devel@gnu.org; Mon, 21 Jan 2019 17:51:38 -0500 References: <87bm4df2ld.fsf@elephly.net> <878szgg9bi.fsf@elephly.net> <871s58fk0z.fsf@elephly.net> From: Ricardo Wurmus In-reply-to: Date: Mon, 21 Jan 2019 23:51:21 +0100 Message-ID: <871s55eiae.fsf@elephly.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: =?UTF-8?B?UmU6IG1lcmdpbmcg4oCccHJvY2Vzc2Vz4oCdIGFuZCDigJxyZXN0?= =?UTF-8?B?cmljdGlvbnPigJ0=?= List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: zimoun Cc: gwl-devel@gnu.org Hi simon, > For example, I run: > > guix gc > GUILE_AUTO_COMPILE=3D0 GUIX_WORKFLOW_PATH=3D./doc/examples/ \ > ./pre-inst-env guix workflow -r simple > > and all the dance with the store shows up. Beautiful! :-) > > Is it possible to turn off the test (make check) when building hello ? This is not supported in Guix, so there=E2=80=99s nothing I can do in the G= WL. > Cosmetic comment. :-) > About the `A -> B' which means A depends on B. > To me, the arrow is counterintuitive, notationally speaking. :-) > Because the data flow is going from B to A. > Even if this notation is usual when speaking of dependencies and graph. The arrow is read as =E2=80=9Cdepends on=E2=80=9D. If you want to we could= just as well support an arrow in the opposite direction, as it really has no meaning. But I think that would be more confusing. >> >> Or like this assuming that all of the processes declare inputs and >> >> outputs *somehow*: >> >> >> >> (workflow >> >> (name "simple") >> >> (processes >> >> (eat "fruit") (eat "veges") greet sleep bye)) >> > >> > With this, I do not see how the graph could be deduced; without >> > specifying the inputs-outputs relationship and without specifying the >> > processes relationship. >> >> This will only work if these processes declare inputs and outputs and >> they can be matched up. Otherwise all of these processes would be >> deemed independent. >> >> I still wonder how processes should declare inputs. The easiest and >> possibly least useful way I can think of is to have them declare >> abstract symbols. >> >> --8<---------------cut here---------------start------------->8--- >> (process: 'bake >> (data-inputs '(flour eggs)) >> (procedure '(display "baking")) >> (outputs '(cake))) >> >> (process: fry >> (data-inputs '(flour eggs)) >> (procedure '(display "frying")) >> (outputs '(pancake))) >> >> (process: (take thing) >> (procedure '(format #t "taking ~a." thing)) >> (outputs (list thing))) >> >> (workflow: dinner >> (processes >> (list (take 'flour) (take 'eggs) fry bake))) >> --8<---------------cut here---------------end--------------->8--- >> > [...] >> Given this information we can deduce the adjacency list: >> >> (graph >> (fry -> (take 'flour) (take 'eggs)) >> (bake -> (take 'flour) (take 'eggs))) >> > [...] >> I=E2=80=99m not sure how useful this is as a *generic* mechanism, though= . One >> could also use this as a very specific mechanism, for example to have a >> process declare that it outputs a certain file, and another that it >> takes this very same file as an input. > > From a simple user perspective, I find more readable the current > version with `graph'. Because I am able to see the flow even if I do > not know about the processes fry, bake and take. Right. I also prefer the explicit =E2=80=9Cgraph=E2=80=9D syntax. With = =E2=80=9Clink=E2=80=9D (formerly =E2=80=9Cconnect=E2=80=9D) it=E2=80=99s *possible* but not requir= ied to automatically link up all of the processes. I suspect that this is more in line with what Snakemake users might expect. Luckily, we can offer both ways without problems. > From my point of view, the `let' part fixes the entry point or some > specific location of outputs (for debugging purpose?). > > (define (eat input output) > (process > (name "Eat") > (data-inputs input) > (outputs output))) > > (define (cook input output) > (process > (name "Cook") > (data-inputs input) > (outputs output))) > > (define (take input output) > (process > (name "Take") > (data-inputs input) > (outputs output))) > > (workflow > (processes > (let ((take-choc (inputs take "/path/to/chocolate")) > (take-cake (outputs take "/path/to/store/cake")) > (miam (outputs eat "/path/to/my/mouth"))) > (graph > (cook -> take-choc) > (take-cake -> cook) > (miam -> take-cake))) > > If the inputs/outputs are not specified in the `let' part, then they > are automatically stored somewhere in /tmp/ or elsewhere and then > (optionally) removed when all the workflow is done. > > I imagine `inputs'/`outputs' returning a curryfied process, somehow. > > And similarly about options, e.g, > (define* (cook input output #:optional temp-woven) > blah) > > > Does it make sense ? This seems to be from the perspective of data flow as you indicated earlier. I=E2=80=99m not sure I fully understand it, but I give it a try. = (To me it seems similar to continuations.) Expressed as a data flow the workflow looks like this: (take "chocolate") =3D> cook =3D> (take "cake") =3D> miam At each step we generate a value that can be processed by the next step. This looks suspiciously like an Arrow[1]. [1]: https://www.haskell.org/arrows/syntax.html (push "chocolate" (>>> take cook take miam)) i.e. we push the value =E2=80=9Cchocolate=E2=80=9D into a chain where a pro= cedure=E2=80=99s outputs are connected to the next procedure=E2=80=99s inputs. The example makes it a bit hard to think about this clearly =E2=80=94 what = about the second invocation of =E2=80=9Ctake=E2=80=9D? What about multiple input= s? Isn=E2=80=99t this just function composition and application? ((>>> take cook take miam) "chocolate") ((compose miam take cook take) "chocolate") I don=E2=80=99t really know what to do with the output field of a process in this case. Is it really needed at all? I guess it is needed when the data flow is more complex and named outputs can be used. x >=E2=80=93 A =E2=80=93> B =E2=80=94> C =E2=80=93> E =E2=80=93> F | `=E2=80=93=E2=80=93> D =E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80= =93=E2=80=93/ `=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93/ x is the input to the data flow. (flow (x) (a <- (A x)) ; apply A and bind output to =E2=80=9Ca=E2=80=9D (b <- (B a)) ; apply B and bind output to =E2=80=9Cb=E2=80=9D (e <- (>>> C E)) ; apply C and then E, bind the output to =E2=80=9Ce= =E2=80=9D (d <- (D a b)) ; apply D and bind the output to =E2=80=9Cd=E2=80=9D (-> (F e d))) ; return F applied to =E2=80=9Ce=E2=80=9D and =E2=80= =9Cd=E2=80=9D =E2=80=9Cflow=E2=80=9D would somehow figure out in what order to run things= . I feel that there should be a better way to express this, but I haven=E2=80=99t fo= und one. -- Ricardo