From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1glrkq-0006mt-3f for mharc-gwl-devel@gnu.org; Tue, 22 Jan 2019 03:49:56 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51503) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1glrkn-0006ml-7V for gwl-devel@gnu.org; Tue, 22 Jan 2019 03:49:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1glrkl-0005LV-Ep for gwl-devel@gnu.org; Tue, 22 Jan 2019 03:49:52 -0500 Received: from mail-qt1-x834.google.com ([2607:f8b0:4864:20::834]:36733) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1glrkl-0005Kn-A4 for gwl-devel@gnu.org; Tue, 22 Jan 2019 03:49:51 -0500 Received: by mail-qt1-x834.google.com with SMTP id t13so26708511qtn.3 for ; Tue, 22 Jan 2019 00:49:51 -0800 (PST) MIME-Version: 1.0 References: <87bm4df2ld.fsf@elephly.net> <878szgg9bi.fsf@elephly.net> <871s58fk0z.fsf@elephly.net> <871s55eiae.fsf@elephly.net> In-Reply-To: <871s55eiae.fsf@elephly.net> From: zimoun Date: Tue, 22 Jan 2019 09:49:37 +0100 Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: =?UTF-8?B?UmU6IG1lcmdpbmcg4oCccHJvY2Vzc2Vz4oCdIGFuZCDigJxyZXN0?= =?UTF-8?B?cmljdGlvbnPigJ0=?= List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ricardo Wurmus Cc: gwl-devel@gnu.org Hi Ricardo, On Mon, 21 Jan 2019 at 23:51, Ricardo Wurmus wrote: > > Is it possible to turn off the test (make check) when building hello ? > > This is not supported in Guix, so there=E2=80=99s nothing I can do in the= GWL. Ok. > > > Cosmetic comment. :-) > > About the `A -> B' which means A depends on B. > > To me, the arrow is counterintuitive, notationally speaking. :-) > > Because the data flow is going from B to A. > > Even if this notation is usual when speaking of dependencies and graph. > > The arrow is read as =E2=80=9Cdepends on=E2=80=9D. If you want to we cou= ld just as well > support an arrow in the opposite direction, as it really has no > meaning. But I think that would be more confusing. >>From the Snakemake doc about graph and DAG [1], they choose: ""A -> B" means B depends on A because it expresses how the data flow, i.e. the output of A is the input of B. It is the same for CWL [2]. I agree that it is not the usual way to express the dependencies. (e.g. UML= ). If we choose the snakemake/cwl meaning for `->' then it will not be consistent with the meaning of the arrow of `guix graph'. >>From my perspective, it is more intuitive the snakemake/cwl way. But what is intuitive for someone is not for else one. :-) If we speak about cosmetic, and let the example fom the graph [3]. I find more readable: 1. (graph (samsort samindex -> bcftools_call)) than 2. (graph (bcftools_call <- samsort samindex) than 3. (graph (bcftools_call -> samsort samindex) I do not know, I feel like cutting an hair in four pieces. :-) (french expression :-) [1] https://snakemake.readthedocs.io/en/stable/tutorial/basics.html#step-4-= indexing-read-alignments-and-visualizing-the-dag-of-jobs [2] https://view.commonwl.org/workflows/github.com/common-workflow-language= /cwltool/blob/master/cwltool/schemas/v1.0/v1.0/step-valuefrom3-wf.cwl [3] https://snakemake.readthedocs.io/en/stable/tutorial/basics.html#id1 > > From a simple user perspective, I find more readable the current > > version with `graph'. Because I am able to see the flow even if I do > > not know about the processes fry, bake and take. > > Right. I also prefer the explicit =E2=80=9Cgraph=E2=80=9D syntax. With = =E2=80=9Clink=E2=80=9D > (formerly =E2=80=9Cconnect=E2=80=9D) it=E2=80=99s *possible* but not requ= iried to automatically > link up all of the processes. I suspect that this is more in line with > what Snakemake users might expect. Instead of `link', why not `auto-link'? > > From my point of view, the `let' part fixes the entry point or some > > specific location of outputs (for debugging purpose?). > > > > (define (eat input output) > > (process > > (name "Eat") > > (data-inputs input) > > (outputs output))) > > > > (define (cook input output) > > (process > > (name "Cook") > > (data-inputs input) > > (outputs output))) > > > > (define (take input output) > > (process > > (name "Take") > > (data-inputs input) > > (outputs output))) > > > > (workflow > > (processes > > (let ((take-choc (inputs take "/path/to/chocolate")) > > (take-cake (outputs take "/path/to/store/cake")) > > (miam (outputs eat "/path/to/my/mouth"))) > > (graph > > (cook -> take-choc) > > (take-cake -> cook) > > (miam -> take-cake))) > > > > If the inputs/outputs are not specified in the `let' part, then they > > are automatically stored somewhere in /tmp/ or elsewhere and then > > (optionally) removed when all the workflow is done. > > > > I imagine `inputs'/`outputs' returning a curryfied process, somehow. > > > > And similarly about options, e.g, > > (define* (cook input output #:optional temp-woven) > > blah) > > > > > > Does it make sense ? > > This seems to be from the perspective of data flow as you indicated > earlier. I=E2=80=99m not sure I fully understand it, but I give it a try= . (To > me it seems similar to continuations.) I am not clear with continuations but yes it seems similar once said. :-) Thank you to take from your time and give it a try. > Expressed as a data flow the workflow looks like this: > > (take "chocolate") =3D> cook =3D> (take "cake") =3D> miam > > At each step we generate a value that can be processed by the next > step. This looks suspiciously like an Arrow[1]. You better expressed my thoughts. :-) > > [1]: https://www.haskell.org/arrows/syntax.html > > (push "chocolate" > (>>> take cook take miam)) > > i.e. we push the value =E2=80=9Cchocolate=E2=80=9D into a chain where a p= rocedure=E2=80=99s > outputs are connected to the next procedure=E2=80=99s inputs. > > The example makes it a bit hard to think about this clearly =E2=80=94 wha= t about > the second invocation of =E2=80=9Ctake=E2=80=9D? What about multiple inp= uts? Isn=E2=80=99t > this just function composition and application? To me, multiple inputs or outputs should be an issue when composing, I agre= e. Say that `take' takes 2 inputs, say `a' and `b'. We could impose to pack them as a list (a b) and the process' writer should have to unpack them. Now say that `cook` returns 3 outputs, say `x' and `y' and `z'. They are also packed as a list. However how to encode the facts that `a' corresponds to `z', and `b' to `y'= . You need somehow a dummy process that unpack and repack, that somehow agrees the "type" of each process. (push (>>> take cook dumb take miam)) (define (dumb input output) (data-inputs ((u (cadr input) (v (caadr input))) (outputs (v u))) I do not know if it makes sense, if it is usable and better. I just find that more "functional". > > x >=E2=80=93 A =E2=80=93> B =E2=80=94> C =E2=80=93> E =E2=80=93> F > | `=E2=80=93=E2=80=93> D =E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2= =80=93=E2=80=93/ > `=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93=E2=80=93/ > > x is the input to the data flow. > > (flow (x) > (a <- (A x)) ; apply A and bind output to =E2=80=9Ca=E2=80=9D > (b <- (B a)) ; apply B and bind output to =E2=80=9Cb=E2=80=9D > (e <- (>>> C E)) ; apply C and then E, bind the output to =E2=80=9C= e=E2=80=9D > (d <- (D a b)) ; apply D and bind the output to =E2=80=9Cd=E2=80= =9D > (-> (F e d))) ; return F applied to =E2=80=9Ce=E2=80=9D and =E2= =80=9Cd=E2=80=9D > > =E2=80=9Cflow=E2=80=9D would somehow figure out in what order to run thin= gs. I feel > that there should be a better way to express this, but I haven=E2=80=99t = found > one. Yes. This is already nice! :-) And the user does not have to manage by hand the names of all the outputs. In other word, say the user has already computer your workflow with `x' set to /path/to/my-file. Then this user writes another flow: (flow (x) (z <- (>>> A B x)) (-> (G z))) When apply this second flow to /path/to/my-file, then the result `z' is already in the CAS (see `b') and only (G z) is computed. The dream should be: (flow (x) (-> ((>>> A B G) x))) And to automatically detect that the composition `B . A' is already computed for the value /path/to/my-file. Well, I am dreaming... :-) All the best, simon