From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:40756) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hYsV4-00070s-Ps for gwl-devel@gnu.org; Thu, 06 Jun 2019 09:32:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hYsUw-0003g7-3o for gwl-devel@gnu.org; Thu, 06 Jun 2019 09:32:11 -0400 Received: from mail.thebird.nl ([94.142.245.5]:50548) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hYsUv-0002lv-HK for gwl-devel@gnu.org; Thu, 06 Jun 2019 09:32:05 -0400 Date: Thu, 6 Jun 2019 08:23:32 -0500 From: Pjotr Prins Message-ID: <20190606132332.w3prfyygruf63ovs@thebird.nl> References: <87a7f5l6e1.fsf@mdc-berlin.de> <87o93eiqvz.fsf@mdc-berlin.de> <87muiukitj.fsf@mdc-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87muiukitj.fsf@mdc-berlin.de> Content-Transfer-Encoding: quoted-printable Subject: Re: Next steps for the GWL List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gwl-devel-bounces+kyle=kyleam.com@gnu.org Sender: "gwl-devel" To: Ricardo Wurmus Cc: gwl-devel@gnu.org On Thu, Jun 06, 2019 at 02:19:04PM +0200, Ricardo Wurmus wrote: >=20 > Hi simon, >=20 > > (+ Pjotr because I am sure he has an interesting opinion but not sure > > he closely reads this list ;-) I read it :) > > On Mon, 3 Jun 2019 at 18:18, Ricardo Wurmus > > wrote: > > > >> > - what about a bridge with CWL? > >> > >> I=E2=80=99m open to this idea, but it would need to be well-defined.= What does > >> it really mean? Generating CWL files from GWL workflows? That real= ly > >> shouldn=E2=80=99t be too hard. Anything else, however, is hard for = me to > >> imagine. > > > > Well, I point out previous threads about this topic: > > > > https://lists.gnu.org/archive/html/guix-devel/2018-01/msg00428.html > > https://lists.gnu.org/archive/html/gwl-devel/2019-02/msg00019.html > > > > 1- > > Generating CWL from GWL should be nice. It should ease the use of > > already in-place platform and tools (AWS, etc.) >=20 > Generating CWL from GWL should be easy, but it=E2=80=99s also not all t= hat > useful. The GWL takes care of software deployment, so not only should > we generate CWL files but also generate (and upload?) Docker images and > make the CWL file reference them. >=20 > The tooling for CWL=E2=80=A6 seems a little less substantial and focuse= d than it > first appears. The cwltool can only run CWL workflows locally =E2=80=94= no > DRMAA, no AWS. All the other runners that are listed on the CWL websit= e > are either very limited or very large environments where CWL execution > is not necessarily the primary purpose (cf Galaxy or Arvados). >=20 > Still, I think it=E2=80=99s the most meanigful connection the GWL can h= ave with > the CWL: using the GWL as a high-level representation which =E2=80=9Cco= mpiles=E2=80=9D > down to a lower-level representation of CWL + Docker images when needed= . >=20 > > 2- > > Use CWL as a process. A lot of work have been done by Pjotr and > > reported here [1] > > > > > > [1] https://guix-hpc.bordeaux.inria.fr/blog/2019/01/creating-a-reprod= ucible-workflow-with-cwl/ >=20 > Yes, this works, of course, but that=E2=80=99s a level of integration t= hat=E2=80=99s > extremely limited, in my opinion. Using Guix with the CWL is fine as > the blog post demonstrates, but there is very little to be gained and > much to be lost when embedding CWL in a GWL workflow. The only thing > this enables is reusing existing CWL workflows as a GWL =E2=80=9Cproces= s=E2=80=9D. > There is no meaningful integration =E2=80=93 the embedded CWL workflow = is a > second-class citizen that cannot benefit from any of the GWL features. >=20 > If the CWL workflow is connected to the GWL via cwltool then the only > way to run the workflow on a DRMAA-supported cluster or a bunch of > SSH-connected servers, or AWS EC2 instances is to wrap it up in a GWL > context. The GWL treats the process as its smallest unit of > organisation, so a CWL workflow that=E2=80=99s run as a GWL process can= not > really be scaled. If the user has a different CWL execution environmen= t > (such as an Arvados installation), the CWL workflow embedded in the GWL > will not be able to make use of it. It would forever be tied to the > particular version of cwltool in Guix. >=20 > I=E2=80=99d rather not advocate this use of the CWL in the GWL. It mig= ht sound > good (=E2=80=9CThe GWL is compatible with the CWL!=E2=80=9D), but ultim= ately it=E2=80=99s a > really awkward connection that is bound to lead to a great deal of > frustration. >=20 > Does this make sense? Yes. Personally I also think the CWL is flawed. It overcomplicates things and the reference implementation is pretty crappy. If we get GWL to work in my environment I would think it a breath of fresh air. Not to say that the CWL does not have some bad ideas (triple negative). You can read my blog for that. > I don=E2=80=99t want to be dismissive. It would be great if we could c= ome > up > with something that=E2=80=99s mutually beneficial for CWL users and GWL= users > alike, but I feel that our options are very limited. I=E2=80=99m still= open to > ideas and use-case scenarios. We can probably just mix the two. I mean the main benefit of the CWL is *sharing* workflows that have been described by others. That is the point of the CWL and even at that it does not prove really great (after all this time how much is shared?). Since CWL and GWL can use the same file system and job submission system I think it is pretty OK for GWL to ignore the CWL and either send data from one to the other or execute CWL pipelines from GWL. Both possible without much work. Pj.