From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id IKGXFVP5JmR7GgAASxT56A (envelope-from ) for ; Fri, 31 Mar 2023 17:16:35 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id GAawE1P5JmRnGQAAG6o9tA (envelope-from ) for ; Fri, 31 Mar 2023 17:16:35 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id C27B424341 for ; Fri, 31 Mar 2023 17:16:34 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1piGU6-00074D-Cw; Fri, 31 Mar 2023 11:16:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1piGU4-00073w-Nz for guix-devel@gnu.org; Fri, 31 Mar 2023 11:16:08 -0400 Received: from mout01.posteo.de ([185.67.36.65]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1piGU0-0006BU-3s for guix-devel@gnu.org; Fri, 31 Mar 2023 11:16:08 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id CAF7E24028A for ; Fri, 31 Mar 2023 17:16:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1680275761; bh=rBGHbjvcOgawmTrYE7wqeSTvVtb3JsIEZJrJ2J16Fg0=; h=Date:From:To:CC:Subject:From; b=e0BaOEvQu5cydDKk4HRLRXqMHyD9QqA7fxbrJaAGcq46lVAOSejMJcqlwbSW6T8q8 TuXzwp5px0vDN3wveNfjAwJma7y3miyFvqivfPdHzdRJZXz7N3RKmJhc49e8XVWTMi Ft1ihX/4h0T+kBfcfUdGhlr5Num8JN+Hon2DS/itiol9Ko8TQzlUbm69yX9lUYE8Xp bQVOO3Pynvac5MHQCjMa1Tc0dW6sz9EKZ+5CVi2LXF0XUQJ0kg96GkrDs5x4d3sFSu L7hgOAbjbkNiY5VD7LCjfXadv7w/Nhh/rsEO5UgmeUGknC8JkKn4xejk4GxN8O6Ht/ +vVfSwxMxdCDw== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4Pp3lN0t1hz6twP; Fri, 31 Mar 2023 17:15:59 +0200 (CEST) Date: Fri, 31 Mar 2023 15:15:55 +0000 From: Kyle To: Spencer Skylar Chan , Ricardo Wurmus , Simon Tournier CC: guix-devel@gnu.org Subject: Re: Google Summer of Code 2023 Inquiry In-Reply-To: <42aa5844-0769-e122-efd7-8a152070c71c@terpmail.umd.edu> References: <6d30ee7b-f1f0-9199-fea8-efd434c8611c@terpmail.umd.edu> <86sfeb9zx8.fsf@gmail.com> <87ttycir7r.fsf@elephly.net> <42aa5844-0769-e122-efd7-8a152070c71c@terpmail.umd.edu> Message-ID: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=----33BCSSDPPQ4HUXAE1B4EMR5AP5MMUV Content-Transfer-Encoding: 7bit Received-SPF: pass client-ip=185.67.36.65; envelope-from=kyle@posteo.net; helo=mout01.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1680275795; a=rsa-sha256; cv=none; b=WmBTlh/jmBJc/9wJdb1XILLzDgEEmCSqh63ofwmtyTek94/K24d0Bie7fMwlA1f3CMeANP WEJKVcArGGOGZGU83l+u5Sl3G6/9KQ+W0SQFryNrdtMH9wrfl5TJmHhNbkJEzbf4ArYnC7 ca0OlkYhyhl6iO7KKTD3H3Uo6WEWs09A8VHjwd3Mn3s9qEYFF1j9IbRtwVsBb0FS23++TG xFkXD6+RHeLNBavv10ayFMy7eYB0vnpB0Jt8ZeW6XVNrP5jS8vliFn5ny124Nom3vp34z/ Ia5GKgdv+x9AjYZXxkjTGzv8cALWK9udvFOT//lmhaUtH94zKZsvMysR185Qwg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=e0BaOEvQ; dmarc=pass (policy=none) header.from=posteo.net; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1680275795; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=j00meb//0zDkHlEBHMA9AFhp1H5Jl+NhhMBCzC1srns=; b=fJfgD3GT3JM3kDUlzqQJevH6jmxPhFShkrO5oa2efDQp46wUMElX82YsPQhsnkMSq+Zf0d 8lBpoSMRCVBdJ7n31cHd1Zfvdf51T0LMqXPMET/HwGAC/tPWZarQyI/wrCHENeap3zaZZb jdRX8uGtfBeM4cm5jUforHDaAzB/ETtuVK3nqXt89ZhWyddZMDVqK0GrkVK4M77/mgTsKL 0gMTkVAnYwdvTGh8XzejX4L+2cUYPZUkJvRSdxBrC0V7Z0P/PtGwdJmzkA1N9uPp4n4RY2 EjSkmMDvpd1BbY+dACnGt/W5Tl/2dxuP+aOQRL1WNDQB8eUeo2UB9vH3VKQdNA== Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=e0BaOEvQ; dmarc=pass (policy=none) header.from=posteo.net; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Scanner: scn0.migadu.com X-Migadu-Spam-Score: -10.03 X-Spam-Score: -10.03 X-Migadu-Queue-Id: C27B424341 X-TUID: /uPQyDEHo3N4 ------33BCSSDPPQ4HUXAE1B4EMR5AP5MMUV Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable I would expect most software versions to not be in Guix=2E Simon had mentio= ned that this is mostly what the guix-past repository is for=2E However, so= me packages might be buried on some branch or some commit in some Guix rela= ted git repository=2E It may be helpful to facilitate their discovery and e= xtraction for conda import=2E Git has a newish binary file format for caching searches across commits=2E= Maybe it would be helpful to figure out how to parse this format (its docu= mented) and index the data further using Xapian or a graph data structure (= or tree sitter?) with the relevant metadata needed to find and efficiently = extract scheme code and its dependencies? You make an interesting point about compilation errors=2E It may more prod= uctive to help researchers test for working satisfiable configurations as a= more relaxed approach to having to specify the exact software version=2E M= aybe some "nearby" or newer version is packaged and that is enough to succe= ssfully run a test suite? I'm imagining something between git bisect and Gu= ix's own package solver=2E=20 It might also be productive to add infrastructure to help scientists more = conveniently track and study their recent packaging experiments=2E Guix wil= l only become more useful the more packages which are already available=2E = Work which makes packaging more approachable by more people benefits everyo= ne=2E Perhaps you can think of other ideas in this direction? On March 30, 2023 7:22:14 PM EDT, Spencer Skylar Chan wrote: >Hi Kyle, > >On 3/24/23 14:59, Kyle wrote: >> I am a bit worried about your proposed project is too focused on replac= ing python with guile=2E I think the project would benefit more from making= python users more comfortable productively using Guix tools in concert wit= h the tools they are already comfortable with=2E > >Yes, I agree with you=2E Replacing Python with Guile is a much more ambit= ious task and is not the highest priority here=2E > >> I'm wondering if you might consider modifying your project goals toward= exploring how GWL might be enhanced so that it could better complement mor= e expressive language specific workflow tools like snakemake=2E I am also p= ersonally interested in exploring such a facilities from the targets workfl= ow system in R as well=2E Alternatively, perhaps you could focus kn extendi= ng the GWL with more features? > >I would also be interested in extending GWL with more features, I will fo= llow up with this on the GWL mailing list=2E > >> I agree that establishing an achievable scope within a short timeline i= s crucial=2E The conda env importer idea would be quite an ambitious undert= aking by itself and would lead you towards thinking about some pretty inter= esting and impactful problems=2E > >While it's a challenging project, it could be broken into smaller steps: > >1=2E import packages by exact matching names only, without versioning=2E >2=2E extend `guix import` to have `guix import conda` to help with packag= e names that do not match exactly, and to accelerate adoption of Conda pack= ages not in Guix >3=2E match software version numbers when translating Conda packages to Gu= ix > >What's currently undefined is the error handling: >- if a Conda package does not exist in Guix >- if the dependency graph is not solvable >- if compiling the environment fails (due to mismatching dependency versi= ons) > >I believe there are many satisfactory stopping points for successful comp= letion within the timeline of the summer, which I hope to present with my p= roposal soon=2E > >Thanks, >Skylar > >>=20 >> On March 22, 2023 5:44:52 PM EDT, Spencer Skylar Chan wrote: >>=20 >> Hi Ricardo, >>=20 >> On 3/22/23 14:19, Ricardo Wurmus wrote: >>=20 >>=20 >> - Translating Snakemake to Guix Workflow Language (GWL) >>=20 >>=20 >> Ricardo, maybe you would have some suggestions=2E :-) >>=20 >>=20 >> Oh, this looks interesting=2E Could you please elaborate on the= idea? >>=20 >> My idea is to take as input a Snakemake workflow file and eventuall= y output an equivalent GWL workflow file=2E >>=20 >> Currently, Snakemake workflows can be exported to CWL (Common Workf= low Language): >>=20 >> https://snakemake=2Ereadthedocs=2Eio/en/stable/executing/interopera= bility=2Ehtml >>=20 >> One approach could be to add CWL import/export capabilities to GWL= =2E Then Snakemake/GWL conversion would be a 2 step process, using CWL as a= n intermediate step: >>=20 >> 1=2E Snakemake -> CWL >> 2=2E CWL -> GWL >>=20 >> However, CWL is not as expressive as Snakemake=2E There may be some= details that are lost from Snakemake workflows=2E >>=20 >> So a 1-step Snakemake/GWL transpiler could be interesting, as both = Snakemake/GWL use a domain-specific language inside a general purpose langu= age (Python/Guile respectively)=2E There may be a possibility to achieve mo= re "accurate" translations between workflows=2E >>=20 >> Is this topic something that could fit into a summer project? >>=20 > ------33BCSSDPPQ4HUXAE1B4EMR5AP5MMUV Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable I would expect most software versions to not be in= Guix=2E Simon had mentioned that this is mostly what the guix-past reposit= ory is for=2E However, some packages might be buried on some branch or some= commit in some Guix related git repository=2E It may be helpful to facilit= ate their discovery and extraction for conda import=2E

Git has a new= ish binary file format for caching searches across commits=2E Maybe it woul= d be helpful to figure out how to parse this format (its documented) and in= dex the data further using Xapian or a graph data structure (or tree sitter= ?) with the relevant metadata needed to find and efficiently extract scheme= code and its dependencies?

You make an interesting point about comp= ilation errors=2E It may more productive to help researchers test for worki= ng satisfiable configurations as a more relaxed approach to having to speci= fy the exact software version=2E Maybe some "nearby" or newer version is pa= ckaged and that is enough to successfully run a test suite? I'm imagining s= omething between git bisect and Guix's own package solver=2E

It mig= ht also be productive to add infrastructure to help scientists more conveni= ently track and study their recent packaging experiments=2E Guix will only = become more useful the more packages which are already available=2E Work wh= ich makes packaging more approachable by more people benefits everyone=2E P= erhaps you can think of other ideas in this direction?

On March 30, 2023 7:22:14 PM EDT, Spencer Skylar Chan <sch= an12@terpmail=2Eumd=2Eedu> wrote:
Hi Kyle,

On 3/24/23 14:59, Kyle = wrote:
I am a bit worr= ied about your proposed project is too focused on replacing python with gui= le=2E I think the project would benefit more from making python users more = comfortable productively using Guix tools in concert with the tools they ar= e already comfortable with=2E

Yes, I agree with you=2E = Replacing Python with Guile is a much more ambitious task and is not the hi= ghest priority here=2E

I'm wondering if you might consider modifying your project goals toward= exploring how GWL might be enhanced so that it could better complement mor= e expressive language specific workflow tools like snakemake=2E I am also p= ersonally interested in exploring such a facilities from the targets workfl= ow system in R as well=2E Alternatively, perhaps you could focus kn extendi= ng the GWL with more features?

I would also be interest= ed in extending GWL with more features, I will follow up with this on the G= WL mailing list=2E

= I agree that establishing an achievable scope within a short timeline is cr= ucial=2E The conda env importer idea would be quite an ambitious undertakin= g by itself and would lead you towards thinking about some pretty interesti= ng and impactful problems=2E

While it's a challenging p= roject, it could be broken into smaller steps:

1=2E import packages = by exact matching names only, without versioning=2E
2=2E extend `guix im= port` to have `guix import conda` to help with package names that do not ma= tch exactly, and to accelerate adoption of Conda packages not in Guix
3= =2E match software version numbers when translating Conda packages to Guix<= br>
What's currently undefined is the error handling:
- if a Conda pa= ckage does not exist in Guix
- if the dependency graph is not solvable- if compiling the environment fails (due to mismatching dependency versi= ons)

I believe there are many satisfactory stopping points for succe= ssful completion within the timeline of the summer, which I hope to present= with my proposal soon=2E

Thanks,
Skylar


On March 22, 2023 5:44:52 PM EDT, Spenc= er Skylar Chan <schan12@terpmail=2Eumd=2Eedu> wrote:

Hi Ri= cardo,

On 3/22/23 14:19, Ricardo Wurmus wrote:


= - Translating Snakemake to Guix Workflow Language (GWL)

Ricardo, maybe you would have some suggestions=2E :-)

=
Oh, this looks interesting=2E Could you please elaborate on the= idea?

My idea is to take as input a Snakemake workflow file and= eventually output an equivalent GWL workflow file=2E

Currently,= Snakemake workflows can be exported to CWL (Common Workflow Language):
=
https://snakemake=2Ereadthedocs=2Eio/en/stable/ex= ecuting/interoperability=2Ehtml <https://snakemak= e=2Ereadthedocs=2Eio/en/stable/executing/interoperability=2Ehtml>
One approach could be to add CWL import/export capabilities to GWL= =2E Then Snakemake/GWL conversion would be a 2 step process, using CWL as a= n intermediate step:

1=2E Snakemake -> CWL
2=2E CWL -&= gt; GWL

However, CWL is not as expressive as Snakemake=2E There = may be some details that are lost from Snakemake workflows=2E

So= a 1-step Snakemake/GWL transpiler could be interesting, as both Snakemake/= GWL use a domain-specific language inside a general purpose language (Pytho= n/Guile respectively)=2E There may be a possibility to achieve more "accura= te" translations between workflows=2E

Is this topic something th= at could fit into a summer project?


------33BCSSDPPQ4HUXAE1B4EMR5AP5MMUV--