From: Kyle <kyle@posteo.net>
To: Spencer Skylar Chan <schan12@terpmail.umd.edu>,
Ricardo Wurmus <rekado@elephly.net>
Cc: Simon Tournier <zimon.toutoune@gmail.com>, guix-devel@gnu.org
Subject: Re: Google Summer of Code 2023 Inquiry
Date: Fri, 31 Mar 2023 00:52:48 +0000 [thread overview]
Message-ID: <E5565D20-B8F8-4933-BA0B-69E72077A058@posteo.net> (raw)
In-Reply-To: <c4218892-75f5-1a0d-9674-287bacde3730@terpmail.umd.edu>
[-- Attachment #1: Type: text/plain, Size: 2981 bytes --]
As a statistician who always wants to get the most information for the least effort, I am particularly interested in being able to reprioritize workflow jobs interactively within the equivalent portions of the topological sort. I thought perhaps this would be possible with GWL if it could talk to SLURM with DRMAA version 2 (https://en.wikipedia.org/wiki/DRMAA). This would also be more readily useful to researchers if Guix had a conveniently available slurm service which worked out of the box even on a single machine.
Stepping back, there might be a more ambitious question hidden in there in terms of how to handle indeterminism in a deterministic workflow manager. Without that external information the problem just involves choosing your random seeds up front. However, I would prefer to write a procedure which is constantly reprioritizing labeled sub jobs within their associated containers either until I hit a resource limit or I have achieved certain target statistical diagnostics. Perhaps I would want GWL to tell me how to replay my build after the fact so I can make that reproducible even though I didn't know what I needed to focus my computations on up front and let the computer do that. Making that sort of thing possible might be a longer term effort, but working out what's needed for initial steps might be a fun project.
On March 30, 2023 7:27:37 PM EDT, Spencer Skylar Chan <schan12@terpmail.umd.edu> wrote:
>Hi Ricardo,
>
>On 3/23/23 03:58, Ricardo Wurmus wrote:
>> Hi,
>>
>> Spencer Skylar Chan <schan12@terpmail.umd.edu> writes:
>>
>>> One approach could be to add CWL import/export capabilities to
>>> GWL. Then Snakemake/GWL conversion would be a 2 step process, using
>>> CWL as an intermediate step:
>>>
>>> 1. Snakemake -> CWL
>>> 2. CWL -> GWL
>>
>> This seems doable.
>
>Great! I've been reading the chapter in Evolutionary Genomics on different scalable workflows to understand this process better.
>
>>> However, CWL is not as expressive as Snakemake. There may be some
>>> details that are lost from Snakemake workflows.
>>>
>>> So a 1-step Snakemake/GWL transpiler could be interesting, as both
>>> Snakemake/GWL use a domain-specific language inside a general purpose
>>> language (Python/Guile respectively). There may be a possibility to
>>> achieve more "accurate" translations between workflows.
>>
>> Compared to the previous approach this seems vastly more complex. It’s
>> one thing to *execute* Snakemake code without running it through Python,
>> but quite a bit more challenging to transpile Python to Scheme.
>>
>> Personally, I wouldn’t know where to start. Do you have an idea
>> already?
>>
>
>Actually I was hoping you might have some ideas :)
>I do think that if the execution of the pipeline is more important than its representation (Snakemake or otherwise), then it would make more sense to focus efforts on increasing GWL's capabilities.
>
>Thanks,
>Skylar
[-- Attachment #2: Type: text/html, Size: 3803 bytes --]
next prev parent reply other threads:[~2023-03-31 0:53 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-07 1:31 Google Summer of Code 2023 Inquiry Spencer Skylar Chan
2023-03-11 13:32 ` Simon Tournier
2023-03-14 10:10 ` Simon Tournier
2023-03-22 17:41 ` Spencer Skylar Chan
2023-03-22 18:19 ` Ricardo Wurmus
2023-03-22 21:44 ` Spencer Skylar Chan
2023-03-23 7:58 ` Ricardo Wurmus
2023-03-30 23:27 ` Spencer Skylar Chan
2023-03-31 0:52 ` Kyle [this message]
2023-03-24 18:59 ` Kyle
2023-03-30 23:22 ` Spencer Skylar Chan
2023-03-31 15:15 ` Kyle
2023-04-04 0:41 ` Spencer Skylar Chan
2023-04-04 6:29 ` Kyle
2023-04-04 8:59 ` Simon Tournier
2023-04-04 14:32 ` Kyle
2023-04-04 17:15 ` Simon Tournier
-- strict thread matches above, loose matches on Subject: below --
2023-03-08 2:33 Spencer Skylar Chan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E5565D20-B8F8-4933-BA0B-69E72077A058@posteo.net \
--to=kyle@posteo.net \
--cc=guix-devel@gnu.org \
--cc=rekado@elephly.net \
--cc=schan12@terpmail.umd.edu \
--cc=zimon.toutoune@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.