As a statistician who always wants to get the most information for the least effort, I am particularly interested in being able to reprioritize workflow jobs interactively within the equivalent portions of the topological sort. I thought perhaps this would be possible with GWL if it could talk to SLURM with DRMAA version 2 (https://en.wikipedia.org/wiki/DRMAA). This would also be more readily useful to researchers if Guix had a conveniently available slurm service which worked out of the box even on a single machine. Stepping back, there might be a more ambitious question hidden in there in terms of how to handle indeterminism in a deterministic workflow manager. Without that external information the problem just involves choosing your random seeds up front. However, I would prefer to write a procedure which is constantly reprioritizing labeled sub jobs within their associated containers either until I hit a resource limit or I have achieved certain target statistical diagnostics. Perhaps I would want GWL to tell me how to replay my build after the fact so I can make that reproducible even though I didn't know what I needed to focus my computations on up front and let the computer do that. Making that sort of thing possible might be a longer term effort, but working out what's needed for initial steps might be a fun project. On March 30, 2023 7:27:37 PM EDT, Spencer Skylar Chan wrote: >Hi Ricardo, > >On 3/23/23 03:58, Ricardo Wurmus wrote: >> Hi, >> >> Spencer Skylar Chan writes: >> >>> One approach could be to add CWL import/export capabilities to >>> GWL. Then Snakemake/GWL conversion would be a 2 step process, using >>> CWL as an intermediate step: >>> >>> 1. Snakemake -> CWL >>> 2. CWL -> GWL >> >> This seems doable. > >Great! I've been reading the chapter in Evolutionary Genomics on different scalable workflows to understand this process better. > >>> However, CWL is not as expressive as Snakemake. There may be some >>> details that are lost from Snakemake workflows. >>> >>> So a 1-step Snakemake/GWL transpiler could be interesting, as both >>> Snakemake/GWL use a domain-specific language inside a general purpose >>> language (Python/Guile respectively). There may be a possibility to >>> achieve more "accurate" translations between workflows. >> >> Compared to the previous approach this seems vastly more complex. It’s >> one thing to *execute* Snakemake code without running it through Python, >> but quite a bit more challenging to transpile Python to Scheme. >> >> Personally, I wouldn’t know where to start. Do you have an idea >> already? >> > >Actually I was hoping you might have some ideas :) >I do think that if the execution of the pipeline is more important than its representation (Snakemake or otherwise), then it would make more sense to focus efforts on increasing GWL's capabilities. > >Thanks, >Skylar