* Guix for Corporate "Batch Jobs"? @ 2022-03-08 21:16 Yasuaki Kudo 2022-03-08 23:18 ` Phil 0 siblings, 1 reply; 4+ messages in thread From: Yasuaki Kudo @ 2022-03-08 21:16 UTC (permalink / raw) To: help-guix Hi, In many so-called Application Support jobs in the enterprises, one of the core responsibilities is to see through the daily completion of "batch jobs" - those I/O heavy processes that take a long time to run, even with parallel processing. And at the core of it is to "re-run" the jobs, after due troubleshooting. In many workplaces I have seen, teams ended up writing their own job schedulers based on cron or used proprietary software such as Autosys (and in Japan, there are local brews such as A-Auto, if I remember the name correctly). But none of the solutions above take good care of the mechanical incremental computation aspect and a lot of optimization (say skip this and that because they don't matter during re-runs) depend on the operators' sweat and judgement 😅 Can Guix be put into good use in this area do you think? Or maybe another way of asking this question is, can Guix be used a general compiler such as 'make'? Knowing that 'make' still exists so - is there any reason why Guix just can't take over? Maybe similar questions have been already asked in the Nix world as well? I would love to know! 😄 -Yasu ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Guix for Corporate "Batch Jobs"? 2022-03-08 21:16 Guix for Corporate "Batch Jobs"? Yasuaki Kudo @ 2022-03-08 23:18 ` Phil 2022-03-09 8:20 ` Ricardo Wurmus 2022-03-09 8:49 ` Yasuaki Kudo 0 siblings, 2 replies; 4+ messages in thread From: Phil @ 2022-03-08 23:18 UTC (permalink / raw) To: Yasuaki Kudo; +Cc: help-guix Hi Yasu, Yasuaki Kudo writes: > Hi, > > In many so-called Application Support jobs in the enterprises, one of the core responsibilities is to see through the daily completion of "batch jobs" - those I/O heavy processes that take a long time to run, even with parallel processing. > > And at the core of it is to "re-run" the jobs, after due troubleshooting. > > In many workplaces I have seen, teams ended up writing their own job schedulers based on cron or used proprietary software such as Autosys (and in Japan, there are local brews such as A-Auto, if I remember the name correctly). Not sure if this is exactly what you're looking for - but Guix in my experience can sit at the centre of a tech-stack for providing software on machines, and then batch-running that software in a very predictable way. However Guix is currenty first and foremost a command-line tool, so I find myself augmenting it with other standard offerings to produce familiar front-ends for triggers, job processing, management, etc. A few examples below. I oversee the use of Guix in an enterprise environment. Initially it was used to build/test our software and also provide deployments with dependencies etc. We wrapped Guix builds in Jenkins, which in-turn integrates with our source control to trigger Guix using a standard branch workflow developers are used to. Guix fetches and caches any build dependencies making subsequent builds faster, and making artifacts available via a Guix substitute server to servers across the enterprise. More recently and probably more useful to you - I've been looking at taking the build outputs and making them available as batch jobs using Guix Workflow Language (https://guixwl.org) - which is a good fit if your batches are compute jobs with well defined inputs, numerous dependent stages, and the requirement to reproduce identical numerical output. GWL provides lots of cool features - it's somewhat like Autosys in that it is declarative - defining dependencies (and thus an order) between different workflow processes etc. I don't think GWL can memoize different processes in a workflow tho - so running a workflow several times results in all workflow processes being run, as far as I know. The point is you should be guaranteed the same result with the same inputs, every time. I tend to wrap the GWL scripts in Rundeck (job scheduler) to allow less-technical staff to re-run batches through a web app or to construct a daily schedule for overnight/regression tests etc, rather than use the guix command line. Note GWL isn't designed to be used if the aim of your batch jobs is to have a side-effect on the server you're running on. We only use it to produce results from calculations. This is different to Autosys where each job could be entirely made-up of side-effects which change the state of the server itself. HTH, Phil. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Guix for Corporate "Batch Jobs"? 2022-03-08 23:18 ` Phil @ 2022-03-09 8:20 ` Ricardo Wurmus 2022-03-09 8:49 ` Yasuaki Kudo 1 sibling, 0 replies; 4+ messages in thread From: Ricardo Wurmus @ 2022-03-09 8:20 UTC (permalink / raw) To: Phil; +Cc: help-guix Phil <phil@beadling.co.uk> writes: > I don't think GWL can memoize > different processes in a workflow tho - so running a workflow several > times results in all workflow processes being run, as far as I know. By default GWL caches outputs that have already been computed. Currently there’s only one way to skip computation and that is through files. When a computation results in a file the output is cached; if the output exists already then the computation is not re-rerun unless explicitly requested. (The GWL needs even more caching to avoid recomputing build scripts, but that’s a separate issue.) -- Ricardo ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Guix for Corporate "Batch Jobs"? 2022-03-08 23:18 ` Phil 2022-03-09 8:20 ` Ricardo Wurmus @ 2022-03-09 8:49 ` Yasuaki Kudo 1 sibling, 0 replies; 4+ messages in thread From: Yasuaki Kudo @ 2022-03-09 8:49 UTC (permalink / raw) To: Phil; +Cc: help-guix Hi Phil, Thank you so much, yes, this does help! I was thinking of profit/loss simulations for millions of transactions at large financial companies. They typically have purpose-built libraries written in C and rely on server farms and beefy databases. The acceptable range of input for such systems are quite limited and they do fail due to bad data, wrong assumptions of dates, business events, and so forth. And I am always looking for a good place to start for international worker cooperatives spread around the globe. Providing 24 hour "dev/op" network with Guix as one of the core competencies might do😄 -Yasu > On Mar 9, 2022, at 08:18, Phil <phil@beadling.co.uk> wrote: > > Hi Yasu, > > Yasuaki Kudo writes: > >> Hi, >> >> In many so-called Application Support jobs in the enterprises, one of the core responsibilities is to see through the daily completion of "batch jobs" - those I/O heavy processes that take a long time to run, even with parallel processing. >> >> And at the core of it is to "re-run" the jobs, after due troubleshooting. >> >> In many workplaces I have seen, teams ended up writing their own job schedulers based on cron or used proprietary software such as Autosys (and in Japan, there are local brews such as A-Auto, if I remember the name correctly). > > Not sure if this is exactly what you're looking for - but Guix in my > experience can sit at the centre of a tech-stack for providing software > on machines, and then batch-running that software in a very predictable way. > > However Guix is currenty first and foremost a command-line tool, so I > find myself augmenting it with other standard offerings to produce > familiar front-ends for triggers, job processing, management, etc. > > A few examples below. > > I oversee the use of Guix in an enterprise environment. Initially it > was used to build/test our software and also provide deployments with > dependencies etc. We wrapped Guix builds in Jenkins, which in-turn > integrates with our source control to trigger Guix using a standard > branch workflow developers are used to. Guix fetches and caches any > build dependencies making subsequent builds faster, and making artifacts > available via a Guix substitute server to servers across the enterprise. > > More recently and probably more useful to you - I've been looking at > taking the build outputs and making them available as batch jobs using > Guix Workflow Language (https://guixwl.org) - which is a good fit if > your batches are compute jobs with well defined inputs, numerous > dependent stages, and the requirement to reproduce identical numerical > output. GWL provides lots of cool features - it's somewhat like Autosys > in that it is declarative - defining dependencies (and thus an order) > between different workflow processes etc. I don't think GWL can memoize > different processes in a workflow tho - so running a workflow several > times results in all workflow processes being run, as far as I know. > The point is you should be guaranteed the same result with the same > inputs, every time. > > I tend to wrap the GWL scripts in Rundeck (job scheduler) to allow > less-technical staff to re-run batches through a web app or to construct > a daily schedule for overnight/regression tests etc, rather than use the > guix command line. > > Note GWL isn't designed to be used if the aim of your batch jobs is to > have a side-effect on the server you're running on. We only use it to > produce results from calculations. This is different to Autosys where > each job could be entirely made-up of side-effects which change the > state of the server itself. > > HTH, > Phil. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-03-09 8:52 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-03-08 21:16 Guix for Corporate "Batch Jobs"? Yasuaki Kudo 2022-03-08 23:18 ` Phil 2022-03-09 8:20 ` Ricardo Wurmus 2022-03-09 8:49 ` Yasuaki Kudo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).