* Slurm with containers (i.e., orchestration) @ 2020-05-18 12:49 Pjotr Prins 2020-05-18 13:11 ` Pjotr Prins 2020-05-19 22:33 ` Begley Brothers Inc 0 siblings, 2 replies; 4+ messages in thread From: Pjotr Prins @ 2020-05-18 12:49 UTC (permalink / raw) To: guix-devel I am looking into some light-weight style orchestration. One possibility is to use Slurm with Guix containers - on a cluster with Guix that is almost trivial (we use Guix containers a lot! They are great) and would also allow non-container jobs. Once we have containers and Slurm it should also be possible to deploy in some cloud infrastructure, provided there are no dependencies on the cluster itself. I think it would make a terrific BLOG story if we put something like that together. Bcbio describes an architecture that uses the common workflow language (CWL) to run pipelines with containers https://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html#running-with-cromwell-local-hpc I am not promoting the use of this, but it shows that infrastructure exists that can deploy workflows on containers in different setups (Bcbio supports Slurm). I know the Guix infrastructure uses Guix deploy to achieve similar roll-outs. What that lacks is the orchestration mechanism itself which should handle dependencies between jobs (i.e. a workflow). The GNU Workflow Language goes some way, but it does not handle orchestration itself. In other words, we almost have the pieces, but one thing is missing :). Thoughts? I know I have brought this up before in different guises, but we start to really need something here. What makes orchestration? I guess it concerns a dynamic database of machines that can execute jobs and some type of software registry (Guix). Next it should be able to schedule and execute jobs using some constraint specifiers (like network/CPU/RAM). It could be a 'dynamic' Slurm that makes use of real machines and VMs. Or hook into an existing cloud service. A slurm job could monitor sending a container into a cloud service. I think we can build this up a step at a time. Thoughts? Pj. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Slurm with containers (i.e., orchestration) 2020-05-18 12:49 Slurm with containers (i.e., orchestration) Pjotr Prins @ 2020-05-18 13:11 ` Pjotr Prins 2020-05-19 22:33 ` Begley Brothers Inc 1 sibling, 0 replies; 4+ messages in thread From: Pjotr Prins @ 2020-05-18 13:11 UTC (permalink / raw) To: guix-devel Ricardo added slurm-drmaa in the past (I can't believe it almost 4 years ago we packaged slurm!) which may also help in addressing some points http://www.drmaa.org/ Pj. On Mon, May 18, 2020 at 07:49:00AM -0500, Pjotr Prins wrote: > I am looking into some light-weight style orchestration. One > possibility is to use Slurm with Guix containers - on a cluster with > Guix that is almost trivial (we use Guix containers a lot! They are > great) and would also allow non-container jobs. > > Once we have containers and Slurm it should also be possible to deploy > in some cloud infrastructure, provided there are no dependencies on > the cluster itself. I think it would make a terrific BLOG story if we > put something like that together. > > Bcbio describes an architecture that uses the common workflow language > (CWL) to run pipelines with containers > > https://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html#running-with-cromwell-local-hpc > > I am not promoting the use of this, but it shows that infrastructure > exists that can deploy workflows on containers in different setups > (Bcbio supports Slurm). I know the Guix infrastructure uses Guix > deploy to achieve similar roll-outs. What that lacks is the > orchestration mechanism itself which should handle dependencies > between jobs (i.e. a workflow). The GNU Workflow Language goes some > way, but it does not handle orchestration itself. > > In other words, we almost have the pieces, but one thing is missing > :). Thoughts? I know I have brought this up before in different > guises, but we start to really need something here. > > What makes orchestration? I guess it concerns a dynamic database of > machines that can execute jobs and some type of software registry > (Guix). Next it should be able to schedule and execute jobs using > some constraint specifiers (like network/CPU/RAM). It could be a > 'dynamic' Slurm that makes use of real machines and VMs. Or hook into > an existing cloud service. A slurm job could monitor sending a > container into a cloud service. > > I think we can build this up a step at a time. > > Thoughts? > > Pj. > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Slurm with containers (i.e., orchestration) 2020-05-18 12:49 Slurm with containers (i.e., orchestration) Pjotr Prins 2020-05-18 13:11 ` Pjotr Prins @ 2020-05-19 22:33 ` Begley Brothers Inc 2020-05-20 2:13 ` Begley Brothers Inc 1 sibling, 1 reply; 4+ messages in thread From: Begley Brothers Inc @ 2020-05-19 22:33 UTC (permalink / raw) To: Pjotr Prins; +Cc: guix-devel On Mon, May 18, 2020 at 7:50 AM Pjotr Prins <pjotr.public12@thebird.nl> wrote: > > I am looking into some light-weight style orchestration. One We think there is such a niche, the 80/20 rule. We think containers are too limiting and a bad idea to target - but we use them and they have mind share. We also have some other ideas in mind, related to this context, but we'll keep this on-topic. Compromise: Cast the issue in terms of a VM and let the 'hello world' MVP/example be a VM that grabs a container from a registry and runs it to completion and shutsdown. Then demostrate the use of Guix building the VM and show the you can not only discard the the container overhead, cruft and headaches, but you also get a more powerful Dockerfile, and all the other Guix features. Then show that you can easily repurposethat VM workflow to a Metal Machine. Of course in real world cases there are many scenarios where people are looking for the reverse incrementalist pathway: 1.) Legacy-App + MM 2.) Legacy MM + (Guix + Legacy-App) 3.) Legacy MM + Guix + workflow 4.) Guix + workflow + (VM or MM or both) > possibility is to use Slurm with Guix containers - on a cluster with > Guix that is almost trivial (we use Guix containers a lot! They are > great) and would also allow non-container jobs. Hmm, doesn't slurm break the opening objective 'light-weight'? Maybe better to write a VM abstraction/adapter for something like Tinkerbell/tink[1], its Apache-2.0, and some project context is here[5]. Define the use case as: VM's that run a task lauched by init and shut themselves down when done - many of course have open-ended run times. For multiple VM use cases: There are a multitude of distributed computing tools that Guix leaves the user free to chose amoung to build into their VM - Guix could take no position on whether Condor, Nomad, etc, etc., etc. are better suited to someone's problem. With those constraints in mind, and lightweight being primary, then it is simple to imagine Guix generating a VM version of such a workflow[2]and delgating the workflow heavy lifting to Tinkerbell/tink: ```bash guix light ~/src/project/hello-world.tmpl ``` > Once we have containers and Slurm it should also be possible to deploy slurm: -1 containers: -1 > in some cloud infrastructure, provided there are no dependencies on I think you could get there and beyond with some relatively minor (compared to Slurm) contributions to Tinkerbell (Apache-2.0). This setup[3] targets an AWS instance, so you could likely leverage `guix deploy` too. > the cluster itself. I think it would make a terrific BLOG story if we > put something like that together. > > Bcbio describes an architecture that uses the common workflow language > (CWL) to run pipelines with containers > > https://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html#running-with-cromwell-local-hpc > > I am not promoting the use of this, but it shows that infrastructure > exists that can deploy workflows on containers in different setups Again we believe if you think in terms of VM's (rather than containers) there is a wider set of possible use cases. If you build on Tinkerbell/tink or re-implement its logic - not clear what you have in mind - you could also expand the Guix use cases to workflows that include metal machine (MM) users/managers. > (Bcbio supports Slurm). I know the Guix infrastructure uses Guix > deploy to achieve similar roll-outs. What that lacks is the > orchestration mechanism itself which should handle dependencies > between jobs (i.e. a workflow). We're not familiar with GNU workflow language - with that caveat: We think that for a lightweight implementation you might be better off defining the scope more narrowly but still expanding the universe of Guix use cases as we outlined. Once that is in place, experience might lead the project to be opinionated on how 'jobs' are handled between machines (VM and MM). Maybe getting to the point of a Guix entry in such Awesome-Metal lists[4] > The GNU Workflow Language goes some > way, but it does not handle orchestration itself. > > In other words, we almost have the pieces, but one thing is missing > :). Agreed. This does seem to be a gap. The challenge is keeping the solution elegant, focussed, yet general. Tinkerbell targets the “17 Unix Rules” so they may be interested in accomodating a VM use case? If not it would still be possible to 'define' a VM that looks to tinkerbell like a MM. > Thoughts? I know I have brought this up before in different > guises, but we start to really need something here. > > What makes orchestration? I guess it concerns a dynamic database of > machines that can execute jobs and some type of software registry > (Guix). That seems a resonable inital scope definition, especially if you recognize VM and MM as two distinct categories of machines to apply Guix to. > Next it should be able to schedule and execute jobs using > some constraint specifiers (like network/CPU/RAM). It could be a > 'dynamic' Slurm that makes use of real machines and VMs. Or hook into > an existing cloud service. A slurm job could monitor sending a > container into a cloud service. Agreed. Those also strike us as 2nd order/stage scope elements - once orchestrated VM's and MM's are running Guix deployed OS and apps. > I think we can build this up a step at a time. It is not clear, but it does sound like you could intend to implment everything in Guix? Or take a more "build small, build modular, and build simple" approach where Guix connects some pre-existing elements? We've identified one in Tinkerbell/tink, where Guix would get the benefit of "four microservices that take you from a powered off server to a high-level execution environment running your very special custom thingamabobber" but there maybe others better suited? > > Thoughts? Very interesting. Thanks for sharing. [1]: https://github.com/tinkerbell/tink [2]: https://github.com/tinkerbell/tink/blob/master/docs/hello-world.md [3]: https://github.com/tinkerbell/tink/blob/master/docs/setup.md [4]: https://github.com/alexellis/awesome-baremetal [5]: https://www.packet.com/blog/open-sourcing-tinkerbell -- Kind Regards Begley Brothers Inc. The content of this email is confidential and intended for the recipient specified in message only. It is strictly forbidden to share any part of this message with any third party, without a written consent of the sender. If you received this message by mistake, please reply to this message and follow with its deletion, so that we can ensure such a mistake does not occur in the future. This message has been sent as a part of discussion between Begley Brothers Inc. and the addressee whose name is specified above. Should you receive this message by mistake, we would be most grateful if you informed us that the message has been sent to you. In this case, we also ask that you delete this message from your mailbox, and do not forward it or any part of it to anyone else. Thank you for your cooperation and understanding. Begley Brothers Inc. puts the security of the client at a high priority. Therefore, we have put efforts into ensuring that the message is error and virus-free. Unfortunately, full security of the email cannot be ensured as, despite our efforts, the data included in emails could be infected, intercepted, or corrupted. Therefore, the recipient should check the email for threats with proper software, as the sender does not accept liability for any damage inflicted by viewing the content of this email. The views and opinions included in this email belong to their author and do not necessarily mirror the views and opinions of the company. Our employees are obliged not to make any defamatory clauses, infringe, or authorize infringement of any legal right. Therefore, the company will not take any liability for such statements included in emails. In case of any damages or other liabilities arising, employees are fully responsible for the content of their emails. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Slurm with containers (i.e., orchestration) 2020-05-19 22:33 ` Begley Brothers Inc @ 2020-05-20 2:13 ` Begley Brothers Inc 0 siblings, 0 replies; 4+ messages in thread From: Begley Brothers Inc @ 2020-05-20 2:13 UTC (permalink / raw) To: Pjotr Prins; +Cc: guix-devel P.S Just to keep things interesting - GWL: A workflow management language extension for GNU Guix The Guix Workflow Language (GWL) provides a scientific computing extension to GNU Guix's declarative language for package management for the declaration of scientific workflows. https://www.guixwl.org/tutorial On Tue, May 19, 2020 at 5:33 PM Begley Brothers Inc <begleybrothers@gmail.com> wrote: > > On Mon, May 18, 2020 at 7:50 AM Pjotr Prins <pjotr.public12@thebird.nl> wrote: > > > > I am looking into some light-weight style orchestration. One > > We think there is such a niche, the 80/20 rule. > We think containers are too limiting and a bad idea to target - but we > use them and they have mind share. > We also have some other ideas in mind, related to this context, but > we'll keep this on-topic. > > Compromise: > Cast the issue in terms of a VM and let the 'hello world' MVP/example > be a VM that grabs a container from a registry and runs it to > completion and shutsdown. > Then demostrate the use of Guix building the VM and show the you can > not only discard the the container overhead, cruft and headaches, but > you also get a more powerful Dockerfile, and all the other Guix > features. > Then show that you can easily repurposethat VM workflow to a Metal Machine. > > Of course in real world cases there are many scenarios where people > are looking for the reverse incrementalist pathway: > 1.) Legacy-App + MM > 2.) Legacy MM + (Guix + Legacy-App) > 3.) Legacy MM + Guix + workflow > 4.) Guix + workflow + (VM or MM or both) > > > possibility is to use Slurm with Guix containers - on a cluster with > > Guix that is almost trivial (we use Guix containers a lot! They are > > great) and would also allow non-container jobs. > > Hmm, doesn't slurm break the opening objective 'light-weight'? > Maybe better to write a VM abstraction/adapter for something like > Tinkerbell/tink[1], its Apache-2.0, and some project context is > here[5]. > > Define the use case as: VM's that run a task lauched by init and shut > themselves down when done - many of course have open-ended run times. > For multiple VM use cases: > There are a multitude of distributed computing tools that Guix leaves > the user free to chose amoung to build into their VM - Guix could take > no position on whether Condor, Nomad, etc, etc., etc. are better > suited to someone's problem. > > With those constraints in mind, and lightweight being primary, then it > is simple to imagine Guix generating a VM version of such a > workflow[2]and delgating the workflow heavy lifting to > Tinkerbell/tink: > > ```bash > guix light ~/src/project/hello-world.tmpl > ``` > > > Once we have containers and Slurm it should also be possible to deploy > > slurm: -1 > containers: -1 > > > in some cloud infrastructure, provided there are no dependencies on > > I think you could get there and beyond with some relatively minor > (compared to Slurm) contributions to Tinkerbell (Apache-2.0). > This setup[3] targets an AWS instance, so you could likely leverage > `guix deploy` too. > > > the cluster itself. I think it would make a terrific BLOG story if we > > put something like that together. > > > > Bcbio describes an architecture that uses the common workflow language > > (CWL) to run pipelines with containers > > > > https://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html#running-with-cromwell-local-hpc > > > > I am not promoting the use of this, but it shows that infrastructure > > exists that can deploy workflows on containers in different setups > > Again we believe if you think in terms of VM's (rather than > containers) there is a wider set of possible use cases. > If you build on Tinkerbell/tink or re-implement its logic - not clear > what you have in mind - you could also expand the Guix use cases to > workflows that include metal machine (MM) users/managers. > > > (Bcbio supports Slurm). I know the Guix infrastructure uses Guix > > deploy to achieve similar roll-outs. What that lacks is the > > orchestration mechanism itself which should handle dependencies > > between jobs (i.e. a workflow). > > We're not familiar with GNU workflow language - with that caveat: > We think that for a lightweight implementation you might be better off > defining the scope more narrowly but still expanding the universe of > Guix use cases as we outlined. > Once that is in place, experience might lead the project to be > opinionated on how 'jobs' are handled between machines (VM and MM). > > Maybe getting to the point of a Guix entry in such Awesome-Metal lists[4] > > > The GNU Workflow Language goes some > > way, but it does not handle orchestration itself. > > > > In other words, we almost have the pieces, but one thing is missing > > :). > > Agreed. This does seem to be a gap. > The challenge is keeping the solution elegant, focussed, yet general. > Tinkerbell targets the “17 Unix Rules” so they may be interested in > accomodating a VM use case? > If not it would still be possible to 'define' a VM that looks to > tinkerbell like a MM. > > > Thoughts? I know I have brought this up before in different > > guises, but we start to really need something here. > > > > What makes orchestration? I guess it concerns a dynamic database of > > machines that can execute jobs and some type of software registry > > (Guix). > > That seems a resonable inital scope definition, especially if you > recognize VM and MM as two distinct categories of machines to apply > Guix to. > > > Next it should be able to schedule and execute jobs using > > some constraint specifiers (like network/CPU/RAM). It could be a > > 'dynamic' Slurm that makes use of real machines and VMs. Or hook into > > an existing cloud service. A slurm job could monitor sending a > > container into a cloud service. > > Agreed. Those also strike us as 2nd order/stage scope elements - once > orchestrated VM's and MM's are running Guix deployed OS and apps. > > > I think we can build this up a step at a time. > > It is not clear, but it does sound like you could intend to implment > everything in Guix? Or take a more "build small, build modular, and > build simple" approach where Guix connects some pre-existing elements? > We've identified one in Tinkerbell/tink, where Guix would get the > benefit of "four microservices that take you from a powered off server > to a high-level execution environment running your very special custom > thingamabobber" but there maybe others better suited? > > > > > Thoughts? > > Very interesting. Thanks for sharing. > > [1]: https://github.com/tinkerbell/tink > [2]: https://github.com/tinkerbell/tink/blob/master/docs/hello-world.md > [3]: https://github.com/tinkerbell/tink/blob/master/docs/setup.md > [4]: https://github.com/alexellis/awesome-baremetal > [5]: https://www.packet.com/blog/open-sourcing-tinkerbell > > -- > Kind Regards > > Begley Brothers Inc. > > The content of this email is confidential and intended for the > recipient specified in message only. It is strictly forbidden to share > any part of this message with any third party, without a written > consent of the sender. If you received this message by mistake, please > reply to this message and follow with its deletion, so that we can > ensure such a mistake does not occur in the future. > This message has been sent as a part of discussion between Begley > Brothers Inc. and the addressee whose name is specified above. Should > you receive this message by mistake, we would be most grateful if you > informed us that the message has been sent to you. In this case, we > also ask that you delete this message from your mailbox, and do not > forward it or any part of it to anyone else. Thank you for your > cooperation and understanding. > Begley Brothers Inc. puts the security of the client at a high > priority. Therefore, we have put efforts into ensuring that the > message is error and virus-free. Unfortunately, full security of the > email cannot be ensured as, despite our efforts, the data included in > emails could be infected, intercepted, or corrupted. Therefore, the > recipient should check the email for threats with proper software, as > the sender does not accept liability for any damage inflicted by > viewing the content of this email. > The views and opinions included in this email belong to their author > and do not necessarily mirror the views and opinions of the company. > Our employees are obliged not to make any defamatory clauses, > infringe, or authorize infringement of any legal right. Therefore, the > company will not take any liability for such statements included in > emails. In case of any damages or other liabilities arising, employees > are fully responsible for the content of their emails. -- Kind Regards Begley Brothers Inc. The content of this email is confidential and intended for the recipient specified in message only. It is strictly forbidden to share any part of this message with any third party, without a written consent of the sender. If you received this message by mistake, please reply to this message and follow with its deletion, so that we can ensure such a mistake does not occur in the future. This message has been sent as a part of discussion between Begley Brothers Inc. and the addressee whose name is specified above. Should you receive this message by mistake, we would be most grateful if you informed us that the message has been sent to you. In this case, we also ask that you delete this message from your mailbox, and do not forward it or any part of it to anyone else. Thank you for your cooperation and understanding. Begley Brothers Inc. puts the security of the client at a high priority. Therefore, we have put efforts into ensuring that the message is error and virus-free. Unfortunately, full security of the email cannot be ensured as, despite our efforts, the data included in emails could be infected, intercepted, or corrupted. Therefore, the recipient should check the email for threats with proper software, as the sender does not accept liability for any damage inflicted by viewing the content of this email. The views and opinions included in this email belong to their author and do not necessarily mirror the views and opinions of the company. Our employees are obliged not to make any defamatory clauses, infringe, or authorize infringement of any legal right. Therefore, the company will not take any liability for such statements included in emails. In case of any damages or other liabilities arising, employees are fully responsible for the content of their emails. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-05-20 2:14 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-05-18 12:49 Slurm with containers (i.e., orchestration) Pjotr Prins 2020-05-18 13:11 ` Pjotr Prins 2020-05-19 22:33 ` Begley Brothers Inc 2020-05-20 2:13 ` Begley Brothers Inc
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/guix.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.