From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id UJ5uB9lexF5AQgAA0tVLHw (envelope-from ) for ; Tue, 19 May 2020 22:34:01 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id cLNRA9lexF5nBwAA1q6Kng (envelope-from ) for ; Tue, 19 May 2020 22:34:01 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 5C0DB940143 for ; Tue, 19 May 2020 22:34:00 +0000 (UTC) Received: from localhost ([::1]:59396 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jbAoB-0000Bw-8F for larch@yhetil.org; Tue, 19 May 2020 18:33:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45274) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbAo3-0000Bp-Hr for guix-devel@gnu.org; Tue, 19 May 2020 18:33:51 -0400 Received: from mail-lj1-x231.google.com ([2a00:1450:4864:20::231]:45109) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jbAo1-0003Dk-Uh for guix-devel@gnu.org; Tue, 19 May 2020 18:33:51 -0400 Received: by mail-lj1-x231.google.com with SMTP id z18so1423338lji.12 for ; Tue, 19 May 2020 15:33:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=ALkvAORYOkQBOyYG3aMLFPc3wXX8zTrzN8ahKRQSHJU=; b=XWIL6nAC7mtDPl/LAwCKacMhso2wB2+RpRp6l99obRoRvJ6sVRq41pgya9WP0f3y7Z +fkkiF3W/bZ7082pk8+IBl3sx3qeJ6SXIcRcWxEDSYZYHW6bv5nd4iwgdrb4x1lmj4IP U9SRSc+u++mwaCO/TvJr64XFA33jWQfCyhyf2svPdfCr2ly9cM94pEGB06XLcsUFkzu1 QwCFPOvmILk9F3oHaDmq7hbf+lYCs8rEFzF9ebYiMRlkKKLPuB60vAt/8QuHwrHh8e9p Y4bOiAu8nzVGF0I49EVGgrznGtHtU5XohiPI2C9h2qMUw4AhUXGL4IdWQ3vThYCweK/n jn9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ALkvAORYOkQBOyYG3aMLFPc3wXX8zTrzN8ahKRQSHJU=; b=qyZG3vriQksOFIA9XrK3WpZqRxPFl29vxLKrj4+KOgn8CI3tZXIIBAqzfiPz/F3jCW i7D/oN9jH1ViUDWNCMN30glp4YVk1i6uyep/vPe+xK+18xpH95v2Gz1IF3W/DWxrx+vS szD5iiK8i2ieCeTrtnddmlMpVeGtasZWriarAsG6kFG2HolC+nSgXFI/kyt7ID1uRVFg MeBt6pu6I9iN94+hf8Ek/hcZCzC2gBS5PVJRtdWt2CMoVBHTt9z/sqEtTLzoznqTZKnj VOgNOpcmb2uM8b4kT7ZAPRseCkB1mGOvaFVjvbkJNRV+EFhie6r7YllJZlctEDydwMx/ O+JQ== X-Gm-Message-State: AOAM533a2+HAT+modkPxkQ4wQqlTblO7+5DDOdFZWRZ+yZS5tdSg15k6 ObHLqxaBtWcn1X0Z1TAytHibMZUVOBVoBZsR8Q8= X-Google-Smtp-Source: ABdhPJwQSrGZqDct4ALO2VQ+Ni5JpTJi6vYArZVdQFXd84h0/XZueEdkQeTIJW7cJFJ5QqGPHTbZAdKu/zSBp24QhsQ= X-Received: by 2002:a2e:89cb:: with SMTP id c11mr936760ljk.97.1589927627953; Tue, 19 May 2020 15:33:47 -0700 (PDT) MIME-Version: 1.0 References: <20200518124900.jkr5rts5bnslrkqg@thebird.nl> In-Reply-To: <20200518124900.jkr5rts5bnslrkqg@thebird.nl> From: Begley Brothers Inc Date: Tue, 19 May 2020 17:33:09 -0500 Message-ID: Subject: Re: Slurm with containers (i.e., orchestration) To: Pjotr Prins Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::231; envelope-from=begleybrothers@gmail.com; helo=mail-lj1-x231.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=XWIL6nAC; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -1.71 X-TUID: 3k22xfJMtIAZ On Mon, May 18, 2020 at 7:50 AM Pjotr Prins wro= te: > > I am looking into some light-weight style orchestration. One We think there is such a niche, the 80/20 rule. We think containers are too limiting and a bad idea to target - but we use them and they have mind share. We also have some other ideas in mind, related to this context, but we'll keep this on-topic. Compromise: Cast the issue in terms of a VM and let the 'hello world' MVP/example be a VM that grabs a container from a registry and runs it to completion and shutsdown. Then demostrate the use of Guix building the VM and show the you can not only discard the the container overhead, cruft and headaches, but you also get a more powerful Dockerfile, and all the other Guix features. Then show that you can easily repurposethat VM workflow to a Metal Machine. Of course in real world cases there are many scenarios where people are looking for the reverse incrementalist pathway: 1.) Legacy-App + MM 2.) Legacy MM + (Guix + Legacy-App) 3.) Legacy MM + Guix + workflow 4.) Guix + workflow + (VM or MM or both) > possibility is to use Slurm with Guix containers - on a cluster with > Guix that is almost trivial (we use Guix containers a lot! They are > great) and would also allow non-container jobs. Hmm, doesn't slurm break the opening objective 'light-weight'? Maybe better to write a VM abstraction/adapter for something like Tinkerbell/tink[1], its Apache-2.0, and some project context is here[5]. Define the use case as: VM's that run a task lauched by init and shut themselves down when done - many of course have open-ended run times. For multiple VM use cases: There are a multitude of distributed computing tools that Guix leaves the user free to chose amoung to build into their VM - Guix could take no position on whether Condor, Nomad, etc, etc., etc. are better suited to someone's problem. With those constraints in mind, and lightweight being primary, then it is simple to imagine Guix generating a VM version of such a workflow[2]and delgating the workflow heavy lifting to Tinkerbell/tink: ```bash guix light ~/src/project/hello-world.tmpl ``` > Once we have containers and Slurm it should also be possible to deploy slurm: -1 containers: -1 > in some cloud infrastructure, provided there are no dependencies on I think you could get there and beyond with some relatively minor (compared to Slurm) contributions to Tinkerbell (Apache-2.0). This setup[3] targets an AWS instance, so you could likely leverage `guix deploy` too. > the cluster itself. I think it would make a terrific BLOG story if we > put something like that together. > > Bcbio describes an architecture that uses the common workflow language > (CWL) to run pipelines with containers > > https://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html#runnin= g-with-cromwell-local-hpc > > I am not promoting the use of this, but it shows that infrastructure > exists that can deploy workflows on containers in different setups Again we believe if you think in terms of VM's (rather than containers) there is a wider set of possible use cases. If you build on Tinkerbell/tink or re-implement its logic - not clear what you have in mind - you could also expand the Guix use cases to workflows that include metal machine (MM) users/managers. > (Bcbio supports Slurm). I know the Guix infrastructure uses Guix > deploy to achieve similar roll-outs. What that lacks is the > orchestration mechanism itself which should handle dependencies > between jobs (i.e. a workflow). We're not familiar with GNU workflow language - with that caveat: We think that for a lightweight implementation you might be better off defining the scope more narrowly but still expanding the universe of Guix use cases as we outlined. Once that is in place, experience might lead the project to be opinionated on how 'jobs' are handled between machines (VM and MM). Maybe getting to the point of a Guix entry in such Awesome-Metal lists[4] > The GNU Workflow Language goes some > way, but it does not handle orchestration itself. > > In other words, we almost have the pieces, but one thing is missing > :). Agreed. This does seem to be a gap. The challenge is keeping the solution elegant, focussed, yet general. Tinkerbell targets the =E2=80=9C17 Unix Rules=E2=80=9D so they may be inter= ested in accomodating a VM use case? If not it would still be possible to 'define' a VM that looks to tinkerbell like a MM. > Thoughts? I know I have brought this up before in different > guises, but we start to really need something here. > > What makes orchestration? I guess it concerns a dynamic database of > machines that can execute jobs and some type of software registry > (Guix). That seems a resonable inital scope definition, especially if you recognize VM and MM as two distinct categories of machines to apply Guix to. > Next it should be able to schedule and execute jobs using > some constraint specifiers (like network/CPU/RAM). It could be a > 'dynamic' Slurm that makes use of real machines and VMs. Or hook into > an existing cloud service. A slurm job could monitor sending a > container into a cloud service. Agreed. Those also strike us as 2nd order/stage scope elements - once orchestrated VM's and MM's are running Guix deployed OS and apps. > I think we can build this up a step at a time. It is not clear, but it does sound like you could intend to implment everything in Guix? Or take a more "build small, build modular, and build simple" approach where Guix connects some pre-existing elements? We've identified one in Tinkerbell/tink, where Guix would get the benefit of "four microservices that take you from a powered off server to a high-level execution environment running your very special custom thingamabobber" but there maybe others better suited? > > Thoughts? Very interesting. Thanks for sharing. [1]: https://github.com/tinkerbell/tink [2]: https://github.com/tinkerbell/tink/blob/master/docs/hello-world.md [3]: https://github.com/tinkerbell/tink/blob/master/docs/setup.md [4]: https://github.com/alexellis/awesome-baremetal [5]: https://www.packet.com/blog/open-sourcing-tinkerbell --=20 Kind Regards Begley Brothers Inc. The content of this email is confidential and intended for the recipient specified in message only. It is strictly forbidden to share any part of this message with any third party, without a written consent of the sender. If you received this message by mistake, please reply to this message and follow with its deletion, so that we can ensure such a mistake does not occur in the future. This message has been sent as a part of discussion between Begley Brothers Inc. and the addressee whose name is specified above. Should you receive this message by mistake, we would be most grateful if you informed us that the message has been sent to you. In this case, we also ask that you delete this message from your mailbox, and do not forward it or any part of it to anyone else. Thank you for your cooperation and understanding. Begley Brothers Inc. puts the security of the client at a high priority. Therefore, we have put efforts into ensuring that the message is error and virus-free. Unfortunately, full security of the email cannot be ensured as, despite our efforts, the data included in emails could be infected, intercepted, or corrupted. Therefore, the recipient should check the email for threats with proper software, as the sender does not accept liability for any damage inflicted by viewing the content of this email. The views and opinions included in this email belong to their author and do not necessarily mirror the views and opinions of the company. Our employees are obliged not to make any defamatory clauses, infringe, or authorize infringement of any legal right. Therefore, the company will not take any liability for such statements included in emails. In case of any damages or other liabilities arising, employees are fully responsible for the content of their emails.