From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Guix on clusters and in HPC Date: Tue, 18 Oct 2016 16:20:43 +0200 Message-ID: <87r37divr8.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:39801) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bwVG5-0008S5-ED for guix-devel@gnu.org; Tue, 18 Oct 2016 10:20:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bwVG1-00086f-KY for guix-devel@gnu.org; Tue, 18 Oct 2016 10:20:49 -0400 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:35758) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bwVG1-00086Z-6b for guix-devel@gnu.org; Tue, 18 Oct 2016 10:20:45 -0400 Received: from pluto.bordeaux.inria.fr ([193.50.110.57]:58726 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1bwVG0-0000zI-LR for guix-devel@gnu.org; Tue, 18 Oct 2016 10:20:44 -0400 List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Guix-devel --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello, I=E2=80=99m trying to gather a =E2=80=9Cwish list=E2=80=9D of things to be = done to facilitate the use of Guix on clusters and for high-performance computing (HPC). Ricardo and I wrote about the advantages, shortcomings, and perspectives before: http://elephly.net/posts/2015-04-17-gnu-guix.html https://hal.inria.fr/hal-01161771/en I know that Pjotr, Roel, Ben, Eric and maybe others also have experience and ideas on what should be done (and maybe even code? :-)). So I=E2=80=99ve come up with an initial list of work items going from the immediate needs to crazy ideas (batch scheduler integration!) that hopefully make sense to cluster/HPC people. I=E2=80=99d be happy to get feedback, suggestions, etc. from whoever is interested! (The reason I=E2=80=99m asking is that I=E2=80=99m considering submitting a= proposal at Inria to work on some of these things.) TIA! :-) Ludo=E2=80=99. --=-=-= Content-Type: text/x-org; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable - non-root usage + file system virtualization needed * map ~/.local/gnu/store to /gnu/store * user name spaces? * [[https://github.com/proot-me/PRoot/][PRoot]]? but performance prob= lems? * common interface, like =E2=80=9Cguix enter=E2=80=9D spawns a shell = where /gnu/store is available + daemon functionality as a library * client no longer connects to the daemon, does everything locally, including direct store accesses * can use substitutes + or plain =E2=80=99guix-daemon --disable-root=E2=80=99? + see [[http://lists.gnu.org/archive/html/help-guix/2016-06/msg00079.ht= ml][discussion with Ben Woodcroft and Roel]] - central daemon usage (like at MDC, but improved) + describe/define appropriate setup, like: * daemon runs on front-end node * clients can connect to daemon from compute nodes, and perform any operation * use of distributed file systems: anything to pay attention to? * how should the front-end offload to compute nodes? + technical issues * daemon needs to be able to listen for connections elsewhere * client needs to be able to [[http://debbugs.gnu.org/cgi/bugreport.c= gi?bug=3D20381][connect remotely]] instead of using [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D20381#5][=E2=80=98= socat=E2=80=99 hack]] * how do we share localstatedir? how do we share /gnu/store? * how do we share the profile directory? + admin/social issues * daemon runs as root * daemon needs Internet access * Ricardo mentions lack of nscd and problems caused by the use of NSS plugins like [[https://fedoraproject.org/wiki/Features/SSSD][SS= SD]] in this context + batch scheduler integration? * allow users to offload right from their machine to the cluster? - package variants, experimentation + for experiments, as in Section 4.2 of [[https://hal.inria.fr/hal-0116= 1771/en][the RepPar paper]] * in the meantime we added [[https://www.gnu.org/software/guix/manual= /html_node/Package-Transformation-Options.html][--with-input et al.]]; need= more? + for [[https://lists.gnu.org/archive/html/guix-devel/2016-10/msg00005.= html][CPU-specific optimizations]] + somehow support -mtune=3Dnative (and even profile-guided optimizations?) + simplify the API to switch compilers, libcs, etc. - workflow, reproducible science + implement [[http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D22629][cha= nnels]] + provide a way to see which Guix commit is used, like =E2=80=9Cguix ch= annel describe=E2=80=9D + simple ways to [[https://lists.gnu.org/archive/html/guix-devel/2016-1= 0/msg00701.html][test the dependents of a package]] (see also discussion between E. Agullo & A. Enge) * new transformation options: --with-graft, --with-source recursive + support [[https://lists.gnu.org/archive/html/guix-devel/2016-05/msg00= 380.html][workflows and pipelines]]? + add [[https://github.com/galaxyproject/galaxy/issues/2778][Guix suppo= rt in Galaxy]]? --=-=-=--