From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ricardo Wurmus <rekado@elephly.net>
Subject: Re: Guix on clusters and in HPC
Date: Wed, 26 Oct 2016 14:08:23 +0200
Message-ID: <877f8vgvns.fsf@elephly.net>
References: <87r37divr8.fsf@gnu.org> <86vawh9lvw.fsf@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Return-path: <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47275)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <rekado@elephly.net>) id 1bzN0Z-0002Xj-9j
	for guix-devel@gnu.org; Wed, 26 Oct 2016 08:08:40 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <rekado@elephly.net>) id 1bzN0Y-000592-BU
	for guix-devel@gnu.org; Wed, 26 Oct 2016 08:08:39 -0400
In-reply-to: <86vawh9lvw.fsf@gmail.com>
List-Id: "Development of GNU Guix and the GNU System distribution."
	<guix-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-devel>,
	<mailto:guix-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/guix-devel/>
List-Post: <mailto:guix-devel@gnu.org>
List-Help: <mailto:guix-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-devel>,
	<mailto:guix-devel-request@gnu.org?subject=subscribe>
Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org
Sender: "Guix-devel" <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
To: myglc2 <myglc2@gmail.com>
Cc: Guix-devel <guix-devel@gnu.org>


myglc2 <myglc2@gmail.com> writes:

> While SGE is dated and can be a bear to use, it provides a useful
> yardstick for HPC/Cluster functionality. So it is useful to consider how
> Guix(SD) might impact this model. Presumably a defining characteristic
> of GuixSD clusters is that the software configuration of compute hosts
> no longer needs to be fixed and the user can "dial in" a specific SW
> configuration for each job step.  This is in many ways a good thing. But
> it also generates new requirements. How does one specify the SW config
> for a given job or recipe step:
>
> 1) VM image?
>
> 2) VM?
>
> 3) Installed System Packages?
>
> 4) Installed (user) packages?

At the MDC we’re using SGE and users specify their software environment
in the job script.  The software environment is a Guix profile, so the
job script usually contains a line to source the profile’s
“etc/profile”, which has the effect of setting up the required
environment variables.

I don’t know of anyone who uses VMs or VM images to specify software
environments.

> Based on my experiments with Guix/Debian, GuixSD, VMs, and VM images it
> is not obvious to me which of these levels of abstraction is
> appropriate.

FWIW we’re using Guix on top of CentOS 6.8.  The store is mounted
read-only on all cluster nodes.

> The most forward-thinking group that I know discarded their cluster
> hardware a year ago to replace it with starcluster
> (http://star.mit.edu/cluster/). Starcluster automates the creation,
> care, and feeding of a HPC clusters on AWS using the Grid Engine
> scheduler and AMIs. The group has a full-time "starcluster jockey" who
> manages their cluster and they seem quite happy with the approach. So
> you may want to consider starcluster as a model when you think of
> cluster management requirements.

When using starcluster are software environments transferred to AWS on
demand?  Does this happen on a per-job basis?  Are any of the
instantiated machines persistent or are they discarded after use?

~~ Ricardo