From mboxrd@z Thu Jan 1 00:00:00 1970 References: <87a7f5l6e1.fsf@mdc-berlin.de> <8736knieo3.fsf@kyleam.com> From: Ricardo Wurmus Subject: Re: Next steps for the GWL In-Reply-To: <8736knieo3.fsf@kyleam.com> Date: Thu, 6 Jun 2019 12:11:08 +0200 Message-ID: <87v9xjja6b.fsf@mdc-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable To: Kyle Meyer Cc: gwl-devel@gnu.org List-ID: Hi Kyle, thanks for your comments! > One of the things I'd love to do > with GWL is to make it play well with git-annex, something that would > almost certainly be too specific for GWL itself. For example > > * Make data caching git-annex aware. When deciding to recompute data > files, GWL avoids computing the hash of data files, using scripts as > the cheaper proxy, as you described in 87womnnjg0.fsf@elephly.net. > But if the user is tracking data files with git-annex, getting the > hash of data files becomes less expensive because we can ask > git-annex for the hash it has already computed. > > * Support getting annex data files on demand (i.e. 'git annex get') if > they are needed as inputs. I wonder what the protocol should look like. Should a workflow explicitly request a =E2=80=9Cgit annex=E2=80=9D file or should it be up to= the person running the workflow, i.e. when =E2=80=9Cgit annex=E2=80=9D has been config= ured to be the cache backend it would simply look up the declared input/output files there. I suppose the answers would equally apply to using IPFS as a cache. >> * add support for executing processes in isolated environments >> (containers) =E2=80=94 this requires a better understanding of process= inputs. > > This is another one I'm especially excited about. Functionality-wise, > are you imagining essentially matching the options available for 'guix > environment --container ...'? So far this is all I=E2=80=99ve got: --8<---------------cut here---------------start------------->8--- diff --git a/gwl/processes.scm b/gwl/processes.scm index beb61cc..264807f 100644 --- a/gwl/processes.scm +++ b/gwl/processes.scm @@ -19,13 +19,19 @@ #:use-module ((guix derivations) #:select (derivation->output-path build-derivations)) + #:use-module ((guix packages) + #:select (package-file)) #:use-module (guix gexp) - #:use-module ((guix monads) #:select (mlet return)) + #:use-module ((guix monads) #:select (mlet mapm return)) #:use-module (guix records) #:use-module ((guix store) #:select (run-with-store with-store %store-monad)) + #:use-module ((guix modules) + #:select (source-module-closure)) + #:use-module (gnu system file-systems) + #:use-module (gnu build linux-container) #:use-module (ice-9 format) #:use-module (ice-9 match) #:use-module (srfi srfi-1) @@ -276,6 +282,54 @@ plain S-expression." (call process code))) (whatever (error (format #f "unsupported procedure: ~a\n" whatever))))) +;; WIP +(define (containerize exp process) + "Wrap EXP, an S-expression or G-expression, in a G-expression that +causes EXP to be run in a container according to the requirements +specified in PROCESS." + (let* ((package-dirs + (with-store store + (run-with-store store + (mapm %store-monad package-file + (process-package-inputs process))))) + (data-inputs + (process-data-inputs process)) + (output-dirs + (delete-duplicates + (map dirname (process-outputs process)))) + (input-mappings + (map (lambda (location) + (file-system-mapping + (source location) + (target location) + (writable? #f))) + (lset-difference string=3D? + (append package-dirs + data-inputs) + output-dirs))) + (output-mappings + (map (lambda (dir) + (file-system-mapping + (source dir) + (target dir) + (writable? #t))) + output-dirs)) + (specs + (map (compose file-system->spec + file-system-mapping->bind-mount) + (append input-mappings + output-mappings)))) + (with-imported-modules (source-module-closure + '((gnu build linux-container) + (gnu system file-systems))) + #~(begin + (use-modules (gnu build linux-container) + (gnu system file-systems)) + (call-with-container (append %container-file-systems + (map spec->file-system + '#$specs)) + (lambda () #$exp)))))) + ;;; ----------------------------------------------------------------------= ----- ;;; ADDITIONAL FUNCTIONS ;;; ----------------------------------------------------------------------= ----- --8<---------------cut here---------------end--------------->8--- This means that it can map file systems into the container and then run the process expression in that environment. One thing I=E2=80=99m not happy about is that I can only mount directories,= and not individual files that have been declared as inputs. I=E2=80=99d like to have more fine-grained access. I suppose it might be possible to mount just the relevant parts of the GWL cache, but I need to play with this to better understand what the desired behaviour would be. -- Ricardo