unofficial mirror of gwl-devel@gnu.org
 help / color / mirror / Atom feed
From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
To: Kyle Meyer <kyle@kyleam.com>
Cc: gwl-devel@gnu.org
Subject: Re: Next steps for the GWL
Date: Thu, 6 Jun 2019 12:11:08 +0200	[thread overview]
Message-ID: <87v9xjja6b.fsf@mdc-berlin.de> (raw)
In-Reply-To: <8736knieo3.fsf@kyleam.com>


Hi Kyle,

thanks for your comments!

> One of the things I'd love to do
> with GWL is to make it play well with git-annex, something that would
> almost certainly be too specific for GWL itself.  For example
>
>   * Make data caching git-annex aware.  When deciding to recompute data
>     files, GWL avoids computing the hash of data files, using scripts as
>     the cheaper proxy, as you described in 87womnnjg0.fsf@elephly.net.
>     But if the user is tracking data files with git-annex, getting the
>     hash of data files becomes less expensive because we can ask
>     git-annex for the hash it has already computed.
>
>   * Support getting annex data files on demand (i.e. 'git annex get') if
>     they are needed as inputs.

I wonder what the protocol should look like.  Should a workflow
explicitly request a “git annex” file or should it be up to the person
running the workflow, i.e. when “git annex” has been configured to be
the cache backend it would simply look up the declared input/output
files there.

I suppose the answers would equally apply to using IPFS as a cache.

>> * add support for executing processes in isolated environments
>>   (containers) — this requires a better understanding of process inputs.
>
> This is another one I'm especially excited about.  Functionality-wise,
> are you imagining essentially matching the options available for 'guix
> environment --container ...'?

So far this is all I’ve got:

--8<---------------cut here---------------start------------->8---
diff --git a/gwl/processes.scm b/gwl/processes.scm
index beb61cc..264807f 100644
--- a/gwl/processes.scm
+++ b/gwl/processes.scm
@@ -19,13 +19,19 @@
   #:use-module ((guix derivations)
                 #:select (derivation->output-path
                           build-derivations))
+  #:use-module ((guix packages)
+                #:select (package-file))
   #:use-module (guix gexp)
-  #:use-module ((guix monads) #:select (mlet return))
+  #:use-module ((guix monads) #:select (mlet mapm return))
   #:use-module (guix records)
   #:use-module ((guix store)
                 #:select (run-with-store
                           with-store
                           %store-monad))
+  #:use-module ((guix modules)
+                #:select (source-module-closure))
+  #:use-module (gnu system file-systems)
+  #:use-module (gnu build linux-container)
   #:use-module (ice-9 format)
   #:use-module (ice-9 match)
   #:use-module (srfi srfi-1)
@@ -276,6 +282,54 @@ plain S-expression."
        (call process code)))
     (whatever (error (format #f "unsupported procedure: ~a\n" whatever)))))

+;; WIP
+(define (containerize exp process)
+  "Wrap EXP, an S-expression or G-expression, in a G-expression that
+causes EXP to be run in a container according to the requirements
+specified in PROCESS."
+  (let* ((package-dirs
+          (with-store store
+            (run-with-store store
+              (mapm %store-monad package-file
+                    (process-package-inputs process)))))
+         (data-inputs
+          (process-data-inputs process))
+         (output-dirs
+          (delete-duplicates
+           (map dirname (process-outputs process))))
+         (input-mappings
+          (map (lambda (location)
+                 (file-system-mapping
+                  (source location)
+                  (target location)
+                  (writable? #f)))
+               (lset-difference string=?
+                                (append package-dirs
+                                        data-inputs)
+                                output-dirs)))
+         (output-mappings
+          (map (lambda (dir)
+                 (file-system-mapping
+                  (source dir)
+                  (target dir)
+                  (writable? #t)))
+               output-dirs))
+         (specs
+          (map (compose file-system->spec
+                        file-system-mapping->bind-mount)
+               (append input-mappings
+                       output-mappings))))
+    (with-imported-modules (source-module-closure
+                            '((gnu build linux-container)
+                              (gnu system file-systems)))
+      #~(begin
+          (use-modules (gnu build linux-container)
+                       (gnu system file-systems))
+          (call-with-container (append %container-file-systems
+                                       (map spec->file-system
+                                            '#$specs))
+            (lambda () #$exp))))))
+
 ;;; ---------------------------------------------------------------------------
 ;;; ADDITIONAL FUNCTIONS
 ;;; ---------------------------------------------------------------------------
--8<---------------cut here---------------end--------------->8---

This means that it can map file systems into the container and then run
the process expression in that environment.

One thing I’m not happy about is that I can only mount directories, and
not individual files that have been declared as inputs.  I’d like to
have more fine-grained access.  I suppose it might be possible to mount
just the relevant parts of the GWL cache, but I need to play with this
to better understand what the desired behaviour would be.

--
Ricardo

  reply	other threads:[~2019-06-06 10:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-29 13:47 Next steps for the GWL Ricardo Wurmus
2019-06-03 15:16 ` zimoun
2019-06-03 16:18   ` Ricardo Wurmus
2019-06-06 11:07     ` zimoun
2019-06-06 12:19       ` Ricardo Wurmus
2019-06-06 13:23         ` Pjotr Prins
2019-06-06  3:19 ` Kyle Meyer
2019-06-06 10:11   ` Ricardo Wurmus [this message]
2019-06-06 10:55     ` zimoun
2019-06-06 11:59       ` Ricardo Wurmus
2019-06-06 13:44       ` Pjotr Prins
2019-06-06 14:06         ` Pjotr Prins
2019-06-06 15:07     ` Kyle Meyer
2019-06-06 20:29       ` Ricardo Wurmus
2019-06-07  4:11         ` Kyle Meyer
2019-06-12  9:46 ` Ricardo Wurmus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.guixwl.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v9xjja6b.fsf@mdc-berlin.de \
    --to=ricardo.wurmus@mdc-berlin.de \
    --cc=gwl-devel@gnu.org \
    --cc=kyle@kyleam.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).