unofficial mirror of gwl-devel@gnu.org
 help / color / mirror / Atom feed
From: Liliana Marie Prikler <liliana.prikler@ist.tugraz.at>
To: Ricardo Wurmus <rekado@elephly.net>
Cc: gwl-devel@gnu.org
Subject: Re: Processing large amounts of files
Date: Mon, 25 Mar 2024 08:42:23 +0100	[thread overview]
Message-ID: <54f697191220794a99dde447f2f2ce56439d8408.camel@ist.tugraz.at> (raw)
In-Reply-To: <87plvjd4el.fsf@elephly.net>

Am Donnerstag, dem 21.03.2024 um 15:34 +0100 schrieb Ricardo Wurmus:
> [-guix-devel@gnu.org, +gwl-devel@gnu.org]
oops D:
> 
> [...]
> When running with "-l all" I see this:
> 
>   info: .75 Computing workflow `cat'...
>   debug: 3.13 Computing script for process `meow'
>   guix: 3.13 Looking up package `bash-minimal'
>   guix: 3.13 Opening inferior Guix at
> `/gnu/store/pb1nkrn3sg6a1j6c4r5j2ahygkf4vkv9-profile'
>   guix: 4.27 Looking up package `guix'
>   debug: 4.45 Generating all scripts and their dependencies.
>   debug: 4.89 Generating all scripts and their dependencies.
>   run: 6.73 Executing: /bin/sh -c
> /gnu/store/5idhbvhrwj3p53kkz2vikdn1ypncwj84-gwl-meow.scm '((inputs
> "/tmp/meow/0" ...
>   process: 8.80 In execvp of /bin/sh: Argument list too long
>   error: 8.80 Wrong type argument in position 1: #f
> 
> This at least tells us that the last error here is due to sh refusing
> to run.
Good to know, and I thought it'd be just that, but… shouldn't this
failure to invoke sh be caught through something?

> > For comparison:
> >   time cat /tmp/meow/{0..7769}
> >   […]
> >   
> >   real  0m0,144s
> >   user  0m0,049s
> >   sys   0m0,094s
> > 
> > It takes GWL 6 times longer to compute the workflow than to create
> > the inputs in Guile, and 600 times longer than to actually execute
> > the shell command.  I think there is room for improvement :)
> 
> Yeah, not good.  Do you have any recommendations?
We already talked about this in response to your second mail, but (LRU)
Caching of things that can be cached would be an approach to take. 
Perhaps there's also inefficiencies in auto-connecting inputs – not
exhibited by this example, but thinkable.

Design-wise, we might need a way of splitting large worfklows anyhow. 
Files and environment variables work, but feel clunky at the moment,
and particular files remind me about recursive make… maybe when I get
the time, I can code something up and then look at ways for
simplification.

Cheers


  reply	other threads:[~2024-03-25  7:43 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2010bdb88116d64da3650b06e58979518b2c7277.camel@ist.tugraz.at>
2024-03-21 14:34 ` Processing large amounts of files Ricardo Wurmus
2024-03-25  7:42   ` Liliana Marie Prikler [this message]
2024-03-25  9:25     ` Ricardo Wurmus
2024-03-25 10:42       ` Ricardo Wurmus
2024-03-21 15:03 ` Ricardo Wurmus
2024-03-21 15:33   ` Liliana Marie Prikler
2024-03-26 21:30   ` Ricardo Wurmus
2024-03-27  7:10     ` Liliana Marie Prikler
2024-03-27  9:58       ` Ricardo Wurmus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.guixwl.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54f697191220794a99dde447f2f2ce56439d8408.camel@ist.tugraz.at \
    --to=liliana.prikler@ist.tugraz.at \
    --cc=gwl-devel@gnu.org \
    --cc=rekado@elephly.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).