Re: Limiting parallelism using futures, parallel forms and fibers

unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed

From: Zelphir Kaltstahl <zelphirkaltstahl@posteo.de>
To: Chris Vine <vine35792468@gmail.com>
Cc: Guile User <guile-user@gnu.org>
Subject: Re: Limiting parallelism using futures, parallel forms and fibers
Date: Fri, 10 Jan 2020 01:43:18 +0100	[thread overview]
Message-ID: <27b27190-cc9c-f0b8-574f-7ef21bf3f313@posteo.de> (raw)
In-Reply-To: <20200108114402.2cabbd011c52456a73a2fca8@gmail.com>

Hi Chris!

On 1/8/20 12:44 PM, Chris Vine wrote:
> On Wed, 8 Jan 2020 08:56:11 +0100
> Zelphir Kaltstahl <zelphirkaltstahl@posteo.de> wrote:
> [snip]
>> So my questions are:
>>
>> - Is there a default / recommended way to limit parallelism for
>> recursive calls to parallel forms?
>>
>> - Is there a better way than a global counter with locking, to limit the
>> number of futures created during recursive calls? I would dislike very
>> much to have to do something like global state + mutex.
>>
>> - What do you recommend in general to solve this?
> I think you have it wrong, and that futures use a global queue and a
> global set of worker threads.  I don't see how futures could work
> without at least a global set of worker threads.  Have a look at the
> futures source code.

So far I did not write any parallel code in my project. I am only
exploring possibilities, finding out what I should be using. I also
could not think of a good way to limit the number of threads to anything
less than the (current-processor-count) value, but thought maybe someone
knows something I did not see or read about Guile's futures yet : )

> If you want more control over the thread pool than guile's futures
> provide, you could consider something like this:
> https://github.com/ChrisVine/guile-a-sync2/blob/master/a-sync/thread-pool.scm
> But then you would have to make your own futures if you want a graph
> of futures rather than a graph of coroutines (which is what you would
> get if you use the associated event loop).

This seems like a lot of code for the purpose of being able to limit the
maximum number of threads running at the same time. (Perhaps it is
necessary?)

I would not like to create some kind of futures alternative and whatever
else I would have to do. I am not even sure what "making my own futures"
would entail.

Perhaps I can create something like a thread pool using a reused fibers
scheduler, which I give the chosen limit in the #:parallelism keyword
argument. It spawns only as many fibers at the same time, as are
allowed, if that is possible (some kind of loop inside (run-fibers …)?)
Whenever one spawned fiber finishes, it checks, if there is more work
available (on some channel?) and put that work in a new spawned fiber.
If I squint at the fibers library, it almost looks like a thread pool
should be not too difficult to implement using the library, but maybe I
am wrong about this and there is some fundamental difficulty.

> The main thing is to get the parallel algorithm right, which can be
> tricky.

In my decision tree the fitting of the tree is purely functional code so
far (as far as I know). If I can only find a way to limit the
parallelism as I want to, I should be able to simply spawn some kind of
unit of evaluation for each subtree recursively. Usually decision trees
should not be too deep, as they tend to easily overfit, so I think the
overhead would not be too much for spawning a unit of parallel evaluation.

In the end, if it turns out to be too difficult to limit the maximum
number of threads running in parallel, I might choose to discard the
idea and instead use all available cores, depending on the tree depth
and number of cores.

Initially I had 2 reasons for porting the whole program to GNU Guile:

1. I had difficulties using places in Racket, as I could not send normal
lambdas to other places. This means I would have to predefine the
procedures, instead of sending them on channels or I would have to use
serializable-lambdas, which someone told me they did not manage to use
for channels and then later someone on the Racket user list told me they
can be used for sending over channels. I had already started creating a
thread pool, but this whole serializable-lambda thing stopped me from
finishing that project and it is a half-finished thing, waiting to be
completed some day by anyone. Guile in comparison seems to have easier
to use ways of achieving multicore usage.

2. I am using Guile more than Racket now and wanted my own code to be
available there. Possibly develop more machine learning algorithms in
Guile, getting the basics available.

Then, during porting the code, I found more reasons for going on with
porting:

* I used many Racketisms, which are not available in Scheme dialects. My
code would only ever work on Racket.
* I separated out many modules, which in the Racket code were all in one
big file.
* I added tests for some new procedures and some separated modules.
* I refactored some parts, made some display procedures testable with
optional output port arguments.

Regards,
Zelphir

next prev parent reply	other threads:[~2020-01-10  0:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-08  7:56 Limiting parallelism using futures, parallel forms and fibers Zelphir Kaltstahl
2020-01-08  8:11 ` Linus Björnstam
2020-01-09 23:27   ` Zelphir Kaltstahl
2020-01-08 11:44 ` Chris Vine
2020-01-10  0:43   ` Zelphir Kaltstahl [this message]
2020-01-10 16:08     ` Chris Vine

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27b27190-cc9c-f0b8-574f-7ef21bf3f313@posteo.de \
    --to=zelphirkaltstahl@posteo.de \
    --cc=guile-user@gnu.org \
    --cc=vine35792468@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).