Limiting parallelism using futures, parallel forms and fibers

unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed

* Limiting parallelism using futures, parallel forms and fibers
@ 2020-01-08  7:56 Zelphir Kaltstahl
  2020-01-08  8:11 ` Linus Björnstam
  2020-01-08 11:44 ` Chris Vine
  0 siblings, 2 replies; 6+ messages in thread
From: Zelphir Kaltstahl @ 2020-01-08  7:56 UTC (permalink / raw)
  To: Guile User

Hello Guile users!

I thought about what I need for parallelizing an algorithm I am working
on. Background: I am working on my decision tree implementation
(https://notabug.org/ZelphirKaltstahl/guile-ml/src/wip-port-to-guile),
which is currently only using a single core. Training the model splits
the tree into subtrees, which is an opportunity for running things in
parallel. Subtrees can subsequently split again and that could again
result in a parallel evaluation of the subtrees.

Since it might not always be desired to use potentially all available
cores (blocking the whole CPU), I would like to add the possibility for
the user of the algorithm to limit the number of cores used by the
algorithm, most likely using a keyword argument, which defaults to the
number of cores. Looking at futures, parallel forms and the fibers
library, I've had the following thoughts:

1. fibers support a ~#:parallelism~ keyword and thus the number of
fibers running in parallel can be set for ~run-fibers~, which creates a
scheduler, which might be used later, possibly avoiding to keep track of
how many threads are being used. It would probably be important when
using fibers, to always make use of the same scheduler, as a new
scheduler might not know anything about other schedulers and the limit
for parallelism might be overstepped. Schedulers, which are controlled
by the initial scheduler are probably OK. I will need to re-read the
fibers manual on that.

2. With parallel forms, there are ~n-par-map~ and ~n-par-for-each~,
where one can specify the maximum number of threads running in parallel.
However, when recursively calling these procedures (in my case
processing subtrees of a decision tree, which might split into subtrees
again), one does not have the knowledge of whether the processing of the
other recursive calls has finished and cannot know, how many other
threads are being used currently. Calling one of these procedures again
might run more threads than specified on a upper level call.

3. With futures, one cannot specify the number of futures to run at
maximum directly. In order to control how many threads are evaluating
code at the same time, one would have to build some kind of construct
around them, which keeps track of how many futures are running or could
be running and make that work for recursive creation of further futures
as well.

4. One could also do something ugly and create a globally defined active
thread counter, which requires locking, to keep track of the number of
in parallel running threads or futures.

5. I could just assume the maximum number of currently used cores, by
giving the tree depth as argument for recursive calls and calculating
from that, how many splits and thus evaluations might be running in
parallel at that point. However, this might be inaccurate, as some
subtree might already be finished and then I would not use the maximum
user specified number of parallel evaluations.

So my questions are:

- Is there a default / recommended way to limit parallelism for
recursive calls to parallel forms?

- Is there a better way than a global counter with locking, to limit the
number of futures created during recursive calls? I would dislike very
much to have to do something like global state + mutex.

- What do you recommend in general to solve this?

Regards,
Zelphir

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Limiting parallelism using futures, parallel forms and fibers
  2020-01-08  7:56 Limiting parallelism using futures, parallel forms and fibers Zelphir Kaltstahl
@ 2020-01-08  8:11 ` Linus Björnstam
  2020-01-09 23:27   ` Zelphir Kaltstahl
  2020-01-08 11:44 ` Chris Vine
  1 sibling, 1 reply; 6+ messages in thread
From: Linus Björnstam @ 2020-01-08  8:11 UTC (permalink / raw)
  To: guile-user

Hi! 

I don't have much more input than to say that futures use a built in thread pool that is limited to (current-processor-count) threads. That could maybe be modified using setaffinity ?

Hope this helps.

-- 
  Linus Björnstam

On Wed, 8 Jan 2020, at 08:56, Zelphir Kaltstahl wrote:
> Hello Guile users!
> 
> I thought about what I need for parallelizing an algorithm I am working
> on. Background: I am working on my decision tree implementation
> (https://notabug.org/ZelphirKaltstahl/guile-ml/src/wip-port-to-guile),
> which is currently only using a single core. Training the model splits
> the tree into subtrees, which is an opportunity for running things in
> parallel. Subtrees can subsequently split again and that could again
> result in a parallel evaluation of the subtrees.
> 
> Since it might not always be desired to use potentially all available
> cores (blocking the whole CPU), I would like to add the possibility for
> the user of the algorithm to limit the number of cores used by the
> algorithm, most likely using a keyword argument, which defaults to the
> number of cores. Looking at futures, parallel forms and the fibers
> library, I've had the following thoughts:
> 
> 1. fibers support a ~#:parallelism~ keyword and thus the number of
> fibers running in parallel can be set for ~run-fibers~, which creates a
> scheduler, which might be used later, possibly avoiding to keep track of
> how many threads are being used. It would probably be important when
> using fibers, to always make use of the same scheduler, as a new
> scheduler might not know anything about other schedulers and the limit
> for parallelism might be overstepped. Schedulers, which are controlled
> by the initial scheduler are probably OK. I will need to re-read the
> fibers manual on that.
> 
> 2. With parallel forms, there are ~n-par-map~ and ~n-par-for-each~,
> where one can specify the maximum number of threads running in parallel.
> However, when recursively calling these procedures (in my case
> processing subtrees of a decision tree, which might split into subtrees
> again), one does not have the knowledge of whether the processing of the
> other recursive calls has finished and cannot know, how many other
> threads are being used currently. Calling one of these procedures again
> might run more threads than specified on a upper level call.
> 
> 3. With futures, one cannot specify the number of futures to run at
> maximum directly. In order to control how many threads are evaluating
> code at the same time, one would have to build some kind of construct
> around them, which keeps track of how many futures are running or could
> be running and make that work for recursive creation of further futures
> as well.
> 
> 4. One could also do something ugly and create a globally defined active
> thread counter, which requires locking, to keep track of the number of
> in parallel running threads or futures.
> 
> 5. I could just assume the maximum number of currently used cores, by
> giving the tree depth as argument for recursive calls and calculating
> from that, how many splits and thus evaluations might be running in
> parallel at that point. However, this might be inaccurate, as some
> subtree might already be finished and then I would not use the maximum
> user specified number of parallel evaluations.
> 
> So my questions are:
> 
> - Is there a default / recommended way to limit parallelism for
> recursive calls to parallel forms?
> 
> - Is there a better way than a global counter with locking, to limit the
> number of futures created during recursive calls? I would dislike very
> much to have to do something like global state + mutex.
> 
> - What do you recommend in general to solve this?
> 
> Regards,
> Zelphir
> 
> 
>



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Limiting parallelism using futures, parallel forms and fibers
  2020-01-08  7:56 Limiting parallelism using futures, parallel forms and fibers Zelphir Kaltstahl
  2020-01-08  8:11 ` Linus Björnstam
@ 2020-01-08 11:44 ` Chris Vine
  2020-01-10  0:43   ` Zelphir Kaltstahl
  1 sibling, 1 reply; 6+ messages in thread
From: Chris Vine @ 2020-01-08 11:44 UTC (permalink / raw)
  To: Zelphir Kaltstahl; +Cc: Guile User

On Wed, 8 Jan 2020 08:56:11 +0100
Zelphir Kaltstahl <zelphirkaltstahl@posteo.de> wrote:
[snip]
> So my questions are:
> 
> - Is there a default / recommended way to limit parallelism for
> recursive calls to parallel forms?
> 
> - Is there a better way than a global counter with locking, to limit the
> number of futures created during recursive calls? I would dislike very
> much to have to do something like global state + mutex.
> 
> - What do you recommend in general to solve this?

I think you have it wrong, and that futures use a global queue and a
global set of worker threads.  I don't see how futures could work
without at least a global set of worker threads.  Have a look at the
futures source code.

If you want more control over the thread pool than guile's futures
provide, you could consider something like this:
https://github.com/ChrisVine/guile-a-sync2/blob/master/a-sync/thread-pool.scm
But then you would have to make your own futures if you want a graph
of futures rather than a graph of coroutines (which is what you would
get if you use the associated event loop).

The main thing is to get the parallel algorithm right, which can be
tricky.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Limiting parallelism using futures, parallel forms and fibers
  2020-01-08  8:11 ` Linus Björnstam
@ 2020-01-09 23:27   ` Zelphir Kaltstahl
  0 siblings, 0 replies; 6+ messages in thread
From: Zelphir Kaltstahl @ 2020-01-09 23:27 UTC (permalink / raw)
  To: linus.internet; +Cc: Guile User

Thanks for your suggestion, I will take it into account.

Regards,

Zelphir

On 1/8/20 9:11 AM, Linus Björnstam wrote:
> Hi! 
>
> I don't have much more input than to say that futures use a built in thread pool that is limited to (current-processor-count) threads. That could maybe be modified using setaffinity ?
>
> Hope this helps.
>



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Limiting parallelism using futures, parallel forms and fibers
  2020-01-08 11:44 ` Chris Vine
@ 2020-01-10  0:43   ` Zelphir Kaltstahl
  2020-01-10 16:08     ` Chris Vine
  0 siblings, 1 reply; 6+ messages in thread
From: Zelphir Kaltstahl @ 2020-01-10  0:43 UTC (permalink / raw)
  To: Chris Vine; +Cc: Guile User

Hi Chris!

On 1/8/20 12:44 PM, Chris Vine wrote:
> On Wed, 8 Jan 2020 08:56:11 +0100
> Zelphir Kaltstahl <zelphirkaltstahl@posteo.de> wrote:
> [snip]
>> So my questions are:
>>
>> - Is there a default / recommended way to limit parallelism for
>> recursive calls to parallel forms?
>>
>> - Is there a better way than a global counter with locking, to limit the
>> number of futures created during recursive calls? I would dislike very
>> much to have to do something like global state + mutex.
>>
>> - What do you recommend in general to solve this?
> I think you have it wrong, and that futures use a global queue and a
> global set of worker threads.  I don't see how futures could work
> without at least a global set of worker threads.  Have a look at the
> futures source code.

So far I did not write any parallel code in my project. I am only
exploring possibilities, finding out what I should be using. I also
could not think of a good way to limit the number of threads to anything
less than the (current-processor-count) value, but thought maybe someone
knows something I did not see or read about Guile's futures yet : )

> If you want more control over the thread pool than guile's futures
> provide, you could consider something like this:
> https://github.com/ChrisVine/guile-a-sync2/blob/master/a-sync/thread-pool.scm
> But then you would have to make your own futures if you want a graph
> of futures rather than a graph of coroutines (which is what you would
> get if you use the associated event loop).

This seems like a lot of code for the purpose of being able to limit the
maximum number of threads running at the same time. (Perhaps it is
necessary?)

I would not like to create some kind of futures alternative and whatever
else I would have to do. I am not even sure what "making my own futures"
would entail.

Perhaps I can create something like a thread pool using a reused fibers
scheduler, which I give the chosen limit in the #:parallelism keyword
argument. It spawns only as many fibers at the same time, as are
allowed, if that is possible (some kind of loop inside (run-fibers …)?)
Whenever one spawned fiber finishes, it checks, if there is more work
available (on some channel?) and put that work in a new spawned fiber.
If I squint at the fibers library, it almost looks like a thread pool
should be not too difficult to implement using the library, but maybe I
am wrong about this and there is some fundamental difficulty.

> The main thing is to get the parallel algorithm right, which can be
> tricky.

In my decision tree the fitting of the tree is purely functional code so
far (as far as I know). If I can only find a way to limit the
parallelism as I want to, I should be able to simply spawn some kind of
unit of evaluation for each subtree recursively. Usually decision trees
should not be too deep, as they tend to easily overfit, so I think the
overhead would not be too much for spawning a unit of parallel evaluation.

In the end, if it turns out to be too difficult to limit the maximum
number of threads running in parallel, I might choose to discard the
idea and instead use all available cores, depending on the tree depth
and number of cores.

Initially I had 2 reasons for porting the whole program to GNU Guile:

1. I had difficulties using places in Racket, as I could not send normal
lambdas to other places. This means I would have to predefine the
procedures, instead of sending them on channels or I would have to use
serializable-lambdas, which someone told me they did not manage to use
for channels and then later someone on the Racket user list told me they
can be used for sending over channels. I had already started creating a
thread pool, but this whole serializable-lambda thing stopped me from
finishing that project and it is a half-finished thing, waiting to be
completed some day by anyone. Guile in comparison seems to have easier
to use ways of achieving multicore usage.

2. I am using Guile more than Racket now and wanted my own code to be
available there. Possibly develop more machine learning algorithms in
Guile, getting the basics available.

Then, during porting the code, I found more reasons for going on with
porting:

* I used many Racketisms, which are not available in Scheme dialects. My
code would only ever work on Racket.
* I separated out many modules, which in the Racket code were all in one
big file.
* I added tests for some new procedures and some separated modules.
* I refactored some parts, made some display procedures testable with
optional output port arguments.

Regards,
Zelphir

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Limiting parallelism using futures, parallel forms and fibers
  2020-01-10  0:43   ` Zelphir Kaltstahl
@ 2020-01-10 16:08     ` Chris Vine
  0 siblings, 0 replies; 6+ messages in thread
From: Chris Vine @ 2020-01-10 16:08 UTC (permalink / raw)
  To: Zelphir Kaltstahl; +Cc: Guile User

On Fri, 10 Jan 2020 01:43:18 +0100
Zelphir Kaltstahl <zelphirkaltstahl@posteo.de> wrote:
> Hi Chris!
> 
> On 1/8/20 12:44 PM, Chris Vine wrote:
> > On Wed, 8 Jan 2020 08:56:11 +0100
> > Zelphir Kaltstahl <zelphirkaltstahl@posteo.de> wrote:
> > [snip]
> >> So my questions are:
> >>
> >> - Is there a default / recommended way to limit parallelism for
> >> recursive calls to parallel forms?
> >>
> >> - Is there a better way than a global counter with locking, to limit the
> >> number of futures created during recursive calls? I would dislike very
> >> much to have to do something like global state + mutex.
> >>
> >> - What do you recommend in general to solve this?
> > I think you have it wrong, and that futures use a global queue and a
> > global set of worker threads.  I don't see how futures could work
> > without at least a global set of worker threads.  Have a look at the
> > futures source code.
> 
> So far I did not write any parallel code in my project. I am only
> exploring possibilities, finding out what I should be using. I also
> could not think of a good way to limit the number of threads to anything
> less than the (current-processor-count) value, but thought maybe someone
> knows something I did not see or read about Guile's futures yet : )

I still don't understand why you think you want to do that, given the
aims you set out in your earlier posts.  Why do you want to spawn less
than (processor-count - 1) threads for your algorithm, which is what
futures would use?

But if you really want to do that, and you don't want to use a
stand-alone thread pool, the parallelism keyword of fibers seems to do
what you want.



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-01-10 16:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-08  7:56 Limiting parallelism using futures, parallel forms and fibers Zelphir Kaltstahl
2020-01-08  8:11 ` Linus Björnstam
2020-01-09 23:27   ` Zelphir Kaltstahl
2020-01-08 11:44 ` Chris Vine
2020-01-10  0:43   ` Zelphir Kaltstahl
2020-01-10 16:08     ` Chris Vine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).