From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Zelphir Kaltstahl Newsgroups: gmane.lisp.guile.user Subject: Re: Limiting parallelism using futures, parallel forms and fibers Date: Fri, 10 Jan 2020 01:43:18 +0100 Message-ID: <27b27190-cc9c-f0b8-574f-7ef21bf3f313@posteo.de> References: <04cb0461-18a1-ef17-4db7-2475c7c806e6@posteo.de> <20200108114402.2cabbd011c52456a73a2fca8@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="136448"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 Cc: Guile User To: Chris Vine Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Fri Jan 10 01:45:04 2020 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1ipiP0-000GRl-Dm for guile-user@m.gmane.org; Fri, 10 Jan 2020 01:43:51 +0100 Original-Received: from localhost ([::1]:39066 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ipiOt-0003ml-Mq for guile-user@m.gmane.org; Thu, 09 Jan 2020 19:43:43 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:58367) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ipiOa-0003mc-RK for guile-user@gnu.org; Thu, 09 Jan 2020 19:43:26 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ipiOZ-0007hg-7Z for guile-user@gnu.org; Thu, 09 Jan 2020 19:43:24 -0500 Original-Received: from mout01.posteo.de ([185.67.36.65]:54343) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ipiOY-0007XR-If for guile-user@gnu.org; Thu, 09 Jan 2020 19:43:23 -0500 Original-Received: from submission (posteo.de [89.146.220.130]) by mout01.posteo.de (Postfix) with ESMTPS id 9B19516005F for ; Fri, 10 Jan 2020 01:43:20 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1578617000; bh=SiOLnotbTka8F713CY2DPBqHvXheRQJEUfjr+OqR5gQ=; h=Subject:To:Cc:From:Date:From; b=p489f2AvzR563a82xU2Wtum4ZSs3QTJ4tGi88d9SifquJTj7XNgI2X+KZvICJ+FRR BGdhSndDAy5teuEscmbHT5lAjcb8GKyoRCycgnDjjyNfadqYQJO+91BkHqu/kOyKpF fT1F97Kj99/YDCD620H7v5c7gxsjuSxDNY25CC5/Zd5IOOvrO3Pn/hTqmJEtcOBEDm nqRFd+LPHB9LWHyIcvBKE9YbWOXcxObCNxP11nP8F3ACyCGwsY0UUOIxSVvLFcIhzE d/9ux/P6f1OIC2nSwH8NIhIav57Yz4rRDt6GfIgMq7mRc7PndDVP5XvPpVMj82mDcr zjC/RwicWWLQA== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 47v43C5ZVnz6tm6; Fri, 10 Jan 2020 01:43:19 +0100 (CET) In-Reply-To: <20200108114402.2cabbd011c52456a73a2fca8@gmail.com> Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 185.67.36.65 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:16023 Archived-At: Hi Chris! On 1/8/20 12:44 PM, Chris Vine wrote: > On Wed, 8 Jan 2020 08:56:11 +0100 > Zelphir Kaltstahl wrote: > [snip] >> So my questions are: >> >> - Is there a default / recommended way to limit parallelism for >> recursive calls to parallel forms? >> >> - Is there a better way than a global counter with locking, to limit t= he >> number of futures created during recursive calls? I would dislike very >> much to have to do something like global state + mutex. >> >> - What do you recommend in general to solve this? > I think you have it wrong, and that futures use a global queue and a > global set of worker threads. I don't see how futures could work > without at least a global set of worker threads. Have a look at the > futures source code. So far I did not write any parallel code in my project. I am only exploring possibilities, finding out what I should be using. I also could not think of a good way to limit the number of threads to anything less than the (current-processor-count) value, but thought maybe someone knows something I did not see or read about Guile's futures yet : ) > If you want more control over the thread pool than guile's futures > provide, you could consider something like this: > https://github.com/ChrisVine/guile-a-sync2/blob/master/a-sync/thread-po= ol.scm > But then you would have to make your own futures if you want a graph > of futures rather than a graph of coroutines (which is what you would > get if you use the associated event loop). This seems like a lot of code for the purpose of being able to limit the maximum number of threads running at the same time. (Perhaps it is necessary?) I would not like to create some kind of futures alternative and whatever else I would have to do. I am not even sure what "making my own futures" would entail. Perhaps I can create something like a thread pool using a reused fibers scheduler, which I give the chosen limit in the #:parallelism keyword argument. It spawns only as many fibers at the same time, as are allowed, if that is possible (some kind of loop inside (run-fibers =E2=80= =A6)?) Whenever one spawned fiber finishes, it checks, if there is more work available (on some channel?) and put that work in a new spawned fiber. If I squint at the fibers library, it almost looks like a thread pool should be not too difficult to implement using the library, but maybe I am wrong about this and there is some fundamental difficulty. > The main thing is to get the parallel algorithm right, which can be > tricky. In my decision tree the fitting of the tree is purely functional code so far (as far as I know). If I can only find a way to limit the parallelism as I want to, I should be able to simply spawn some kind of unit of evaluation for each subtree recursively. Usually decision trees should not be too deep, as they tend to easily overfit, so I think the overhead would not be too much for spawning a unit of parallel evaluation= . In the end, if it turns out to be too difficult to limit the maximum number of threads running in parallel, I might choose to discard the idea and instead use all available cores, depending on the tree depth and number of cores. Initially I had 2 reasons for porting the whole program to GNU Guile: 1. I had difficulties using places in Racket, as I could not send normal lambdas to other places. This means I would have to predefine the procedures, instead of sending them on channels or I would have to use serializable-lambdas, which someone told me they did not manage to use for channels and then later someone on the Racket user list told me they can be used for sending over channels. I had already started creating a thread pool, but this whole serializable-lambda thing stopped me from finishing that project and it is a half-finished thing, waiting to be completed some day by anyone. Guile in comparison seems to have easier to use ways of achieving multicore usage. 2. I am using Guile more than Racket now and wanted my own code to be available there. Possibly develop more machine learning algorithms in Guile, getting the basics available. Then, during porting the code, I found more reasons for going on with porting: * I used many Racketisms, which are not available in Scheme dialects. My code would only ever work on Racket. * I separated out many modules, which in the Racket code were all in one big file. * I added tests for some new procedures and some separated modules. * I refactored some parts, made some display procedures testable with optional output port arguments. Regards, Zelphir