From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Damien Mattei Newsgroups: gmane.lisp.guile.user,gmane.lisp.guile.devel Subject: Re: map-par slower than map Date: Mon, 17 Oct 2022 15:17:04 +0200 Message-ID: References: <5608809c-89a2-118c-5c05-c46ac3a0e21b@posteo.de> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39680"; mail-complaints-to="usenet@ciao.gmane.io" Cc: guile-user , guile-devel To: Zelphir Kaltstahl Original-X-From: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Mon Oct 17 15:18:20 2022 Return-path: Envelope-to: guile-user@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1okQ0a-000A0Y-DH for guile-user@m.gmane-mx.org; Mon, 17 Oct 2022 15:18:20 +0200 Original-Received: from localhost ([::1]:60214 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1okQ0Y-0004Lc-SS for guile-user@m.gmane-mx.org; Mon, 17 Oct 2022 09:18:18 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49878) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1okPzf-0004Kx-Ua; Mon, 17 Oct 2022 09:17:24 -0400 Original-Received: from mail-ed1-x532.google.com ([2a00:1450:4864:20::532]:42966) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1okPzc-0000g5-DM; Mon, 17 Oct 2022 09:17:23 -0400 Original-Received: by mail-ed1-x532.google.com with SMTP id u21so15976872edi.9; Mon, 17 Oct 2022 06:17:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=9c4mqmFRpQZBWeBNJlwtQp3msR5Y1NyCfJ9M5hrzr2E=; b=Cw3emkaCfN+6NdLjg6RCt+W3TkdSODHo0AYl2PxRvNENYwrf/ur4+BjYcN+KMByUMd 8Afo9AYa5YfE6EaXA0hxyXbs+IZh6+O1P7mD0T+qHcMoAubOyli2znaIuO5ibYTesgX9 mR+eM4DK9qtox98QAm04Hd2zFQXsk80N4nHf9WOqBZhsL4NLQ0JRnVjeMga/kb/Mfv/W GGZPEXVivxltjU8gS1dDrsl7vlspOp/XFFN6A6nLRw8ntUQ61np+2aOpll3juQ82ZGQL 2XwuOaF1owaSRPQaAyoYIeTm/2fg+qtKYpE1HNaurxxXFQcWkWFyNRb2KK5k73wnB+8U f4/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=9c4mqmFRpQZBWeBNJlwtQp3msR5Y1NyCfJ9M5hrzr2E=; b=0Fsh7K2/fUMnimwgCehISu6WIDlhc+MOyB7V7Vx77ClZ3f/8Rpgyp4E9WuxYdU5C3d pmyeaSq4jdG/y7WksGNVHlWuwKzIGq5eWRifA4fe/Ic7vPzLbb3wM8+Om86s2ac1zFQQ kQ29DSxvGiPL6UyOqPJv6tufqP4GnaJbKBkNOHfmh/3weXJcWc9U227mOiTMkCeTGdjQ vWgmIXdM7IXpVd0UBKur19gsqgoZtXbuwbXWX0iHtzFDoVKLs143wNGHuT+VZBIPf47H aCMUaxoD7otui2nyFmE/kJF/SHXnd8fumdQiJP8tJ88/hCe8So10VagEXlyzAVG7cZd4 WXEg== X-Gm-Message-State: ACrzQf3woj8RJ91C/9zyYePXMkQ/fl4L/uGDdH+Nt6/3EbMm38B//w9V wT8E1Q0K45uKUTpTy8sTbq9aGQx+J0vMHXmcUFM= X-Google-Smtp-Source: AMsMyM7RBSyUoNXSj6Rl1RfQwqHXQlnd3wpSfWcsejYLZjoizVf0yQleAltcF2P1CG2ox8BJVeCRLMFH+BO/fXFVT3M= X-Received: by 2002:a05:6402:3492:b0:45d:c00:ea8e with SMTP id v18-20020a056402349200b0045d0c00ea8emr10182552edc.150.1666012636441; Mon, 17 Oct 2022 06:17:16 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2a00:1450:4864:20::532; envelope-from=damien.mattei@gmail.com; helo=mail-ed1-x532.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.io gmane.lisp.guile.user:18661 gmane.lisp.guile.devel:21438 Archived-At: Hello, sorry for my late answer ( i wrote the code friday but i could only debug today, being busy and on travel last week-end) i modified my code to works with 'futures' , the speed is now incredible !!! (and it computes exact) the speed seems multiplied by even more than the number of CPUs (6): cat /proc/cpuinfo | grep processor processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki= %2B.scm#L1900 (declare minterms-vector unified-minterms-vector-1) (define (funct-unify-minterms-set-1-unit-future set1 set2) {function-unify-minterms-list <+ (=CE=BB (L) (apply function-unify-two-minterms-and-tag L))} ;; note : sorting is useless {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)} ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1 set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create list of pair of minterms {minterms-vector <- (list->vector minterms-set)} {minterms-vector-length <+ (vector-length minterms-vector)} {nb-procs <+ (current-processor-count)} {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; compute the segments {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)} (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)} {unified-minterms-set-2 <+ (filter (=CE=BB (x) x) unified-minterms-set-1)= } ;; remove #f results {unified-minterms-set <+ (remove-duplicates-sorted unified-minterms-set-2)} ;; uniq unified-minterms-set) i keeped your 'segment' definitions to index an array of data after convert it from list to vector (list->vector) which probably consume additional time on big data list ( i had more than 1000000 element list lengths at some point) i used a simplified version of run-in parallel (that do not do 'reduce after processing data): https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki= %2B.scm#L1794 the computation on a segment is done by those functions: https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki= %2B.scm#L1842 {function-unify-minterms-list <+ (=CE=BB (L) (apply function-unify-two-minterms-and-tag L))} ;; proc to be called with futures (define (proc-unify-minterms-seg seg) {start <+ (segment-start seg)} {end <+ (segment-end seg)} (for ({i <+ start} {i <=3D end} {i <- {i + 1}}) {mtL <+ {minterms-vector[i]}} {unified-minterms-vector-1[i] <- (function-unify-minterms-list mtL)})) i use those code in another program symbolically to compute a value named Cn: C0 =3D T C1 =3D T C2 =3D T C3 =3D (B1 =E2=88=A8 B2) C4 =3D (B2 =E2=88=A8 (B1 =E2=88=A7 B3)) C5 =3D ((B1 =E2=88=A7 B3) =E2=88=A8 (B2 =E2=88=A7 B3) =E2=88=A8 (B2 =E2=88= =A7 B4) =E2=88=A8 (B3 =E2=88=A7 B4)) C6 =3D ((B1 =E2=88=A7 B3 =E2=88=A7 B5) =E2=88=A8 (B2 =E2=88=A7 B3 =E2=88=A7= B5) =E2=88=A8 (B2 =E2=88=A7 B4) =E2=88=A8 (B3 =E2=88=A7 B4) =E2=88=A8 (B4 = =E2=88=A7 B5)) C7 =3D ((B1 =E2=88=A7 B3 =E2=88=A7 B5) =E2=88=A8 (B2 =E2=88=A7 B3 =E2=88=A7= B5) =E2=88=A8 (B2 =E2=88=A7 B4 =E2=88=A7 B6) =E2=88=A8 (B3 =E2=88=A7 B4 = =E2=88=A7 B6) =E2=88=A8 (B4 =E2=88=A7 B5) =E2=88=A8 (B5 =E2=88=A7 B6)) C8 =3D ((B1 =E2=88=A7 B3 =E2=88=A7 B5 =E2=88=A7 B7) =E2=88=A8 (B2 =E2=88=A7= B3 =E2=88=A7 B5 =E2=88=A7 B7) =E2=88=A8 (B2 =E2=88=A7 B4 =E2=88=A7 B6) =E2= =88=A8 (B3 =E2=88=A7 B4 =E2=88=A7 B6) =E2=88=A8 (B4 =E2=88=A7 B5 =E2=88=A7 B7) =E2=88=A8 (B5 =E2=88= =A7 B6) =E2=88=A8 (B6 =E2=88=A7 B7)) C9 =3D ((B1 =E2=88=A7 B3 =E2=88=A7 B5 =E2=88=A7 B7) =E2=88=A8 (B2 =E2=88=A7= B3 =E2=88=A7 B5 =E2=88=A7 B7) =E2=88=A8 (B2 =E2=88=A7 B4 =E2=88=A7 B6 =E2= =88=A7 B8) =E2=88=A8 (B3 =E2=88=A7 B4 =E2=88=A7 B6 =E2=88=A7 B8) =E2=88=A8 (B4 =E2=88=A7 B5 =E2=88= =A7 B7) =E2=88=A8 (B5 =E2=88=A7 B6 =E2=88=A7 B8) =E2=88=A8 (B6 =E2=88=A7 B7= ) =E2=88=A8 (B7 =E2=88=A7 B8)) from C1 to C9 used to be fast,less than a minute for the whole with or without // C10 =3D ((B1 =E2=88=A7 B3 =E2=88=A7 B5 =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8= (B2 =E2=88=A7 B3 =E2=88=A7 B5 =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8 (B2 =E2= =88=A7 B4 =E2=88=A7 B6 =E2=88=A7 B8) =E2=88=A8 (B3 =E2=88=A7 B4 =E2=88=A7 B6 =E2=88=A7 B8) =E2=88= =A8 (B4 =E2=88=A7 B5 =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8 (B5 =E2=88=A7 B6 = =E2=88=A7 B8) =E2=88=A8 (B6 =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8 (B7 =E2=88=A7 B8) =E2=88=A8 (B8 =E2=88=A7 B9)) C10 takes a few minutes but C11 used to take one day before // with 'future' and i got now the result during less than one hour!!! C11 =3D ((B1 =E2=88=A7 B3 =E2=88=A7 B5 =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8= (B10 =E2=88=A7 B2 =E2=88=A7 B4 =E2=88=A7 B6 =E2=88=A7 B8) =E2=88=A8 (B10 = =E2=88=A7 B3 =E2=88=A7 B4 =E2=88=A7 B6 =E2=88=A7 B8) =E2=88=A8 (B10 =E2=88=A7 B5 =E2=88=A7 B6 =E2= =88=A7 B8) =E2=88=A8 (B10 =E2=88=A7 B7 =E2=88=A7 B8) =E2=88=A8 (B10 =E2=88= =A7 B9) =E2=88=A8 (B2 =E2=88=A7 B3 =E2=88=A7 B5 =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8 (B4 =E2=88=A7 B5 =E2= =88=A7 B7 =E2=88=A7 B9) =E2=88=A8 (B6 =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8 = (B8 =E2=88=A7 B9)) i never got C12 in the past ,even with 7 days of computation! and i got it today during the lunchbreak !!! C12 =3D ((B1 =E2=88=A7 B11 =E2=88=A7 B3 =E2=88=A7 B5 =E2=88=A7 B7 =E2=88=A7= B9) =E2=88=A8 (B10 =E2=88=A7 B11) =E2=88=A8 (B10 =E2=88=A7 B2 =E2=88=A7 B4= =E2=88=A7 B6 =E2=88=A7 B8) =E2=88=A8 (B10 =E2=88=A7 B3 =E2=88=A7 B4 =E2=88=A7 B6 =E2=88=A7 B8) =E2= =88=A8 (B10 =E2=88=A7 B5 =E2=88=A7 B6 =E2=88=A7 B8) =E2=88=A8 (B10 =E2=88= =A7 B7 =E2=88=A7 B8) =E2=88=A8 (B10 =E2=88=A7 B9) =E2=88=A8 (B11 =E2=88=A7 B2 =E2=88=A7 B3 =E2=88=A7 B5 = =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8 (B11 =E2=88=A7 B4 =E2=88=A7 B5 =E2=88= =A7 B7 =E2=88=A7 B9) =E2=88=A8 (B11 =E2=88=A7 B6 =E2=88=A7 B7 =E2=88=A7 B9) =E2=88=A8 (B11 =E2=88=A7 B8 = =E2=88=A7 B9)) (note that a wise people can find a general formula for Cn based on Cn-1 but that is another (mathematical) story....) i'm computing C13 but i see that it consumes 12gb out of 16gb of my system ! and stopped on the linux box : GC Warning: Repeated allocation of very large block (appr. size 510935040): May lead to memory leak and poor performance Processus arr=C3=AAt=C3=A9 but still running on the Mac laptop...ok will see that but i think that the data set is now huge and there is noting that can be done with that... note that there is still an hash table used in function-unify-two-minterms-and-tag and i will use another data structure and algo to avoid that, i feared that the concurrent access to hash table can cause the guile 'future' to crash or fails or to slow down the process but not! result is incredibly fast.Also i use continuation and i read it can cause problem with 'future' i will improve that too... I will see where i can improve the algo and data structure to optimize again but this is already really good. Thanks Damien On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl < zelphirkaltstahl@posteo.de> wrote: > Hello Damien, > On 10/14/22 10:21, Damien Mattei wrote: > > yes Zelphir i think there is a problem in the threading part of guile, > whatever the code is ,it run well the first time, and after is unstable. > Long ago at the very begin of Java language on my master project at > university i experienced same problem with Java "green" threads, under > windows and/or linux ,i can't remember. > > I'm trying your code and future, which have perheaps also the portability > with other schemes, future can provide "light" // , with carefull code it > could be ok. > > in your segment.scm code ,is segment-count possibly replaced by the numbe= r > of available CPUs or nodes, if i understand well? > > Regards, > Damien > > Correct. For each segment one future is used. > > However, there are scenarios, where one would rather want to interleave > the numbers of the whole range, to have a "fairer" workload per future, f= or > example: > > (1 2 3 4 5 6 7 8 9) > > -> (1 4 7), (2 5 8), (3 6 9) > > instead of > > -> (1 2 3) (4 5 6) (7 8 9) > > (I seem to remember this to be the case for Mandelbrot set calculations, > but might be wrong.) > > But that is all a matter of forming some segments and what a segment is, > not really part of the logic of creating futures. So one could create the= n > in any way that fits ones use-case. I have not generalized that segment > logic that far, but it is doable. > > Anyway, if anyone more knowledgeable could chime in on what the > performance differences between futures and parallel map are, that would = be > great! Or pointing out some flaws that this kind of future usage might > have. Or when the point is reached to switch to guile-fibers library. > > Regards, > Zelphir > > -- > repositories: https://notabug.org/ZelphirKaltstahl > >