after intense coding i finally got the good results, my assumption about the global variable hash table was true ,it is incompatible with 'future : the competition for the writing into the hash table breaks the code. If i do writing in hash table out of // region all is ok: a simple function to extract the hash table access: (define (tag-minterms i umt) (when umt {mt1 <+ (first {minterms-vector[i]})} {mt2 <+ (second {minterms-vector[i]})} {minterms-ht[mt1] <- #t} {minterms-ht[mt2] <- #t})) ;; proc to be called with futures (define (proc-unify-minterms-seg seg) {function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms L))} ;;{function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms-and-tag L))} {start <+ (segment-start seg)} {end <+ (segment-end seg)} (for ({i <+ start} {i <= end} {i <- {i + 1}}) {mtL <+ {minterms-vector[i]}} (nodebug (dv mtL)) {unified-minterms-vector-1[i] <- (function-unify-minterms-list mtL)})) (declare minterms-vector unified-minterms-vector-1) https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L2893 and the call to the code that use hash table is now completely out of the // region: (debug (display-nl "before //")) (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code (debug (display-nl "after //")) (vector-for-each tag-minterms unified-minterms-vector-1) ;; tag the minterms in the hash table with a partitioned vector it works because one cell is not accessed by different threads other stuff like continuation seems to work with 'future, i wanted to rewrite code with call/cc in 'future but it was too long and apparently it works,any way i do not see why call/cc would cause trouble in a thread... i think 'futures are good if you use them carefully like in OpenMP regions... in other languages. the code is very very fast now i reached C15,in a few minutes on an Apple M1 scheme@(guile-user)> (current-processor-count) $1 = 8 C15 = ((B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14)) and it consumes more than 2Gb in memory: PID COMMAND %CPU TIME #TH #WQ #PORT MEM PURG CMPRS PGRP PPID STATE BOOSTS %CPU_ME 15081 guile 442.8 57:24.28 17/8 0 44 *2423M* 0B 118M 15081 10551 running *0[1] 0.0000 i checked a part of the logical computation with Mathematica online and it was ok but i will do a Python program to check the logical expression again and compare speed with Sympy... and for the space complexity i do not think it can be reduced good night all On Sat, Oct 22, 2022 at 6:01 PM Damien Mattei wrote: > just for the fun things, i just find that the parallelized results are > false :-) even for low indices: > the non // result: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4)) > is different than the parallelized result below: > C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4)) > > i have to check things again ,i think futures are easy to use but limited > in solving acute // problems, such as using a global variable like an hash > table... > > Damien > > > > On Mon, Oct 17, 2022 at 3:17 PM Damien Mattei > wrote: > >> Hello, >> >> sorry for my late answer ( i wrote the code friday but i could only debug >> today, being busy and on travel last week-end) >> >> i modified my code to works with 'futures' , the speed is now incredible >> !!! (and it computes exact) >> the speed seems multiplied by even more than the number of CPUs (6): >> cat /proc/cpuinfo | grep processor >> processor : 0 >> processor : 1 >> processor : 2 >> processor : 3 >> processor : 4 >> processor : 5 >> >> >> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900 >> >> (declare minterms-vector unified-minterms-vector-1) >> >> (define (funct-unify-minterms-set-1-unit-future set1 set2) >> >> {function-unify-minterms-list <+ (λ (L) (apply >> function-unify-two-minterms-and-tag L))} >> >> ;; note : sorting is useless >> >> {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)} >> ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1 >> set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create >> list of pair of minterms >> >> {minterms-vector <- (list->vector minterms-set)} >> >> {minterms-vector-length <+ (vector-length minterms-vector)} >> >> {nb-procs <+ (current-processor-count)} >> >> {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; >> compute the segments >> >> {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)} >> >> (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel >> code >> >> {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)} >> >> {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} >> ;; remove #f results >> >> {unified-minterms-set <+ (remove-duplicates-sorted >> unified-minterms-set-2)} ;; uniq >> >> unified-minterms-set) >> >> i keeped your 'segment' definitions to index an array of data after >> convert it from list to vector (list->vector) which probably consume >> additional time on big data list ( i had more than 1000000 element list >> lengths at some point) >> >> i used a simplified version of run-in parallel (that do not do 'reduce >> after processing data): >> >> >> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794 >> >> the computation on a segment is done by those functions: >> >> >> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842 >> >> {function-unify-minterms-list <+ (λ (L) (apply >> function-unify-two-minterms-and-tag L))} >> >> ;; proc to be called with futures >> (define (proc-unify-minterms-seg seg) >> {start <+ (segment-start seg)} >> {end <+ (segment-end seg)} >> (for ({i <+ start} {i <= end} {i <- {i + 1}}) >> {mtL <+ {minterms-vector[i]}} >> {unified-minterms-vector-1[i] <- (function-unify-minterms-list >> mtL)})) >> >> >> i use those code in another program symbolically to compute a value named >> Cn: >> >> C0 = T >> C1 = T >> C2 = T >> C3 = (B1 ∨ B2) >> C4 = (B2 ∨ (B1 ∧ B3)) >> C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4)) >> C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5)) >> C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨ >> (B4 ∧ B5) ∨ (B5 ∧ B6)) >> C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ >> B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7)) >> C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ >> (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧ >> B8)) >> >> from C1 to C9 used to be fast,less than a minute for the whole with or >> without // >> >> C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ >> B6 ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 >> ∧ B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9)) >> >> C10 takes a few minutes >> >> but C11 used to take one day before // with 'future' and i got now the >> result during less than one hour!!! >> >> C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧ >> B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧ >> B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9)) >> >> i never got C12 in the past ,even with 7 days of computation! and i got >> it today during the lunchbreak !!! >> >> C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 >> ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) >> ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨ >> (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9)) >> >> (note that a wise people can find a general formula for Cn based on Cn-1 >> but that is another (mathematical) story....) >> >> i'm computing C13 but i see that it consumes 12gb out of 16gb of my >> system ! and stopped on the linux box : >> GC Warning: Repeated allocation of very large block (appr. size >> 510935040): >> May lead to memory leak and poor performance >> Processus arrêté >> but still running on the Mac laptop...ok will see that but i think that >> the data set is now huge and there is noting that can be done with that... >> >> note that there is still an hash table used in >> function-unify-two-minterms-and-tag and i will use another data structure >> and algo to avoid that, i feared that the concurrent access to hash table >> can cause the guile 'future' to crash or fails or to slow down the process >> but not! result is incredibly fast.Also i use continuation and i read it >> can cause problem with 'future' i will improve that too... >> >> I will see where i can improve the algo and data structure to optimize >> again but this is already really good. >> >> Thanks >> >> Damien >> >> >> On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl < >> zelphirkaltstahl@posteo.de> wrote: >> >>> Hello Damien, >>> On 10/14/22 10:21, Damien Mattei wrote: >>> >>> yes Zelphir i think there is a problem in the threading part of guile, >>> whatever the code is ,it run well the first time, and after is unstable. >>> Long ago at the very begin of Java language on my master project at >>> university i experienced same problem with Java "green" threads, under >>> windows and/or linux ,i can't remember. >>> >>> I'm trying your code and future, which have perheaps also the >>> portability with other schemes, future can provide "light" // , with >>> carefull code it could be ok. >>> >>> in your segment.scm code ,is segment-count possibly replaced by the >>> number of available CPUs or nodes, if i understand well? >>> >>> Regards, >>> Damien >>> >>> Correct. For each segment one future is used. >>> >>> However, there are scenarios, where one would rather want to interleave >>> the numbers of the whole range, to have a "fairer" workload per future, for >>> example: >>> >>> (1 2 3 4 5 6 7 8 9) >>> >>> -> (1 4 7), (2 5 8), (3 6 9) >>> >>> instead of >>> >>> -> (1 2 3) (4 5 6) (7 8 9) >>> >>> (I seem to remember this to be the case for Mandelbrot set calculations, >>> but might be wrong.) >>> >>> But that is all a matter of forming some segments and what a segment is, >>> not really part of the logic of creating futures. So one could create then >>> in any way that fits ones use-case. I have not generalized that segment >>> logic that far, but it is doable. >>> >>> Anyway, if anyone more knowledgeable could chime in on what the >>> performance differences between futures and parallel map are, that would be >>> great! Or pointing out some flaws that this kind of future usage might >>> have. Or when the point is reached to switch to guile-fibers library. >>> >>> Regards, >>> Zelphir >>> >>> -- >>> repositories: https://notabug.org/ZelphirKaltstahl >>> >>>