* map-par slower than map @ 2022-10-12 17:19 Damien Mattei 2022-10-12 18:45 ` Maxime Devos 2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 0 siblings, 2 replies; 41+ messages in thread From: Damien Mattei @ 2022-10-12 17:19 UTC (permalink / raw) To: guile-user, guile-devel Hello, all is in the title, i test on a approximately 30000 element list , i got 9s with map and 3min 30s with par-map on exactly the same piece of code!? what i change in code is in comments: {unified-minterms-set-1 <+ (map function-unify-minterms-list minterms-set)} ;;(par-map function-unify-minterms-list minterms-set)} translated from Scheme+ to Scheme: (define unified-minterms-set-1 (map function-unify-minterms-list minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) best regards, Damien ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 17:19 map-par slower than map Damien Mattei @ 2022-10-12 18:45 ` Maxime Devos 2022-10-12 20:20 ` Damien Mattei 2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 1 sibling, 1 reply; 41+ messages in thread From: Maxime Devos @ 2022-10-12 18:45 UTC (permalink / raw) To: Damien Mattei, guile-user, guile-devel [-- Attachment #1.1.1: Type: text/plain, Size: 606 bytes --] On 12-10-2022 19:19, Damien Mattei wrote: > Hello, > all is in the title, i test on a approximately 30000 element list , i got > 9s with map and 3min 30s with par-map on exactly the same piece of code!? > > [...] > > translated from Scheme+ to Scheme: > (define unified-minterms-set-1 (map function-unify-minterms-list > minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) The definition of 'function-unify-minterms-list' and 'minterms-set' is missing. Without a test case, we can only speculate what's going on. (E.g., maybe it grabs a mutex). Greetings, Maxime. [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 929 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 236 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 18:45 ` Maxime Devos @ 2022-10-12 20:20 ` Damien Mattei 2022-10-12 20:27 ` Damien Mattei 2022-10-12 21:44 ` Maxime Devos 0 siblings, 2 replies; 41+ messages in thread From: Damien Mattei @ 2022-10-12 20:20 UTC (permalink / raw) To: Maxime Devos; +Cc: guile-user, guile-devel Mutex? i do not think code has situation where dead lock could happen, it is a code about minimalising logic expressions, it uses minterms , minterms set is a set of minterms :like this: example: ((1 1 0) (1 1 1)) will be unified : (1 1 x) because 0 and 1 are replaced by x the minterms-set could have thousands of pair (mathematic not lisp) minterms to unify if there is more than one x as result there is no need to continue so i escape with a continuation: minterms-set = { ((1 0 1 0) (1 1 1 0)) ((1 0 1 0) (1 1 0 1)) ((1 0 1 0) (1 0 1 1)) ((1 0 1 0) (0 1 1 1)) ((0 1 1 0) (1 1 1 0)) ((0 1 1 0) (1 1 0 1)) ((0 1 1 0) (1 0 1 1)) ((0 1 1 0) (0 1 1 1)) ((0 1 0 1) (1 1 1 0)) ((0 1 0 1) (1 1 0 1)) ((0 1 0 1) (1 0 1 1)) ((0 1 0 1) (0 1 1 1)) ((0 0 1 1) (1 1 1 0)) ((0 0 1 1) (1 1 0 1)) ((0 0 1 1) (1 0 1 1)) ((0 0 1 1) (0 1 1 1)) } replace { } by () to have the list, other example at another level : minterms-set = { ((0 x 1 1) (x 1 1 1)) ((0 x 1 1) (1 x 1 1)) ((0 x 1 1) (1 1 x 1)) ((0 x 1 1) (1 1 1 x)) ((x 0 1 1) (x 1 1 1)) ((x 0 1 1) (1 x 1 1)) ((x 0 1 1) (1 1 x 1)) ((x 0 1 1) (1 1 1 x)) ((0 1 x 1) (x 1 1 1)) ((0 1 x 1) (1 x 1 1)) ((0 1 x 1) (1 1 x 1)) ((0 1 x 1) (1 1 1 x)) ((x 1 0 1) (x 1 1 1)) ((x 1 0 1) (1 x 1 1)) ((x 1 0 1) (1 1 x 1)) ((x 1 0 1) (1 1 1 x)) ((0 1 1 x) (x 1 1 1)) ((0 1 1 x) (1 x 1 1)) ((0 1 1 x) (1 1 x 1)) ((0 1 1 x) (1 1 1 x)) ((x 1 1 0) (x 1 1 1)) ((x 1 1 0) (1 x 1 1)) ((x 1 1 0) (1 1 x 1)) ((x 1 1 0) (1 1 1 x)) ((1 0 1 x) (x 1 1 1)) ((1 0 1 x) (1 x 1 1)) ((1 0 1 x) (1 1 x 1)) ((1 0 1 x) (1 1 1 x)) ((1 x 1 0) (x 1 1 1)) ((1 x 1 0) (1 x 1 1)) ((1 x 1 0) (1 1 x 1)) ((1 x 1 0) (1 1 1 x)) } here we see some minterms are already unified it is not easy to read even by me because i wrote the code many years ago and is split in many files, but here it is: (par-map function-unify-minterms-list minterms-set) {function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms-and-tag L))} (define (unify-two-minterms mt1 mt2) (function-map-with-escaping-by-kontinuation2 (macro-function-compare-2-bits-with-continuation) mt1 mt2)) ;; (function-map-with-escaping-by-kontinuation2 (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) '(1 1 0 1 1 1 1 1)) ;; list1 = (1 1 0 1 0 1 1 0) ;; more-lists = ((1 1 0 1 1 1 1 1)) ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> ;; #f ;; ;; (function-map-with-escaping-by-kontinuation2 (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) '(1 1 0 1 1 1 1 0)) ;; list1 = (1 1 0 1 0 1 1 0) ;; more-lists = ((1 1 0 1 1 1 1 0)) ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> ;; '(1 1 0 1 x 1 1 0) (define (function-map-with-escaping-by-kontinuation2 clozure list1 . more-lists) (call/cc (lambda (kontinuation) (let ((lists (cons list1 more-lists)) (funct-continu ;; this function have the kontinuation in his environment (lambda (arg1 . more-args) (let ((args (cons arg1 more-args))) (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons conti args)) ;; (newline) ;; (dv list1) ;; (dv more-lists) ;; (dv lists) ;; (dv clozure) ;; (newline) (apply map funct-continu lists))))) (define-syntax macro-function-compare-2-bits-with-continuation ;; continuation version of macro-compare-2-bits ;; i need a macro because of external function to the clozure (syntax-rules () ((_) (let ((cnt 0)) ;; counter (lambda (continuation b1 b2) (if (equal? b1 b2) b1 (begin (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > 1, we can have used a flag too instead of a counter (when (> cnt 1) (continuation #f)) ;; escaping with the continuation 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) what could have caused mutex if in the latter definition above (let ((cnt 0)) ;; counter was defined at top level and shared by all threads!!! yes there could have be some mutex but this is not the case, i think even all function are pure so why is it more slow with // than without? Damien On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> wrote: > On 12-10-2022 19:19, Damien Mattei wrote: > > Hello, > > all is in the title, i test on a approximately 30000 element list , i got > > 9s with map and 3min 30s with par-map on exactly the same piece of code!? > > > > [...] > > > > translated from Scheme+ to Scheme: > > (define unified-minterms-set-1 (map function-unify-minterms-list > > minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) > > The definition of 'function-unify-minterms-list' and 'minterms-set' is > missing. Without a test case, we can only speculate what's going on. > (E.g., maybe it grabs a mutex). > > Greetings, > Maxime. > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 20:20 ` Damien Mattei @ 2022-10-12 20:27 ` Damien Mattei 2022-10-12 21:29 ` Zelphir Kaltstahl 2022-10-12 21:44 ` Maxime Devos 1 sibling, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-12 20:27 UTC (permalink / raw) To: Maxime Devos; +Cc: guile-user, guile-devel https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 i commited the current version of code here with all files but it is huge.... :-/ On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com> wrote: > Mutex? i do not think code has situation where dead lock could happen, it > is a code about minimalising logic expressions, it uses minterms , minterms > set is a set of minterms :like this: > > example: > ((1 1 0) (1 1 1)) will be unified : (1 1 x) > because 0 and 1 are replaced by x > the minterms-set could have thousands of pair (mathematic not lisp) > minterms to unify > if there is more than one x as result there is no need to continue so i > escape with a continuation: > > minterms-set = > { > ((1 0 1 0) (1 1 1 0)) > ((1 0 1 0) (1 1 0 1)) > ((1 0 1 0) (1 0 1 1)) > ((1 0 1 0) (0 1 1 1)) > ((0 1 1 0) (1 1 1 0)) > ((0 1 1 0) (1 1 0 1)) > ((0 1 1 0) (1 0 1 1)) > ((0 1 1 0) (0 1 1 1)) > ((0 1 0 1) (1 1 1 0)) > ((0 1 0 1) (1 1 0 1)) > ((0 1 0 1) (1 0 1 1)) > ((0 1 0 1) (0 1 1 1)) > ((0 0 1 1) (1 1 1 0)) > ((0 0 1 1) (1 1 0 1)) > ((0 0 1 1) (1 0 1 1)) > ((0 0 1 1) (0 1 1 1)) > } > > replace { } by () to have the list, other example at another level : > > minterms-set = > { > ((0 x 1 1) (x 1 1 1)) > ((0 x 1 1) (1 x 1 1)) > ((0 x 1 1) (1 1 x 1)) > ((0 x 1 1) (1 1 1 x)) > ((x 0 1 1) (x 1 1 1)) > ((x 0 1 1) (1 x 1 1)) > ((x 0 1 1) (1 1 x 1)) > ((x 0 1 1) (1 1 1 x)) > ((0 1 x 1) (x 1 1 1)) > ((0 1 x 1) (1 x 1 1)) > ((0 1 x 1) (1 1 x 1)) > ((0 1 x 1) (1 1 1 x)) > ((x 1 0 1) (x 1 1 1)) > ((x 1 0 1) (1 x 1 1)) > ((x 1 0 1) (1 1 x 1)) > ((x 1 0 1) (1 1 1 x)) > ((0 1 1 x) (x 1 1 1)) > ((0 1 1 x) (1 x 1 1)) > ((0 1 1 x) (1 1 x 1)) > ((0 1 1 x) (1 1 1 x)) > ((x 1 1 0) (x 1 1 1)) > ((x 1 1 0) (1 x 1 1)) > ((x 1 1 0) (1 1 x 1)) > ((x 1 1 0) (1 1 1 x)) > ((1 0 1 x) (x 1 1 1)) > ((1 0 1 x) (1 x 1 1)) > ((1 0 1 x) (1 1 x 1)) > ((1 0 1 x) (1 1 1 x)) > ((1 x 1 0) (x 1 1 1)) > ((1 x 1 0) (1 x 1 1)) > ((1 x 1 0) (1 1 x 1)) > ((1 x 1 0) (1 1 1 x)) > } > > here we see some minterms are already unified > > it is not easy to read even by me because i wrote the code many years ago > and is split in many files, but here it is: > > (par-map function-unify-minterms-list minterms-set) > > {function-unify-minterms-list <+ (λ (L) (apply > function-unify-two-minterms-and-tag L))} > > (define (unify-two-minterms mt1 mt2) > (function-map-with-escaping-by-kontinuation2 > (macro-function-compare-2-bits-with-continuation) mt1 mt2)) > > ;; (function-map-with-escaping-by-kontinuation2 > (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) '(1 > 1 0 1 1 1 1 1)) > > ;; list1 = (1 1 0 1 0 1 1 0) > ;; more-lists = ((1 1 0 1 1 1 1 1)) > ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) > ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > > ;; #f > ;; > ;; (function-map-with-escaping-by-kontinuation2 > (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) '(1 > 1 0 1 1 1 1 0)) > > ;; list1 = (1 1 0 1 0 1 1 0) > ;; more-lists = ((1 1 0 1 1 1 1 0)) > ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) > ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > > ;; '(1 1 0 1 x 1 1 0) > (define (function-map-with-escaping-by-kontinuation2 clozure list1 . > more-lists) > (call/cc (lambda (kontinuation) > (let ((lists (cons list1 more-lists)) > (funct-continu ;; this function have the kontinuation in his environment > (lambda (arg1 . more-args) > (let ((args (cons arg1 more-args))) > (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons > conti args)) > > ;; (newline) > ;; (dv list1) > ;; (dv more-lists) > ;; (dv lists) > ;; (dv clozure) > ;; (newline) > > (apply map funct-continu lists))))) > > (define-syntax macro-function-compare-2-bits-with-continuation ;; > continuation version of macro-compare-2-bits > ;; i need a macro because of external function to the clozure > (syntax-rules () > ((_) (let ((cnt 0)) ;; counter > (lambda (continuation b1 b2) (if (equal? b1 b2) > b1 > (begin > (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > 1, we > can have used a flag too instead of a counter > (when (> cnt 1) (continuation #f)) ;; escaping with the continuation > 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) > > what could have caused mutex if in the latter definition above (let ((cnt > 0)) ;; counter was defined at top level and shared by all threads!!! yes > there could have be some mutex but this is not the case, i think even all > function are pure so why is it more slow with // than without? > Damien > > On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> > wrote: > >> On 12-10-2022 19:19, Damien Mattei wrote: >> > Hello, >> > all is in the title, i test on a approximately 30000 element list , i >> got >> > 9s with map and 3min 30s with par-map on exactly the same piece of >> code!? >> > >> > [...] >> > >> > translated from Scheme+ to Scheme: >> > (define unified-minterms-set-1 (map function-unify-minterms-list >> > minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) >> >> The definition of 'function-unify-minterms-list' and 'minterms-set' is >> missing. Without a test case, we can only speculate what's going on. >> (E.g., maybe it grabs a mutex). >> >> Greetings, >> Maxime. >> > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 20:27 ` Damien Mattei @ 2022-10-12 21:29 ` Zelphir Kaltstahl 2022-10-14 8:21 ` Damien Mattei ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Zelphir Kaltstahl @ 2022-10-12 21:29 UTC (permalink / raw) To: Damien Mattei; +Cc: guile-user, guile-devel Hi! On 10/12/22 22:27, Damien Mattei wrote: > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 > > i commited the current version of code here with all files but it is > huge.... :-/ > > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com> > wrote: > >> Mutex? i do not think code has situation where dead lock could happen, it >> is a code about minimalising logic expressions, it uses minterms , minterms >> set is a set of minterms :like this: >> >> example: >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) >> because 0 and 1 are replaced by x >> the minterms-set could have thousands of pair (mathematic not lisp) >> minterms to unify >> if there is more than one x as result there is no need to continue so i >> escape with a continuation: >> >> minterms-set = >> { >> ((1 0 1 0) (1 1 1 0)) >> ((1 0 1 0) (1 1 0 1)) >> ((1 0 1 0) (1 0 1 1)) >> ((1 0 1 0) (0 1 1 1)) >> ((0 1 1 0) (1 1 1 0)) >> ((0 1 1 0) (1 1 0 1)) >> ((0 1 1 0) (1 0 1 1)) >> ((0 1 1 0) (0 1 1 1)) >> ((0 1 0 1) (1 1 1 0)) >> ((0 1 0 1) (1 1 0 1)) >> ((0 1 0 1) (1 0 1 1)) >> ((0 1 0 1) (0 1 1 1)) >> ((0 0 1 1) (1 1 1 0)) >> ((0 0 1 1) (1 1 0 1)) >> ((0 0 1 1) (1 0 1 1)) >> ((0 0 1 1) (0 1 1 1)) >> } >> >> replace { } by () to have the list, other example at another level : >> >> minterms-set = >> { >> ((0 x 1 1) (x 1 1 1)) >> ((0 x 1 1) (1 x 1 1)) >> ((0 x 1 1) (1 1 x 1)) >> ((0 x 1 1) (1 1 1 x)) >> ((x 0 1 1) (x 1 1 1)) >> ((x 0 1 1) (1 x 1 1)) >> ((x 0 1 1) (1 1 x 1)) >> ((x 0 1 1) (1 1 1 x)) >> ((0 1 x 1) (x 1 1 1)) >> ((0 1 x 1) (1 x 1 1)) >> ((0 1 x 1) (1 1 x 1)) >> ((0 1 x 1) (1 1 1 x)) >> ((x 1 0 1) (x 1 1 1)) >> ((x 1 0 1) (1 x 1 1)) >> ((x 1 0 1) (1 1 x 1)) >> ((x 1 0 1) (1 1 1 x)) >> ((0 1 1 x) (x 1 1 1)) >> ((0 1 1 x) (1 x 1 1)) >> ((0 1 1 x) (1 1 x 1)) >> ((0 1 1 x) (1 1 1 x)) >> ((x 1 1 0) (x 1 1 1)) >> ((x 1 1 0) (1 x 1 1)) >> ((x 1 1 0) (1 1 x 1)) >> ((x 1 1 0) (1 1 1 x)) >> ((1 0 1 x) (x 1 1 1)) >> ((1 0 1 x) (1 x 1 1)) >> ((1 0 1 x) (1 1 x 1)) >> ((1 0 1 x) (1 1 1 x)) >> ((1 x 1 0) (x 1 1 1)) >> ((1 x 1 0) (1 x 1 1)) >> ((1 x 1 0) (1 1 x 1)) >> ((1 x 1 0) (1 1 1 x)) >> } >> >> here we see some minterms are already unified >> >> it is not easy to read even by me because i wrote the code many years ago >> and is split in many files, but here it is: >> >> (par-map function-unify-minterms-list minterms-set) >> >> {function-unify-minterms-list <+ (λ (L) (apply >> function-unify-two-minterms-and-tag L))} >> >> (define (unify-two-minterms mt1 mt2) >> (function-map-with-escaping-by-kontinuation2 >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) >> >> ;; (function-map-with-escaping-by-kontinuation2 >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) '(1 >> 1 0 1 1 1 1 1)) >> >> ;; list1 = (1 1 0 1 0 1 1 0) >> ;; more-lists = ((1 1 0 1 1 1 1 1)) >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >> >> ;; #f >> ;; >> ;; (function-map-with-escaping-by-kontinuation2 >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) '(1 >> 1 0 1 1 1 1 0)) >> >> ;; list1 = (1 1 0 1 0 1 1 0) >> ;; more-lists = ((1 1 0 1 1 1 1 0)) >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >> >> ;; '(1 1 0 1 x 1 1 0) >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . >> more-lists) >> (call/cc (lambda (kontinuation) >> (let ((lists (cons list1 more-lists)) >> (funct-continu ;; this function have the kontinuation in his environment >> (lambda (arg1 . more-args) >> (let ((args (cons arg1 more-args))) >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons >> conti args)) >> >> ;; (newline) >> ;; (dv list1) >> ;; (dv more-lists) >> ;; (dv lists) >> ;; (dv clozure) >> ;; (newline) >> >> (apply map funct-continu lists))))) >> >> (define-syntax macro-function-compare-2-bits-with-continuation ;; >> continuation version of macro-compare-2-bits >> ;; i need a macro because of external function to the clozure >> (syntax-rules () >> ((_) (let ((cnt 0)) ;; counter >> (lambda (continuation b1 b2) (if (equal? b1 b2) >> b1 >> (begin >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > 1, we >> can have used a flag too instead of a counter >> (when (> cnt 1) (continuation #f)) ;; escaping with the continuation >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) >> >> what could have caused mutex if in the latter definition above (let ((cnt >> 0)) ;; counter was defined at top level and shared by all threads!!! yes >> there could have be some mutex but this is not the case, i think even all >> function are pure so why is it more slow with // than without? >> Damien >> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> >> wrote: >> >>> On 12-10-2022 19:19, Damien Mattei wrote: >>>> Hello, >>>> all is in the title, i test on a approximately 30000 element list , i >>> got >>>> 9s with map and 3min 30s with par-map on exactly the same piece of >>> code!? >>> > [...] >>> > >>>> translated from Scheme+ to Scheme: >>>> (define unified-minterms-set-1 (map function-unify-minterms-list >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is >>> missing. Without a test case, we can only speculate what's going on. >>> (E.g., maybe it grabs a mutex). >>> >>> Greetings, >>> Maxime. I don't want to scare anyone, just maybe warn about parallel map. I once tried to use Guile's parallel map function for a decision tree implementation (https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), where each branch while learning the tree would call parallel map again for sub branches and so on. Somehow it made Guile crash (I don't have the error message any longer, but I did post about it on the mailing list back then.). I never figured out, what went wrong. All I had was pure function calls and math inside the thing that parallel map was supposed to run. Ultimately I simply tried other parallelism constructs and when I switched to using futures instead, everything worked fine, no crashes, no errors. Since that time, I did not use parallel map and instead used futures. Recently I made a parallelization thing for solving exercises of Project Euler using multiple cores, so that some solutions are calculated faster. Maybe this can help or can be adapted to another use case: https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 It expects ranges of things, which are called `segments` in the code. Usually ranges of numbers for Project Euler things. Here is the code to split a range into segments: https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm (Check any solution using it for an example.) So this might be a bit too specific for general parallel things, but I guess one could change the way futures are used in `run-in-parallel`, to fit any other purpose. Best regards, Zelphir -- repositories: https://notabug.org/ZelphirKaltstahl ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 21:29 ` Zelphir Kaltstahl @ 2022-10-14 8:21 ` Damien Mattei 2022-10-14 8:38 ` Zelphir Kaltstahl 2022-10-25 9:07 ` Mikael Djurfeldt 2022-11-10 10:32 ` Damien Mattei 2 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-14 8:21 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user, guile-devel yes Zelphir i think there is a problem in the threading part of guile, whatever the code is ,it run well the first time, and after is unstable. Long ago at the very begin of Java language on my master project at university i experienced same problem with Java "green" threads, under windows and/or linux ,i can't remember. I'm trying your code and future, which have perheaps also the portability with other schemes, future can provide "light" // , with carefull code it could be ok. in your segment.scm code ,is segment-count possibly replaced by the number of available CPUs or nodes, if i understand well? Regards, Damien On Wed, Oct 12, 2022 at 11:29 PM Zelphir Kaltstahl < zelphirkaltstahl@posteo.de> wrote: > Hi! > > On 10/12/22 22:27, Damien Mattei wrote: > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 > > > > i commited the current version of code here with all files but it is > > huge.... :-/ > > > > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com> > > wrote: > > > >> Mutex? i do not think code has situation where dead lock could happen, > it > >> is a code about minimalising logic expressions, it uses minterms , > minterms > >> set is a set of minterms :like this: > >> > >> example: > >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) > >> because 0 and 1 are replaced by x > >> the minterms-set could have thousands of pair (mathematic not lisp) > >> minterms to unify > >> if there is more than one x as result there is no need to continue so i > >> escape with a continuation: > >> > >> minterms-set = > >> { > >> ((1 0 1 0) (1 1 1 0)) > >> ((1 0 1 0) (1 1 0 1)) > >> ((1 0 1 0) (1 0 1 1)) > >> ((1 0 1 0) (0 1 1 1)) > >> ((0 1 1 0) (1 1 1 0)) > >> ((0 1 1 0) (1 1 0 1)) > >> ((0 1 1 0) (1 0 1 1)) > >> ((0 1 1 0) (0 1 1 1)) > >> ((0 1 0 1) (1 1 1 0)) > >> ((0 1 0 1) (1 1 0 1)) > >> ((0 1 0 1) (1 0 1 1)) > >> ((0 1 0 1) (0 1 1 1)) > >> ((0 0 1 1) (1 1 1 0)) > >> ((0 0 1 1) (1 1 0 1)) > >> ((0 0 1 1) (1 0 1 1)) > >> ((0 0 1 1) (0 1 1 1)) > >> } > >> > >> replace { } by () to have the list, other example at another level : > >> > >> minterms-set = > >> { > >> ((0 x 1 1) (x 1 1 1)) > >> ((0 x 1 1) (1 x 1 1)) > >> ((0 x 1 1) (1 1 x 1)) > >> ((0 x 1 1) (1 1 1 x)) > >> ((x 0 1 1) (x 1 1 1)) > >> ((x 0 1 1) (1 x 1 1)) > >> ((x 0 1 1) (1 1 x 1)) > >> ((x 0 1 1) (1 1 1 x)) > >> ((0 1 x 1) (x 1 1 1)) > >> ((0 1 x 1) (1 x 1 1)) > >> ((0 1 x 1) (1 1 x 1)) > >> ((0 1 x 1) (1 1 1 x)) > >> ((x 1 0 1) (x 1 1 1)) > >> ((x 1 0 1) (1 x 1 1)) > >> ((x 1 0 1) (1 1 x 1)) > >> ((x 1 0 1) (1 1 1 x)) > >> ((0 1 1 x) (x 1 1 1)) > >> ((0 1 1 x) (1 x 1 1)) > >> ((0 1 1 x) (1 1 x 1)) > >> ((0 1 1 x) (1 1 1 x)) > >> ((x 1 1 0) (x 1 1 1)) > >> ((x 1 1 0) (1 x 1 1)) > >> ((x 1 1 0) (1 1 x 1)) > >> ((x 1 1 0) (1 1 1 x)) > >> ((1 0 1 x) (x 1 1 1)) > >> ((1 0 1 x) (1 x 1 1)) > >> ((1 0 1 x) (1 1 x 1)) > >> ((1 0 1 x) (1 1 1 x)) > >> ((1 x 1 0) (x 1 1 1)) > >> ((1 x 1 0) (1 x 1 1)) > >> ((1 x 1 0) (1 1 x 1)) > >> ((1 x 1 0) (1 1 1 x)) > >> } > >> > >> here we see some minterms are already unified > >> > >> it is not easy to read even by me because i wrote the code many years > ago > >> and is split in many files, but here it is: > >> > >> (par-map function-unify-minterms-list minterms-set) > >> > >> {function-unify-minterms-list <+ (λ (L) (apply > >> function-unify-two-minterms-and-tag L))} > >> > >> (define (unify-two-minterms mt1 mt2) > >> (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) > >> > >> ;; (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) > '(1 > >> 1 0 1 1 1 1 1)) > >> > >> ;; list1 = (1 1 0 1 0 1 1 0) > >> ;; more-lists = ((1 1 0 1 1 1 1 1)) > >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) > >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > >> > >> ;; #f > >> ;; > >> ;; (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) > '(1 > >> 1 0 1 1 1 1 0)) > >> > >> ;; list1 = (1 1 0 1 0 1 1 0) > >> ;; more-lists = ((1 1 0 1 1 1 1 0)) > >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) > >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > >> > >> ;; '(1 1 0 1 x 1 1 0) > >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . > >> more-lists) > >> (call/cc (lambda (kontinuation) > >> (let ((lists (cons list1 more-lists)) > >> (funct-continu ;; this function have the kontinuation in his > environment > >> (lambda (arg1 . more-args) > >> (let ((args (cons arg1 more-args))) > >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons > >> conti args)) > >> > >> ;; (newline) > >> ;; (dv list1) > >> ;; (dv more-lists) > >> ;; (dv lists) > >> ;; (dv clozure) > >> ;; (newline) > >> > >> (apply map funct-continu lists))))) > >> > >> (define-syntax macro-function-compare-2-bits-with-continuation ;; > >> continuation version of macro-compare-2-bits > >> ;; i need a macro because of external function to the clozure > >> (syntax-rules () > >> ((_) (let ((cnt 0)) ;; counter > >> (lambda (continuation b1 b2) (if (equal? b1 b2) > >> b1 > >> (begin > >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > > 1, we > >> can have used a flag too instead of a counter > >> (when (> cnt 1) (continuation #f)) ;; escaping with the continuation > >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) > >> > >> what could have caused mutex if in the latter definition above (let > ((cnt > >> 0)) ;; counter was defined at top level and shared by all threads!!! yes > >> there could have be some mutex but this is not the case, i think even > all > >> function are pure so why is it more slow with // than without? > >> Damien > >> > >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> > >> wrote: > >> > >>> On 12-10-2022 19:19, Damien Mattei wrote: > >>>> Hello, > >>>> all is in the title, i test on a approximately 30000 element list , i > >>> got > >>>> 9s with map and 3min 30s with par-map on exactly the same piece of > >>> code!? > >>> > [...] > >>> > > >>>> translated from Scheme+ to Scheme: > >>>> (define unified-minterms-set-1 (map function-unify-minterms-list > >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) > >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is > >>> missing. Without a test case, we can only speculate what's going on. > >>> (E.g., maybe it grabs a mutex). > >>> > >>> Greetings, > >>> Maxime. > I don't want to scare anyone, just maybe warn about parallel map. I once > tried > to use Guile's parallel map function for a decision tree implementation > ( > https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), > > where each branch while learning the tree would call parallel map again > for sub > branches and so on. Somehow it made Guile crash (I don't have the error > message > any longer, but I did post about it on the mailing list back then.). I > never > figured out, what went wrong. All I had was pure function calls and math > inside > the thing that parallel map was supposed to run. > > Ultimately I simply tried other parallelism constructs and when I switched > to > using futures instead, everything worked fine, no crashes, no errors. > > Since that time, I did not use parallel map and instead used futures. > Recently I > made a parallelization thing for solving exercises of Project Euler using > multiple cores, so that some solutions are calculated faster. Maybe this > can > help or can be adapted to another use case: > > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 > > It expects ranges of things, which are called `segments` in the code. > Usually > ranges of numbers for Project Euler things. Here is the code to split a > range > into segments: > > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm > > (Check any solution using it for an example.) > > So this might be a bit too specific for general parallel things, but I > guess one > could change the way futures are used in `run-in-parallel`, to fit any > other > purpose. > > Best regards, > Zelphir > > -- > repositories: https://notabug.org/ZelphirKaltstahl > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-14 8:21 ` Damien Mattei @ 2022-10-14 8:38 ` Zelphir Kaltstahl 2022-10-17 13:17 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Zelphir Kaltstahl @ 2022-10-14 8:38 UTC (permalink / raw) To: Damien Mattei; +Cc: guile-user, guile-devel [-- Attachment #1: Type: text/plain, Size: 1723 bytes --] Hello Damien, On 10/14/22 10:21, Damien Mattei wrote: > yes Zelphir i think there is a problem in the threading part of guile, > whatever the code is ,it run well the first time, and after is unstable. Long > ago at the very begin of Java language on my master project at university i > experienced same problem with Java "green" threads, under windows and/or linux > ,i can't remember. > > I'm trying your code and future, which have perheaps also the portability with > other schemes, future can provide "light" // , with carefull code it could be ok. > > in your segment.scm code ,is segment-count possibly replaced by the number of > available CPUs or nodes, if i understand well? > > Regards, > Damien Correct. For each segment one future is used. However, there are scenarios, where one would rather want to interleave the numbers of the whole range, to have a "fairer" workload per future, for example: (1 2 3 4 5 6 7 8 9) -> (1 4 7), (2 5 8), (3 6 9) instead of -> (1 2 3) (4 5 6) (7 8 9) (I seem to remember this to be the case for Mandelbrot set calculations, but might be wrong.) But that is all a matter of forming some segments and what a segment is, not really part of the logic of creating futures. So one could create then in any way that fits ones use-case. I have not generalized that segment logic that far, but it is doable. Anyway, if anyone more knowledgeable could chime in on what the performance differences between futures and parallel map are, that would be great! Or pointing out some flaws that this kind of future usage might have. Or when the point is reached to switch to guile-fibers library. Regards, Zelphir -- repositories:https://notabug.org/ZelphirKaltstahl [-- Attachment #2: Type: text/html, Size: 3110 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-14 8:38 ` Zelphir Kaltstahl @ 2022-10-17 13:17 ` Damien Mattei 2022-10-22 16:01 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-17 13:17 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user, guile-devel Hello, sorry for my late answer ( i wrote the code friday but i could only debug today, being busy and on travel last week-end) i modified my code to works with 'futures' , the speed is now incredible !!! (and it computes exact) the speed seems multiplied by even more than the number of CPUs (6): cat /proc/cpuinfo | grep processor processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900 (declare minterms-vector unified-minterms-vector-1) (define (funct-unify-minterms-set-1-unit-future set1 set2) {function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms-and-tag L))} ;; note : sorting is useless {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)} ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1 set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create list of pair of minterms {minterms-vector <- (list->vector minterms-set)} {minterms-vector-length <+ (vector-length minterms-vector)} {nb-procs <+ (current-processor-count)} {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; compute the segments {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)} (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)} {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} ;; remove #f results {unified-minterms-set <+ (remove-duplicates-sorted unified-minterms-set-2)} ;; uniq unified-minterms-set) i keeped your 'segment' definitions to index an array of data after convert it from list to vector (list->vector) which probably consume additional time on big data list ( i had more than 1000000 element list lengths at some point) i used a simplified version of run-in parallel (that do not do 'reduce after processing data): https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794 the computation on a segment is done by those functions: https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842 {function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms-and-tag L))} ;; proc to be called with futures (define (proc-unify-minterms-seg seg) {start <+ (segment-start seg)} {end <+ (segment-end seg)} (for ({i <+ start} {i <= end} {i <- {i + 1}}) {mtL <+ {minterms-vector[i]}} {unified-minterms-vector-1[i] <- (function-unify-minterms-list mtL)})) i use those code in another program symbolically to compute a value named Cn: C0 = T C1 = T C2 = T C3 = (B1 ∨ B2) C4 = (B2 ∨ (B1 ∧ B3)) C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4)) C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5)) C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨ (B4 ∧ B5) ∨ (B5 ∧ B6)) C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7)) C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧ B8)) from C1 to C9 used to be fast,less than a minute for the whole with or without // C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9)) C10 takes a few minutes but C11 used to take one day before // with 'future' and i got now the result during less than one hour!!! C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9)) i never got C12 in the past ,even with 7 days of computation! and i got it today during the lunchbreak !!! C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9)) (note that a wise people can find a general formula for Cn based on Cn-1 but that is another (mathematical) story....) i'm computing C13 but i see that it consumes 12gb out of 16gb of my system ! and stopped on the linux box : GC Warning: Repeated allocation of very large block (appr. size 510935040): May lead to memory leak and poor performance Processus arrêté but still running on the Mac laptop...ok will see that but i think that the data set is now huge and there is noting that can be done with that... note that there is still an hash table used in function-unify-two-minterms-and-tag and i will use another data structure and algo to avoid that, i feared that the concurrent access to hash table can cause the guile 'future' to crash or fails or to slow down the process but not! result is incredibly fast.Also i use continuation and i read it can cause problem with 'future' i will improve that too... I will see where i can improve the algo and data structure to optimize again but this is already really good. Thanks Damien On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl < zelphirkaltstahl@posteo.de> wrote: > Hello Damien, > On 10/14/22 10:21, Damien Mattei wrote: > > yes Zelphir i think there is a problem in the threading part of guile, > whatever the code is ,it run well the first time, and after is unstable. > Long ago at the very begin of Java language on my master project at > university i experienced same problem with Java "green" threads, under > windows and/or linux ,i can't remember. > > I'm trying your code and future, which have perheaps also the portability > with other schemes, future can provide "light" // , with carefull code it > could be ok. > > in your segment.scm code ,is segment-count possibly replaced by the number > of available CPUs or nodes, if i understand well? > > Regards, > Damien > > Correct. For each segment one future is used. > > However, there are scenarios, where one would rather want to interleave > the numbers of the whole range, to have a "fairer" workload per future, for > example: > > (1 2 3 4 5 6 7 8 9) > > -> (1 4 7), (2 5 8), (3 6 9) > > instead of > > -> (1 2 3) (4 5 6) (7 8 9) > > (I seem to remember this to be the case for Mandelbrot set calculations, > but might be wrong.) > > But that is all a matter of forming some segments and what a segment is, > not really part of the logic of creating futures. So one could create then > in any way that fits ones use-case. I have not generalized that segment > logic that far, but it is doable. > > Anyway, if anyone more knowledgeable could chime in on what the > performance differences between futures and parallel map are, that would be > great! Or pointing out some flaws that this kind of future usage might > have. Or when the point is reached to switch to guile-fibers library. > > Regards, > Zelphir > > -- > repositories: https://notabug.org/ZelphirKaltstahl > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-17 13:17 ` Damien Mattei @ 2022-10-22 16:01 ` Damien Mattei 2022-10-23 1:06 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-22 16:01 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user, guile-devel just for the fun things, i just find that the parallelized results are false :-) even for low indices: the non // result: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4)) is different than the parallelized result below: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4)) i have to check things again ,i think futures are easy to use but limited in solving acute // problems, such as using a global variable like an hash table... Damien On Mon, Oct 17, 2022 at 3:17 PM Damien Mattei <damien.mattei@gmail.com> wrote: > Hello, > > sorry for my late answer ( i wrote the code friday but i could only debug > today, being busy and on travel last week-end) > > i modified my code to works with 'futures' , the speed is now incredible > !!! (and it computes exact) > the speed seems multiplied by even more than the number of CPUs (6): > cat /proc/cpuinfo | grep processor > processor : 0 > processor : 1 > processor : 2 > processor : 3 > processor : 4 > processor : 5 > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900 > > (declare minterms-vector unified-minterms-vector-1) > > (define (funct-unify-minterms-set-1-unit-future set1 set2) > > {function-unify-minterms-list <+ (λ (L) (apply > function-unify-two-minterms-and-tag L))} > > ;; note : sorting is useless > > {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)} > ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1 > set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create > list of pair of minterms > > {minterms-vector <- (list->vector minterms-set)} > > {minterms-vector-length <+ (vector-length minterms-vector)} > > {nb-procs <+ (current-processor-count)} > > {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; compute > the segments > > {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)} > > (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code > > {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)} > > {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} ;; > remove #f results > > {unified-minterms-set <+ (remove-duplicates-sorted > unified-minterms-set-2)} ;; uniq > > unified-minterms-set) > > i keeped your 'segment' definitions to index an array of data after > convert it from list to vector (list->vector) which probably consume > additional time on big data list ( i had more than 1000000 element list > lengths at some point) > > i used a simplified version of run-in parallel (that do not do 'reduce > after processing data): > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794 > > the computation on a segment is done by those functions: > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842 > > {function-unify-minterms-list <+ (λ (L) (apply > function-unify-two-minterms-and-tag L))} > > ;; proc to be called with futures > (define (proc-unify-minterms-seg seg) > {start <+ (segment-start seg)} > {end <+ (segment-end seg)} > (for ({i <+ start} {i <= end} {i <- {i + 1}}) > {mtL <+ {minterms-vector[i]}} > {unified-minterms-vector-1[i] <- (function-unify-minterms-list > mtL)})) > > > i use those code in another program symbolically to compute a value named > Cn: > > C0 = T > C1 = T > C2 = T > C3 = (B1 ∨ B2) > C4 = (B2 ∨ (B1 ∧ B3)) > C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4)) > C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5)) > C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨ > (B4 ∧ B5) ∨ (B5 ∧ B6)) > C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ > B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7)) > C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ > (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧ > B8)) > > from C1 to C9 used to be fast,less than a minute for the whole with or > without // > > C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ B6 > ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ > B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9)) > > C10 takes a few minutes > > but C11 used to take one day before // with 'future' and i got now the > result during less than one hour!!! > > C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧ > B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧ > B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9)) > > i never got C12 in the past ,even with 7 days of computation! and i got it > today during the lunchbreak !!! > > C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 > ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) > ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨ > (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9)) > > (note that a wise people can find a general formula for Cn based on Cn-1 > but that is another (mathematical) story....) > > i'm computing C13 but i see that it consumes 12gb out of 16gb of my system > ! and stopped on the linux box : > GC Warning: Repeated allocation of very large block (appr. size 510935040): > May lead to memory leak and poor performance > Processus arrêté > but still running on the Mac laptop...ok will see that but i think that > the data set is now huge and there is noting that can be done with that... > > note that there is still an hash table used in > function-unify-two-minterms-and-tag and i will use another data structure > and algo to avoid that, i feared that the concurrent access to hash table > can cause the guile 'future' to crash or fails or to slow down the process > but not! result is incredibly fast.Also i use continuation and i read it > can cause problem with 'future' i will improve that too... > > I will see where i can improve the algo and data structure to optimize > again but this is already really good. > > Thanks > > Damien > > > On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl < > zelphirkaltstahl@posteo.de> wrote: > >> Hello Damien, >> On 10/14/22 10:21, Damien Mattei wrote: >> >> yes Zelphir i think there is a problem in the threading part of guile, >> whatever the code is ,it run well the first time, and after is unstable. >> Long ago at the very begin of Java language on my master project at >> university i experienced same problem with Java "green" threads, under >> windows and/or linux ,i can't remember. >> >> I'm trying your code and future, which have perheaps also the portability >> with other schemes, future can provide "light" // , with carefull code it >> could be ok. >> >> in your segment.scm code ,is segment-count possibly replaced by the >> number of available CPUs or nodes, if i understand well? >> >> Regards, >> Damien >> >> Correct. For each segment one future is used. >> >> However, there are scenarios, where one would rather want to interleave >> the numbers of the whole range, to have a "fairer" workload per future, for >> example: >> >> (1 2 3 4 5 6 7 8 9) >> >> -> (1 4 7), (2 5 8), (3 6 9) >> >> instead of >> >> -> (1 2 3) (4 5 6) (7 8 9) >> >> (I seem to remember this to be the case for Mandelbrot set calculations, >> but might be wrong.) >> >> But that is all a matter of forming some segments and what a segment is, >> not really part of the logic of creating futures. So one could create then >> in any way that fits ones use-case. I have not generalized that segment >> logic that far, but it is doable. >> >> Anyway, if anyone more knowledgeable could chime in on what the >> performance differences between futures and parallel map are, that would be >> great! Or pointing out some flaws that this kind of future usage might >> have. Or when the point is reached to switch to guile-fibers library. >> >> Regards, >> Zelphir >> >> -- >> repositories: https://notabug.org/ZelphirKaltstahl >> >> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-22 16:01 ` Damien Mattei @ 2022-10-23 1:06 ` Damien Mattei 2022-10-23 23:18 ` Zelphir Kaltstahl 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-23 1:06 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user, guile-devel [-- Attachment #1: Type: text/plain, Size: 14665 bytes --] after intense coding i finally got the good results, my assumption about the global variable hash table was true ,it is incompatible with 'future : the competition for the writing into the hash table breaks the code. If i do writing in hash table out of // region all is ok: a simple function to extract the hash table access: (define (tag-minterms i umt) (when umt {mt1 <+ (first {minterms-vector[i]})} {mt2 <+ (second {minterms-vector[i]})} {minterms-ht[mt1] <- #t} {minterms-ht[mt2] <- #t})) ;; proc to be called with futures (define (proc-unify-minterms-seg seg) {function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms L))} ;;{function-unify-minterms-list <+ (λ (L) (apply function-unify-two-minterms-and-tag L))} {start <+ (segment-start seg)} {end <+ (segment-end seg)} (for ({i <+ start} {i <= end} {i <- {i + 1}}) {mtL <+ {minterms-vector[i]}} (nodebug (dv mtL)) {unified-minterms-vector-1[i] <- (function-unify-minterms-list mtL)})) (declare minterms-vector unified-minterms-vector-1) https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L2893 and the call to the code that use hash table is now completely out of the // region: (debug (display-nl "before //")) (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code (debug (display-nl "after //")) (vector-for-each tag-minterms unified-minterms-vector-1) ;; tag the minterms in the hash table with a partitioned vector it works because one cell is not accessed by different threads other stuff like continuation seems to work with 'future, i wanted to rewrite code with call/cc in 'future but it was too long and apparently it works,any way i do not see why call/cc would cause trouble in a thread... i think 'futures are good if you use them carefully like in OpenMP regions... in other languages. the code is very very fast now i reached C15,in a few minutes on an Apple M1 scheme@(guile-user)> (current-processor-count) $1 = 8 C15 = ((B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14)) and it consumes more than 2Gb in memory: PID COMMAND %CPU TIME #TH #WQ #PORT MEM PURG CMPRS PGRP PPID STATE BOOSTS %CPU_ME 15081 guile 442.8 57:24.28 17/8 0 44 *2423M* 0B 118M 15081 10551 running *0[1] 0.0000 i checked a part of the logical computation with Mathematica online and it was ok but i will do a Python program to check the logical expression again and compare speed with Sympy... and for the space complexity i do not think it can be reduced good night all On Sat, Oct 22, 2022 at 6:01 PM Damien Mattei <damien.mattei@gmail.com> wrote: > just for the fun things, i just find that the parallelized results are > false :-) even for low indices: > the non // result: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4)) > is different than the parallelized result below: > C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4)) > > i have to check things again ,i think futures are easy to use but limited > in solving acute // problems, such as using a global variable like an hash > table... > > Damien > > > > On Mon, Oct 17, 2022 at 3:17 PM Damien Mattei <damien.mattei@gmail.com> > wrote: > >> Hello, >> >> sorry for my late answer ( i wrote the code friday but i could only debug >> today, being busy and on travel last week-end) >> >> i modified my code to works with 'futures' , the speed is now incredible >> !!! (and it computes exact) >> the speed seems multiplied by even more than the number of CPUs (6): >> cat /proc/cpuinfo | grep processor >> processor : 0 >> processor : 1 >> processor : 2 >> processor : 3 >> processor : 4 >> processor : 5 >> >> >> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900 >> >> (declare minterms-vector unified-minterms-vector-1) >> >> (define (funct-unify-minterms-set-1-unit-future set1 set2) >> >> {function-unify-minterms-list <+ (λ (L) (apply >> function-unify-two-minterms-and-tag L))} >> >> ;; note : sorting is useless >> >> {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)} >> ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1 >> set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create >> list of pair of minterms >> >> {minterms-vector <- (list->vector minterms-set)} >> >> {minterms-vector-length <+ (vector-length minterms-vector)} >> >> {nb-procs <+ (current-processor-count)} >> >> {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; >> compute the segments >> >> {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)} >> >> (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel >> code >> >> {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)} >> >> {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} >> ;; remove #f results >> >> {unified-minterms-set <+ (remove-duplicates-sorted >> unified-minterms-set-2)} ;; uniq >> >> unified-minterms-set) >> >> i keeped your 'segment' definitions to index an array of data after >> convert it from list to vector (list->vector) which probably consume >> additional time on big data list ( i had more than 1000000 element list >> lengths at some point) >> >> i used a simplified version of run-in parallel (that do not do 'reduce >> after processing data): >> >> >> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794 >> >> the computation on a segment is done by those functions: >> >> >> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842 >> >> {function-unify-minterms-list <+ (λ (L) (apply >> function-unify-two-minterms-and-tag L))} >> >> ;; proc to be called with futures >> (define (proc-unify-minterms-seg seg) >> {start <+ (segment-start seg)} >> {end <+ (segment-end seg)} >> (for ({i <+ start} {i <= end} {i <- {i + 1}}) >> {mtL <+ {minterms-vector[i]}} >> {unified-minterms-vector-1[i] <- (function-unify-minterms-list >> mtL)})) >> >> >> i use those code in another program symbolically to compute a value named >> Cn: >> >> C0 = T >> C1 = T >> C2 = T >> C3 = (B1 ∨ B2) >> C4 = (B2 ∨ (B1 ∧ B3)) >> C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4)) >> C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5)) >> C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨ >> (B4 ∧ B5) ∨ (B5 ∧ B6)) >> C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ >> B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7)) >> C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ >> (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧ >> B8)) >> >> from C1 to C9 used to be fast,less than a minute for the whole with or >> without // >> >> C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ >> B6 ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 >> ∧ B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9)) >> >> C10 takes a few minutes >> >> but C11 used to take one day before // with 'future' and i got now the >> result during less than one hour!!! >> >> C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧ >> B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧ >> B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9)) >> >> i never got C12 in the past ,even with 7 days of computation! and i got >> it today during the lunchbreak !!! >> >> C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 >> ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) >> ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨ >> (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9)) >> >> (note that a wise people can find a general formula for Cn based on Cn-1 >> but that is another (mathematical) story....) >> >> i'm computing C13 but i see that it consumes 12gb out of 16gb of my >> system ! and stopped on the linux box : >> GC Warning: Repeated allocation of very large block (appr. size >> 510935040): >> May lead to memory leak and poor performance >> Processus arrêté >> but still running on the Mac laptop...ok will see that but i think that >> the data set is now huge and there is noting that can be done with that... >> >> note that there is still an hash table used in >> function-unify-two-minterms-and-tag and i will use another data structure >> and algo to avoid that, i feared that the concurrent access to hash table >> can cause the guile 'future' to crash or fails or to slow down the process >> but not! result is incredibly fast.Also i use continuation and i read it >> can cause problem with 'future' i will improve that too... >> >> I will see where i can improve the algo and data structure to optimize >> again but this is already really good. >> >> Thanks >> >> Damien >> >> >> On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl < >> zelphirkaltstahl@posteo.de> wrote: >> >>> Hello Damien, >>> On 10/14/22 10:21, Damien Mattei wrote: >>> >>> yes Zelphir i think there is a problem in the threading part of guile, >>> whatever the code is ,it run well the first time, and after is unstable. >>> Long ago at the very begin of Java language on my master project at >>> university i experienced same problem with Java "green" threads, under >>> windows and/or linux ,i can't remember. >>> >>> I'm trying your code and future, which have perheaps also the >>> portability with other schemes, future can provide "light" // , with >>> carefull code it could be ok. >>> >>> in your segment.scm code ,is segment-count possibly replaced by the >>> number of available CPUs or nodes, if i understand well? >>> >>> Regards, >>> Damien >>> >>> Correct. For each segment one future is used. >>> >>> However, there are scenarios, where one would rather want to interleave >>> the numbers of the whole range, to have a "fairer" workload per future, for >>> example: >>> >>> (1 2 3 4 5 6 7 8 9) >>> >>> -> (1 4 7), (2 5 8), (3 6 9) >>> >>> instead of >>> >>> -> (1 2 3) (4 5 6) (7 8 9) >>> >>> (I seem to remember this to be the case for Mandelbrot set calculations, >>> but might be wrong.) >>> >>> But that is all a matter of forming some segments and what a segment is, >>> not really part of the logic of creating futures. So one could create then >>> in any way that fits ones use-case. I have not generalized that segment >>> logic that far, but it is doable. >>> >>> Anyway, if anyone more knowledgeable could chime in on what the >>> performance differences between futures and parallel map are, that would be >>> great! Or pointing out some flaws that this kind of future usage might >>> have. Or when the point is reached to switch to guile-fibers library. >>> >>> Regards, >>> Zelphir >>> >>> -- >>> repositories: https://notabug.org/ZelphirKaltstahl >>> >>> [-- Attachment #2: Type: text/html, Size: 21674 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-23 1:06 ` Damien Mattei @ 2022-10-23 23:18 ` Zelphir Kaltstahl 2022-10-24 3:56 ` Keith Wright 2022-10-24 4:39 ` Damien Mattei 0 siblings, 2 replies; 41+ messages in thread From: Zelphir Kaltstahl @ 2022-10-23 23:18 UTC (permalink / raw) To: Damien Mattei; +Cc: guile-user Hello Damien! On 10/23/22 03:06, Damien Mattei wrote: > after intense coding i finally got the good results, > my assumption about the global variable hash table was true ,it is > incompatible with 'future : the competition for the writing into the hash > table breaks the code. > > If i do writing in hash table out of // region all is ok: > > a simple function to extract the hash table access: > > (define (tag-minterms i umt) > (when umt > {mt1 <+ (first {minterms-vector[i]})} > {mt2 <+ (second {minterms-vector[i]})} > {minterms-ht[mt1] <- #t} > {minterms-ht[mt2] <- #t})) > [...] I am not sure what exactly the problem is, which you are trying to calculate, but it looks fairly mathematical to me. Maybe you do not need to share state globally at all? Maybe you can find a way to avoid global shared state? I am guessing, that you want to avoid duplicated computation of part terms? Of course,if you have global state and do not have a synchronization construct (!) for accessing the hash table, I would expect things to go wrong at some point, with non-reproducible results. I do not think that futures are to blame here, or parallel map in that case. With a synchronization construct, some kind of mutex, your bottle neck might just become that mutex. Up to you to measure that ; ) Would be nice, if you found a clever way to separate the problems into disjunct parts and then solve them without global state. Regards, Zelphir -- repositories:https://notabug.org/ZelphirKaltstahl ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-23 23:18 ` Zelphir Kaltstahl @ 2022-10-24 3:56 ` Keith Wright 2022-10-24 7:03 ` Zelphir Kaltstahl 2022-10-24 4:39 ` Damien Mattei 1 sibling, 1 reply; 41+ messages in thread From: Keith Wright @ 2022-10-24 3:56 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: damien.mattei, guile-user Zelphir Kaltstahl <zelphirkaltstahl@posteo.de> writes: > Of course,if you have global state and do not have a synchronization > construct for accessing the hash table, I would expect things to > go wrong at some point, with non-reproducible results. Indeed. > I do not think that futures are to blame here, > or parallel map in that case. Are futures to blame? Do you blame the plus sign for this wrong equation: 4+5=20? My teachers blamed me for that sort of thing. > With a synchronization construct, some kind of mutex, > your bottle neck might just become that mutex. But why make it parallel at all? Unless you have some serious parallel hardware you will just be wasting time in the scheduler when you could be computing the answer. In other news --- there are many kinds of people on this mailing list. I wonder what percentage are the kind that needs page-long disjunctive normal form expressions in their mail boxes? Not me. -- Keith PS: For page-long boolean expression, they are nice. But why? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-24 3:56 ` Keith Wright @ 2022-10-24 7:03 ` Zelphir Kaltstahl 0 siblings, 0 replies; 41+ messages in thread From: Zelphir Kaltstahl @ 2022-10-24 7:03 UTC (permalink / raw) To: Keith Wright; +Cc: damien.mattei, guile-user Hello Keith! On 10/24/22 05:56, Keith Wright wrote: > Zelphir Kaltstahl <zelphirkaltstahl@posteo.de> writes: > >> Of course,if you have global state and do not have a synchronization >> construct for accessing the hash table, I would expect things to >> go wrong at some point, with non-reproducible results. > Indeed. > >> I do not think that futures are to blame here, >> or parallel map in that case. > Are futures to blame? Do you blame the plus sign for > this wrong equation: 4+5=20? > My teachers blamed me for that sort of thing. I do not know your teachers, but in my experience futures work well in Guile : ) Blaming a construct like futures for the results of using shared global state seems really unreasonable. >> With a synchronization construct, some kind of mutex, >> your bottle neck might just become that mutex. > But why make it parallel at all? Unless you have some serious > parallel hardware you will just be wasting time in the scheduler > when you could be computing the answer. Assuming, that parallelization is possible and a problem is not inherently of sequential nature, of course there is hope for a speedup to be achieved. When parallelizing properly (which might involve finding a different algorithm) there might not be any resource conflict between the futures or OS threads running, so time will not be "time wasted in the scheduler". According to what Damien has written the parallelization has already sped up his calculations a lot. I think it is fair to assume, that it is at least partly working : ) I agree though, that pages of boolean expressions are not really that interesting to have in the mailbox ; ) Regards, Zelphir -- repositories: https://notabug.org/ZelphirKaltstahl ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-23 23:18 ` Zelphir Kaltstahl 2022-10-24 3:56 ` Keith Wright @ 2022-10-24 4:39 ` Damien Mattei 1 sibling, 0 replies; 41+ messages in thread From: Damien Mattei @ 2022-10-24 4:39 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user Hello Zelphir, i should had written first about the mathematical problem first but i did not want to go into other than computer things. But i expose here the main idea briefly: i try to solve a mathematical conjecture ( https://en.wikipedia.org/wiki/Conjecture ) using logic expressions but not in the common sense of Boole's algebra, at some point i shift to Probability logic and i need to simplify, minimalize logic expressions in disjunctive form, this is my idea. I will give a few link for the curious and i hope to publish something in the next year. At the beginning it is a computer problem well know in logic: the minterms of the hash table i use are described here: https://en.wikipedia.org/wiki/Canonical_normal_form#Minterms and the algorithms are here: https://en.wikipedia.org/wiki/Quine%E2%80%93McCluskey_algorithm https://en.wikipedia.org/wiki/Petrick%27s_method The minterms come from a set product : https://en.wikipedia.org/wiki/Cartesian_product (space complexity is polynomial) but the whole logic problem is an NP-hard problem so i do not expect it to be fast for more than 10 variables, it works well on one processor, and i'm now checking it on multiple core. I could have used Mathematica or Python sympy but i choosed many years ago to build a system in Scheme because there is no free software, Maxima does not support well logic (there was a no more supported module for Zhegalkin polynomials https://en.wikipedia.org/wiki/Zhegalkin_polynomial ) i did not want to use a commercial product such as Mathematica and i did not know Python Sympy ten years ago... but now i will use it to cross-check my Scheme results. I hope to do that this week, if it is ok i will not modify the Scheme code any more. The hash table was the more obvious structure to use due to the nature of algoritms here, i can not use arrays. I did not expected the compiler to solve the concurrent access to the hash table, i know i was going into problems, and i solve the problem with arrays in the // region and put back data after in the hash table in a non // region.It is okay now. For the one interested (i apologize because this is out of subject in the Guile mailing list) here is a few interesting links about Probability and Logic: Theodore Hailperin wrote the best articles in my opinion: https://projecteuclid.org/journals/notre-dame-journal-of-formal-logic/volume-25/issue-3/Probability-logic/10.1305/ndjfl/1093870625.full https://plato.stanford.edu/entries/logic-probability/ https://www.amazon.fr/Booles-logic-probability-contemporary-foundations/dp/0444110372 Best regards, Damien On Mon, Oct 24, 2022 at 1:18 AM Zelphir Kaltstahl < zelphirkaltstahl@posteo.de> wrote: > Hello Damien! > On 10/23/22 03:06, Damien Mattei wrote: > > after intense coding i finally got the good results, > my assumption about the global variable hash table was true ,it is > incompatible with 'future : the competition for the writing into the hash > table breaks the code. > > If i do writing in hash table out of // region all is ok: > > a simple function to extract the hash table access: > > (define (tag-minterms i umt) > (when umt > {mt1 <+ (first {minterms-vector[i]})} > {mt2 <+ (second {minterms-vector[i]})} > {minterms-ht[mt1] <- #t} > {minterms-ht[mt2] <- #t})) > [...] > > I am not sure what exactly the problem is, which you are trying to > calculate, but it looks fairly mathematical to me. Maybe you do not need to > share state globally at all? Maybe you can find a way to avoid global > shared state? I am guessing, that you want to avoid duplicated computation > of part terms? > > Of course,if you have global state and do not have a synchronization > construct (!) for accessing the hash table, I would expect things to go > wrong at some point, with non-reproducible results. I do not think that > futures are to blame here, or parallel map in that case. > > With a synchronization construct, some kind of mutex, your bottle neck > might just become that mutex. Up to you to measure that ; ) > > Would be nice, if you found a clever way to separate the problems into > disjunct parts and then solve them without global state. > > Regards, > Zelphir > > -- > repositories: https://notabug.org/ZelphirKaltstahl > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 21:29 ` Zelphir Kaltstahl 2022-10-14 8:21 ` Damien Mattei @ 2022-10-25 9:07 ` Mikael Djurfeldt 2022-10-25 9:11 ` Mikael Djurfeldt 2022-11-10 10:32 ` Damien Mattei 2 siblings, 1 reply; 41+ messages in thread From: Mikael Djurfeldt @ 2022-10-25 9:07 UTC (permalink / raw) To: Zelphir Kaltstahl Cc: Damien Mattei, guile-user, guile-devel, Mikael Djurfeldt [-- Attachment #1: Type: text/plain, Size: 9302 bytes --] A piece of background on par-map: When I introduced par-map et al the only ambition was to have simple language constructs to invoke parallelism. The use case I had in mind was course grained parallelism where each piece of work is somewhat substantial. Back then, a thread was launched for each piece of work, however there was also a thread pool such that not all of the overhead of launching new threads always was required. Since then, par-map has been rewritten (by others) to be based on futures. (And now the thread pool is localized in the futures implementation---as "workers".) Looking in the code now, I think it is fair to say that it is still intended for coarse grained parallelism. There is some heavy lifting going on with mutexes and condition variables as well as shuffling around with list pairs. So, applying par-map on a huge list with small amount of work per item was never and still isn't the intended use case. It would of course be interesting if the code could be improved to support fine grained parallelism. :-) Best regards, Mikael On Wed, Oct 12, 2022 at 11:30 PM Zelphir Kaltstahl < zelphirkaltstahl@posteo.de> wrote: > Hi! > > On 10/12/22 22:27, Damien Mattei wrote: > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 > > > > i commited the current version of code here with all files but it is > > huge.... :-/ > > > > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com> > > wrote: > > > >> Mutex? i do not think code has situation where dead lock could happen, > it > >> is a code about minimalising logic expressions, it uses minterms , > minterms > >> set is a set of minterms :like this: > >> > >> example: > >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) > >> because 0 and 1 are replaced by x > >> the minterms-set could have thousands of pair (mathematic not lisp) > >> minterms to unify > >> if there is more than one x as result there is no need to continue so i > >> escape with a continuation: > >> > >> minterms-set = > >> { > >> ((1 0 1 0) (1 1 1 0)) > >> ((1 0 1 0) (1 1 0 1)) > >> ((1 0 1 0) (1 0 1 1)) > >> ((1 0 1 0) (0 1 1 1)) > >> ((0 1 1 0) (1 1 1 0)) > >> ((0 1 1 0) (1 1 0 1)) > >> ((0 1 1 0) (1 0 1 1)) > >> ((0 1 1 0) (0 1 1 1)) > >> ((0 1 0 1) (1 1 1 0)) > >> ((0 1 0 1) (1 1 0 1)) > >> ((0 1 0 1) (1 0 1 1)) > >> ((0 1 0 1) (0 1 1 1)) > >> ((0 0 1 1) (1 1 1 0)) > >> ((0 0 1 1) (1 1 0 1)) > >> ((0 0 1 1) (1 0 1 1)) > >> ((0 0 1 1) (0 1 1 1)) > >> } > >> > >> replace { } by () to have the list, other example at another level : > >> > >> minterms-set = > >> { > >> ((0 x 1 1) (x 1 1 1)) > >> ((0 x 1 1) (1 x 1 1)) > >> ((0 x 1 1) (1 1 x 1)) > >> ((0 x 1 1) (1 1 1 x)) > >> ((x 0 1 1) (x 1 1 1)) > >> ((x 0 1 1) (1 x 1 1)) > >> ((x 0 1 1) (1 1 x 1)) > >> ((x 0 1 1) (1 1 1 x)) > >> ((0 1 x 1) (x 1 1 1)) > >> ((0 1 x 1) (1 x 1 1)) > >> ((0 1 x 1) (1 1 x 1)) > >> ((0 1 x 1) (1 1 1 x)) > >> ((x 1 0 1) (x 1 1 1)) > >> ((x 1 0 1) (1 x 1 1)) > >> ((x 1 0 1) (1 1 x 1)) > >> ((x 1 0 1) (1 1 1 x)) > >> ((0 1 1 x) (x 1 1 1)) > >> ((0 1 1 x) (1 x 1 1)) > >> ((0 1 1 x) (1 1 x 1)) > >> ((0 1 1 x) (1 1 1 x)) > >> ((x 1 1 0) (x 1 1 1)) > >> ((x 1 1 0) (1 x 1 1)) > >> ((x 1 1 0) (1 1 x 1)) > >> ((x 1 1 0) (1 1 1 x)) > >> ((1 0 1 x) (x 1 1 1)) > >> ((1 0 1 x) (1 x 1 1)) > >> ((1 0 1 x) (1 1 x 1)) > >> ((1 0 1 x) (1 1 1 x)) > >> ((1 x 1 0) (x 1 1 1)) > >> ((1 x 1 0) (1 x 1 1)) > >> ((1 x 1 0) (1 1 x 1)) > >> ((1 x 1 0) (1 1 1 x)) > >> } > >> > >> here we see some minterms are already unified > >> > >> it is not easy to read even by me because i wrote the code many years > ago > >> and is split in many files, but here it is: > >> > >> (par-map function-unify-minterms-list minterms-set) > >> > >> {function-unify-minterms-list <+ (λ (L) (apply > >> function-unify-two-minterms-and-tag L))} > >> > >> (define (unify-two-minterms mt1 mt2) > >> (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) > >> > >> ;; (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) > '(1 > >> 1 0 1 1 1 1 1)) > >> > >> ;; list1 = (1 1 0 1 0 1 1 0) > >> ;; more-lists = ((1 1 0 1 1 1 1 1)) > >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) > >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > >> > >> ;; #f > >> ;; > >> ;; (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) > '(1 > >> 1 0 1 1 1 1 0)) > >> > >> ;; list1 = (1 1 0 1 0 1 1 0) > >> ;; more-lists = ((1 1 0 1 1 1 1 0)) > >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) > >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > >> > >> ;; '(1 1 0 1 x 1 1 0) > >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . > >> more-lists) > >> (call/cc (lambda (kontinuation) > >> (let ((lists (cons list1 more-lists)) > >> (funct-continu ;; this function have the kontinuation in his > environment > >> (lambda (arg1 . more-args) > >> (let ((args (cons arg1 more-args))) > >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons > >> conti args)) > >> > >> ;; (newline) > >> ;; (dv list1) > >> ;; (dv more-lists) > >> ;; (dv lists) > >> ;; (dv clozure) > >> ;; (newline) > >> > >> (apply map funct-continu lists))))) > >> > >> (define-syntax macro-function-compare-2-bits-with-continuation ;; > >> continuation version of macro-compare-2-bits > >> ;; i need a macro because of external function to the clozure > >> (syntax-rules () > >> ((_) (let ((cnt 0)) ;; counter > >> (lambda (continuation b1 b2) (if (equal? b1 b2) > >> b1 > >> (begin > >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > > 1, we > >> can have used a flag too instead of a counter > >> (when (> cnt 1) (continuation #f)) ;; escaping with the continuation > >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) > >> > >> what could have caused mutex if in the latter definition above (let > ((cnt > >> 0)) ;; counter was defined at top level and shared by all threads!!! yes > >> there could have be some mutex but this is not the case, i think even > all > >> function are pure so why is it more slow with // than without? > >> Damien > >> > >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> > >> wrote: > >> > >>> On 12-10-2022 19:19, Damien Mattei wrote: > >>>> Hello, > >>>> all is in the title, i test on a approximately 30000 element list , i > >>> got > >>>> 9s with map and 3min 30s with par-map on exactly the same piece of > >>> code!? > >>> > [...] > >>> > > >>>> translated from Scheme+ to Scheme: > >>>> (define unified-minterms-set-1 (map function-unify-minterms-list > >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) > >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is > >>> missing. Without a test case, we can only speculate what's going on. > >>> (E.g., maybe it grabs a mutex). > >>> > >>> Greetings, > >>> Maxime. > I don't want to scare anyone, just maybe warn about parallel map. I once > tried > to use Guile's parallel map function for a decision tree implementation > ( > https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), > > where each branch while learning the tree would call parallel map again > for sub > branches and so on. Somehow it made Guile crash (I don't have the error > message > any longer, but I did post about it on the mailing list back then.). I > never > figured out, what went wrong. All I had was pure function calls and math > inside > the thing that parallel map was supposed to run. > > Ultimately I simply tried other parallelism constructs and when I switched > to > using futures instead, everything worked fine, no crashes, no errors. > > Since that time, I did not use parallel map and instead used futures. > Recently I > made a parallelization thing for solving exercises of Project Euler using > multiple cores, so that some solutions are calculated faster. Maybe this > can > help or can be adapted to another use case: > > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 > > It expects ranges of things, which are called `segments` in the code. > Usually > ranges of numbers for Project Euler things. Here is the code to split a > range > into segments: > > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm > > (Check any solution using it for an example.) > > So this might be a bit too specific for general parallel things, but I > guess one > could change the way futures are used in `run-in-parallel`, to fit any > other > purpose. > > Best regards, > Zelphir > > -- > repositories: https://notabug.org/ZelphirKaltstahl > > > [-- Attachment #2: Type: text/html, Size: 12202 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-25 9:07 ` Mikael Djurfeldt @ 2022-10-25 9:11 ` Mikael Djurfeldt 2022-10-25 14:09 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Mikael Djurfeldt @ 2022-10-25 9:11 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: Damien Mattei, guile-user, guile-devel [-- Attachment #1: Type: text/plain, Size: 9889 bytes --] Also, I would believe that any crashes in this context are neither due to the futures implementation nor par-map et al. I would think that crashes are due to the Guile basic thread support itself. On Tue, Oct 25, 2022 at 11:07 AM Mikael Djurfeldt <mikael@djurfeldt.com> wrote: > A piece of background on par-map: > > When I introduced par-map et al the only ambition was to have simple > language constructs to invoke parallelism. The use case I had in mind was > course grained parallelism where each piece of work is somewhat > substantial. Back then, a thread was launched for each piece of work, > however there was also a thread pool such that not all of the overhead of > launching new threads always was required. > > Since then, par-map has been rewritten (by others) to be based on futures. > (And now the thread pool is localized in the futures implementation---as > "workers".) Looking in the code now, I think it is fair to say that it is > still intended for coarse grained parallelism. There is some heavy lifting > going on with mutexes and condition variables as well as shuffling around > with list pairs. > > So, applying par-map on a huge list with small amount of work per item was > never and still isn't the intended use case. > > It would of course be interesting if the code could be improved to support > fine grained parallelism. :-) > > Best regards, > Mikael > > On Wed, Oct 12, 2022 at 11:30 PM Zelphir Kaltstahl < > zelphirkaltstahl@posteo.de> wrote: > >> Hi! >> >> On 10/12/22 22:27, Damien Mattei wrote: >> > >> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 >> > >> > i commited the current version of code here with all files but it is >> > huge.... :-/ >> > >> > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com >> > >> > wrote: >> > >> >> Mutex? i do not think code has situation where dead lock could happen, >> it >> >> is a code about minimalising logic expressions, it uses minterms , >> minterms >> >> set is a set of minterms :like this: >> >> >> >> example: >> >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) >> >> because 0 and 1 are replaced by x >> >> the minterms-set could have thousands of pair (mathematic not lisp) >> >> minterms to unify >> >> if there is more than one x as result there is no need to continue so i >> >> escape with a continuation: >> >> >> >> minterms-set = >> >> { >> >> ((1 0 1 0) (1 1 1 0)) >> >> ((1 0 1 0) (1 1 0 1)) >> >> ((1 0 1 0) (1 0 1 1)) >> >> ((1 0 1 0) (0 1 1 1)) >> >> ((0 1 1 0) (1 1 1 0)) >> >> ((0 1 1 0) (1 1 0 1)) >> >> ((0 1 1 0) (1 0 1 1)) >> >> ((0 1 1 0) (0 1 1 1)) >> >> ((0 1 0 1) (1 1 1 0)) >> >> ((0 1 0 1) (1 1 0 1)) >> >> ((0 1 0 1) (1 0 1 1)) >> >> ((0 1 0 1) (0 1 1 1)) >> >> ((0 0 1 1) (1 1 1 0)) >> >> ((0 0 1 1) (1 1 0 1)) >> >> ((0 0 1 1) (1 0 1 1)) >> >> ((0 0 1 1) (0 1 1 1)) >> >> } >> >> >> >> replace { } by () to have the list, other example at another level : >> >> >> >> minterms-set = >> >> { >> >> ((0 x 1 1) (x 1 1 1)) >> >> ((0 x 1 1) (1 x 1 1)) >> >> ((0 x 1 1) (1 1 x 1)) >> >> ((0 x 1 1) (1 1 1 x)) >> >> ((x 0 1 1) (x 1 1 1)) >> >> ((x 0 1 1) (1 x 1 1)) >> >> ((x 0 1 1) (1 1 x 1)) >> >> ((x 0 1 1) (1 1 1 x)) >> >> ((0 1 x 1) (x 1 1 1)) >> >> ((0 1 x 1) (1 x 1 1)) >> >> ((0 1 x 1) (1 1 x 1)) >> >> ((0 1 x 1) (1 1 1 x)) >> >> ((x 1 0 1) (x 1 1 1)) >> >> ((x 1 0 1) (1 x 1 1)) >> >> ((x 1 0 1) (1 1 x 1)) >> >> ((x 1 0 1) (1 1 1 x)) >> >> ((0 1 1 x) (x 1 1 1)) >> >> ((0 1 1 x) (1 x 1 1)) >> >> ((0 1 1 x) (1 1 x 1)) >> >> ((0 1 1 x) (1 1 1 x)) >> >> ((x 1 1 0) (x 1 1 1)) >> >> ((x 1 1 0) (1 x 1 1)) >> >> ((x 1 1 0) (1 1 x 1)) >> >> ((x 1 1 0) (1 1 1 x)) >> >> ((1 0 1 x) (x 1 1 1)) >> >> ((1 0 1 x) (1 x 1 1)) >> >> ((1 0 1 x) (1 1 x 1)) >> >> ((1 0 1 x) (1 1 1 x)) >> >> ((1 x 1 0) (x 1 1 1)) >> >> ((1 x 1 0) (1 x 1 1)) >> >> ((1 x 1 0) (1 1 x 1)) >> >> ((1 x 1 0) (1 1 1 x)) >> >> } >> >> >> >> here we see some minterms are already unified >> >> >> >> it is not easy to read even by me because i wrote the code many >> years ago >> >> and is split in many files, but here it is: >> >> >> >> (par-map function-unify-minterms-list minterms-set) >> >> >> >> {function-unify-minterms-list <+ (λ (L) (apply >> >> function-unify-two-minterms-and-tag L))} >> >> >> >> (define (unify-two-minterms mt1 mt2) >> >> (function-map-with-escaping-by-kontinuation2 >> >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) >> >> >> >> ;; (function-map-with-escaping-by-kontinuation2 >> >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) >> '(1 >> >> 1 0 1 1 1 1 1)) >> >> >> >> ;; list1 = (1 1 0 1 0 1 1 0) >> >> ;; more-lists = ((1 1 0 1 1 1 1 1)) >> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) >> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >> >> >> >> ;; #f >> >> ;; >> >> ;; (function-map-with-escaping-by-kontinuation2 >> >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 >> 0) '(1 >> >> 1 0 1 1 1 1 0)) >> >> >> >> ;; list1 = (1 1 0 1 0 1 1 0) >> >> ;; more-lists = ((1 1 0 1 1 1 1 0)) >> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) >> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >> >> >> >> ;; '(1 1 0 1 x 1 1 0) >> >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . >> >> more-lists) >> >> (call/cc (lambda (kontinuation) >> >> (let ((lists (cons list1 more-lists)) >> >> (funct-continu ;; this function have the kontinuation in his >> environment >> >> (lambda (arg1 . more-args) >> >> (let ((args (cons arg1 more-args))) >> >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons >> >> conti args)) >> >> >> >> ;; (newline) >> >> ;; (dv list1) >> >> ;; (dv more-lists) >> >> ;; (dv lists) >> >> ;; (dv clozure) >> >> ;; (newline) >> >> >> >> (apply map funct-continu lists))))) >> >> >> >> (define-syntax macro-function-compare-2-bits-with-continuation ;; >> >> continuation version of macro-compare-2-bits >> >> ;; i need a macro because of external function to the clozure >> >> (syntax-rules () >> >> ((_) (let ((cnt 0)) ;; counter >> >> (lambda (continuation b1 b2) (if (equal? b1 b2) >> >> b1 >> >> (begin >> >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > >> 1, we >> >> can have used a flag too instead of a counter >> >> (when (> cnt 1) (continuation #f)) ;; escaping with the >> continuation >> >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) >> >> >> >> what could have caused mutex if in the latter definition above (let >> ((cnt >> >> 0)) ;; counter was defined at top level and shared by all threads!!! >> yes >> >> there could have be some mutex but this is not the case, i think even >> all >> >> function are pure so why is it more slow with // than without? >> >> Damien >> >> >> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> >> >> wrote: >> >> >> >>> On 12-10-2022 19:19, Damien Mattei wrote: >> >>>> Hello, >> >>>> all is in the title, i test on a approximately 30000 element list , i >> >>> got >> >>>> 9s with map and 3min 30s with par-map on exactly the same piece of >> >>> code!? >> >>> > [...] >> >>> > >> >>>> translated from Scheme+ to Scheme: >> >>>> (define unified-minterms-set-1 (map function-unify-minterms-list >> >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) >> >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is >> >>> missing. Without a test case, we can only speculate what's going on. >> >>> (E.g., maybe it grabs a mutex). >> >>> >> >>> Greetings, >> >>> Maxime. >> I don't want to scare anyone, just maybe warn about parallel map. I once >> tried >> to use Guile's parallel map function for a decision tree implementation >> ( >> https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), >> >> where each branch while learning the tree would call parallel map again >> for sub >> branches and so on. Somehow it made Guile crash (I don't have the error >> message >> any longer, but I did post about it on the mailing list back then.). I >> never >> figured out, what went wrong. All I had was pure function calls and math >> inside >> the thing that parallel map was supposed to run. >> >> Ultimately I simply tried other parallelism constructs and when I >> switched to >> using futures instead, everything worked fine, no crashes, no errors. >> >> Since that time, I did not use parallel map and instead used futures. >> Recently I >> made a parallelization thing for solving exercises of Project Euler using >> multiple cores, so that some solutions are calculated faster. Maybe this >> can >> help or can be adapted to another use case: >> >> >> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 >> >> It expects ranges of things, which are called `segments` in the code. >> Usually >> ranges of numbers for Project Euler things. Here is the code to split a >> range >> into segments: >> >> >> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm >> >> (Check any solution using it for an example.) >> >> So this might be a bit too specific for general parallel things, but I >> guess one >> could change the way futures are used in `run-in-parallel`, to fit any >> other >> purpose. >> >> Best regards, >> Zelphir >> >> -- >> repositories: https://notabug.org/ZelphirKaltstahl >> >> >> [-- Attachment #2: Type: text/html, Size: 12781 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-25 9:11 ` Mikael Djurfeldt @ 2022-10-25 14:09 ` Damien Mattei 0 siblings, 0 replies; 41+ messages in thread From: Damien Mattei @ 2022-10-25 14:09 UTC (permalink / raw) To: mikael; +Cc: Zelphir Kaltstahl, guile-user, guile-devel [-- Attachment #1: Type: text/plain, Size: 12643 bytes --] Hello Mikael, your message comes very well because i decided yesterday,partly due to the discussion in mailing list, to review all the // method used in my code and to use them to perform benchmark to see which one is preferable to use with the algorithmic problem of putting a logic expression in its canonical minimal disjunctive or conjunctive form. I used 4 methods: -sequential (map with no //) -par-map (//) -threads (//) -futures (//) But, of course, the data structure is not the same now, it will be lists with sequential and par-map and vectors with threads and futures, as i already experimented problems (blocking, crashing,false results) with some (threads,futures) . Also i have solved the concurrent access problem to the hash table with lists or vectors depending if i use map or par-map or futures and threads, but the algorithm remains the same in all case. I computed from C1,C2,C3 ... to C12 (i do not write the logical expressions here, some peoples seeing those infix mathematical expression coming in the mailing list as drones in the peaceful sky of the lambda calculus expressions ;-) ) but here is the results: map: 10' 17" for C12 (5" for C10) par-map: 1'21" for C10, more than 2h 40' for C12 threads: blocks randomly before C7 (works but is not reentrant, perheaps a problem in the code i use with thread programming) futures:8" for C12, 25" for C13,2' for C14,7' for C15... the best performance are,from far, with futures and the code provided by Zelphir. on a 6 cores processor with 16Gb of memory.note at C15 computation the memory containing minterms and logical expression not simplified is about 3 Gb: PID UTIL. PR NI VIRT RES SHR S %CPU %MEM TEMPS+ COM. 61051 mattei 20 0 *3339388* 2,7g 22368 R 463,1 *17,2 * 33:27.92 *guile* cf : https://en.wikipedia.org/wiki/Quine%E2%80%93McCluskey_algorithm#Complexity what i will do is write a Python version of the computation of Cn that uses the Sympy minimal disjunctive normal form procedure of the Sympy library , first to check my scheme code result and perheaps compare the speed, i know python is slow but as it is a library (compiled? but written in python as far as i know?), well, i'm a bit nervous to compare the speed between Python and Scheme... Best regards, Damien On Tue, Oct 25, 2022 at 11:11 AM Mikael Djurfeldt <mikael@djurfeldt.com> wrote: > Also, I would believe that any crashes in this context are neither due to > the futures implementation nor par-map et al. I would think that crashes > are due to the Guile basic thread support itself. > > On Tue, Oct 25, 2022 at 11:07 AM Mikael Djurfeldt <mikael@djurfeldt.com> > wrote: > >> A piece of background on par-map: >> >> When I introduced par-map et al the only ambition was to have simple >> language constructs to invoke parallelism. The use case I had in mind was >> course grained parallelism where each piece of work is somewhat >> substantial. Back then, a thread was launched for each piece of work, >> however there was also a thread pool such that not all of the overhead of >> launching new threads always was required. >> >> Since then, par-map has been rewritten (by others) to be based on >> futures. (And now the thread pool is localized in the futures >> implementation---as "workers".) Looking in the code now, I think it is fair >> to say that it is still intended for coarse grained parallelism. There is >> some heavy lifting going on with mutexes and condition variables as well as >> shuffling around with list pairs. >> >> So, applying par-map on a huge list with small amount of work per item >> was never and still isn't the intended use case. >> >> It would of course be interesting if the code could be improved to >> support fine grained parallelism. :-) >> >> Best regards, >> Mikael >> >> On Wed, Oct 12, 2022 at 11:30 PM Zelphir Kaltstahl < >> zelphirkaltstahl@posteo.de> wrote: >> >>> Hi! >>> >>> On 10/12/22 22:27, Damien Mattei wrote: >>> > >>> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 >>> > >>> > i commited the current version of code here with all files but it is >>> > huge.... :-/ >>> > >>> > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei < >>> damien.mattei@gmail.com> >>> > wrote: >>> > >>> >> Mutex? i do not think code has situation where dead lock could >>> happen, it >>> >> is a code about minimalising logic expressions, it uses minterms , >>> minterms >>> >> set is a set of minterms :like this: >>> >> >>> >> example: >>> >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) >>> >> because 0 and 1 are replaced by x >>> >> the minterms-set could have thousands of pair (mathematic not lisp) >>> >> minterms to unify >>> >> if there is more than one x as result there is no need to continue so >>> i >>> >> escape with a continuation: >>> >> >>> >> minterms-set = >>> >> { >>> >> ((1 0 1 0) (1 1 1 0)) >>> >> ((1 0 1 0) (1 1 0 1)) >>> >> ((1 0 1 0) (1 0 1 1)) >>> >> ((1 0 1 0) (0 1 1 1)) >>> >> ((0 1 1 0) (1 1 1 0)) >>> >> ((0 1 1 0) (1 1 0 1)) >>> >> ((0 1 1 0) (1 0 1 1)) >>> >> ((0 1 1 0) (0 1 1 1)) >>> >> ((0 1 0 1) (1 1 1 0)) >>> >> ((0 1 0 1) (1 1 0 1)) >>> >> ((0 1 0 1) (1 0 1 1)) >>> >> ((0 1 0 1) (0 1 1 1)) >>> >> ((0 0 1 1) (1 1 1 0)) >>> >> ((0 0 1 1) (1 1 0 1)) >>> >> ((0 0 1 1) (1 0 1 1)) >>> >> ((0 0 1 1) (0 1 1 1)) >>> >> } >>> >> >>> >> replace { } by () to have the list, other example at another level : >>> >> >>> >> minterms-set = >>> >> { >>> >> ((0 x 1 1) (x 1 1 1)) >>> >> ((0 x 1 1) (1 x 1 1)) >>> >> ((0 x 1 1) (1 1 x 1)) >>> >> ((0 x 1 1) (1 1 1 x)) >>> >> ((x 0 1 1) (x 1 1 1)) >>> >> ((x 0 1 1) (1 x 1 1)) >>> >> ((x 0 1 1) (1 1 x 1)) >>> >> ((x 0 1 1) (1 1 1 x)) >>> >> ((0 1 x 1) (x 1 1 1)) >>> >> ((0 1 x 1) (1 x 1 1)) >>> >> ((0 1 x 1) (1 1 x 1)) >>> >> ((0 1 x 1) (1 1 1 x)) >>> >> ((x 1 0 1) (x 1 1 1)) >>> >> ((x 1 0 1) (1 x 1 1)) >>> >> ((x 1 0 1) (1 1 x 1)) >>> >> ((x 1 0 1) (1 1 1 x)) >>> >> ((0 1 1 x) (x 1 1 1)) >>> >> ((0 1 1 x) (1 x 1 1)) >>> >> ((0 1 1 x) (1 1 x 1)) >>> >> ((0 1 1 x) (1 1 1 x)) >>> >> ((x 1 1 0) (x 1 1 1)) >>> >> ((x 1 1 0) (1 x 1 1)) >>> >> ((x 1 1 0) (1 1 x 1)) >>> >> ((x 1 1 0) (1 1 1 x)) >>> >> ((1 0 1 x) (x 1 1 1)) >>> >> ((1 0 1 x) (1 x 1 1)) >>> >> ((1 0 1 x) (1 1 x 1)) >>> >> ((1 0 1 x) (1 1 1 x)) >>> >> ((1 x 1 0) (x 1 1 1)) >>> >> ((1 x 1 0) (1 x 1 1)) >>> >> ((1 x 1 0) (1 1 x 1)) >>> >> ((1 x 1 0) (1 1 1 x)) >>> >> } >>> >> >>> >> here we see some minterms are already unified >>> >> >>> >> it is not easy to read even by me because i wrote the code many >>> years ago >>> >> and is split in many files, but here it is: >>> >> >>> >> (par-map function-unify-minterms-list minterms-set) >>> >> >>> >> {function-unify-minterms-list <+ (λ (L) (apply >>> >> function-unify-two-minterms-and-tag L))} >>> >> >>> >> (define (unify-two-minterms mt1 mt2) >>> >> (function-map-with-escaping-by-kontinuation2 >>> >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) >>> >> >>> >> ;; (function-map-with-escaping-by-kontinuation2 >>> >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 >>> 0) '(1 >>> >> 1 0 1 1 1 1 1)) >>> >> >>> >> ;; list1 = (1 1 0 1 0 1 1 0) >>> >> ;; more-lists = ((1 1 0 1 1 1 1 1)) >>> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) >>> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >>> >> >>> >> ;; #f >>> >> ;; >>> >> ;; (function-map-with-escaping-by-kontinuation2 >>> >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 >>> 0) '(1 >>> >> 1 0 1 1 1 1 0)) >>> >> >>> >> ;; list1 = (1 1 0 1 0 1 1 0) >>> >> ;; more-lists = ((1 1 0 1 1 1 1 0)) >>> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) >>> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >>> >> >>> >> ;; '(1 1 0 1 x 1 1 0) >>> >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . >>> >> more-lists) >>> >> (call/cc (lambda (kontinuation) >>> >> (let ((lists (cons list1 more-lists)) >>> >> (funct-continu ;; this function have the kontinuation in his >>> environment >>> >> (lambda (arg1 . more-args) >>> >> (let ((args (cons arg1 more-args))) >>> >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure >>> (cons >>> >> conti args)) >>> >> >>> >> ;; (newline) >>> >> ;; (dv list1) >>> >> ;; (dv more-lists) >>> >> ;; (dv lists) >>> >> ;; (dv clozure) >>> >> ;; (newline) >>> >> >>> >> (apply map funct-continu lists))))) >>> >> >>> >> (define-syntax macro-function-compare-2-bits-with-continuation ;; >>> >> continuation version of macro-compare-2-bits >>> >> ;; i need a macro because of external function to the clozure >>> >> (syntax-rules () >>> >> ((_) (let ((cnt 0)) ;; counter >>> >> (lambda (continuation b1 b2) (if (equal? b1 b2) >>> >> b1 >>> >> (begin >>> >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > >>> 1, we >>> >> can have used a flag too instead of a counter >>> >> (when (> cnt 1) (continuation #f)) ;; escaping with the >>> continuation >>> >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) >>> >> >>> >> what could have caused mutex if in the latter definition above (let >>> ((cnt >>> >> 0)) ;; counter was defined at top level and shared by all threads!!! >>> yes >>> >> there could have be some mutex but this is not the case, i think >>> even all >>> >> function are pure so why is it more slow with // than without? >>> >> Damien >>> >> >>> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> >>> >> wrote: >>> >> >>> >>> On 12-10-2022 19:19, Damien Mattei wrote: >>> >>>> Hello, >>> >>>> all is in the title, i test on a approximately 30000 element list , >>> i >>> >>> got >>> >>>> 9s with map and 3min 30s with par-map on exactly the same piece of >>> >>> code!? >>> >>> > [...] >>> >>> > >>> >>>> translated from Scheme+ to Scheme: >>> >>>> (define unified-minterms-set-1 (map function-unify-minterms-list >>> >>>> minterms-set)) ;;(par-map function-unify-minterms-list >>> minterms-set)) >>> >>> The definition of 'function-unify-minterms-list' and 'minterms-set' >>> is >>> >>> missing. Without a test case, we can only speculate what's going on. >>> >>> (E.g., maybe it grabs a mutex). >>> >>> >>> >>> Greetings, >>> >>> Maxime. >>> I don't want to scare anyone, just maybe warn about parallel map. I once >>> tried >>> to use Guile's parallel map function for a decision tree implementation >>> ( >>> https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), >>> >>> where each branch while learning the tree would call parallel map again >>> for sub >>> branches and so on. Somehow it made Guile crash (I don't have the error >>> message >>> any longer, but I did post about it on the mailing list back then.). I >>> never >>> figured out, what went wrong. All I had was pure function calls and math >>> inside >>> the thing that parallel map was supposed to run. >>> >>> Ultimately I simply tried other parallelism constructs and when I >>> switched to >>> using futures instead, everything worked fine, no crashes, no errors. >>> >>> Since that time, I did not use parallel map and instead used futures. >>> Recently I >>> made a parallelization thing for solving exercises of Project Euler >>> using >>> multiple cores, so that some solutions are calculated faster. Maybe this >>> can >>> help or can be adapted to another use case: >>> >>> >>> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 >>> >>> It expects ranges of things, which are called `segments` in the code. >>> Usually >>> ranges of numbers for Project Euler things. Here is the code to split a >>> range >>> into segments: >>> >>> >>> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm >>> >>> (Check any solution using it for an example.) >>> >>> So this might be a bit too specific for general parallel things, but I >>> guess one >>> could change the way futures are used in `run-in-parallel`, to fit any >>> other >>> purpose. >>> >>> Best regards, >>> Zelphir >>> >>> -- >>> repositories: https://notabug.org/ZelphirKaltstahl >>> >>> >>> [-- Attachment #2: Type: text/html, Size: 17537 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 21:29 ` Zelphir Kaltstahl 2022-10-14 8:21 ` Damien Mattei 2022-10-25 9:07 ` Mikael Djurfeldt @ 2022-11-10 10:32 ` Damien Mattei 2022-11-10 10:41 ` Damien Mattei 2022-11-10 17:07 ` Olivier Dion via General Guile related discussions 2 siblings, 2 replies; 41+ messages in thread From: Damien Mattei @ 2022-11-10 10:32 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user Hello Zelphir, i finally find a possible cause of no speed up of my code, i find that using your code the procedure keep blocked on the first 'touch at line 27 here: https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L27 if i add a 'display i got this output, see the second part ,i cut it waiting the rest of output , it is blockers on the first 'touch until it return ,after all the touch are fast as if all the job is done in the first 'touch unct-unify-minterms-set-1-unit-future : begin set1-length = 930 set2-length = 1270 before Cartesian product set after Cartesian product set minterms-set-length = 1181100 minterms-set-first = ((1 1 1 x x 0 0 0 0 1) (1 1 1 1 x x 0 0 0 1)) segmts = ((0 . 196850) (196851 . 393701) (393702 . 590552) (590553 . 787403) (787404 . 984254) (984255 . 1181099)) before // run-in-parallel : making future run-in-parallel : making future run-in-parallel : making future run-in-parallel : making future run-in-parallel : making future run-in-parallel : making future run-in-parallel : touching future run-in-parallel : touching future run-in-parallel : touching future run-in-parallel : touching future run-in-parallel : touching future run-in-parallel : touching future after // unified-minterms-vector-1-length = 1181100 funct-unify-minterms-set-1-unit-future : end funct-unify-minterms-set-1-unit-future : begin set1-length = 1270 set2-length = 888 before Cartesian product set after Cartesian product set minterms-set-length = 1127760 minterms-set-first = ((1 1 1 1 x x 0 0 0 1) (1 1 1 1 1 x x 0 0 1)) segmts = ((0 . 187960) (187961 . 375921) (375922 . 563882) (563883 . 751843) (751844 . 939804) (939805 . 1127759)) before // run-in-parallel : making future run-in-parallel : making future run-in-parallel : making future run-in-parallel : making future run-in-parallel : making future run-in-parallel : making future run-in-parallel : touching future blocking just above i find no explanation in Guile doc: Scheme Procedure: *touch* *f* Return the result of the expression embedded in future f. If the result was already computed in parallel, touch returns instantaneously. Otherwise, it waits for the computation to complete, if it already started, or initiates it. In the former case, the calling thread may process other futures in the meantime. perheaps 'map is not the good way to "launch" futures? here is my version of code with display that genrate the output above: (define run-in-parallel (λ (segments map-proc) ;;reduce-proc reduce-init) "Use futures to run a procedure in parallel, if multiple cores are available. Take a list of SEGMENTS as input, which are ranges of values to work on. MAP-PROC is applied to the SEGMENTS using map. When the MAP-PROC calls for all segments finished and returned values, the REDUCE-PROC is applied to the map result using reduce and the REDUCE-INIT argument." (let ([futures (map (λ (seg) (display-nl "run-in-parallel : making future") (make-future ;; Need to wrap in a thunk, to not ;; immediately start evaluating. (λ () (map-proc seg)))) segments)]) ;;(let ([segment-results (map touch futures)]) (let ([segment-results (map (lambda (f) (display-nl "run-in-parallel : touching future") (touch f)) futures)]) segment-results ;; (reduce reduce-proc ;; reduce-init ;; segment-results) )))) Best regards, Damien On Wed, Oct 12, 2022 at 11:29 PM Zelphir Kaltstahl < zelphirkaltstahl@posteo.de> wrote: > Hi! > > On 10/12/22 22:27, Damien Mattei wrote: > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 > > > > i commited the current version of code here with all files but it is > > huge.... :-/ > > > > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com> > > wrote: > > > >> Mutex? i do not think code has situation where dead lock could happen, > it > >> is a code about minimalising logic expressions, it uses minterms , > minterms > >> set is a set of minterms :like this: > >> > >> example: > >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) > >> because 0 and 1 are replaced by x > >> the minterms-set could have thousands of pair (mathematic not lisp) > >> minterms to unify > >> if there is more than one x as result there is no need to continue so i > >> escape with a continuation: > >> > >> minterms-set = > >> { > >> ((1 0 1 0) (1 1 1 0)) > >> ((1 0 1 0) (1 1 0 1)) > >> ((1 0 1 0) (1 0 1 1)) > >> ((1 0 1 0) (0 1 1 1)) > >> ((0 1 1 0) (1 1 1 0)) > >> ((0 1 1 0) (1 1 0 1)) > >> ((0 1 1 0) (1 0 1 1)) > >> ((0 1 1 0) (0 1 1 1)) > >> ((0 1 0 1) (1 1 1 0)) > >> ((0 1 0 1) (1 1 0 1)) > >> ((0 1 0 1) (1 0 1 1)) > >> ((0 1 0 1) (0 1 1 1)) > >> ((0 0 1 1) (1 1 1 0)) > >> ((0 0 1 1) (1 1 0 1)) > >> ((0 0 1 1) (1 0 1 1)) > >> ((0 0 1 1) (0 1 1 1)) > >> } > >> > >> replace { } by () to have the list, other example at another level : > >> > >> minterms-set = > >> { > >> ((0 x 1 1) (x 1 1 1)) > >> ((0 x 1 1) (1 x 1 1)) > >> ((0 x 1 1) (1 1 x 1)) > >> ((0 x 1 1) (1 1 1 x)) > >> ((x 0 1 1) (x 1 1 1)) > >> ((x 0 1 1) (1 x 1 1)) > >> ((x 0 1 1) (1 1 x 1)) > >> ((x 0 1 1) (1 1 1 x)) > >> ((0 1 x 1) (x 1 1 1)) > >> ((0 1 x 1) (1 x 1 1)) > >> ((0 1 x 1) (1 1 x 1)) > >> ((0 1 x 1) (1 1 1 x)) > >> ((x 1 0 1) (x 1 1 1)) > >> ((x 1 0 1) (1 x 1 1)) > >> ((x 1 0 1) (1 1 x 1)) > >> ((x 1 0 1) (1 1 1 x)) > >> ((0 1 1 x) (x 1 1 1)) > >> ((0 1 1 x) (1 x 1 1)) > >> ((0 1 1 x) (1 1 x 1)) > >> ((0 1 1 x) (1 1 1 x)) > >> ((x 1 1 0) (x 1 1 1)) > >> ((x 1 1 0) (1 x 1 1)) > >> ((x 1 1 0) (1 1 x 1)) > >> ((x 1 1 0) (1 1 1 x)) > >> ((1 0 1 x) (x 1 1 1)) > >> ((1 0 1 x) (1 x 1 1)) > >> ((1 0 1 x) (1 1 x 1)) > >> ((1 0 1 x) (1 1 1 x)) > >> ((1 x 1 0) (x 1 1 1)) > >> ((1 x 1 0) (1 x 1 1)) > >> ((1 x 1 0) (1 1 x 1)) > >> ((1 x 1 0) (1 1 1 x)) > >> } > >> > >> here we see some minterms are already unified > >> > >> it is not easy to read even by me because i wrote the code many years > ago > >> and is split in many files, but here it is: > >> > >> (par-map function-unify-minterms-list minterms-set) > >> > >> {function-unify-minterms-list <+ (λ (L) (apply > >> function-unify-two-minterms-and-tag L))} > >> > >> (define (unify-two-minterms mt1 mt2) > >> (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) > >> > >> ;; (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) > '(1 > >> 1 0 1 1 1 1 1)) > >> > >> ;; list1 = (1 1 0 1 0 1 1 0) > >> ;; more-lists = ((1 1 0 1 1 1 1 1)) > >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) > >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > >> > >> ;; #f > >> ;; > >> ;; (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) > '(1 > >> 1 0 1 1 1 1 0)) > >> > >> ;; list1 = (1 1 0 1 0 1 1 0) > >> ;; more-lists = ((1 1 0 1 1 1 1 0)) > >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) > >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > >> > >> ;; '(1 1 0 1 x 1 1 0) > >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . > >> more-lists) > >> (call/cc (lambda (kontinuation) > >> (let ((lists (cons list1 more-lists)) > >> (funct-continu ;; this function have the kontinuation in his > environment > >> (lambda (arg1 . more-args) > >> (let ((args (cons arg1 more-args))) > >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons > >> conti args)) > >> > >> ;; (newline) > >> ;; (dv list1) > >> ;; (dv more-lists) > >> ;; (dv lists) > >> ;; (dv clozure) > >> ;; (newline) > >> > >> (apply map funct-continu lists))))) > >> > >> (define-syntax macro-function-compare-2-bits-with-continuation ;; > >> continuation version of macro-compare-2-bits > >> ;; i need a macro because of external function to the clozure > >> (syntax-rules () > >> ((_) (let ((cnt 0)) ;; counter > >> (lambda (continuation b1 b2) (if (equal? b1 b2) > >> b1 > >> (begin > >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > > 1, we > >> can have used a flag too instead of a counter > >> (when (> cnt 1) (continuation #f)) ;; escaping with the continuation > >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) > >> > >> what could have caused mutex if in the latter definition above (let > ((cnt > >> 0)) ;; counter was defined at top level and shared by all threads!!! yes > >> there could have be some mutex but this is not the case, i think even > all > >> function are pure so why is it more slow with // than without? > >> Damien > >> > >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> > >> wrote: > >> > >>> On 12-10-2022 19:19, Damien Mattei wrote: > >>>> Hello, > >>>> all is in the title, i test on a approximately 30000 element list , i > >>> got > >>>> 9s with map and 3min 30s with par-map on exactly the same piece of > >>> code!? > >>> > [...] > >>> > > >>>> translated from Scheme+ to Scheme: > >>>> (define unified-minterms-set-1 (map function-unify-minterms-list > >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) > >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is > >>> missing. Without a test case, we can only speculate what's going on. > >>> (E.g., maybe it grabs a mutex). > >>> > >>> Greetings, > >>> Maxime. > I don't want to scare anyone, just maybe warn about parallel map. I once > tried > to use Guile's parallel map function for a decision tree implementation > ( > https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), > > where each branch while learning the tree would call parallel map again > for sub > branches and so on. Somehow it made Guile crash (I don't have the error > message > any longer, but I did post about it on the mailing list back then.). I > never > figured out, what went wrong. All I had was pure function calls and math > inside > the thing that parallel map was supposed to run. > > Ultimately I simply tried other parallelism constructs and when I switched > to > using futures instead, everything worked fine, no crashes, no errors. > > Since that time, I did not use parallel map and instead used futures. > Recently I > made a parallelization thing for solving exercises of Project Euler using > multiple cores, so that some solutions are calculated faster. Maybe this > can > help or can be adapted to another use case: > > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 > > It expects ranges of things, which are called `segments` in the code. > Usually > ranges of numbers for Project Euler things. Here is the code to split a > range > into segments: > > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm > > (Check any solution using it for an example.) > > So this might be a bit too specific for general parallel things, but I > guess one > could change the way futures are used in `run-in-parallel`, to fit any > other > purpose. > > Best regards, > Zelphir > > -- > repositories: https://notabug.org/ZelphirKaltstahl > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-10 10:32 ` Damien Mattei @ 2022-11-10 10:41 ` Damien Mattei 2022-11-10 10:52 ` Zelphir Kaltstahl 2022-11-10 17:07 ` Olivier Dion via General Guile related discussions 1 sibling, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-11-10 10:41 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user note that it is not a Guile problem, the same code give also no speed up with Racket 'future ,i have not already test it but it should block also on 'touch future... On Thu, Nov 10, 2022 at 11:32 AM Damien Mattei <damien.mattei@gmail.com> wrote: > Hello Zelphir, > > i finally find a possible cause of no speed up of my code, i find that > using your code the procedure keep blocked on the first 'touch at line 27 > here: > > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L27 > > if i add a 'display i got this output, see the second part ,i cut it > waiting the rest of output , it is blockers on the first 'touch until it > return ,after all the touch are fast as if all the job is done in the first > 'touch > > unct-unify-minterms-set-1-unit-future : begin > set1-length = 930 > set2-length = 1270 > before Cartesian product set > after Cartesian product set > minterms-set-length = 1181100 > minterms-set-first = ((1 1 1 x x 0 0 0 0 1) (1 1 1 1 x x 0 0 0 1)) > segmts = ((0 . 196850) (196851 . 393701) (393702 . 590552) (590553 . > 787403) (787404 . 984254) (984255 . 1181099)) > before // > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : touching future > run-in-parallel : touching future > run-in-parallel : touching future > run-in-parallel : touching future > run-in-parallel : touching future > run-in-parallel : touching future > after // > unified-minterms-vector-1-length = 1181100 > > funct-unify-minterms-set-1-unit-future : end > funct-unify-minterms-set-1-unit-future : begin > set1-length = 1270 > set2-length = 888 > before Cartesian product set > after Cartesian product set > minterms-set-length = 1127760 > minterms-set-first = ((1 1 1 1 x x 0 0 0 1) (1 1 1 1 1 x x 0 0 1)) > segmts = ((0 . 187960) (187961 . 375921) (375922 . 563882) (563883 . > 751843) (751844 . 939804) (939805 . 1127759)) > before // > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : touching future > > blocking just above > > i find no explanation in Guile doc: > > Scheme Procedure: *touch* *f* > > Return the result of the expression embedded in future f. > > If the result was already computed in parallel, touch returns > instantaneously. Otherwise, it waits for the computation to complete, if it > already started, or initiates it. In the former case, the calling thread > may process other futures in the meantime. > perheaps 'map is not the good way to "launch" futures? > > here is my version of code with display that genrate the output above: > > (define run-in-parallel > (λ (segments map-proc) ;;reduce-proc reduce-init) > "Use futures to run a procedure in parallel, if > multiple cores are available. Take a list of SEGMENTS as > input, which are ranges of values to work on. MAP-PROC is > applied to the SEGMENTS using map. When the MAP-PROC calls > for all segments finished and returned values, the > REDUCE-PROC is applied to the map result using reduce and > the REDUCE-INIT argument." > (let ([futures > (map (λ (seg) > (display-nl "run-in-parallel : making future") > (make-future > ;; Need to wrap in a thunk, to not > ;; immediately start evaluating. > (λ () (map-proc seg)))) > segments)]) > ;;(let ([segment-results (map touch futures)]) > (let ([segment-results (map (lambda (f) > (display-nl "run-in-parallel : touching future") > (touch f)) > futures)]) > segment-results > ;; (reduce reduce-proc > ;; reduce-init > ;; segment-results) > )))) > > > Best regards, > > Damien > > On Wed, Oct 12, 2022 at 11:29 PM Zelphir Kaltstahl < > zelphirkaltstahl@posteo.de> wrote: > >> Hi! >> >> On 10/12/22 22:27, Damien Mattei wrote: >> > >> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 >> > >> > i commited the current version of code here with all files but it is >> > huge.... :-/ >> > >> > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com >> > >> > wrote: >> > >> >> Mutex? i do not think code has situation where dead lock could happen, >> it >> >> is a code about minimalising logic expressions, it uses minterms , >> minterms >> >> set is a set of minterms :like this: >> >> >> >> example: >> >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) >> >> because 0 and 1 are replaced by x >> >> the minterms-set could have thousands of pair (mathematic not lisp) >> >> minterms to unify >> >> if there is more than one x as result there is no need to continue so i >> >> escape with a continuation: >> >> >> >> minterms-set = >> >> { >> >> ((1 0 1 0) (1 1 1 0)) >> >> ((1 0 1 0) (1 1 0 1)) >> >> ((1 0 1 0) (1 0 1 1)) >> >> ((1 0 1 0) (0 1 1 1)) >> >> ((0 1 1 0) (1 1 1 0)) >> >> ((0 1 1 0) (1 1 0 1)) >> >> ((0 1 1 0) (1 0 1 1)) >> >> ((0 1 1 0) (0 1 1 1)) >> >> ((0 1 0 1) (1 1 1 0)) >> >> ((0 1 0 1) (1 1 0 1)) >> >> ((0 1 0 1) (1 0 1 1)) >> >> ((0 1 0 1) (0 1 1 1)) >> >> ((0 0 1 1) (1 1 1 0)) >> >> ((0 0 1 1) (1 1 0 1)) >> >> ((0 0 1 1) (1 0 1 1)) >> >> ((0 0 1 1) (0 1 1 1)) >> >> } >> >> >> >> replace { } by () to have the list, other example at another level : >> >> >> >> minterms-set = >> >> { >> >> ((0 x 1 1) (x 1 1 1)) >> >> ((0 x 1 1) (1 x 1 1)) >> >> ((0 x 1 1) (1 1 x 1)) >> >> ((0 x 1 1) (1 1 1 x)) >> >> ((x 0 1 1) (x 1 1 1)) >> >> ((x 0 1 1) (1 x 1 1)) >> >> ((x 0 1 1) (1 1 x 1)) >> >> ((x 0 1 1) (1 1 1 x)) >> >> ((0 1 x 1) (x 1 1 1)) >> >> ((0 1 x 1) (1 x 1 1)) >> >> ((0 1 x 1) (1 1 x 1)) >> >> ((0 1 x 1) (1 1 1 x)) >> >> ((x 1 0 1) (x 1 1 1)) >> >> ((x 1 0 1) (1 x 1 1)) >> >> ((x 1 0 1) (1 1 x 1)) >> >> ((x 1 0 1) (1 1 1 x)) >> >> ((0 1 1 x) (x 1 1 1)) >> >> ((0 1 1 x) (1 x 1 1)) >> >> ((0 1 1 x) (1 1 x 1)) >> >> ((0 1 1 x) (1 1 1 x)) >> >> ((x 1 1 0) (x 1 1 1)) >> >> ((x 1 1 0) (1 x 1 1)) >> >> ((x 1 1 0) (1 1 x 1)) >> >> ((x 1 1 0) (1 1 1 x)) >> >> ((1 0 1 x) (x 1 1 1)) >> >> ((1 0 1 x) (1 x 1 1)) >> >> ((1 0 1 x) (1 1 x 1)) >> >> ((1 0 1 x) (1 1 1 x)) >> >> ((1 x 1 0) (x 1 1 1)) >> >> ((1 x 1 0) (1 x 1 1)) >> >> ((1 x 1 0) (1 1 x 1)) >> >> ((1 x 1 0) (1 1 1 x)) >> >> } >> >> >> >> here we see some minterms are already unified >> >> >> >> it is not easy to read even by me because i wrote the code many >> years ago >> >> and is split in many files, but here it is: >> >> >> >> (par-map function-unify-minterms-list minterms-set) >> >> >> >> {function-unify-minterms-list <+ (λ (L) (apply >> >> function-unify-two-minterms-and-tag L))} >> >> >> >> (define (unify-two-minterms mt1 mt2) >> >> (function-map-with-escaping-by-kontinuation2 >> >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) >> >> >> >> ;; (function-map-with-escaping-by-kontinuation2 >> >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 0) >> '(1 >> >> 1 0 1 1 1 1 1)) >> >> >> >> ;; list1 = (1 1 0 1 0 1 1 0) >> >> ;; more-lists = ((1 1 0 1 1 1 1 1)) >> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) >> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >> >> >> >> ;; #f >> >> ;; >> >> ;; (function-map-with-escaping-by-kontinuation2 >> >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 >> 0) '(1 >> >> 1 0 1 1 1 1 0)) >> >> >> >> ;; list1 = (1 1 0 1 0 1 1 0) >> >> ;; more-lists = ((1 1 0 1 1 1 1 0)) >> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) >> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >> >> >> >> ;; '(1 1 0 1 x 1 1 0) >> >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . >> >> more-lists) >> >> (call/cc (lambda (kontinuation) >> >> (let ((lists (cons list1 more-lists)) >> >> (funct-continu ;; this function have the kontinuation in his >> environment >> >> (lambda (arg1 . more-args) >> >> (let ((args (cons arg1 more-args))) >> >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons >> >> conti args)) >> >> >> >> ;; (newline) >> >> ;; (dv list1) >> >> ;; (dv more-lists) >> >> ;; (dv lists) >> >> ;; (dv clozure) >> >> ;; (newline) >> >> >> >> (apply map funct-continu lists))))) >> >> >> >> (define-syntax macro-function-compare-2-bits-with-continuation ;; >> >> continuation version of macro-compare-2-bits >> >> ;; i need a macro because of external function to the clozure >> >> (syntax-rules () >> >> ((_) (let ((cnt 0)) ;; counter >> >> (lambda (continuation b1 b2) (if (equal? b1 b2) >> >> b1 >> >> (begin >> >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > >> 1, we >> >> can have used a flag too instead of a counter >> >> (when (> cnt 1) (continuation #f)) ;; escaping with the >> continuation >> >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) >> >> >> >> what could have caused mutex if in the latter definition above (let >> ((cnt >> >> 0)) ;; counter was defined at top level and shared by all threads!!! >> yes >> >> there could have be some mutex but this is not the case, i think even >> all >> >> function are pure so why is it more slow with // than without? >> >> Damien >> >> >> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> >> >> wrote: >> >> >> >>> On 12-10-2022 19:19, Damien Mattei wrote: >> >>>> Hello, >> >>>> all is in the title, i test on a approximately 30000 element list , i >> >>> got >> >>>> 9s with map and 3min 30s with par-map on exactly the same piece of >> >>> code!? >> >>> > [...] >> >>> > >> >>>> translated from Scheme+ to Scheme: >> >>>> (define unified-minterms-set-1 (map function-unify-minterms-list >> >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) >> >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is >> >>> missing. Without a test case, we can only speculate what's going on. >> >>> (E.g., maybe it grabs a mutex). >> >>> >> >>> Greetings, >> >>> Maxime. >> I don't want to scare anyone, just maybe warn about parallel map. I once >> tried >> to use Guile's parallel map function for a decision tree implementation >> ( >> https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), >> >> where each branch while learning the tree would call parallel map again >> for sub >> branches and so on. Somehow it made Guile crash (I don't have the error >> message >> any longer, but I did post about it on the mailing list back then.). I >> never >> figured out, what went wrong. All I had was pure function calls and math >> inside >> the thing that parallel map was supposed to run. >> >> Ultimately I simply tried other parallelism constructs and when I >> switched to >> using futures instead, everything worked fine, no crashes, no errors. >> >> Since that time, I did not use parallel map and instead used futures. >> Recently I >> made a parallelization thing for solving exercises of Project Euler using >> multiple cores, so that some solutions are calculated faster. Maybe this >> can >> help or can be adapted to another use case: >> >> >> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 >> >> It expects ranges of things, which are called `segments` in the code. >> Usually >> ranges of numbers for Project Euler things. Here is the code to split a >> range >> into segments: >> >> >> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm >> >> (Check any solution using it for an example.) >> >> So this might be a bit too specific for general parallel things, but I >> guess one >> could change the way futures are used in `run-in-parallel`, to fit any >> other >> purpose. >> >> Best regards, >> Zelphir >> >> -- >> repositories: https://notabug.org/ZelphirKaltstahl >> >> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-10 10:41 ` Damien Mattei @ 2022-11-10 10:52 ` Zelphir Kaltstahl 2022-11-10 13:36 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Zelphir Kaltstahl @ 2022-11-10 10:52 UTC (permalink / raw) To: Damien Mattei; +Cc: guile-user Hi Damien! I think Racket futures and Guile futures are a bit different. According to the Racket documentation "The level of parallelism available from those constructs, however, is limited by several factors, and the current implementation is best suited to numerical tasks." (https://docs.racket-lang.org/guide/parallelism.html#%28part._effective-futures%29). So I am not sure, if they will work well for your use-case. I think I remember there having been a discussion on the Guile mailing list, where I asked, whether the Guile futures suffer from the same limitations, but I am not sure, that this question was sufficiently answered. I personally haven't noticed any blocking in my pure mathematical project Euler code. That said. If you can send me some example code, which does not require me to set up the whole thing of Scheme+, then I can take a look and check on my end, how what when blocks. Or at least send me some snippet, which I can run without setting up lots of things, maybe with 1 simple command, where the entry point to `run-in-parallel` is obvious. Regards, Zelphir On 11/10/22 11:41, Damien Mattei wrote: > note that it is not a Guile problem, the same code give also no speed up with > Racket 'future ,i have not already test it but it should block also on 'touch > future... > > On Thu, Nov 10, 2022 at 11:32 AM Damien Mattei <damien.mattei@gmail.com> wrote: > > Hello Zelphir, > > i finally find a possible cause of no speed up of my code, i find that > using your code the procedure keep blocked on the first 'touch at line 27 > here: > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L27 > > if i add a 'display i got this output, see the second part ,i cut it > waiting the rest of output , it is blockers on the first 'touch until it > return ,after all the touch are fast as if all the job is done in the > first 'touch > > unct-unify-minterms-set-1-unit-future : begin > set1-length = 930 > set2-length = 1270 > before Cartesian product set > after Cartesian product set > minterms-set-length = 1181100 > minterms-set-first = ((1 1 1 x x 0 0 0 0 1) (1 1 1 1 x x 0 0 0 1)) > segmts = ((0 . 196850) (196851 . 393701) (393702 . 590552) (590553 . > 787403) (787404 . 984254) (984255 . 1181099)) > before // > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : touching future > run-in-parallel : touching future > run-in-parallel : touching future > run-in-parallel : touching future > run-in-parallel : touching future > run-in-parallel : touching future > after // > unified-minterms-vector-1-length = 1181100 > > funct-unify-minterms-set-1-unit-future : end > funct-unify-minterms-set-1-unit-future : begin > set1-length = 1270 > set2-length = 888 > before Cartesian product set > after Cartesian product set > minterms-set-length = 1127760 > minterms-set-first = ((1 1 1 1 x x 0 0 0 1) (1 1 1 1 1 x x 0 0 1)) > segmts = ((0 . 187960) (187961 . 375921) (375922 . 563882) (563883 . > 751843) (751844 . 939804) (939805 . 1127759)) > before // > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : making future > run-in-parallel : touching future > > blocking just above > > i find no explanation in Guile doc: > > Scheme Procedure: *touch* /f/ > > Return the result of the expression embedded in future f. > > If the result was already computed in parallel, |touch| returns > instantaneously. Otherwise, it waits for the computation to complete, > if it already started, or initiates it. In the former case, the > calling thread may process other futures in the meantime. > > perheaps 'map is not the good way to "launch" futures? > > here is my version of code with display that genrate the output above: > > (define run-in-parallel > (λ (segments map-proc) ;;reduce-proc reduce-init) > "Use futures to run a procedure in parallel, if > multiple cores are available. Take a list of SEGMENTS as > input, which are ranges of values to work on. MAP-PROC is > applied to the SEGMENTS using map. When the MAP-PROC calls > for all segments finished and returned values, the > REDUCE-PROC is applied to the map result using reduce and > the REDUCE-INIT argument." > (let ([futures > (map (λ (seg) > (display-nl "run-in-parallel : making future") > (make-future > ;; Need to wrap in a thunk, to not > ;; immediately start evaluating. > (λ () (map-proc seg)))) > segments)]) > ;;(let ([segment-results (map touch futures)]) > (let ([segment-results (map (lambda (f) > (display-nl "run-in-parallel : touching future") > (touch f)) > futures)]) > segment-results > ;; (reduce reduce-proc > ;; reduce-init > ;; segment-results) > )))) > > > Best regards, > > Damien > > On Wed, Oct 12, 2022 at 11:29 PM Zelphir Kaltstahl > <zelphirkaltstahl@posteo.de> wrote: > > Hi! > > On 10/12/22 22:27, Damien Mattei wrote: > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 > > > > i commited the current version of code here with all files but it is > > huge.... :-/ > > > > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com> > > wrote: > > > >> Mutex? i do not think code has situation where dead lock could > happen, it > >> is a code about minimalising logic expressions, it uses minterms , > minterms > >> set is a set of minterms :like this: > >> > >> example: > >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) > >> because 0 and 1 are replaced by x > >> the minterms-set could have thousands of pair (mathematic not lisp) > >> minterms to unify > >> if there is more than one x as result there is no need to continue so i > >> escape with a continuation: > >> > >> minterms-set = > >> { > >> ((1 0 1 0) (1 1 1 0)) > >> ((1 0 1 0) (1 1 0 1)) > >> ((1 0 1 0) (1 0 1 1)) > >> ((1 0 1 0) (0 1 1 1)) > >> ((0 1 1 0) (1 1 1 0)) > >> ((0 1 1 0) (1 1 0 1)) > >> ((0 1 1 0) (1 0 1 1)) > >> ((0 1 1 0) (0 1 1 1)) > >> ((0 1 0 1) (1 1 1 0)) > >> ((0 1 0 1) (1 1 0 1)) > >> ((0 1 0 1) (1 0 1 1)) > >> ((0 1 0 1) (0 1 1 1)) > >> ((0 0 1 1) (1 1 1 0)) > >> ((0 0 1 1) (1 1 0 1)) > >> ((0 0 1 1) (1 0 1 1)) > >> ((0 0 1 1) (0 1 1 1)) > >> } > >> > >> replace { } by () to have the list, other example at another level : > >> > >> minterms-set = > >> { > >> ((0 x 1 1) (x 1 1 1)) > >> ((0 x 1 1) (1 x 1 1)) > >> ((0 x 1 1) (1 1 x 1)) > >> ((0 x 1 1) (1 1 1 x)) > >> ((x 0 1 1) (x 1 1 1)) > >> ((x 0 1 1) (1 x 1 1)) > >> ((x 0 1 1) (1 1 x 1)) > >> ((x 0 1 1) (1 1 1 x)) > >> ((0 1 x 1) (x 1 1 1)) > >> ((0 1 x 1) (1 x 1 1)) > >> ((0 1 x 1) (1 1 x 1)) > >> ((0 1 x 1) (1 1 1 x)) > >> ((x 1 0 1) (x 1 1 1)) > >> ((x 1 0 1) (1 x 1 1)) > >> ((x 1 0 1) (1 1 x 1)) > >> ((x 1 0 1) (1 1 1 x)) > >> ((0 1 1 x) (x 1 1 1)) > >> ((0 1 1 x) (1 x 1 1)) > >> ((0 1 1 x) (1 1 x 1)) > >> ((0 1 1 x) (1 1 1 x)) > >> ((x 1 1 0) (x 1 1 1)) > >> ((x 1 1 0) (1 x 1 1)) > >> ((x 1 1 0) (1 1 x 1)) > >> ((x 1 1 0) (1 1 1 x)) > >> ((1 0 1 x) (x 1 1 1)) > >> ((1 0 1 x) (1 x 1 1)) > >> ((1 0 1 x) (1 1 x 1)) > >> ((1 0 1 x) (1 1 1 x)) > >> ((1 x 1 0) (x 1 1 1)) > >> ((1 x 1 0) (1 x 1 1)) > >> ((1 x 1 0) (1 1 x 1)) > >> ((1 x 1 0) (1 1 1 x)) > >> } > >> > >> here we see some minterms are already unified > >> > >> it is not easy to read even by me because i wrote the code many > years ago > >> and is split in many files, but here it is: > >> > >> (par-map function-unify-minterms-list minterms-set) > >> > >> {function-unify-minterms-list <+ (λ (L) (apply > >> function-unify-two-minterms-and-tag L))} > >> > >> (define (unify-two-minterms mt1 mt2) > >> (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) > >> > >> ;; (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 > 0) '(1 > >> 1 0 1 1 1 1 1)) > >> > >> ;; list1 = (1 1 0 1 0 1 1 0) > >> ;; more-lists = ((1 1 0 1 1 1 1 1)) > >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) > >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > >> > >> ;; #f > >> ;; > >> ;; (function-map-with-escaping-by-kontinuation2 > >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 > 1 0) '(1 > >> 1 0 1 1 1 1 0)) > >> > >> ;; list1 = (1 1 0 1 0 1 1 0) > >> ;; more-lists = ((1 1 0 1 1 1 1 0)) > >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) > >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> > >> > >> ;; '(1 1 0 1 x 1 1 0) > >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . > >> more-lists) > >> (call/cc (lambda (kontinuation) > >> (let ((lists (cons list1 more-lists)) > >> (funct-continu ;; this function have the kontinuation in his > environment > >> (lambda (arg1 . more-args) > >> (let ((args (cons arg1 more-args))) > >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons > >> conti args)) > >> > >> ;; (newline) > >> ;; (dv list1) > >> ;; (dv more-lists) > >> ;; (dv lists) > >> ;; (dv clozure) > >> ;; (newline) > >> > >> (apply map funct-continu lists))))) > >> > >> (define-syntax macro-function-compare-2-bits-with-continuation ;; > >> continuation version of macro-compare-2-bits > >> ;; i need a macro because of external function to the clozure > >> (syntax-rules () > >> ((_) (let ((cnt 0)) ;; counter > >> (lambda (continuation b1 b2) (if (equal? b1 b2) > >> b1 > >> (begin > >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > > 1, we > >> can have used a flag too instead of a counter > >> (when (> cnt 1) (continuation #f)) ;; escaping with the > continuation > >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) > >> > >> what could have caused mutex if in the latter definition above (let > ((cnt > >> 0)) ;; counter was defined at top level and shared by all > threads!!! yes > >> there could have be some mutex but this is not the case, i think > even all > >> function are pure so why is it more slow with // than without? > >> Damien > >> > >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> > >> wrote: > >> > >>> On 12-10-2022 19:19, Damien Mattei wrote: > >>>> Hello, > >>>> all is in the title, i test on a approximately 30000 element list , i > >>> got > >>>> 9s with map and 3min 30s with par-map on exactly the same piece of > >>> code!? > >>> > [...] > >>> > > >>>> translated from Scheme+ to Scheme: > >>>> (define unified-minterms-set-1 (map function-unify-minterms-list > >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set)) > >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is > >>> missing. Without a test case, we can only speculate what's going on. > >>> (E.g., maybe it grabs a mutex). > >>> > >>> Greetings, > >>> Maxime. > I don't want to scare anyone, just maybe warn about parallel map. I > once tried > to use Guile's parallel map function for a decision tree implementation > (https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), > > where each branch while learning the tree would call parallel map > again for sub > branches and so on. Somehow it made Guile crash (I don't have the > error message > any longer, but I did post about it on the mailing list back then.). I > never > figured out, what went wrong. All I had was pure function calls and > math inside > the thing that parallel map was supposed to run. > > Ultimately I simply tried other parallelism constructs and when I > switched to > using futures instead, everything worked fine, no crashes, no errors. > > Since that time, I did not use parallel map and instead used futures. > Recently I > made a parallelization thing for solving exercises of Project Euler using > multiple cores, so that some solutions are calculated faster. Maybe > this can > help or can be adapted to another use case: > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 > > It expects ranges of things, which are called `segments` in the code. > Usually > ranges of numbers for Project Euler things. Here is the code to split > a range > into segments: > > https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm > > (Check any solution using it for an example.) > > So this might be a bit too specific for general parallel things, but I > guess one > could change the way futures are used in `run-in-parallel`, to fit any > other > purpose. > > Best regards, > Zelphir > > -- > repositories: https://notabug.org/ZelphirKaltstahl > -- repositories:https://notabug.org/ZelphirKaltstahl ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-10 10:52 ` Zelphir Kaltstahl @ 2022-11-10 13:36 ` Damien Mattei 0 siblings, 0 replies; 41+ messages in thread From: Damien Mattei @ 2022-11-10 13:36 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user as it worked with your code it is incompatibility between my procedures and 'future. I read 'future require pure function,ok my function are not pure but the side effect they did are on partitioned sets of data, so there is no reason for concurrence accessing data between each threads. I will try to modify the code with a personal schema of // with thread or other. For now it fails with par-map,threads and future. The problem can not be recreate on a simple case becaus emy code needs: -the minimalizing procedures of logic (logiki+.scm and co) -Scheme+ (now with a modified version of 'def macro not online) -and the code used for benchmark,which is about unpublished science and experimental. Damien On Thu, Nov 10, 2022 at 11:53 AM Zelphir Kaltstahl < zelphirkaltstahl@posteo.de> wrote: > Hi Damien! > > I think Racket futures and Guile futures are a bit different. According to > the Racket documentation "The level of parallelism available from those > constructs, however, is limited by several factors, and the current > implementation is best suited to numerical tasks." ( > https://docs.racket-lang.org/guide/parallelism.html#%28part._effective-futures%29). > So I am not sure, if they will work well for your use-case. I think I > remember there having been a discussion on the Guile mailing list, where I > asked, whether the Guile futures suffer from the same limitations, but I am > not sure, that this question was sufficiently answered. I personally > haven't noticed any blocking in my pure mathematical project Euler code. > > That said. If you can send me some example code, which does not require me > to set up the whole thing of Scheme+, then I can take a look and check on > my end, how what when blocks. Or at least send me some snippet, which I can > run without setting up lots of things, maybe with 1 simple command, where > the entry point to `run-in-parallel` is obvious. > > Regards, > Zelphir > On 11/10/22 11:41, Damien Mattei wrote: > > note that it is not a Guile problem, the same code give also no speed up > with Racket 'future ,i have not already test it but it should block also on > 'touch future... > > On Thu, Nov 10, 2022 at 11:32 AM Damien Mattei <damien.mattei@gmail.com> > wrote: > >> Hello Zelphir, >> >> i finally find a possible cause of no speed up of my code, i find that >> using your code the procedure keep blocked on the first 'touch at line 27 >> here: >> >> >> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L27 >> >> if i add a 'display i got this output, see the second part ,i cut it >> waiting the rest of output , it is blockers on the first 'touch until it >> return ,after all the touch are fast as if all the job is done in the first >> 'touch >> >> unct-unify-minterms-set-1-unit-future : begin >> set1-length = 930 >> set2-length = 1270 >> before Cartesian product set >> after Cartesian product set >> minterms-set-length = 1181100 >> minterms-set-first = ((1 1 1 x x 0 0 0 0 1) (1 1 1 1 x x 0 0 0 1)) >> segmts = ((0 . 196850) (196851 . 393701) (393702 . 590552) (590553 . >> 787403) (787404 . 984254) (984255 . 1181099)) >> before // >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : touching future >> run-in-parallel : touching future >> run-in-parallel : touching future >> run-in-parallel : touching future >> run-in-parallel : touching future >> run-in-parallel : touching future >> after // >> unified-minterms-vector-1-length = 1181100 >> >> funct-unify-minterms-set-1-unit-future : end >> funct-unify-minterms-set-1-unit-future : begin >> set1-length = 1270 >> set2-length = 888 >> before Cartesian product set >> after Cartesian product set >> minterms-set-length = 1127760 >> minterms-set-first = ((1 1 1 1 x x 0 0 0 1) (1 1 1 1 1 x x 0 0 1)) >> segmts = ((0 . 187960) (187961 . 375921) (375922 . 563882) (563883 . >> 751843) (751844 . 939804) (939805 . 1127759)) >> before // >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : making future >> run-in-parallel : touching future >> >> blocking just above >> >> i find no explanation in Guile doc: >> >> Scheme Procedure: *touch* *f* >> >> Return the result of the expression embedded in future f. >> >> If the result was already computed in parallel, touch returns >> instantaneously. Otherwise, it waits for the computation to complete, if it >> already started, or initiates it. In the former case, the calling thread >> may process other futures in the meantime. >> perheaps 'map is not the good way to "launch" futures? >> >> here is my version of code with display that genrate the output above: >> >> (define run-in-parallel >> (λ (segments map-proc) ;;reduce-proc reduce-init) >> "Use futures to run a procedure in parallel, if >> multiple cores are available. Take a list of SEGMENTS as >> input, which are ranges of values to work on. MAP-PROC is >> applied to the SEGMENTS using map. When the MAP-PROC calls >> for all segments finished and returned values, the >> REDUCE-PROC is applied to the map result using reduce and >> the REDUCE-INIT argument." >> (let ([futures >> (map (λ (seg) >> (display-nl "run-in-parallel : making future") >> (make-future >> ;; Need to wrap in a thunk, to not >> ;; immediately start evaluating. >> (λ () (map-proc seg)))) >> segments)]) >> ;;(let ([segment-results (map touch futures)]) >> (let ([segment-results (map (lambda (f) >> (display-nl "run-in-parallel : touching future") >> (touch f)) >> futures)]) >> segment-results >> ;; (reduce reduce-proc >> ;; reduce-init >> ;; segment-results) >> )))) >> >> >> Best regards, >> >> Damien >> >> On Wed, Oct 12, 2022 at 11:29 PM Zelphir Kaltstahl < >> zelphirkaltstahl@posteo.de> wrote: >> >>> Hi! >>> >>> On 10/12/22 22:27, Damien Mattei wrote: >>> > >>> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674 >>> > >>> > i commited the current version of code here with all files but it is >>> > huge.... :-/ >>> > >>> > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei < >>> damien.mattei@gmail.com> >>> > wrote: >>> > >>> >> Mutex? i do not think code has situation where dead lock could >>> happen, it >>> >> is a code about minimalising logic expressions, it uses minterms , >>> minterms >>> >> set is a set of minterms :like this: >>> >> >>> >> example: >>> >> ((1 1 0) (1 1 1)) will be unified : (1 1 x) >>> >> because 0 and 1 are replaced by x >>> >> the minterms-set could have thousands of pair (mathematic not lisp) >>> >> minterms to unify >>> >> if there is more than one x as result there is no need to continue so >>> i >>> >> escape with a continuation: >>> >> >>> >> minterms-set = >>> >> { >>> >> ((1 0 1 0) (1 1 1 0)) >>> >> ((1 0 1 0) (1 1 0 1)) >>> >> ((1 0 1 0) (1 0 1 1)) >>> >> ((1 0 1 0) (0 1 1 1)) >>> >> ((0 1 1 0) (1 1 1 0)) >>> >> ((0 1 1 0) (1 1 0 1)) >>> >> ((0 1 1 0) (1 0 1 1)) >>> >> ((0 1 1 0) (0 1 1 1)) >>> >> ((0 1 0 1) (1 1 1 0)) >>> >> ((0 1 0 1) (1 1 0 1)) >>> >> ((0 1 0 1) (1 0 1 1)) >>> >> ((0 1 0 1) (0 1 1 1)) >>> >> ((0 0 1 1) (1 1 1 0)) >>> >> ((0 0 1 1) (1 1 0 1)) >>> >> ((0 0 1 1) (1 0 1 1)) >>> >> ((0 0 1 1) (0 1 1 1)) >>> >> } >>> >> >>> >> replace { } by () to have the list, other example at another level : >>> >> >>> >> minterms-set = >>> >> { >>> >> ((0 x 1 1) (x 1 1 1)) >>> >> ((0 x 1 1) (1 x 1 1)) >>> >> ((0 x 1 1) (1 1 x 1)) >>> >> ((0 x 1 1) (1 1 1 x)) >>> >> ((x 0 1 1) (x 1 1 1)) >>> >> ((x 0 1 1) (1 x 1 1)) >>> >> ((x 0 1 1) (1 1 x 1)) >>> >> ((x 0 1 1) (1 1 1 x)) >>> >> ((0 1 x 1) (x 1 1 1)) >>> >> ((0 1 x 1) (1 x 1 1)) >>> >> ((0 1 x 1) (1 1 x 1)) >>> >> ((0 1 x 1) (1 1 1 x)) >>> >> ((x 1 0 1) (x 1 1 1)) >>> >> ((x 1 0 1) (1 x 1 1)) >>> >> ((x 1 0 1) (1 1 x 1)) >>> >> ((x 1 0 1) (1 1 1 x)) >>> >> ((0 1 1 x) (x 1 1 1)) >>> >> ((0 1 1 x) (1 x 1 1)) >>> >> ((0 1 1 x) (1 1 x 1)) >>> >> ((0 1 1 x) (1 1 1 x)) >>> >> ((x 1 1 0) (x 1 1 1)) >>> >> ((x 1 1 0) (1 x 1 1)) >>> >> ((x 1 1 0) (1 1 x 1)) >>> >> ((x 1 1 0) (1 1 1 x)) >>> >> ((1 0 1 x) (x 1 1 1)) >>> >> ((1 0 1 x) (1 x 1 1)) >>> >> ((1 0 1 x) (1 1 x 1)) >>> >> ((1 0 1 x) (1 1 1 x)) >>> >> ((1 x 1 0) (x 1 1 1)) >>> >> ((1 x 1 0) (1 x 1 1)) >>> >> ((1 x 1 0) (1 1 x 1)) >>> >> ((1 x 1 0) (1 1 1 x)) >>> >> } >>> >> >>> >> here we see some minterms are already unified >>> >> >>> >> it is not easy to read even by me because i wrote the code many >>> years ago >>> >> and is split in many files, but here it is: >>> >> >>> >> (par-map function-unify-minterms-list minterms-set) >>> >> >>> >> {function-unify-minterms-list <+ (λ (L) (apply >>> >> function-unify-two-minterms-and-tag L))} >>> >> >>> >> (define (unify-two-minterms mt1 mt2) >>> >> (function-map-with-escaping-by-kontinuation2 >>> >> (macro-function-compare-2-bits-with-continuation) mt1 mt2)) >>> >> >>> >> ;; (function-map-with-escaping-by-kontinuation2 >>> >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 >>> 0) '(1 >>> >> 1 0 1 1 1 1 1)) >>> >> >>> >> ;; list1 = (1 1 0 1 0 1 1 0) >>> >> ;; more-lists = ((1 1 0 1 1 1 1 1)) >>> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1)) >>> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >>> >> >>> >> ;; #f >>> >> ;; >>> >> ;; (function-map-with-escaping-by-kontinuation2 >>> >> (macro-function-compare-2-bits-with-continuation) '(1 1 0 1 0 1 1 >>> 0) '(1 >>> >> 1 0 1 1 1 1 0)) >>> >> >>> >> ;; list1 = (1 1 0 1 0 1 1 0) >>> >> ;; more-lists = ((1 1 0 1 1 1 1 0)) >>> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0)) >>> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11> >>> >> >>> >> ;; '(1 1 0 1 x 1 1 0) >>> >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 . >>> >> more-lists) >>> >> (call/cc (lambda (kontinuation) >>> >> (let ((lists (cons list1 more-lists)) >>> >> (funct-continu ;; this function have the kontinuation in his >>> environment >>> >> (lambda (arg1 . more-args) >>> >> (let ((args (cons arg1 more-args))) >>> >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure >>> (cons >>> >> conti args)) >>> >> >>> >> ;; (newline) >>> >> ;; (dv list1) >>> >> ;; (dv more-lists) >>> >> ;; (dv lists) >>> >> ;; (dv clozure) >>> >> ;; (newline) >>> >> >>> >> (apply map funct-continu lists))))) >>> >> >>> >> (define-syntax macro-function-compare-2-bits-with-continuation ;; >>> >> continuation version of macro-compare-2-bits >>> >> ;; i need a macro because of external function to the clozure >>> >> (syntax-rules () >>> >> ((_) (let ((cnt 0)) ;; counter >>> >> (lambda (continuation b1 b2) (if (equal? b1 b2) >>> >> b1 >>> >> (begin >>> >> (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > >>> 1, we >>> >> can have used a flag too instead of a counter >>> >> (when (> cnt 1) (continuation #f)) ;; escaping with the >>> continuation >>> >> 'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0) >>> >> >>> >> what could have caused mutex if in the latter definition above (let >>> ((cnt >>> >> 0)) ;; counter was defined at top level and shared by all threads!!! >>> yes >>> >> there could have be some mutex but this is not the case, i think >>> even all >>> >> function are pure so why is it more slow with // than without? >>> >> Damien >>> >> >>> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> >>> >> wrote: >>> >> >>> >>> On 12-10-2022 19:19, Damien Mattei wrote: >>> >>>> Hello, >>> >>>> all is in the title, i test on a approximately 30000 element list , >>> i >>> >>> got >>> >>>> 9s with map and 3min 30s with par-map on exactly the same piece of >>> >>> code!? >>> >>> > [...] >>> >>> > >>> >>>> translated from Scheme+ to Scheme: >>> >>>> (define unified-minterms-set-1 (map function-unify-minterms-list >>> >>>> minterms-set)) ;;(par-map function-unify-minterms-list >>> minterms-set)) >>> >>> The definition of 'function-unify-minterms-list' and 'minterms-set' >>> is >>> >>> missing. Without a test case, we can only speculate what's going on. >>> >>> (E.g., maybe it grabs a mutex). >>> >>> >>> >>> Greetings, >>> >>> Maxime. >>> I don't want to scare anyone, just maybe warn about parallel map. I once >>> tried >>> to use Guile's parallel map function for a decision tree implementation >>> ( >>> https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), >>> >>> where each branch while learning the tree would call parallel map again >>> for sub >>> branches and so on. Somehow it made Guile crash (I don't have the error >>> message >>> any longer, but I did post about it on the mailing list back then.). I >>> never >>> figured out, what went wrong. All I had was pure function calls and math >>> inside >>> the thing that parallel map was supposed to run. >>> >>> Ultimately I simply tried other parallelism constructs and when I >>> switched to >>> using futures instead, everything worked fine, no crashes, no errors. >>> >>> Since that time, I did not use parallel map and instead used futures. >>> Recently I >>> made a parallelization thing for solving exercises of Project Euler >>> using >>> multiple cores, so that some solutions are calculated faster. Maybe this >>> can >>> help or can be adapted to another use case: >>> >>> >>> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30 >>> >>> It expects ranges of things, which are called `segments` in the code. >>> Usually >>> ranges of numbers for Project Euler things. Here is the code to split a >>> range >>> into segments: >>> >>> >>> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm >>> >>> (Check any solution using it for an example.) >>> >>> So this might be a bit too specific for general parallel things, but I >>> guess one >>> could change the way futures are used in `run-in-parallel`, to fit any >>> other >>> purpose. >>> >>> Best regards, >>> Zelphir >>> >>> -- >>> repositories: https://notabug.org/ZelphirKaltstahl >>> >>> -- > repositories: https://notabug.org/ZelphirKaltstahl > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-10 10:32 ` Damien Mattei 2022-11-10 10:41 ` Damien Mattei @ 2022-11-10 17:07 ` Olivier Dion via General Guile related discussions 2022-11-11 10:26 ` Damien Mattei 1 sibling, 1 reply; 41+ messages in thread From: Olivier Dion via General Guile related discussions @ 2022-11-10 17:07 UTC (permalink / raw) To: Damien Mattei, Zelphir Kaltstahl; +Cc: guile-user On Thu, 10 Nov 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > Hello Zelphir, > > i finally find a possible cause of no speed up of my code, i find that > using your code the procedure keep blocked on the first 'touch at line 27 > here: > I have found a possible deadlock with Guile's mutex. In some very rare cases, ice-9 futures deadlock because of it. Maybe it's not the case here I haven't read the code you've mentioned. Just letting you know. -- Olivier Dion oldiob.dev ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-10 17:07 ` Olivier Dion via General Guile related discussions @ 2022-11-11 10:26 ` Damien Mattei 2022-11-11 12:25 ` Zelphir Kaltstahl 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-11-11 10:26 UTC (permalink / raw) To: Olivier Dion; +Cc: Zelphir Kaltstahl, guile-user i rewrote a complete threaded routine both with guile and racket creating threads and waiting they finish : ;; run the parallel code {threads <+ (map (λ (seg) (call-with-new-thread (λ () (proc-unify-minterms-seg seg)))) segmts)} (nodebug (display-nl "waiting for threads to finish...")) ;; wait for threads to finish (map (λ (thread) (join-thread thread)) ;;(+ start-time max-sleep))) threads) it does not seems to block but it is a bit slower than on a single CPU. i have this config: Puce : Apple M1 Nombre total de cœurs : 8 (4 performance et 4 efficacité) it is better on a single cpu than with all the cores... i read all the doc of Posix Thread, i admit Scheme is not C , i read on forums a few about C and Scheme comparaison. In any thread local variables should be on the stack of each thread. The only doubt i have is in Scheme (but the same question exist in C) what portions of code are or not copied in any thread? for this reason i tried to encapsulate all the procedures used in // as internals defines in the procedure passed at call-with-new-thread hoping they are copied in each threads. I hope the basic scheme procedures are reentrant... I have no explaination why is it even a bit slower on multiple core than on one. Regards, Damien On Thu, Nov 10, 2022 at 6:07 PM Olivier Dion <olivier.dion@polymtl.ca> wrote: > On Thu, 10 Nov 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > > Hello Zelphir, > > > > i finally find a possible cause of no speed up of my code, i find that > > using your code the procedure keep blocked on the first 'touch at line 27 > > here: > > > > I have found a possible deadlock with Guile's mutex. In some very rare > cases, ice-9 futures deadlock because of it. > > Maybe it's not the case here I haven't read the code you've mentioned. > Just letting you know. > > -- > Olivier Dion > oldiob.dev > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-11 10:26 ` Damien Mattei @ 2022-11-11 12:25 ` Zelphir Kaltstahl 2022-11-11 13:36 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Zelphir Kaltstahl @ 2022-11-11 12:25 UTC (permalink / raw) To: Damien Mattei; +Cc: guile-user, Olivier Dion Hi! On 11/11/22 11:26, Damien Mattei wrote: > i rewrote a complete threaded routine both with guile and racket creating > threads and waiting they finish : > ;; run the parallel code > {threads <+ (map (λ (seg) (call-with-new-thread > (λ () (proc-unify-minterms-seg seg)))) > segmts)} > > (nodebug > (display-nl "waiting for threads to finish...")) > > ;; wait for threads to finish > (map (λ (thread) (join-thread thread)) ;;(+ start-time max-sleep))) > threads) > > it does not seems to block but it is a bit slower than on a single CPU. > > i have this config: > Puce : Apple M1 > Nombre total de cœurs : 8 (4 performance et 4 efficacité) > it is better on a single cpu than with all the cores... > > i read all the doc of Posix Thread, i admit Scheme is not C , i read on forums > a few about C and Scheme comparaison. > In any thread local variables should be on the stack of each thread. > The only doubt i have is in Scheme (but the same question exist in C) what > portions of code are or not copied in any thread? for this reason i tried to > encapsulate all the procedures used in // as internals defines in the > procedure passed at call-with-new-thread hoping they are copied in each > threads. I hope the basic scheme procedures are reentrant... > I have no explaination why is it even a bit slower on multiple core than on one. > > Regards, > Damien Note, that threads in Guile and Racket are different: https://docs.racket-lang.org/reference/eval-model.html#%28part._thread-model%29: > Racket supports multiple threads of evaluation. Threads run concurrently, in the sense that one thread can preempt another without its cooperation, but threads currently all run on the same processor (i.e., the same underlying operating system process and thread). https://www.gnu.org/software/guile/manual/html_node/Threads.html: > The procedures below manipulate Guile threads, which are wrappers around the system’s POSIX threads. For application-level parallelism, using higher-level constructs, such as futures, is recommended (see Futures). I believe another word for Racket's threads is "green threads". They are like (more like?) Python threads, and do not run on another core. If you start multiple Racket threads on the same Racket VM, they will run all on the same core. No speedup to be expected, unless you would be waiting for IO or something, if you did not use threads. Racket threads are concurrent, but not parallel. I think Racket's threads' nature is the answer to why it is slower than single threaded execution. Regards, Zelphir -- repositories:https://notabug.org/ZelphirKaltstahl ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-11 12:25 ` Zelphir Kaltstahl @ 2022-11-11 13:36 ` Damien Mattei 2022-11-11 13:37 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-11-11 13:36 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user, Olivier Dion Hi Zelphir, On Fri, Nov 11, 2022 at 1:25 PM Zelphir Kaltstahl < zelphirkaltstahl@posteo.de> wrote: > > Note, that threads in Guile and Racket are different: > > > https://docs.racket-lang.org/reference/eval-model.html#%28part._thread-model%29 > : > > > Racket supports multiple threads of evaluation. Threads run > concurrently, in the sense that one thread can preempt another without its > cooperation, but threads currently all run on the same processor (i.e., the > same underlying operating system process and thread). > oh! that is the reason with Racket of no speed up. https://www.gnu.org/software/guile/manual/html_node/Threads.html: > > > The procedures below manipulate Guile threads, which are wrappers around > the system’s POSIX threads. For application-level parallelism, using > higher-level constructs, such as futures, is recommended (see Futures). > yes but futures seems to block on touch with guile, the same code under Racket,do not show speed up, it display a different output: run-in-parallel : making future run-in-parallel : touching future run-in-parallel : making future run-in-parallel : touching future run-in-parallel : making future run-in-parallel : touching future run-in-parallel : making future run-in-parallel : touching future run-in-parallel : making future run-in-parallel : touching future run-in-parallel : making future run-in-parallel : touching future it is different from the guile ouput. It seems again that only with this code Guile can make another thread when the previous is finished (after touch)... The code is this one: https://github.com/damien-mattei/library-FunctProg/blob/master/racket/logiki%2B.rkt#L3012 https://github.com/damien-mattei/library-FunctProg/blob/master/racket/logiki%2B.rkt#L3331 i do not think it could work faster if i cannot access to others CPUs like in OpenMP or this: https://en.wikipedia.org/wiki/Processor_affinity but it is not existing with Scheme. Best Regards, Damien > I believe another word for Racket's threads is "green threads". They are > like (more like?) Python threads, and do not run on another core. If you > start multiple Racket threads on the same Racket VM, they will run all on > the same core. No speedup to be expected, unless you would be waiting for > IO or something, if you did not use threads. Racket threads are concurrent, > but not parallel. > > I think Racket's threads' nature is the answer to why it is slower than > single threaded execution. > > Regards, > Zelphir > > -- > repositories: https://notabug.org/ZelphirKaltstahl > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-11 13:36 ` Damien Mattei @ 2022-11-11 13:37 ` Damien Mattei 2022-11-13 8:23 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-11-11 13:37 UTC (permalink / raw) To: Zelphir Kaltstahl; +Cc: guile-user, Olivier Dion in the previous mail ,read: It seems again that only with this code Guile Racket can make another thread only when the previous is finished (after touch)... On Fri, Nov 11, 2022 at 2:36 PM Damien Mattei <damien.mattei@gmail.com> wrote: > Hi Zelphir, > > > On Fri, Nov 11, 2022 at 1:25 PM Zelphir Kaltstahl < > zelphirkaltstahl@posteo.de> wrote: > >> >> Note, that threads in Guile and Racket are different: >> >> >> https://docs.racket-lang.org/reference/eval-model.html#%28part._thread-model%29 >> : >> >> > Racket supports multiple threads of evaluation. Threads run >> concurrently, in the sense that one thread can preempt another without its >> cooperation, but threads currently all run on the same processor (i.e., the >> same underlying operating system process and thread). >> > > oh! that is the reason with Racket of no speed up. > > https://www.gnu.org/software/guile/manual/html_node/Threads.html: >> >> > The procedures below manipulate Guile threads, which are wrappers >> around the system’s POSIX threads. For application-level parallelism, using >> higher-level constructs, such as futures, is recommended (see Futures). >> > yes but futures seems to block on touch with guile, the same code under > Racket,do not show speed up, it display a different output: > run-in-parallel : making future > run-in-parallel : touching future > run-in-parallel : making future > run-in-parallel : touching future > run-in-parallel : making future > run-in-parallel : touching future > run-in-parallel : making future > run-in-parallel : touching future > run-in-parallel : making future > run-in-parallel : touching future > run-in-parallel : making future > run-in-parallel : touching future > it is different from the guile ouput. > It seems again that only with this code Guile can make another thread when > the previous is finished (after touch)... > > The code is this one: > > https://github.com/damien-mattei/library-FunctProg/blob/master/racket/logiki%2B.rkt#L3012 > > > https://github.com/damien-mattei/library-FunctProg/blob/master/racket/logiki%2B.rkt#L3331 > > i do not think it could work faster if i cannot access to others CPUs like > in OpenMP > or this: > https://en.wikipedia.org/wiki/Processor_affinity > but it is not existing with Scheme. > > Best Regards, > Damien > >> I believe another word for Racket's threads is "green threads". They are >> like (more like?) Python threads, and do not run on another core. If you >> start multiple Racket threads on the same Racket VM, they will run all on >> the same core. No speedup to be expected, unless you would be waiting for >> IO or something, if you did not use threads. Racket threads are concurrent, >> but not parallel. >> >> I think Racket's threads' nature is the answer to why it is slower than >> single threaded execution. >> >> Regards, >> Zelphir >> >> -- >> repositories: https://notabug.org/ZelphirKaltstahl >> >> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-11-11 13:37 ` Damien Mattei @ 2022-11-13 8:23 ` Damien Mattei 0 siblings, 0 replies; 41+ messages in thread From: Damien Mattei @ 2022-11-13 8:23 UTC (permalink / raw) Cc: guile-user it just an idea, but i explore the way to fork() my code,use some shared memory (for the minterms vector) to get more system ressource (CPUs), i did not find the equivalent to fork() under unix in Guile or an API to manage shared memory (like in unix) but i read that C and guile can be used together: https://www.gnu.org/software/guile/manual/html_node/Combining-with-C.html but i can not find any example, is there a place i can find more documentation and tutorials about that ? Best Regards, Damien On Fri, Nov 11, 2022 at 2:37 PM Damien Mattei <damien.mattei@gmail.com> wrote: > in the previous mail ,read: > It seems again that only with this code Guile Racket can make another > thread only when the previous is finished (after touch)... > > On Fri, Nov 11, 2022 at 2:36 PM Damien Mattei <damien.mattei@gmail.com> > wrote: > >> Hi Zelphir, >> >> >> On Fri, Nov 11, 2022 at 1:25 PM Zelphir Kaltstahl < >> zelphirkaltstahl@posteo.de> wrote: >> >>> >>> Note, that threads in Guile and Racket are different: >>> >>> >>> https://docs.racket-lang.org/reference/eval-model.html#%28part._thread-model%29 >>> : >>> >>> > Racket supports multiple threads of evaluation. Threads run >>> concurrently, in the sense that one thread can preempt another without its >>> cooperation, but threads currently all run on the same processor (i.e., the >>> same underlying operating system process and thread). >>> >> >> oh! that is the reason with Racket of no speed up. >> >> https://www.gnu.org/software/guile/manual/html_node/Threads.html: >>> >>> > The procedures below manipulate Guile threads, which are wrappers >>> around the system’s POSIX threads. For application-level parallelism, using >>> higher-level constructs, such as futures, is recommended (see Futures). >>> >> yes but futures seems to block on touch with guile, the same code under >> Racket,do not show speed up, it display a different output: >> run-in-parallel : making future >> run-in-parallel : touching future >> run-in-parallel : making future >> run-in-parallel : touching future >> run-in-parallel : making future >> run-in-parallel : touching future >> run-in-parallel : making future >> run-in-parallel : touching future >> run-in-parallel : making future >> run-in-parallel : touching future >> run-in-parallel : making future >> run-in-parallel : touching future >> it is different from the guile ouput. >> It seems again that only with this code Guile can make another thread >> when the previous is finished (after touch)... >> >> The code is this one: >> >> https://github.com/damien-mattei/library-FunctProg/blob/master/racket/logiki%2B.rkt#L3012 >> >> >> https://github.com/damien-mattei/library-FunctProg/blob/master/racket/logiki%2B.rkt#L3331 >> >> i do not think it could work faster if i cannot access to others CPUs >> like in OpenMP >> or this: >> https://en.wikipedia.org/wiki/Processor_affinity >> but it is not existing with Scheme. >> >> Best Regards, >> Damien >> >>> I believe another word for Racket's threads is "green threads". They are >>> like (more like?) Python threads, and do not run on another core. If you >>> start multiple Racket threads on the same Racket VM, they will run all on >>> the same core. No speedup to be expected, unless you would be waiting for >>> IO or something, if you did not use threads. Racket threads are concurrent, >>> but not parallel. >>> >>> I think Racket's threads' nature is the answer to why it is slower than >>> single threaded execution. >>> >>> Regards, >>> Zelphir >>> >>> -- >>> repositories: https://notabug.org/ZelphirKaltstahl >>> >>> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 20:20 ` Damien Mattei 2022-10-12 20:27 ` Damien Mattei @ 2022-10-12 21:44 ` Maxime Devos 1 sibling, 0 replies; 41+ messages in thread From: Maxime Devos @ 2022-10-12 21:44 UTC (permalink / raw) To: Damien Mattei; +Cc: guile-user, guile-devel [-- Attachment #1.1.1: Type: text/plain, Size: 1040 bytes --] On 12-10-2022 22:20, Damien Mattei wrote: > Mutex? i do not think code has situation where dead lock could happen, > it is a code about minimalising logic expressions, it uses minterms , > minterms set is a set of minterms :like this: I didn't mean deadlocks, I just meant a single mutex being taken by all the threads. Also, you seem to have a global hash table on line 1603 -- presumably there's a mutex in there (and even if not, likely there are CPU cache issues). > [...] I propose to minimise this complicated test case (doesn't need to be equivalent, just needs to demonstrate the 'map-par slower than map') and to change it to standard-ish Scheme instead of your Scheme+ variant. > On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be > <mailto:maximedevos@telenet.be>> wrote: > > [rest of copied previous message, without any comments] This is top-posting. For why not to top-post, https://wiki.lyx.org/FAQ/ListNetiquette has various information. Greetings, Maxime. [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 929 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 236 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 17:19 map-par slower than map Damien Mattei 2022-10-12 18:45 ` Maxime Devos @ 2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 7:40 ` Damien Mattei 2022-10-13 10:44 ` Damien Mattei 1 sibling, 2 replies; 41+ messages in thread From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-12 21:55 UTC (permalink / raw) To: Damien Mattei, guile-user, guile-devel On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > Hello, > all is in the title, i test on a approximately 30000 element list , i got > 9s with map and 3min 30s with par-map on exactly the same piece of > code!? I can only speculate here. But trying with a very simple example here: --8<---------------cut here---------------start------------->8--- (use-modules (statprof)) (statprof (lambda () (par-map 1+ (iota 300000)))) --8<---------------cut here---------------end--------------->8--- Performance are terrible. I don't know how par-map is implemented, but if it does 1 element static scheduling -- which it probably does because you pass a linked list and not a vector -- then yeah you can assure that thing will be very slow. You're probably better off with dynamic scheduling with vectors. Here's a quick snippet I made for static scheduling but with vectors. Feel free to roll your own. --8<---------------cut here---------------start------------->8--- (use-modules (srfi srfi-1) (ice-9 threads)) (define* (par-map-vector proc input #:optional (max-thread (current-processor-count))) (let* ((block-size (quotient (vector-length input) max-thread)) (rest (remainder (vector-length input) max-thread)) (output (make-vector (vector-length input) #f))) (when (not (zero? block-size)) (let ((mtx (make-mutex)) (cnd (make-condition-variable)) (n 0)) (fold (lambda (scale output) (begin-thread (let lp ((i 0)) (when (< i block-size) (let ((i (+ i (* scale block-size)))) (vector-set! output i (proc (vector-ref input i)))) (lp (1+ i)))) (with-mutex mtx (set! n (1+ n)) (signal-condition-variable cnd))) output) output (iota max-thread)) (with-mutex mtx (while (not (< n max-thread)) (wait-condition-variable cnd mtx)))) (let ((base (- (vector-length input) rest))) (let lp ((i 0)) (when (< i rest) (let ((i (+ i base))) (vector-set! output i (proc (vector-ref input i)))) (lp (1+ i))))) output)) --8<---------------cut here---------------end--------------->8--- -- Olivier Dion oldiob.dev ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 7:40 ` Damien Mattei 2022-10-13 8:20 ` Damien Mattei 2022-10-13 9:10 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 10:44 ` Damien Mattei 1 sibling, 2 replies; 41+ messages in thread From: Damien Mattei @ 2022-10-13 7:40 UTC (permalink / raw) Cc: guile-user, guile-devel ok , i think the problem comes both from my code and from guile parmap so. Obviously parmap could be slower on other codes because of the nature of list i think, it is hard to split a list in sublist and send them to thread and after redo a single list, i better use vector. As mentioned and forget by me, i apologize, i use an hash table which is a global variable and mutex can be set on it , no dead lock , here but it slow down the code than it is dead for speed , but not locked. The examples given are good but i do not want to write long and specific code for //, for // must a ssimple as OpenMP directives (on arrays) not be a pain in the ass like things i already did at work with GPUs in C or Fortran,that's nightmare and i do not want to have this in functional programming. ok i will try to modify the code but i do not think there is easy solution to replace the hash table, any structure i would use would be global and the problem (this function is not pure, it does a side-effect),at the end i do not think this code could be // but i'm not a specialist of //. I think this is a parallelization problem, how can we deal with a global variable such as an hash table? On Wed, Oct 12, 2022 at 11:55 PM Olivier Dion <olivier.dion@polymtl.ca> wrote: > On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > > Hello, > > all is in the title, i test on a approximately 30000 element list , i got > > 9s with map and 3min 30s with par-map on exactly the same piece of > > code!? > > I can only speculate here. But trying with a very simple example here: > --8<---------------cut here---------------start------------->8--- > (use-modules (statprof)) > (statprof (lambda () (par-map 1+ (iota 300000)))) > --8<---------------cut here---------------end--------------->8--- > > Performance are terrible. I don't know how par-map is implemented, but > if it does 1 element static scheduling -- which it probably does because > you pass a linked list and not a vector -- then yeah you can assure that > thing will be very slow. > > You're probably better off with dynamic scheduling with vectors. Here's > a quick snippet I made for static scheduling but with vectors. Feel > free to roll your own. > > --8<---------------cut here---------------start------------->8--- > (use-modules > (srfi srfi-1) > (ice-9 threads)) > > (define* (par-map-vector proc input > #:optional > (max-thread (current-processor-count))) > > (let* ((block-size (quotient (vector-length input) max-thread)) > (rest (remainder (vector-length input) max-thread)) > (output (make-vector (vector-length input) #f))) > (when (not (zero? block-size)) > (let ((mtx (make-mutex)) > (cnd (make-condition-variable)) > (n 0)) > (fold > (lambda (scale output) > (begin-thread > (let lp ((i 0)) > (when (< i block-size) > (let ((i (+ i (* scale block-size)))) > (vector-set! output i (proc (vector-ref input i)))) > (lp (1+ i)))) > (with-mutex mtx > (set! n (1+ n)) > (signal-condition-variable cnd))) > output) > output > (iota max-thread)) > (with-mutex mtx > (while (not (< n max-thread)) > (wait-condition-variable cnd mtx)))) > (let ((base (- (vector-length input) rest))) > (let lp ((i 0)) > (when (< i rest) > (let ((i (+ i base))) > (vector-set! output i (proc (vector-ref input i)))) > (lp (1+ i))))) > output)) > --8<---------------cut here---------------end--------------->8--- > > -- > Olivier Dion > oldiob.dev > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 7:40 ` Damien Mattei @ 2022-10-13 8:20 ` Damien Mattei 2022-10-13 9:10 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 1 sibling, 0 replies; 41+ messages in thread From: Damien Mattei @ 2022-10-13 8:20 UTC (permalink / raw) To: guile-user, guile-devel ok 'parallelization code and hash table' in google show no real solution, i have another idea,instead of function-unify-two-minterms-and-tag ,just unify the minterms and tag them (which use hash table) later out of the // region, after if it still sucks i will drop list for vectors, making it step by step will show where is the bottleneck, let me just a few time and i will go back to the mailing lists with interesting results i hope.... On Thu, Oct 13, 2022 at 9:40 AM Damien Mattei <damien.mattei@gmail.com> wrote: > ok , i think the problem comes both from my code and from guile parmap so. > Obviously parmap could be slower on other codes because of the nature of > list i think, it is hard to split a list in sublist and send them to thread > and after redo a single list, i better use vector. > As mentioned and forget by me, i apologize, i use an hash table which is a > global variable and mutex can be set on it , no dead lock , here but it > slow down the code than it is dead for speed , but not locked. > > The examples given are good but i do not want to write long and specific > code for //, for // must a ssimple as OpenMP directives (on arrays) not be > a pain in the ass like things i already did at work with GPUs in C or > Fortran,that's nightmare and i do not want to have this in functional > programming. > > ok i will try to modify the code but i do not think there is easy solution > to replace the hash table, any structure i would use would be global and > the problem (this function is not pure, it does a side-effect),at the end i > do not think this code could be // but i'm not a specialist of //. I think > this is a parallelization problem, how can we deal with a global variable > such as an hash table? > > On Wed, Oct 12, 2022 at 11:55 PM Olivier Dion <olivier.dion@polymtl.ca> > wrote: > >> On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: >> > Hello, >> > all is in the title, i test on a approximately 30000 element list , i >> got >> > 9s with map and 3min 30s with par-map on exactly the same piece of >> > code!? >> >> I can only speculate here. But trying with a very simple example here: >> --8<---------------cut here---------------start------------->8--- >> (use-modules (statprof)) >> (statprof (lambda () (par-map 1+ (iota 300000)))) >> --8<---------------cut here---------------end--------------->8--- >> >> Performance are terrible. I don't know how par-map is implemented, but >> if it does 1 element static scheduling -- which it probably does because >> you pass a linked list and not a vector -- then yeah you can assure that >> thing will be very slow. >> >> You're probably better off with dynamic scheduling with vectors. Here's >> a quick snippet I made for static scheduling but with vectors. Feel >> free to roll your own. >> >> --8<---------------cut here---------------start------------->8--- >> (use-modules >> (srfi srfi-1) >> (ice-9 threads)) >> >> (define* (par-map-vector proc input >> #:optional >> (max-thread (current-processor-count))) >> >> (let* ((block-size (quotient (vector-length input) max-thread)) >> (rest (remainder (vector-length input) max-thread)) >> (output (make-vector (vector-length input) #f))) >> (when (not (zero? block-size)) >> (let ((mtx (make-mutex)) >> (cnd (make-condition-variable)) >> (n 0)) >> (fold >> (lambda (scale output) >> (begin-thread >> (let lp ((i 0)) >> (when (< i block-size) >> (let ((i (+ i (* scale block-size)))) >> (vector-set! output i (proc (vector-ref input i)))) >> (lp (1+ i)))) >> (with-mutex mtx >> (set! n (1+ n)) >> (signal-condition-variable cnd))) >> output) >> output >> (iota max-thread)) >> (with-mutex mtx >> (while (not (< n max-thread)) >> (wait-condition-variable cnd mtx)))) >> (let ((base (- (vector-length input) rest))) >> (let lp ((i 0)) >> (when (< i rest) >> (let ((i (+ i base))) >> (vector-set! output i (proc (vector-ref input i)))) >> (lp (1+ i))))) >> output)) >> --8<---------------cut here---------------end--------------->8--- >> >> -- >> Olivier Dion >> oldiob.dev >> > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 7:40 ` Damien Mattei 2022-10-13 8:20 ` Damien Mattei @ 2022-10-13 9:10 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 1 sibling, 0 replies; 41+ messages in thread From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 9:10 UTC (permalink / raw) To: Damien Mattei; +Cc: guile-user, guile-devel On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > ok , i think the problem comes both from my code and from guile parmap so. > Obviously parmap could be slower on other codes because of the nature of > list i think, it is hard to split a list in sublist and send them to thread > and after redo a single list, i better use vector. > As mentioned and forget by me, i apologize, i use an hash table which is a > global variable and mutex can be set on it , no dead lock , here but it > slow down the code than it is dead for speed , but not locked. Your code is not the issue. The fact that `(par-map 1+ (iota 10000000))` is slow is the proof of that. I think that you should use par-map when you want to do I/O/subprocesses in parallel. Fast computation on a huge number of items should be done with vector because of its random access nature. List are simply no the good data structure for parallelization. > The examples given are good but i do not want to write long and specific > code for //, for // must a ssimple as OpenMP directives (on arrays) not be > a pain in the ass like things i already did at work with GPUs in C or > Fortran,that's nightmare and i do not want to have this in functional > programming. You're free to use my example of `par-map-vector` and tweak it to your needs. But maybe it's time for a `guile-parallel` for fast high level parallel. If others are interested in that, I can make one. -- Olivier Dion oldiob.dev ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 7:40 ` Damien Mattei @ 2022-10-13 10:44 ` Damien Mattei 2022-10-13 11:00 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 1 sibling, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-13 10:44 UTC (permalink / raw) To: Olivier Dion, guile-user, guile-devel [-- Attachment #1: Type: text/plain, Size: 2663 bytes --] i trying to use your code but it seems there is a ) mismatch somewhere? On Wed, Oct 12, 2022 at 11:55 PM Olivier Dion <olivier.dion@polymtl.ca> wrote: > On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > > Hello, > > all is in the title, i test on a approximately 30000 element list , i got > > 9s with map and 3min 30s with par-map on exactly the same piece of > > code!? > > I can only speculate here. But trying with a very simple example here: > --8<---------------cut here---------------start------------->8--- > (use-modules (statprof)) > (statprof (lambda () (par-map 1+ (iota 300000)))) > --8<---------------cut here---------------end--------------->8--- > > Performance are terrible. I don't know how par-map is implemented, but > if it does 1 element static scheduling -- which it probably does because > you pass a linked list and not a vector -- then yeah you can assure that > thing will be very slow. > > You're probably better off with dynamic scheduling with vectors. Here's > a quick snippet I made for static scheduling but with vectors. Feel > free to roll your own. > > --8<---------------cut here---------------start------------->8--- > (use-modules > (srfi srfi-1) > (ice-9 threads)) > > (define* (par-map-vector proc input > #:optional > (max-thread (current-processor-count))) > > (let* ((block-size (quotient (vector-length input) max-thread)) > (rest (remainder (vector-length input) max-thread)) > (output (make-vector (vector-length input) #f))) > (when (not (zero? block-size)) > (let ((mtx (make-mutex)) > (cnd (make-condition-variable)) > (n 0)) > (fold > (lambda (scale output) > (begin-thread > (let lp ((i 0)) > (when (< i block-size) > (let ((i (+ i (* scale block-size)))) > (vector-set! output i (proc (vector-ref input i)))) > (lp (1+ i)))) > (with-mutex mtx > (set! n (1+ n)) > (signal-condition-variable cnd))) > output) > output > (iota max-thread)) > (with-mutex mtx > (while (not (< n max-thread)) > (wait-condition-variable cnd mtx)))) > (let ((base (- (vector-length input) rest))) > (let lp ((i 0)) > (when (< i rest) > (let ((i (+ i base))) > (vector-set! output i (proc (vector-ref input i)))) > (lp (1+ i))))) > output)) > --8<---------------cut here---------------end--------------->8--- > > -- > Olivier Dion > oldiob.dev > [-- Attachment #2: Type: text/html, Size: 3591 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 10:44 ` Damien Mattei @ 2022-10-13 11:00 ` Olivier Dion via Developers list for Guile, the GNU extensibility library [not found] ` <CADEOadfovi8s3OxRcssWOuOW8jjHoL9Z7pD_5FstSm=ZkBHP8g@mail.gmail.com> 0 siblings, 1 reply; 41+ messages in thread From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 11:00 UTC (permalink / raw) To: Damien Mattei, guile-user, guile-devel On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > i trying to use your code but it seems there is a ) mismatch > somewhere? Right there's a missing ')' a the end to close the procedure, sorry about that. -- Olivier Dion oldiob.dev ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <CADEOadfovi8s3OxRcssWOuOW8jjHoL9Z7pD_5FstSm=ZkBHP8g@mail.gmail.com>]
* Fwd: map-par slower than map [not found] ` <CADEOadfovi8s3OxRcssWOuOW8jjHoL9Z7pD_5FstSm=ZkBHP8g@mail.gmail.com> @ 2022-10-13 11:57 ` Damien Mattei 2022-10-13 12:36 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-13 11:57 UTC (permalink / raw) To: guile-user, guile-devel sorry google always do reply to single author... ---------- Forwarded message --------- From: Damien Mattei <damien.mattei@gmail.com> Date: Thu, Oct 13, 2022 at 1:56 PM Subject: Re: map-par slower than map To: Olivier Dion <olivier.dion@polymtl.ca> ah just at the end? i had noticed, but i get #unspecifeid ? so my new code is not good if your procedure is, i will check it,thanks. Damien On Thu, Oct 13, 2022 at 1:00 PM Olivier Dion <olivier.dion@polymtl.ca> wrote: > On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > > i trying to use your code but it seems there is a ) mismatch > > somewhere? > > Right there's a missing ')' a the end to close the procedure, sorry > about that. > > -- > Olivier Dion > oldiob.dev > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 11:57 ` Fwd: " Damien Mattei @ 2022-10-13 12:36 ` Damien Mattei 2022-10-13 12:41 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-13 12:36 UTC (permalink / raw) To: guile-user, guile-devel, Olivier Dion the code did not worked when data length were more little than number of cpus (6 on my host) (iota 5) returns #unsepcified: scheme@(guile-user)> (use-modules (ice-9 threads)) scheme@(guile-user)> {v <+ (list->vector (iota 5))} #(0 1 2 3 4) scheme@(guile-user)> (par-map-vector sqrt v) scheme@(guile-user)> {v <+ (list->vector (iota 20))} #(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19) scheme@(guile-user)> (par-map-vector sqrt v) #(0 1 1.4142135623730951 1.7320508075688772 2 2.23606797749979 2.449489742783178 2.6457513110645907 2.8284271247461903 3 3.1622776601683795 3.3166247903554 3.4641016151377544 3.605551275463989 3.7416573867739413 3.872983346207417 4 4.123105625617661 4.242640687119285 4.358898943540674) scheme@(guile-user)> (current-processor-count) 6 scheme@(guile-user)> {v <+ (list->vector (iota 6))} #(0 1 2 3 4 5) scheme@(guile-user)> (par-map-vector sqrt v) #(0 1 1.4142135623730951 1.7320508075688772 2 2.23606797749979) so i modify it this way: (define* (par-map-vector proc input #:optional (max-thread (current-processor-count))) (if (< (vector-length input) max-thread) (list->vector (map proc (vector->list input))) ;; less data than threads or CPUs (let* ((block-size (quotient (vector-length input) max-thread)) (rest (remainder (vector-length input) max-thread)) (output (make-vector (vector-length input) #f))) (when (not (zero? block-size)) (let ((mtx (make-mutex)) (cnd (make-condition-variable)) (n 0)) (fold (lambda (scale output) (begin-thread (let lp ((i 0)) (when (< i block-size) (let ((i (+ i (* scale block-size)))) (vector-set! output i (proc (vector-ref input i)))) (lp (1+ i)))) (with-mutex mtx (set! n (1+ n)) (signal-condition-variable cnd))) output) output (iota max-thread)) (with-mutex mtx (while (not (< n max-thread)) (wait-condition-variable cnd mtx)))) (let ((base (- (vector-length input) rest))) (let lp ((i 0)) (when (< i rest) (let ((i (+ i base))) (vector-set! output i (proc (vector-ref input i)))) (lp (1+ i))))) output)))) now it works but crash randomly, not even at the same stage, i continue to test and debug.... On Thu, Oct 13, 2022 at 1:57 PM Damien Mattei <damien.mattei@gmail.com> wrote: > sorry google always do reply to single author... > > ---------- Forwarded message --------- > From: Damien Mattei <damien.mattei@gmail.com> > Date: Thu, Oct 13, 2022 at 1:56 PM > Subject: Re: map-par slower than map > To: Olivier Dion <olivier.dion@polymtl.ca> > > > ah just at the end? i had noticed, but i get #unspecifeid ? so my new code > is not good if your procedure is, i will check it,thanks. > Damien > > On Thu, Oct 13, 2022 at 1:00 PM Olivier Dion <olivier.dion@polymtl.ca> > wrote: > >> On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: >> > i trying to use your code but it seems there is a ) mismatch >> > somewhere? >> >> Right there's a missing ')' a the end to close the procedure, sorry >> about that. >> >> -- >> Olivier Dion >> oldiob.dev >> > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 12:36 ` Damien Mattei @ 2022-10-13 12:41 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 13:43 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 12:41 UTC (permalink / raw) To: Damien Mattei, guile-user, guile-devel On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > the code did not worked when data length were more little than number of > cpus (6 on my host) (iota 5) returns #unsepcified: Yeah sorry I miss indended the output and the rest. Here's a version that should work: --8<---------------cut here---------------start------------->8--- (use-modules (srfi srfi-1) (ice-9 threads)) (define* (par-map-vector proc input #:optional (max-thread (current-processor-count))) (let* ((block-size (quotient (vector-length input) max-thread)) (rest (remainder (vector-length input) max-thread)) (output (make-vector (vector-length input) #f))) (when (not (zero? block-size)) (let ((mtx (make-mutex)) (cnd (make-condition-variable)) (n 0)) (fold (lambda (scale output) (begin-thread (let lp ((i 0)) (when (< i block-size) (let ((i (+ i (* scale block-size)))) (vector-set! output i (proc (vector-ref input i)))) (lp (1+ i)))) (with-mutex mtx (set! n (1+ n)) (signal-condition-variable cnd))) output) output (iota max-thread)) (with-mutex mtx (while (not (< n max-thread)) (wait-condition-variable cnd mtx))))) (let ((base (- (vector-length input) rest))) (let lp ((i 0)) (when (< i rest) (let ((i (+ i base))) (vector-set! output i (proc (vector-ref input i)))) (lp (1+ i))))) output)) --8<---------------cut here---------------end--------------->8--- -- Olivier Dion oldiob.dev ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 12:41 ` Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 13:43 ` Damien Mattei 2022-10-13 14:06 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-13 13:43 UTC (permalink / raw) To: Olivier Dion; +Cc: guile-user, guile-devel i do not see what has changed in your code ? really strange,even with bad code the moment it crash should be the same, sometimes works,crash or freeze.... On Thu, Oct 13, 2022 at 2:41 PM Olivier Dion <olivier.dion@polymtl.ca> wrote: > On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > > the code did not worked when data length were more little than number of > > cpus (6 on my host) (iota 5) returns #unsepcified: > > Yeah sorry I miss indended the output and the rest. Here's a version > that should work: > --8<---------------cut here---------------start------------->8--- > (use-modules > (srfi srfi-1) > (ice-9 threads)) > > (define* (par-map-vector proc input > #:optional > (max-thread (current-processor-count))) > > (let* ((block-size (quotient (vector-length input) max-thread)) > (rest (remainder (vector-length input) max-thread)) > (output (make-vector (vector-length input) #f))) > (when (not (zero? block-size)) > (let ((mtx (make-mutex)) > (cnd (make-condition-variable)) > (n 0)) > (fold > (lambda (scale output) > (begin-thread > (let lp ((i 0)) > (when (< i block-size) > (let ((i (+ i (* scale block-size)))) > (vector-set! output i (proc (vector-ref input i)))) > (lp (1+ i)))) > (with-mutex mtx > (set! n (1+ n)) > (signal-condition-variable cnd))) > output) > output > (iota max-thread)) > (with-mutex mtx > (while (not (< n max-thread)) > (wait-condition-variable cnd mtx))))) > (let ((base (- (vector-length input) rest))) > (let lp ((i 0)) > (when (< i rest) > (let ((i (+ i base))) > (vector-set! output i (proc (vector-ref input i)))) > (lp (1+ i))))) > output)) > --8<---------------cut here---------------end--------------->8--- > > -- > Olivier Dion > oldiob.dev > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 13:43 ` Damien Mattei @ 2022-10-13 14:06 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 14:10 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 14:06 UTC (permalink / raw) To: Damien Mattei; +Cc: guile-user, guile-devel On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > i do not see what has changed in your code ? Just copy it. Trust me it has changed. > really strange,even with bad code the moment it crash should be the same, > sometimes works,crash or freeze.... I don't get any of these problem .. running with (iota 100000000) I don't see any synchornization issue also. -- Olivier Dion oldiob.dev ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 14:06 ` Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 14:10 ` Damien Mattei 2022-10-13 14:21 ` Damien Mattei 0 siblings, 1 reply; 41+ messages in thread From: Damien Mattei @ 2022-10-13 14:10 UTC (permalink / raw) To: Olivier Dion; +Cc: guile-user, guile-devel On Thu, Oct 13, 2022 at 4:06 PM Olivier Dion <olivier.dion@polymtl.ca> wrote: > On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: > > i do not see what has changed in your code ? > > Just copy it. Trust me it has changed. > really ? :-) i read it 3 times line by line, yes the end is cut ,still no ending ) and i use it and freeze,crash or ok randomly, the first times it works but try 4 or 5 examples.... :-( > > > really strange,even with bad code the moment it crash should be the same, > > sometimes works,crash or freeze.... > > I don't get any of these problem .. running with (iota 100000000) I > don't see any synchornization issue also. > > -- > Olivier Dion > oldiob.dev > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: map-par slower than map 2022-10-13 14:10 ` Damien Mattei @ 2022-10-13 14:21 ` Damien Mattei 0 siblings, 0 replies; 41+ messages in thread From: Damien Mattei @ 2022-10-13 14:21 UTC (permalink / raw) To: Olivier Dion; +Cc: guile-user, guile-devel ok i have the proof, this time (second time) the process stopped itself on your example ("processus stoppé " in french) : mattei@pc-mattei:~/Dropbox/git/library-FunctProg$ guile GNU Guile 3.0.1 Copyright (C) 1995-2020 Free Software Foundation, Inc. Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. This program is free software, and you are welcome to redistribute it under certain conditions; type `,show c' for details. Enter `,help' for help. scheme@(guile-user)> (use-modules (srfi srfi-1) (ice-9 threads)) scheme@(guile-user)> (define* (par-map-vector proc input #:optional (max-thread (current-processor-count))) (if (< (vector-length input) max-thread) (list->vector (map proc (vector->list input))) ;; less data than threads or CPUs (let* ((block-size (quotient (vector-length input) max-thread)) (rest (remainder (vector-length input) max-thread)) (output (make-vector (vector-length input) #f))) (when (not (zero? block-size)) (let ((mtx (make-mutex)) (cnd (make-condition-variable)) (n 0)) (fold (lambda (scale output) (begin-thread (let lp ((i 0)) (when (< i block-size) (let ((i (+ i (* scale block-size)))) (vector-set! output i (proc (vector-ref input i)))) (lp (1+ i)))) (with-mutex mtx (set! n (1+ n)) (signal-condition-variable cnd))) output) output (iota max-thread)) (with-mutex mtx (while (not (< n max-thread)) (wait-condition-variable cnd mtx)))) ;; (let ((base (- (vector-length input) rest))) ;; (let lp ((i 0)) ;; (when (< i rest) ;; (let ((i (+ i base))) ;; (vector-set! output i (proc (vector-ref input i)))) ;; (lp (1+ i))))) output)))) scheme@(guile-user)> (define v (list->vector (iota 100000000) )) scheme@(guile-user)> (define r (par-map-vector sqrt v)) scheme@(guile-user)> (define v (list->vector (iota 100000000) )) scheme@(guile-user)> (define r (par-map-vector sqrt v)) Processus arrêté the first time is ok but the second time, Guile quit to the terminal. seems something makes the guile system instable after one run, i test it out of my program and out of my Scheme+ but my .guile is in this directory like that: ; Guile config file ;; history (use-modules (ice-9 readline) (ice-9 history) (srfi srfi-43) ;; vector ;; guile object oriented programming system (oop goops) (oop goops describe)) (activate-readline) ;;(disable-value-history!) ;; curly infix as in srfi-105 (read-enable 'curly-infix) ;; set current path in load path (set! %load-path (reverse (cons "." (reverse %load-path)))) ;; other solution is to put this in shell: ;; export GUILE_LOAD_PATH="...:." On Thu, Oct 13, 2022 at 4:10 PM Damien Mattei <damien.mattei@gmail.com> wrote: > > > On Thu, Oct 13, 2022 at 4:06 PM Olivier Dion <olivier.dion@polymtl.ca> > wrote: > >> On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote: >> > i do not see what has changed in your code ? >> >> Just copy it. Trust me it has changed. >> > really ? :-) i read it 3 times line by line, yes the end is cut ,still no > ending ) and i use it and freeze,crash or ok randomly, the first times it > works but try 4 or 5 examples.... :-( > >> >> > really strange,even with bad code the moment it crash should be the >> same, >> > sometimes works,crash or freeze.... >> >> I don't get any of these problem .. running with (iota 100000000) I >> don't see any synchornization issue also. >> >> -- >> Olivier Dion >> oldiob.dev >> > ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2022-11-13 8:23 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-12 17:19 map-par slower than map Damien Mattei 2022-10-12 18:45 ` Maxime Devos 2022-10-12 20:20 ` Damien Mattei 2022-10-12 20:27 ` Damien Mattei 2022-10-12 21:29 ` Zelphir Kaltstahl 2022-10-14 8:21 ` Damien Mattei 2022-10-14 8:38 ` Zelphir Kaltstahl 2022-10-17 13:17 ` Damien Mattei 2022-10-22 16:01 ` Damien Mattei 2022-10-23 1:06 ` Damien Mattei 2022-10-23 23:18 ` Zelphir Kaltstahl 2022-10-24 3:56 ` Keith Wright 2022-10-24 7:03 ` Zelphir Kaltstahl 2022-10-24 4:39 ` Damien Mattei 2022-10-25 9:07 ` Mikael Djurfeldt 2022-10-25 9:11 ` Mikael Djurfeldt 2022-10-25 14:09 ` Damien Mattei 2022-11-10 10:32 ` Damien Mattei 2022-11-10 10:41 ` Damien Mattei 2022-11-10 10:52 ` Zelphir Kaltstahl 2022-11-10 13:36 ` Damien Mattei 2022-11-10 17:07 ` Olivier Dion via General Guile related discussions 2022-11-11 10:26 ` Damien Mattei 2022-11-11 12:25 ` Zelphir Kaltstahl 2022-11-11 13:36 ` Damien Mattei 2022-11-11 13:37 ` Damien Mattei 2022-11-13 8:23 ` Damien Mattei 2022-10-12 21:44 ` Maxime Devos 2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 7:40 ` Damien Mattei 2022-10-13 8:20 ` Damien Mattei 2022-10-13 9:10 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 10:44 ` Damien Mattei 2022-10-13 11:00 ` Olivier Dion via Developers list for Guile, the GNU extensibility library [not found] ` <CADEOadfovi8s3OxRcssWOuOW8jjHoL9Z7pD_5FstSm=ZkBHP8g@mail.gmail.com> 2022-10-13 11:57 ` Fwd: " Damien Mattei 2022-10-13 12:36 ` Damien Mattei 2022-10-13 12:41 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 13:43 ` Damien Mattei 2022-10-13 14:06 ` Olivier Dion via Developers list for Guile, the GNU extensibility library 2022-10-13 14:10 ` Damien Mattei 2022-10-13 14:21 ` Damien Mattei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).