map-par slower than map

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

* map-par slower than map
@ 2022-10-12 17:19 Damien Mattei
  2022-10-12 18:45 ` Maxime Devos
  2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  0 siblings, 2 replies; 27+ messages in thread
From: Damien Mattei @ 2022-10-12 17:19 UTC (permalink / raw)
  To: guile-user, guile-devel

Hello,
all is in the title, i test on a approximately 30000 element list , i got
9s with map and 3min 30s with par-map on exactly the same piece of code!?

what i change in code is in comments:
{unified-minterms-set-1 <+ (map function-unify-minterms-list minterms-set)}
;;(par-map function-unify-minterms-list minterms-set)}

translated from Scheme+ to Scheme:
(define unified-minterms-set-1 (map function-unify-minterms-list
minterms-set)) ;;(par-map function-unify-minterms-list minterms-set))

best regards,

Damien


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 17:19 map-par slower than map Damien Mattei
@ 2022-10-12 18:45 ` Maxime Devos
  2022-10-12 20:20   ` Damien Mattei
  2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  1 sibling, 1 reply; 27+ messages in thread
From: Maxime Devos @ 2022-10-12 18:45 UTC (permalink / raw)
  To: Damien Mattei, guile-user, guile-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 606 bytes --]

On 12-10-2022 19:19, Damien Mattei wrote:
> Hello,
> all is in the title, i test on a approximately 30000 element list , i got
> 9s with map and 3min 30s with par-map on exactly the same piece of code!?
>
 > [...]
 >
> translated from Scheme+ to Scheme:
> (define unified-minterms-set-1 (map function-unify-minterms-list
> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set))

The definition of 'function-unify-minterms-list' and 'minterms-set' is 
missing.  Without a test case, we can only speculate what's going on. 
(E.g., maybe it grabs a mutex).

Greetings,
Maxime.

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 18:45 ` Maxime Devos
@ 2022-10-12 20:20   ` Damien Mattei
  2022-10-12 20:27     ` Damien Mattei
  2022-10-12 21:44     ` Maxime Devos
  0 siblings, 2 replies; 27+ messages in thread
From: Damien Mattei @ 2022-10-12 20:20 UTC (permalink / raw)
  To: Maxime Devos; +Cc: guile-user, guile-devel

Mutex? i do not think code has situation where dead lock could happen, it
is a code about minimalising logic expressions, it uses minterms , minterms
set is a set of minterms :like this:

example:
((1 1 0) (1 1 1)) will be unified : (1 1 x)
because 0 and 1 are replaced by x
the minterms-set could have thousands of pair (mathematic not lisp)
minterms to unify
if there is more than one x as result there is no need to continue so i
escape with a continuation:

minterms-set =
{
((1 0 1 0) (1 1 1 0))
((1 0 1 0) (1 1 0 1))
((1 0 1 0) (1 0 1 1))
((1 0 1 0) (0 1 1 1))
((0 1 1 0) (1 1 1 0))
((0 1 1 0) (1 1 0 1))
((0 1 1 0) (1 0 1 1))
((0 1 1 0) (0 1 1 1))
((0 1 0 1) (1 1 1 0))
((0 1 0 1) (1 1 0 1))
((0 1 0 1) (1 0 1 1))
((0 1 0 1) (0 1 1 1))
((0 0 1 1) (1 1 1 0))
((0 0 1 1) (1 1 0 1))
((0 0 1 1) (1 0 1 1))
((0 0 1 1) (0 1 1 1))
}

replace { } by () to have the list, other example at another level :

minterms-set =
{
((0 x 1 1) (x 1 1 1))
((0 x 1 1) (1 x 1 1))
((0 x 1 1) (1 1 x 1))
((0 x 1 1) (1 1 1 x))
((x 0 1 1) (x 1 1 1))
((x 0 1 1) (1 x 1 1))
((x 0 1 1) (1 1 x 1))
((x 0 1 1) (1 1 1 x))
((0 1 x 1) (x 1 1 1))
((0 1 x 1) (1 x 1 1))
((0 1 x 1) (1 1 x 1))
((0 1 x 1) (1 1 1 x))
((x 1 0 1) (x 1 1 1))
((x 1 0 1) (1 x 1 1))
((x 1 0 1) (1 1 x 1))
((x 1 0 1) (1 1 1 x))
((0 1 1 x) (x 1 1 1))
((0 1 1 x) (1 x 1 1))
((0 1 1 x) (1 1 x 1))
((0 1 1 x) (1 1 1 x))
((x 1 1 0) (x 1 1 1))
((x 1 1 0) (1 x 1 1))
((x 1 1 0) (1 1 x 1))
((x 1 1 0) (1 1 1 x))
((1 0 1 x) (x 1 1 1))
((1 0 1 x) (1 x 1 1))
((1 0 1 x) (1 1 x 1))
((1 0 1 x) (1 1 1 x))
((1 x 1 0) (x 1 1 1))
((1 x 1 0) (1 x 1 1))
((1 x 1 0) (1 1 x 1))
((1 x 1 0) (1 1 1 x))
}

here we see some minterms are already unified

 it is not easy to read even by me because i wrote the code many years ago
and is split in many files, but here it is:

(par-map function-unify-minterms-list minterms-set)

{function-unify-minterms-list <+ (λ (L) (apply
function-unify-two-minterms-and-tag L))}

(define (unify-two-minterms mt1 mt2)
  (function-map-with-escaping-by-kontinuation2
 (macro-function-compare-2-bits-with-continuation) mt1 mt2))

;; (function-map-with-escaping-by-kontinuation2
(macro-function-compare-2-bits-with-continuation)   '(1 1 0 1 0 1 1 0) '(1
1 0 1 1 1 1 1))

;; list1 = (1 1 0 1 0 1 1 0)
;; more-lists = ((1 1 0 1 1 1 1 1))
;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1))
;; clozure = #<procedure:...gos-DrRacket.scm:195:11>

;; #f
;;
;;  (function-map-with-escaping-by-kontinuation2
(macro-function-compare-2-bits-with-continuation)    '(1 1 0 1 0 1 1 0) '(1
1 0 1 1 1 1 0))

;; list1 = (1 1 0 1 0 1 1 0)
;; more-lists = ((1 1 0 1 1 1 1 0))
;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0))
;; clozure = #<procedure:...gos-DrRacket.scm:195:11>

;; '(1 1 0 1 x 1 1 0)
(define (function-map-with-escaping-by-kontinuation2 clozure list1 .
more-lists)
  (call/cc (lambda (kontinuation)
    (let ((lists (cons list1 more-lists))
  (funct-continu ;; this function have the kontinuation in his environment
   (lambda (arg1 . more-args)
     (let ((args (cons arg1 more-args)))
(apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons
conti args))

         ;; (newline)
         ;; (dv list1)
         ;; (dv more-lists)
         ;; (dv lists)
 ;; (dv clozure)
         ;; (newline)

      (apply map funct-continu lists)))))

(define-syntax macro-function-compare-2-bits-with-continuation ;;
continuation version of macro-compare-2-bits
  ;; i need a macro because of external function to the clozure
  (syntax-rules ()
    ((_) (let ((cnt 0)) ;; counter
  (lambda (continuation b1 b2) (if (equal? b1 b2)
 b1
 (begin
   (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > 1, we
can have used a flag too instead of a counter
   (when (> cnt 1) (continuation #f)) ;; escaping with the continuation
   'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0)

what could have caused mutex if in the latter definition above (let ((cnt
0)) ;; counter was defined at top level and shared by all threads!!! yes
there could have be some mutex  but this is not the case, i think even all
function are pure so why is it more slow with // than without?
Damien

On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be> wrote:

> On 12-10-2022 19:19, Damien Mattei wrote:
> > Hello,
> > all is in the title, i test on a approximately 30000 element list , i got
> > 9s with map and 3min 30s with par-map on exactly the same piece of code!?
> >
>  > [...]
>  >
> > translated from Scheme+ to Scheme:
> > (define unified-minterms-set-1 (map function-unify-minterms-list
> > minterms-set)) ;;(par-map function-unify-minterms-list minterms-set))
>
> The definition of 'function-unify-minterms-list' and 'minterms-set' is
> missing.  Without a test case, we can only speculate what's going on.
> (E.g., maybe it grabs a mutex).
>
> Greetings,
> Maxime.
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 20:20   ` Damien Mattei
@ 2022-10-12 20:27     ` Damien Mattei
  2022-10-12 21:29       ` Zelphir Kaltstahl
  2022-10-12 21:44     ` Maxime Devos
  1 sibling, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-12 20:27 UTC (permalink / raw)
  To: Maxime Devos; +Cc: guile-user, guile-devel

https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674

i commited the current version of code here with all files but it is
huge.... :-/

On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com>
wrote:

> Mutex? i do not think code has situation where dead lock could happen, it
> is a code about minimalising logic expressions, it uses minterms , minterms
> set is a set of minterms :like this:
>
> example:
> ((1 1 0) (1 1 1)) will be unified : (1 1 x)
> because 0 and 1 are replaced by x
> the minterms-set could have thousands of pair (mathematic not lisp)
> minterms to unify
> if there is more than one x as result there is no need to continue so i
> escape with a continuation:
>
> minterms-set =
> {
> ((1 0 1 0) (1 1 1 0))
> ((1 0 1 0) (1 1 0 1))
> ((1 0 1 0) (1 0 1 1))
> ((1 0 1 0) (0 1 1 1))
> ((0 1 1 0) (1 1 1 0))
> ((0 1 1 0) (1 1 0 1))
> ((0 1 1 0) (1 0 1 1))
> ((0 1 1 0) (0 1 1 1))
> ((0 1 0 1) (1 1 1 0))
> ((0 1 0 1) (1 1 0 1))
> ((0 1 0 1) (1 0 1 1))
> ((0 1 0 1) (0 1 1 1))
> ((0 0 1 1) (1 1 1 0))
> ((0 0 1 1) (1 1 0 1))
> ((0 0 1 1) (1 0 1 1))
> ((0 0 1 1) (0 1 1 1))
> }
>
> replace { } by () to have the list, other example at another level :
>
> minterms-set =
> {
> ((0 x 1 1) (x 1 1 1))
> ((0 x 1 1) (1 x 1 1))
> ((0 x 1 1) (1 1 x 1))
> ((0 x 1 1) (1 1 1 x))
> ((x 0 1 1) (x 1 1 1))
> ((x 0 1 1) (1 x 1 1))
> ((x 0 1 1) (1 1 x 1))
> ((x 0 1 1) (1 1 1 x))
> ((0 1 x 1) (x 1 1 1))
> ((0 1 x 1) (1 x 1 1))
> ((0 1 x 1) (1 1 x 1))
> ((0 1 x 1) (1 1 1 x))
> ((x 1 0 1) (x 1 1 1))
> ((x 1 0 1) (1 x 1 1))
> ((x 1 0 1) (1 1 x 1))
> ((x 1 0 1) (1 1 1 x))
> ((0 1 1 x) (x 1 1 1))
> ((0 1 1 x) (1 x 1 1))
> ((0 1 1 x) (1 1 x 1))
> ((0 1 1 x) (1 1 1 x))
> ((x 1 1 0) (x 1 1 1))
> ((x 1 1 0) (1 x 1 1))
> ((x 1 1 0) (1 1 x 1))
> ((x 1 1 0) (1 1 1 x))
> ((1 0 1 x) (x 1 1 1))
> ((1 0 1 x) (1 x 1 1))
> ((1 0 1 x) (1 1 x 1))
> ((1 0 1 x) (1 1 1 x))
> ((1 x 1 0) (x 1 1 1))
> ((1 x 1 0) (1 x 1 1))
> ((1 x 1 0) (1 1 x 1))
> ((1 x 1 0) (1 1 1 x))
> }
>
> here we see some minterms are already unified
>
>  it is not easy to read even by me because i wrote the code many years ago
> and is split in many files, but here it is:
>
> (par-map function-unify-minterms-list minterms-set)
>
> {function-unify-minterms-list <+ (λ (L) (apply
> function-unify-two-minterms-and-tag L))}
>
> (define (unify-two-minterms mt1 mt2)
>   (function-map-with-escaping-by-kontinuation2
>  (macro-function-compare-2-bits-with-continuation) mt1 mt2))
>
> ;; (function-map-with-escaping-by-kontinuation2
> (macro-function-compare-2-bits-with-continuation)   '(1 1 0 1 0 1 1 0) '(1
> 1 0 1 1 1 1 1))
>
> ;; list1 = (1 1 0 1 0 1 1 0)
> ;; more-lists = ((1 1 0 1 1 1 1 1))
> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1))
> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
>
> ;; #f
> ;;
> ;;  (function-map-with-escaping-by-kontinuation2
> (macro-function-compare-2-bits-with-continuation)    '(1 1 0 1 0 1 1 0) '(1
> 1 0 1 1 1 1 0))
>
> ;; list1 = (1 1 0 1 0 1 1 0)
> ;; more-lists = ((1 1 0 1 1 1 1 0))
> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0))
> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
>
> ;; '(1 1 0 1 x 1 1 0)
> (define (function-map-with-escaping-by-kontinuation2 clozure list1 .
> more-lists)
>   (call/cc (lambda (kontinuation)
>     (let ((lists (cons list1 more-lists))
>   (funct-continu ;; this function have the kontinuation in his environment
>    (lambda (arg1 . more-args)
>      (let ((args (cons arg1 more-args)))
> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons
> conti args))
>
>          ;; (newline)
>          ;; (dv list1)
>          ;; (dv more-lists)
>          ;; (dv lists)
>  ;; (dv clozure)
>          ;; (newline)
>
>       (apply map funct-continu lists)))))
>
> (define-syntax macro-function-compare-2-bits-with-continuation ;;
> continuation version of macro-compare-2-bits
>   ;; i need a macro because of external function to the clozure
>   (syntax-rules ()
>     ((_) (let ((cnt 0)) ;; counter
>   (lambda (continuation b1 b2) (if (equal? b1 b2)
>  b1
>  (begin
>    (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > 1, we
> can have used a flag too instead of a counter
>    (when (> cnt 1) (continuation #f)) ;; escaping with the continuation
>    'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0)
>
> what could have caused mutex if in the latter definition above (let ((cnt
> 0)) ;; counter was defined at top level and shared by all threads!!! yes
> there could have be some mutex  but this is not the case, i think even all
> function are pure so why is it more slow with // than without?
> Damien
>
> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be>
> wrote:
>
>> On 12-10-2022 19:19, Damien Mattei wrote:
>> > Hello,
>> > all is in the title, i test on a approximately 30000 element list , i
>> got
>> > 9s with map and 3min 30s with par-map on exactly the same piece of
>> code!?
>> >
>>  > [...]
>>  >
>> > translated from Scheme+ to Scheme:
>> > (define unified-minterms-set-1 (map function-unify-minterms-list
>> > minterms-set)) ;;(par-map function-unify-minterms-list minterms-set))
>>
>> The definition of 'function-unify-minterms-list' and 'minterms-set' is
>> missing.  Without a test case, we can only speculate what's going on.
>> (E.g., maybe it grabs a mutex).
>>
>> Greetings,
>> Maxime.
>>
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 20:27     ` Damien Mattei
@ 2022-10-12 21:29       ` Zelphir Kaltstahl
  2022-10-14  8:21         ` Damien Mattei
  2022-10-25  9:07         ` Mikael Djurfeldt
  0 siblings, 2 replies; 27+ messages in thread
From: Zelphir Kaltstahl @ 2022-10-12 21:29 UTC (permalink / raw)
  To: Damien Mattei; +Cc: guile-user, guile-devel

Hi!

On 10/12/22 22:27, Damien Mattei wrote:
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674
>
> i commited the current version of code here with all files but it is
> huge.... :-/
>
> On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com>
> wrote:
>
>> Mutex? i do not think code has situation where dead lock could happen, it
>> is a code about minimalising logic expressions, it uses minterms , minterms
>> set is a set of minterms :like this:
>>
>> example:
>> ((1 1 0) (1 1 1)) will be unified : (1 1 x)
>> because 0 and 1 are replaced by x
>> the minterms-set could have thousands of pair (mathematic not lisp)
>> minterms to unify
>> if there is more than one x as result there is no need to continue so i
>> escape with a continuation:
>>
>> minterms-set =
>> {
>> ((1 0 1 0) (1 1 1 0))
>> ((1 0 1 0) (1 1 0 1))
>> ((1 0 1 0) (1 0 1 1))
>> ((1 0 1 0) (0 1 1 1))
>> ((0 1 1 0) (1 1 1 0))
>> ((0 1 1 0) (1 1 0 1))
>> ((0 1 1 0) (1 0 1 1))
>> ((0 1 1 0) (0 1 1 1))
>> ((0 1 0 1) (1 1 1 0))
>> ((0 1 0 1) (1 1 0 1))
>> ((0 1 0 1) (1 0 1 1))
>> ((0 1 0 1) (0 1 1 1))
>> ((0 0 1 1) (1 1 1 0))
>> ((0 0 1 1) (1 1 0 1))
>> ((0 0 1 1) (1 0 1 1))
>> ((0 0 1 1) (0 1 1 1))
>> }
>>
>> replace { } by () to have the list, other example at another level :
>>
>> minterms-set =
>> {
>> ((0 x 1 1) (x 1 1 1))
>> ((0 x 1 1) (1 x 1 1))
>> ((0 x 1 1) (1 1 x 1))
>> ((0 x 1 1) (1 1 1 x))
>> ((x 0 1 1) (x 1 1 1))
>> ((x 0 1 1) (1 x 1 1))
>> ((x 0 1 1) (1 1 x 1))
>> ((x 0 1 1) (1 1 1 x))
>> ((0 1 x 1) (x 1 1 1))
>> ((0 1 x 1) (1 x 1 1))
>> ((0 1 x 1) (1 1 x 1))
>> ((0 1 x 1) (1 1 1 x))
>> ((x 1 0 1) (x 1 1 1))
>> ((x 1 0 1) (1 x 1 1))
>> ((x 1 0 1) (1 1 x 1))
>> ((x 1 0 1) (1 1 1 x))
>> ((0 1 1 x) (x 1 1 1))
>> ((0 1 1 x) (1 x 1 1))
>> ((0 1 1 x) (1 1 x 1))
>> ((0 1 1 x) (1 1 1 x))
>> ((x 1 1 0) (x 1 1 1))
>> ((x 1 1 0) (1 x 1 1))
>> ((x 1 1 0) (1 1 x 1))
>> ((x 1 1 0) (1 1 1 x))
>> ((1 0 1 x) (x 1 1 1))
>> ((1 0 1 x) (1 x 1 1))
>> ((1 0 1 x) (1 1 x 1))
>> ((1 0 1 x) (1 1 1 x))
>> ((1 x 1 0) (x 1 1 1))
>> ((1 x 1 0) (1 x 1 1))
>> ((1 x 1 0) (1 1 x 1))
>> ((1 x 1 0) (1 1 1 x))
>> }
>>
>> here we see some minterms are already unified
>>
>>   it is not easy to read even by me because i wrote the code many years ago
>> and is split in many files, but here it is:
>>
>> (par-map function-unify-minterms-list minterms-set)
>>
>> {function-unify-minterms-list <+ (λ (L) (apply
>> function-unify-two-minterms-and-tag L))}
>>
>> (define (unify-two-minterms mt1 mt2)
>>    (function-map-with-escaping-by-kontinuation2
>>   (macro-function-compare-2-bits-with-continuation) mt1 mt2))
>>
>> ;; (function-map-with-escaping-by-kontinuation2
>> (macro-function-compare-2-bits-with-continuation)   '(1 1 0 1 0 1 1 0) '(1
>> 1 0 1 1 1 1 1))
>>
>> ;; list1 = (1 1 0 1 0 1 1 0)
>> ;; more-lists = ((1 1 0 1 1 1 1 1))
>> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1))
>> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
>>
>> ;; #f
>> ;;
>> ;;  (function-map-with-escaping-by-kontinuation2
>> (macro-function-compare-2-bits-with-continuation)    '(1 1 0 1 0 1 1 0) '(1
>> 1 0 1 1 1 1 0))
>>
>> ;; list1 = (1 1 0 1 0 1 1 0)
>> ;; more-lists = ((1 1 0 1 1 1 1 0))
>> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0))
>> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
>>
>> ;; '(1 1 0 1 x 1 1 0)
>> (define (function-map-with-escaping-by-kontinuation2 clozure list1 .
>> more-lists)
>>    (call/cc (lambda (kontinuation)
>>      (let ((lists (cons list1 more-lists))
>>    (funct-continu ;; this function have the kontinuation in his environment
>>     (lambda (arg1 . more-args)
>>       (let ((args (cons arg1 more-args)))
>> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons
>> conti args))
>>
>>           ;; (newline)
>>           ;; (dv list1)
>>           ;; (dv more-lists)
>>           ;; (dv lists)
>>   ;; (dv clozure)
>>           ;; (newline)
>>
>>        (apply map funct-continu lists)))))
>>
>> (define-syntax macro-function-compare-2-bits-with-continuation ;;
>> continuation version of macro-compare-2-bits
>>    ;; i need a macro because of external function to the clozure
>>    (syntax-rules ()
>>      ((_) (let ((cnt 0)) ;; counter
>>    (lambda (continuation b1 b2) (if (equal? b1 b2)
>>   b1
>>   (begin
>>     (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt > 1, we
>> can have used a flag too instead of a counter
>>     (when (> cnt 1) (continuation #f)) ;; escaping with the continuation
>>     'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0)
>>
>> what could have caused mutex if in the latter definition above (let ((cnt
>> 0)) ;; counter was defined at top level and shared by all threads!!! yes
>> there could have be some mutex  but this is not the case, i think even all
>> function are pure so why is it more slow with // than without?
>> Damien
>>
>> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be>
>> wrote:
>>
>>> On 12-10-2022 19:19, Damien Mattei wrote:
>>>> Hello,
>>>> all is in the title, i test on a approximately 30000 element list , i
>>> got
>>>> 9s with map and 3min 30s with par-map on exactly the same piece of
>>> code!?
>>>   > [...]
>>>   >
>>>> translated from Scheme+ to Scheme:
>>>> (define unified-minterms-set-1 (map function-unify-minterms-list
>>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set))
>>> The definition of 'function-unify-minterms-list' and 'minterms-set' is
>>> missing.  Without a test case, we can only speculate what's going on.
>>> (E.g., maybe it grabs a mutex).
>>>
>>> Greetings,
>>> Maxime.
I don't want to scare anyone, just maybe warn about parallel map. I once tried 
to use Guile's parallel map function for a decision tree implementation 
(https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm), 
where each branch while learning the tree would call parallel map again for sub 
branches and so on. Somehow it made Guile crash (I don't have the error message 
any longer, but I did post about it on the mailing list back then.). I never 
figured out, what went wrong. All I had was pure function calls and math inside 
the thing that parallel map was supposed to run.

Ultimately I simply tried other parallelism constructs and when I switched to 
using futures instead, everything worked fine, no crashes, no errors.

Since that time, I did not use parallel map and instead used futures. Recently I 
made a parallelization thing for solving exercises of Project Euler using 
multiple cores, so that some solutions are calculated faster. Maybe this can 
help or can be adapted to another use case:

https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30

It expects ranges of things, which are called `segments` in the code. Usually 
ranges of numbers for Project Euler things. Here is the code to split a range 
into segments:

https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm

(Check any solution using it for an example.)

So this might be a bit too specific for general parallel things, but I guess one 
could change the way futures are used in `run-in-parallel`, to fit any other 
purpose.

Best regards,
Zelphir

-- 
repositories: https://notabug.org/ZelphirKaltstahl




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 20:20   ` Damien Mattei
  2022-10-12 20:27     ` Damien Mattei
@ 2022-10-12 21:44     ` Maxime Devos
  1 sibling, 0 replies; 27+ messages in thread
From: Maxime Devos @ 2022-10-12 21:44 UTC (permalink / raw)
  To: Damien Mattei; +Cc: guile-user, guile-devel

[-- Attachment #1.1.1: Type: text/plain, Size: 1040 bytes --]

On 12-10-2022 22:20, Damien Mattei wrote:
> Mutex? i do not think code has situation where dead lock could happen, 
> it is a code about minimalising logic expressions, it uses minterms , 
> minterms set is a set of minterms :like this:

I didn't mean deadlocks, I just meant a single mutex being taken by all 
the threads.

Also, you seem to have a global hash table on line 1603 -- presumably 
there's a mutex in there (and even if not, likely there are CPU cache 
issues).

> [...]

I propose to minimise this complicated test case (doesn't need to be 
equivalent, just needs to demonstrate the 'map-par slower than map') and 
to change it to standard-ish Scheme instead of your Scheme+ variant.

> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be 
> <mailto:maximedevos@telenet.be>> wrote:
>
> [rest of copied previous message, without any comments]

This is top-posting.  For why not to top-post, 
https://wiki.lyx.org/FAQ/ListNetiquette has various information.

Greetings,
Maxime.

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 17:19 map-par slower than map Damien Mattei
  2022-10-12 18:45 ` Maxime Devos
@ 2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  2022-10-13  7:40   ` Damien Mattei
  2022-10-13 10:44   ` Damien Mattei
  1 sibling, 2 replies; 27+ messages in thread
From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-12 21:55 UTC (permalink / raw)
  To: Damien Mattei, guile-user, guile-devel

On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> Hello,
> all is in the title, i test on a approximately 30000 element list , i got
> 9s with map and 3min 30s with par-map on exactly the same piece of
> code!?

I can only speculate here.  But trying with a very simple example here:
--8<---------------cut here---------------start------------->8---
(use-modules (statprof))
(statprof (lambda () (par-map 1+ (iota 300000))))
--8<---------------cut here---------------end--------------->8---

Performance are terrible.  I don't know how par-map is implemented, but
if it does 1 element static scheduling -- which it probably does because
you pass a linked list and not a vector -- then yeah you can assure that
thing will be very slow.

You're probably better off with dynamic scheduling with vectors.  Here's
a quick snippet I made for static scheduling but with vectors.  Feel
free to roll your own.

--8<---------------cut here---------------start------------->8---
(use-modules
 (srfi srfi-1)
 (ice-9 threads))

(define* (par-map-vector proc input
                         #:optional
                         (max-thread (current-processor-count)))

  (let* ((block-size (quotient (vector-length input) max-thread))
         (rest (remainder (vector-length input) max-thread))
         (output (make-vector (vector-length input) #f)))
    (when (not (zero? block-size))
      (let ((mtx (make-mutex))
            (cnd (make-condition-variable))
            (n 0))
        (fold
         (lambda (scale output)
           (begin-thread
            (let lp ((i 0))
              (when (< i block-size)
                (let ((i (+ i (* scale block-size))))
                  (vector-set! output i (proc (vector-ref input i))))
                (lp (1+ i))))
            (with-mutex mtx
              (set! n (1+ n))
              (signal-condition-variable cnd)))
           output)
         output
         (iota max-thread))
        (with-mutex mtx
          (while (not (< n max-thread))
            (wait-condition-variable cnd mtx))))
    (let ((base (- (vector-length input) rest)))
      (let lp ((i 0))
        (when (< i rest)
          (let ((i (+ i base)))
            (vector-set! output i (proc (vector-ref input i))))
          (lp (1+ i)))))
    output))
--8<---------------cut here---------------end--------------->8---

-- 
Olivier Dion
oldiob.dev



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
@ 2022-10-13  7:40   ` Damien Mattei
  2022-10-13  8:20     ` Damien Mattei
  2022-10-13  9:10     ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  2022-10-13 10:44   ` Damien Mattei
  1 sibling, 2 replies; 27+ messages in thread
From: Damien Mattei @ 2022-10-13  7:40 UTC (permalink / raw)
  Cc: guile-user, guile-devel

ok , i think the problem comes both from my code and from guile parmap so.
Obviously parmap could be slower on other codes because of the nature of
list i think, it is hard to split a list in sublist and send them to thread
and after redo a single list, i better use vector.
As mentioned and forget by me, i apologize, i use an hash table which is a
global variable and mutex can be set on it , no dead lock , here but it
slow down the code than it is dead for speed , but not locked.

The examples given are good but i do not want to write long and specific
code for //, for // must a ssimple as OpenMP directives (on arrays) not be
a pain in the ass like things i already did at work with GPUs in C or
Fortran,that's nightmare and i do not want to have this in functional
programming.

ok i will try to modify the code but i do not think there is easy solution
to replace the hash table, any structure i would use would be global and
the problem (this function is not pure, it does a side-effect),at the end i
do not think this code could be // but i'm not a specialist of //. I think
this is a parallelization problem, how can we deal with a global variable
such as an hash table?

On Wed, Oct 12, 2022 at 11:55 PM Olivier Dion <olivier.dion@polymtl.ca>
wrote:

> On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> > Hello,
> > all is in the title, i test on a approximately 30000 element list , i got
> > 9s with map and 3min 30s with par-map on exactly the same piece of
> > code!?
>
> I can only speculate here.  But trying with a very simple example here:
> --8<---------------cut here---------------start------------->8---
> (use-modules (statprof))
> (statprof (lambda () (par-map 1+ (iota 300000))))
> --8<---------------cut here---------------end--------------->8---
>
> Performance are terrible.  I don't know how par-map is implemented, but
> if it does 1 element static scheduling -- which it probably does because
> you pass a linked list and not a vector -- then yeah you can assure that
> thing will be very slow.
>
> You're probably better off with dynamic scheduling with vectors.  Here's
> a quick snippet I made for static scheduling but with vectors.  Feel
> free to roll your own.
>
> --8<---------------cut here---------------start------------->8---
> (use-modules
>  (srfi srfi-1)
>  (ice-9 threads))
>
> (define* (par-map-vector proc input
>                          #:optional
>                          (max-thread (current-processor-count)))
>
>   (let* ((block-size (quotient (vector-length input) max-thread))
>          (rest (remainder (vector-length input) max-thread))
>          (output (make-vector (vector-length input) #f)))
>     (when (not (zero? block-size))
>       (let ((mtx (make-mutex))
>             (cnd (make-condition-variable))
>             (n 0))
>         (fold
>          (lambda (scale output)
>            (begin-thread
>             (let lp ((i 0))
>               (when (< i block-size)
>                 (let ((i (+ i (* scale block-size))))
>                   (vector-set! output i (proc (vector-ref input i))))
>                 (lp (1+ i))))
>             (with-mutex mtx
>               (set! n (1+ n))
>               (signal-condition-variable cnd)))
>            output)
>          output
>          (iota max-thread))
>         (with-mutex mtx
>           (while (not (< n max-thread))
>             (wait-condition-variable cnd mtx))))
>     (let ((base (- (vector-length input) rest)))
>       (let lp ((i 0))
>         (when (< i rest)
>           (let ((i (+ i base)))
>             (vector-set! output i (proc (vector-ref input i))))
>           (lp (1+ i)))))
>     output))
> --8<---------------cut here---------------end--------------->8---
>
> --
> Olivier Dion
> oldiob.dev
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13  7:40   ` Damien Mattei
@ 2022-10-13  8:20     ` Damien Mattei
  2022-10-13  9:10     ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  1 sibling, 0 replies; 27+ messages in thread
From: Damien Mattei @ 2022-10-13  8:20 UTC (permalink / raw)
  To: guile-user, guile-devel

ok 'parallelization code and hash table' in google show no real solution, i
have another idea,instead of function-unify-two-minterms-and-tag ,just
unify the minterms and tag them (which use hash table) later out of the //
region, after if it still sucks i will drop list for vectors, making it
step by step will show where is the bottleneck, let me just a few time and
i will go back to the mailing lists with interesting results i hope....

On Thu, Oct 13, 2022 at 9:40 AM Damien Mattei <damien.mattei@gmail.com>
wrote:

> ok , i think the problem comes both from my code and from guile parmap so.
> Obviously parmap could be slower on other codes because of the nature of
> list i think, it is hard to split a list in sublist and send them to thread
> and after redo a single list, i better use vector.
> As mentioned and forget by me, i apologize, i use an hash table which is a
> global variable and mutex can be set on it , no dead lock , here but it
> slow down the code than it is dead for speed , but not locked.
>
> The examples given are good but i do not want to write long and specific
> code for //, for // must a ssimple as OpenMP directives (on arrays) not be
> a pain in the ass like things i already did at work with GPUs in C or
> Fortran,that's nightmare and i do not want to have this in functional
> programming.
>
> ok i will try to modify the code but i do not think there is easy solution
> to replace the hash table, any structure i would use would be global and
> the problem (this function is not pure, it does a side-effect),at the end i
> do not think this code could be // but i'm not a specialist of //. I think
> this is a parallelization problem, how can we deal with a global variable
> such as an hash table?
>
> On Wed, Oct 12, 2022 at 11:55 PM Olivier Dion <olivier.dion@polymtl.ca>
> wrote:
>
>> On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
>> > Hello,
>> > all is in the title, i test on a approximately 30000 element list , i
>> got
>> > 9s with map and 3min 30s with par-map on exactly the same piece of
>> > code!?
>>
>> I can only speculate here.  But trying with a very simple example here:
>> --8<---------------cut here---------------start------------->8---
>> (use-modules (statprof))
>> (statprof (lambda () (par-map 1+ (iota 300000))))
>> --8<---------------cut here---------------end--------------->8---
>>
>> Performance are terrible.  I don't know how par-map is implemented, but
>> if it does 1 element static scheduling -- which it probably does because
>> you pass a linked list and not a vector -- then yeah you can assure that
>> thing will be very slow.
>>
>> You're probably better off with dynamic scheduling with vectors.  Here's
>> a quick snippet I made for static scheduling but with vectors.  Feel
>> free to roll your own.
>>
>> --8<---------------cut here---------------start------------->8---
>> (use-modules
>>  (srfi srfi-1)
>>  (ice-9 threads))
>>
>> (define* (par-map-vector proc input
>>                          #:optional
>>                          (max-thread (current-processor-count)))
>>
>>   (let* ((block-size (quotient (vector-length input) max-thread))
>>          (rest (remainder (vector-length input) max-thread))
>>          (output (make-vector (vector-length input) #f)))
>>     (when (not (zero? block-size))
>>       (let ((mtx (make-mutex))
>>             (cnd (make-condition-variable))
>>             (n 0))
>>         (fold
>>          (lambda (scale output)
>>            (begin-thread
>>             (let lp ((i 0))
>>               (when (< i block-size)
>>                 (let ((i (+ i (* scale block-size))))
>>                   (vector-set! output i (proc (vector-ref input i))))
>>                 (lp (1+ i))))
>>             (with-mutex mtx
>>               (set! n (1+ n))
>>               (signal-condition-variable cnd)))
>>            output)
>>          output
>>          (iota max-thread))
>>         (with-mutex mtx
>>           (while (not (< n max-thread))
>>             (wait-condition-variable cnd mtx))))
>>     (let ((base (- (vector-length input) rest)))
>>       (let lp ((i 0))
>>         (when (< i rest)
>>           (let ((i (+ i base)))
>>             (vector-set! output i (proc (vector-ref input i))))
>>           (lp (1+ i)))))
>>     output))
>> --8<---------------cut here---------------end--------------->8---
>>
>> --
>> Olivier Dion
>> oldiob.dev
>>
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13  7:40   ` Damien Mattei
  2022-10-13  8:20     ` Damien Mattei
@ 2022-10-13  9:10     ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  1 sibling, 0 replies; 27+ messages in thread
From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13  9:10 UTC (permalink / raw)
  To: Damien Mattei; +Cc: guile-user, guile-devel

On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> ok , i think the problem comes both from my code and from guile parmap so.
> Obviously parmap could be slower on other codes because of the nature of
> list i think, it is hard to split a list in sublist and send them to thread
> and after redo a single list, i better use vector.
> As mentioned and forget by me, i apologize, i use an hash table which is a
> global variable and mutex can be set on it , no dead lock , here but it
> slow down the code than it is dead for speed , but not locked.

Your code is not the issue.  The fact that `(par-map 1+ (iota
10000000))` is slow is the proof of that.  I think that you should use
par-map when you want to do I/O/subprocesses in parallel.  Fast
computation on a huge number of items should be done with vector because
of its random access nature.  List are simply no the good data structure
for parallelization.

> The examples given are good but i do not want to write long and specific
> code for //, for // must a ssimple as OpenMP directives (on arrays) not be
> a pain in the ass like things i already did at work with GPUs in C or
> Fortran,that's nightmare and i do not want to have this in functional
> programming.

You're free to use my example of `par-map-vector` and tweak it to your needs.
But maybe it's time for a `guile-parallel` for fast high level
parallel.  If others are interested in that, I can make one.

-- 
Olivier Dion
oldiob.dev



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  2022-10-13  7:40   ` Damien Mattei
@ 2022-10-13 10:44   ` Damien Mattei
  2022-10-13 11:00     ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  1 sibling, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-13 10:44 UTC (permalink / raw)
  To: Olivier Dion, guile-user, guile-devel

[-- Attachment #1: Type: text/plain, Size: 2663 bytes --]

i trying to use your code but it seems there is a ) mismatch somewhere?

On Wed, Oct 12, 2022 at 11:55 PM Olivier Dion <olivier.dion@polymtl.ca>
wrote:

> On Wed, 12 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> > Hello,
> > all is in the title, i test on a approximately 30000 element list , i got
> > 9s with map and 3min 30s with par-map on exactly the same piece of
> > code!?
>
> I can only speculate here.  But trying with a very simple example here:
> --8<---------------cut here---------------start------------->8---
> (use-modules (statprof))
> (statprof (lambda () (par-map 1+ (iota 300000))))
> --8<---------------cut here---------------end--------------->8---
>
> Performance are terrible.  I don't know how par-map is implemented, but
> if it does 1 element static scheduling -- which it probably does because
> you pass a linked list and not a vector -- then yeah you can assure that
> thing will be very slow.
>
> You're probably better off with dynamic scheduling with vectors.  Here's
> a quick snippet I made for static scheduling but with vectors.  Feel
> free to roll your own.
>
> --8<---------------cut here---------------start------------->8---
> (use-modules
>  (srfi srfi-1)
>  (ice-9 threads))
>
> (define* (par-map-vector proc input
>                          #:optional
>                          (max-thread (current-processor-count)))
>
>   (let* ((block-size (quotient (vector-length input) max-thread))
>          (rest (remainder (vector-length input) max-thread))
>          (output (make-vector (vector-length input) #f)))
>     (when (not (zero? block-size))
>       (let ((mtx (make-mutex))
>             (cnd (make-condition-variable))
>             (n 0))
>         (fold
>          (lambda (scale output)
>            (begin-thread
>             (let lp ((i 0))
>               (when (< i block-size)
>                 (let ((i (+ i (* scale block-size))))
>                   (vector-set! output i (proc (vector-ref input i))))
>                 (lp (1+ i))))
>             (with-mutex mtx
>               (set! n (1+ n))
>               (signal-condition-variable cnd)))
>            output)
>          output
>          (iota max-thread))
>         (with-mutex mtx
>           (while (not (< n max-thread))
>             (wait-condition-variable cnd mtx))))
>     (let ((base (- (vector-length input) rest)))
>       (let lp ((i 0))
>         (when (< i rest)
>           (let ((i (+ i base)))
>             (vector-set! output i (proc (vector-ref input i))))
>           (lp (1+ i)))))
>     output))
> --8<---------------cut here---------------end--------------->8---
>
> --
> Olivier Dion
> oldiob.dev
>

[-- Attachment #2: Type: text/html, Size: 3591 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13 10:44   ` Damien Mattei
@ 2022-10-13 11:00     ` Olivier Dion via Developers list for Guile, the GNU extensibility library
       [not found]       ` <CADEOadfovi8s3OxRcssWOuOW8jjHoL9Z7pD_5FstSm=ZkBHP8g@mail.gmail.com>
  0 siblings, 1 reply; 27+ messages in thread
From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 11:00 UTC (permalink / raw)
  To: Damien Mattei, guile-user, guile-devel

On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> i trying to use your code but it seems there is a ) mismatch
> somewhere?

Right there's a missing ')' a the end to close the procedure, sorry
about that.

-- 
Olivier Dion
oldiob.dev



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Fwd: map-par slower than map
       [not found]       ` <CADEOadfovi8s3OxRcssWOuOW8jjHoL9Z7pD_5FstSm=ZkBHP8g@mail.gmail.com>
@ 2022-10-13 11:57         ` Damien Mattei
  2022-10-13 12:36           ` Damien Mattei
  0 siblings, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-13 11:57 UTC (permalink / raw)
  To: guile-user, guile-devel

sorry google always do reply to single author...

---------- Forwarded message ---------
From: Damien Mattei <damien.mattei@gmail.com>
Date: Thu, Oct 13, 2022 at 1:56 PM
Subject: Re: map-par slower than map
To: Olivier Dion <olivier.dion@polymtl.ca>


ah just at the end? i had noticed, but i get #unspecifeid ? so my new code
is not good if your procedure is, i will check it,thanks.
Damien

On Thu, Oct 13, 2022 at 1:00 PM Olivier Dion <olivier.dion@polymtl.ca>
wrote:

> On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> > i trying to use your code but it seems there is a ) mismatch
> > somewhere?
>
> Right there's a missing ')' a the end to close the procedure, sorry
> about that.
>
> --
> Olivier Dion
> oldiob.dev
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13 11:57         ` Fwd: " Damien Mattei
@ 2022-10-13 12:36           ` Damien Mattei
  2022-10-13 12:41             ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  0 siblings, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-13 12:36 UTC (permalink / raw)
  To: guile-user, guile-devel, Olivier Dion

the code did not worked when data length were more little than number of
cpus (6 on my host) (iota 5) returns #unsepcified:

scheme@(guile-user)> (use-modules (ice-9 threads))
scheme@(guile-user)> {v <+ (list->vector (iota 5))}
#(0 1 2 3 4)
scheme@(guile-user)> (par-map-vector sqrt v)
scheme@(guile-user)> {v <+ (list->vector (iota 20))}
#(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19)
scheme@(guile-user)> (par-map-vector sqrt v)
#(0 1 1.4142135623730951 1.7320508075688772 2 2.23606797749979
2.449489742783178 2.6457513110645907 2.8284271247461903 3
3.1622776601683795 3.3166247903554 3.4641016151377544 3.605551275463989
3.7416573867739413 3.872983346207417 4 4.123105625617661 4.242640687119285
4.358898943540674)
scheme@(guile-user)> (current-processor-count)
6
scheme@(guile-user)> {v <+ (list->vector (iota 6))}
#(0 1 2 3 4 5)
scheme@(guile-user)> (par-map-vector sqrt v)
#(0 1 1.4142135623730951 1.7320508075688772 2 2.23606797749979)

so i modify it this way:

(define* (par-map-vector proc input
                         #:optional
                         (max-thread (current-processor-count)))

  (if (< (vector-length input) max-thread)

      (list->vector (map proc (vector->list input))) ;; less data than
threads or CPUs

      (let* ((block-size (quotient (vector-length input) max-thread))
    (rest (remainder (vector-length input) max-thread))
    (output (make-vector (vector-length input) #f)))

(when (not (zero? block-size))

 (let ((mtx (make-mutex))
(cnd (make-condition-variable))
(n 0))

   (fold

    (lambda (scale output)

      (begin-thread

(let lp ((i 0))
 (when (< i block-size)
   (let ((i (+ i (* scale block-size))))
     (vector-set! output i (proc (vector-ref input i))))
   (lp (1+ i))))

(with-mutex mtx
   (set! n (1+ n))
   (signal-condition-variable cnd)))
      output)

    output
    (iota max-thread))

   (with-mutex mtx
(while (not (< n max-thread))
      (wait-condition-variable cnd mtx))))

 (let ((base (- (vector-length input) rest)))
   (let lp ((i 0))
     (when (< i rest)
(let ((i (+ i base)))
 (vector-set! output i (proc (vector-ref input i))))
(lp (1+ i)))))

 output))))

now it works but crash randomly, not even at the same stage, i continue to
test and debug....


On Thu, Oct 13, 2022 at 1:57 PM Damien Mattei <damien.mattei@gmail.com>
wrote:

> sorry google always do reply to single author...
>
> ---------- Forwarded message ---------
> From: Damien Mattei <damien.mattei@gmail.com>
> Date: Thu, Oct 13, 2022 at 1:56 PM
> Subject: Re: map-par slower than map
> To: Olivier Dion <olivier.dion@polymtl.ca>
>
>
> ah just at the end? i had noticed, but i get #unspecifeid ? so my new code
> is not good if your procedure is, i will check it,thanks.
> Damien
>
> On Thu, Oct 13, 2022 at 1:00 PM Olivier Dion <olivier.dion@polymtl.ca>
> wrote:
>
>> On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
>> > i trying to use your code but it seems there is a ) mismatch
>> > somewhere?
>>
>> Right there's a missing ')' a the end to close the procedure, sorry
>> about that.
>>
>> --
>> Olivier Dion
>> oldiob.dev
>>
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13 12:36           ` Damien Mattei
@ 2022-10-13 12:41             ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  2022-10-13 13:43               ` Damien Mattei
  0 siblings, 1 reply; 27+ messages in thread
From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 12:41 UTC (permalink / raw)
  To: Damien Mattei, guile-user, guile-devel

On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> the code did not worked when data length were more little than number of
> cpus (6 on my host) (iota 5) returns #unsepcified:

Yeah sorry I miss indended the output and the rest.  Here's a version
that should work:
--8<---------------cut here---------------start------------->8---
(use-modules
 (srfi srfi-1)
 (ice-9 threads))

(define* (par-map-vector proc input
                         #:optional
                         (max-thread (current-processor-count)))

  (let* ((block-size (quotient (vector-length input) max-thread))
         (rest (remainder (vector-length input) max-thread))
         (output (make-vector (vector-length input) #f)))
    (when (not (zero? block-size))
      (let ((mtx (make-mutex))
            (cnd (make-condition-variable))
            (n 0))
        (fold
         (lambda (scale output)
           (begin-thread
            (let lp ((i 0))
              (when (< i block-size)
                (let ((i (+ i (* scale block-size))))
                  (vector-set! output i (proc (vector-ref input i))))
                (lp (1+ i))))
            (with-mutex mtx
              (set! n (1+ n))
              (signal-condition-variable cnd)))
           output)
         output
         (iota max-thread))
        (with-mutex mtx
          (while (not (< n max-thread))
            (wait-condition-variable cnd mtx)))))
    (let ((base (- (vector-length input) rest)))
      (let lp ((i 0))
        (when (< i rest)
          (let ((i (+ i base)))
            (vector-set! output i (proc (vector-ref input i))))
          (lp (1+ i)))))
    output))
--8<---------------cut here---------------end--------------->8---

-- 
Olivier Dion
oldiob.dev



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13 12:41             ` Olivier Dion via Developers list for Guile, the GNU extensibility library
@ 2022-10-13 13:43               ` Damien Mattei
  2022-10-13 14:06                 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  0 siblings, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-13 13:43 UTC (permalink / raw)
  To: Olivier Dion; +Cc: guile-user, guile-devel

i do not see what has changed in your code ?
really strange,even with bad code the moment it crash should be the same,
sometimes works,crash or freeze....

On Thu, Oct 13, 2022 at 2:41 PM Olivier Dion <olivier.dion@polymtl.ca>
wrote:

> On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> > the code did not worked when data length were more little than number of
> > cpus (6 on my host) (iota 5) returns #unsepcified:
>
> Yeah sorry I miss indended the output and the rest.  Here's a version
> that should work:
> --8<---------------cut here---------------start------------->8---
> (use-modules
>  (srfi srfi-1)
>  (ice-9 threads))
>
> (define* (par-map-vector proc input
>                          #:optional
>                          (max-thread (current-processor-count)))
>
>   (let* ((block-size (quotient (vector-length input) max-thread))
>          (rest (remainder (vector-length input) max-thread))
>          (output (make-vector (vector-length input) #f)))
>     (when (not (zero? block-size))
>       (let ((mtx (make-mutex))
>             (cnd (make-condition-variable))
>             (n 0))
>         (fold
>          (lambda (scale output)
>            (begin-thread
>             (let lp ((i 0))
>               (when (< i block-size)
>                 (let ((i (+ i (* scale block-size))))
>                   (vector-set! output i (proc (vector-ref input i))))
>                 (lp (1+ i))))
>             (with-mutex mtx
>               (set! n (1+ n))
>               (signal-condition-variable cnd)))
>            output)
>          output
>          (iota max-thread))
>         (with-mutex mtx
>           (while (not (< n max-thread))
>             (wait-condition-variable cnd mtx)))))
>     (let ((base (- (vector-length input) rest)))
>       (let lp ((i 0))
>         (when (< i rest)
>           (let ((i (+ i base)))
>             (vector-set! output i (proc (vector-ref input i))))
>           (lp (1+ i)))))
>     output))
> --8<---------------cut here---------------end--------------->8---
>
> --
> Olivier Dion
> oldiob.dev
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13 13:43               ` Damien Mattei
@ 2022-10-13 14:06                 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
  2022-10-13 14:10                   ` Damien Mattei
  0 siblings, 1 reply; 27+ messages in thread
From: Olivier Dion via Developers list for Guile, the GNU extensibility library @ 2022-10-13 14:06 UTC (permalink / raw)
  To: Damien Mattei; +Cc: guile-user, guile-devel

On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> i do not see what has changed in your code ?

Just copy it.  Trust me it has changed.

> really strange,even with bad code the moment it crash should be the same,
> sometimes works,crash or freeze....

I don't get any of these problem .. running with (iota 100000000)  I
don't see any synchornization issue also.

-- 
Olivier Dion
oldiob.dev



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13 14:06                 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
@ 2022-10-13 14:10                   ` Damien Mattei
  2022-10-13 14:21                     ` Damien Mattei
  0 siblings, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-13 14:10 UTC (permalink / raw)
  To: Olivier Dion; +Cc: guile-user, guile-devel

On Thu, Oct 13, 2022 at 4:06 PM Olivier Dion <olivier.dion@polymtl.ca>
wrote:

> On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
> > i do not see what has changed in your code ?
>
> Just copy it.  Trust me it has changed.
>
really ? :-) i read it 3 times line by line, yes the end is cut ,still no
ending ) and i use it and freeze,crash or ok randomly, the first times it
works but try 4 or 5 examples.... :-(

>
> > really strange,even with bad code the moment it crash should be the same,
> > sometimes works,crash or freeze....
>
> I don't get any of these problem .. running with (iota 100000000)  I
> don't see any synchornization issue also.
>
> --
> Olivier Dion
> oldiob.dev
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-13 14:10                   ` Damien Mattei
@ 2022-10-13 14:21                     ` Damien Mattei
  0 siblings, 0 replies; 27+ messages in thread
From: Damien Mattei @ 2022-10-13 14:21 UTC (permalink / raw)
  To: Olivier Dion; +Cc: guile-user, guile-devel

ok i have the proof, this time (second time) the process stopped itself on
your example ("processus stoppé " in french) :

mattei@pc-mattei:~/Dropbox/git/library-FunctProg$ guile
GNU Guile 3.0.1
Copyright (C) 1995-2020 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (use-modules (srfi srfi-1) (ice-9 threads))
scheme@(guile-user)> (define* (par-map-vector proc input
                         #:optional
                         (max-thread (current-processor-count)))

  (if (< (vector-length input) max-thread)

      (list->vector (map proc (vector->list input))) ;; less data than
threads or CPUs

      (let* ((block-size (quotient (vector-length input) max-thread))
             (rest (remainder (vector-length input) max-thread))
             (output (make-vector (vector-length input) #f)))

        (when (not (zero? block-size))

          (let ((mtx (make-mutex))
                (cnd (make-condition-variable))
                (n 0))

            (fold

             (lambda (scale output)

               (begin-thread

                (let lp ((i 0))
                  (when (< i block-size)
                    (let ((i (+ i (* scale block-size))))
                      (vector-set! output i (proc (vector-ref input i))))
                    (lp (1+ i))))

                (with-mutex mtx
                            (set! n (1+ n))
                            (signal-condition-variable cnd)))
               output)

             output
             (iota max-thread))

            (with-mutex mtx
                        (while (not (< n max-thread))
                               (wait-condition-variable cnd mtx))))

          ;; (let ((base (- (vector-length input) rest)))
          ;;   (let lp ((i 0))
          ;;     (when (< i rest)
          ;;    (let ((i (+ i base)))
          ;;      (vector-set! output i (proc (vector-ref input i))))
          ;;    (lp (1+ i)))))

          output))))
scheme@(guile-user)> (define v (list->vector  (iota 100000000) ))
scheme@(guile-user)> (define r (par-map-vector sqrt v))
scheme@(guile-user)> (define v (list->vector  (iota 100000000) ))
scheme@(guile-user)> (define r (par-map-vector sqrt v))
Processus arrêté

the first time is ok but the second time, Guile quit to the terminal.
seems something makes the guile system instable after one run,
i test it out of my program and out of my Scheme+
but my .guile is in this directory like that:
; Guile config file

;; history
(use-modules (ice-9 readline)
    (ice-9 history)
    (srfi srfi-43) ;; vector
    ;; guile object oriented programming system
    (oop goops)
    (oop goops describe))

(activate-readline)
;;(disable-value-history!)

;; curly infix as in srfi-105
(read-enable 'curly-infix)

;; set current path in load path
(set! %load-path (reverse (cons "." (reverse %load-path))))
;; other solution is to put this in shell:
;; export GUILE_LOAD_PATH="...:."


On Thu, Oct 13, 2022 at 4:10 PM Damien Mattei <damien.mattei@gmail.com>
wrote:

>
>
> On Thu, Oct 13, 2022 at 4:06 PM Olivier Dion <olivier.dion@polymtl.ca>
> wrote:
>
>> On Thu, 13 Oct 2022, Damien Mattei <damien.mattei@gmail.com> wrote:
>> > i do not see what has changed in your code ?
>>
>> Just copy it.  Trust me it has changed.
>>
> really ? :-) i read it 3 times line by line, yes the end is cut ,still no
> ending ) and i use it and freeze,crash or ok randomly, the first times it
> works but try 4 or 5 examples.... :-(
>
>>
>> > really strange,even with bad code the moment it crash should be the
>> same,
>> > sometimes works,crash or freeze....
>>
>> I don't get any of these problem .. running with (iota 100000000)  I
>> don't see any synchornization issue also.
>>
>> --
>> Olivier Dion
>> oldiob.dev
>>
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 21:29       ` Zelphir Kaltstahl
@ 2022-10-14  8:21         ` Damien Mattei
  2022-10-14  8:38           ` Zelphir Kaltstahl
  2022-10-25  9:07         ` Mikael Djurfeldt
  1 sibling, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-14  8:21 UTC (permalink / raw)
  To: Zelphir Kaltstahl; +Cc: guile-user, guile-devel

yes Zelphir i think there is a problem in the threading part of guile,
whatever the code is ,it run well the first time, and after is unstable.
Long ago at the very begin of Java language on my master project at
university i experienced same problem with Java "green" threads, under
windows and/or linux ,i can't remember.

I'm trying your code and future, which have perheaps also the portability
with other schemes, future can provide "light" // , with carefull code it
could be ok.

in your segment.scm code ,is segment-count possibly replaced by the number
of available CPUs or nodes, if i understand well?

Regards,
Damien


On Wed, Oct 12, 2022 at 11:29 PM Zelphir Kaltstahl <
zelphirkaltstahl@posteo.de> wrote:

> Hi!
>
> On 10/12/22 22:27, Damien Mattei wrote:
> >
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674
> >
> > i commited the current version of code here with all files but it is
> > huge.... :-/
> >
> > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com>
> > wrote:
> >
> >> Mutex? i do not think code has situation where dead lock could happen,
> it
> >> is a code about minimalising logic expressions, it uses minterms ,
> minterms
> >> set is a set of minterms :like this:
> >>
> >> example:
> >> ((1 1 0) (1 1 1)) will be unified : (1 1 x)
> >> because 0 and 1 are replaced by x
> >> the minterms-set could have thousands of pair (mathematic not lisp)
> >> minterms to unify
> >> if there is more than one x as result there is no need to continue so i
> >> escape with a continuation:
> >>
> >> minterms-set =
> >> {
> >> ((1 0 1 0) (1 1 1 0))
> >> ((1 0 1 0) (1 1 0 1))
> >> ((1 0 1 0) (1 0 1 1))
> >> ((1 0 1 0) (0 1 1 1))
> >> ((0 1 1 0) (1 1 1 0))
> >> ((0 1 1 0) (1 1 0 1))
> >> ((0 1 1 0) (1 0 1 1))
> >> ((0 1 1 0) (0 1 1 1))
> >> ((0 1 0 1) (1 1 1 0))
> >> ((0 1 0 1) (1 1 0 1))
> >> ((0 1 0 1) (1 0 1 1))
> >> ((0 1 0 1) (0 1 1 1))
> >> ((0 0 1 1) (1 1 1 0))
> >> ((0 0 1 1) (1 1 0 1))
> >> ((0 0 1 1) (1 0 1 1))
> >> ((0 0 1 1) (0 1 1 1))
> >> }
> >>
> >> replace { } by () to have the list, other example at another level :
> >>
> >> minterms-set =
> >> {
> >> ((0 x 1 1) (x 1 1 1))
> >> ((0 x 1 1) (1 x 1 1))
> >> ((0 x 1 1) (1 1 x 1))
> >> ((0 x 1 1) (1 1 1 x))
> >> ((x 0 1 1) (x 1 1 1))
> >> ((x 0 1 1) (1 x 1 1))
> >> ((x 0 1 1) (1 1 x 1))
> >> ((x 0 1 1) (1 1 1 x))
> >> ((0 1 x 1) (x 1 1 1))
> >> ((0 1 x 1) (1 x 1 1))
> >> ((0 1 x 1) (1 1 x 1))
> >> ((0 1 x 1) (1 1 1 x))
> >> ((x 1 0 1) (x 1 1 1))
> >> ((x 1 0 1) (1 x 1 1))
> >> ((x 1 0 1) (1 1 x 1))
> >> ((x 1 0 1) (1 1 1 x))
> >> ((0 1 1 x) (x 1 1 1))
> >> ((0 1 1 x) (1 x 1 1))
> >> ((0 1 1 x) (1 1 x 1))
> >> ((0 1 1 x) (1 1 1 x))
> >> ((x 1 1 0) (x 1 1 1))
> >> ((x 1 1 0) (1 x 1 1))
> >> ((x 1 1 0) (1 1 x 1))
> >> ((x 1 1 0) (1 1 1 x))
> >> ((1 0 1 x) (x 1 1 1))
> >> ((1 0 1 x) (1 x 1 1))
> >> ((1 0 1 x) (1 1 x 1))
> >> ((1 0 1 x) (1 1 1 x))
> >> ((1 x 1 0) (x 1 1 1))
> >> ((1 x 1 0) (1 x 1 1))
> >> ((1 x 1 0) (1 1 x 1))
> >> ((1 x 1 0) (1 1 1 x))
> >> }
> >>
> >> here we see some minterms are already unified
> >>
> >>   it is not easy to read even by me because i wrote the code many years
> ago
> >> and is split in many files, but here it is:
> >>
> >> (par-map function-unify-minterms-list minterms-set)
> >>
> >> {function-unify-minterms-list <+ (λ (L) (apply
> >> function-unify-two-minterms-and-tag L))}
> >>
> >> (define (unify-two-minterms mt1 mt2)
> >>    (function-map-with-escaping-by-kontinuation2
> >>   (macro-function-compare-2-bits-with-continuation) mt1 mt2))
> >>
> >> ;; (function-map-with-escaping-by-kontinuation2
> >> (macro-function-compare-2-bits-with-continuation)   '(1 1 0 1 0 1 1 0)
> '(1
> >> 1 0 1 1 1 1 1))
> >>
> >> ;; list1 = (1 1 0 1 0 1 1 0)
> >> ;; more-lists = ((1 1 0 1 1 1 1 1))
> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1))
> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
> >>
> >> ;; #f
> >> ;;
> >> ;;  (function-map-with-escaping-by-kontinuation2
> >> (macro-function-compare-2-bits-with-continuation)    '(1 1 0 1 0 1 1 0)
> '(1
> >> 1 0 1 1 1 1 0))
> >>
> >> ;; list1 = (1 1 0 1 0 1 1 0)
> >> ;; more-lists = ((1 1 0 1 1 1 1 0))
> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0))
> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
> >>
> >> ;; '(1 1 0 1 x 1 1 0)
> >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 .
> >> more-lists)
> >>    (call/cc (lambda (kontinuation)
> >>      (let ((lists (cons list1 more-lists))
> >>    (funct-continu ;; this function have the kontinuation in his
> environment
> >>     (lambda (arg1 . more-args)
> >>       (let ((args (cons arg1 more-args)))
> >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons
> >> conti args))
> >>
> >>           ;; (newline)
> >>           ;; (dv list1)
> >>           ;; (dv more-lists)
> >>           ;; (dv lists)
> >>   ;; (dv clozure)
> >>           ;; (newline)
> >>
> >>        (apply map funct-continu lists)))))
> >>
> >> (define-syntax macro-function-compare-2-bits-with-continuation ;;
> >> continuation version of macro-compare-2-bits
> >>    ;; i need a macro because of external function to the clozure
> >>    (syntax-rules ()
> >>      ((_) (let ((cnt 0)) ;; counter
> >>    (lambda (continuation b1 b2) (if (equal? b1 b2)
> >>   b1
> >>   (begin
> >>     (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt >
> 1, we
> >> can have used a flag too instead of a counter
> >>     (when (> cnt 1) (continuation #f)) ;; escaping with the continuation
> >>     'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0)
> >>
> >> what could have caused mutex if in the latter definition above (let
> ((cnt
> >> 0)) ;; counter was defined at top level and shared by all threads!!! yes
> >> there could have be some mutex  but this is not the case, i think even
> all
> >> function are pure so why is it more slow with // than without?
> >> Damien
> >>
> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be>
> >> wrote:
> >>
> >>> On 12-10-2022 19:19, Damien Mattei wrote:
> >>>> Hello,
> >>>> all is in the title, i test on a approximately 30000 element list , i
> >>> got
> >>>> 9s with map and 3min 30s with par-map on exactly the same piece of
> >>> code!?
> >>>   > [...]
> >>>   >
> >>>> translated from Scheme+ to Scheme:
> >>>> (define unified-minterms-set-1 (map function-unify-minterms-list
> >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set))
> >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is
> >>> missing.  Without a test case, we can only speculate what's going on.
> >>> (E.g., maybe it grabs a mutex).
> >>>
> >>> Greetings,
> >>> Maxime.
> I don't want to scare anyone, just maybe warn about parallel map. I once
> tried
> to use Guile's parallel map function for a decision tree implementation
> (
> https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm),
>
> where each branch while learning the tree would call parallel map again
> for sub
> branches and so on. Somehow it made Guile crash (I don't have the error
> message
> any longer, but I did post about it on the mailing list back then.). I
> never
> figured out, what went wrong. All I had was pure function calls and math
> inside
> the thing that parallel map was supposed to run.
>
> Ultimately I simply tried other parallelism constructs and when I switched
> to
> using futures instead, everything worked fine, no crashes, no errors.
>
> Since that time, I did not use parallel map and instead used futures.
> Recently I
> made a parallelization thing for solving exercises of Project Euler using
> multiple cores, so that some solutions are calculated faster. Maybe this
> can
> help or can be adapted to another use case:
>
>
> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30
>
> It expects ranges of things, which are called `segments` in the code.
> Usually
> ranges of numbers for Project Euler things. Here is the code to split a
> range
> into segments:
>
>
> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm
>
> (Check any solution using it for an example.)
>
> So this might be a bit too specific for general parallel things, but I
> guess one
> could change the way futures are used in `run-in-parallel`, to fit any
> other
> purpose.
>
> Best regards,
> Zelphir
>
> --
> repositories: https://notabug.org/ZelphirKaltstahl
>
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-14  8:21         ` Damien Mattei
@ 2022-10-14  8:38           ` Zelphir Kaltstahl
  2022-10-17 13:17             ` Damien Mattei
  0 siblings, 1 reply; 27+ messages in thread
From: Zelphir Kaltstahl @ 2022-10-14  8:38 UTC (permalink / raw)
  To: Damien Mattei; +Cc: guile-user, guile-devel

[-- Attachment #1: Type: text/plain, Size: 1723 bytes --]

Hello Damien,

On 10/14/22 10:21, Damien Mattei wrote:
> yes Zelphir i think there is a problem in the threading part of guile, 
> whatever the code is ,it run well the first time, and after is unstable. Long 
> ago at the very begin of Java language on my master project at university i 
> experienced same problem with Java "green" threads, under windows and/or linux 
> ,i can't remember.
>
> I'm trying your code and future, which have perheaps also the portability with 
> other schemes, future can provide "light" // , with carefull code it could be ok.
>
> in your segment.scm code ,is segment-count possibly replaced by the number of 
> available CPUs or nodes, if i understand well?
>
> Regards,
> Damien

Correct. For each segment one future is used.

However, there are scenarios, where one would rather want to interleave the 
numbers of the whole range, to have a "fairer" workload per future, for example:

(1 2 3 4 5 6 7 8 9)

-> (1 4 7), (2 5 8), (3 6 9)

instead of

-> (1 2 3) (4 5 6) (7 8 9)

(I seem to remember this to be the case for Mandelbrot set calculations, but 
might be wrong.)

But that is all a matter of forming some segments and what a segment is, not 
really part of the logic of creating futures. So one could create then in any 
way that fits ones use-case. I have not generalized that segment logic that far, 
but it is doable.

Anyway, if anyone more knowledgeable could chime in on what the performance 
differences between futures and parallel map are, that would be great! Or 
pointing out some flaws that this kind of future usage might have. Or when the 
point is reached to switch to guile-fibers library.

Regards,
Zelphir

-- 
repositories:https://notabug.org/ZelphirKaltstahl

[-- Attachment #2: Type: text/html, Size: 3110 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-14  8:38           ` Zelphir Kaltstahl
@ 2022-10-17 13:17             ` Damien Mattei
  2022-10-22 16:01               ` Damien Mattei
  0 siblings, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-17 13:17 UTC (permalink / raw)
  To: Zelphir Kaltstahl; +Cc: guile-user, guile-devel

Hello,

sorry for my late answer ( i wrote the code friday but i could only debug
today, being busy and on travel last week-end)

i modified my code to works with 'futures' , the speed is now incredible
!!! (and it computes exact)
the speed seems multiplied by even more than the number of CPUs (6):
cat /proc/cpuinfo  | grep processor
processor : 0
processor : 1
processor : 2
processor : 3
processor : 4
processor : 5

https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900

(declare minterms-vector unified-minterms-vector-1)

(define (funct-unify-minterms-set-1-unit-future set1 set2)

  {function-unify-minterms-list <+ (λ (L) (apply
function-unify-two-minterms-and-tag L))}

  ;; note : sorting is useless

  {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)}
;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1
set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create
list of pair of minterms

  {minterms-vector <- (list->vector minterms-set)}

  {minterms-vector-length <+ (vector-length minterms-vector)}

  {nb-procs <+ (current-processor-count)}

  {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; compute
the segments

  {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)}

  (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code

  {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)}

  {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} ;;
remove #f results

  {unified-minterms-set <+ (remove-duplicates-sorted
unified-minterms-set-2)} ;; uniq

  unified-minterms-set)

i keeped your 'segment' definitions to index an array of data after convert
it from list to vector (list->vector) which probably consume additional
time on big data list ( i had more than 1000000 element list lengths at
some point)

i used a simplified version of run-in parallel (that do not do 'reduce
after processing data):

https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794

the computation on a segment is done by those functions:

https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842

{function-unify-minterms-list <+ (λ (L) (apply
function-unify-two-minterms-and-tag L))}

;; proc to be called with futures
(define (proc-unify-minterms-seg seg)
  {start <+ (segment-start seg)}
  {end <+ (segment-end seg)}
  (for ({i <+ start} {i <= end} {i <- {i + 1}})
       {mtL <+ {minterms-vector[i]}}
       {unified-minterms-vector-1[i] <- (function-unify-minterms-list
mtL)}))


i use those code in another program symbolically to compute a value named
Cn:

C0 = T
C1 = T
C2 = T
C3 = (B1 ∨ B2)
C4 = (B2 ∨ (B1 ∧ B3))
C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))
C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5))
C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨
(B4 ∧ B5) ∨ (B5 ∧ B6))
C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4
∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7))
C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B3
∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧ B8))

from C1 to C9 used to be fast,less than  a minute for the whole with or
without //

C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ B6
∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧
B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9))

C10 takes a few minutes

but C11 used to take one day before // with 'future' and i got now the
result during less than one hour!!!

C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧
B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧
B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9))

i never got C12 in the past ,even with 7 days of computation! and i got it
today during the lunchbreak !!!

C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧
B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨
(B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨
(B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9))

(note that a wise people can find a general formula for Cn based on Cn-1
but that is another (mathematical) story....)

i'm computing C13 but i see that it consumes 12gb out of 16gb of my system
! and stopped on the linux box :
GC Warning: Repeated allocation of very large block (appr. size 510935040):
May lead to memory leak and poor performance
Processus arrêté
but still running on the Mac laptop...ok will see that but i think that the
data set is now huge and there is noting that can be done with that...

note that there is still an hash table used in
function-unify-two-minterms-and-tag and i will use another data structure
and algo to avoid that, i feared that the concurrent access to hash table
can cause the guile 'future' to crash or fails or to slow down the process
but not! result is incredibly fast.Also i use continuation and i read it
can cause problem with 'future' i will improve that too...

I will see where i can improve the algo and data structure to optimize
again but this is already really good.

Thanks

Damien


On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl <
zelphirkaltstahl@posteo.de> wrote:

> Hello Damien,
> On 10/14/22 10:21, Damien Mattei wrote:
>
> yes Zelphir i think there is a problem in the threading part of guile,
> whatever the code is ,it run well the first time, and after is unstable.
> Long ago at the very begin of Java language on my master project at
> university i experienced same problem with Java "green" threads, under
> windows and/or linux ,i can't remember.
>
> I'm trying your code and future, which have perheaps also the portability
> with other schemes, future can provide "light" // , with carefull code it
> could be ok.
>
> in your segment.scm code ,is segment-count possibly replaced by the number
> of available CPUs or nodes, if i understand well?
>
> Regards,
> Damien
>
> Correct. For each segment one future is used.
>
> However, there are scenarios, where one would rather want to interleave
> the numbers of the whole range, to have a "fairer" workload per future, for
> example:
>
> (1 2 3 4 5 6 7 8 9)
>
> -> (1 4 7), (2 5 8), (3 6 9)
>
> instead of
>
> -> (1 2 3) (4 5 6) (7 8 9)
>
> (I seem to remember this to be the case for Mandelbrot set calculations,
> but might be wrong.)
>
> But that is all a matter of forming some segments and what a segment is,
> not really part of the logic of creating futures. So one could create then
> in any way that fits ones use-case. I have not generalized that segment
> logic that far, but it is doable.
>
> Anyway, if anyone more knowledgeable could chime in on what the
> performance differences between futures and parallel map are, that would be
> great! Or pointing out some flaws that this kind of future usage might
> have. Or when the point is reached to switch to guile-fibers library.
>
> Regards,
> Zelphir
>
> --
> repositories: https://notabug.org/ZelphirKaltstahl
>
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-17 13:17             ` Damien Mattei
@ 2022-10-22 16:01               ` Damien Mattei
  2022-10-23  1:06                 ` Damien Mattei
  0 siblings, 1 reply; 27+ messages in thread
From: Damien Mattei @ 2022-10-22 16:01 UTC (permalink / raw)
  To: Zelphir Kaltstahl; +Cc: guile-user, guile-devel

just for the fun things, i just find that the parallelized results are
false :-) even for low indices:
the non // result: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4))
is different than the parallelized result below:
C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))

i have to check things again ,i think futures are easy to use but limited
in solving acute // problems, such as using a global variable like an hash
table...

Damien



On Mon, Oct 17, 2022 at 3:17 PM Damien Mattei <damien.mattei@gmail.com>
wrote:

> Hello,
>
> sorry for my late answer ( i wrote the code friday but i could only debug
> today, being busy and on travel last week-end)
>
> i modified my code to works with 'futures' , the speed is now incredible
> !!! (and it computes exact)
> the speed seems multiplied by even more than the number of CPUs (6):
> cat /proc/cpuinfo  | grep processor
> processor : 0
> processor : 1
> processor : 2
> processor : 3
> processor : 4
> processor : 5
>
>
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900
>
> (declare minterms-vector unified-minterms-vector-1)
>
> (define (funct-unify-minterms-set-1-unit-future set1 set2)
>
>   {function-unify-minterms-list <+ (λ (L) (apply
> function-unify-two-minterms-and-tag L))}
>
>   ;; note : sorting is useless
>
>   {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)}
> ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1
> set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create
> list of pair of minterms
>
>   {minterms-vector <- (list->vector minterms-set)}
>
>   {minterms-vector-length <+ (vector-length minterms-vector)}
>
>   {nb-procs <+ (current-processor-count)}
>
>   {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; compute
> the segments
>
>   {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)}
>
>   (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code
>
>   {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)}
>
>   {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} ;;
> remove #f results
>
>   {unified-minterms-set <+ (remove-duplicates-sorted
> unified-minterms-set-2)} ;; uniq
>
>   unified-minterms-set)
>
> i keeped your 'segment' definitions to index an array of data after
> convert it from list to vector (list->vector) which probably consume
> additional time on big data list ( i had more than 1000000 element list
> lengths at some point)
>
> i used a simplified version of run-in parallel (that do not do 'reduce
> after processing data):
>
>
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794
>
> the computation on a segment is done by those functions:
>
>
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842
>
> {function-unify-minterms-list <+ (λ (L) (apply
> function-unify-two-minterms-and-tag L))}
>
> ;; proc to be called with futures
> (define (proc-unify-minterms-seg seg)
>   {start <+ (segment-start seg)}
>   {end <+ (segment-end seg)}
>   (for ({i <+ start} {i <= end} {i <- {i + 1}})
>        {mtL <+ {minterms-vector[i]}}
>        {unified-minterms-vector-1[i] <- (function-unify-minterms-list
> mtL)}))
>
>
> i use those code in another program symbolically to compute a value named
> Cn:
>
> C0 = T
> C1 = T
> C2 = T
> C3 = (B1 ∨ B2)
> C4 = (B2 ∨ (B1 ∧ B3))
> C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))
> C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5))
> C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨
> (B4 ∧ B5) ∨ (B5 ∧ B6))
> C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧
> B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7))
> C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨
> (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧
> B8))
>
> from C1 to C9 used to be fast,less than  a minute for the whole with or
> without //
>
> C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ B6
> ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧
> B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9))
>
> C10 takes a few minutes
>
> but C11 used to take one day before // with 'future' and i got now the
> result during less than one hour!!!
>
> C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧
> B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧
> B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9))
>
> i never got C12 in the past ,even with 7 days of computation! and i got it
> today during the lunchbreak !!!
>
> C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6
> ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8)
> ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨
> (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9))
>
> (note that a wise people can find a general formula for Cn based on Cn-1
> but that is another (mathematical) story....)
>
> i'm computing C13 but i see that it consumes 12gb out of 16gb of my system
> ! and stopped on the linux box :
> GC Warning: Repeated allocation of very large block (appr. size 510935040):
> May lead to memory leak and poor performance
> Processus arrêté
> but still running on the Mac laptop...ok will see that but i think that
> the data set is now huge and there is noting that can be done with that...
>
> note that there is still an hash table used in
> function-unify-two-minterms-and-tag and i will use another data structure
> and algo to avoid that, i feared that the concurrent access to hash table
> can cause the guile 'future' to crash or fails or to slow down the process
> but not! result is incredibly fast.Also i use continuation and i read it
> can cause problem with 'future' i will improve that too...
>
> I will see where i can improve the algo and data structure to optimize
> again but this is already really good.
>
> Thanks
>
> Damien
>
>
> On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl <
> zelphirkaltstahl@posteo.de> wrote:
>
>> Hello Damien,
>> On 10/14/22 10:21, Damien Mattei wrote:
>>
>> yes Zelphir i think there is a problem in the threading part of guile,
>> whatever the code is ,it run well the first time, and after is unstable.
>> Long ago at the very begin of Java language on my master project at
>> university i experienced same problem with Java "green" threads, under
>> windows and/or linux ,i can't remember.
>>
>> I'm trying your code and future, which have perheaps also the portability
>> with other schemes, future can provide "light" // , with carefull code it
>> could be ok.
>>
>> in your segment.scm code ,is segment-count possibly replaced by the
>> number of available CPUs or nodes, if i understand well?
>>
>> Regards,
>> Damien
>>
>> Correct. For each segment one future is used.
>>
>> However, there are scenarios, where one would rather want to interleave
>> the numbers of the whole range, to have a "fairer" workload per future, for
>> example:
>>
>> (1 2 3 4 5 6 7 8 9)
>>
>> -> (1 4 7), (2 5 8), (3 6 9)
>>
>> instead of
>>
>> -> (1 2 3) (4 5 6) (7 8 9)
>>
>> (I seem to remember this to be the case for Mandelbrot set calculations,
>> but might be wrong.)
>>
>> But that is all a matter of forming some segments and what a segment is,
>> not really part of the logic of creating futures. So one could create then
>> in any way that fits ones use-case. I have not generalized that segment
>> logic that far, but it is doable.
>>
>> Anyway, if anyone more knowledgeable could chime in on what the
>> performance differences between futures and parallel map are, that would be
>> great! Or pointing out some flaws that this kind of future usage might
>> have. Or when the point is reached to switch to guile-fibers library.
>>
>> Regards,
>> Zelphir
>>
>> --
>> repositories: https://notabug.org/ZelphirKaltstahl
>>
>>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-22 16:01               ` Damien Mattei
@ 2022-10-23  1:06                 ` Damien Mattei
  0 siblings, 0 replies; 27+ messages in thread
From: Damien Mattei @ 2022-10-23  1:06 UTC (permalink / raw)
  To: Zelphir Kaltstahl; +Cc: guile-user, guile-devel

[-- Attachment #1: Type: text/plain, Size: 14665 bytes --]

after intense coding i finally got the good results,
my assumption about the global variable hash table was true ,it is
incompatible with 'future : the competition for the writing into the hash
table breaks the code.

If i do writing in hash table out of // region all is ok:

a simple function to extract the hash table access:

(define (tag-minterms i umt)
  (when umt
     {mt1 <+ (first {minterms-vector[i]})}
     {mt2 <+ (second {minterms-vector[i]})}
     {minterms-ht[mt1] <- #t}
     {minterms-ht[mt2] <- #t}))



;; proc to be called with futures
(define (proc-unify-minterms-seg seg)

  {function-unify-minterms-list <+ (λ (L) (apply
function-unify-two-minterms L))}
  ;;{function-unify-minterms-list <+ (λ (L) (apply
function-unify-two-minterms-and-tag L))}


  {start <+ (segment-start seg)}
  {end <+ (segment-end seg)}
  (for ({i <+ start} {i <= end} {i <- {i + 1}})
       {mtL <+ {minterms-vector[i]}}
       (nodebug
          (dv mtL))
       {unified-minterms-vector-1[i] <- (function-unify-minterms-list
mtL)}))


(declare minterms-vector unified-minterms-vector-1)

https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L2893

and the call to the code that use hash table is now completely out of the
// region:

 (debug
   (display-nl "before //"))

  (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code

  (debug
   (display-nl "after //"))

  (vector-for-each tag-minterms unified-minterms-vector-1) ;; tag the
minterms in the hash table

with a partitioned vector it works because one cell is not accessed by
different threads

other stuff like continuation seems to work with 'future, i wanted to
rewrite code with call/cc in 'future but it was too long and apparently it
works,any way i do not see why call/cc would cause trouble in a thread...

i think 'futures are good if you use them carefully like in OpenMP
regions... in other languages.

the code is very very fast now i reached C15,in a few minutes on an Apple M1
scheme@(guile-user)>  (current-processor-count)
$1 = 8

C15 = ((B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6
∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧
B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8
∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B1 ∧
B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12
∧ B13) ∨ (B1 ∧ B3 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧
B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨
(B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10
∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧
B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨
(B1 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10
∧ B12 ∧ B13) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧
B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧
B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B9 ∧
B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧
B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨
(B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10
∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧
B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨
(B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9 ∧ B11
∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧
B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B8 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧
B4 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B10 ∧ B12
∧ B14) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B5 ∧ B7 ∧
B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧
B6 ∧ B7 ∧ B9 ∧ B10 ∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B12 ∧
B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B7 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11
∧ B12 ∧ B14) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B9 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧
B10 ∧ B11 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧ B8 ∧ B10 ∧ B12 ∧ B13) ∨ (B2 ∧ B4 ∧ B6 ∧
B8 ∧ B10 ∧ B12 ∧ B14))

and it consumes more than 2Gb in memory:


PID    COMMAND      %CPU  TIME     #TH   #WQ  #PORT MEM    PURG   CMPRS
 PGRP  PPID  STATE    BOOSTS          %CPU_ME
15081  guile        442.8 57:24.28 17/8  0    44    *2423M*  0B     118M
15081 10551 running  *0[1]           0.0000

i checked a part of the logical computation with Mathematica online and it
was ok but i will do a Python program to check the logical expression again
and compare speed with Sympy...

and for the space complexity i do not think it can be reduced

good night all

On Sat, Oct 22, 2022 at 6:01 PM Damien Mattei <damien.mattei@gmail.com>
wrote:

> just for the fun things, i just find that the parallelized results are
> false :-) even for low indices:
> the non // result: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4))
> is different than the parallelized result below:
> C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))
>
> i have to check things again ,i think futures are easy to use but limited
> in solving acute // problems, such as using a global variable like an hash
> table...
>
> Damien
>
>
>
> On Mon, Oct 17, 2022 at 3:17 PM Damien Mattei <damien.mattei@gmail.com>
> wrote:
>
>> Hello,
>>
>> sorry for my late answer ( i wrote the code friday but i could only debug
>> today, being busy and on travel last week-end)
>>
>> i modified my code to works with 'futures' , the speed is now incredible
>> !!! (and it computes exact)
>> the speed seems multiplied by even more than the number of CPUs (6):
>> cat /proc/cpuinfo  | grep processor
>> processor : 0
>> processor : 1
>> processor : 2
>> processor : 3
>> processor : 4
>> processor : 5
>>
>>
>> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900
>>
>> (declare minterms-vector unified-minterms-vector-1)
>>
>> (define (funct-unify-minterms-set-1-unit-future set1 set2)
>>
>>   {function-unify-minterms-list <+ (λ (L) (apply
>> function-unify-two-minterms-and-tag L))}
>>
>>   ;; note : sorting is useless
>>
>>   {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)}
>> ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1
>> set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create
>> list of pair of minterms
>>
>>   {minterms-vector <- (list->vector minterms-set)}
>>
>>   {minterms-vector-length <+ (vector-length minterms-vector)}
>>
>>   {nb-procs <+ (current-processor-count)}
>>
>>   {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;;
>> compute the segments
>>
>>   {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)}
>>
>>   (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel
>> code
>>
>>   {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)}
>>
>>   {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)}
>> ;; remove #f results
>>
>>   {unified-minterms-set <+ (remove-duplicates-sorted
>> unified-minterms-set-2)} ;; uniq
>>
>>   unified-minterms-set)
>>
>> i keeped your 'segment' definitions to index an array of data after
>> convert it from list to vector (list->vector) which probably consume
>> additional time on big data list ( i had more than 1000000 element list
>> lengths at some point)
>>
>> i used a simplified version of run-in parallel (that do not do 'reduce
>> after processing data):
>>
>>
>> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794
>>
>> the computation on a segment is done by those functions:
>>
>>
>> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842
>>
>> {function-unify-minterms-list <+ (λ (L) (apply
>> function-unify-two-minterms-and-tag L))}
>>
>> ;; proc to be called with futures
>> (define (proc-unify-minterms-seg seg)
>>   {start <+ (segment-start seg)}
>>   {end <+ (segment-end seg)}
>>   (for ({i <+ start} {i <= end} {i <- {i + 1}})
>>        {mtL <+ {minterms-vector[i]}}
>>        {unified-minterms-vector-1[i] <- (function-unify-minterms-list
>> mtL)}))
>>
>>
>> i use those code in another program symbolically to compute a value named
>> Cn:
>>
>> C0 = T
>> C1 = T
>> C2 = T
>> C3 = (B1 ∨ B2)
>> C4 = (B2 ∨ (B1 ∧ B3))
>> C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))
>> C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5))
>> C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨
>> (B4 ∧ B5) ∨ (B5 ∧ B6))
>> C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧
>> B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7))
>> C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨
>> (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧
>> B8))
>>
>> from C1 to C9 used to be fast,less than  a minute for the whole with or
>> without //
>>
>> C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧
>> B6 ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6
>> ∧ B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9))
>>
>> C10 takes a few minutes
>>
>> but C11 used to take one day before // with 'future' and i got now the
>> result during less than one hour!!!
>>
>> C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧
>> B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧
>> B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9))
>>
>> i never got C12 in the past ,even with 7 days of computation! and i got
>> it today during the lunchbreak !!!
>>
>> C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6
>> ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8)
>> ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨
>> (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9))
>>
>> (note that a wise people can find a general formula for Cn based on Cn-1
>> but that is another (mathematical) story....)
>>
>> i'm computing C13 but i see that it consumes 12gb out of 16gb of my
>> system ! and stopped on the linux box :
>> GC Warning: Repeated allocation of very large block (appr. size
>> 510935040):
>> May lead to memory leak and poor performance
>> Processus arrêté
>> but still running on the Mac laptop...ok will see that but i think that
>> the data set is now huge and there is noting that can be done with that...
>>
>> note that there is still an hash table used in
>> function-unify-two-minterms-and-tag and i will use another data structure
>> and algo to avoid that, i feared that the concurrent access to hash table
>> can cause the guile 'future' to crash or fails or to slow down the process
>> but not! result is incredibly fast.Also i use continuation and i read it
>> can cause problem with 'future' i will improve that too...
>>
>> I will see where i can improve the algo and data structure to optimize
>> again but this is already really good.
>>
>> Thanks
>>
>> Damien
>>
>>
>> On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl <
>> zelphirkaltstahl@posteo.de> wrote:
>>
>>> Hello Damien,
>>> On 10/14/22 10:21, Damien Mattei wrote:
>>>
>>> yes Zelphir i think there is a problem in the threading part of guile,
>>> whatever the code is ,it run well the first time, and after is unstable.
>>> Long ago at the very begin of Java language on my master project at
>>> university i experienced same problem with Java "green" threads, under
>>> windows and/or linux ,i can't remember.
>>>
>>> I'm trying your code and future, which have perheaps also the
>>> portability with other schemes, future can provide "light" // , with
>>> carefull code it could be ok.
>>>
>>> in your segment.scm code ,is segment-count possibly replaced by the
>>> number of available CPUs or nodes, if i understand well?
>>>
>>> Regards,
>>> Damien
>>>
>>> Correct. For each segment one future is used.
>>>
>>> However, there are scenarios, where one would rather want to interleave
>>> the numbers of the whole range, to have a "fairer" workload per future, for
>>> example:
>>>
>>> (1 2 3 4 5 6 7 8 9)
>>>
>>> -> (1 4 7), (2 5 8), (3 6 9)
>>>
>>> instead of
>>>
>>> -> (1 2 3) (4 5 6) (7 8 9)
>>>
>>> (I seem to remember this to be the case for Mandelbrot set calculations,
>>> but might be wrong.)
>>>
>>> But that is all a matter of forming some segments and what a segment is,
>>> not really part of the logic of creating futures. So one could create then
>>> in any way that fits ones use-case. I have not generalized that segment
>>> logic that far, but it is doable.
>>>
>>> Anyway, if anyone more knowledgeable could chime in on what the
>>> performance differences between futures and parallel map are, that would be
>>> great! Or pointing out some flaws that this kind of future usage might
>>> have. Or when the point is reached to switch to guile-fibers library.
>>>
>>> Regards,
>>> Zelphir
>>>
>>> --
>>> repositories: https://notabug.org/ZelphirKaltstahl
>>>
>>>

[-- Attachment #2: Type: text/html, Size: 21674 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-12 21:29       ` Zelphir Kaltstahl
  2022-10-14  8:21         ` Damien Mattei
@ 2022-10-25  9:07         ` Mikael Djurfeldt
  2022-10-25  9:11           ` Mikael Djurfeldt
  1 sibling, 1 reply; 27+ messages in thread
From: Mikael Djurfeldt @ 2022-10-25  9:07 UTC (permalink / raw)
  To: Zelphir Kaltstahl
  Cc: Damien Mattei, guile-user, guile-devel, Mikael Djurfeldt

[-- Attachment #1: Type: text/plain, Size: 9302 bytes --]

A piece of background on par-map:

When I introduced par-map et al the only ambition was to have simple
language constructs to invoke parallelism. The use case I had in mind was
course grained parallelism where each piece of work is somewhat
substantial. Back then, a thread was launched for each piece of work,
however there was also a thread pool such that not all of the overhead of
launching new threads always was required.

Since then, par-map has been rewritten (by others) to be based on futures.
(And now the thread pool is localized in the futures implementation---as
"workers".) Looking in the code now, I think it is fair to say that it is
still intended for coarse grained parallelism. There is some heavy lifting
going on with mutexes and condition variables as well as shuffling around
with list pairs.

So, applying par-map on a huge list with small amount of work per item was
never and still isn't the intended use case.

It would of course be interesting if the code could be improved to support
fine grained parallelism. :-)

Best regards,
Mikael

On Wed, Oct 12, 2022 at 11:30 PM Zelphir Kaltstahl <
zelphirkaltstahl@posteo.de> wrote:

> Hi!
>
> On 10/12/22 22:27, Damien Mattei wrote:
> >
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674
> >
> > i commited the current version of code here with all files but it is
> > huge.... :-/
> >
> > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com>
> > wrote:
> >
> >> Mutex? i do not think code has situation where dead lock could happen,
> it
> >> is a code about minimalising logic expressions, it uses minterms ,
> minterms
> >> set is a set of minterms :like this:
> >>
> >> example:
> >> ((1 1 0) (1 1 1)) will be unified : (1 1 x)
> >> because 0 and 1 are replaced by x
> >> the minterms-set could have thousands of pair (mathematic not lisp)
> >> minterms to unify
> >> if there is more than one x as result there is no need to continue so i
> >> escape with a continuation:
> >>
> >> minterms-set =
> >> {
> >> ((1 0 1 0) (1 1 1 0))
> >> ((1 0 1 0) (1 1 0 1))
> >> ((1 0 1 0) (1 0 1 1))
> >> ((1 0 1 0) (0 1 1 1))
> >> ((0 1 1 0) (1 1 1 0))
> >> ((0 1 1 0) (1 1 0 1))
> >> ((0 1 1 0) (1 0 1 1))
> >> ((0 1 1 0) (0 1 1 1))
> >> ((0 1 0 1) (1 1 1 0))
> >> ((0 1 0 1) (1 1 0 1))
> >> ((0 1 0 1) (1 0 1 1))
> >> ((0 1 0 1) (0 1 1 1))
> >> ((0 0 1 1) (1 1 1 0))
> >> ((0 0 1 1) (1 1 0 1))
> >> ((0 0 1 1) (1 0 1 1))
> >> ((0 0 1 1) (0 1 1 1))
> >> }
> >>
> >> replace { } by () to have the list, other example at another level :
> >>
> >> minterms-set =
> >> {
> >> ((0 x 1 1) (x 1 1 1))
> >> ((0 x 1 1) (1 x 1 1))
> >> ((0 x 1 1) (1 1 x 1))
> >> ((0 x 1 1) (1 1 1 x))
> >> ((x 0 1 1) (x 1 1 1))
> >> ((x 0 1 1) (1 x 1 1))
> >> ((x 0 1 1) (1 1 x 1))
> >> ((x 0 1 1) (1 1 1 x))
> >> ((0 1 x 1) (x 1 1 1))
> >> ((0 1 x 1) (1 x 1 1))
> >> ((0 1 x 1) (1 1 x 1))
> >> ((0 1 x 1) (1 1 1 x))
> >> ((x 1 0 1) (x 1 1 1))
> >> ((x 1 0 1) (1 x 1 1))
> >> ((x 1 0 1) (1 1 x 1))
> >> ((x 1 0 1) (1 1 1 x))
> >> ((0 1 1 x) (x 1 1 1))
> >> ((0 1 1 x) (1 x 1 1))
> >> ((0 1 1 x) (1 1 x 1))
> >> ((0 1 1 x) (1 1 1 x))
> >> ((x 1 1 0) (x 1 1 1))
> >> ((x 1 1 0) (1 x 1 1))
> >> ((x 1 1 0) (1 1 x 1))
> >> ((x 1 1 0) (1 1 1 x))
> >> ((1 0 1 x) (x 1 1 1))
> >> ((1 0 1 x) (1 x 1 1))
> >> ((1 0 1 x) (1 1 x 1))
> >> ((1 0 1 x) (1 1 1 x))
> >> ((1 x 1 0) (x 1 1 1))
> >> ((1 x 1 0) (1 x 1 1))
> >> ((1 x 1 0) (1 1 x 1))
> >> ((1 x 1 0) (1 1 1 x))
> >> }
> >>
> >> here we see some minterms are already unified
> >>
> >>   it is not easy to read even by me because i wrote the code many years
> ago
> >> and is split in many files, but here it is:
> >>
> >> (par-map function-unify-minterms-list minterms-set)
> >>
> >> {function-unify-minterms-list <+ (λ (L) (apply
> >> function-unify-two-minterms-and-tag L))}
> >>
> >> (define (unify-two-minterms mt1 mt2)
> >>    (function-map-with-escaping-by-kontinuation2
> >>   (macro-function-compare-2-bits-with-continuation) mt1 mt2))
> >>
> >> ;; (function-map-with-escaping-by-kontinuation2
> >> (macro-function-compare-2-bits-with-continuation)   '(1 1 0 1 0 1 1 0)
> '(1
> >> 1 0 1 1 1 1 1))
> >>
> >> ;; list1 = (1 1 0 1 0 1 1 0)
> >> ;; more-lists = ((1 1 0 1 1 1 1 1))
> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1))
> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
> >>
> >> ;; #f
> >> ;;
> >> ;;  (function-map-with-escaping-by-kontinuation2
> >> (macro-function-compare-2-bits-with-continuation)    '(1 1 0 1 0 1 1 0)
> '(1
> >> 1 0 1 1 1 1 0))
> >>
> >> ;; list1 = (1 1 0 1 0 1 1 0)
> >> ;; more-lists = ((1 1 0 1 1 1 1 0))
> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0))
> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
> >>
> >> ;; '(1 1 0 1 x 1 1 0)
> >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 .
> >> more-lists)
> >>    (call/cc (lambda (kontinuation)
> >>      (let ((lists (cons list1 more-lists))
> >>    (funct-continu ;; this function have the kontinuation in his
> environment
> >>     (lambda (arg1 . more-args)
> >>       (let ((args (cons arg1 more-args)))
> >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons
> >> conti args))
> >>
> >>           ;; (newline)
> >>           ;; (dv list1)
> >>           ;; (dv more-lists)
> >>           ;; (dv lists)
> >>   ;; (dv clozure)
> >>           ;; (newline)
> >>
> >>        (apply map funct-continu lists)))))
> >>
> >> (define-syntax macro-function-compare-2-bits-with-continuation ;;
> >> continuation version of macro-compare-2-bits
> >>    ;; i need a macro because of external function to the clozure
> >>    (syntax-rules ()
> >>      ((_) (let ((cnt 0)) ;; counter
> >>    (lambda (continuation b1 b2) (if (equal? b1 b2)
> >>   b1
> >>   (begin
> >>     (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt >
> 1, we
> >> can have used a flag too instead of a counter
> >>     (when (> cnt 1) (continuation #f)) ;; escaping with the continuation
> >>     'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0)
> >>
> >> what could have caused mutex if in the latter definition above (let
> ((cnt
> >> 0)) ;; counter was defined at top level and shared by all threads!!! yes
> >> there could have be some mutex  but this is not the case, i think even
> all
> >> function are pure so why is it more slow with // than without?
> >> Damien
> >>
> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be>
> >> wrote:
> >>
> >>> On 12-10-2022 19:19, Damien Mattei wrote:
> >>>> Hello,
> >>>> all is in the title, i test on a approximately 30000 element list , i
> >>> got
> >>>> 9s with map and 3min 30s with par-map on exactly the same piece of
> >>> code!?
> >>>   > [...]
> >>>   >
> >>>> translated from Scheme+ to Scheme:
> >>>> (define unified-minterms-set-1 (map function-unify-minterms-list
> >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set))
> >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is
> >>> missing.  Without a test case, we can only speculate what's going on.
> >>> (E.g., maybe it grabs a mutex).
> >>>
> >>> Greetings,
> >>> Maxime.
> I don't want to scare anyone, just maybe warn about parallel map. I once
> tried
> to use Guile's parallel map function for a decision tree implementation
> (
> https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm),
>
> where each branch while learning the tree would call parallel map again
> for sub
> branches and so on. Somehow it made Guile crash (I don't have the error
> message
> any longer, but I did post about it on the mailing list back then.). I
> never
> figured out, what went wrong. All I had was pure function calls and math
> inside
> the thing that parallel map was supposed to run.
>
> Ultimately I simply tried other parallelism constructs and when I switched
> to
> using futures instead, everything worked fine, no crashes, no errors.
>
> Since that time, I did not use parallel map and instead used futures.
> Recently I
> made a parallelization thing for solving exercises of Project Euler using
> multiple cores, so that some solutions are calculated faster. Maybe this
> can
> help or can be adapted to another use case:
>
>
> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30
>
> It expects ranges of things, which are called `segments` in the code.
> Usually
> ranges of numbers for Project Euler things. Here is the code to split a
> range
> into segments:
>
>
> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm
>
> (Check any solution using it for an example.)
>
> So this might be a bit too specific for general parallel things, but I
> guess one
> could change the way futures are used in `run-in-parallel`, to fit any
> other
> purpose.
>
> Best regards,
> Zelphir
>
> --
> repositories: https://notabug.org/ZelphirKaltstahl
>
>
>

[-- Attachment #2: Type: text/html, Size: 12202 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-25  9:07         ` Mikael Djurfeldt
@ 2022-10-25  9:11           ` Mikael Djurfeldt
  2022-10-25 14:09             ` Damien Mattei
  0 siblings, 1 reply; 27+ messages in thread
From: Mikael Djurfeldt @ 2022-10-25  9:11 UTC (permalink / raw)
  To: Zelphir Kaltstahl; +Cc: Damien Mattei, guile-user, guile-devel

[-- Attachment #1: Type: text/plain, Size: 9889 bytes --]

Also, I would believe that any crashes in this context are neither due to
the futures implementation nor par-map et al. I would think that crashes
are due to the Guile basic thread support itself.

On Tue, Oct 25, 2022 at 11:07 AM Mikael Djurfeldt <mikael@djurfeldt.com>
wrote:

> A piece of background on par-map:
>
> When I introduced par-map et al the only ambition was to have simple
> language constructs to invoke parallelism. The use case I had in mind was
> course grained parallelism where each piece of work is somewhat
> substantial. Back then, a thread was launched for each piece of work,
> however there was also a thread pool such that not all of the overhead of
> launching new threads always was required.
>
> Since then, par-map has been rewritten (by others) to be based on futures.
> (And now the thread pool is localized in the futures implementation---as
> "workers".) Looking in the code now, I think it is fair to say that it is
> still intended for coarse grained parallelism. There is some heavy lifting
> going on with mutexes and condition variables as well as shuffling around
> with list pairs.
>
> So, applying par-map on a huge list with small amount of work per item was
> never and still isn't the intended use case.
>
> It would of course be interesting if the code could be improved to support
> fine grained parallelism. :-)
>
> Best regards,
> Mikael
>
> On Wed, Oct 12, 2022 at 11:30 PM Zelphir Kaltstahl <
> zelphirkaltstahl@posteo.de> wrote:
>
>> Hi!
>>
>> On 10/12/22 22:27, Damien Mattei wrote:
>> >
>> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674
>> >
>> > i commited the current version of code here with all files but it is
>> > huge.... :-/
>> >
>> > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <damien.mattei@gmail.com
>> >
>> > wrote:
>> >
>> >> Mutex? i do not think code has situation where dead lock could happen,
>> it
>> >> is a code about minimalising logic expressions, it uses minterms ,
>> minterms
>> >> set is a set of minterms :like this:
>> >>
>> >> example:
>> >> ((1 1 0) (1 1 1)) will be unified : (1 1 x)
>> >> because 0 and 1 are replaced by x
>> >> the minterms-set could have thousands of pair (mathematic not lisp)
>> >> minterms to unify
>> >> if there is more than one x as result there is no need to continue so i
>> >> escape with a continuation:
>> >>
>> >> minterms-set =
>> >> {
>> >> ((1 0 1 0) (1 1 1 0))
>> >> ((1 0 1 0) (1 1 0 1))
>> >> ((1 0 1 0) (1 0 1 1))
>> >> ((1 0 1 0) (0 1 1 1))
>> >> ((0 1 1 0) (1 1 1 0))
>> >> ((0 1 1 0) (1 1 0 1))
>> >> ((0 1 1 0) (1 0 1 1))
>> >> ((0 1 1 0) (0 1 1 1))
>> >> ((0 1 0 1) (1 1 1 0))
>> >> ((0 1 0 1) (1 1 0 1))
>> >> ((0 1 0 1) (1 0 1 1))
>> >> ((0 1 0 1) (0 1 1 1))
>> >> ((0 0 1 1) (1 1 1 0))
>> >> ((0 0 1 1) (1 1 0 1))
>> >> ((0 0 1 1) (1 0 1 1))
>> >> ((0 0 1 1) (0 1 1 1))
>> >> }
>> >>
>> >> replace { } by () to have the list, other example at another level :
>> >>
>> >> minterms-set =
>> >> {
>> >> ((0 x 1 1) (x 1 1 1))
>> >> ((0 x 1 1) (1 x 1 1))
>> >> ((0 x 1 1) (1 1 x 1))
>> >> ((0 x 1 1) (1 1 1 x))
>> >> ((x 0 1 1) (x 1 1 1))
>> >> ((x 0 1 1) (1 x 1 1))
>> >> ((x 0 1 1) (1 1 x 1))
>> >> ((x 0 1 1) (1 1 1 x))
>> >> ((0 1 x 1) (x 1 1 1))
>> >> ((0 1 x 1) (1 x 1 1))
>> >> ((0 1 x 1) (1 1 x 1))
>> >> ((0 1 x 1) (1 1 1 x))
>> >> ((x 1 0 1) (x 1 1 1))
>> >> ((x 1 0 1) (1 x 1 1))
>> >> ((x 1 0 1) (1 1 x 1))
>> >> ((x 1 0 1) (1 1 1 x))
>> >> ((0 1 1 x) (x 1 1 1))
>> >> ((0 1 1 x) (1 x 1 1))
>> >> ((0 1 1 x) (1 1 x 1))
>> >> ((0 1 1 x) (1 1 1 x))
>> >> ((x 1 1 0) (x 1 1 1))
>> >> ((x 1 1 0) (1 x 1 1))
>> >> ((x 1 1 0) (1 1 x 1))
>> >> ((x 1 1 0) (1 1 1 x))
>> >> ((1 0 1 x) (x 1 1 1))
>> >> ((1 0 1 x) (1 x 1 1))
>> >> ((1 0 1 x) (1 1 x 1))
>> >> ((1 0 1 x) (1 1 1 x))
>> >> ((1 x 1 0) (x 1 1 1))
>> >> ((1 x 1 0) (1 x 1 1))
>> >> ((1 x 1 0) (1 1 x 1))
>> >> ((1 x 1 0) (1 1 1 x))
>> >> }
>> >>
>> >> here we see some minterms are already unified
>> >>
>> >>   it is not easy to read even by me because i wrote the code many
>> years ago
>> >> and is split in many files, but here it is:
>> >>
>> >> (par-map function-unify-minterms-list minterms-set)
>> >>
>> >> {function-unify-minterms-list <+ (λ (L) (apply
>> >> function-unify-two-minterms-and-tag L))}
>> >>
>> >> (define (unify-two-minterms mt1 mt2)
>> >>    (function-map-with-escaping-by-kontinuation2
>> >>   (macro-function-compare-2-bits-with-continuation) mt1 mt2))
>> >>
>> >> ;; (function-map-with-escaping-by-kontinuation2
>> >> (macro-function-compare-2-bits-with-continuation)   '(1 1 0 1 0 1 1 0)
>> '(1
>> >> 1 0 1 1 1 1 1))
>> >>
>> >> ;; list1 = (1 1 0 1 0 1 1 0)
>> >> ;; more-lists = ((1 1 0 1 1 1 1 1))
>> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1))
>> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
>> >>
>> >> ;; #f
>> >> ;;
>> >> ;;  (function-map-with-escaping-by-kontinuation2
>> >> (macro-function-compare-2-bits-with-continuation)    '(1 1 0 1 0 1 1
>> 0) '(1
>> >> 1 0 1 1 1 1 0))
>> >>
>> >> ;; list1 = (1 1 0 1 0 1 1 0)
>> >> ;; more-lists = ((1 1 0 1 1 1 1 0))
>> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0))
>> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
>> >>
>> >> ;; '(1 1 0 1 x 1 1 0)
>> >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 .
>> >> more-lists)
>> >>    (call/cc (lambda (kontinuation)
>> >>      (let ((lists (cons list1 more-lists))
>> >>    (funct-continu ;; this function have the kontinuation in his
>> environment
>> >>     (lambda (arg1 . more-args)
>> >>       (let ((args (cons arg1 more-args)))
>> >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure (cons
>> >> conti args))
>> >>
>> >>           ;; (newline)
>> >>           ;; (dv list1)
>> >>           ;; (dv more-lists)
>> >>           ;; (dv lists)
>> >>   ;; (dv clozure)
>> >>           ;; (newline)
>> >>
>> >>        (apply map funct-continu lists)))))
>> >>
>> >> (define-syntax macro-function-compare-2-bits-with-continuation ;;
>> >> continuation version of macro-compare-2-bits
>> >>    ;; i need a macro because of external function to the clozure
>> >>    (syntax-rules ()
>> >>      ((_) (let ((cnt 0)) ;; counter
>> >>    (lambda (continuation b1 b2) (if (equal? b1 b2)
>> >>   b1
>> >>   (begin
>> >>     (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt >
>> 1, we
>> >> can have used a flag too instead of a counter
>> >>     (when (> cnt 1) (continuation #f)) ;; escaping with the
>> continuation
>> >>     'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0)
>> >>
>> >> what could have caused mutex if in the latter definition above (let
>> ((cnt
>> >> 0)) ;; counter was defined at top level and shared by all threads!!!
>> yes
>> >> there could have be some mutex  but this is not the case, i think even
>> all
>> >> function are pure so why is it more slow with // than without?
>> >> Damien
>> >>
>> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be>
>> >> wrote:
>> >>
>> >>> On 12-10-2022 19:19, Damien Mattei wrote:
>> >>>> Hello,
>> >>>> all is in the title, i test on a approximately 30000 element list , i
>> >>> got
>> >>>> 9s with map and 3min 30s with par-map on exactly the same piece of
>> >>> code!?
>> >>>   > [...]
>> >>>   >
>> >>>> translated from Scheme+ to Scheme:
>> >>>> (define unified-minterms-set-1 (map function-unify-minterms-list
>> >>>> minterms-set)) ;;(par-map function-unify-minterms-list minterms-set))
>> >>> The definition of 'function-unify-minterms-list' and 'minterms-set' is
>> >>> missing.  Without a test case, we can only speculate what's going on.
>> >>> (E.g., maybe it grabs a mutex).
>> >>>
>> >>> Greetings,
>> >>> Maxime.
>> I don't want to scare anyone, just maybe warn about parallel map. I once
>> tried
>> to use Guile's parallel map function for a decision tree implementation
>> (
>> https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm),
>>
>> where each branch while learning the tree would call parallel map again
>> for sub
>> branches and so on. Somehow it made Guile crash (I don't have the error
>> message
>> any longer, but I did post about it on the mailing list back then.). I
>> never
>> figured out, what went wrong. All I had was pure function calls and math
>> inside
>> the thing that parallel map was supposed to run.
>>
>> Ultimately I simply tried other parallelism constructs and when I
>> switched to
>> using futures instead, everything worked fine, no crashes, no errors.
>>
>> Since that time, I did not use parallel map and instead used futures.
>> Recently I
>> made a parallelization thing for solving exercises of Project Euler using
>> multiple cores, so that some solutions are calculated faster. Maybe this
>> can
>> help or can be adapted to another use case:
>>
>>
>> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30
>>
>> It expects ranges of things, which are called `segments` in the code.
>> Usually
>> ranges of numbers for Project Euler things. Here is the code to split a
>> range
>> into segments:
>>
>>
>> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm
>>
>> (Check any solution using it for an example.)
>>
>> So this might be a bit too specific for general parallel things, but I
>> guess one
>> could change the way futures are used in `run-in-parallel`, to fit any
>> other
>> purpose.
>>
>> Best regards,
>> Zelphir
>>
>> --
>> repositories: https://notabug.org/ZelphirKaltstahl
>>
>>
>>

[-- Attachment #2: Type: text/html, Size: 12781 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: map-par slower than map
  2022-10-25  9:11           ` Mikael Djurfeldt
@ 2022-10-25 14:09             ` Damien Mattei
  0 siblings, 0 replies; 27+ messages in thread
From: Damien Mattei @ 2022-10-25 14:09 UTC (permalink / raw)
  To: mikael; +Cc: Zelphir Kaltstahl, guile-user, guile-devel

[-- Attachment #1: Type: text/plain, Size: 12643 bytes --]

Hello Mikael,

your message comes very well because i decided yesterday,partly due to the
discussion in mailing list, to review all the // method used in my code and
to use them to perform benchmark to see which one is preferable to use with
the algorithmic problem of putting a logic expression in its canonical
minimal disjunctive or conjunctive form.

I used 4 methods:
-sequential (map with no //)
-par-map (//)
-threads  (//)
-futures (//)

But, of course, the data structure is not the same now, it will be lists
with sequential and par-map and vectors with threads and futures, as i
already experimented problems (blocking, crashing,false results) with some
(threads,futures) . Also i have solved the concurrent access problem to the
hash table with lists or vectors depending if i use map or par-map or
futures and threads, but the algorithm remains the same in all case.

I computed from C1,C2,C3 ... to C12 (i do not write the logical expressions
here, some peoples seeing those infix mathematical expression coming in the
mailing list as drones in the peaceful sky of the lambda calculus
expressions ;-) ) but here is the results:

map: 10' 17" for C12 (5" for C10)
par-map: 1'21" for C10, more than 2h 40' for  C12
threads: blocks randomly before C7 (works but is not reentrant, perheaps a
problem in the code i use with thread programming)
futures:8" for C12, 25" for C13,2' for C14,7' for C15...

the best performance are,from far, with futures and the code provided by
Zelphir.

on a 6 cores processor with 16Gb of memory.note at C15 computation the
memory containing minterms and logical expression not simplified is about 3
Gb:
  PID UTIL.     PR  NI    VIRT    RES    SHR S  %CPU  %MEM    TEMPS+ COM.


  61051 mattei    20   0 *3339388*   2,7g  22368 R 463,1  *17,2 * 33:27.92
*guile*

cf :
https://en.wikipedia.org/wiki/Quine%E2%80%93McCluskey_algorithm#Complexity

what i will do is write a Python version of the computation of Cn that uses
the Sympy minimal disjunctive normal form procedure of the Sympy library ,
first to check my scheme code result and perheaps compare the speed, i know
python is slow but as it is a library (compiled? but written in python as
far as i know?), well, i'm a bit nervous to compare the speed between
Python and Scheme...

Best regards,
Damien

On Tue, Oct 25, 2022 at 11:11 AM Mikael Djurfeldt <mikael@djurfeldt.com>
wrote:

> Also, I would believe that any crashes in this context are neither due to
> the futures implementation nor par-map et al. I would think that crashes
> are due to the Guile basic thread support itself.
>
> On Tue, Oct 25, 2022 at 11:07 AM Mikael Djurfeldt <mikael@djurfeldt.com>
> wrote:
>
>> A piece of background on par-map:
>>
>> When I introduced par-map et al the only ambition was to have simple
>> language constructs to invoke parallelism. The use case I had in mind was
>> course grained parallelism where each piece of work is somewhat
>> substantial. Back then, a thread was launched for each piece of work,
>> however there was also a thread pool such that not all of the overhead of
>> launching new threads always was required.
>>
>> Since then, par-map has been rewritten (by others) to be based on
>> futures. (And now the thread pool is localized in the futures
>> implementation---as "workers".) Looking in the code now, I think it is fair
>> to say that it is still intended for coarse grained parallelism. There is
>> some heavy lifting going on with mutexes and condition variables as well as
>> shuffling around with list pairs.
>>
>> So, applying par-map on a huge list with small amount of work per item
>> was never and still isn't the intended use case.
>>
>> It would of course be interesting if the code could be improved to
>> support fine grained parallelism. :-)
>>
>> Best regards,
>> Mikael
>>
>> On Wed, Oct 12, 2022 at 11:30 PM Zelphir Kaltstahl <
>> zelphirkaltstahl@posteo.de> wrote:
>>
>>> Hi!
>>>
>>> On 10/12/22 22:27, Damien Mattei wrote:
>>> >
>>> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1674
>>> >
>>> > i commited the current version of code here with all files but it is
>>> > huge.... :-/
>>> >
>>> > On Wed, Oct 12, 2022 at 10:20 PM Damien Mattei <
>>> damien.mattei@gmail.com>
>>> > wrote:
>>> >
>>> >> Mutex? i do not think code has situation where dead lock could
>>> happen, it
>>> >> is a code about minimalising logic expressions, it uses minterms ,
>>> minterms
>>> >> set is a set of minterms :like this:
>>> >>
>>> >> example:
>>> >> ((1 1 0) (1 1 1)) will be unified : (1 1 x)
>>> >> because 0 and 1 are replaced by x
>>> >> the minterms-set could have thousands of pair (mathematic not lisp)
>>> >> minterms to unify
>>> >> if there is more than one x as result there is no need to continue so
>>> i
>>> >> escape with a continuation:
>>> >>
>>> >> minterms-set =
>>> >> {
>>> >> ((1 0 1 0) (1 1 1 0))
>>> >> ((1 0 1 0) (1 1 0 1))
>>> >> ((1 0 1 0) (1 0 1 1))
>>> >> ((1 0 1 0) (0 1 1 1))
>>> >> ((0 1 1 0) (1 1 1 0))
>>> >> ((0 1 1 0) (1 1 0 1))
>>> >> ((0 1 1 0) (1 0 1 1))
>>> >> ((0 1 1 0) (0 1 1 1))
>>> >> ((0 1 0 1) (1 1 1 0))
>>> >> ((0 1 0 1) (1 1 0 1))
>>> >> ((0 1 0 1) (1 0 1 1))
>>> >> ((0 1 0 1) (0 1 1 1))
>>> >> ((0 0 1 1) (1 1 1 0))
>>> >> ((0 0 1 1) (1 1 0 1))
>>> >> ((0 0 1 1) (1 0 1 1))
>>> >> ((0 0 1 1) (0 1 1 1))
>>> >> }
>>> >>
>>> >> replace { } by () to have the list, other example at another level :
>>> >>
>>> >> minterms-set =
>>> >> {
>>> >> ((0 x 1 1) (x 1 1 1))
>>> >> ((0 x 1 1) (1 x 1 1))
>>> >> ((0 x 1 1) (1 1 x 1))
>>> >> ((0 x 1 1) (1 1 1 x))
>>> >> ((x 0 1 1) (x 1 1 1))
>>> >> ((x 0 1 1) (1 x 1 1))
>>> >> ((x 0 1 1) (1 1 x 1))
>>> >> ((x 0 1 1) (1 1 1 x))
>>> >> ((0 1 x 1) (x 1 1 1))
>>> >> ((0 1 x 1) (1 x 1 1))
>>> >> ((0 1 x 1) (1 1 x 1))
>>> >> ((0 1 x 1) (1 1 1 x))
>>> >> ((x 1 0 1) (x 1 1 1))
>>> >> ((x 1 0 1) (1 x 1 1))
>>> >> ((x 1 0 1) (1 1 x 1))
>>> >> ((x 1 0 1) (1 1 1 x))
>>> >> ((0 1 1 x) (x 1 1 1))
>>> >> ((0 1 1 x) (1 x 1 1))
>>> >> ((0 1 1 x) (1 1 x 1))
>>> >> ((0 1 1 x) (1 1 1 x))
>>> >> ((x 1 1 0) (x 1 1 1))
>>> >> ((x 1 1 0) (1 x 1 1))
>>> >> ((x 1 1 0) (1 1 x 1))
>>> >> ((x 1 1 0) (1 1 1 x))
>>> >> ((1 0 1 x) (x 1 1 1))
>>> >> ((1 0 1 x) (1 x 1 1))
>>> >> ((1 0 1 x) (1 1 x 1))
>>> >> ((1 0 1 x) (1 1 1 x))
>>> >> ((1 x 1 0) (x 1 1 1))
>>> >> ((1 x 1 0) (1 x 1 1))
>>> >> ((1 x 1 0) (1 1 x 1))
>>> >> ((1 x 1 0) (1 1 1 x))
>>> >> }
>>> >>
>>> >> here we see some minterms are already unified
>>> >>
>>> >>   it is not easy to read even by me because i wrote the code many
>>> years ago
>>> >> and is split in many files, but here it is:
>>> >>
>>> >> (par-map function-unify-minterms-list minterms-set)
>>> >>
>>> >> {function-unify-minterms-list <+ (λ (L) (apply
>>> >> function-unify-two-minterms-and-tag L))}
>>> >>
>>> >> (define (unify-two-minterms mt1 mt2)
>>> >>    (function-map-with-escaping-by-kontinuation2
>>> >>   (macro-function-compare-2-bits-with-continuation) mt1 mt2))
>>> >>
>>> >> ;; (function-map-with-escaping-by-kontinuation2
>>> >> (macro-function-compare-2-bits-with-continuation)   '(1 1 0 1 0 1 1
>>> 0) '(1
>>> >> 1 0 1 1 1 1 1))
>>> >>
>>> >> ;; list1 = (1 1 0 1 0 1 1 0)
>>> >> ;; more-lists = ((1 1 0 1 1 1 1 1))
>>> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 1))
>>> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
>>> >>
>>> >> ;; #f
>>> >> ;;
>>> >> ;;  (function-map-with-escaping-by-kontinuation2
>>> >> (macro-function-compare-2-bits-with-continuation)    '(1 1 0 1 0 1 1
>>> 0) '(1
>>> >> 1 0 1 1 1 1 0))
>>> >>
>>> >> ;; list1 = (1 1 0 1 0 1 1 0)
>>> >> ;; more-lists = ((1 1 0 1 1 1 1 0))
>>> >> ;; lists = ((1 1 0 1 0 1 1 0) (1 1 0 1 1 1 1 0))
>>> >> ;; clozure = #<procedure:...gos-DrRacket.scm:195:11>
>>> >>
>>> >> ;; '(1 1 0 1 x 1 1 0)
>>> >> (define (function-map-with-escaping-by-kontinuation2 clozure list1 .
>>> >> more-lists)
>>> >>    (call/cc (lambda (kontinuation)
>>> >>      (let ((lists (cons list1 more-lists))
>>> >>    (funct-continu ;; this function have the kontinuation in his
>>> environment
>>> >>     (lambda (arg1 . more-args)
>>> >>       (let ((args (cons arg1 more-args)))
>>> >> (apply clozure kontinuation args))))) ;; a tester: (apply clozure
>>> (cons
>>> >> conti args))
>>> >>
>>> >>           ;; (newline)
>>> >>           ;; (dv list1)
>>> >>           ;; (dv more-lists)
>>> >>           ;; (dv lists)
>>> >>   ;; (dv clozure)
>>> >>           ;; (newline)
>>> >>
>>> >>        (apply map funct-continu lists)))))
>>> >>
>>> >> (define-syntax macro-function-compare-2-bits-with-continuation ;;
>>> >> continuation version of macro-compare-2-bits
>>> >>    ;; i need a macro because of external function to the clozure
>>> >>    (syntax-rules ()
>>> >>      ((_) (let ((cnt 0)) ;; counter
>>> >>    (lambda (continuation b1 b2) (if (equal? b1 b2)
>>> >>   b1
>>> >>   (begin
>>> >>     (set! cnt (add1 cnt)) ;; we leave with continuation in case cpt >
>>> 1, we
>>> >> can have used a flag too instead of a counter
>>> >>     (when (> cnt 1) (continuation #f)) ;; escaping with the
>>> continuation
>>> >>     'x))))))) ;; return x in case of (b1,b2) = (O,1) or (1,0)
>>> >>
>>> >> what could have caused mutex if in the latter definition above (let
>>> ((cnt
>>> >> 0)) ;; counter was defined at top level and shared by all threads!!!
>>> yes
>>> >> there could have be some mutex  but this is not the case, i think
>>> even all
>>> >> function are pure so why is it more slow with // than without?
>>> >> Damien
>>> >>
>>> >> On Wed, Oct 12, 2022 at 8:45 PM Maxime Devos <maximedevos@telenet.be>
>>> >> wrote:
>>> >>
>>> >>> On 12-10-2022 19:19, Damien Mattei wrote:
>>> >>>> Hello,
>>> >>>> all is in the title, i test on a approximately 30000 element list ,
>>> i
>>> >>> got
>>> >>>> 9s with map and 3min 30s with par-map on exactly the same piece of
>>> >>> code!?
>>> >>>   > [...]
>>> >>>   >
>>> >>>> translated from Scheme+ to Scheme:
>>> >>>> (define unified-minterms-set-1 (map function-unify-minterms-list
>>> >>>> minterms-set)) ;;(par-map function-unify-minterms-list
>>> minterms-set))
>>> >>> The definition of 'function-unify-minterms-list' and 'minterms-set'
>>> is
>>> >>> missing.  Without a test case, we can only speculate what's going on.
>>> >>> (E.g., maybe it grabs a mutex).
>>> >>>
>>> >>> Greetings,
>>> >>> Maxime.
>>> I don't want to scare anyone, just maybe warn about parallel map. I once
>>> tried
>>> to use Guile's parallel map function for a decision tree implementation
>>> (
>>> https://notabug.org/ZelphirKaltstahl/guile-ml/src/cf666801fea91c9fa8fa290764ff6c60b7f3949d/decision-tree.scm),
>>>
>>> where each branch while learning the tree would call parallel map again
>>> for sub
>>> branches and so on. Somehow it made Guile crash (I don't have the error
>>> message
>>> any longer, but I did post about it on the mailing list back then.). I
>>> never
>>> figured out, what went wrong. All I had was pure function calls and math
>>> inside
>>> the thing that parallel map was supposed to run.
>>>
>>> Ultimately I simply tried other parallelism constructs and when I
>>> switched to
>>> using futures instead, everything worked fine, no crashes, no errors.
>>>
>>> Since that time, I did not use parallel map and instead used futures.
>>> Recently I
>>> made a parallelization thing for solving exercises of Project Euler
>>> using
>>> multiple cores, so that some solutions are calculated faster. Maybe this
>>> can
>>> help or can be adapted to another use case:
>>>
>>>
>>> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/parallelism.scm#L11-L30
>>>
>>> It expects ranges of things, which are called `segments` in the code.
>>> Usually
>>> ranges of numbers for Project Euler things. Here is the code to split a
>>> range
>>> into segments:
>>>
>>>
>>> https://notabug.org/ZelphirKaltstahl/guile-project-euler-solutions/src/ebb19b11b465903105924adb6252f1e2ecf63859/lib/segment.scm
>>>
>>> (Check any solution using it for an example.)
>>>
>>> So this might be a bit too specific for general parallel things, but I
>>> guess one
>>> could change the way futures are used in `run-in-parallel`, to fit any
>>> other
>>> purpose.
>>>
>>> Best regards,
>>> Zelphir
>>>
>>> --
>>> repositories: https://notabug.org/ZelphirKaltstahl
>>>
>>>
>>>

[-- Attachment #2: Type: text/html, Size: 17537 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-10-25 14:09 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-12 17:19 map-par slower than map Damien Mattei
2022-10-12 18:45 ` Maxime Devos
2022-10-12 20:20   ` Damien Mattei
2022-10-12 20:27     ` Damien Mattei
2022-10-12 21:29       ` Zelphir Kaltstahl
2022-10-14  8:21         ` Damien Mattei
2022-10-14  8:38           ` Zelphir Kaltstahl
2022-10-17 13:17             ` Damien Mattei
2022-10-22 16:01               ` Damien Mattei
2022-10-23  1:06                 ` Damien Mattei
2022-10-25  9:07         ` Mikael Djurfeldt
2022-10-25  9:11           ` Mikael Djurfeldt
2022-10-25 14:09             ` Damien Mattei
2022-10-12 21:44     ` Maxime Devos
2022-10-12 21:55 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
2022-10-13  7:40   ` Damien Mattei
2022-10-13  8:20     ` Damien Mattei
2022-10-13  9:10     ` Olivier Dion via Developers list for Guile, the GNU extensibility library
2022-10-13 10:44   ` Damien Mattei
2022-10-13 11:00     ` Olivier Dion via Developers list for Guile, the GNU extensibility library
     [not found]       ` <CADEOadfovi8s3OxRcssWOuOW8jjHoL9Z7pD_5FstSm=ZkBHP8g@mail.gmail.com>
2022-10-13 11:57         ` Fwd: " Damien Mattei
2022-10-13 12:36           ` Damien Mattei
2022-10-13 12:41             ` Olivier Dion via Developers list for Guile, the GNU extensibility library
2022-10-13 13:43               ` Damien Mattei
2022-10-13 14:06                 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
2022-10-13 14:10                   ` Damien Mattei
2022-10-13 14:21                     ` Damien Mattei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).