From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Daniel Llorens Newsgroups: gmane.lisp.guile.devel Subject: Re: array-copy! is slow & array-map.c (was: Extremly slow for format & string-join) Date: Tue, 2 Apr 2013 16:06:16 +0200 Message-ID: References: <919253B8-A19B-418F-9F2E-603A2AE9DC7E@bluewin.ch> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: multipart/mixed; boundary=Apple-Mail-143-57319837 X-Trace: ger.gmane.org 1364912822 1832 80.91.229.3 (2 Apr 2013 14:27:02 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 2 Apr 2013 14:27:02 +0000 (UTC) To: guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Apr 02 16:27:30 2013 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UN2BN-0002Ux-8t for guile-devel@m.gmane.org; Tue, 02 Apr 2013 16:27:29 +0200 Original-Received: from localhost ([::1]:58120 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UN2Ay-0005BW-Eg for guile-devel@m.gmane.org; Tue, 02 Apr 2013 10:27:04 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:46301) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UN2Ap-0005B3-Io for guile-devel@gnu.org; Tue, 02 Apr 2013 10:27:00 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UN2Am-0004JE-59 for guile-devel@gnu.org; Tue, 02 Apr 2013 10:26:55 -0400 Original-Received: from smtp3.infomaniak.ch ([2001:1600:2:5:92b1:1cff:fe01:147]:47692) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UN2Al-0004IC-Nn for guile-devel@gnu.org; Tue, 02 Apr 2013 10:26:52 -0400 Original-Received: from [172.16.96.17] (mail.infoklick.ch [62.2.203.131]) (authenticated bits=0) by smtp3.infomaniak.ch (8.14.5/8.14.5) with ESMTP id r32E6Ff2024601 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Tue, 2 Apr 2013 16:06:15 +0200 In-Reply-To: <919253B8-A19B-418F-9F2E-603A2AE9DC7E@bluewin.ch> X-Mailer: Apple Mail (2.1085) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2001:1600:2:5:92b1:1cff:fe01:147 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:16105 Archived-At: --Apple-Mail-143-57319837 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On Apr 2, 2013, at 12:19, Daniel Llorens wrote: > On Apr 1, 2013, at 19:15, Daniel Llorens wrote: >=20 >> Third, none of the above are causing the slowness of array-copy!. I = noticed that there's a double indirection in racp(). The second patch = removes it. Actually this double indirection goes on all over = array-map.c and I don't understand why it's needed... >=20 > This patch does the same for array-fill!. The improvement is similar. These two patches do it for array-map!. The first patch avoids cons for the 1-argument case. The second patch = removes the double indirection for the first two arguments in every = case. Since there's some work done inside the loop, the improvement is = smaller than for array-copy! or array-fill!. Before the first patch: scheme@(guile-user)> (define a (make-array 0. 1000000 10)) scheme@(guile-user)> (define b (make-array 0. 1000000 10)) scheme@(guile-user)> (define c (make-array *unspecified* 1000000 10)) scheme@(guile-user)> (define d (transpose-array (make-array = *unspecified* 10 1000000) 1 0)) scheme@(guile-user)> ,time (array-map! c (const 0.)) ;; 0.558498s real time, 0.556974s run time. 0.000000s spent in GC. scheme@(guile-user)> ,time (array-map! c values a) ;; 1.467717s real time, 1.463470s run time. 0.269786s spent in GC. scheme@(guile-user)> ,time (array-map! c values d) ;; 1.500785s real time, 1.496482s run time. 0.254156s spent in GC. scheme@(guile-user)> ,time (array-map! c (lambda (a b) (+ a b)) a b) ;; 2.278655s real time, 2.271948s run time. 0.528379s spent in GC. After the first patch: scheme@(guile-user)> (define a (make-array 0. 1000000 10)) scheme@(guile-user)> (define b (make-array 0. 1000000 10)) scheme@(guile-user)> (define c (make-array *unspecified* 1000000 10)) scheme@(guile-user)> (define d (transpose-array (make-array = *unspecified* 10 1000000) 1 0)) scheme@(guile-user)> ,time (array-map! c (const 0.)) ;; 0.559337s real time, 0.557811s run time. 0.000000s spent in GC. scheme@(guile-user)> ,time (array-map! c values a) ;; 1.038836s real time, 1.035859s run time. 0.134621s spent in GC. scheme@(guile-user)> ,time (array-map! c values d) ;; 1.041937s real time, 1.039000s run time. 0.125787s spent in GC. scheme@(guile-user)> ,time (array-map! c (lambda (a b) (+ a b)) a b) ;; 2.294594s real time, 2.287883s run time. 0.518576s spent in GC. After both patches: scheme@(guile-user)> (define a (make-array 0. 1000000 10)) scheme@(guile-user)> (define b (make-array 0. 1000000 10)) scheme@(guile-user)> (define c (make-array *unspecified* 1000000 10)) scheme@(guile-user)> (define d (transpose-array (make-array = *unspecified* 10 1000000) 1 0)) scheme@(guile-user)> ,time (array-map! c (const 0.)) ;; 0.467621s real time, 0.466343s run time. 0.000000s spent in GC. scheme@(guile-user)> ,time (array-map! c values a) ;; 0.823515s real time, 0.821152s run time. 0.135106s spent in GC. scheme@(guile-user)> ,time (array-map! c values d) ;; 0.825138s real time, 0.822763s run time. 0.124970s spent in GC. scheme@(guile-user)> ,time (array-map! c (lambda (a b) (+ a b)) a b) ;; 2.066735s real time, 2.060635s run time. 0.525089s spent in GC. --Apple-Mail-143-57319837 Content-Disposition: attachment; filename=0005-Avoid-per-element-cons-for-1-arg-case-of-array-map.patch Content-Type: application/octet-stream; name="0005-Avoid-per-element-cons-for-1-arg-case-of-array-map.patch" Content-Transfer-Encoding: quoted-printable =46rom=201bdf84a420defad44721e7ba1915e16621745b57=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20Daniel=20Llorens=20=0A= Date:=20Tue,=202=20Apr=202013=2015:23:55=20+0200=0ASubject:=20[PATCH=20= 5/6]=20Avoid=20per-element=20cons=20for=201-arg=20case=20of=20array-map!=0A= =0A*=20libguile/array-map.c:=20(ramap):=20special=20case=20when=20ras=20= is=20a=201-element=20list.=0A---=0A=20libguile/array-map.c=20|=20=20=20= 45=20++++++++++++++++++++++++++-------------------=0A=201=20file=20= changed,=2026=20insertions(+),=2019=20deletions(-)=0A=0Adiff=20--git=20= a/libguile/array-map.c=20b/libguile/array-map.c=0Aindex=20= f591419..079e3b1=20100644=0A---=20a/libguile/array-map.c=0A+++=20= b/libguile/array-map.c=0A@@=20-390,31=20+390,38=20@@=20SCM_DEFINE=20= (scm_array_copy_x,=20"array-copy!",=202,=200,=200,=0A=20static=20int=0A=20= ramap=20(SCM=20ra0,=20SCM=20proc,=20SCM=20ras)=0A=20{=0A-=20=20long=20i=20= =3D=20SCM_I_ARRAY_DIMS=20(ra0)->lbnd;=0A-=20=20long=20inc=20=3D=20= SCM_I_ARRAY_DIMS=20(ra0)->inc;=0A-=20=20long=20n=20=3D=20= SCM_I_ARRAY_DIMS=20(ra0)->ubnd;=0A-=20=20long=20base=20=3D=20= SCM_I_ARRAY_BASE=20(ra0)=20-=20i=20*=20inc;=0A+=20=20ssize_t=20i=20=3D=20= SCM_I_ARRAY_DIMS=20(ra0)->lbnd;=0A+=0A+=20=20size_t=20i0=20=3D=20= SCM_I_ARRAY_BASE=20(ra0);=0A+=20=20ssize_t=20inc0=20=3D=20= SCM_I_ARRAY_DIMS=20(ra0)->inc;=0A+=20=20size_t=20n=20=3D=20= SCM_I_ARRAY_DIMS=20(ra0)->ubnd=20-=20SCM_I_ARRAY_DIMS=20(ra0)->lbnd=20+=20= 1;=0A+=20=20size_t=20i0end=20=3D=20i0=20+=20n*inc0;=0A=20=20=20ra0=20=3D=20= SCM_I_ARRAY_V=20(ra0);=0A=20=20=20if=20(scm_is_null=20(ras))=0A-=20=20=20= =20for=20(;=20i=20<=3D=20n;=20i++)=0A-=20=20=20=20=20=20GVSET=20(ra0,=20= i*inc+base,=20scm_call_0=20(proc));=0A+=20=20=20=20for=20(;=20i0=20<=20= i0end;=20i0=20+=3D=20inc0)=0A+=20=20=20=20=20=20GVSET=20(ra0,=20i0,=20= scm_call_0=20(proc));=0A=20=20=20else=0A=20=20=20=20=20{=0A=20=20=20=20=20= =20=20SCM=20ra1=20=3D=20SCM_CAR=20(ras);=0A-=20=20=20=20=20=20SCM=20= args;=0A-=20=20=20=20=20=20unsigned=20long=20k,=20i1=20=3D=20= SCM_I_ARRAY_BASE=20(ra1);=0A-=20=20=20=20=20=20long=20inc1=20=3D=20= SCM_I_ARRAY_DIMS=20(ra1)->inc;=0A+=20=20=20=20=20=20size_t=20i1=20=3D=20= SCM_I_ARRAY_BASE=20(ra1);=0A+=20=20=20=20=20=20ssize_t=20inc1=20=3D=20= SCM_I_ARRAY_DIMS=20(ra1)->inc;=0A=20=20=20=20=20=20=20ra1=20=3D=20= SCM_I_ARRAY_V=20(ra1);=0A-=20=20=20=20=20=20ras=20=3D=20scm_vector=20= (SCM_CDR=20(ras));=0A-=20=20=20=20=20=20=0A-=20=20=20=20=20=20for=20(;=20= i=20<=3D=20n;=20i++,=20i1=20+=3D=20inc1)=0A-=09{=0A-=09=20=20args=20=3D=20= SCM_EOL;=0A-=09=20=20for=20(k=20=3D=20scm_c_vector_length=20(ras);=20= k--;)=0A-=09=20=20=20=20args=20=3D=20scm_cons=20(GVREF=20= (scm_c_vector_ref=20(ras,=20k),=20i),=20args);=0A-=09=20=20args=20=3D=20= scm_cons=20(GVREF=20(ra1,=20i1),=20args);=0A-=09=20=20GVSET=20(ra0,=20= i*inc+base,=20scm_apply_0=20(proc,=20args));=0A-=09}=0A+=20=20=20=20=20=20= ras=20=3D=20SCM_CDR=20(ras);=0A+=20=20=20=20=20=20if=20= (scm_is_null(ras))=0A+=20=20=20=20=20=20=20=20=20=20for=20(;=20i0=20<=20= i0end;=20i0=20+=3D=20inc0,=20i1=20+=3D=20inc1)=0A+=20=20=20=20=20=20=20=20= =20=20=20=20GVSET=20(ra0,=20i0,=20scm_call_1=20(proc,=20GVREF=20(ra1,=20= i1)));=0A+=20=20=20=20=20=20else=0A+=20=20=20=20=20=20=20=20{=0A+=20=20=20= =20=20=20=20=20=20=20ras=20=3D=20scm_vector=20(ras);=0A+=20=20=20=20=20=20= =20=20=20=20for=20(;=20i0=20<=20i0end;=20i0=20+=3D=20inc0,=20i1=20+=3D=20= inc1,=20++i)=0A+=20=20=20=20=20=20=20=20=20=20=20=20{=0A+=20=20=20=20=20=20= =20=20=20=20=20=20=20=20SCM=20args=20=3D=20SCM_EOL;=0A+=20=20=20=20=20=20= =20=20=20=20=20=20=20=20unsigned=20long=20k;=0A+=20=20=20=20=20=20=20=20=20= =20=20=20=20=20for=20(k=20=3D=20scm_c_vector_length=20(ras);=20k--;)=0A+=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20args=20=3D=20scm_cons=20= (GVREF=20(scm_c_vector_ref=20(ras,=20k),=20i),=20args);=0A+=20=20=20=20=20= =20=20=20=20=20=20=20=20=20GVSET=20(ra0,=20i0,=20scm_apply_1=20(proc,=20= GVREF=20(ra1,=20i1),=20args));=0A+=20=20=20=20=20=20=20=20=20=20=20=20}=0A= +=20=20=20=20=20=20=20=20}=0A=20=20=20=20=20}=0A=20=20=20return=201;=0A=20= }=0A--=20=0A1.7.9.5=0A=0A= --Apple-Mail-143-57319837 Content-Disposition: attachment; filename=0006-Remove-double-indirection-in-array-map-with-2-args.patch Content-Type: application/octet-stream; name="0006-Remove-double-indirection-in-array-map-with-2-args.patch" Content-Transfer-Encoding: quoted-printable =46rom=2092ecf4b1724ef2db958cf9b513822e6c62c95536=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20Daniel=20Llorens=20=0A= Date:=20Tue,=202=20Apr=202013=2015:53:22=20+0200=0ASubject:=20[PATCH=20= 6/6]=20Remove=20double=20indirection=20in=20array-map!=20with=20<2=20= args=0A=0A*=20libguile/array-map.c:=20(ramap):=20factor=20GVSET/GVREF=20= out=20of=20rank-1=20loop=0A=20=20for=20ra0=20and=20the=20first=20element=20= of=20ras.=0A---=0A=20libguile/array-map.c=20|=20=20=2032=20= +++++++++++++++++++-------------=0A=201=20file=20changed,=2019=20= insertions(+),=2013=20deletions(-)=0A=0Adiff=20--git=20= a/libguile/array-map.c=20b/libguile/array-map.c=0Aindex=20= 079e3b1..7d5b19f=20100644=0A---=20a/libguile/array-map.c=0A+++=20= b/libguile/array-map.c=0A@@=20-391,25=20+391,31=20@@=20static=20int=0A=20= ramap=20(SCM=20ra0,=20SCM=20proc,=20SCM=20ras)=0A=20{=0A=20=20=20ssize_t=20= i=20=3D=20SCM_I_ARRAY_DIMS=20(ra0)->lbnd;=0A-=0A-=20=20size_t=20i0=20=3D=20= SCM_I_ARRAY_BASE=20(ra0);=0A-=20=20ssize_t=20inc0=20=3D=20= SCM_I_ARRAY_DIMS=20(ra0)->inc;=0A-=20=20size_t=20n=20=3D=20= SCM_I_ARRAY_DIMS=20(ra0)->ubnd=20-=20SCM_I_ARRAY_DIMS=20(ra0)->lbnd=20+=20= 1;=0A-=20=20size_t=20i0end=20=3D=20i0=20+=20n*inc0;=0A-=20=20ra0=20=3D=20= SCM_I_ARRAY_V=20(ra0);=0A+=20=20size_t=20n=20=3D=20SCM_I_ARRAY_DIMS=20= (ra0)->ubnd=20-=20i=20+=201;=0A+=0A+=20=20scm_t_array_handle=20h0;=0A+=20= =20size_t=20i0,=20i0end;=0A+=20=20ssize_t=20inc0;=0A+=20=20= scm_generalized_vector_get_handle=20(SCM_I_ARRAY_V=20(ra0),=20&h0);=0A+=20= =20i0=20=3D=20h0.base=20+=20h0.dims[0].lbnd=20+=20SCM_I_ARRAY_BASE=20= (ra0)*h0.dims[0].inc;=0A+=20=20inc0=20=3D=20SCM_I_ARRAY_DIMS=20= (ra0)->inc=20*=20h0.dims[0].inc;=0A+=20=20i0end=20=3D=20i0=20+=20n*inc0;=0A= =20=20=20if=20(scm_is_null=20(ras))=0A=20=20=20=20=20for=20(;=20i0=20<=20= i0end;=20i0=20+=3D=20inc0)=0A-=20=20=20=20=20=20GVSET=20(ra0,=20i0,=20= scm_call_0=20(proc));=0A+=20=20=20=20=20=20h0.impl->vset=20(&h0,=20i0,=20= scm_call_0=20(proc));=0A=20=20=20else=0A=20=20=20=20=20{=0A=20=20=20=20=20= =20=20SCM=20ra1=20=3D=20SCM_CAR=20(ras);=0A-=20=20=20=20=20=20size_t=20= i1=20=3D=20SCM_I_ARRAY_BASE=20(ra1);=0A-=20=20=20=20=20=20ssize_t=20inc1=20= =3D=20SCM_I_ARRAY_DIMS=20(ra1)->inc;=0A-=20=20=20=20=20=20ra1=20=3D=20= SCM_I_ARRAY_V=20(ra1);=0A+=20=20=20=20=20=20scm_t_array_handle=20h1;=0A+=20= =20=20=20=20=20size_t=20i1;=0A+=20=20=20=20=20=20ssize_t=20inc1;=0A+=20=20= =20=20=20=20scm_generalized_vector_get_handle=20(SCM_I_ARRAY_V=20(ra1),=20= &h1);=0A+=20=20=20=20=20=20i1=20=3D=20h1.base=20+=20h1.dims[0].lbnd=20+=20= SCM_I_ARRAY_BASE=20(ra1)*h1.dims[0].inc;=0A+=20=20=20=20=20=20inc1=20=3D=20= SCM_I_ARRAY_DIMS=20(ra1)->inc=20*=20h1.dims[0].inc;=0A=20=20=20=20=20=20=20= ras=20=3D=20SCM_CDR=20(ras);=0A-=20=20=20=20=20=20if=20= (scm_is_null(ras))=0A+=20=20=20=20=20=20if=20(scm_is_null=20(ras))=0A=20=20= =20=20=20=20=20=20=20=20=20for=20(;=20i0=20<=20i0end;=20i0=20+=3D=20= inc0,=20i1=20+=3D=20inc1)=0A-=20=20=20=20=20=20=20=20=20=20=20=20GVSET=20= (ra0,=20i0,=20scm_call_1=20(proc,=20GVREF=20(ra1,=20i1)));=0A+=20=20=20=20= =20=20=20=20=20=20=20=20h0.impl->vset=20(&h0,=20i0,=20scm_call_1=20= (proc,=20h1.impl->vref=20(&h1,=20i1)));=0A=20=20=20=20=20=20=20else=0A=20= =20=20=20=20=20=20=20=20{=0A=20=20=20=20=20=20=20=20=20=20=20ras=20=3D=20= scm_vector=20(ras);=0A@@=20-419,7=20+425,7=20@@=20ramap=20(SCM=20ra0,=20= SCM=20proc,=20SCM=20ras)=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= unsigned=20long=20k;=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20for=20= (k=20=3D=20scm_c_vector_length=20(ras);=20k--;)=0A=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20args=20=3D=20scm_cons=20(GVREF=20= (scm_c_vector_ref=20(ras,=20k),=20i),=20args);=0A-=20=20=20=20=20=20=20=20= =20=20=20=20=20=20GVSET=20(ra0,=20i0,=20scm_apply_1=20(proc,=20GVREF=20= (ra1,=20i1),=20args));=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= h0.impl->vset=20(&h0,=20i0,=20scm_apply_1=20(proc,=20h1.impl->vref=20= (&h1,=20i1),=20args));=0A=20=20=20=20=20=20=20=20=20=20=20=20=20}=0A=20=20= =20=20=20=20=20=20=20}=0A=20=20=20=20=20}=0A--=20=0A1.7.9.5=0A=0A= --Apple-Mail-143-57319837--