From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Re: more advanced bytevector => supervectors Date: Wed, 8 Sep 2021 13:32:53 +0200 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000d5935405cb7a3e5a" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37795"; mail-complaints-to="usenet@ciao.gmane.io" Cc: guile-devel To: lloda Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Wed Sep 08 13:33:50 2021 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mNvpt-0009cd-Dt for guile-devel@m.gmane-mx.org; Wed, 08 Sep 2021 13:33:49 +0200 Original-Received: from localhost ([::1]:51150 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mNvpr-0001TX-SS for guile-devel@m.gmane-mx.org; Wed, 08 Sep 2021 07:33:47 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:57808) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mNvpD-0001NN-MV for guile-devel@gnu.org; Wed, 08 Sep 2021 07:33:07 -0400 Original-Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]:43905) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mNvpB-0006Qd-Sp for guile-devel@gnu.org; Wed, 08 Sep 2021 07:33:07 -0400 Original-Received: by mail-pg1-x534.google.com with SMTP id r2so2252085pgl.10 for ; Wed, 08 Sep 2021 04:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JDMdLdOVnl6FEHtCv3c/yxlSgvLhp0abO29fI0kSsJc=; b=cUGqsHJzzTSRSHDprzm9hbnJW4p64v4JKbtG1Rt4u1sBXJTSkm1bhGTKeZJosw0Kig ZZkcJCf8MqE6V6pFfNV2Sq84BDLK8cLp8oPpX54uLZETb4mZcVR7trbv3vDlhwJOH7g7 AlpFqXlzu8O3m3sqvDmpwykvVVJEYanX1gbgKkrMG71k7qdYusN7K3M0r4jdPe1uommV AP9zaHnBB0ggMSb5nOSBFVS9hY6Jf1NIHoBEScVLr3BZB2qBhnU74kbSAkw2z5NdkUNW RSumEVfqV0z5sgDeQ6vMeuNHNxVZQqLLZ7uB3WIX4XgxHcoNm6AzVy/MsREOT206Ff8O BWiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JDMdLdOVnl6FEHtCv3c/yxlSgvLhp0abO29fI0kSsJc=; b=qBz/LMqeEZN48w1KGRzA+l7uziOnL0HMeuXy41GqR+yxrY/o8xM1XP4MweGz2KqUpc O1fcXKuRSOn2pzwLVsI6YJRR54ejBwp0Q8G/+cv8HjQTQdQWgOqFx3xTkZ2AwBBcoFFg JpWNT35wSrDQTKjQi7TXG5JSekGAmuRymIPs/Vp1pIM7p5uhi0gSecY8QQ1QxXeMAN0m XL+Zy1LyJNEDhn9qupnddYfLJ6M8HnrGZ3QNJa2epnVQgrrpmi/SM0JDEaKPouSSaHV8 aHMVMqvBtAJbkmRznOz0QHk5+XjDrrour3Oc7eQ/s31Hd8av6udy1prw6osvt8FijT8s 4mJA== X-Gm-Message-State: AOAM531D+KoiIhPgcZNJ2ycP2c+9JTlJJZkxz6Cj8VZMS7AqB8VhG//6 WkGL4JkEPUaCua5BIbikaHH4NhYj+l7zrTvV3iM= X-Google-Smtp-Source: ABdhPJyfy6b9pck9PsFSJDSC+stG027wgqFk2rf75I4UfiAhiHEBlVR4DDV32tGgm7DdvUd8YkBucqV3GJ6ChhmW8/g= X-Received: by 2002:a62:e50c:0:b029:2f9:b9b1:d44f with SMTP id n12-20020a62e50c0000b02902f9b9b1d44fmr3248338pff.42.1631100784185; Wed, 08 Sep 2021 04:33:04 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::534; envelope-from=stefan.itampe@gmail.com; helo=mail-pg1-x534.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20846 Archived-At: --000000000000d5935405cb7a3e5a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable When I made this code work I will start examine the impact of the design. I suspect that it will, for complex cases if not compiled be way slower then the compiled part and this is really the hot path. For one thing, we will create 2 closures that allocates memory at each iteration if not inlined. The best solutionis that wingo can design a very fast compiler targeting this case in the beginning meaning that guile just handle it perfectly even with potentially 1000:s of cases. Second posibility is if guiile had a fast compiler that when feeding a lambda to it, it would optimise it. we could simulate this by simply pass lists representing code and use compile to compile it, but my experience is that it's very time consuming to do this. I can experiment a little here to see the actual timing. Anyhow the idea with a fast compiler is that it could prepare in=C2=A8the f= irst compiler the setup so that it is really fast to compile compared to starting from scratch. Here the advice from wingo would be apprisiated. A final posibility which is not too bad speedwise is to do the following inside the loop and create one big dispatch like that is executed each iteration. (let ((val (if (=3D endian 'little) (if float? (if (=3D m 4) (get-f32 v1 k1 'little) (get-d64 v1 k1 'little)) ...)) (if (=3D endian 'little) (if float? (if (=3D m 4) (set-f32 v1 k1 val 'little) (set-d64 v1 k1 val 'little)) ...)) This is ideally the code should compile to if it can't create all possible loops Now I do not like to adjust my code to output this as it makes the framework less powerfulll and useful as every case will be a special case. But what about if you could mark a code less important. what we want is a dispatch like so (if (=3D endian 'little) #:level-2 ...) And in the first pass, if will be handled if endian is known (will reduce complexity) else it will in the first pass freeze this one and continue with the whole shebang. the level2 will be the basic compiler, but where the #:level-2 tag is ignored. Maybe this is a no issue and the compiler handles this gracefully. Also The compiler could note that endian nbits single? float? etc etc is really created outside the loop and prepare the code for handling all cases. essentialle make sure to compile all nodes and make an area in the code to modify. then when before the loop the code can decide which version to use outside the loop (here we can use padding or a goto in case if the padded area is so large that a goto saves time. this means that the compiler has 33 cases for the ref and 33 cases for the set! part in my most general version which is ok as they each are typically small. So what I would do if I where the compiler do the following layout pseudo, (if ... (copy RefStub1 to StubX ...) (if ... (copy SetStub1 to StubY ...) goto loop StubRef1 ... StubRefN StubSet1 ... StubSetN loop: (let lp (...) (let ((val StubX)) StubY) (iwhen... (lp ...))) this can be quite fast. Self modifying code rocks!!! --000000000000d5935405cb7a3e5a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
When I made this code work I will start examine the impact= of the design. I suspect that it will, for complex cases
if not compil= ed be way slower then the compiled part and this is really the hot path. Fo= r one thing, we will create 2 closures that=C2=A0
allocates memory at e= ach iteration if not inlined. The best solutionis that wingo can design a v= ery fast compiler targeting
this case in the beginning meaning th= at guile just handle it perfectly even with potentially 1000:s of cases. Se= cond
posibility is if guiile=C2=A0had a fast compiler that when f= eeding=C2=A0a lambda to it, it would optimise it. we could simulate=C2=A0th= is=C2=A0
by simply pass lists representing code and use compile t= o compile it, but my experience=C2=A0is that it's very time consuming
to do this. I can experiment a little here to see the actual timin= g.=C2=A0

Anyhow the idea with a fast compiler is t= hat it could prepare in=C2=A8the first compiler the setup so that it is rea= lly fast to=C2=A0
compile compared to starting from scratch. Here= the advice from wingo would be apprisiated.

A fin= al=C2=A0posibility=C2=A0which is not too bad speedwise is to do the followi= ng inside the loop and create one big dispatch like
that is execu= ted each iteration.

(let ((val (if (=3D endian = 9;little)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0(if float?=C2=A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(if (=3D m 4) (get-f32 v1 = k1 'little) (get-d64 v1 k1 'little))
=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ...))
=
=C2=A0 =C2=A0=C2=A0
(if (=3D endian 'little)
<= div>=C2=A0 =C2=A0 (if float?=C2=A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0(if (=3D m 4) (set-f32 v1 k1 val 'little) (set-d64 v1 k1 val '= ;little))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ...))<= /div>

This is ideally the code should compile to if it c= an't create all possible loops

Now I do not li= ke to adjust my code to output this as it makes the framework less powerful= ll=C2=A0and useful as every case
will be a special case. But what= about if you could mark a code less important. what we want is a dispatch = like so

(if (=3D endian 'little) #:level-= 2 ...)

And in the first pass, if will be handled i= f endian is known (will reduce complexity) else it will in the first pass f= reeze
this one and continue with the whole shebang. the level2 wi= ll be the basic compiler, but where the #:level-2 tag is ignored.
Maybe this is a no issue and the compiler handles this gracefully.

Also The compiler could note that endian nbits single? fl= oat? etc etc is really created outside the loop and prepare the code for
handling=C2=A0all cases. essentialle make sure to compile all nodes= and make an area=C2=A0in the code to modify. then when before the loop
the code can decide which version to use outside the loop (here we c= an use padding or a goto in case if the padded area is so large
t= hat a goto saves time. this means that the compiler has 33 cases for the re= f and 33 cases for the set! part in my most general
version which= is ok as they each are typically small. So what I would do if I where=C2= =A0the compiler do the following layout pseudo,

(i= f ... (copy RefStub1 to StubX ...)
(if ... (copy SetStub1 to=C2= =A0 StubY ...)
goto loop
StubRef1
...
StubRefN

StubSet1
...
StubS= etN
loop:
(let lp (...)
=C2=A0 (let ((val Stu= bX))
=C2=A0 =C2=A0 =C2=A0StubY)
=C2=A0 =C2=A0(iwhen... = (lp ...)))

this can be quite fast.=C2=A0

Self modifying code rocks!!!


<= /div>






=C2=A0 =C2=A0= =C2=A0




=C2=A0=C2=A0


--000000000000d5935405cb7a3e5a--