From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Stefan Israelsson Tampe <stefan.itampe@gmail.com>
Newsgroups: gmane.lisp.guile.devel
Subject: Re: Register VM WIP
Date: Fri, 11 May 2012 22:29:11 +0200
Message-ID: <CAGua6m0=cav13eX29vyTsAAf0KFdgRaO48_cmwY8mDBFqY5L-w@mail.gmail.com>
References: <871umqr8q0.fsf@pobox.com>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary=14dae9340c45ce6d6904bfc8968b
X-Trace: dough.gmane.org 1336768163 13639 80.91.229.3 (11 May 2012 20:29:23 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Fri, 11 May 2012 20:29:23 +0000 (UTC)
Cc: guile-devel <guile-devel@gnu.org>
To: Andy Wingo <wingo@pobox.com>
Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Fri May 11 22:29:23 2012
Return-path: <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>
Envelope-to: guile-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>)
	id 1SSwSn-0002Px-9C
	for guile-devel@m.gmane.org; Fri, 11 May 2012 22:29:21 +0200
Original-Received: from localhost ([::1]:42765 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>)
	id 1SSwSm-0006MT-Dh
	for guile-devel@m.gmane.org; Fri, 11 May 2012 16:29:20 -0400
Original-Received: from eggs.gnu.org ([208.118.235.92]:34977)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefan.itampe@gmail.com>) id 1SSwSi-0006MD-R9
	for guile-devel@gnu.org; Fri, 11 May 2012 16:29:19 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefan.itampe@gmail.com>) id 1SSwSg-0004Io-AK
	for guile-devel@gnu.org; Fri, 11 May 2012 16:29:16 -0400
Original-Received: from mail-yx0-f169.google.com ([209.85.213.169]:58500)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefan.itampe@gmail.com>) id 1SSwSg-0004IY-2B
	for guile-devel@gnu.org; Fri, 11 May 2012 16:29:14 -0400
Original-Received: by yenm7 with SMTP id m7so3797811yen.0
	for <guile-devel@gnu.org>; Fri, 11 May 2012 13:29:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	bh=EbJyu7xwW4Z+0pAB3TBnMaXAkDWgBsSql8yj0JSRby8=;
	b=VCkvbX+HUemWekosnGg77NSnJ02Bt7omhh8TOCd3pCFrPVwLUHwuiXyJmUNK8hHz5O
	oY0EvG631jUF/y3ugB3YgGm8bLLwRE2vZuvT+RVPJMoLSLsZgSePWmPDyPkttUwmZT/p
	GrkeDt27fE2VuWYOqnytYr8eLoyCsDyTz+dUYl+oV+KmsEO9wFPnymj8j24JMN9VSv87
	WS9Nf9I+6idlqur0NRGgQUCRwXYfz/pod3uio7j0qM3xK4SFVcyOt8TBnEH2bpVq1foI
	sB56A46aAoBZIowN3b4quKg/X8ewVIzNyXwXgCQZUOpjZ4K9Em3Upv9b4x9NaYu+g++n
	oSIw==
Original-Received: by 10.50.15.137 with SMTP id x9mr2446631igc.8.1336768151317; Fri, 11
	May 2012 13:29:11 -0700 (PDT)
Original-Received: by 10.50.242.102 with HTTP; Fri, 11 May 2012 13:29:11 -0700 (PDT)
In-Reply-To: <871umqr8q0.fsf@pobox.com>
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 209.85.213.169
X-BeenThere: guile-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Developers list for Guile,
	the GNU extensibility library" <guile-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/guile-devel>
List-Post: <mailto:guile-devel@gnu.org>
List-Help: <mailto:guile-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=subscribe>
Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org
Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.lisp.guile.devel:14396
Archived-At: <http://permalink.gmane.org/gmane.lisp.guile.devel/14396>

--14dae9340c45ce6d6904bfc8968b
Content-Type: text/plain; charset=ISO-8859-1

Hi,

This looks very good. i like the hole approach and this approach has the
potential to address most of the issues I have seen when disassembling
guile-2.0 output. A few notes.

1. What about growing stacks any coments if they will be easier to manage
for this setup. Can one copy the C stack logic?

2. Is there an instruction that does what call does but can be used for
tail call's
when it needs it e.g. the code
 for (n = 0; n < nargs; n++)
        LOCAL_SET (n, old_fp[ip[4 + n]]);
that is missing for the tail code

3. I would appriciate if the frame is always below say 256 SCM:s of the fp
stack limit
that way when preparing tail calling one doesn't usally need to check if
the argument fit's
when issuing a tail call. If you compile a function that tail call more
then 254 (?) arguments
then you can as well check because then be free relative the argument
handling.

4. I think the logic code hook I recently investigated could easily fit
into this VM engine with
using similar techniques as I described in previous mails.

Thanks for your work on this
Stefan


On Fri, May 11, 2012 at 6:19 PM, Andy Wingo <wingo@pobox.com> wrote:

> Hi all,
>
> This mail announces some very early work on a register VM.  The code is
> in wip-rtl ("work in progress, register transfer language".  The latter
> bit is something of a misnomer.).  There is not much there yet:
> basically just the VM, an assembler, and a disassembler.  Still, it's
> interesting, and I thought people might want to hear more about it.
>
> So, the deal: why is it interesting to switch from a stack VM, which is
> what we have, to a register VM?  There are three overriding
> disadvantages to the current VM.
>
>  1) With our stack VM, instructions are smaller.  They do less, so you
>     need more of them.  This increases dispatch cost, which is the
>     largest cost of a VM.
>
>  2) On a stack VM, there is a penalty to naming values.  Since the only
>     values that are accessible to an instruction are the ones on the
>     top of the stack, whenever you want to use more names, you end up
>     doing a lot of local-ref / local-set operations.  In contrast an
>     instruction for a register VM can address many more operands, so
>     there is much less penalty to storing something on the stack.  (The
>     penalty is not so much in the storage, but in the pushing and
>     popping needed to access it.)
>
>  3) Our stack VM has variable-sized stack frames, so we need to check
>     for overflow every time we push a value on the stack.  This is
>     quite costly.
>
> The WIP register VM fixes all of these issues.
>
> The basic design of the VM is: 32-bit instruction words, 8-bit opcodes,
> variable-length instructions, maximum of 24-bit register addressing, and
> static, relocatable allocation of constants.
>
> Also, with the wip-rtl VM there is no stack pointer: locals are
> addressed directly via the frame pointer, and the call frame for a
> function is of a fixed size.  Indeed the IP and FP are the only state
> variables of the VM, which makes it much easier to think about native
> compilation, given the scarcity of CPU registers on some architectures.
>
> See vm-engine.c from around line 1000 for a commented set of
> instructions.  It's messy in many ways now, but hey.
>
> As far as performance goes, we won't know yet.  But at least for a
> simple loop, counting down from a billion, the register VM is a few
> times faster than the stack VM.  Still, I would be happy if the general
> speedup were on the order of 40%.  We'll see.
>
> Here's that loop in rtl VM:
>
>   (use-modules (system vm rtl))
>
>   (assemble-rtl-program
>     0
>     '((assert-nargs-ee/locals 1 2)
>       (br fix-body)
>       loop-head
>       (make-short-immediate 2 0)
>       (br-if-= 1 2 out)
>       (sub1 1 1)
>       (br loop-head)
>       fix-body
>       (mov 1 0)
>       (br loop-head)
>       out
>       (make-short-immediate 0 #t)
>       (return 0)))
>
> There are various ways to improve this, but its structure is like what
> the stack VM produces.
>
> Compare to the current opcode:
>
>    scheme@(guile-user)> (define (countdown n) (let lp ((n n)) (or (zero?
> n) (lp (1- n)))))
>    scheme@(guile-user)> ,x countdown
>    Disassembly of #<procedure countdown (n)>:
>
>       0    (assert-nargs-ee/locals 17)     ;; 1 arg, 2 locals
>       2    (br :L186)                      ;; -> 30
>       6    (local-ref 1)                   ;; `n'
>       8    (make-int8:0)                   ;; 0
>       9    (ee?)
>      10    (local-set 2)                   ;; `t'
>      12    (local-ref 2)                   ;; `t'
>      14    (br-if-not :L187)               ;; -> 21
>      18    (local-ref 2)                   ;; `t'
>      20    (return)
>      21    (local-ref 1)                   ;; `n'
>      23    (sub1)
>      24    (local-set 1)                   ;; `n'
>      26    (br :L188)                      ;; -> 6
>      30    (local-ref 0)                   ;; `n'
>      32    (local-set 1)
>      34    (br :L188)                      ;; -> 6
>
> OK, time to set down the keyboard; been working far too much on this in
> recent days.  I still need to adapt the compiler to produce RTL
> bytecode.  I am going to let it sit for a week or two before touching it
> again.  Comments welcome.
>
> Regards,
>
> Andy
> --
> http://wingolog.org/
>
>

--14dae9340c45ce6d6904bfc8968b
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi,<br><br>This looks very good. i like the hole approach and this approach=
 has the <br>potential to address most of the issues I have seen when disas=
sembling<br>guile-2.0 output. A few notes.<br><br>1. What about growing sta=
cks any coments if they will be easier to manage<br>
for this setup. Can one copy the C stack logic?<br><br>2. Is there an instr=
uction that does what call does but can be used for tail call&#39;s<br>when=
 it needs it e.g. the code<br>=A0for (n =3D 0; n &lt; nargs; n++)<br>=A0=A0=
=A0=A0=A0=A0=A0 LOCAL_SET (n, old_fp[ip[4 + n]]);<br>
that is missing for the tail code<br><br>3. I would appriciate if the frame=
 is always below say 256 SCM:s of the fp stack limit<br>that way when prepa=
ring tail calling one doesn&#39;t usally need to check if the argument fit&=
#39;s<br>
when issuing a tail call. If you compile a function that tail call more the=
n 254 (?) arguments<br>then you can as well check because then be free rela=
tive the argument handling.<br><br>4. I think the logic code hook I recentl=
y investigated could easily fit into this VM engine with<br>
using similar techniques as I described in previous mails.<br><br>Thanks fo=
r your work on this<br>Stefan<br><br><br><div class=3D"gmail_quote">On Fri,=
 May 11, 2012 at 6:19 PM, Andy Wingo <span dir=3D"ltr">&lt;<a href=3D"mailt=
o:wingo@pobox.com" target=3D"_blank">wingo@pobox.com</a>&gt;</span> wrote:<=
br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi all,<br>
<br>
This mail announces some very early work on a register VM. =A0The code is<b=
r>
in wip-rtl (&quot;work in progress, register transfer language&quot;. =A0Th=
e latter<br>
bit is something of a misnomer.). =A0There is not much there yet:<br>
basically just the VM, an assembler, and a disassembler. =A0Still, it&#39;s=
<br>
interesting, and I thought people might want to hear more about it.<br>
<br>
So, the deal: why is it interesting to switch from a stack VM, which is<br>
what we have, to a register VM? =A0There are three overriding<br>
disadvantages to the current VM.<br>
<br>
 =A01) With our stack VM, instructions are smaller. =A0They do less, so you=
<br>
 =A0 =A0 need more of them. =A0This increases dispatch cost, which is the<b=
r>
 =A0 =A0 largest cost of a VM.<br>
<br>
 =A02) On a stack VM, there is a penalty to naming values. =A0Since the onl=
y<br>
 =A0 =A0 values that are accessible to an instruction are the ones on the<b=
r>
 =A0 =A0 top of the stack, whenever you want to use more names, you end up<=
br>
 =A0 =A0 doing a lot of local-ref / local-set operations. =A0In contrast an=
<br>
 =A0 =A0 instruction for a register VM can address many more operands, so<b=
r>
 =A0 =A0 there is much less penalty to storing something on the stack. =A0(=
The<br>
 =A0 =A0 penalty is not so much in the storage, but in the pushing and<br>
 =A0 =A0 popping needed to access it.)<br>
<br>
 =A03) Our stack VM has variable-sized stack frames, so we need to check<br=
>
 =A0 =A0 for overflow every time we push a value on the stack. =A0This is<b=
r>
 =A0 =A0 quite costly.<br>
<br>
The WIP register VM fixes all of these issues.<br>
<br>
The basic design of the VM is: 32-bit instruction words, 8-bit opcodes,<br>
variable-length instructions, maximum of 24-bit register addressing, and<br=
>
static, relocatable allocation of constants.<br>
<br>
Also, with the wip-rtl VM there is no stack pointer: locals are<br>
addressed directly via the frame pointer, and the call frame for a<br>
function is of a fixed size. =A0Indeed the IP and FP are the only state<br>
variables of the VM, which makes it much easier to think about native<br>
compilation, given the scarcity of CPU registers on some architectures.<br>
<br>
See vm-engine.c from around line 1000 for a commented set of<br>
instructions. =A0It&#39;s messy in many ways now, but hey.<br>
<br>
As far as performance goes, we won&#39;t know yet. =A0But at least for a<br=
>
simple loop, counting down from a billion, the register VM is a few<br>
times faster than the stack VM. =A0Still, I would be happy if the general<b=
r>
speedup were on the order of 40%. =A0We&#39;ll see.<br>
<br>
Here&#39;s that loop in rtl VM:<br>
<br>
 =A0 (use-modules (system vm rtl))<br>
<br>
 =A0 (assemble-rtl-program<br>
 =A0 =A0 0<br>
 =A0 =A0 &#39;((assert-nargs-ee/locals 1 2)<br>
 =A0 =A0 =A0 (br fix-body)<br>
 =A0 =A0 =A0 loop-head<br>
 =A0 =A0 =A0 (make-short-immediate 2 0)<br>
 =A0 =A0 =A0 (br-if-=3D 1 2 out)<br>
 =A0 =A0 =A0 (sub1 1 1)<br>
 =A0 =A0 =A0 (br loop-head)<br>
 =A0 =A0 =A0 fix-body<br>
 =A0 =A0 =A0 (mov 1 0)<br>
 =A0 =A0 =A0 (br loop-head)<br>
 =A0 =A0 =A0 out<br>
 =A0 =A0 =A0 (make-short-immediate 0 #t)<br>
 =A0 =A0 =A0 (return 0)))<br>
<br>
There are various ways to improve this, but its structure is like what<br>
the stack VM produces.<br>
<br>
Compare to the current opcode:<br>
<br>
 =A0 =A0scheme@(guile-user)&gt; (define (countdown n) (let lp ((n n)) (or (=
zero? n) (lp (1- n)))))<br>
 =A0 =A0scheme@(guile-user)&gt; ,x countdown<br>
 =A0 =A0Disassembly of #&lt;procedure countdown (n)&gt;:<br>
<br>
 =A0 =A0 =A0 0 =A0 =A0(assert-nargs-ee/locals 17) =A0 =A0 ;; 1 arg, 2 local=
s<br>
 =A0 =A0 =A0 2 =A0 =A0(br :L186) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0;; -&gt; 30<br>
 =A0 =A0 =A0 6 =A0 =A0(local-ref 1) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; =
`n&#39;<br>
 =A0 =A0 =A0 8 =A0 =A0(make-int8:0) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; =
0<br>
 =A0 =A0 =A0 9 =A0 =A0(ee?)<br>
 =A0 =A0 =A010 =A0 =A0(local-set 2) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; =
`t&#39;<br>
 =A0 =A0 =A012 =A0 =A0(local-ref 2) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; =
`t&#39;<br>
 =A0 =A0 =A014 =A0 =A0(br-if-not :L187) =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; -&gt=
; 21<br>
 =A0 =A0 =A018 =A0 =A0(local-ref 2) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; =
`t&#39;<br>
 =A0 =A0 =A020 =A0 =A0(return)<br>
 =A0 =A0 =A021 =A0 =A0(local-ref 1) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; =
`n&#39;<br>
 =A0 =A0 =A023 =A0 =A0(sub1)<br>
 =A0 =A0 =A024 =A0 =A0(local-set 1) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; =
`n&#39;<br>
 =A0 =A0 =A026 =A0 =A0(br :L188) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0;; -&gt; 6<br>
 =A0 =A0 =A030 =A0 =A0(local-ref 0) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ;; =
`n&#39;<br>
 =A0 =A0 =A032 =A0 =A0(local-set 1)<br>
 =A0 =A0 =A034 =A0 =A0(br :L188) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0;; -&gt; 6<br>
<br>
OK, time to set down the keyboard; been working far too much on this in<br>
recent days. =A0I still need to adapt the compiler to produce RTL<br>
bytecode. =A0I am going to let it sit for a week or two before touching it<=
br>
again. =A0Comments welcome.<br>
<br>
Regards,<br>
<br>
Andy<br>
<span class=3D"HOEnZb"><font color=3D"#888888">--<br>
<a href=3D"http://wingolog.org/" target=3D"_blank">http://wingolog.org/</a>=
<br>
<br>
</font></span></blockquote></div><br>

--14dae9340c45ce6d6904bfc8968b--