From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Noah Lavine <noah.b.lavine@gmail.com>
Newsgroups: gmane.lisp.guile.devel
Subject: Re: Register VM WIP
Date: Wed, 16 May 2012 10:54:37 -0400
Message-ID: <CA+U71=NNpFPKrG5WUeatoRJfos=9YRd3R_L3Q3GRn1fT=ixUQg@mail.gmail.com>
References: <871umqr8q0.fsf@pobox.com> <873972zczy.fsf@gnu.org>
	<CADoGzsfSmPmYzvtKtiDkspLVUnjZ+Qm5cUFKtk5ij229jR-xdQ@mail.gmail.com>
	<87bolpmgew.fsf@pobox.com>
	<CA+U71=Psa=j5goZDCXE0cORu8Ly7ApbQH94XCkjNMM3aTrZA0A@mail.gmail.com>
	<871umkbvp3.fsf@netris.org> <87fwb0k35g.fsf@pobox.com>
	<87sjf09r5v.fsf@netris.org>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: dough.gmane.org 1337180097 18854 80.91.229.3 (16 May 2012 14:54:57 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Wed, 16 May 2012 14:54:57 +0000 (UTC)
Cc: Andy Wingo <wingo@pobox.com>,
	=?ISO-8859-1?Q?Ludovic_Court=E8s?= <ludo@gnu.org>, guile-devel@gnu.org
To: Mark H Weaver <mhw@netris.org>
Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed May 16 16:54:56 2012
Return-path: <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>
Envelope-to: guile-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>)
	id 1SUfct-00077h-BC
	for guile-devel@m.gmane.org; Wed, 16 May 2012 16:54:55 +0200
Original-Received: from localhost ([::1]:60298 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>)
	id 1SUfcs-0003Il-NJ
	for guile-devel@m.gmane.org; Wed, 16 May 2012 10:54:54 -0400
Original-Received: from eggs.gnu.org ([208.118.235.92]:56303)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <noah549@gmail.com>) id 1SUfch-0003IP-6A
	for guile-devel@gnu.org; Wed, 16 May 2012 10:54:52 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <noah549@gmail.com>) id 1SUfce-00031j-Pc
	for guile-devel@gnu.org; Wed, 16 May 2012 10:54:42 -0400
Original-Received: from mail-gg0-f169.google.com ([209.85.161.169]:49952)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <noah549@gmail.com>)
	id 1SUfce-00031G-HJ; Wed, 16 May 2012 10:54:40 -0400
Original-Received: by ggm4 with SMTP id 4so954577ggm.0
	for <multiple recipients>; Wed, 16 May 2012 07:54:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=TIBBcydmHulORCbYEN5rsc9+agULMpbkXR/7Arr9oag=;
	b=QL+ykUG0ZXHUMjlk/uYDlQ9LGRbsy9+DkuPNrBM7lkkJzLzHOQJhKG046seJXVZ1c3
	GegXr2Nb32iV0dBS/aS+FIxkwLp/7Rnz0ZCOKM5lb7mhh6LyLfaPBT89EwgSCyBWzyqv
	oXOVQkMg+8LWXA64XzlSKD5/wGGNls2IRoNbMQH+FBWokr61KgY+fijhXv0a56gKhHxc
	1oH3AyhLw8YvFL1dBZ7AWgTnnOV3R3QC683UpCmfhoivOSkmgFDe9faVcIFBsEROnFd4
	fsKKoUELzK2Q21YM2CP2MT8G/ZZeylX3pm7x084SBQ1cnNChkqRyqLriaSdqcm7R+kRf
	kvwQ==
Original-Received: by 10.50.158.167 with SMTP id wv7mr2474103igb.7.1337180077650; Wed,
	16 May 2012 07:54:37 -0700 (PDT)
Original-Received: by 10.42.29.200 with HTTP; Wed, 16 May 2012 07:54:37 -0700 (PDT)
In-Reply-To: <87sjf09r5v.fsf@netris.org>
X-Google-Sender-Auth: RPldERto7VvE17D7mcmUkYm3geg
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 209.85.161.169
X-BeenThere: guile-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Developers list for Guile,
	the GNU extensibility library" <guile-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/guile-devel>
List-Post: <mailto:guile-devel@gnu.org>
List-Help: <mailto:guile-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=subscribe>
Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org
Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.lisp.guile.devel:14466
Archived-At: <http://permalink.gmane.org/gmane.lisp.guile.devel/14466>

Hi Mark,

You are thinking along very similar lines to how I used to think. But
I have a different way to think about it that might make it seem
better.

In our current VM, we have two stacks: the local-variable stack, which
has frames for different function calls and is generally what you'd
think of as a stack, and the temporary-variable stack, which is
literally a stack in the sense that you only operate on the top of it.
The temporary-variable stack makes us do a lot of unnecessary work,
because we have to load things from the local-variable stack to the
temporary-variable stack.

I think what Andy is proposing to do is to get rid of the
temporary-variable stack and operate directly on the local-variable
stack. We shouldn't think of these registers as being like machine
registers, and in fact maybe "registers" is not a good name for these
objects. They are really just variables in the topmost stack frame.
This should only reduce memory usage, because the local-variable stack
stays the same and the temporary-variable stack goes away (some
temporaries might move to the local-variable stack, but it can't be
more than were on the temporary-variable stack, so that's still a
win).

The machine I was initially thinking of, and I imagine you were too,
is different. I had imagined a machine where the number of registers
was limited, ideally to the length of a processor cache line, and was
separate from the local-variables stack. In such a machine, the
registers are used as a cache for the local variables, and you get to
deal with all the register allocation problems that a standard
compiler would. That would accomplish the goal of keeping more things
in cache.

The "registers as cache" idea may result in faster code than the
"directly addressing local variables" idea, but it's also more
complicated to implement. So it makes sense to me that we would try
directly addressing local variables first, and maybe later move to
using a fixed-size cache of registers. It also occurs to me that the
RTL intermediate language, which is really just a language for
directly addressing an arbitrary number of local variables, is a
standard compiler intermediate language. So it might be useful to have
that around anyway, because we could more easily feed its output into,
for instance, GCC.

Andy, is this an accurate description of the register VM? And Mark and
everyone else, does it seem better when you look at it this way?

Noah

On Wed, May 16, 2012 at 9:44 AM, Mark H Weaver <mhw@netris.org> wrote:
> Hi Andy!
>
> Andy Wingo <wingo@pobox.com> writes:
>> On Wed 16 May 2012 06:23, Mark H Weaver <mhw@netris.org> writes:
>>
>>> It's surprising to me for another reason: in order to make the
>>> instructions reasonably compact, only a limited number of bits are
>>> available in each instruction to specify which registers to use.
>>
>> It turns out that being reasonably compact isn't terribly important --
>> more important is the number of opcodes it takes to get something done,
>> which translates to the number of dispatches. =A0Have you seen the "dire=
ct
>> threading" VM implementation strategy? =A0In that case the opcode is not
>> an index into a jump table, it's a word that encodes the pointer
>> directly. =A0So it's a word wide, just for the opcode. =A0That's what
>> JavaScriptCore does, for example. =A0The opcode is a word wide, and each
>> operand is a word as well.
>>
>> The design of the wip-rtl VM is to allow 16M registers (24-bit
>> addressing). =A0However many instructions can just address 2**8 register=
s
>> (8-bit addressing) or 2**12 registers (12-bit addressing). =A0We will
>> reserve registers 253 to 255 as temporaries. =A0If you have so many
>> registers as to need more than that, then you have to shuffle operands
>> down into the temporaries. =A0That's the plan, anyway.
>
> I'm very concerned about this design, for the same reason that I was
> concerned about NaN-boxing on 32-bit platforms. =A0Efficient use of memor=
y
> is extremely important on modern architectures, because of the vast (and
> increasing) disparity between cache speed and RAM speed. =A0If you can fi=
t
> the active set into the cache, that often makes a profound difference in
> the speed of a program.
>
> I agree that with VMs, minimizing the number of dispatches is crucial,
> but beyond a certain point, having more registers is not going to save
> you any dispatches, because they will almost never be used anyway.
> 2^12 registers is _far_ beyond that point.
>
> As I wrote before concerning NaN-boxing, I suspect that the reason these
> memory-bloated designs are so successful in the JavaScript world is that
> they are specifically optimized for use within a modern web browser,
> which is already a memory hog anyway. =A0Therefore, if the language
> implementation wastes yet more memory it will hardly be noticed.
>
> If I were designing this VM, I'd work hard to allow as many loops as
> possible to run completely in the cache. =A0That means that three things
> have to fit into the cache together: the VM itself, the user loop code,
> and the user data. =A0IMO, the sum of these three things should be made a=
s
> small as possible.
>
> I certainly agree that we should have a generous number of registers,
> but I suspect that the sweet spot for a VM is 256, because it enables
> more compact dispatching code in the VM, and yet is more than enough to
> allow a decent register allocator to generate good code.
>
> That's my educated guess anyway. =A0Feel free to prove me wrong :)
>
> =A0 =A0Regards,
> =A0 =A0 =A0Mark