From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Mark H Weaver <mhw@netris.org>
Newsgroups: gmane.lisp.guile.devel
Subject: Re: Register VM WIP
Date: Wed, 16 May 2012 09:44:28 -0400
Message-ID: <87sjf09r5v.fsf@netris.org>
References: <871umqr8q0.fsf@pobox.com> <873972zczy.fsf@gnu.org>
	<CADoGzsfSmPmYzvtKtiDkspLVUnjZ+Qm5cUFKtk5ij229jR-xdQ@mail.gmail.com>
	<87bolpmgew.fsf@pobox.com>
	<CA+U71=Psa=j5goZDCXE0cORu8Ly7ApbQH94XCkjNMM3aTrZA0A@mail.gmail.com>
	<871umkbvp3.fsf@netris.org> <87fwb0k35g.fsf@pobox.com>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain
X-Trace: dough.gmane.org 1337175974 14701 80.91.229.3 (16 May 2012 13:46:14 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Wed, 16 May 2012 13:46:14 +0000 (UTC)
Cc: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@gnu.org>, guile-devel@gnu.org
To: Andy Wingo <wingo@pobox.com>
Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed May 16 15:46:13 2012
Return-path: <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>
Envelope-to: guile-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>)
	id 1SUeYO-0008Hi-O2
	for guile-devel@m.gmane.org; Wed, 16 May 2012 15:46:12 +0200
Original-Received: from localhost ([::1]:38828 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>)
	id 1SUeYO-0004WT-4I
	for guile-devel@m.gmane.org; Wed, 16 May 2012 09:46:12 -0400
Original-Received: from eggs.gnu.org ([208.118.235.92]:33299)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mhw@netris.org>) id 1SUeY3-0003rL-HY
	for guile-devel@gnu.org; Wed, 16 May 2012 09:45:55 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mhw@netris.org>) id 1SUeXv-0006be-EJ
	for guile-devel@gnu.org; Wed, 16 May 2012 09:45:51 -0400
Original-Received: from world.peace.net ([96.39.62.75]:55715)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mhw@netris.org>)
	id 1SUeXv-0006Zy-9j; Wed, 16 May 2012 09:45:43 -0400
Original-Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com
	([209.6.91.212] helo=yeeloong)
	by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16)
	(Exim 4.72) (envelope-from <mhw@netris.org>)
	id 1SUeXm-0001PT-3Z; Wed, 16 May 2012 09:45:34 -0400
In-Reply-To: <87fwb0k35g.fsf@pobox.com> (Andy Wingo's message of "Wed, 16 May
	2012 09:15:23 +0200")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-Received-From: 96.39.62.75
X-BeenThere: guile-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Developers list for Guile,
	the GNU extensibility library" <guile-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/guile-devel>
List-Post: <mailto:guile-devel@gnu.org>
List-Help: <mailto:guile-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=subscribe>
Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org
Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.lisp.guile.devel:14464
Archived-At: <http://permalink.gmane.org/gmane.lisp.guile.devel/14464>

Hi Andy!

Andy Wingo <wingo@pobox.com> writes:
> On Wed 16 May 2012 06:23, Mark H Weaver <mhw@netris.org> writes:
>
>> It's surprising to me for another reason: in order to make the
>> instructions reasonably compact, only a limited number of bits are
>> available in each instruction to specify which registers to use.
>
> It turns out that being reasonably compact isn't terribly important --
> more important is the number of opcodes it takes to get something done,
> which translates to the number of dispatches.  Have you seen the "direct
> threading" VM implementation strategy?  In that case the opcode is not
> an index into a jump table, it's a word that encodes the pointer
> directly.  So it's a word wide, just for the opcode.  That's what
> JavaScriptCore does, for example.  The opcode is a word wide, and each
> operand is a word as well.
>
> The design of the wip-rtl VM is to allow 16M registers (24-bit
> addressing).  However many instructions can just address 2**8 registers
> (8-bit addressing) or 2**12 registers (12-bit addressing).  We will
> reserve registers 253 to 255 as temporaries.  If you have so many
> registers as to need more than that, then you have to shuffle operands
> down into the temporaries.  That's the plan, anyway.

I'm very concerned about this design, for the same reason that I was
concerned about NaN-boxing on 32-bit platforms.  Efficient use of memory
is extremely important on modern architectures, because of the vast (and
increasing) disparity between cache speed and RAM speed.  If you can fit
the active set into the cache, that often makes a profound difference in
the speed of a program.

I agree that with VMs, minimizing the number of dispatches is crucial,
but beyond a certain point, having more registers is not going to save
you any dispatches, because they will almost never be used anyway.
2^12 registers is _far_ beyond that point.

As I wrote before concerning NaN-boxing, I suspect that the reason these
memory-bloated designs are so successful in the JavaScript world is that
they are specifically optimized for use within a modern web browser,
which is already a memory hog anyway.  Therefore, if the language
implementation wastes yet more memory it will hardly be noticed.        

If I were designing this VM, I'd work hard to allow as many loops as
possible to run completely in the cache.  That means that three things
have to fit into the cache together: the VM itself, the user loop code,
and the user data.  IMO, the sum of these three things should be made as
small as possible.

I certainly agree that we should have a generous number of registers,
but I suspect that the sweet spot for a VM is 256, because it enables
more compact dispatching code in the VM, and yet is more than enough to
allow a decent register allocator to generate good code.

That's my educated guess anyway.  Feel free to prove me wrong :)

    Regards,
      Mark