From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Noah Lavine Newsgroups: gmane.lisp.guile.devel Subject: Re: The Road to 2.2 Date: Sat, 18 May 2013 09:44:08 -0400 Message-ID: References: <878v3dgtv0.fsf@pobox.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=bcaec52bec2371f40a04dcfe4cbe X-Trace: ger.gmane.org 1368884703 1483 80.91.229.3 (18 May 2013 13:45:03 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 18 May 2013 13:45:03 +0000 (UTC) Cc: guile-devel To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sat May 18 15:45:04 2013 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UdhRX-00045R-VP for guile-devel@m.gmane.org; Sat, 18 May 2013 15:45:04 +0200 Original-Received: from localhost ([::1]:34506 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UdhRX-0004fV-EO for guile-devel@m.gmane.org; Sat, 18 May 2013 09:45:03 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:42711) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UdhRD-0004eN-Qq for guile-devel@gnu.org; Sat, 18 May 2013 09:44:58 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UdhR1-0006WP-65 for guile-devel@gnu.org; Sat, 18 May 2013 09:44:43 -0400 Original-Received: from mail-pb0-x230.google.com ([2607:f8b0:400e:c01::230]:51193) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UdhR0-0006VV-Jc for guile-devel@gnu.org; Sat, 18 May 2013 09:44:31 -0400 Original-Received: by mail-pb0-f48.google.com with SMTP id md12so1414698pbc.35 for ; Sat, 18 May 2013 06:44:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=OI+JnSx2v10/LLaKGa9zqVITI1sd8M79+hfUCtpw/Y0=; b=iBmuNocJJ0JZ3GbpuT60FMrR8ZGjt0JCy0LmCw3qkieAVxpJRPYgcC/UmcudpdPRqW xTrSvqEPo7bmEzY6KtDTxtg3THoyCNyMHAGUUXyf4bLPisvRX6ZHiRcytEd76xZ0MN65 PyWZzR6MScasTZUGwxiPwi6nZnufND3XBC+gQ2ky4rFDlsW2K1Bi+RV7wJhOVqAwYaTp r/X7FqEcUd6gHyBRD9yVvZBZj3eTEcEmrIS+9dw8Kre6Trwhr8Kgx0FsnWR52x9kPIys lLp1H6wYbU8kZprtd1NONRL1ccpBfN6Fcn//2szkcyEJqz+m8fyd+ToH69yx/Ca1vv2p LuvA== X-Received: by 10.66.27.68 with SMTP id r4mr54559671pag.151.1368884669180; Sat, 18 May 2013 06:44:29 -0700 (PDT) Original-Received: by 10.68.139.98 with HTTP; Sat, 18 May 2013 06:44:08 -0700 (PDT) In-Reply-To: <878v3dgtv0.fsf@pobox.com> X-Google-Sender-Auth: ZFWfUQxxDdsM3yk4iO11_iDo9kM X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400e:c01::230 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:16373 Archived-At: --bcaec52bec2371f40a04dcfe4cbe Content-Type: text/plain; charset=ISO-8859-1 This is very exciting! I have a very small question, based on something I think you said earlier - since the container will be ELF, will we call our files .so now? Noah On Fri, May 17, 2013 at 5:49 PM, Andy Wingo wrote: > Friends! Schemers! Gentle Guilefolk! The time has come to begin in > earnest on the road to Guile 2.2! > > I know all of you are wondering, "did that paragraph really need four > exclamation marks?" Well the answer is yes, yes it did, and the rest of > this mail lays out the reasons. > > So, what does 2.2 mean, why is it appropriate now? 2.2 means many > things on many levels, but on its most basic level, it will result in a > new series of Guile deployments. This series will be installed in > parallel with Guile 2.0 [0]. It shouldn't disrupt people that develop > or distribute C extensions to Guile or programs written in Guile. True, > the binary named "guile" on your system might be 2.0 or 2.2, depending > on what you have installed, though one can get around that with passing > --program-suffix to ./configure. In total, the disruptive impact of a > new series is low. > > [0] > http://www.gnu.org/software/guile/manual/html_node/Parallel-Installations.html > > On the other hand, creating a new development series does have a cost in > splitting our development effort away from the stable branch, and so we > have to justify that. This is the second part of the question, "why > now?" The answer is that we have important improvements to make to the > Guile implementation that can't be made in a strictly binary-compatible > way. We think these things are so exciting as to be worth the > inconvenience of a new stable series. We might still make a few more > 2.0 maintenance releases, but when I describe what we have on tap for > 2.2 you will see why we're excited to move to the next phase. > > RTL VM > ====== > > Guile 2.0 added a compiler and a virtual machine to what was previously > a pure-interpreter implementation of Scheme. I know we all are pretty > stoked about that, but some time has passed and it's time to look at > what we have and see how we can do better. > > The Guile 2.0 VM is a stack machine. That means that its instructions > usually take their values from the stack, and produce values (if > appropriate) by pushing values onto the stack. > > The problem with stack machines is that they penalize named values. If > I realize that a computation is happening twice and I factor it out to a > variable, that means in practice that I allocate a stack frame slot to > the value. So far so good. However, to use the value, I have to emit > an instruction to fetch the value for use by some other instruction; and > to store it, I likewise have to have another instruction to do that. > > For example, in a stack machine, compiling "z = x + y" might mean: > > (local-ref 0) ; x > (local-ref 1) ; y > (add) > (local-set! 2) ; z > > The cost of interpreting a bytecode is largely dispatch cost, which is > linear in the number of instructions executed. So factoring out a > computation to a local variable can sometimes actually cause code to run > slower, depending on the relative cost of the computation versus > accessing its operands. Pushing and popping values can also be > expensive as we are updating a stack pointer all the time, possibly > checking for overflow as well. > > The solution to this problem is to use a "register machine". I use > scare quotes because in fact this is a virtual machine, so unlike a CPU, > the number of "registers" is unlimited, and in fact they are just stack > slots accessed by index. > > So in a register machine, "z = x + y" might be: > > (add 2 0 1) ; register 2 = register 0 + register 1; > > Of course the code for a register VM has to be bigger: instead of _byte_ > code that doesn't need to encode its operands because they are expected > on the stack, register VM code has to encode the "register names", so it > ends up being better to implement it as _word_ code. But this cost is > much less significant than the dispatch cost. > > Anyway, that's the back-story. For Guile 2.2, I have implemented a new > register-based virtual machine. For no good reason I called it the > "RTL" VM, for "register transfer language". RTL is a name for a > particular kind of intermediate representation in compilers, but perhaps > it isn't the most appropriate denomination for what we're doing. In any > case the name seems to have stuck. Perhaps we need a marketing guru > here to find some other compelling name :) > > Given that we're changing the bytecode... > ========================================= > > So RTL is a new bytecode, a new VM, and will ultimately need a new > compiler. I'll talk a bit more about the compiler in a minute, but I > want to mention another aspect of RTL. > > Besides being faster than the Guile 2.0 VM, RTL opens up other > interesting possibilities. One of the most exciting is the ability to > statically allocate all kinds of data. In Guile 2.0, when a .go file > loads from disk, it allocates all of the constants that it needs on the > heap. But that's sub-optimal in many ways. For example, a pair of > immediates like '(1 . 2) doesn't actually need to be allocated at all -- > it can just be part of the loaded image. Likewise a pair with > non-immediates like ("foo" . "bar") can also be allocated in the image, > but it needs fixing up at runtime so that its fields point to the "foo" > and "bar" values. In this particular case the "foo" and "bar" would be > values also in the image. > > What I mean to say is that we can, at compile time, allocate and lay out > the memory for the constants that a program will need, compute the > minimal set of initializations that those constants will need, and make > the right links from the compiled code to the constants. This is > exactly like what the C compiler, linker, and dynamic loader do -- so we > take pointers from them and do exactly the same thing. The RTL branch > writes out compiled data to ELF files. When a .go file is loaded, it > runs a thunk to do the run-time initializations, then returns the "main" > thunk for the .go (which the caller can run or not). > > Note that we use ELF for all platforms. Since we don't rely on the > platform's linker or dynamic loader, there's no sense in using the > format of each individual platform. That said, the files that Guile > produces are readable with readelf, and it should be possible to operate > on them with standard ELF tools. > > Guile's linker and loader are careful to separate read-only data like > bytecode and writable data like those pairs that need runtime > initialization. In this way we can maximize sharing between processes. > Also, Guile puts a bunch of data that's not needed in normal execution > at the end of the file, in the read-only section -- like docstrings, for > example. If a program never asks for procedure-documentation of a > procedure, those docstrings will never be paged in by the kernel; but if > a program does do a lot of documentation grovelling, they are readily > accessible. > > Separating the different metadata into different sections also makes it > possible to strip metadata. For example when deploying to systems with > not much disk space, you might strip docstrings or line number > information. You can use standard ELF tools to do this, or do it using > Guile's ELF tools. > > Finally, using a well-structured format like ELF brings in the > possibility of linking together a number of modules in one file, and > possibly even an entire application. We don't have this working yet, > and it's not in the immediate plans, but it is a great possibility to > keep in mind. > > What about the compiler? > ======================== > > So we have a new VM, assembler, disassembler, and all the things! It's > pretty close to finished: it's just missing the part that lets a > debugger know about line numbers and local variables. But how do we > actually produce RTL assembly from Scheme? > > Well, there we don't have a finished story. Noah Lavine made a branch > to compile tree-il, our high-level intermediate language, to a > "continuation-passing-style" (CPS) form. This is a useful > transformation for many reasons, but among them it is convenient for > giving a "name" to all values -- a property that a register VM prizes. > > His CPS compiler isn't done yet, though when I last looked at it, it was > looking great. So the path goes Scheme -> Tree-IL (something we already > have), then -> CPS (something Noah did), then -> RTL Assembly (again, > Noah), then -> bytecode (the piece I did). It doesn't yet work with all > of Guile but it's getting there. > > How to get there? > ================= > > However, I think we've done all we can in branches. I think we should > bless this RTL experiment as the way to do Guile 2.2. (Thoughts or > objections welcome.) To that end, I think we need to start merging > wip-rtl into master. > > What I propose is that, if we agree, I start merging in trivial stuff. > When I get to something interesting that's not just a refactoring of > existing things, I'll submit that patch to the list. We have a few days > to look at it, we go through some review, and then we figure out how to > move forward -- usually merging some version of that patch. Then we > repeat. > > Once the RTL branch is all merged in, we can start doing the same with > Noah's wip-rtl-cps branch. > > Then eventually some glorious day comes and we start using the CPS/RTL > toolchain, everything is working great and fast, and we start deleting > the old code. > > What do you all think? > > I love it when a plan comes together! > > Cheers, > > Andy > -- > http://wingolog.org/ > > --bcaec52bec2371f40a04dcfe4cbe Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
This is very exciting! I have a very small question, = based on something I think you said earlier - since the container will be E= LF, will we call our files .so now?

Noah


On Fri, May 17, 2013 at 5:49 PM, Andy Wi= ngo <wingo@pobox.com> wrote:
Friends! =A0Schemers! =A0Gentle Guilefolk! =A0The time has come to begin in=
earnest on the road to Guile 2.2!

I know all of you are wondering, "did that paragraph really need four<= br> exclamation marks?" =A0Well the answer is yes, yes it did, and the res= t of
this mail lays out the reasons.

So, what does 2.2 mean, why is it appropriate now? =A02.2 means many
things on many levels, but on its most basic level, it will result in a
new series of Guile deployments. =A0This series will be installed in
parallel with Guile 2.0 [0]. =A0It shouldn't disrupt people that develo= p
or distribute C extensions to Guile or programs written in Guile. =A0True,<= br> the binary named "guile" on your system might be 2.0 or 2.2, depe= nding
on what you have installed, though one can get around that with passing
--program-suffix to ./configure. =A0In total, the disruptive impact of a new series is low.

[0] http://www.gnu.org/software/guile/man= ual/html_node/Parallel-Installations.html

On the other hand, creating a new development series does have a cost in splitting our development effort away from the stable branch, and so we
have to justify that. =A0This is the second part of the question, "why=
now?" =A0The answer is that we have important improvements to make to = the
Guile implementation that can't be made in a strictly binary-compatible=
way. =A0We think these things are so exciting as to be worth the
inconvenience of a new stable series. =A0We might still make a few more
2.0 maintenance releases, but when I describe what we have on tap for
2.2 you will see why we're excited to move to the next phase.

RTL VM
=3D=3D=3D=3D=3D=3D

Guile 2.0 added a compiler and a virtual machine to what was previously
a pure-interpreter implementation of Scheme. =A0I know we all are pretty stoked about that, but some time has passed and it's time to look at what we have and see how we can do better.

The Guile 2.0 VM is a stack machine. =A0That means that its instructions usually take their values from the stack, and produce values (if
appropriate) by pushing values onto the stack.

The problem with stack machines is that they penalize named values. =A0If I realize that a computation is happening twice and I factor it out to a variable, that means in practice that I allocate a stack frame slot to
the value. =A0So far so good. =A0However, to use the value, I have to emit<= br> an instruction to fetch the value for use by some other instruction; and to store it, I likewise have to have another instruction to do that.

For example, in a stack machine, compiling "z =3D x + y" might me= an:

=A0 (local-ref 0) ; x
=A0 (local-ref 1) ; y
=A0 (add)
=A0 (local-set! 2) ; z

The cost of interpreting a bytecode is largely dispatch cost, which is
linear in the number of instructions executed. =A0So factoring out a
computation to a local variable can sometimes actually cause code to run slower, depending on the relative cost of the computation versus
accessing its operands. =A0Pushing and popping values can also be
expensive as we are updating a stack pointer all the time, possibly
checking for overflow as well.

The solution to this problem is to use a "register machine". =A0I= use
scare quotes because in fact this is a virtual machine, so unlike a CPU, the number of "registers" is unlimited, and in fact they are just= stack
slots accessed by index.

So in a register machine, "z =3D x + y" might be:

=A0 (add 2 0 1) ; register 2 =3D register 0 + register 1;

Of course the code for a register VM has to be bigger: instead of _byte_ code that doesn't need to encode its operands because they are expected=
on the stack, register VM code has to encode the "register names"= , so it
ends up being better to implement it as _word_ code. =A0But this cost is much less significant than the dispatch cost.

Anyway, that's the back-story. =A0For Guile 2.2, I have implemented a n= ew
register-based virtual machine. =A0For no good reason I called it the
"RTL" VM, for "register transfer language". =A0RTL is a= name for a
particular kind of intermediate representation in compilers, but perhaps it isn't the most appropriate denomination for what we're doing. = =A0In any
case the name seems to have stuck. =A0Perhaps we need a marketing guru
here to find some other compelling name :)

Given that we're changing the bytecode...
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

So RTL is a new bytecode, a new VM, and will ultimately need a new
compiler. =A0I'll talk a bit more about the compiler in a minute, but I=
want to mention another aspect of RTL.

Besides being faster than the Guile 2.0 VM, RTL opens up other
interesting possibilities. =A0One of the most exciting is the ability to statically allocate all kinds of data. =A0In Guile 2.0, when a .go file
loads from disk, it allocates all of the constants that it needs on the
heap. =A0But that's sub-optimal in many ways. =A0For example, a pair of=
immediates like '(1 . 2) doesn't actually need to be allocated at a= ll --
it can just be part of the loaded image. =A0Likewise a pair with
non-immediates like ("foo" . "bar") can also be allocat= ed in the image,
but it needs fixing up at runtime so that its fields point to the "foo= "
and "bar" values. =A0In this particular case the "foo" = and "bar" would be
values also in the image.

What I mean to say is that we can, at compile time, allocate and lay out the memory for the constants that a program will need, compute the
minimal set of initializations that those constants will need, and make
the right links from the compiled code to the constants. =A0This is
exactly like what the C compiler, linker, and dynamic loader do -- so we take pointers from them and do exactly the same thing. =A0The RTL branch writes out compiled data to ELF files. =A0When a .go file is loaded, it
runs a thunk to do the run-time initializations, then returns the "mai= n"
thunk for the .go (which the caller can run or not).

Note that we use ELF for all platforms. =A0Since we don't rely on the platform's linker or dynamic loader, there's no sense in using the<= br> format of each individual platform. =A0That said, the files that Guile
produces are readable with readelf, and it should be possible to operate on them with standard ELF tools.

Guile's linker and loader are careful to separate read-only data like bytecode and writable data like those pairs that need runtime
initialization. =A0In this way we can maximize sharing between processes. Also, Guile puts a bunch of data that's not needed in normal execution<= br> at the end of the file, in the read-only section -- like docstrings, for example. =A0If a program never asks for procedure-documentation of a
procedure, those docstrings will never be paged in by the kernel; but if a program does do a lot of documentation grovelling, they are readily
accessible.

Separating the different metadata into different sections also makes it
possible to strip metadata. =A0For example when deploying to systems with not much disk space, you might strip docstrings or line number
information. =A0You can use standard ELF tools to do this, or do it using Guile's ELF tools.

Finally, using a well-structured format like ELF brings in the
possibility of linking together a number of modules in one file, and
possibly even an entire application. =A0We don't have this working yet,=
and it's not in the immediate plans, but it is a great possibility to keep in mind.

What about the compiler?
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
So we have a new VM, assembler, disassembler, and all the things! =A0It'= ;s
pretty close to finished: it's just missing the part that lets a
debugger know about line numbers and local variables. =A0But how do we
actually produce RTL assembly from Scheme?

Well, there we don't have a finished story. =A0Noah Lavine made a branc= h
to compile tree-il, our high-level intermediate language, to a
"continuation-passing-style" (CPS) form. =A0This is a useful
transformation for many reasons, but among them it is convenient for
giving a "name" to all values -- a property that a register VM pr= izes.

His CPS compiler isn't done yet, though when I last looked at it, it wa= s
looking great. =A0So the path goes Scheme -> Tree-IL (something we alrea= dy
have), then -> CPS (something Noah did), then -> RTL Assembly (again,=
Noah), then -> bytecode (the piece I did). =A0It doesn't yet work wi= th all
of Guile but it's getting there.

How to get there?
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

However, I think we've done all we can in branches. =A0I think we shoul= d
bless this RTL experiment as the way to do Guile 2.2. =A0(Thoughts or
objections welcome.) =A0To that end, I think we need to start merging
wip-rtl into master.

What I propose is that, if we agree, I start merging in trivial stuff.
When I get to something interesting that's not just a refactoring of existing things, I'll submit that patch to the list. =A0We have a few d= ays
to look at it, we go through some review, and then we figure out how to
move forward -- usually merging some version of that patch. =A0Then we
repeat.

Once the RTL branch is all merged in, we can start doing the same with
Noah's wip-rtl-cps branch.

Then eventually some glorious day comes and we start using the CPS/RTL
toolchain, everything is working great and fast, and we start deleting
the old code.

What do you all think?

I love it when a plan comes together!

Cheers,

Andy
--
http://wingolog.org/=


--bcaec52bec2371f40a04dcfe4cbe--