From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Stefan Israelsson Tampe <stefan.itampe@gmail.com>
Newsgroups: gmane.lisp.guile.devel
Subject: vm ops and threading discussion for guile-log
Date: Tue, 24 Apr 2012 22:09:09 +0200
Message-ID: <CAGua6m3wkMhkMKARqJ++ofQ2eFEb+R-OXRfbfPCC_Va3dyWU-g@mail.gmail.com>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary=14dae93411cd0987b704be72545b
X-Trace: dough.gmane.org 1335298167 6883 80.91.229.3 (24 Apr 2012 20:09:27 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Tue, 24 Apr 2012 20:09:27 +0000 (UTC)
To: guile-devel <guile-devel@gnu.org>
Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Apr 24 22:09:25 2012
Return-path: <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>
Envelope-to: guile-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>)
	id 1SMm39-0004KM-Tu
	for guile-devel@m.gmane.org; Tue, 24 Apr 2012 22:09:24 +0200
Original-Received: from localhost ([::1]:55666 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane.org@gnu.org>)
	id 1SMm39-0002AG-70
	for guile-devel@m.gmane.org; Tue, 24 Apr 2012 16:09:23 -0400
Original-Received: from eggs.gnu.org ([208.118.235.92]:49138)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefan.itampe@gmail.com>) id 1SMm33-00021K-Lm
	for guile-devel@gnu.org; Tue, 24 Apr 2012 16:09:19 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefan.itampe@gmail.com>) id 1SMm31-0007sk-2y
	for guile-devel@gnu.org; Tue, 24 Apr 2012 16:09:17 -0400
Original-Received: from mail-iy0-f169.google.com ([209.85.210.169]:49875)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefan.itampe@gmail.com>) id 1SMm30-0007sC-Pu
	for guile-devel@gnu.org; Tue, 24 Apr 2012 16:09:15 -0400
Original-Received: by iajr24 with SMTP id r24so1858256iaj.0
	for <guile-devel@gnu.org>; Tue, 24 Apr 2012 13:09:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:date:message-id:subject:from:to:content-type;
	bh=DjTauQ6ZyYFKpacLoBibbyAd2Kqb3vGSO/bGH4NToEI=;
	b=OSdQkvuSz4iMmjsGXbvQRR/yiv8eOjS16u5A908AJMn1od1h9G3ENrYnJLEKpC8KQc
	IkDvAFstbOFq3SWWC+th/PsZiW7P6KnnBUygxEokzsTlX1bwiECzVkIqVm/6+2LpGO+I
	hg0CHzXvB2U86zTgQeqm7nVEKpu2rKAbA2hM9zoX43Zw+3tot/yS4Sqxmmr+uxvYVV2x
	437B2g909V667fza4sYLSVcBE8wmtCZOg5jFBtU0QQ0TOG/Nte7JkGJb1SA4ncetq5xY
	tF1VRN/P8J+UoNOfaE8pqUId4+5mhGuJOHuArKxaTJI/fGWJSlri6HBIdMmVLsW6Q+RF
	1i1g==
Original-Received: by 10.50.219.194 with SMTP id pq2mr11926890igc.18.1335298152302;
	Tue, 24 Apr 2012 13:09:12 -0700 (PDT)
Original-Received: by 10.50.242.102 with HTTP; Tue, 24 Apr 2012 13:09:09 -0700 (PDT)
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 209.85.210.169
X-BeenThere: guile-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Developers list for Guile,
	the GNU extensibility library" <guile-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/guile-devel>
List-Post: <mailto:guile-devel@gnu.org>
List-Help: <mailto:guile-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=subscribe>
Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org
Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.lisp.guile.devel:14307
Archived-At: <http://permalink.gmane.org/gmane.lisp.guile.devel/14307>

--14dae93411cd0987b704be72545b
Content-Type: text/plain; charset=ISO-8859-1

Hi,

i hope that I do not bore you with continuing talking about guile-log.

But one of the things brought up by Mark, and rightly so, are that having
the fundaments
of logic programming based on a stack compared to assq lists or similar more
scalable functional data structures makes it complicated to take advantage
of multiple
cores and threads.

To answer this I have now enabled the possibility to use multiple stacks by
blending the two
techniques together. The reason to take such a complicated approach are
threefold.

1. Stack based techniques are faster and quite much so if we can take
advantage of special
vm operation or are able to compile the code. I find that assq based one
will saturate and we
would for the uniprocessor case get a difference around 10 in speed.

2. Stack based approaches can take advantage of the fact that it is a stack
and implement
variables that behaves like global variables but also like stack variables
and can be reconstruted
from a stored state. Now these data-structures look nonfunctional but they
have similarities
to variables living on the stack e.g. in order for reentrance to work they
need to reference functional
datastructures. Anyway I find that with them one can formulate many problem
in a natural way
which a functional approach cannot mach hence make sure that we have an
expressive framework
that preserves a functional behavior e.g. running stored code that does not
reference global variables always return
the same and also we are able to put all code in tail position and enable
user space code to yield a return
value simply by returning a value in the user space function.

3. We are able to trace rebuilding unwinding, this is nice with respect to
debugging but also is a algorithmic
enabler because you can design functionality taking advantage of this fact.

Well there are some drawbacks.
1. Things are a little more complicated.
2. There might be some overhead in some situation, althogh selecting the
right mix of
a stack based approach and assq list based one will lead to good
performance in most cases.
-----------------------------------------------------------------------------------------------------------------------------------------

How the simple multithreading works:
Each thread has a stack of their own and each created variable has an id
indicating to which stack they belong.
Now if a thread tries to set a variable from another thread then it will do
this by consing on an assq list in stead
of modifying the other stack and simply mimic kanrens strategy.
------------------------------------------------------------------------------------------------

VM OPS.
--------------
Here one can work on different levels. But I choose to look into just
reducing the overhead for function call's
out into C land. You might write a direct op code that calls that function,
but this means a lot of VM op's.
And I therefore settled in a smaller set of op-codes that used function
pointers stored in a array reachable
form the VM. And then used op-codes like

(fast-call-0 index)
(fast-call-1 index a1)
(fast-call-2 index a2)

Then when we load the c functions like gp-car form the shared library I add
a stub like,

(fast-call-set! gp-car 2 5)
(define-inlinable (gp-car x s)
  (fast-call-2 5 x s))

The first is a vm-op that transferes gp-car function pointer to the
function pointer array at arguments 2,  index 5.
and then I overwrite the gp-car! with a suitable inline funciton.

With this techniqe I can increase the speed of the stack based logic code
by a factor 2-3. The assq based one is only marginally
faster.

I think I could win a factor of 2 in the number of vm ops if I could design
into this framework the possibility to return several
variables. e.g.

Consider destructuring a cons cell in a car and a cdr. Currently this is
done like,

(let ((s (gp-pair!? x s)))
  (if s
    (let ((x-car  (gp-car x s))
          (x-cdr  (gp-cdr x s)))
       ...

I would like to be able to write

(let-values (((s x-car x-cdr) (gp-pair-fast!? x s)))
   (if s ...))

And the op-codes would align quite nicely and reduce the op count.

There is a further improvements that could be made but then everything is
special and special vm op-codes would be needed and that bloat that should
be handled
by implementing A JIT engine or naitive compilation in stead.

My conclusion is that it's possible to implement a tool in guile that would
make it quite
easy to improve the speed of vm based guile by a factor of 2-5 by using C
based elements
without the need to be a VM wizard and it would be nice to have something
along the lines
discussed above but with better integration into guile.

I did also redesign the code so currently I'm at 80ms (for stack based
guile-log) for the einstein case that basically tests
the speed fo destructuring a pair in it's elements.

Cheers
Stefan

--14dae93411cd0987b704be72545b
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi,<br><br>i hope that I do not bore you with continuing talking about guil=
e-log.<br><br>But one of the things brought up by Mark, and rightly so, are=
 that having the fundaments<br>of logic programming based on a stack compar=
ed to assq lists or similar more<br>
scalable functional data structures makes it complicated to take advantage =
of multiple<br>cores and threads.<br><br>To answer this I have now enabled =
the possibility to use multiple stacks by blending the two <br>techniques t=
ogether. The reason to take such a complicated approach are threefold.<br>
<br>1. Stack based techniques are faster and quite much so if we can take a=
dvantage of special<br>vm operation or are able to compile the code. I find=
 that assq based one will saturate and we <br>would for the uniprocessor ca=
se get a difference around 10 in speed.<br>
<br>2. Stack based approaches can take advantage of the fact that it is a s=
tack and implement<br>variables that behaves like global variables but also=
 like stack variables and can be reconstruted<br>from a stored state. Now t=
hese data-structures look nonfunctional but they have similarities<br>
to variables living on the stack e.g. in order for reentrance to work they =
need to reference functional <br>datastructures. Anyway I find that with th=
em one can formulate many problem in a natural way<br>which a functional ap=
proach cannot mach hence make sure that we have an expressive framework<br>
that preserves a functional behavior e.g. running stored code that does not=
 reference global variables always return <br>the same and also we are able=
 to put all code in tail position and enable user space code to yield a ret=
urn<br>
value simply by returning a value in the user space function.<br><br>3. We =
are able to trace rebuilding unwinding, this is nice with respect to debugg=
ing but also is a algorithmic<br>enabler because you can design functionali=
ty taking advantage of this fact.<br>
<br>Well there are some drawbacks.<br>1. Things are a little more complicat=
ed.<br>2. There might be some overhead in some situation, althogh selecting=
 the right mix of<br>a stack based approach and assq list based one will le=
ad to good performance in most cases.<br>
---------------------------------------------------------------------------=
--------------------------------------------------------------<br><br>How t=
he simple multithreading works:<br>Each thread has a stack of their own and=
 each created variable has an id indicating to which stack they belong.<br>
Now if a thread tries to set a variable from another thread then it will do=
 this by consing on an assq list in stead<br>of modifying the other stack a=
nd simply mimic kanrens strategy.<br>--------------------------------------=
----------------------------------------------------------<br>
<br>VM OPS.<br>--------------<br>Here one can work on different levels. But=
 I choose to look into just reducing the overhead for function call&#39;s<b=
r>out into C land. You might write a direct op code that calls that functio=
n, but this means a lot of VM op&#39;s. <br>
And I therefore settled in a smaller set of op-codes that used function poi=
nters stored in a array reachable<br>form the VM. And then used op-codes li=
ke<br><br>(fast-call-0 index)<br>(fast-call-1 index a1)<br>(fast-call-2 ind=
ex a2)<br>
<br>Then when we load the c functions like gp-car form the shared library I=
 add a stub like,<br><br>(fast-call-set! gp-car 2 5)<br>(define-inlinable (=
gp-car x s)<br>=A0 (fast-call-2 5 x s))<br><br>The first is a vm-op that tr=
ansferes gp-car function pointer to the function pointer array at arguments=
 2,=A0 index 5.<br>
and then I overwrite the gp-car! with a suitable inline funciton.<br><br>Wi=
th this techniqe I can increase the speed of the stack based logic code by =
a factor 2-3. The assq based one is only marginally<br>faster.<br><br>I thi=
nk I could win a factor of 2 in the number of vm ops if I could design into=
 this framework the possibility to return several<br>
variables. e.g.<br><br>Consider destructuring a cons cell in a car and a cd=
r. Currently this is done like,<br><br>(let ((s (gp-pair!? x s)))<br>=A0 (i=
f s<br>=A0=A0=A0 (let ((x-car=A0 (gp-car x s))<br>=A0=A0=A0=A0=A0=A0=A0=A0=
=A0 (x-cdr=A0 (gp-cdr x s)))<br>
=A0=A0=A0=A0=A0=A0 ...<br><br>I would like to be able to write<br><br>(let-=
values (((s x-car x-cdr) (gp-pair-fast!? x s)))<br>=A0=A0 (if s ...))<br><b=
r>And the op-codes would align quite nicely and reduce the op count.<br><br=
>There is a further improvements that could be made but then everything is<=
br>
special and special vm op-codes would be needed and that bloat that should =
be handled<br>by implementing A JIT engine or naitive compilation in stead.=
<br><br>My conclusion is that it&#39;s possible to implement a tool in guil=
e that would make it quite<br>
easy to improve the speed of vm based guile by a factor of 2-5 by using C b=
ased elements<br>without the need to be a VM wizard and it would be nice to=
 have something along the lines <br>discussed above but with better integra=
tion into guile.<br>
<br>I did also redesign the code so currently I&#39;m at 80ms (for stack ba=
sed guile-log) for the einstein case that basically tests<br>the speed fo d=
estructuring a pair in it&#39;s elements.<br><br>Cheers<br>Stefan<br><br>
<br><br><br><br><br>

--14dae93411cd0987b704be72545b--