From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Re: vhash speed thread safeness Date: Tue, 29 Oct 2013 15:21:34 +0100 Message-ID: References: <3150759.E8ExfKuC24@warperdoze> <87zjpsp86c.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=047d7b6d9fa20b8e6c04e9e1efa7 X-Trace: ger.gmane.org 1383056505 9992 80.91.229.3 (29 Oct 2013 14:21:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 29 Oct 2013 14:21:45 +0000 (UTC) Cc: guile-devel To: =?ISO-8859-1?Q?Ludovic_Court=E8s?= Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Oct 29 15:21:49 2013 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VbAB3-0002UM-0w for guile-devel@m.gmane.org; Tue, 29 Oct 2013 15:21:49 +0100 Original-Received: from localhost ([::1]:47459 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VbAB2-00069S-HX for guile-devel@m.gmane.org; Tue, 29 Oct 2013 10:21:48 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:45273) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VbAAs-00066M-G1 for guile-devel@gnu.org; Tue, 29 Oct 2013 10:21:40 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VbAAq-0007go-PY for guile-devel@gnu.org; Tue, 29 Oct 2013 10:21:38 -0400 Original-Received: from mail-pa0-x22e.google.com ([2607:f8b0:400e:c03::22e]:52623) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VbAAq-0007ga-Cr; Tue, 29 Oct 2013 10:21:36 -0400 Original-Received: by mail-pa0-f46.google.com with SMTP id rd3so83981pab.33 for ; Tue, 29 Oct 2013 07:21:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=MClDd68vCXSCTu23Wo3VRR2njEivKVfiiST5pL/YW+E=; b=rIVhLTsJaq0VDl2NAfNQOX5HWKB7da8zYwqSjenX25piFaPIyem3M67Y0eDw7eTZia fAs0DAr9LMI0ULSjZffy/Js2/pbXtByzxKL/qZHPvlXOSqRfdNc6UCUgvyhsIf2zXuad yhyqhqtfd2YunAZGINm++fykrmIQdefVZjt9xMwR35RMm49MbDQGhRBKFyLuG3qSzatf cHHBW5SKIforAXmCMGxAGa8doPWv72UbB2zFiJQXJX37Dhb0Ei8rwj4KQOgc4cOKM1n6 VkYnSBVrkYz9WiIpSBcUZ9LgLdTH3wxYOTNOdXiiCNAujyeQOnWp+XhevMiNza6m+9Uk WRZA== X-Received: by 10.66.148.97 with SMTP id tr1mr539748pab.163.1383056494264; Tue, 29 Oct 2013 07:21:34 -0700 (PDT) Original-Received: by 10.70.91.140 with HTTP; Tue, 29 Oct 2013 07:21:34 -0700 (PDT) In-Reply-To: <87zjpsp86c.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400e:c03::22e X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:16698 Archived-At: --047d7b6d9fa20b8e6c04e9e1efa7 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On Tue, Oct 29, 2013 at 1:34 PM, Ludovic Court=E8s wrote: > Hi, Stefan, > > Stefan Israelsson Tampe skribis: > > > I did some tests witha C-based vhash implementation, it's possible to > > increse the speed by 30x compared to current vlist imlpementation in > > guile. It would be possible to get this speed today if one implemented > > vhash-assoc as a VM op. Anyway using the source as a standard lib will > > get you about 10x speed increase. > > As we discussed, I don=92t really like the idea of implementing that much > in C. > My neither, but it is good to see that in cache friendly cases we can improve the situation 30x. Gives us a goal to strive for. Also the main intention is to use vashses for guile-log. For this we can note. 1. With this I managed to make an indexer that can find any ordered set of matches that matches a 10000 consed pattern (x,y) in about 0.5 mu s. e.g. matches for form (x y) (_ y) (x _) (_ _). the fun part is that the indexer indexes and y list of list of list including having some elements beeing variable. This is cool because if you want to solve first order logic problems you will have sometimes many hundreds of matching patterns that represents the compiled rules and hence get a speedup for searching for rule matches. 2.A thread safe version of vhashes could be used for kanren or guile-log. To improve scalability. In order to use this sanely we need to be able to truncate the vhash else the lookup complexity will not be that great for many applications. A truncation means that one need to check if the state has been stored or not and in case it has been stored do a no-op. Also, I=92d really like the optimization effort to be driven by actual > profiling data, such as C-level profiles of repeated calls to the > current =91vhash-assoc=92. Have you tried that? No. But a lot of things can be gain by unpacking the vectors and use pure C vector lookup, I think that we will get that in the second or third version of the native compiler at least its to much to ask for to get it for the first one no? Then it's the overhead of VM jumps as usual which probably is the main contributor. > > Another pressing need is that vhashes are not thread safe, > > Section 2.8 of Bagwell=92s paper proposes a simple solution. All that is > missing AFAICS is an atomic test-and-set VM op to implement it (which > may also be useful in other places.) > > What do you think of this approach? For vlists it's probably a good idea, I don't know if it's enough for vhashes though. Maybe you need a mutex. But lock overhead will be significant and I suspect that it is faster many times to start afresh in all threads. But untill we get a well optimized native compiler, lock overhead is not that pressing. BTW I have a good enough solution implemented now that I will use for parallellizing logic programs, which is what I will concentrate on getting the iso-prolog into shape. > > Thanks, > Ludo=92. > > > /Stefan --047d7b6d9fa20b8e6c04e9e1efa7 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable



On Tue, Oct 29, 2013 at 1:34 PM, Ludovic Court=E8s <ludo@gnu.org>= ; wrote:
Hi, Stefan,

Stefan Israelsson Tampe <stef= an.itampe@gmail.com> skribis:

> I did some tests witha C-based vhash implementation, it's possible= to
> increse the speed by 30x compared to current vlist imlpementation in > guile. It would be possible to get this speed today if one implemented=
> vhash-assoc as a VM op. Anyway using the source as a standard lib will=
> get you about 10x speed increase.

As we discussed, I don=92t really like the idea of implementing that = much
in C.
My neither, but it is good to see that in cache = friendly cases we can improve the situation 30x. Gives us a goal to strive = for. Also the main intention is to use vashses for guile-log. For this we c= an note.

1. With this I managed to make an indexer that can find= any ordered set of matches that matches a 10000 consed pattern (x,y) in ab= out 0.5 mu s.=A0
e.g. matches for form (x y) (_ y) (x _) (_ _= ). the fun part is that the indexer indexes and y list of list of list incl= uding having some elements beeing variable. This is cool because if you wan= t to solve first order logic problems you will have sometimes many hundreds= of matching patterns that represents the compiled rules and hence get a sp= eedup for searching for rule matches.=A0
=A0
2.A thread safe version of vhashes could be used for kan= ren or guile-log. To improve scalability. In order to use this sanely we ne= ed to be able to truncate the vhash else the lookup complexity will not be = that great for many applications. A truncation means that one need to check= if the state has been stored or not and in case it has been stored do a no= -op.

Also, I=92d really like the optimization effort to be driven by actual
profiling data, such as C-level profiles of repeated calls to the
current =91vhash-assoc=92. =A0Have you tried that?

No. But a lot of things can be gain by unpacking the vectors and us= e pure C vector lookup, I think that we will get that in the second or thir= d version of the native compiler at least its to much to ask for to get it = for the first one no? Then it's the overhead of VM jumps as usual which= probably is the main contributor.


=A0
=
> Another pressing need is that vhashes are not thread safe,

Section 2.8 of Bagwell=92s paper proposes a simple solution. =A0All t= hat is
missing AFAICS is an atomic test-and-set VM op to implement it (which
may also be useful in other places.)

What do you think of this approach?

For vli= sts it's probably a good idea, I don't know if it's enough for = vhashes though. Maybe you need a mutex. But lock overhead will be significa= nt and I suspect that it is faster many times to start afresh in all thread= s. But untill we get a well optimized native compiler, lock overhead is not= that pressing.=A0

BTW I have a good enough solution implemented now that = I will use for parallellizing logic programs, which is what I will concentr= ate on getting the iso-prolog into shape.
=A0
=A0
=A0
Thanks,
Ludo=92.


/Stefan
--047d7b6d9fa20b8e6c04e9e1efa7--