From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ian Grant Newsgroups: gmane.lisp.guile.devel Subject: Re: implementation idea for infinite cons lists aka scon(e)s lists. Date: Mon, 15 Sep 2014 14:38:01 -0400 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a11c3504e3cae8a05031eefcd X-Trace: ger.gmane.org 1410806306 20790 80.91.229.3 (15 Sep 2014 18:38:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 15 Sep 2014 18:38:26 +0000 (UTC) Cc: guile-devel To: Stefan Israelsson Tampe Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Mon Sep 15 20:38:21 2014 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XTbAD-0005Ck-7K for guile-devel@m.gmane.org; Mon, 15 Sep 2014 20:38:13 +0200 Original-Received: from localhost ([::1]:33488 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XTbAC-0003Yt-Ht for guile-devel@m.gmane.org; Mon, 15 Sep 2014 14:38:12 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49102) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XTbA7-0003Ya-OQ for guile-devel@gnu.org; Mon, 15 Sep 2014 14:38:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XTbA5-0002yZ-M8 for guile-devel@gnu.org; Mon, 15 Sep 2014 14:38:07 -0400 Original-Received: from mail-wi0-x230.google.com ([2a00:1450:400c:c05::230]:34639) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XTbA5-0002yH-BL for guile-devel@gnu.org; Mon, 15 Sep 2014 14:38:05 -0400 Original-Received: by mail-wi0-f176.google.com with SMTP id ex7so4847189wid.3 for ; Mon, 15 Sep 2014 11:38:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=jvr8ejwBnNEHZk1wKyuD36erSsloIR2x+2RzV7sY8eE=; b=otwMjcAWoii4/NFd3jES/nn8rLHWvmob9kuVSz8NjwwHjFr1Zjvq5NBbZEtMk4Z3dm KJ/JOK9u/5AEL8yIVeIWF7XSx+ljHzgOnZbcokAsahgZufEbaq2ie5JcB9YVeev/2lpf +Su++5Ph1F9531aicGF+ks2S+FgqMEE0DIx1i/yrBzfYL3oc49mwnfDyhD+asUX5ZYg3 wonykuS9CLcUwfdAzcanbmZ2Yt4CY4kWbO07ZIPQ1vn8MEJKgUuQQDZ8bCozPTKV/Ui4 zg7KKkMBDsCVmt1vSZkULqSgN9p04VksmYPx5tLm6LgnkMjEEU3w/XcOhOXMZERG2kTA SSxw== X-Received: by 10.180.184.40 with SMTP id er8mr26575486wic.31.1410806281185; Mon, 15 Sep 2014 11:38:01 -0700 (PDT) Original-Received: by 10.194.81.194 with HTTP; Mon, 15 Sep 2014 11:38:01 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2a00:1450:400c:c05::230 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:17452 Archived-At: --001a11c3504e3cae8a05031eefcd Content-Type: text/plain; charset=UTF-8 On Sat, Sep 13, 2014 at 7:13 AM, Stefan Israelsson Tampe < stefan.itampe@gmail.com> wrote: > it is saved from gc by havinga special reference and that we also can see > if the cons cell have been referenced outside of the > special references. Then in the sweep phase one can decide to modify the > cons list to set the cdr to null in say 1000 cons from the > head. But this modding is only done if there is now referncing of it > outside the special references. This means that the rest of the tail > will gc normally and be reclaimed. If however a function wants to analyze > a part of the list it just references the head and by that the gc > will not mod any part of the rest of the list and the algorithm can safely > do it's work. It's in the gc's sweep phase that the list is cut with a > setcdr! > > I think I am beginning to get the picture. This sounds a little like something I started trying to do with the CAML gc. I called it "recycling garbage" or "resurrection" because the idea is to have a process whereby one can reclaim "weak" references which were not really dead. The theological difficulties with this idea might be considerable, but I thought this would be good thing to do because some structures are expensive to allocate: double-precision floats and GMP mpzs, for example. And in evaluating arithmetic expressions the CAML runtime repeatedly throws intermediate values away, and immediately does a far malloc call to allocate another. I thought if we could keep an array of weak references and recycle some or all of them between the mark and sweep phases, then these could make a 'pool' of pre-allocated objects that could reclaimed just by pointing at them. Does this fit with your idea? Could we combine these as two reasons to do this change? I implemented it, but you may not be too interested in the horrible details of the ancient CAML gc. https://github.com/IanANGrant/red-october/commit/1e76f6746eab2f0afa7dbbcd78d3013029e8187b On a related theme, I have a suggestion for Guile's memory allocation strategy, which is to document a method of preallocating a page-sized block of cons cells, for example, Then when one has a fragment of machine code that is constructing s-exp representations of something or other, that code can do its own memory management just by switching pointers. I think this is a sort of simple-minded slab allocator maybe? I had a brief look at the way the BDW collector does things, and it seemed to me that this could be done right now, just by directly calling GC_MALLOC from the machine code, and SCM references which were pointers to the inside of the GC_ALLOCed block would be enough to keep it from being swept up. And it looks as if whatever memory was optimistically allocated this way, and unused at the end of the construction process, could be freed. Provided it was a contiguous region, which it usually will be, the BDW collector would split the block, and free the unused part. So I don't think there's anything we would need to do to actually implement this. The only thing is that if it is not enshrined in the Guile API spec, then it is vulnerable to being 'patched out' without warning one day. So maybe it only needs a comment in the code somewhere, and a paragraph in the manual. Since you've also spent some time down there in the garbage chute, maybe you can confirm/deny this? Ian --001a11c3504e3cae8a05031eefcd Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On Sat, Sep 13, 2014 at 7:13 AM, Stefan Israelsson Tampe <= span dir=3D"ltr"><stefan.itampe@gmail.com> wrote:
it is saved from gc by havinga special reference and t= hat we also can see if the cons cell have been referenced outside of the=C2= =A0
special references. Then in the sweep phase one can decide to= modify the cons list to set the cdr to null in say 1000 cons from the
head. But this modding is only done if there is now referncing of it = outside the special references. This means that the rest of the tail
<= div>will gc normally and be reclaimed. If however a function wants to analy= ze a part of the list it just references the head and by that the gc
<= div>will not mod any part of the rest of the list and the algorithm can saf= ely do it's work. It's in the gc's sweep phase that the list is= cut with a setcdr!

I think I am beginning= to get the picture. This sounds a little like something I started trying t= o do with the CAML gc. I called it "recycling garbage" or "r= esurrection" because the idea is to have a process whereby one can rec= laim "weak" references which were not really dead. The theologica= l difficulties with this idea might be considerable, but I thought this wou= ld be good thing to do because some structures are expensive to allocate: d= ouble-precision floats and GMP mpzs, for example. And in evaluating arithme= tic expressions the=C2=A0 CAML runtime repeatedly throws intermediate value= s away, and immediately does a far malloc call to allocate another. I thoug= ht if we could keep an array of weak references and recycle some or all of = them between the mark and sweep phases, then these could make a 'pool&#= 39; of pre-allocated objects that could reclaimed just by pointing at them.=

Does this fit with your idea? Could we combine these as = two reasons to do this change?

I implemented i= t, but you may not be too interested in the horrible details of the ancient= CAML gc.=C2=A0 https://github.com/IanANGrant/re= d-october/commit/1e76f6746eab2f0afa7dbbcd78d3013029e8187b

=
On a related theme, I have a suggestion for Guile's memory allocat= ion strategy, which is to document a method of preallocating a page-sized b= lock of cons cells, for example, Then when one has a fragment of machine co= de that is constructing s-exp representations of something or other, that c= ode can do its own memory management just by switching pointers. I think th= is is a sort of simple-minded slab allocator maybe? I had a brief look at t= he way the BDW collector does things, and it seemed to me that this could b= e done right now, just by directly calling GC_MALLOC from the machine code,= and SCM references which were pointers to the inside of the GC_ALLOCed blo= ck would be enough to keep it from being swept up.=C2=A0 And it looks as if= whatever memory was optimistically allocated this way, and unused at the e= nd of the construction process, could be freed. Provided it was a contiguou= s region, which it usually will be, the BDW collector would split the block= , and free the unused part.

So I don't think there's anythin= g we would need to do to actually implement this. The only thing is that if= it is not enshrined in the Guile API spec, then it is vulnerable to being= 'patched out' without warning one day. So maybe it only needs a co= mment in the code somewhere, and a paragraph in the manual.

Since you've also=C2=A0 spent some time down there in the garbage ch= ute, maybe you can confirm/deny this?

Ian

--001a11c3504e3cae8a05031eefcd--