From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Re: implementation idea for infinite cons lists aka scon(e)s lists. Date: Sat, 13 Sep 2014 13:13:09 +0200 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=047d7bf0c19e9532ef0502f07ca2 X-Trace: ger.gmane.org 1410606807 25411 80.91.229.3 (13 Sep 2014 11:13:27 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 13 Sep 2014 11:13:27 +0000 (UTC) Cc: guile-devel To: Ian Grant Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sat Sep 13 13:13:20 2014 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XSlGZ-0008Fx-Jw for guile-devel@m.gmane.org; Sat, 13 Sep 2014 13:13:19 +0200 Original-Received: from localhost ([::1]:49238 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSlGZ-0003xz-5A for guile-devel@m.gmane.org; Sat, 13 Sep 2014 07:13:19 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58329) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSlGS-0003xl-MK for guile-devel@gnu.org; Sat, 13 Sep 2014 07:13:14 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XSlGR-0004F6-6N for guile-devel@gnu.org; Sat, 13 Sep 2014 07:13:12 -0400 Original-Received: from mail-pd0-x231.google.com ([2607:f8b0:400e:c02::231]:47843) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSlGQ-0004E4-R4 for guile-devel@gnu.org; Sat, 13 Sep 2014 07:13:11 -0400 Original-Received: by mail-pd0-f177.google.com with SMTP id y10so3033649pdj.22 for ; Sat, 13 Sep 2014 04:13:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=d5z49TM3UPTHG25uiM2N86C2+0NMjA11YevkXAEguY8=; b=He1VgiY3Q5+vYCK9de5VMm8luTpxUe2Qf7kUNI7/cOu4gHZWJbJ0MXYdLjY3rjAJ25 W7POSSdi2l7QOhMjxZM6UO78oClldcZkXmt4CJ2yQJ3XGMSG+cdeFIasyKt5D17s8IXA RJ/jt9bI2PGAFecQL1E0EDHWsvSOMmWfdo7bQpVs0004kcx5tPy39iGLJBf142P/WkAD xrEJo7LUL6MnfrUywwo7EqvKh/4JJgJEjTjygHxfdJOBGj5uahP3TCggPKuEDLiFtHhE byC4QGLXssLC/Ucpchd1j1Xd2+kxUByiNZCF1VDZwI5MpBw4zdnxHvuPH+ZUZmNSZQSk oIGA== X-Received: by 10.70.45.41 with SMTP id j9mr23988621pdm.85.1410606789110; Sat, 13 Sep 2014 04:13:09 -0700 (PDT) Original-Received: by 10.70.36.48 with HTTP; Sat, 13 Sep 2014 04:13:09 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400e:c02::231 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:17443 Archived-At: --047d7bf0c19e9532ef0502f07ca2 Content-Type: text/plain; charset=UTF-8 Hi, The idea is to mod the gc and take advantage of that to trace if the datastructure ( a cons cell) so that one can now that it is saved from gc by havinga special reference and that we also can see if the cons cell have been referenced outside of the special references. Then in the sweep phase one can decide to modify the cons list to set the cdr to null in say 1000 cons from the head. But this modding is only done if there is now referncing of it outside the special references. This means that the rest of the tail will gc normally and be reclaimed. If however a function wants to analyze a part of the list it just references the head and by that the gc will not mod any part of the rest of the list and the algorithm can safely do it's work. It's in the gc's sweep phase that the list is cut with a setcdr! /Stefan On Fri, Sep 12, 2014 at 10:08 PM, Ian Grant wrote: > > #| > > Basically a stream is > > [x,y,z,...] > > > But we want to gc the tail of the stream if not reachable. How to do > this? > > I don't understand. The tail is infinitely long. When do you want to GC > it? When your infinite memory is 50% full, or 75% full :-) > > I think you probably have a good idea, but it's just not at all clear from > these two messages. > > Do you know about co-induction and co-data? An ordinary (proper) list is > an example of an inductive structure: you have two constructors, a nullary > one '() which is like a constant or a constant function, it takes no > arguments and makes a list out of nothing. And you have a binary > constructor, cons, which takes a head a tail as arguments and makes a new > list. And then you can use these a bit like an induction proof in > mathematics: '() is the base case, and cons is the induction step which > takes you from a list, to a new longer list. This is a concrete datatype: > the elements it is made of are all represented in memory. > > The dual of this idea is a co-datatype like a stream, where you don't have > the concrete data structures anymore, you have what is called an _abstract > datatype_ which is a datatype that has no actual representation in the > machine: so you don't have the constructors '() and cons anymore, you just > have a single deconstructor, snoc, which, when it is applied to a stream, > maybe returns an element and a new stream, which is the tail, otherwise it > just returns something like #f which says "it ended!" In languages like > scheme and standard ML which do eager evaluation, streams are modelled > using either references (i.e. mutable cons cells) or eta-expansions > (thunks, (lambda () ...) with a 'unit' argument to delay evaluation), but > in lazy languages like haskell (and untyped lambda calculus under normal > order evaluation) you don't need any tricks, you can just write the co-data > types as ordinary lambda expressions, and the 'call by name' semantics mean > that these 'infinite tails' only get expanded (i.e. represented in the > memory) when they are 'observed' by applying the deconstructor. So like in > real life, the only garbage you have to deal with is the stuff that results > from what you make: the whole infinite substructure is all 'enfolded' > underneath and takes no space at all. It's just your observing it that > makes it concrete. > > There is a huge body of theory and an awful lot of scribbling been done > about this. There are mathematical texts where it's called 'category > theory' or 'non well-founded set theory' And it comes up in order theory as > 'fixedpoint calculus' and the theory of Galois connections. And in other > areas of computer science it's called bisimulation. To me it all seems to > be the same thing: "consing up one list while cdr'ing down another", but > there's probably no research mileage in saying things like that. > > Ian > > --047d7bf0c19e9532ef0502f07ca2 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

The idea is to mod the gc and take = advantage of that to trace if the datastructure ( a cons cell) so that one = can now that
it is saved from gc by havinga special reference and= that we also can see if the cons cell have been referenced outside of the= =C2=A0
special references. Then in the sweep phase one can decide= to modify the cons list to set the cdr to null in say 1000 cons from the
head. But this modding is only done if there is now referncing of = it outside the special references. This means that the rest of the tail
will gc normally and be reclaimed. If however a function wants to an= alyze a part of the list it just references the head and by that the gc
will not mod any part of the rest of the list and the algorithm can = safely do it's work. It's in the gc's sweep phase that the list= is cut with a setcdr!

/Stefan

On Fri, Sep 12, 2014 at 1= 0:08 PM, Ian Grant <ian.a.n.grant@googlemail.com>= wrote:
> #|
> Basically a stream is
> [x,y,z,...]

=
> But we want to gc the tail of the stream if not reachable. = How to do this?

I don't understand. The tail i= s infinitely long. When do you want to GC it? When your infinite memory is = 50% full, or 75% full :-)

I think you probably have a goo= d idea, but it's just not at all clear from these two messages.

=
Do you know about co-induction and co-data? An ordinary (proper)= list is an example of an inductive structure: you have two constructors, a= nullary one '() which is like a constant or a constant function, it ta= kes no arguments and makes a list out of nothing. And you have a binary con= structor, cons, which takes a head a tail as arguments and makes a new list= . And then you can use these a bit like an induction proof in mathematics: = '() is the base case, and cons is the induction step which takes you fr= om a list, to a new longer list. This is a concrete datatype: the elements = it is made of are all represented in memory.

The dual of = this idea is a co-datatype like a stream, where you don't have the conc= rete data structures anymore, you have what is called an _abstract datatype= _ which is a datatype that has no actual representation in the machine: so = you don't have the constructors '() and cons anymore, you just have= a single deconstructor, snoc, which, when it is applied to a stream, maybe= returns an element and a new stream, which is the tail, otherwise it just = returns something like #f which says "it ended!" In languages lik= e scheme and standard ML which do eager evaluation, streams are modelled us= ing either references (i.e. mutable cons cells) or eta-expansions (thunks, = (lambda () ...) with a 'unit' argument to delay evaluation), but in= lazy languages like=C2=A0 haskell (and untyped lambda calculus under norma= l order evaluation) you don't need any tricks, you can just write the c= o-data types as ordinary lambda expressions, and the 'call by name'= semantics mean that these 'infinite tails' only get expanded (i.e.= represented in the memory) when they are 'observed' by applying th= e deconstructor. So like in real life, the only garbage you have to deal wi= th is the stuff that results from what you make: the whole infinite substru= cture is all 'enfolded' underneath and takes no space at all. It= 9;s just your observing it that makes it concrete.

There = is a huge body of theory and an awful lot of scribbling been done about thi= s. There are mathematical texts where it's called 'category theory&= #39; or 'non well-founded set theory' And it comes up in order theo= ry as 'fixedpoint calculus' and the theory of Galois connections. A= nd in other areas of computer science it's called bisimulation. To me i= t all seems to be the same thing: "consing up one list while cdr'i= ng down another", but there's probably no research mileage in sayi= ng things like that.

Ian


--047d7bf0c19e9532ef0502f07ca2--