implementation idea for infinite cons lists aka scon(e)s lists.

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

* implementation idea for infinite cons lists aka scon(e)s lists.
@ 2014-09-10 20:10 Stefan Israelsson Tampe
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Israelsson Tampe @ 2014-09-10 20:10 UTC (permalink / raw)
  To: guile-devel, guile-user@gnu.org

[-- Attachment #1: Type: text/plain, Size: 2884 bytes --]

#|
Basically a stream is
[x,y,z,...]

But we want to gc the tail of the stream if not reachable. How to do this?

Well consider a cons cell of
[tag,car,cdr]

and that the tag can be marked or not just as with the marking of
guile-log's
logical variables.

The technique to create persistant parts of the list is to just reference
the head of the part
under analysis. Then the tail of that is never reclaimed. This is a simple
version of gc-able
lists. Better versions might exists. But hey let's have fun ....
|#

;; This is a referetial structure, but it will not gc the end
(define id->data (make-weak-key-hash-table))

(define (new-cstream)
  (cons (cons 0 '()) (vector (cons 'cstream-id '()) '() '())))
(define (cstream-i       cs) (caar   cs))
(define (cstream-data    cs) (cdar  cs))
(define (cstream-id      cs) (vector-ref (cdr cs) 0))
(define (cstream-backlog cs) (vector-ref (cdr cs) 1))
(define (cstream-logback cs) (vector-ref (cdr cs) 2))
(define (update-cstream cs a b c d)
  (cons (cons a b) (vector (cstream-id cs) c d)))
(define (hash-new-cstream cs)
  (hashq-set! id->data (cstream-id cs) cs))

(define-syntax-rule (increment-cstream-count cstream val)
  (cons (cons (+ 1 (cstream-i cstream))
      (c-ref (scons val (c-unref (cstream-data cstream)))))
(cdr cstream)))

(define (update cstream val)
  (let ((i (cstream-i cstream)))
    (if (> i 30)
(let ((data    (c-ref (scons val (c-unref (cstream-data cstream)))))
      (backlog (cons data (cstream-backlog cstream)))
      (logback (cstream-logback cstream)))
  (hash-new-cstream
     (if (null? logback)
       (updtate-cstream cstream 0 data '() (cdr (reverse backlog)))
       (updtate-cstream cstream 0 data backlog (cdr logdata)))))
(increment-cstream-count cstream))))

(define (get-cstream-data cstream) (cstream-data cstream))

;; This is executed after the mark phase assuming the gc hack!
(define (sweep)
  (hash-for-each id->data
    (lambda (id cs)
      (let* ((data (cstream-data cs))
     (lb   (cstream-logback cs))
     (lb   (if (pair? lb) (car lb) '())))
(let lp ((d data))
  (if (eq? d lb)
      (let lp ((d d))
(if (and (pair? d) (marked-bit? d))
    (lp (cdr d))
    (if (pair? d)
(set-cdr! d '()))))
      (lp (cdr d))))))))

;; we have a WMARK procedure and a normal MARK procedure. WMARK will not set
;; the IS_MARKED bit of the containing scons, but that is what MARK will do
;; that is the normal marking procedure in the modded bdw-gc

;; c-ref makes a reference that is a box that will make sure that we WMARK
;; the scons list and c-unref will unbox the value

;; What is needed is special mark procedures
#|
Here is the schematic of the C mark procedures.
mark_scons(scm s)
{
 SCM tag = s[0];
 SCM x1  = s[1];
 SCM x2  = s[2];

 if(IS_MARKED(tag))
    MARK(x1)
    MARK(x2);
 else
    MARK(x1)
    WMARK(x2)
}

mark_ref(scm s)
{
 SCM tag = s[0]:
 SCM d   = s[1];
 WMARK(d);
}
|#

[-- Attachment #2: Type: text/html, Size: 5175 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* implementation idea for infinite cons lists aka scon(e)s lists.
@ 2014-09-12 20:08 Ian Grant
  2014-09-13  1:22 ` Ian Grant
  2014-09-13 11:13 ` Stefan Israelsson Tampe
  0 siblings, 2 replies; 7+ messages in thread
From: Ian Grant @ 2014-09-12 20:08 UTC (permalink / raw)
  To: stefan.itampe, guile-devel

[-- Attachment #1: Type: text/plain, Size: 2823 bytes --]

> #|
> Basically a stream is
> [x,y,z,...]

> But we want to gc the tail of the stream if not reachable. How to do this?

I don't understand. The tail is infinitely long. When do you want to GC it?
When your infinite memory is 50% full, or 75% full :-)

I think you probably have a good idea, but it's just not at all clear from
these two messages.

Do you know about co-induction and co-data? An ordinary (proper) list is an
example of an inductive structure: you have two constructors, a nullary one
'() which is like a constant or a constant function, it takes no arguments
and makes a list out of nothing. And you have a binary constructor, cons,
which takes a head a tail as arguments and makes a new list. And then you
can use these a bit like an induction proof in mathematics: '() is the base
case, and cons is the induction step which takes you from a list, to a new
longer list. This is a concrete datatype: the elements it is made of are
all represented in memory.

The dual of this idea is a co-datatype like a stream, where you don't have
the concrete data structures anymore, you have what is called an _abstract
datatype_ which is a datatype that has no actual representation in the
machine: so you don't have the constructors '() and cons anymore, you just
have a single deconstructor, snoc, which, when it is applied to a stream,
maybe returns an element and a new stream, which is the tail, otherwise it
just returns something like #f which says "it ended!" In languages like
scheme and standard ML which do eager evaluation, streams are modelled
using either references (i.e. mutable cons cells) or eta-expansions
(thunks, (lambda () ...) with a 'unit' argument to delay evaluation), but
in lazy languages like  haskell (and untyped lambda calculus under normal
order evaluation) you don't need any tricks, you can just write the co-data
types as ordinary lambda expressions, and the 'call by name' semantics mean
that these 'infinite tails' only get expanded (i.e. represented in the
memory) when they are 'observed' by applying the deconstructor. So like in
real life, the only garbage you have to deal with is the stuff that results
from what you make: the whole infinite substructure is all 'enfolded'
underneath and takes no space at all. It's just your observing it that
makes it concrete.

There is a huge body of theory and an awful lot of scribbling been done
about this. There are mathematical texts where it's called 'category
theory' or 'non well-founded set theory' And it comes up in order theory as
'fixedpoint calculus' and the theory of Galois connections. And in other
areas of computer science it's called bisimulation. To me it all seems to
be the same thing: "consing up one list while cdr'ing down another", but
there's probably no research mileage in saying things like that.

Ian

[-- Attachment #2: Type: text/html, Size: 3188 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: implementation idea for infinite cons lists aka scon(e)s lists.
  2014-09-12 20:08 Ian Grant
@ 2014-09-13  1:22 ` Ian Grant
  2014-09-13 11:19   ` Stefan Israelsson Tampe
  2014-09-13 11:13 ` Stefan Israelsson Tampe
  1 sibling, 1 reply; 7+ messages in thread
From: Ian Grant @ 2014-09-13  1:22 UTC (permalink / raw)
  To: stefan.itampe, guile-devel

[-- Attachment #1: Type: text/plain, Size: 6581 bytes --]

On the c-lambda page, you write:

   "In a sense C is simplistic and in many ways a glorified
    assembler but I actually like it."

I first learned C quite a while ago. And up until three months ago, I would
have agreed with this. But then I started using assembler in a way that I'd
never used it before: I had written a bunch of C functions to interface
lightning and libffi with Moscow ML. And it was hard work. The CAML
byte-code interpreter uses a similar sort of cell encoding scheme, folding
longs and pointers together, and passing 'vectors' of pointers around,
often several deep. It's really easy to make a mistake with the casts and
end up de-referencing a pointer to a pointer to a string reference, say,
too many or too few times.

Once I had the lightning primitives going, though, I found it got a lot
easier. And to my genuine surprise I found the assembler easier to write
and easier to understand than the C had been. And there was another benefit
that wasn't so surprising, because I think it was one of the reasons why I
was doing this: the assembler was much easier to generalise. It was almost
trivial to compose assembler-generating ML functions together, and the
result was that the fragments of assembler were automatically stacked up to
implement, for me, quite complicated C function bindings for ML. But
because the fragments of code were all being composed mechanically, they
either worked first time (most of the time) or they failed every time. I
have had very, very few of those bugs that show up once in a blue moon. One
was, I think, a fairly predictable case of some SCM pointers that guile's
GC wasn't protecting, because they were only referenced from CAML bytecode
environments And another was, I think, a bug in my C code implementing
off-heap malloc'ed buffers for ML functions. And there haven't been any
others. But this is hundreds, if not thousands of 'lines' of assembler,
because I use the same pieces of code over and over again, in different
contexts.

You might be interested to try it out. I have made some examples of how to
do the Guile bindings for lightning, and you won't have any problem at all
generating the scheme code to implement the whole lot. There are about 350
functions to bind, so it needs to be done programmatically, but I have set
up an ML list which will translate trivially to a scheme s-exp to implement
the definitions. And there are some examples of lightning code to access
and create SCMs here:

https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/testguile.sml

The example of multiple-use starts around line 450.

I posted the example.scm.gz as an attachment to this message:

  http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00042.html

If you are interested, then I would like to ask you about how or if it
might be possible to use Prolog to solve some logical equations and do
automatic register allocation once a whole lot of assembler fragments have
been stacked up. If you try it, you will see that it is easy to assign
registers as scheme parameters and fix them in the caller, but there are
some compromises one has to make, which are less than 100"% efficient. And
I think maybe a bit of AI would be able to make them dynamically and
according to context, which is definitely not what we would want to do
manually, because it would destroy the simplicity and reliability of
function composition for 'stacking fragments.' The manually specified steps
have to be very uniform and regular, because we want the results to be
_very_ reliable.

Happy hacking.

On Fri, Sep 12, 2014 at 4:08 PM, Ian Grant <ian.a.n.grant@googlemail.com>
wrote:

> > #|
> > Basically a stream is
> > [x,y,z,...]
>
> > But we want to gc the tail of the stream if not reachable. How to do
> this?
>
> I don't understand. The tail is infinitely long. When do you want to GC
> it? When your infinite memory is 50% full, or 75% full :-)
>
> I think you probably have a good idea, but it's just not at all clear from
> these two messages.
>
> Do you know about co-induction and co-data? An ordinary (proper) list is
> an example of an inductive structure: you have two constructors, a nullary
> one '() which is like a constant or a constant function, it takes no
> arguments and makes a list out of nothing. And you have a binary
> constructor, cons, which takes a head a tail as arguments and makes a new
> list. And then you can use these a bit like an induction proof in
> mathematics: '() is the base case, and cons is the induction step which
> takes you from a list, to a new longer list. This is a concrete datatype:
> the elements it is made of are all represented in memory.
>
> The dual of this idea is a co-datatype like a stream, where you don't have
> the concrete data structures anymore, you have what is called an _abstract
> datatype_ which is a datatype that has no actual representation in the
> machine: so you don't have the constructors '() and cons anymore, you just
> have a single deconstructor, snoc, which, when it is applied to a stream,
> maybe returns an element and a new stream, which is the tail, otherwise it
> just returns something like #f which says "it ended!" In languages like
> scheme and standard ML which do eager evaluation, streams are modelled
> using either references (i.e. mutable cons cells) or eta-expansions
> (thunks, (lambda () ...) with a 'unit' argument to delay evaluation), but
> in lazy languages like  haskell (and untyped lambda calculus under normal
> order evaluation) you don't need any tricks, you can just write the co-data
> types as ordinary lambda expressions, and the 'call by name' semantics mean
> that these 'infinite tails' only get expanded (i.e. represented in the
> memory) when they are 'observed' by applying the deconstructor. So like in
> real life, the only garbage you have to deal with is the stuff that results
> from what you make: the whole infinite substructure is all 'enfolded'
> underneath and takes no space at all. It's just your observing it that
> makes it concrete.
>
> There is a huge body of theory and an awful lot of scribbling been done
> about this. There are mathematical texts where it's called 'category
> theory' or 'non well-founded set theory' And it comes up in order theory as
> 'fixedpoint calculus' and the theory of Galois connections. And in other
> areas of computer science it's called bisimulation. To me it all seems to
> be the same thing: "consing up one list while cdr'ing down another", but
> there's probably no research mileage in saying things like that.
>
> Ian
>
>

[-- Attachment #2: Type: text/html, Size: 7718 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: implementation idea for infinite cons lists aka scon(e)s lists.
  2014-09-13  1:22 ` Ian Grant
@ 2014-09-13 11:19   ` Stefan Israelsson Tampe
  2014-09-15 14:37     ` Ian Grant
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Israelsson Tampe @ 2014-09-13 11:19 UTC (permalink / raw)
  To: Ian Grant; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 7336 bytes --]

The c-lambda pageat c-lambda.se is reflecting my mind 5 years ago or so. So
it is outdated :-)

Anyhow today I think that programming in C should be done like in my
c-lambda repo at gitorius.

Else you supply some interesting ideas to follow. I will spend some time on
those and return a
better reply later on.

 I have actually copied sbcl's assebler for amd64. you can find it in the
aschm
repo at gitorious, and here again the meta programming you get with scheme
is fantastic.

Cheers!

On Sat, Sep 13, 2014 at 3:22 AM, Ian Grant <ian.a.n.grant@googlemail.com>
wrote:

> On the c-lambda page, you write:
>
>    "In a sense C is simplistic and in many ways a glorified
>     assembler but I actually like it."
>
> I first learned C quite a while ago. And up until three months ago, I
> would have agreed with this. But then I started using assembler in a way
> that I'd never used it before: I had written a bunch of C functions to
> interface lightning and libffi with Moscow ML. And it was hard work. The
> CAML byte-code interpreter uses a similar sort of cell encoding scheme,
> folding longs and pointers together, and passing 'vectors' of pointers
> around, often several deep. It's really easy to make a mistake with the
> casts and end up de-referencing a pointer to a pointer to a string
> reference, say, too many or too few times.
>
> Once I had the lightning primitives going, though, I found it got a lot
> easier. And to my genuine surprise I found the assembler easier to write
> and easier to understand than the C had been. And there was another benefit
> that wasn't so surprising, because I think it was one of the reasons why I
> was doing this: the assembler was much easier to generalise. It was almost
> trivial to compose assembler-generating ML functions together, and the
> result was that the fragments of assembler were automatically stacked up to
> implement, for me, quite complicated C function bindings for ML. But
> because the fragments of code were all being composed mechanically, they
> either worked first time (most of the time) or they failed every time. I
> have had very, very few of those bugs that show up once in a blue moon. One
> was, I think, a fairly predictable case of some SCM pointers that guile's
> GC wasn't protecting, because they were only referenced from CAML bytecode
> environments And another was, I think, a bug in my C code implementing
> off-heap malloc'ed buffers for ML functions. And there haven't been any
> others. But this is hundreds, if not thousands of 'lines' of assembler,
> because I use the same pieces of code over and over again, in different
> contexts.
>
> You might be interested to try it out. I have made some examples of how to
> do the Guile bindings for lightning, and you won't have any problem at all
> generating the scheme code to implement the whole lot. There are about 350
> functions to bind, so it needs to be done programmatically, but I have set
> up an ML list which will translate trivially to a scheme s-exp to implement
> the definitions. And there are some examples of lightning code to access
> and create SCMs here:
>
>
> https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/testguile.sml
>
> The example of multiple-use starts around line 450.
>
> I posted the example.scm.gz as an attachment to this message:
>
>   http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00042.html
>
> If you are interested, then I would like to ask you about how or if it
> might be possible to use Prolog to solve some logical equations and do
> automatic register allocation once a whole lot of assembler fragments have
> been stacked up. If you try it, you will see that it is easy to assign
> registers as scheme parameters and fix them in the caller, but there are
> some compromises one has to make, which are less than 100"% efficient. And
> I think maybe a bit of AI would be able to make them dynamically and
> according to context, which is definitely not what we would want to do
> manually, because it would destroy the simplicity and reliability of
> function composition for 'stacking fragments.' The manually specified steps
> have to be very uniform and regular, because we want the results to be
> _very_ reliable.
>
> Happy hacking.
>
>
> On Fri, Sep 12, 2014 at 4:08 PM, Ian Grant <ian.a.n.grant@googlemail.com>
> wrote:
>
>> > #|
>> > Basically a stream is
>> > [x,y,z,...]
>>
>> > But we want to gc the tail of the stream if not reachable. How to do
>> this?
>>
>> I don't understand. The tail is infinitely long. When do you want to GC
>> it? When your infinite memory is 50% full, or 75% full :-)
>>
>> I think you probably have a good idea, but it's just not at all clear
>> from these two messages.
>>
>> Do you know about co-induction and co-data? An ordinary (proper) list is
>> an example of an inductive structure: you have two constructors, a nullary
>> one '() which is like a constant or a constant function, it takes no
>> arguments and makes a list out of nothing. And you have a binary
>> constructor, cons, which takes a head a tail as arguments and makes a new
>> list. And then you can use these a bit like an induction proof in
>> mathematics: '() is the base case, and cons is the induction step which
>> takes you from a list, to a new longer list. This is a concrete datatype:
>> the elements it is made of are all represented in memory.
>>
>> The dual of this idea is a co-datatype like a stream, where you don't
>> have the concrete data structures anymore, you have what is called an
>> _abstract datatype_ which is a datatype that has no actual representation
>> in the machine: so you don't have the constructors '() and cons anymore,
>> you just have a single deconstructor, snoc, which, when it is applied to a
>> stream, maybe returns an element and a new stream, which is the tail,
>> otherwise it just returns something like #f which says "it ended!" In
>> languages like scheme and standard ML which do eager evaluation, streams
>> are modelled using either references (i.e. mutable cons cells) or
>> eta-expansions (thunks, (lambda () ...) with a 'unit' argument to delay
>> evaluation), but in lazy languages like  haskell (and untyped lambda
>> calculus under normal order evaluation) you don't need any tricks, you can
>> just write the co-data types as ordinary lambda expressions, and the 'call
>> by name' semantics mean that these 'infinite tails' only get expanded (i.e.
>> represented in the memory) when they are 'observed' by applying the
>> deconstructor. So like in real life, the only garbage you have to deal with
>> is the stuff that results from what you make: the whole infinite
>> substructure is all 'enfolded' underneath and takes no space at all. It's
>> just your observing it that makes it concrete.
>>
>> There is a huge body of theory and an awful lot of scribbling been done
>> about this. There are mathematical texts where it's called 'category
>> theory' or 'non well-founded set theory' And it comes up in order theory as
>> 'fixedpoint calculus' and the theory of Galois connections. And in other
>> areas of computer science it's called bisimulation. To me it all seems to
>> be the same thing: "consing up one list while cdr'ing down another", but
>> there's probably no research mileage in saying things like that.
>>
>> Ian
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 8766 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: implementation idea for infinite cons lists aka scon(e)s lists.
  2014-09-13 11:19   ` Stefan Israelsson Tampe
@ 2014-09-15 14:37     ` Ian Grant
  0 siblings, 0 replies; 7+ messages in thread
From: Ian Grant @ 2014-09-15 14:37 UTC (permalink / raw)
  To: Stefan Israelsson Tampe; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1211 bytes --]

On Sat, Sep 13, 2014 at 7:19 AM, Stefan Israelsson Tampe <
stefan.itampe@gmail.com> wrote:

> Anyhow today I think that programming in C should be done like in my
> c-lambda repo at gitorius.
>

I can't find it. Can you send me a URL?

 I have actually copied sbcl's assebler for amd64. you can find it in the
> aschm
> repo at gitorious, and here again the meta programming you get with scheme
> is fantastic
>

I found this. It looks really good. I like the way it handles labels. I
haven't yet looked up how you do it, I guess its a syntax macro of some
kind - I didn't know they could do that though.

It looks like SBCL hackers have been playing with virtual machines
metaprogrammed in LISP for a while. But they don't do abstract syntax for
their VM, so every implementation is a concrete LISP program implementing
the VM in some particular assembler or other. But writing a common,
abstract VM spec, as an s-exp, for example, and then interpreting that one
common abstract spec in each of the particular concrete assembler languages
is only a small step to make. But it gets you a completely portable LISP
runtime very quickly, if all your LISP primitives were specified in that
abstract assembler.

Ian

[-- Attachment #2: Type: text/html, Size: 1908 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: implementation idea for infinite cons lists aka scon(e)s lists.
  2014-09-12 20:08 Ian Grant
  2014-09-13  1:22 ` Ian Grant
@ 2014-09-13 11:13 ` Stefan Israelsson Tampe
  2014-09-15 18:38   ` Ian Grant
  1 sibling, 1 reply; 7+ messages in thread
From: Stefan Israelsson Tampe @ 2014-09-13 11:13 UTC (permalink / raw)
  To: Ian Grant; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 3836 bytes --]

Hi,

The idea is to mod the gc and take advantage of that to trace if the
datastructure ( a cons cell) so that one can now that
it is saved from gc by havinga special reference and that we also can see
if the cons cell have been referenced outside of the
special references. Then in the sweep phase one can decide to modify the
cons list to set the cdr to null in say 1000 cons from the
head. But this modding is only done if there is now referncing of it
outside the special references. This means that the rest of the tail
will gc normally and be reclaimed. If however a function wants to analyze a
part of the list it just references the head and by that the gc
will not mod any part of the rest of the list and the algorithm can safely
do it's work. It's in the gc's sweep phase that the list is cut with a
setcdr!

/Stefan

On Fri, Sep 12, 2014 at 10:08 PM, Ian Grant <ian.a.n.grant@googlemail.com>
wrote:

> > #|
> > Basically a stream is
> > [x,y,z,...]
>
> > But we want to gc the tail of the stream if not reachable. How to do
> this?
>
> I don't understand. The tail is infinitely long. When do you want to GC
> it? When your infinite memory is 50% full, or 75% full :-)
>
> I think you probably have a good idea, but it's just not at all clear from
> these two messages.
>
> Do you know about co-induction and co-data? An ordinary (proper) list is
> an example of an inductive structure: you have two constructors, a nullary
> one '() which is like a constant or a constant function, it takes no
> arguments and makes a list out of nothing. And you have a binary
> constructor, cons, which takes a head a tail as arguments and makes a new
> list. And then you can use these a bit like an induction proof in
> mathematics: '() is the base case, and cons is the induction step which
> takes you from a list, to a new longer list. This is a concrete datatype:
> the elements it is made of are all represented in memory.
>
> The dual of this idea is a co-datatype like a stream, where you don't have
> the concrete data structures anymore, you have what is called an _abstract
> datatype_ which is a datatype that has no actual representation in the
> machine: so you don't have the constructors '() and cons anymore, you just
> have a single deconstructor, snoc, which, when it is applied to a stream,
> maybe returns an element and a new stream, which is the tail, otherwise it
> just returns something like #f which says "it ended!" In languages like
> scheme and standard ML which do eager evaluation, streams are modelled
> using either references (i.e. mutable cons cells) or eta-expansions
> (thunks, (lambda () ...) with a 'unit' argument to delay evaluation), but
> in lazy languages like  haskell (and untyped lambda calculus under normal
> order evaluation) you don't need any tricks, you can just write the co-data
> types as ordinary lambda expressions, and the 'call by name' semantics mean
> that these 'infinite tails' only get expanded (i.e. represented in the
> memory) when they are 'observed' by applying the deconstructor. So like in
> real life, the only garbage you have to deal with is the stuff that results
> from what you make: the whole infinite substructure is all 'enfolded'
> underneath and takes no space at all. It's just your observing it that
> makes it concrete.
>
> There is a huge body of theory and an awful lot of scribbling been done
> about this. There are mathematical texts where it's called 'category
> theory' or 'non well-founded set theory' And it comes up in order theory as
> 'fixedpoint calculus' and the theory of Galois connections. And in other
> areas of computer science it's called bisimulation. To me it all seems to
> be the same thing: "consing up one list while cdr'ing down another", but
> there's probably no research mileage in saying things like that.
>
> Ian
>
>

[-- Attachment #2: Type: text/html, Size: 4629 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: implementation idea for infinite cons lists aka scon(e)s lists.
  2014-09-13 11:13 ` Stefan Israelsson Tampe
@ 2014-09-15 18:38   ` Ian Grant
  0 siblings, 0 replies; 7+ messages in thread
From: Ian Grant @ 2014-09-15 18:38 UTC (permalink / raw)
  To: Stefan Israelsson Tampe; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 3374 bytes --]

On Sat, Sep 13, 2014 at 7:13 AM, Stefan Israelsson Tampe <
stefan.itampe@gmail.com> wrote:

> it is saved from gc by havinga special reference and that we also can see
> if the cons cell have been referenced outside of the
> special references. Then in the sweep phase one can decide to modify the
> cons list to set the cdr to null in say 1000 cons from the
> head. But this modding is only done if there is now referncing of it
> outside the special references. This means that the rest of the tail
> will gc normally and be reclaimed. If however a function wants to analyze
> a part of the list it just references the head and by that the gc
> will not mod any part of the rest of the list and the algorithm can safely
> do it's work. It's in the gc's sweep phase that the list is cut with a
> setcdr!
>
> I think I am beginning to get the picture. This sounds a little like
something I started trying to do with the CAML gc. I called it "recycling
garbage" or "resurrection" because the idea is to have a process whereby
one can reclaim "weak" references which were not really dead. The
theological difficulties with this idea might be considerable, but I
thought this would be good thing to do because some structures are
expensive to allocate: double-precision floats and GMP mpzs, for example.
And in evaluating arithmetic expressions the  CAML runtime repeatedly
throws intermediate values away, and immediately does a far malloc call to
allocate another. I thought if we could keep an array of weak references
and recycle some or all of them between the mark and sweep phases, then
these could make a 'pool' of pre-allocated objects that could reclaimed
just by pointing at them.

Does this fit with your idea? Could we combine these as two reasons to do
this change?

I implemented it, but you may not be too interested in the horrible details
of the ancient CAML gc.
https://github.com/IanANGrant/red-october/commit/1e76f6746eab2f0afa7dbbcd78d3013029e8187b

On a related theme, I have a suggestion for Guile's memory allocation
strategy, which is to document a method of preallocating a page-sized block
of cons cells, for example, Then when one has a fragment of machine code
that is constructing s-exp representations of something or other, that code
can do its own memory management just by switching pointers. I think this
is a sort of simple-minded slab allocator maybe? I had a brief look at the
way the BDW collector does things, and it seemed to me that this could be
done right now, just by directly calling GC_MALLOC from the machine code,
and SCM references which were pointers to the inside of the GC_ALLOCed
block would be enough to keep it from being swept up.  And it looks as if
whatever memory was optimistically allocated this way, and unused at the
end of the construction process, could be freed. Provided it was a
contiguous region, which it usually will be, the BDW collector would split
the block, and free the unused part.

So I don't think there's anything we would need to do to actually implement
this. The only thing is that if it is not enshrined in the Guile API spec,
then it is vulnerable to being 'patched out' without warning one day. So
maybe it only needs a comment in the code somewhere, and a paragraph in the
manual.

Since you've also  spent some time down there in the garbage chute, maybe
you can confirm/deny this?

Ian

[-- Attachment #2: Type: text/html, Size: 4029 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-09-15 18:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-10 20:10 implementation idea for infinite cons lists aka scon(e)s lists Stefan Israelsson Tampe
  -- strict thread matches above, loose matches on Subject: below --
2014-09-12 20:08 Ian Grant
2014-09-13  1:22 ` Ian Grant
2014-09-13 11:19   ` Stefan Israelsson Tampe
2014-09-15 14:37     ` Ian Grant
2014-09-13 11:13 ` Stefan Israelsson Tampe
2014-09-15 18:38   ` Ian Grant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).