Re: implementation idea for infinite cons lists aka scon(e)s lists.

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

From: Ian Grant <ian.a.n.grant@googlemail.com>
To: stefan.itampe@gmail.com, guile-devel@gnu.org
Subject: Re: implementation idea for infinite cons lists aka scon(e)s lists.
Date: Fri, 12 Sep 2014 21:22:19 -0400	[thread overview]
Message-ID: <CAKFjmdwMFfA1Ap9DLxgofWKycSpdUjYV7UNwUUkCk1vGWBTsvg@mail.gmail.com> (raw)
In-Reply-To: <CAKFjmdx2NyEEBh7JyWvM=kD=votFZn_ejY8_p=G0igr+P1jq8g@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6581 bytes --]

On the c-lambda page, you write:

   "In a sense C is simplistic and in many ways a glorified
    assembler but I actually like it."

I first learned C quite a while ago. And up until three months ago, I would
have agreed with this. But then I started using assembler in a way that I'd
never used it before: I had written a bunch of C functions to interface
lightning and libffi with Moscow ML. And it was hard work. The CAML
byte-code interpreter uses a similar sort of cell encoding scheme, folding
longs and pointers together, and passing 'vectors' of pointers around,
often several deep. It's really easy to make a mistake with the casts and
end up de-referencing a pointer to a pointer to a string reference, say,
too many or too few times.

Once I had the lightning primitives going, though, I found it got a lot
easier. And to my genuine surprise I found the assembler easier to write
and easier to understand than the C had been. And there was another benefit
that wasn't so surprising, because I think it was one of the reasons why I
was doing this: the assembler was much easier to generalise. It was almost
trivial to compose assembler-generating ML functions together, and the
result was that the fragments of assembler were automatically stacked up to
implement, for me, quite complicated C function bindings for ML. But
because the fragments of code were all being composed mechanically, they
either worked first time (most of the time) or they failed every time. I
have had very, very few of those bugs that show up once in a blue moon. One
was, I think, a fairly predictable case of some SCM pointers that guile's
GC wasn't protecting, because they were only referenced from CAML bytecode
environments And another was, I think, a bug in my C code implementing
off-heap malloc'ed buffers for ML functions. And there haven't been any
others. But this is hundreds, if not thousands of 'lines' of assembler,
because I use the same pieces of code over and over again, in different
contexts.

You might be interested to try it out. I have made some examples of how to
do the Guile bindings for lightning, and you won't have any problem at all
generating the scheme code to implement the whole lot. There are about 350
functions to bind, so it needs to be done programmatically, but I have set
up an ML list which will translate trivially to a scheme s-exp to implement
the definitions. And there are some examples of lightning code to access
and create SCMs here:

https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/testguile.sml

The example of multiple-use starts around line 450.

I posted the example.scm.gz as an attachment to this message:

  http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00042.html

If you are interested, then I would like to ask you about how or if it
might be possible to use Prolog to solve some logical equations and do
automatic register allocation once a whole lot of assembler fragments have
been stacked up. If you try it, you will see that it is easy to assign
registers as scheme parameters and fix them in the caller, but there are
some compromises one has to make, which are less than 100"% efficient. And
I think maybe a bit of AI would be able to make them dynamically and
according to context, which is definitely not what we would want to do
manually, because it would destroy the simplicity and reliability of
function composition for 'stacking fragments.' The manually specified steps
have to be very uniform and regular, because we want the results to be
_very_ reliable.

Happy hacking.

On Fri, Sep 12, 2014 at 4:08 PM, Ian Grant <ian.a.n.grant@googlemail.com>
wrote:

> > #|
> > Basically a stream is
> > [x,y,z,...]
>
> > But we want to gc the tail of the stream if not reachable. How to do
> this?
>
> I don't understand. The tail is infinitely long. When do you want to GC
> it? When your infinite memory is 50% full, or 75% full :-)
>
> I think you probably have a good idea, but it's just not at all clear from
> these two messages.
>
> Do you know about co-induction and co-data? An ordinary (proper) list is
> an example of an inductive structure: you have two constructors, a nullary
> one '() which is like a constant or a constant function, it takes no
> arguments and makes a list out of nothing. And you have a binary
> constructor, cons, which takes a head a tail as arguments and makes a new
> list. And then you can use these a bit like an induction proof in
> mathematics: '() is the base case, and cons is the induction step which
> takes you from a list, to a new longer list. This is a concrete datatype:
> the elements it is made of are all represented in memory.
>
> The dual of this idea is a co-datatype like a stream, where you don't have
> the concrete data structures anymore, you have what is called an _abstract
> datatype_ which is a datatype that has no actual representation in the
> machine: so you don't have the constructors '() and cons anymore, you just
> have a single deconstructor, snoc, which, when it is applied to a stream,
> maybe returns an element and a new stream, which is the tail, otherwise it
> just returns something like #f which says "it ended!" In languages like
> scheme and standard ML which do eager evaluation, streams are modelled
> using either references (i.e. mutable cons cells) or eta-expansions
> (thunks, (lambda () ...) with a 'unit' argument to delay evaluation), but
> in lazy languages like  haskell (and untyped lambda calculus under normal
> order evaluation) you don't need any tricks, you can just write the co-data
> types as ordinary lambda expressions, and the 'call by name' semantics mean
> that these 'infinite tails' only get expanded (i.e. represented in the
> memory) when they are 'observed' by applying the deconstructor. So like in
> real life, the only garbage you have to deal with is the stuff that results
> from what you make: the whole infinite substructure is all 'enfolded'
> underneath and takes no space at all. It's just your observing it that
> makes it concrete.
>
> There is a huge body of theory and an awful lot of scribbling been done
> about this. There are mathematical texts where it's called 'category
> theory' or 'non well-founded set theory' And it comes up in order theory as
> 'fixedpoint calculus' and the theory of Galois connections. And in other
> areas of computer science it's called bisimulation. To me it all seems to
> be the same thing: "consing up one list while cdr'ing down another", but
> there's probably no research mileage in saying things like that.
>
> Ian
>
>

[-- Attachment #2: Type: text/html, Size: 7718 bytes --]

next prev parent reply	other threads:[~2014-09-13  1:22 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-12 20:08 implementation idea for infinite cons lists aka scon(e)s lists Ian Grant
2014-09-13  1:22 ` Ian Grant [this message]
2014-09-13 11:19   ` Stefan Israelsson Tampe
2014-09-15 14:37     ` Ian Grant
2014-09-13 11:13 ` Stefan Israelsson Tampe
2014-09-15 18:38   ` Ian Grant
  -- strict thread matches above, loose matches on Subject: below --
2014-09-10 20:10 Stefan Israelsson Tampe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKFjmdwMFfA1Ap9DLxgofWKycSpdUjYV7UNwUUkCk1vGWBTsvg@mail.gmail.com \
    --to=ian.a.n.grant@googlemail.com \
    --cc=guile-devel@gnu.org \
    --cc=stefan.itampe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).