From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Re: implementation idea for infinite cons lists aka scon(e)s lists. Date: Sat, 13 Sep 2014 13:19:01 +0200 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a1133221c901d950502f091cd X-Trace: ger.gmane.org 1410607155 28973 80.91.229.3 (13 Sep 2014 11:19:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 13 Sep 2014 11:19:15 +0000 (UTC) Cc: guile-devel To: Ian Grant Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sat Sep 13 13:19:11 2014 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XSlME-0001ge-Md for guile-devel@m.gmane.org; Sat, 13 Sep 2014 13:19:11 +0200 Original-Received: from localhost ([::1]:49274 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSlMD-0004yM-Ua for guile-devel@m.gmane.org; Sat, 13 Sep 2014 07:19:09 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58918) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSlM8-0004wI-Vh for guile-devel@gnu.org; Sat, 13 Sep 2014 07:19:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XSlM6-0005cs-QC for guile-devel@gnu.org; Sat, 13 Sep 2014 07:19:04 -0400 Original-Received: from mail-pa0-x231.google.com ([2607:f8b0:400e:c03::231]:63667) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSlM6-0005co-D9 for guile-devel@gnu.org; Sat, 13 Sep 2014 07:19:02 -0400 Original-Received: by mail-pa0-f49.google.com with SMTP id lf10so3151444pab.22 for ; Sat, 13 Sep 2014 04:19:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=0bvot1A7P8y2kZJbJfr31YZ0FwdnhEyFmFbf5crL7YM=; b=0Hr+CX8IS8y4kGzexMfOQa2KUMb+jFgsPgR1gqr97l6MqZYw++V7Bcx6jyiw+H0uaD IRq3u4ir4RB+XfMW/5H1v4nFPUjfXezpnbaXQlg4dLVwZc8iPCQJMcuQBIrGuS8V9Kz9 +NimJNNDr5tfiiG/RiTEWskck9J6BTkdEVXMKux+jCEDDvTdBt7GYMSP3w85pb38BTlp ONNtmpvqmniqkfbYjKNmveCoZ30D7S+qYdmMsH8lqhsFh1Q0NxFgyQihacNnYWktHlL5 uXc4l0MW4vnJ3Kbo9QcTqwNpr0/aZTDmW8uQicCmMSoQKoHXCpUX5zHbYy2LGFGzcbpy Mmig== X-Received: by 10.70.131.199 with SMTP id oo7mr24149525pdb.95.1410607141098; Sat, 13 Sep 2014 04:19:01 -0700 (PDT) Original-Received: by 10.70.36.48 with HTTP; Sat, 13 Sep 2014 04:19:01 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400e:c03::231 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:17444 Archived-At: --001a1133221c901d950502f091cd Content-Type: text/plain; charset=UTF-8 The c-lambda pageat c-lambda.se is reflecting my mind 5 years ago or so. So it is outdated :-) Anyhow today I think that programming in C should be done like in my c-lambda repo at gitorius. Else you supply some interesting ideas to follow. I will spend some time on those and return a better reply later on. I have actually copied sbcl's assebler for amd64. you can find it in the aschm repo at gitorious, and here again the meta programming you get with scheme is fantastic. Cheers! On Sat, Sep 13, 2014 at 3:22 AM, Ian Grant wrote: > On the c-lambda page, you write: > > "In a sense C is simplistic and in many ways a glorified > assembler but I actually like it." > > I first learned C quite a while ago. And up until three months ago, I > would have agreed with this. But then I started using assembler in a way > that I'd never used it before: I had written a bunch of C functions to > interface lightning and libffi with Moscow ML. And it was hard work. The > CAML byte-code interpreter uses a similar sort of cell encoding scheme, > folding longs and pointers together, and passing 'vectors' of pointers > around, often several deep. It's really easy to make a mistake with the > casts and end up de-referencing a pointer to a pointer to a string > reference, say, too many or too few times. > > Once I had the lightning primitives going, though, I found it got a lot > easier. And to my genuine surprise I found the assembler easier to write > and easier to understand than the C had been. And there was another benefit > that wasn't so surprising, because I think it was one of the reasons why I > was doing this: the assembler was much easier to generalise. It was almost > trivial to compose assembler-generating ML functions together, and the > result was that the fragments of assembler were automatically stacked up to > implement, for me, quite complicated C function bindings for ML. But > because the fragments of code were all being composed mechanically, they > either worked first time (most of the time) or they failed every time. I > have had very, very few of those bugs that show up once in a blue moon. One > was, I think, a fairly predictable case of some SCM pointers that guile's > GC wasn't protecting, because they were only referenced from CAML bytecode > environments And another was, I think, a bug in my C code implementing > off-heap malloc'ed buffers for ML functions. And there haven't been any > others. But this is hundreds, if not thousands of 'lines' of assembler, > because I use the same pieces of code over and over again, in different > contexts. > > You might be interested to try it out. I have made some examples of how to > do the Guile bindings for lightning, and you won't have any problem at all > generating the scheme code to implement the whole lot. There are about 350 > functions to bind, so it needs to be done programmatically, but I have set > up an ML list which will translate trivially to a scheme s-exp to implement > the definitions. And there are some examples of lightning code to access > and create SCMs here: > > > https://github.com/IanANGrant/red-october/blob/master/src/dynlibs/ffi/testguile.sml > > The example of multiple-use starts around line 450. > > I posted the example.scm.gz as an attachment to this message: > > http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00042.html > > If you are interested, then I would like to ask you about how or if it > might be possible to use Prolog to solve some logical equations and do > automatic register allocation once a whole lot of assembler fragments have > been stacked up. If you try it, you will see that it is easy to assign > registers as scheme parameters and fix them in the caller, but there are > some compromises one has to make, which are less than 100"% efficient. And > I think maybe a bit of AI would be able to make them dynamically and > according to context, which is definitely not what we would want to do > manually, because it would destroy the simplicity and reliability of > function composition for 'stacking fragments.' The manually specified steps > have to be very uniform and regular, because we want the results to be > _very_ reliable. > > Happy hacking. > > > On Fri, Sep 12, 2014 at 4:08 PM, Ian Grant > wrote: > >> > #| >> > Basically a stream is >> > [x,y,z,...] >> >> > But we want to gc the tail of the stream if not reachable. How to do >> this? >> >> I don't understand. The tail is infinitely long. When do you want to GC >> it? When your infinite memory is 50% full, or 75% full :-) >> >> I think you probably have a good idea, but it's just not at all clear >> from these two messages. >> >> Do you know about co-induction and co-data? An ordinary (proper) list is >> an example of an inductive structure: you have two constructors, a nullary >> one '() which is like a constant or a constant function, it takes no >> arguments and makes a list out of nothing. And you have a binary >> constructor, cons, which takes a head a tail as arguments and makes a new >> list. And then you can use these a bit like an induction proof in >> mathematics: '() is the base case, and cons is the induction step which >> takes you from a list, to a new longer list. This is a concrete datatype: >> the elements it is made of are all represented in memory. >> >> The dual of this idea is a co-datatype like a stream, where you don't >> have the concrete data structures anymore, you have what is called an >> _abstract datatype_ which is a datatype that has no actual representation >> in the machine: so you don't have the constructors '() and cons anymore, >> you just have a single deconstructor, snoc, which, when it is applied to a >> stream, maybe returns an element and a new stream, which is the tail, >> otherwise it just returns something like #f which says "it ended!" In >> languages like scheme and standard ML which do eager evaluation, streams >> are modelled using either references (i.e. mutable cons cells) or >> eta-expansions (thunks, (lambda () ...) with a 'unit' argument to delay >> evaluation), but in lazy languages like haskell (and untyped lambda >> calculus under normal order evaluation) you don't need any tricks, you can >> just write the co-data types as ordinary lambda expressions, and the 'call >> by name' semantics mean that these 'infinite tails' only get expanded (i.e. >> represented in the memory) when they are 'observed' by applying the >> deconstructor. So like in real life, the only garbage you have to deal with >> is the stuff that results from what you make: the whole infinite >> substructure is all 'enfolded' underneath and takes no space at all. It's >> just your observing it that makes it concrete. >> >> There is a huge body of theory and an awful lot of scribbling been done >> about this. There are mathematical texts where it's called 'category >> theory' or 'non well-founded set theory' And it comes up in order theory as >> 'fixedpoint calculus' and the theory of Galois connections. And in other >> areas of computer science it's called bisimulation. To me it all seems to >> be the same thing: "consing up one list while cdr'ing down another", but >> there's probably no research mileage in saying things like that. >> >> Ian >> >> > --001a1133221c901d950502f091cd Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
The c-lambda pageat c-lambd= a.se is reflecting my mind 5 years ago or so. So it is outdated :-)
Anyhow today I think that programming in C should be done l= ike in my c-lambda repo at gitorius.

Else you supp= ly some interesting ideas to follow. I will spend some time on those and re= turn a
better reply later on.

=C2=A0I ha= ve actually copied sbcl's assebler for amd64. you can find it in the as= chm
repo at gitorious, and here again the meta programming you ge= t with scheme is fantastic.

Cheers!

On Sat, Sep 13, 2014= at 3:22 AM, Ian Grant <ian.a.n.grant@googlemail.com> wrote:
On the c-lam= bda page, you write:

=C2=A0=C2=A0 "In a sense C is simplistic a= nd in many ways a glorified
=C2=A0 =C2=A0 assembler but I actually like = it."

I first learned C quite a while ago. And up until three m= onths ago, I would have agreed with this. But then I started using assemble= r in a way that I'd never used it before: I had written a bunch of C fu= nctions to interface lightning and libffi with Moscow ML. And it was hard w= ork. The CAML byte-code interpreter uses a similar sort of cell encoding sc= heme, folding longs and pointers together, and passing 'vectors' of= pointers around, often several deep. It's really easy to make a mistak= e with the casts and end up de-referencing a pointer to a pointer to a stri= ng reference, say, too many or too few times.

Once I had = the lightning primitives going, though, I found it got a lot easier. And to= my genuine surprise I found the assembler easier to write and easier to un= derstand than the C had been. And there was another benefit that wasn't= so surprising, because I think it was one of the reasons why I was doing t= his: the assembler was much easier to generalise. It was almost trivial to = compose assembler-generating ML functions together, and the result was that= the fragments of assembler were automatically stacked up to implement, for= me, quite complicated C function bindings for ML. But because the fragment= s of code were all being composed mechanically, they either worked first ti= me (most of the time) or they failed every time. I have had very, very few = of those bugs that show up once in a blue moon. One was, I think, a fairly = predictable case of some SCM pointers that guile's GC wasn't protec= ting, because they were only referenced from CAML bytecode environments And= another was, I think, a bug in my C code implementing off-heap malloc'= ed buffers for ML functions. And there haven't been any others. But thi= s is hundreds, if not thousands of 'lines' of assembler, because I = use the same pieces of code over and over again, in different contexts.
You might be interested to try it out. I have made some exa= mples of how to do the Guile bindings for lightning, and you won't have= any problem at all generating the scheme code to implement the whole lot. = There are about 350 functions to bind, so it needs to be done programmatica= lly, but I have set up an ML list which will translate trivially to a schem= e s-exp to implement the definitions. And there are some examples of lightn= ing code to access and create SCMs here:

=C2=A0=C2=A0 https://github.com/IanANGrant/red-october/blob/mas= ter/src/dynlibs/ffi/testguile.sml

The example of mult= iple-use starts around line 450.

I posted the example.scm= .gz as an attachment to this message:

=C2=A0 = http://lists.gnu.org/archive/html/guile-devel/2014-09/msg00042.html

If you are interested, then I would like to ask = you about how or if it might be possible to use Prolog to solve some logica= l equations and do automatic register allocation once a whole lot of assemb= ler fragments have been stacked up. If you try it, you will see that it is = easy to assign registers as scheme parameters and fix them in the caller, b= ut there are some compromises one has to make, which are less than 100"= ;% efficient. And I think maybe a bit of AI would be able to make them dyna= mically and according to context, which is definitely not what we would wan= t to do manually, because it would destroy the simplicity and reliability o= f function composition for 'stacking fragments.' The manually speci= fied steps have to be very uniform and regular, because we want the results= to be _very_ reliable.
=C2=A0
Happy hacking.


On Fri, Sep 12, 2014 at 4:08 PM, Ian Gr= ant <ian.a.n.grant@googlemail.com> wrote:
> #|
> Bas= ically a stream is
> [x,y,z,...]

>= But we want to gc the tail of the stream if not reachable. How to do this?=

I don't understand. The tail is infinitely lo= ng. When do you want to GC it? When your infinite memory is 50% full, or 75= % full :-)

I think you probably have a good idea, but it&= #39;s just not at all clear from these two messages.

Do y= ou know about co-induction and co-data? An ordinary (proper) list is an exa= mple of an inductive structure: you have two constructors, a nullary one &#= 39;() which is like a constant or a constant function, it takes no argument= s and makes a list out of nothing. And you have a binary constructor, cons,= which takes a head a tail as arguments and makes a new list. And then you = can use these a bit like an induction proof in mathematics: '() is the = base case, and cons is the induction step which takes you from a list, to a= new longer list. This is a concrete datatype: the elements it is made of a= re all represented in memory.

The dual of this idea is a = co-datatype like a stream, where you don't have the concrete data struc= tures anymore, you have what is called an _abstract datatype_ which is a da= tatype that has no actual representation in the machine: so you don't h= ave the constructors '() and cons anymore, you just have a single decon= structor, snoc, which, when it is applied to a stream, maybe returns an ele= ment and a new stream, which is the tail, otherwise it just returns somethi= ng like #f which says "it ended!" In languages like scheme and st= andard ML which do eager evaluation, streams are modelled using either refe= rences (i.e. mutable cons cells) or eta-expansions (thunks, (lambda () ...)= with a 'unit' argument to delay evaluation), but in lazy languages= like=C2=A0 haskell (and untyped lambda calculus under normal order evaluat= ion) you don't need any tricks, you can just write the co-data types as= ordinary lambda expressions, and the 'call by name' semantics mean= that these 'infinite tails' only get expanded (i.e. represented in= the memory) when they are 'observed' by applying the deconstructor= . So like in real life, the only garbage you have to deal with is the stuff= that results from what you make: the whole infinite substructure is all &#= 39;enfolded' underneath and takes no space at all. It's just your o= bserving it that makes it concrete.

There is a huge body = of theory and an awful lot of scribbling been done about this. There are ma= thematical texts where it's called 'category theory' or 'no= n well-founded set theory' And it comes up in order theory as 'fixe= dpoint calculus' and the theory of Galois connections. And in other are= as of computer science it's called bisimulation. To me it all seems to = be the same thing: "consing up one list while cdr'ing down another= ", but there's probably no research mileage in saying things like = that.

Ian



--001a1133221c901d950502f091cd--