unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* Bytestructures: a "type system" for bytevectors
@ 2015-08-30 16:32 Taylan Ulrich Bayırlı/Kammer
  2015-08-31 13:50 ` Taylan Ulrich Bayırlı/Kammer
  2016-06-20 13:24 ` Andy Wingo
  0 siblings, 2 replies; 11+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2015-08-30 16:32 UTC (permalink / raw)
  To: guile-user

https://github.com/taylanub/scheme-bytestructures

(I don't endorse GitHub, but I gave in after Gitorious went down.)

I had started working on this project around two years ago but it had a
pretty strange and complex API, an unreadable README, high overhead for
something you might want to use in high-performance settings (byte
crunching), and a bit too hung up on being standards compliant.

I resumed working on it around a week ago, and solved all these issues!

(I still do my best to keep R7RS compliance, but Guile is my prime
*real* target.)

So what is it?

=== Structured access to bytevector contents ===

Don't tell C-standard pedants, but the type system of C is nothing more
than a thin layer over your computer memory: a huge sequence of bytes.

Compilers are allowed some crazy things as per the strict standards
language, but if you define a struct { uint16_t x; int8_t y[3]; }, you
essentially declare that you will be working with memory-snippets of 5
bytes, whose first two bytes stand for an unsigned 16-bit integer, and
the other three bytes for an array of three signed 8-bit integers.
That's all there is to it, and you could populate those bytes directly,
one by one:

    the_struct_t my_struct = { a, b, c, d, e, f }

Not sensible in practice for various reasons (endianness, compiler
optims, unreadable code, etc.), but it's good to be aware of the basic
idea: structs, arrays, unions, integer types, whatever, all types in C
are just windows upon byte-sequences.

Some programs will offer a stable ABI, indeed making promises about the
byte-by-byte structure of their data types.  It might also be a piece of
hardware giving you bytes under such a structure, and you can declare
the structure via C's type system to put some sanity into your
hardware-interfacing code.

Now in Scheme we have some *proper* abstraction.  We can fully forget
about the bytes and work with a purely logical notion of objects.  But
still we have bytevectors, a type encapsulating a raw sequence of bytes,
because they are useful for various purposes, be it talking with a C
library, or with a piece of hardware.

But Scheme doesn't offer anything like C's type system for bytevectors.
There's the bytevector-foo-ref/set! functions, and there's SRFI-4, but
there's nothing as generic as C's type system.

Now there is. :-)

(Bit-fields not yet supported.)

The struct example from above is now:

    (bs:struct `((x ,uint16) (y ,(bs:vector 3 int8))))

and that returns an object (a "bytestructure descriptor") which you can
use as part of other type definitions (the uin16 ant int8 there are
types provided for convenience by the library; they could have been
defined by the user as well), or use to access data in a bytevector
logically via struct field names and vector indices.

    (bytestructure my-struct 'y 2)  ;the last uint8

There is even a "pointer" type, made possible by our FFI module.  Be
warned though, you can write Scheme programs that segfault!

For when you need maximal performance from your code, there's a
macro-generating macro which will arrange for all the type stuff to
happen at compile time.  Granted, you lose some flexibility.

For more details, refer to the README.

I'm still shy of offering an absolutely final API, but I'm pretty happy
with the state of things and would love it if some people played with
it.  Consider this a beta phase announcement or so.  Tell me any
nontrivial issues you see with the library.


That's all for now.  Thanks in advance for any feedback!

Happy hacking,
Taylan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2015-08-30 16:32 Bytestructures: a "type system" for bytevectors Taylan Ulrich Bayırlı/Kammer
@ 2015-08-31 13:50 ` Taylan Ulrich Bayırlı/Kammer
  2015-09-07 22:57   ` neil
  2016-06-20 13:24 ` Andy Wingo
  1 sibling, 1 reply; 11+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2015-08-31 13:50 UTC (permalink / raw)
  To: guile-user

taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:

> That's all there is to it, and you could populate those bytes directly,
> one by one:
>
>     the_struct_t my_struct = { a, b, c, d, e, f }
>

Whoops, C isn't as dumb as I had in memory.  You'll need to memcpy it
into a char[] to be able to hack on the bytes that freely.  Anyway.

So I've been made aware that if I want my library to work with C data
structures, I'll probably want to add alignment support. :-)

I did that now, though I don't know if I did it right because I couldn't
find precise information on what an FFI system should support
wrt. data structure alignment.  I read a little on Wikipedia and peeked
into the documentation of Haskell's FFI and CL's CFFI.

The struct constructor takes an `align?' argument now, which if true
will enable default alignment for struct fields.

E.g.

  struct { uint8_t; uint16_t; uint64_t; }

becomes:

    1: 1 byte uint8
    2: 1 byte padding so uint16 is 2-byte aligned
  3-4: 2 bytes uint16
  5-8: 4 bytes padding so uint64 is 8-byte aligned
 9-16: 8 bytes uint64

16 bytes in total.  The struct's own alignment is 8, equal to the
alignment of its element with the highest alignment.

Vectors' alignment is equal to that of their element-type's alignment.
Unions' is equal to their highest member.

I see C compilers support stuff like "pack(2)" to force 2-byte alignment
for >2 byte sized fields.  I might add support for that too; I don't
know if C libraries typically use that for their ABI?

---

Next up I'll see if I can implement some example programs using the
library.  Something parsing a binary file format, something using the
FFI to work with some C data structures, etc.

On the meanwhile, testers and feedback welcome.

Taylan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2015-08-31 13:50 ` Taylan Ulrich Bayırlı/Kammer
@ 2015-09-07 22:57   ` neil
  2015-09-08  7:59     ` Taylan Ulrich Bayırlı/Kammer
  0 siblings, 1 reply; 11+ messages in thread
From: neil @ 2015-09-07 22:57 UTC (permalink / raw)
  To: Taylan Ulrich Bayırlı/Kammer, guile-user

Sorry for not mentioning this before, but have you seen make-c-struct and parse-c-struct at https://www.gnu.org/software/guile/manual/html_node/Foreign-Structs.html#Foreign-Structs ?


  Original Message  
From: Taylan Ulrich Bayırlı/Kammer
Sent: Monday, 31 August 2015 14:47
To: guile-user@gnu.org
Subject: Re: Bytestructures: a "type system" for bytevectors

taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:

> That's all there is to it, and you could populate those bytes directly,
> one by one:
>
> the_struct_t my_struct = { a, b, c, d, e, f }
>

Whoops, C isn't as dumb as I had in memory. You'll need to memcpy it
into a char[] to be able to hack on the bytes that freely. Anyway.

So I've been made aware that if I want my library to work with C data
structures, I'll probably want to add alignment support. :-)

I did that now, though I don't know if I did it right because I couldn't
find precise information on what an FFI system should support
wrt. data structure alignment. I read a little on Wikipedia and peeked
into the documentation of Haskell's FFI and CL's CFFI.

The struct constructor takes an `align?' argument now, which if true
will enable default alignment for struct fields.

E.g.

struct { uint8_t; uint16_t; uint64_t; }

becomes:

1: 1 byte uint8
2: 1 byte padding so uint16 is 2-byte aligned
3-4: 2 bytes uint16
5-8: 4 bytes padding so uint64 is 8-byte aligned
9-16: 8 bytes uint64

16 bytes in total. The struct's own alignment is 8, equal to the
alignment of its element with the highest alignment.

Vectors' alignment is equal to that of their element-type's alignment.
Unions' is equal to their highest member.

I see C compilers support stuff like "pack(2)" to force 2-byte alignment
for >2 byte sized fields. I might add support for that too; I don't
know if C libraries typically use that for their ABI?

---

Next up I'll see if I can implement some example programs using the
library. Something parsing a binary file format, something using the
FFI to work with some C data structures, etc.

On the meanwhile, testers and feedback welcome.

Taylan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2015-09-07 22:57   ` neil
@ 2015-09-08  7:59     ` Taylan Ulrich Bayırlı/Kammer
  0 siblings, 0 replies; 11+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2015-09-08  7:59 UTC (permalink / raw)
  To: neil; +Cc: guile-user

neil@ossau.homelinux.net writes:

> Sorry for not mentioning this before, but have you seen make-c-struct
> and parse-c-struct at
> https://www.gnu.org/software/guile/manual/html_node/Foreign-Structs.html
> ?

Yup.  They're somewhat primitive, not supporting nested structs or
arrays.  Unions aren't supported at all.  Bytestructures imitates C's
type system in full w.r.t. numeric, array (vector), struct, union, and
pointer types.  Though there are two main missing things now: tight
packing and bit fields.

Tight packing should be simple to implement by allowing an integer for
the optional argument to the struct constructor (currently only accepts
#f to mean no alignment-padding whatsoever, and #t (the default) for
conventional packing).

Bit fields would be implemented by allowing an optional third element,
the width, in each field of the fields alist passed to the struct
constructor.  The only problem is how to implement fields smaller than 8
bits when the bytevector API only allows granularity down to 8 bits.  I
guess it would be implemented via bitwise operations or by converting
ints to bit-arrays.  In any case it shouldn't be too hard to abstract it
out entirely for the user.

I'm working in a couple other things in parallel, but I should find time
to implement those some time in the not too distant future.

Taylan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2015-08-30 16:32 Bytestructures: a "type system" for bytevectors Taylan Ulrich Bayırlı/Kammer
  2015-08-31 13:50 ` Taylan Ulrich Bayırlı/Kammer
@ 2016-06-20 13:24 ` Andy Wingo
  2016-06-20 22:05   ` Taylan Ulrich Bayırlı/Kammer
  1 sibling, 1 reply; 11+ messages in thread
From: Andy Wingo @ 2016-06-20 13:24 UTC (permalink / raw)
  To: Taylan Ulrich "Bayırlı/Kammer"; +Cc: guile-user

On Sun 30 Aug 2015 18:32, taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:

> https://github.com/taylanub/scheme-bytestructures
>
> (I don't endorse GitHub, but I gave in after Gitorious went down.)
>
> I had started working on this project around two years ago but it had a
> pretty strange and complex API, an unreadable README, high overhead for
> something you might want to use in high-performance settings (byte
> crunching), and a bit too hung up on being standards compliant.
>
> I resumed working on it around a week ago, and solved all these issues!
>
> (I still do my best to keep R7RS compliance, but Guile is my prime
> *real* target.)
>
> So what is it?
>
> === Structured access to bytevector contents ===

I really want something like this BTW :) I was thinking that we can do
this on a proper low level, applying type tags to bytevectors.  I want
something like http://luajit.org/ext_ffi.html for Guile, and data access
is part of it.

Andy



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2016-06-20 13:24 ` Andy Wingo
@ 2016-06-20 22:05   ` Taylan Ulrich Bayırlı/Kammer
  2016-06-20 23:47     ` Matt Wette
  0 siblings, 1 reply; 11+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2016-06-20 22:05 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-user

Andy Wingo <wingo@pobox.com> writes:

> On Sun 30 Aug 2015 18:32, taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:
>
>> https://github.com/taylanub/scheme-bytestructures
>>
>> (I don't endorse GitHub, but I gave in after Gitorious went down.)
>>
>> I had started working on this project around two years ago but it had a
>> pretty strange and complex API, an unreadable README, high overhead for
>> something you might want to use in high-performance settings (byte
>> crunching), and a bit too hung up on being standards compliant.
>>
>> I resumed working on it around a week ago, and solved all these issues!
>>
>> (I still do my best to keep R7RS compliance, but Guile is my prime
>> *real* target.)
>>
>> So what is it?
>>
>> === Structured access to bytevector contents ===
>
> I really want something like this BTW :) I was thinking that we can do
> this on a proper low level, applying type tags to bytevectors.  I want
> something like http://luajit.org/ext_ffi.html for Guile, and data access
> is part of it.
>
> Andy

Hi Andy :-)

Have you also seen the thread titled "Bytestructures, FFI"?  (Maybe I
should have dug up this thread and replied to it instead of creating
that one.)

In light of that new feature that lets one easily wrap C functions from
Scheme, the missing features relative to Lua's FFI that I can spot are:

- parsing of C syntax

- automatic handling of strings

Is there more?

The second should be relatively easy.  The first is something I think
would be *great* and always had in mind, but it scares me. :-) Any tips
on how to begin to implement it welcome.  Would I have to write a C
parser in Scheme, or can we cheat somehow?

Taylan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2016-06-20 22:05   ` Taylan Ulrich Bayırlı/Kammer
@ 2016-06-20 23:47     ` Matt Wette
  2016-06-21  7:50       ` Taylan Ulrich Bayırlı/Kammer
  0 siblings, 1 reply; 11+ messages in thread
From: Matt Wette @ 2016-06-20 23:47 UTC (permalink / raw)
  To: Taylan Ulrich Bayırlı/Kammer; +Cc: Andy Wingo, guile-user


> On Jun 20, 2016, at 3:05 PM, Taylan Ulrich Bayırlı/Kammer <taylanbayirli@gmail.com> wrote:
> [SNIP]
>  Would I have to write a C parser in Scheme, or can we cheat somehow?


scheme@(guile-user)> (use-modules (nyacc lang c99 parser))
scheme@(guile-user)> (use-modules (ice-9 pretty-print))
scheme@(guile-user)> (pretty-print (with-input-from-string "int printf(char *fmt, ...);" parse-c99))
(trans-unit
  (decl (decl-spec-list (type-spec (fixed-type "int")))
        (init-declr-list
          (init-declr
            (ftn-declr
              (ident "printf")
              (param-list
                (param-decl
                  (decl-spec-list (type-spec (fixed-type "char")))
                  (param-declr (ptr-declr (pointer) (ident "fmt"))))
                (ellipis)))))))

nyacc is an all-guile implementation of yacc and comes with a c99 parser, available from www.nongnu.org.   
The parser outputs parse trees in sxml format.   It is beta-level code.

Matt



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2016-06-20 23:47     ` Matt Wette
@ 2016-06-21  7:50       ` Taylan Ulrich Bayırlı/Kammer
  2016-06-21 12:53         ` Matt Wette
  0 siblings, 1 reply; 11+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2016-06-21  7:50 UTC (permalink / raw)
  To: Matt Wette; +Cc: Andy Wingo, guile-user

Matt Wette <matt.wette@gmail.com> writes:

> nyacc is an all-guile implementation of yacc and comes with a c99
> parser, available from www.nongnu.org. 
> The parser outputs parse trees in sxml format. It is beta-level code.
>
> Matt

Wow!  That covers a big chunk of the task, if I implement it from
scratch.  In fact, given I don't have to deal with typedefs and such for
doing something like Lua's FFI, it covers most of the task.

Thank you!

Taylan



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2016-06-21  7:50       ` Taylan Ulrich Bayırlı/Kammer
@ 2016-06-21 12:53         ` Matt Wette
  2016-06-24 14:33           ` Matt Wette
  2016-06-24 14:47           ` Matt Wette
  0 siblings, 2 replies; 11+ messages in thread
From: Matt Wette @ 2016-06-21 12:53 UTC (permalink / raw)
  To: Taylan Ulrich Bayırlı/Kammer; +Cc: Andy Wingo, guile-user


> On Jun 21, 2016, at 12:50 AM, Taylan Ulrich Bayırlı/Kammer <taylanbayirli@gmail.com> wrote:
> 
> Matt Wette <matt.wette@gmail.com> writes:
> 
>> nyacc is an all-guile implementation of yacc and comes with a c99
>> parser, available from www.nongnu.org. 
>> The parser outputs parse trees in sxml format. It is beta-level code.
>> 
>> Matt
> 
> Wow!  That covers a big chunk of the task, if I implement it from
> scratch.  In fact, given I don't have to deal with typedefs and such for
> doing something like Lua's FFI, it covers most of the task.

There is code to expand typedefs.   Check “stripdown” routine in (nyacc lang c99 util2).  It also provides a keyword arg (#:keep) to provide a list of typedefs to not expand.

Matt




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2016-06-21 12:53         ` Matt Wette
@ 2016-06-24 14:33           ` Matt Wette
  2016-06-24 14:47           ` Matt Wette
  1 sibling, 0 replies; 11+ messages in thread
From: Matt Wette @ 2016-06-24 14:33 UTC (permalink / raw)
  To: guile-user

 
> On Jun 21, 2016, at 5:53 AM, Matt Wette <matt.wette@gmail.com> wrote:
> 
>> On Jun 21, 2016, at 12:50 AM, Taylan Ulrich Bayırlı/Kammer <taylanbayirli@gmail.com> wrote:
>> 
>> Matt Wette <matt.wette@gmail.com> writes:
>> 
>>> nyacc is an all-guile implementation of yacc and comes with a c99
>>> parser, available from www.nongnu.org. 
>>> The parser outputs parse trees in sxml format. It is beta-level code.
>>> 
>>> Matt
>> 
>> Wow!  That covers a big chunk of the task, if I implement it from
>> scratch.  In fact, given I don't have to deal with typedefs and such for
>> doing something like Lua's FFI, it covers most of the task.
> 
> There is code to expand typedefs.   Check “stripdown” routine in (nyacc lang c99 util2).  It also provides a keyword arg (#:keep) to provide a list of typedefs to not expand.

I should expand a little.  The nyacc c99 parser provides functionality that I think will be valuable to FFI:
1) preserves file context: the parse tree for included files is stuffed under an associated cpp-stmt include node
2) provides utility to expand typedef references into base types (with “keepers” arg to select which typedef references to preserve)
3) provides parser argument to choose which defines get expanded
4) provides utility to unwrap declarations

There is still a lot to do to support FFI.  E.g., what to do about system dependencies (e.g., is “int" 32 bits or 64 bits)

My stuff is all still pretty rough right now.   Here is some demo code for features (1) and (2).  

;; demo.scm

(use-modules (nyacc lang c99 parser))
(use-modules (nyacc lang c99 util1))
(use-modules (nyacc lang c99 util2))
(use-modules (ice-9 pretty-print))

(let* ((tree (with-input-from-file "ex1.h" parse-c99))
       (ud1 (reverse (tree->udict tree)))) ;; decl's only in ex1.h
  (display "\ntree:\n")
  (pretty-print tree)
  (display "\nud1:\n")
  (pretty-print ud1)
  (let ((ud0 (tree->udict (merge-inc-trees! tree)))) ;; decl's in all
    (for-each
     (lambda (pair)
       (let* ((udecl (cdr pair))
	      (tdexp (expand-decl-typerefs udecl ud0 #:keep '("int32_t")))
	      (mspec (udecl->mspec tdexp)))
	 (display "\ntdexp:\n")
	 (pretty-print tdexp)))
     ud1)))

=================================================
// ex1.h
#include "ex0.h"

typedef struct {
  double d;
  foo_t x;
} bar_t;

bar_t ftn1(int*);
======================================================
// ex0.h
typedef int int32_t;
typedef int32_t foo_t;
======================================================
tree:
(trans-unit
  (comment " ex1.h")
  (cpp-stmt
    (include
      "\"ex0.h\""
      (trans-unit
        (comment " ex0.h")
        (decl (decl-spec-list
                (stor-spec (typedef))
                (type-spec (fixed-type "int")))
              (init-declr-list (init-declr (ident "int32_t"))))
        (decl (decl-spec-list
                (stor-spec (typedef))
                (type-spec (typename "int32_t")))
              (init-declr-list (init-declr (ident "foo_t")))))))
  (decl (decl-spec-list
          (stor-spec (typedef))
          (type-spec
            (struct-def
              (field-list
                (comp-decl
                  (decl-spec-list
                    (type-spec (float-type "double")))
                  (comp-declr-list (comp-declr (ident "d"))))
                (comp-decl
                  (decl-spec-list (type-spec (typename "foo_t")))
                  (comp-declr-list (comp-declr (ident "x"))))))))
        (init-declr-list (init-declr (ident "bar_t"))))
  (decl (decl-spec-list (type-spec (typename "bar_t")))
        (init-declr-list
          (init-declr
            (ftn-declr
              (ident "ftn1")
              (param-list
                (param-decl
                  (decl-spec-list (type-spec (fixed-type "int")))
                  (param-declr (abs-declr (pointer))))))))))
ud1:
(("bar_t"
  decl
  (decl-spec-list
    (stor-spec (typedef))
    (type-spec
      (struct-def
        (field-list
          (comp-decl
            (decl-spec-list
              (type-spec (float-type "double")))
            (comp-declr-list (comp-declr (ident "d"))))
          (comp-decl
            (decl-spec-list (type-spec (typename "foo_t")))
            (comp-declr-list (comp-declr (ident "x"))))))))
  (init-declr (ident "bar_t")))
 ("ftn1"
  decl
  (decl-spec-list (type-spec (typename "bar_t")))
  (init-declr
    (ftn-declr
      (ident "ftn1")
      (param-list
        (param-decl
          (decl-spec-list (type-spec (fixed-type "int")))
          (param-declr (abs-declr (pointer)))))))))

tdexp:
(decl (decl-spec-list
        (stor-spec (typedef))
        (type-spec
          (struct-def
            (field-list
              (comp-decl
                (decl-spec-list
                  (type-spec (float-type "double")))
                (comp-declr (ident "d")))
              (comp-decl
                (decl-spec-list (type-spec (typename "int32_t")))
                (comp-declr (ident "x")))))))
      (init-declr (ident "bar_t")))

tdexp:
(decl (decl-spec-list
        (type-spec
          (struct-def
            (field-list
              (comp-decl
                (decl-spec-list
                  (type-spec (float-type "double")))
                (comp-declr (ident "d")))
              (comp-decl
                (decl-spec-list (type-spec (typename "int32_t")))
                (comp-declr (ident "x")))))))
      (init-declr
        (ftn-declr
          (ident "ftn1")
          (param-list
            (param-decl
              (decl-spec-list (type-spec (fixed-type "int")))
              (param-declr (abs-declr (pointer))))))))



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bytestructures: a "type system" for bytevectors
  2016-06-21 12:53         ` Matt Wette
  2016-06-24 14:33           ` Matt Wette
@ 2016-06-24 14:47           ` Matt Wette
  1 sibling, 0 replies; 11+ messages in thread
From: Matt Wette @ 2016-06-24 14:47 UTC (permalink / raw)
  To: guile-user, guile-devel

 
> On Jun 21, 2016, at 5:53 AM, Matt Wette <matt.wette@gmail.com> wrote:
> 
>> On Jun 21, 2016, at 12:50 AM, Taylan Ulrich Bayırlı/Kammer <taylanbayirli@gmail.com> wrote:
>> 
>> Matt Wette <matt.wette@gmail.com> writes:
>> 
>>> nyacc is an all-guile implementation of yacc and comes with a c99
>>> parser, available from www.nongnu.org. 
>>> The parser outputs parse trees in sxml format. It is beta-level code.
>>> 
>>> Matt
>> 
>> Wow!  That covers a big chunk of the task, if I implement it from
>> scratch.  In fact, given I don't have to deal with typedefs and such for
>> doing something like Lua's FFI, it covers most of the task.
> 
> There is code to expand typedefs.   Check “stripdown” routine in (nyacc lang c99 util2).  It also provides a keyword arg (#:keep) to provide a list of typedefs to not expand.

I should expand a little.  The nyacc c99 parser provides functionality that I think will be valuable to FFI:
1) preserves file context: the parse tree for included files is stuffed under an associated cpp-stmt include node
2) provides utility to expand typedef references into base types (with “keepers” arg to select which typedef references to preserve)
3) provides parser argument to choose which defines get expanded
4) provides utility to unwrap declarations

There is still a lot to do to support FFI.  E.g., what to do about system dependencies (e.g., is “int" 32 bits or 64 bits)

My stuff is all still pretty rough right now.   Here is some demo code for features (1) and (2).  

;; demo.scm

(use-modules (nyacc lang c99 parser))
(use-modules (nyacc lang c99 util1))
(use-modules (nyacc lang c99 util2))
(use-modules (ice-9 pretty-print))

(let* ((tree (with-input-from-file "ex1.h" parse-c99))
       (ud1 (reverse (tree->udict tree)))) ;; decl's only in ex1.h
  (display "\ntree:\n")
  (pretty-print tree)
  (display "\nud1:\n")
  (pretty-print ud1)
  (let ((ud0 (tree->udict (merge-inc-trees! tree)))) ;; decl's in all
    (for-each
     (lambda (pair)
       (let* ((udecl (cdr pair))
	      (tdexp (expand-decl-typerefs udecl ud0 #:keep '("int32_t")))
	      (mspec (udecl->mspec tdexp)))
	 (display "\ntdexp:\n")
	 (pretty-print tdexp)))
     ud1)))

=================================================
// ex1.h
#include "ex0.h"

typedef struct {
  double d;
  foo_t x;
} bar_t;

bar_t ftn1(int*);
======================================================
// ex0.h
typedef int int32_t;
typedef int32_t foo_t;
======================================================
tree:
(trans-unit
  (comment " ex1.h")
  (cpp-stmt
    (include
      "\"ex0.h\""
      (trans-unit
        (comment " ex0.h")
        (decl (decl-spec-list
                (stor-spec (typedef))
                (type-spec (fixed-type "int")))
              (init-declr-list (init-declr (ident "int32_t"))))
        (decl (decl-spec-list
                (stor-spec (typedef))
                (type-spec (typename "int32_t")))
              (init-declr-list (init-declr (ident "foo_t")))))))
  (decl (decl-spec-list
          (stor-spec (typedef))
          (type-spec
            (struct-def
              (field-list
                (comp-decl
                  (decl-spec-list
                    (type-spec (float-type "double")))
                  (comp-declr-list (comp-declr (ident "d"))))
                (comp-decl
                  (decl-spec-list (type-spec (typename "foo_t")))
                  (comp-declr-list (comp-declr (ident "x"))))))))
        (init-declr-list (init-declr (ident "bar_t"))))
  (decl (decl-spec-list (type-spec (typename "bar_t")))
        (init-declr-list
          (init-declr
            (ftn-declr
              (ident "ftn1")
              (param-list
                (param-decl
                  (decl-spec-list (type-spec (fixed-type "int")))
                  (param-declr (abs-declr (pointer))))))))))
ud1:
(("bar_t"
  decl
  (decl-spec-list
    (stor-spec (typedef))
    (type-spec
      (struct-def
        (field-list
          (comp-decl
            (decl-spec-list
              (type-spec (float-type "double")))
            (comp-declr-list (comp-declr (ident "d"))))
          (comp-decl
            (decl-spec-list (type-spec (typename "foo_t")))
            (comp-declr-list (comp-declr (ident "x"))))))))
  (init-declr (ident "bar_t")))
 ("ftn1"
  decl
  (decl-spec-list (type-spec (typename "bar_t")))
  (init-declr
    (ftn-declr
      (ident "ftn1")
      (param-list
        (param-decl
          (decl-spec-list (type-spec (fixed-type "int")))
          (param-declr (abs-declr (pointer)))))))))

tdexp:
(decl (decl-spec-list
        (stor-spec (typedef))
        (type-spec
          (struct-def
            (field-list
              (comp-decl
                (decl-spec-list
                  (type-spec (float-type "double")))
                (comp-declr (ident "d")))
              (comp-decl
                (decl-spec-list (type-spec (typename "int32_t")))
                (comp-declr (ident "x")))))))
      (init-declr (ident "bar_t")))

tdexp:
(decl (decl-spec-list
        (type-spec
          (struct-def
            (field-list
              (comp-decl
                (decl-spec-list
                  (type-spec (float-type "double")))
                (comp-declr (ident "d")))
              (comp-decl
                (decl-spec-list (type-spec (typename "int32_t")))
                (comp-declr (ident "x")))))))
      (init-declr
        (ftn-declr
          (ident "ftn1")
          (param-list
            (param-decl
              (decl-spec-list (type-spec (fixed-type "int")))
              (param-declr (abs-declr (pointer))))))))



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-06-24 14:47 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-30 16:32 Bytestructures: a "type system" for bytevectors Taylan Ulrich Bayırlı/Kammer
2015-08-31 13:50 ` Taylan Ulrich Bayırlı/Kammer
2015-09-07 22:57   ` neil
2015-09-08  7:59     ` Taylan Ulrich Bayırlı/Kammer
2016-06-20 13:24 ` Andy Wingo
2016-06-20 22:05   ` Taylan Ulrich Bayırlı/Kammer
2016-06-20 23:47     ` Matt Wette
2016-06-21  7:50       ` Taylan Ulrich Bayırlı/Kammer
2016-06-21 12:53         ` Matt Wette
2016-06-24 14:33           ` Matt Wette
2016-06-24 14:47           ` Matt Wette

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).