unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* ffi docs
@ 2010-04-06 21:36 Andy Wingo
  2010-04-07 21:38 ` Ludovic Courtès
  2010-04-15 22:36 ` Neil Jerram
  0 siblings, 2 replies; 9+ messages in thread
From: Andy Wingo @ 2010-04-06 21:36 UTC (permalink / raw)
  To: guile-devel

Hi,

I reorganized the manual sections on dynamic linking, and wrote some
sections on the dynamic FFI. I'm appending a text rendering of the
documentation. Comments welcome!

Andy

0.1 Foreign Function Interface
==============================

The more one hacks in Scheme, the more one realizes that there are
actually two computational worlds: one which is warm and alive, that
land of parentheses, and one cold and dead, the land of C and its ilk.

   But yet we as programmers live in both worlds, and Guile itself is
half implemented in C. So it is that Guile's living half pays respect
to its dead counterpart, via a spectrum of interfaces to C ranging from
dynamic loading of Scheme primitives to dynamic binding of stock C
library prodedures.

0.1.1 Foreign Libraries
-----------------------

Most modern Unices have something called "shared libraries".  This
ordinarily means that they have the capability to share the executable
image of a library between several running programs to save memory and
disk space.  But generally, shared libraries give a lot of additional
flexibility compared to the traditional static libraries.  In fact,
calling them `dynamic' libraries is as correct as calling them `shared'.

   Shared libraries really give you a lot of flexibility in addition to
the memory and disk space savings.  When you link a program against a
shared library, that library is not closely incorporated into the final
executable.  Instead, the executable of your program only contains
enough information to find the needed shared libraries when the program
is actually run.  Only then, when the program is starting, is the final
step of the linking process performed.  This means that you need not
recompile all programs when you install a new, only slightly modified
version of a shared library.  The programs will pick up the changes
automatically the next time they are run.

   Now, when all the necessary machinery is there to perform part of the
linking at run-time, why not take the next step and allow the programmer
to explicitly take advantage of it from within his program?  Of course,
many operating systems that support shared libraries do just that, and
chances are that Guile will allow you to access this feature from within
your Scheme programs.  As you might have guessed already, this feature
is called "dynamic linking".(1)

   We titled this section "foreign libraries" because although the name
"foreign" doesn't leak into the API, the world of C really is foreign
to Scheme - and that estrangement extends to components of foreign
libraries as well, as we see in future sections.

 -- Scheme Procedure: dynamic-link [library]
 -- C Function: scm_dynamic_link (library)
     Find the shared library denoted by LIBRARY (a string) and link it
     into the running Guile application.  When everything works out,
     return a Scheme object suitable for representing the linked object
     file.  Otherwise an error is thrown.  How object files are
     searched is system dependent.

     Normally, LIBRARY is just the name of some shared library file
     that will be searched for in the places where shared libraries
     usually reside, such as in `/usr/lib' and `/usr/local/lib'.

     When LIBRARY is omitted, a "global symbol handle" is returned.
     This handle provides access to the symbols available to the
     program at run-time, including those exported by the program
     itself and the shared libraries already loaded.

 -- Scheme Procedure: dynamic-object? obj
 -- C Function: scm_dynamic_object_p (obj)
     Return `#t' if OBJ is a dynamic library handle, or `#f' otherwise.

 -- Scheme Procedure: dynamic-unlink dobj
 -- C Function: scm_dynamic_unlink (dobj)
     Unlink the indicated object file from the application.  The
     argument DOBJ must have been obtained by a call to `dynamic-link'.
     After `dynamic-unlink' has been called on DOBJ, its content is no
     longer accessible.

     (define libgl-obj (dynamic-link "libGL"))
     libgl-obj
     => #<dynamic-object "libGL">
     (dynamic-unlink libGL-obj)
     libGL-obj
     => #<dynamic-object "libGL" (unlinked)>

   As you can see, after calling `dynamic-unlink' on a dynamically
linked library, it is marked as `(unlinked)' and you are no longer able
to use it with `dynamic-call', etc.  Whether the library is really
removed from you program is system-dependent and will generally not
happen when some other parts of your program still use it.

   When dynamic linking is disabled or not supported on your system,
the above functions throw errors, but they are still available.

   ---------- Footnotes ----------

   (1) Some people also refer to the final linking stage at program
startup as `dynamic linking', so if you want to make yourself perfectly
clear, it is probably best to use the more technical term "dlopening",
as suggested by Gordon Matzigkeit in his libtool documentation.

0.1.2 Foreign Functions
-----------------------

The most natural thing to do with a dynamic library is to grovel around
in it for a function pointer: a "foreign function".  `dynamic-func'
exists for that purpose.

 -- Scheme Procedure: dynamic-func name dobj
 -- C Function: scm_dynamic_func (name, dobj)
     Return a "handle" for the func NAME in the shared object referred
     to by DOBJ. The handle can be passed to `dynamic-call' to actually
     call the function.

     Regardless whether your C compiler prepends an underscore `_' to
     the global names in a program, you should *not* include this
     underscore in NAME since it will be added automatically when
     necessary.

   Guile has static support for calling functions with no arguments,
`dynamic-call'.

 -- Scheme Procedure: dynamic-call func dobj
 -- C Function: scm_dynamic_call (func, dobj)
     Call the C function indicated by FUNC and DOBJ.  The function is
     passed no arguments and its return value is ignored.  When
     FUNCTION is something returned by `dynamic-func', call that
     function and ignore DOBJ.  When FUNC is a string , look it up in
     DYNOBJ; this is equivalent to
          (dynamic-call (dynamic-func FUNC DOBJ) #f)

     Interrupts are deferred while the C function is executing (with
     `SCM_DEFER_INTS'/`SCM_ALLOW_INTS').

   `dynamic-call' is not very powerful. It is mostly intended to be
used for calling specially written initialization functions that will
then add new primitives to Guile. For example, we do not expect that you
will dynamically link `libX11' with `dynamic-link' and then construct a
beautiful graphical user interface just by using `dynamic-call'.
Instead, the usual way would be to write a special Guile-to-X11 glue
library that has intimate knowledge about both Guile and X11 and does
whatever is necessary to make them inter-operate smoothly. This glue
library could then be dynamically linked into a vanilla Guile
interpreter and activated by calling its initialization function. That
function would add all the new types and primitives to the Guile
interpreter that it has to offer.

   (There is actually another, better option: simply to create a
`libX11' wrapper in Scheme via the dynamic FFI. *Note Dynamic FFI::,
for more information.)

   Given some set of C extensions to Guile, the next logical step is to
integrate these glue libraries into the module system of Guile so that
you can load new primitives into a running system just as you can load
new Scheme code.

 -- Scheme Procedure: load-extension lib init
 -- C Function: scm_load_extension (lib, init)
     Load and initialize the extension designated by LIB and INIT.
     When there is no pre-registered function for LIB/INIT, this is
     equivalent to

          (dynamic-call INIT (dynamic-link LIB))

     When there is a pre-registered function, that function is called
     instead.

     Normally, there is no pre-registered function.  This option exists
     only for situations where dynamic linking is unavailable or
     unwanted.  In that case, you would statically link your program
     with the desired library, and register its init function right
     after Guile has been initialized.

     LIB should be a string denoting a shared library without any file
     type suffix such as ".so".  The suffix is provided automatically.
     It should also not contain any directory components.  Libraries
     that implement Guile Extensions should be put into the normal
     locations for shared libraries.  We recommend to use the naming
     convention libguile-bla-blum for a extension related to a module
     `(bla blum)'.

     The normal way for a extension to be used is to write a small
     Scheme file that defines a module, and to load the extension into
     this module.  When the module is auto-loaded, the extension is
     loaded as well.  For example,

          (define-module (bla blum))

          (load-extension "libguile-bla-blum" "bla_init_blum")

0.1.3 C Extensions
------------------

The most interesting application of dynamically linked libraries is
probably to use them for providing _compiled code modules_ to Scheme
programs.  As much fun as programming in Scheme is, every now and then
comes the need to write some low-level C stuff to make Scheme even more
fun.

   Not only can you put these new primitives into their own module (see
the previous section), you can even put them into a shared library that
is only then linked to your running Guile image when it is actually
needed.

   An example will hopefully make everything clear.  Suppose we want to
make the Bessel functions of the C library available to Scheme in the
module `(math bessel)'.  First we need to write the appropriate glue
code to convert the arguments and return values of the functions from
Scheme to C and back.  Additionally, we need a function that will add
them to the set of Guile primitives.  Because this is just an example,
we will only implement this for the `j0' function.

     #include <math.h>
     #include <libguile.h>

     SCM
     j0_wrapper (SCM x)
     {
       return scm_from_double (j0 (scm_to_double (x, "j0")));
     }

     void
     init_math_bessel ()
     {
       scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
     }

   We can already try to bring this into action by manually calling the
low level functions for performing dynamic linking.  The C source file
needs to be compiled into a shared library.  Here is how to do it on
GNU/Linux, please refer to the `libtool' documentation for how to
create dynamically linkable libraries portably.

     gcc -shared -o libbessel.so -fPIC bessel.c

   Now fire up Guile:

     (define bessel-lib (dynamic-link "./libbessel.so"))
     (dynamic-call "init_math_bessel" bessel-lib)
     (j0 2)
     => 0.223890779141236

   The filename `./libbessel.so' should be pointing to the shared
library produced with the `gcc' command above, of course.  The second
line of the Guile interaction will call the `init_math_bessel' function
which in turn will register the C function `j0_wrapper' with the Guile
interpreter under the name `j0'.  This function becomes immediately
available and we can call it from Scheme.

   Fun, isn't it?  But we are only half way there.  This is what
`apropos' has to say about `j0':

     (apropos "j0")
     -| (guile-user): j0     #<primitive-procedure j0>

   As you can see, `j0' is contained in the root module, where all the
other Guile primitives like `display', etc live.  In general, a
primitive is put into whatever module is the "current module" at the
time `scm_c_define_gsubr' is called.

   A compiled module should have a specially named "module init
function".  Guile knows about this special name and will call that
function automatically after having linked in the shared library.  For
our example, we replace `init_math_bessel' with the following code in
`bessel.c':

     void
     init_math_bessel (void *unused)
     {
       scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
       scm_c_export ("j0", NULL);
     }

     void
     scm_init_math_bessel_module ()
     {
       scm_c_define_module ("math bessel", init_math_bessel, NULL);
     }

   The general pattern for the name of a module init function is:
`scm_init_', followed by the name of the module where the individual
hierarchical components are concatenated with underscores, followed by
`_module'.

   After `libbessel.so' has been rebuilt, we need to place the shared
library into the right place.

   Once the module has been correctly installed, it should be possible
to use it like this:

     guile> (load-extension "./libbessel.so" "scm_init_math_bessel_module")
     guile> (use-modules (math bessel))
     guile> (j0 2)
     0.223890779141236
     guile> (apropos "j0")
     -| (math bessel): j0      #<primitive-procedure j0>

   That's it!

0.1.4 Modules and Extensions
----------------------------

The new primitives that you add to Guile with `scm_c_define_gsubr'
(*note Primitive Procedures::) or with any of the other mechanisms are
placed into the module that is current when the `scm_c_define_gsubr' is
executed. Extensions loaded from the REPL, for example, will be placed
into the `(guile-user)' module, if the REPL module was not changed.

   To define C primitives within a specific module, the simplest way is:

     (define-module (foo bar))
     (load-extension "foobar-c-code" "foo_bar_init")

   When loaded with `(use-modules (foo bar))', the `load-extension'
call looks for the `foobar-c-code.so' (etc) object file in the standard
system locations, such as `/usr/lib' or `/usr/local/lib'.

   If someone installs your module to a non-standard location then the
object file won't be found.  You can address this by inserting the
install location in the `foo/bar.scm' file.  This is convenient for the
user and also guarantees the intended object is read, even if stray
older or newer versions are in the loader's path.

   The usual way to specify an install location is with a `prefix' at
the configure stage, for instance `./configure prefix=/opt' results in
library files as say `/opt/lib/foobar-c-code.so'.  When using Autoconf
(*note Introduction: (autoconf)Top.), the library location is in a
`libdir' variable.  Its value is intended to be expanded by `make', and
can by substituted into a source file like `foo.scm.in'

     (define-module (foo bar))
     (load-extension "XXlibdirXX/foobar-c-code" "foo_bar_init")

with the following in a `Makefile', using `sed' (*note Introduction:
(sed)Top. A Stream Editor),

     foo.scm: foo.scm.in
             sed 's|XXlibdirXX|$(libdir)|' <foo.scm.in >foo.scm

   The actual pattern `XXlibdirXX' is arbitrary, it's only something
which doesn't otherwise occur.  If several modules need the value, it
can be easier to create one `foo/config.scm' with a define of the
`libdir' location, and use that as required.

     (define-module (foo config))
     (define-public foo-config-libdir "XXlibdirXX"")

   Such a file might have other locations too, for instance a data
directory for auxiliary files, or `localedir' if the module has its own
`gettext' message catalogue (*note Internationalization::).

   When installing multiple C code objects, it can be convenient to put
them in a subdirectory of `libdir', thus giving for example
`/usr/lib/foo/some-obj.so'.  If the objects are only meant to be used
through the module, then a subdirectory keeps them out of sight.

   It will be noted all of the above requires that the Scheme code to be
found in `%load-path' (*note Build Config::).  Presently it's left up
to the system administrator or each user to augment that path when
installing Guile modules in non-default locations.  But having reached
the Scheme code, that code should take care of hitting any of its own
private files etc.

   Presently there's no convention for having a Guile version number in
module C code filenames or directories.  This is primarily because
there's no established principles for two versions of Guile to be
installed under the same prefix (eg. two both under `/usr').  Assuming
upward compatibility is maintained then this should be unnecessary, and
if compatibility is not maintained then it's highly likely a package
will need to be revisited anyway.

   The present suggestion is that modules should assume when they're
installed under a particular `prefix' that there's a single version of
Guile there, and the `guile-config' at build time has the necessary
information about it.  C code or Scheme code might adapt itself
accordingly (allowing for features not available in an older version
for instance).

0.1.5 Foreign Pointers
----------------------

The previous sections have shown how Guile can be extended at runtime by
loading compiled C extensions. This approach is all well and good, but
wouldn't it be nice if we didn't have to write any C at all? This
section takes up the problem of accessing C values from Scheme, and the
next discusses C functions.

0.1.5.1 Foreign Types
.....................

The first impedance mismatch that one sees between C and Scheme is that
in C, the storage locations (variables) are typed, but in Scheme types
are associated with values, not variables. *Note Values and Variables::.

   So when accessing a C value through a Scheme pointer, we must give
the type of the pointed-to value explicitly, as a parameter to any
Scheme procedure that accesses the value.

   These "C type values" may be constructed using the constants and
procedures from the `(system foreign)' module, which may be loaded like
this:

     (use-modules (system foreign))

   `(system foreign)' exports a number of values expressing the basic C
types:

 -- Scheme Variable: int8
 -- Scheme Variable: uint8
 -- Scheme Variable: uint16
 -- Scheme Variable: int16
 -- Scheme Variable: uint32
 -- Scheme Variable: int32
 -- Scheme Variable: uint64
 -- Scheme Variable: int64
 -- Scheme Variable: float
 -- Scheme Variable: double
     Values exported by the `(system foreign)' module, representing C
     numeric types of the specified sizes and signednesses.

   In addition there are some convenience bindings for indicating types
of platform-dependent size:

 -- Scheme Variable: int
 -- Scheme Variable: unsigned-int
 -- Scheme Variable: long
 -- Scheme Variable: unsigned-long
 -- Scheme Variable: size_t
     Values exported by the `(system foreign)' module, representing C
     numeric types. For example, `long' may be `equal?' to `int64' on a
     64-bit platform.

0.1.5.2 Foreign Variables
.........................

Given the types defined in the previous section, C pointers may be
looked up dynamically using `dynamic-pointer'.

 -- Scheme Procedure: dynamic-pointer name type dobj [len]
 -- C Function: scm_dynamic_pointer (name, type, dobj, len)
     Return a "handle" for the pointer NAME in the shared object
     referred to by DOBJ. The handle aliases a C value, and is declared
     to be of type TYPE. Valid types are defined in the `(system
     foreign)' module.

     This facility works by asking the dynamic linker for the address
     of a symbol, then assuming that it aliases a value of a given
     type. Obviously, the user must be very careful to ensure that the
     value actually is of the declared type, or bad things will happen.

     Regardless whether your C compiler prepends an underscore `_' to
     the global names in a program, you should *not* include this
     underscore in NAME since it will be added automatically when
     necessary.

   For example, currently Guile has a variable, `scm_numptob', as part
of its API. It is declared as a C `long'. So, to create a handle
pointing to that foreign value, we do:

     (use-modules (system foreign))
     (define numptob (dynamic-pointer "scm_numptob" long (dynamic-link)))
     numptob
     => #<foreign int32 8>

   A value returned by `dynamic-pointer' is a Scheme wrapper for a C
pointer, with additional type information. A foreign pointer prints
according to its type. This example showed that a `long' on this
platform is an `int32', and that the value pointed to by `numptob' is 8.

   Typed pointers may be referenced using the `foreign-ref' and
`foreign-set!' functions.

 -- Scheme Procedure: foreign-ref foreign
 -- C Function: scm_foreign_ref foreign
     Reference the foreign value pointed to by FOREIGN.

     The value will be referenced according to its type.

          (foreign-ref numptob) => 8 ; YMMV

 -- Scheme Procedure: foreign-set! foreign val
 -- C Function: scm_foreign_set_x foreign val
     Set the foreign value pointed to by FOREIGN.

     The value will be set according to its type.

          (foreign-set! numptob 120) ; Don't try this at home!

   If we wanted to corrupt Guile's internal state, we could set
`scm_numptob' to another value; but we shouldn't, because that variable
is not meant to be set. Indeed this point applies more widely: the C
API is a dangerous place to be. Not only might setting a value crash
your program, simply referencing a value with a wrong-sized type can
prove equally disastrous.

0.1.5.3 Void Pointers and Byte Access
.....................................

As a special case, a dynamic pointer may be declared to point to type
`void', in which case it is treated as a void pointer. A void pointer
prints its value as a pointer, without dereferencing the pointer.

   It's important at this point to conceptually separate foreign values
from foreign pointers. `dynamic-pointer' gives you a foreign pointer. A
foreign value is the semantic meaning of the bytes pointed to by a
pointer. Only foreign pointers may be wrapped in Scheme. One may make a
pointer to a foreign value, and wrap that as a Scheme object, but a
bare foreign value may not be wrapped.

   When you call `dynamic-pointer', the TYPE argument indicates the
type to which the given symbol points, but sometimes you don't know
that type. Sometimes you have a pointer, and you don't know what kind of
object it references. It's simply a pointer out into the ether, into the
`void'.

   Guile can wrap such a pointer, by declaring that it points to `void'.

 -- Scheme Variable: void
     A foreign type value representing nothing.

     `void' has two uses: for a foreign pointer, declaring it to be of
     type `void' is like having a `void*' in C. For a function, a
     return type of `void' indicates that the function returns no
     values. A function argument type of `void' is invalid.

   As an example, `(dynamic-pointer "foo" void bar-lib)' links in the
FOO symbol in the BAR-LIB library as a pointer to `void': a `void*'.

   Void pointers may be accessed as bytevectors.

 -- Scheme Procedure: foreign->bytevector foreign [uvec_type [offset
          [len]]]
 -- C Function: scm_foreign_to_bytevector foreign uvec_type offset len
     Return a bytevector aliasing the memory pointed to by FOREIGN.

     FOREIGN must be a void pointer, a foreign whose type is VOID. By
     default, the resulting bytevector will alias all of the memory
     pointed to by FOREIGN, from beginning to end, treated as a `vu8'
     array.

     The user may specify an alternate default interpretation for the
     memory by passing the UVEC_TYPE argument, to indicate that the
     memory is an array of elements of that type.  UVEC_TYPE should be
     something that `uniform-vector-element-type' would return, like
     `f32' or `s16'.

     Users may also specify that the bytevector should only alias a
     subset of the memory, by specifying OFFSET and LEN arguments.

     Mutating the returned bytevector mutates the memory pointed to by
     FOREIGN, so buckle your seatbelts.

 -- Scheme Procedure: bytevector->foreign bv [offset [len]]
 -- C Function: scm_bytevector_to_foreign bv offset len
     Return a foreign pointer aliasing the memory pointed to by BV.

     The resulting foreign will be a void pointer, a foreign whose type
     is `void'. By default it will alias all of the memory pointed to
     by BV, from beginning to end.

     Users may explicily specify that the foreign should only alias a
     subset of the memory, by specifying OFFSET and LEN arguments.

0.1.5.4 Foreign Structs
.......................

Finally, one last note on foreign values before moving on to actually
calling foreign functions. Sometimes you need to deal with C structs,
which requires interpreting each element of the struct according to the
its type, offset, and alignment. Guile has some primitives to support
this.

 -- Scheme Procedure: sizeof type
 -- C Function: scm_sizeof type
     Return the size of TYPE, in bytes.

     TYPE should be a valid C type, like `int'.  Alternately TYPE may
     be the symbol `*', in which case the size of a pointer is
     returned. TYPE may also be a list of types, in which case the size
     of a `struct' with ABI-conventional packing is returned.

 -- Scheme Procedure: alignof type
 -- C Function: scm_alignof type
     Return the alignment of TYPE, in bytes.

     TYPE should be a valid C type, like `int'.  Alternately TYPE may
     be the symbol `*', in which case the alignment of a pointer is
     returned. TYPE may also be a list of types, in which case the
     alignment of a `struct' with ABI-conventional packing is returned.

   Guile also provides some convenience methods to pack and unpack
foreign pointers wrapping C structs.

 -- Scheme Procedure: make-c-struct types vals
     Create a foreign pointer to a C struct containing VALS with types
     `types'.

     VALS and `types' should be lists of the same length.

 -- Scheme Procedure: parse-c-struct foreign types
     Parse a foreign pointer to a C struct, returning a list of values.

     `types' should be a list of C types.

   For example, to create and parse the equivalent of a `struct {
int64_t a; uint8_t b; }':

     (parse-c-struct (make-c-struct (list int64 uint8)
                                    (list 300 43))
                     (list int64 uint8))
     => (300 43)

   As yet, Guile only has convenience routines to support
conventionally-packed structs. But given the `bytevector->foreign' and
`foreign->bytevector' routines, one can create and parse tightly packed
structs and unions by hand. See the code for `(system foreign)' for
details.

0.1.6 Dynamic FFI
-----------------

Of course, the land of C is not all nouns and no verbs: there are
functions too, and Guile allows you to call them.

 -- Scheme Procedure: make-foreign-function return_type func_ptr
          arg_types
 -- C Procedure: scm_make_foreign_function return_type func_ptr
          arg_types
     Make a foreign function.

     Given the foreign void pointer FUNC_PTR, its argument and return
     types ARG_TYPES and RETURN_TYPE, return a procedure that will pass
     arguments to the foreign function and return appropriate values.

     ARG_TYPES should be a list of foreign types.  `return_type' should
     be a foreign type. *Note Foreign Types::, for more information on
     foreign types.

   Here is a better definition of `(math bessel)':

     (define-module (math bessel)
       #:use-module (system foreign)
       #:export (j0))

     (define libm (dynamic-link "libm"))

     (define j0
       (make-foreign-function double
                              (dynamic-func "j0" libm)
                              (list double)))

   That's it! No C at all.

   Numeric arguments and return values from foreign functions are
represented as Scheme values. For example, `j0' in the above example
takes a Scheme number as its argument, and returns a Scheme number.

   Pointers may be passed to and returned from foreign functions as
well.  In that case the type of the argument or return value should be
the symbol `*', indicating a pointer. For example, the following code
makes `memcpy' available to Scheme:

     (define memcpy
       (let ((this (dynamic-link)))
         (make-foreign-function '*
                                (dynamic-func "memcpy" this)
                                (list '* '* size_t))))

   To invoke `memcpy', one must pass it foreign pointers:

     (use-modules (rnrs bytevector))

     (define src
       (bytevector->foreign (u8-list->bytevector '(0 1 2 3 4 5 6 7))))
     (define dest
       (bytevector->foreign (make-bytevector 16 0)))

     (memcpy dest src (bytevector-length (foreign->bytevector src)))))

     (bytevector->u8-list (foreign->bytevector dest))
     => (0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0)

   One may also pass structs as values, passing structs as foreign
pointers. *Note Foreign Structs::, for more information on how to
express struct types and struct values.

   "Out" arguments are passed as foreign pointers. The memory pointed to
by the foreign pointer is mutated in place.

     ;; struct timeval {
     ;;      time_t      tv_sec;     /* seconds */
     ;;      suseconds_t tv_usec;    /* microseconds */
     ;; };
     ;; assuming fields are of type "long"

     (define gettimeofday
       (let ((f (make-foreign-function
                 int
                 (dynamic-func "gettimeofday" (dynamic-link))
                 (list '* '*)))
             (tv-type (list long long)))
         (lambda ()
           (let* ((timeval (make-c-struct tv-type (list 0 0)))
                  (ret (f timeval %null-pointer)))
             (if (zero? ret)
                 (apply values (parse-c-struct timeval tv-type))
                 (error "gettimeofday returned an error" ret))))))

     (gettimeofday)
     => 1270587589
     => 499553

   This example also shows use of `%null-pointer', which is a null
foreign pointer, exported by `(system foreign)'.

 -- Scheme Variable: %null-pointer
     A foreign pointer whose value is 0.

   As you can see, this interface to foreign functions is at a very low,
somewhat dangerous level. A contribution to Guile in the form of a
high-level FFI would be most welcome.


-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ffi docs
  2010-04-06 21:36 ffi docs Andy Wingo
@ 2010-04-07 21:38 ` Ludovic Courtès
  2010-04-07 22:01   ` Andy Wingo
  2010-04-15 22:36 ` Neil Jerram
  1 sibling, 1 reply; 9+ messages in thread
From: Ludovic Courtès @ 2010-04-07 21:38 UTC (permalink / raw)
  To: guile-devel

Hello,

Andy Wingo <wingo@pobox.com> writes:

> I reorganized the manual sections on dynamic linking, and wrote some
> sections on the dynamic FFI. I'm appending a text rendering of the
> documentation. Comments welcome!

Not much to say, apart from the fact that I’m happy with all this work!
:-)

I like the introductory text about shared libraries & co., and I like
the tone.

> Most modern Unices have something called "shared libraries".

... or “dynamic shared objects” (DSOs).  [Drepper et al. use this
terminology.]

> to Scheme - and that estrangement extends to components of foreign

This should be an em dash (‘---’ with no space around in Texinfo).

>    (1) Some people also refer to the final linking stage at program
> startup as `dynamic linking', so if you want to make yourself perfectly
> clear, it is probably best to use the more technical term "dlopening",
> as suggested by Gordon Matzigkeit in his libtool documentation.

s/libtool/Libtool/, and an xref to the Libtool manual?

>      ;; struct timeval {
>      ;;      time_t      tv_sec;     /* seconds */
>      ;;      suseconds_t tv_usec;    /* microseconds */
>      ;; };
>      ;; assuming fields are of type "long"

I think this assumption is one of the main pitfalls of a dynamic FFI.
More generally, a problem is that there’s no compile-time check that the
bindings correspond to the C code.  How about adding a paragraph to
mention it?

BTW, I remember seeing occurrences of “her” in new material, where I’d
personally prefer “their” (see <http://aetherlumina.com/gnp/>).

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ffi docs
  2010-04-07 21:38 ` Ludovic Courtès
@ 2010-04-07 22:01   ` Andy Wingo
  0 siblings, 0 replies; 9+ messages in thread
From: Andy Wingo @ 2010-04-07 22:01 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi,

On Wed 07 Apr 2010 23:38, ludo@gnu.org (Ludovic Courtès) writes:

> Andy Wingo <wingo@pobox.com> writes:
>
>> I reorganized the manual sections on dynamic linking, and wrote some
>> sections on the dynamic FFI. I'm appending a text rendering of the
>> documentation. Comments welcome!
>
> Not much to say, apart from the fact that I’m happy with all this work!
> :-)

Thanks! Much of it, including some bits you pointed out, were already
there; but it's always good to revise :)

>>      ;; struct timeval {
>>      ;;      time_t      tv_sec;     /* seconds */
>>      ;;      suseconds_t tv_usec;    /* microseconds */
>>      ;; };
>>      ;; assuming fields are of type "long"
>
> I think this assumption is one of the main pitfalls of a dynamic FFI.
> More generally, a problem is that there’s no compile-time check that the
> bindings correspond to the C code.  How about adding a paragraph to
> mention it?

Good point. (FWIW, the assumption about `long' actually comes from the
man page.)

> BTW, I remember seeing occurrences of “her” in new material, where I’d
> personally prefer “their” (see <http://aetherlumina.com/gnp/>).

FWIW I like to alternate. Sometimes "their" sounds a bit muddy. "Her" is
somewhat jarring but in a good way I think. Generally agreed though.

A
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ffi docs
  2010-04-06 21:36 ffi docs Andy Wingo
  2010-04-07 21:38 ` Ludovic Courtès
@ 2010-04-15 22:36 ` Neil Jerram
  2010-04-16  8:43   ` Ludovic Courtès
                     ` (2 more replies)
  1 sibling, 3 replies; 9+ messages in thread
From: Neil Jerram @ 2010-04-15 22:36 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> Hi,

Hi Andy,

I agree with Ludo that this work is really great.  I had some thoughts
when reading through, as far as the start of the Foreign Structs
section, as follows.  I'll try to comment on the rest tomorrow.

         Neil


>    But yet we as programmers live in both worlds, and Guile itself is
> half implemented in C. So it is that Guile's living half pays respect
> to its dead counterpart, via a spectrum of interfaces to C ranging from
> dynamic loading of Scheme primitives to dynamic binding of stock C
> library prodedures.

c -----------^

>    We titled this section "foreign libraries" because although the name
> "foreign" doesn't leak into the API, the world of C really is foreign
> to Scheme - and that estrangement extends to components of foreign
> libraries as well, as we see in future sections.

I'm not sure what the message is here.

>  -- Scheme Procedure: dynamic-link [library]
>  -- C Function: scm_dynamic_link (library)

Code below implies that library can be omitted, and that this -
i.e. '(dynamic-link)' - means to return an object representing libguile
itself.  Should that be mentioned in the following doc?

>      Find the shared library denoted by LIBRARY (a string) and link it
>      into the running Guile application.  When everything works out,
>      return a Scheme object suitable for representing the linked object
>      file.  Otherwise an error is thrown.  How object files are
>      searched is system dependent.
>
>      Normally, LIBRARY is just the name of some shared library file
>      that will be searched for in the places where shared libraries
>      usually reside, such as in `/usr/lib' and `/usr/local/lib'.
>
>      When LIBRARY is omitted, a "global symbol handle" is returned.
>      This handle provides access to the symbols available to the
>      program at run-time, including those exported by the program
>      itself and the shared libraries already loaded.

>    Given some set of C extensions to Guile, the next logical step is to
> integrate these glue libraries into the module system of Guile so that
> you can load new primitives into a running system just as you can load
> new Scheme code.
>
>  -- Scheme Procedure: load-extension lib init
>  -- C Function: scm_load_extension (lib, init)
>      Load and initialize the extension designated by LIB and INIT.
>      When there is no pre-registered function for LIB/INIT, this is
>      equivalent to
>
>           (dynamic-call INIT (dynamic-link LIB))
>
>      When there is a pre-registered function, that function is called
>      instead.
>
>      Normally, there is no pre-registered function.  This option exists
>      only for situations where dynamic linking is unavailable or
>      unwanted.  In that case, you would statically link your program
>      with the desired library, and register its init function right
>      after Guile has been initialized.

Should there be a reference from here to wherever the registration API
is covered?

>      LIB should be a string denoting a shared library without any file
>      type suffix such as ".so".  The suffix is provided automatically.
>      It should also not contain any directory components.  Libraries
>      that implement Guile Extensions should be put into the normal
>      locations for shared libraries.  We recommend to use the naming
>      convention libguile-bla-blum for a extension related to a module
>      `(bla blum)'.

I believe this will shortly be out of date, won't it? - given our desire
to support parallel installs.

>    A compiled module should have a specially named "module init
> function".  Guile knows about this special name and will call that
> function automatically after having linked in the shared library.  For
> our example, we replace `init_math_bessel' with the following code in
> `bessel.c':
>
>      void
>      init_math_bessel (void *unused)
>      {
>        scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
>        scm_c_export ("j0", NULL);
>      }
>
>      void
>      scm_init_math_bessel_module ()
>      {
>        scm_c_define_module ("math bessel", init_math_bessel, NULL);
>      }
>
>    The general pattern for the name of a module init function is:
> `scm_init_', followed by the name of the module where the individual
> hierarchical components are concatenated with underscores, followed by
> `_module'.

Is this still correct?  IIUC it only makes sense as part of the ability
we once had for a (use-modules (...)) call to find a .so and bootstrap
it automatically.  (Unless that has been reinstated...)

>    Presently there's no convention for having a Guile version number in
> module C code filenames or directories.  This is primarily because
> there's no established principles for two versions of Guile to be
> installed under the same prefix (eg. two both under `/usr').  Assuming
> upward compatibility is maintained then this should be unnecessary, and
> if compatibility is not maintained then it's highly likely a package
> will need to be revisited anyway.
>
>    The present suggestion is that modules should assume when they're
> installed under a particular `prefix' that there's a single version of
> Guile there, and the `guile-config' at build time has the necessary
> information about it.  C code or Scheme code might adapt itself
> accordingly (allowing for features not available in an older version
> for instance).

I guess this also needs updating, for the new parallel install vision.

> 0.1.5 Foreign Pointers
> ----------------------
>
> The previous sections have shown how Guile can be extended at runtime by
> loading compiled C extensions. This approach is all well and good, but
> wouldn't it be nice if we didn't have to write any C at all? This
> section takes up the problem of accessing C values from Scheme, and the
> next discusses C functions.
>
> 0.1.5.1 Foreign Types
> .....................
>
> The first impedance mismatch that one sees between C and Scheme is that
> in C, the storage locations (variables) are typed, but in Scheme types
> are associated with values, not variables. *Note Values and Variables::.

Fine, but...

>    So when accessing a C value through a Scheme pointer, we must give
> the type of the pointed-to value explicitly, as a parameter to any
> Scheme procedure that accesses the value.

This confused me at first.  I think I understand the point now, but

- isn't it actually much more to do with the ELF binary format, rather
  than with C?  If libguile could read and parse C, it would be able to
  infer the type of any variable that the Scheme layer might request.
  The problem is precisely that what we are linking with is *not* C
  anymore...  It's just untyped pointers.

- I think "give the type ... as a parameter to any Scheme procedure that
  accesses the value" is misleading, because we don't do that!  Rather,
  we construct a box that includes both the pointer and the type, and
  then pass the box around.

> 0.1.5.2 Foreign Variables
> .........................
>
> Given the types defined in the previous section, C pointers may be
> looked up dynamically using `dynamic-pointer'.
>
>  -- Scheme Procedure: dynamic-pointer name type dobj [len]
>  -- C Function: scm_dynamic_pointer (name, type, dobj, len)
>      Return a "handle" for the pointer NAME in the shared object
>      referred to by DOBJ. The handle aliases a C value, and is declared
>      to be of type TYPE. Valid types are defined in the `(system
>      foreign)' module.
>
>      This facility works by asking the dynamic linker for the address
>      of a symbol, then assuming that it aliases a value of a given
>      type. Obviously, the user must be very careful to ensure that the
>      value actually is of the declared type, or bad things will happen.
>
>      Regardless whether your C compiler prepends an underscore `_' to
>      the global names in a program, you should *not* include this
>      underscore in NAME since it will be added automatically when
>      necessary.
>
>    For example, currently Guile has a variable, `scm_numptob', as part
> of its API. It is declared as a C `long'. So, to create a handle
> pointing to that foreign value, we do:
>
>      (use-modules (system foreign))
>      (define numptob (dynamic-pointer "scm_numptob" long (dynamic-link)))
>      numptob
>      => #<foreign int32 8>
>
>    A value returned by `dynamic-pointer' is a Scheme wrapper for a C
> pointer, with additional type information. A foreign pointer prints
> according to its type. This example showed that a `long' on this
> platform is an `int32', and that the value pointed to by `numptob' is 8.

I think the terminology is confusing here in two ways.

1. The API and the doc call these objects pointers, but because of the
automatic dereference they don't behave like pointers at all.  (Their
print function prints *p, not p, and foreign-set! does *p = val, not p =
val.)

I think that "reference" might be a less surprising name - as in C++
references, and "call by reference".

2. An object created by '(dynamic-pointer ...)' prints as '#<foreign
...>'.  If you think that foreign is the best word for this whole
area (and I think it's fine), I think you should bite the bullet and
make all the APIs say 'foreign' instead of 'dynamic'.  (And obviously
keep the 'dynamic' names of 1.8.x APIs as aliases.)

> 0.1.5.3 Void Pointers and Byte Access
> .....................................
>
> As a special case, a dynamic pointer may be declared to point to type
> `void', in which case it is treated as a void pointer. A void pointer
> prints its value as a pointer, without dereferencing the pointer.
>
>    It's important at this point to conceptually separate foreign values
> from foreign pointers. `dynamic-pointer' gives you a foreign pointer. A
> foreign value is the semantic meaning of the bytes pointed to by a
> pointer. Only foreign pointers may be wrapped in Scheme. One may make a
> pointer to a foreign value, and wrap that as a Scheme object, but a
> bare foreign value may not be wrapped.

I'm not getting the distinction here at all.  Is it important for what
follows?

>    When you call `dynamic-pointer', the TYPE argument indicates the
> type to which the given symbol points, but sometimes you don't know
> that type. Sometimes you have a pointer, and you don't know what kind of
> object it references. It's simply a pointer out into the ether, into the
> `void'.
>
>    Guile can wrap such a pointer, by declaring that it points to `void'.
>
>  -- Scheme Variable: void
>      A foreign type value representing nothing.
>
>      `void' has two uses: for a foreign pointer, declaring it to be of
>      type `void' is like having a `void*' in C. For a function, a
>      return type of `void' indicates that the function returns no
>      values. A function argument type of `void' is invalid.

This is fine.

>    As an example, `(dynamic-pointer "foo" void bar-lib)' links in the
> FOO symbol in the BAR-LIB library as a pointer to `void': a `void*'.
>
>    Void pointers may be accessed as bytevectors.
>
>  -- Scheme Procedure: foreign->bytevector foreign [uvec_type [offset
>           [len]]]
>  -- C Function: scm_foreign_to_bytevector foreign uvec_type offset len
>      Return a bytevector aliasing the memory pointed to by FOREIGN.
>
>      FOREIGN must be a void pointer, a foreign whose type is VOID. By
>      default, the resulting bytevector will alias all of the memory
>      pointed to by FOREIGN, from beginning to end, treated as a `vu8'
>      array.

It feels like we're missing a unification trick here.

Thought #1: if we have, e.g., an int8 pointer ip, why not just use
(foreign-ref ip n) to interpret the pointer as pointing to an array, and
get its nth element?

Thought #2: but if we do that we'll be duplicating the bytevector API.
So instead, shouldn't the fundamental operation be (foreign->bytevector
NAME TYPE LIBRARY [LEN]), and get/set then done using the bytevector
API?

I'm not sure either of those thoughts is right, but the current API
doesn't feel as elegant as I think it could be.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ffi docs
  2010-04-15 22:36 ` Neil Jerram
@ 2010-04-16  8:43   ` Ludovic Courtès
  2010-04-16  9:33   ` Andy Wingo
  2010-07-27  8:24   ` Ludovic Courtès
  2 siblings, 0 replies; 9+ messages in thread
From: Ludovic Courtès @ 2010-04-16  8:43 UTC (permalink / raw)
  To: guile-devel

Hi Neil,

A few answers/comments and I’ll leave the rest to Andy.  ;-)

Neil Jerram <neil@ossau.uklinux.net> writes:

> Andy Wingo <wingo@pobox.com> writes:

>>  -- Scheme Procedure: dynamic-link [library]
>>  -- C Function: scm_dynamic_link (library)
>
> Code below implies that library can be omitted, and that this -
> i.e. '(dynamic-link)' - means to return an object representing libguile
> itself.  Should that be mentioned in the following doc?

This is actually documented:

>>      When LIBRARY is omitted, a "global symbol handle" is returned.
>>      This handle provides access to the symbols available to the
>>      program at run-time, including those exported by the program
>>      itself and the shared libraries already loaded.

[...]

>>    A compiled module should have a specially named "module init
>> function".  Guile knows about this special name and will call that
>> function automatically after having linked in the shared library.  For
>> our example, we replace `init_math_bessel' with the following code in
>> `bessel.c':
>>
>>      void
>>      init_math_bessel (void *unused)
>>      {
>>        scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
>>        scm_c_export ("j0", NULL);
>>      }
>>
>>      void
>>      scm_init_math_bessel_module ()
>>      {
>>        scm_c_define_module ("math bessel", init_math_bessel, NULL);
>>      }
>>
>>    The general pattern for the name of a module init function is:
>> `scm_init_', followed by the name of the module where the individual
>> hierarchical components are concatenated with underscores, followed by
>> `_module'.
>
> Is this still correct?

The bit that says “Guile knows about this special name” has been
incorrect since 1.8 AFAIK, because the name of the init function has to
be explicitly given to ‘load-extension’.  So this part can be removed.

The bit about the “general pattern for the name” is correct, but it
should probably made clear that it’s just a convention.

>>    So when accessing a C value through a Scheme pointer, we must give
>> the type of the pointed-to value explicitly, as a parameter to any
>> Scheme procedure that accesses the value.
>
> This confused me at first.  I think I understand the point now, but
>
> - isn't it actually much more to do with the ELF binary format, rather
>   than with C?  If libguile could read and parse C, it would be able to
>   infer the type of any variable that the Scheme layer might request.
>   The problem is precisely that what we are linking with is *not* C
>   anymore...  It's just untyped pointers.

C itself is very weakly typed: about anything can be cast to anything
else.

> - I think "give the type ... as a parameter to any Scheme procedure that
>   accesses the value" is misleading, because we don't do that!  Rather,
>   we construct a box that includes both the pointer and the type, and
>   then pass the box around.

Agreed.

[...]

>>    As an example, `(dynamic-pointer "foo" void bar-lib)' links in the
>> FOO symbol in the BAR-LIB library as a pointer to `void': a `void*'.
>>
>>    Void pointers may be accessed as bytevectors.
>>
>>  -- Scheme Procedure: foreign->bytevector foreign [uvec_type [offset
>>           [len]]]
>>  -- C Function: scm_foreign_to_bytevector foreign uvec_type offset len
>>      Return a bytevector aliasing the memory pointed to by FOREIGN.
>>
>>      FOREIGN must be a void pointer, a foreign whose type is VOID. By
>>      default, the resulting bytevector will alias all of the memory
>>      pointed to by FOREIGN, from beginning to end, treated as a `vu8'
>>      array.
>
> It feels like we're missing a unification trick here.

What do you mean?

> Thought #1: if we have, e.g., an int8 pointer ip, why not just use
> (foreign-ref ip n) to interpret the pointer as pointing to an array, and
> get its nth element?
>
> Thought #2: but if we do that we'll be duplicating the bytevector API.
> So instead, shouldn't the fundamental operation be (foreign->bytevector
> NAME TYPE LIBRARY [LEN]), and get/set then done using the bytevector
> API?

What would be LIBRARY here?

FWIW I’m fine with this procedure.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ffi docs
  2010-04-15 22:36 ` Neil Jerram
  2010-04-16  8:43   ` Ludovic Courtès
@ 2010-04-16  9:33   ` Andy Wingo
  2010-04-16 22:34     ` Neil Jerram
  2010-07-27  8:24   ` Ludovic Courtès
  2 siblings, 1 reply; 9+ messages in thread
From: Andy Wingo @ 2010-04-16  9:33 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Hi,

Thanks for the feedback!

On Fri 16 Apr 2010 00:36, Neil Jerram <neil@ossau.uklinux.net> writes:

>>    But yet we as programmers live in both worlds, and Guile itself is
>> half implemented in C. So it is that Guile's living half pays respect
>> to its dead counterpart, via a spectrum of interfaces to C ranging from
>> dynamic loading of Scheme primitives to dynamic binding of stock C
>> library prodedures.
>
> c -----------^

What does this mean?

>>    We titled this section "foreign libraries" because although the name
>> "foreign" doesn't leak into the API, the world of C really is foreign
>> to Scheme - and that estrangement extends to components of foreign
>> libraries as well, as we see in future sections.
>
> I'm not sure what the message is here.

Probably me being to cutesy, I would imagine. The facility is typically
called a "foreign function interface", but that name doesn't appear in
e.g. "dynamic-link", so I was trying to explain.

Beyond that I guess I meant to say that "native" depends on where you're
coming from; that Scheme calls are native to Scheme, and C calls are
foreign to Scheme.

>>  -- Scheme Procedure: dynamic-link [library]
>>  -- C Function: scm_dynamic_link (library)
>
> Code below implies that library can be omitted, and that this -
> i.e. '(dynamic-link)' - means to return an object representing libguile
> itself.  Should that be mentioned in the following doc?
>
>>      Find the shared library denoted by LIBRARY (a string) and link it
>>      into the running Guile application.  When everything works out,
>>      return a Scheme object suitable for representing the linked object
>>      file.  Otherwise an error is thrown.  How object files are
>>      searched is system dependent.
>>
>>      Normally, LIBRARY is just the name of some shared library file
>>      that will be searched for in the places where shared libraries
>>      usually reside, such as in `/usr/lib' and `/usr/local/lib'.
>>
>>      When LIBRARY is omitted, a "global symbol handle" is returned.
>>      This handle provides access to the symbols available to the
>>      program at run-time, including those exported by the program
>>      itself and the shared libraries already loaded.

I think it is mentioned, no? Is there a way that it can be more clear?

>>    Given some set of C extensions to Guile, the next logical step is to
>> integrate these glue libraries into the module system of Guile so that
>> you can load new primitives into a running system just as you can load
>> new Scheme code.
>>
>>  -- Scheme Procedure: load-extension lib init
>>  -- C Function: scm_load_extension (lib, init)
>>      Load and initialize the extension designated by LIB and INIT.
>>      When there is no pre-registered function for LIB/INIT, this is
>>      equivalent to
>>
>>           (dynamic-call INIT (dynamic-link LIB))
>>
>>      When there is a pre-registered function, that function is called
>>      instead.
>>
>>      Normally, there is no pre-registered function.  This option exists
>>      only for situations where dynamic linking is unavailable or
>>      unwanted.  In that case, you would statically link your program
>>      with the desired library, and register its init function right
>>      after Guile has been initialized.
>
> Should there be a reference from here to wherever the registration API
> is covered?

Probably. Is it documented somewhere? :) I think no. I would doc it
here, fwiw...

>>      LIB should be a string denoting a shared library without any file
>>      type suffix such as ".so".  The suffix is provided automatically.
>>      It should also not contain any directory components.  Libraries
>>      that implement Guile Extensions should be put into the normal
>>      locations for shared libraries.  We recommend to use the naming
>>      convention libguile-bla-blum for a extension related to a module
>>      `(bla blum)'.
>
> I believe this will shortly be out of date, won't it? - given our desire
> to support parallel installs.

Hm, good point; though if it is installed into the extensionsdir as
suggested below, we do work around this issue.

>>    A compiled module should have a specially named "module init
>> function".  Guile knows about this special name and will call that
>> function automatically after having linked in the shared library.  For
>> our example, we replace `init_math_bessel' with the following code in
>> `bessel.c':
>>
>>      void
>>      init_math_bessel (void *unused)
>>      {
>>        scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
>>        scm_c_export ("j0", NULL);
>>      }
>>
>>      void
>>      scm_init_math_bessel_module ()
>>      {
>>        scm_c_define_module ("math bessel", init_math_bessel, NULL);
>>      }
>>
>>    The general pattern for the name of a module init function is:
>> `scm_init_', followed by the name of the module where the individual
>> hierarchical components are concatenated with underscores, followed by
>> `_module'.
>
> Is this still correct?  IIUC it only makes sense as part of the ability
> we once had for a (use-modules (...)) call to find a .so and bootstrap
> it automatically.  (Unless that has been reinstated...)

It has not been reinstated. However SWIG seems to use this facility --
doing a (load-extension ...) call to load up a module, then you use the
module.

>>    Presently there's no convention for having a Guile version number in
>> module C code filenames or directories.  This is primarily because
>> there's no established principles for two versions of Guile to be
>> installed under the same prefix (eg. two both under `/usr').  Assuming
>> upward compatibility is maintained then this should be unnecessary, and
>> if compatibility is not maintained then it's highly likely a package
>> will need to be revisited anyway.
>>
>>    The present suggestion is that modules should assume when they're
>> installed under a particular `prefix' that there's a single version of
>> Guile there, and the `guile-config' at build time has the necessary
>> information about it.  C code or Scheme code might adapt itself
>> accordingly (allowing for features not available in an older version
>> for instance).
>
> I guess this also needs updating, for the new parallel install vision.

Probably; there is $extensionsdir, but I am not finding it in this
chapter; durnit. Here's the NEWS entry:

** Dynamically loadable extensions may be placed in a Guile-specific path

Before, Guile only searched the system library paths for extensions
(e.g. /usr/lib), which meant that the names of Guile extensions had to
be globally unique. Installing them to a Guile-specific extensions
directory is cleaner. Use `pkg-config --variable=extensionsdir
guile-2.0' to get the location of the extensions directory.


>> 0.1.5 Foreign Pointers
>> ----------------------
>>
>> The previous sections have shown how Guile can be extended at runtime by
>> loading compiled C extensions. This approach is all well and good, but
>> wouldn't it be nice if we didn't have to write any C at all? This
>> section takes up the problem of accessing C values from Scheme, and the
>> next discusses C functions.
>>
>> 0.1.5.1 Foreign Types
>> .....................
>>
>> The first impedance mismatch that one sees between C and Scheme is that
>> in C, the storage locations (variables) are typed, but in Scheme types
>> are associated with values, not variables. *Note Values and Variables::.
>
> Fine, but...
>
>>    So when accessing a C value through a Scheme pointer, we must give
>> the type of the pointed-to value explicitly, as a parameter to any
>> Scheme procedure that accesses the value.
>
> This confused me at first.  I think I understand the point now, but
>
> - isn't it actually much more to do with the ELF binary format, rather
>   than with C?  If libguile could read and parse C, it would be able to
>   infer the type of any variable that the Scheme layer might request.
>   The problem is precisely that what we are linking with is *not* C
>   anymore...  It's just untyped pointers.

I guess you're right, this is confusing. C doesn't really exist at
runtime, and this API is all about accessing runtime values.

> - I think "give the type ... as a parameter to any Scheme procedure that
>   accesses the value" is misleading, because we don't do that!  Rather,
>   we construct a box that includes both the pointer and the type, and
>   then pass the box around.

True, though there are void pointers, which can be treated as raw memory
arrays, and parsed with the bytevector functions. But agreed, "as a
parameter" is incorrect.

>> 0.1.5.2 Foreign Variables
>> .........................
>>
>> Given the types defined in the previous section, C pointers may be
>> looked up dynamically using `dynamic-pointer'.
>>
>>  -- Scheme Procedure: dynamic-pointer name type dobj [len]
>>  -- C Function: scm_dynamic_pointer (name, type, dobj, len)
>>      Return a "handle" for the pointer NAME in the shared object
>>      referred to by DOBJ. The handle aliases a C value, and is declared
>>      to be of type TYPE. Valid types are defined in the `(system
>>      foreign)' module.
>>
>>      This facility works by asking the dynamic linker for the address
>>      of a symbol, then assuming that it aliases a value of a given
>>      type. Obviously, the user must be very careful to ensure that the
>>      value actually is of the declared type, or bad things will happen.
>>
>>      Regardless whether your C compiler prepends an underscore `_' to
>>      the global names in a program, you should *not* include this
>>      underscore in NAME since it will be added automatically when
>>      necessary.
>>
>>    For example, currently Guile has a variable, `scm_numptob', as part
>> of its API. It is declared as a C `long'. So, to create a handle
>> pointing to that foreign value, we do:
>>
>>      (use-modules (system foreign))
>>      (define numptob (dynamic-pointer "scm_numptob" long (dynamic-link)))
>>      numptob
>>      => #<foreign int32 8>
>>
>>    A value returned by `dynamic-pointer' is a Scheme wrapper for a C
>> pointer, with additional type information. A foreign pointer prints
>> according to its type. This example showed that a `long' on this
>> platform is an `int32', and that the value pointed to by `numptob' is 8.
>
> I think the terminology is confusing here in two ways.
>
> 1. The API and the doc call these objects pointers, but because of the
> automatic dereference they don't behave like pointers at all.  (Their
> print function prints *p, not p, and foreign-set! does *p = val, not p =
> val.)

That is the case for non-void pointers, yes; but dynamic-pointer does
not give you a value. Perhaps as you mention a "reference" would be less
ambiguous; or perhaps more?

Perhaps we should make these things print like #<foreign-pointer
*0xdeadbeef = (int32)8> or something? (Or, as mentioned below, just as
#<foreign-pointer 0xdeadbeef> ?)

> 2. An object created by '(dynamic-pointer ...)' prints as '#<foreign
> ...>'.  If you think that foreign is the best word for this whole
> area (and I think it's fine), I think you should bite the bullet and
> make all the APIs say 'foreign' instead of 'dynamic'.  (And obviously
> keep the 'dynamic' names of 1.8.x APIs as aliases.)

Hmmmmmmmmmmmmmmmmmmmmm. But you can make foreign pointers in other ways
than from dlsym() -- for example, the return value of a function. I
agree though that "dynamic-pointer" is confusing, though, probably
because "pointer" is a noun and not a verb like "link".

Can you think of a better name for "dynamic-pointer"?

>> 0.1.5.3 Void Pointers and Byte Access
>> .....................................
>>
>> As a special case, a dynamic pointer may be declared to point to type
>> `void', in which case it is treated as a void pointer. A void pointer
>> prints its value as a pointer, without dereferencing the pointer.
>>
>>    It's important at this point to conceptually separate foreign values
>> from foreign pointers. `dynamic-pointer' gives you a foreign pointer. A
>> foreign value is the semantic meaning of the bytes pointed to by a
>> pointer. Only foreign pointers may be wrapped in Scheme. One may make a
>> pointer to a foreign value, and wrap that as a Scheme object, but a
>> bare foreign value may not be wrapped.
>
> I'm not getting the distinction here at all.  Is it important for what
> follows?

Maybe not. Perhaps it's just a vestigial remnant of my personal process
of understanding these things. But you haven't gotten to functions yet,
in which foreign values need to be passed as values and not as pointers.

>>    As an example, `(dynamic-pointer "foo" void bar-lib)' links in the
>> FOO symbol in the BAR-LIB library as a pointer to `void': a `void*'.
>>
>>    Void pointers may be accessed as bytevectors.
>>
>>  -- Scheme Procedure: foreign->bytevector foreign [uvec_type [offset
>>           [len]]]
>>  -- C Function: scm_foreign_to_bytevector foreign uvec_type offset len
>>      Return a bytevector aliasing the memory pointed to by FOREIGN.
>>
>>      FOREIGN must be a void pointer, a foreign whose type is VOID. By
>>      default, the resulting bytevector will alias all of the memory
>>      pointed to by FOREIGN, from beginning to end, treated as a `vu8'
>>      array.
>
> It feels like we're missing a unification trick here.
>
> Thought #1: if we have, e.g., an int8 pointer ip, why not just use
> (foreign-ref ip n) to interpret the pointer as pointing to an array, and
> get its nth element?
>
> Thought #2: but if we do that we'll be duplicating the bytevector API.

Right.

> So instead, shouldn't the fundamental operation be (foreign->bytevector
> NAME TYPE LIBRARY [LEN]), and get/set then done using the bytevector
> API?

Perhaps. foreign-ref and foreign-set! aren't actually used anywhere in
Guile, so perhaps they should go. They just seemed convenient. But maybe
convenience shouldn't be a concern of a low-level FFI. I am inclined to
agree with you.

> I'm not sure either of those thoughts is right, but the current API
> doesn't feel as elegant as I think it could be.

I agree that it has that kindof "off" feel; but that if you read on to
structs and functions, those sections will clarify your objections.

How are we to handle these changes? I feel like the manual would end up
better if you did it, because my mind is clouded with the
implementation; yours is fresh, and would do a better job explaining.
Also, those docs were quite a slog to write in the first place ;) What
do you think?

Awaiting your next dispatch!

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ffi docs
  2010-04-16  9:33   ` Andy Wingo
@ 2010-04-16 22:34     ` Neil Jerram
  2010-04-17 10:38       ` Andy Wingo
  0 siblings, 1 reply; 9+ messages in thread
From: Neil Jerram @ 2010-04-16 22:34 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> Hi,
>
> Thanks for the feedback!

Thanks for the quick responses.  As far as most of them are concerned,
I'll read the section on foreign functions first before offering more
thoughts, as you suggest.  Also I agree that it will make sense for me
to handle these doc changes.

On a couple of specific points....

> On Fri 16 Apr 2010 00:36, Neil Jerram <neil@ossau.uklinux.net> writes:
>
>>>    But yet we as programmers live in both worlds, and Guile itself is
>>> half implemented in C. So it is that Guile's living half pays respect
>>> to its dead counterpart, via a spectrum of interfaces to C ranging from
>>> dynamic loading of Scheme primitives to dynamic binding of stock C
>>> library prodedures.
>>
>> c -----------^
>
> What does this mean?

I'm sorry!  It was supposed to indicate a typo, and that there should be
a "c" at the place the caret points - i.e. "procedures" rather than
"prodedures".  Anyway, I'll handle this.

>>>    We titled this section "foreign libraries" because although the name
>>> "foreign" doesn't leak into the API, the world of C really is foreign
>>> to Scheme - and that estrangement extends to components of foreign
>>> libraries as well, as we see in future sections.
>>
>> I'm not sure what the message is here.
>
> Probably me being to cutesy, I would imagine.

I didn't think that!

> The facility is typically
> called a "foreign function interface", but that name doesn't appear in
> e.g. "dynamic-link", so I was trying to explain.

Ah yes, I see now.  In that case I think it's just the last clause that
doesn't quite work for me.  I would say that the _immediately_ following
text is about "components of foreign libraries", so why say "as we see
in future sections"?

Maybe: "... really is foreign to Scheme.  Foreign function and data
pointers, obtained via `dynamic-func' and `dynamic-pointer', or as
return values from a foreign function call, are inherently untyped, and
depend on the Scheme programmer using them in a way that is consistent
with the library's documented interface.  Any other usage is unsafe and
can easily cause the containing Scheme program to crash, and the Guile
low-level FFI cannot protect against this.  (This is quite different
from computations on Scheme values; Scheme values are typed, and so
operations on them can check in advance that the value types are as
expected.)"

>> Code below implies that library can be omitted, and that this -
>> i.e. '(dynamic-link)' - means to return an object representing libguile
>> itself.  Should that be mentioned in the following doc?
>>
>>>      Find the shared library denoted by LIBRARY (a string) and link it
>>>      into the running Guile application.  When everything works out,
>>>      return a Scheme object suitable for representing the linked object
>>>      file.  Otherwise an error is thrown.  How object files are
>>>      searched is system dependent.
>>>
>>>      Normally, LIBRARY is just the name of some shared library file
>>>      that will be searched for in the places where shared libraries
>>>      usually reside, such as in `/usr/lib' and `/usr/local/lib'.
>>>
>>>      When LIBRARY is omitted, a "global symbol handle" is returned.
>>>      This handle provides access to the symbols available to the
>>>      program at run-time, including those exported by the program
>>>      itself and the shared libraries already loaded.
>
> I think it is mentioned, no? Is there a way that it can be more clear?

I'm sorry.  I can't believe I missed that.  I looked for it so
carefully!

>> - isn't it actually much more to do with the ELF binary format, rather
>>   than with C?  If libguile could read and parse C, it would be able to
>>   infer the type of any variable that the Scheme layer might request.
>>   The problem is precisely that what we are linking with is *not* C
>>   anymore...  It's just untyped pointers.
>
> I guess you're right, this is confusing. C doesn't really exist at
> runtime, and this API is all about accessing runtime values.

On further reflection, I think I'm not completely right.  Functions
assume that they will be called according to well known C calling
conventions.  So I guess there are vestiges of C.

Out of interest, do other languages that compile to library format use
different calling conventions, and if so can dlopen/dlsym and FFIs work
with them?

More tomorrow...

Regards,
      Neil




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ffi docs
  2010-04-16 22:34     ` Neil Jerram
@ 2010-04-17 10:38       ` Andy Wingo
  0 siblings, 0 replies; 9+ messages in thread
From: Andy Wingo @ 2010-04-17 10:38 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Greets,

On Sat 17 Apr 2010 00:34, Neil Jerram <neil@ossau.uklinux.net> writes:

> I agree that it will make sense for me to handle these doc changes.

Thank you very much for this; I have been writing too many docs
recently, and long for the hack.

>> The facility is typically
>> called a "foreign function interface", but that name doesn't appear in
>> e.g. "dynamic-link", so I was trying to explain.
>
> Ah yes, I see now.  In that case I think it's just the last clause that
> doesn't quite work for me.  I would say that the _immediately_ following
> text is about "components of foreign libraries", so why say "as we see
> in future sections"?
>
> Maybe: "... really is foreign to Scheme.  Foreign function and data
> pointers, obtained via `dynamic-func' and `dynamic-pointer', or as
> return values from a foreign function call, are inherently untyped, and
> depend on the Scheme programmer using them in a way that is consistent
> with the library's documented interface.  Any other usage is unsafe and
> can easily cause the containing Scheme program to crash, and the Guile
> low-level FFI cannot protect against this.  (This is quite different
> from computations on Scheme values; Scheme values are typed, and so
> operations on them can check in advance that the value types are as
> expected.)"

That looks great to me. I also think it's nice to preface this section,
as you do, with a brief mention of unsafety.

>>> - isn't it actually much more to do with the ELF binary format, rather
>>>   than with C?  If libguile could read and parse C, it would be able to
>>>   infer the type of any variable that the Scheme layer might request.
>>>   The problem is precisely that what we are linking with is *not* C
>>>   anymore...  It's just untyped pointers.
>>
>> I guess you're right, this is confusing. C doesn't really exist at
>> runtime, and this API is all about accessing runtime values.
>
> On further reflection, I think I'm not completely right.  Functions
> assume that they will be called according to well known C calling
> conventions.  So I guess there are vestiges of C.
>
> Out of interest, do other languages that compile to library format use
> different calling conventions, and if so can dlopen/dlsym and FFIs work
> with them?

AFAIK, the only thing you should rely on is the standard ABI for your
architecture. For example for Intel 386, there is

  http://www.sco.com/developers/devspecs/abi386-4.pdf

Yes, hosted on SCO. See chapter 3. I believe there are similar documents
for other architectures. There are mentions of C there, but there are
also attempts to make it known that it's the convention that matters,
not the language.

Not sure if that helps, though. It certainly doesn't have to do with ELF
though, as far as I know.

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ffi docs
  2010-04-15 22:36 ` Neil Jerram
  2010-04-16  8:43   ` Ludovic Courtès
  2010-04-16  9:33   ` Andy Wingo
@ 2010-07-27  8:24   ` Ludovic Courtès
  2 siblings, 0 replies; 9+ messages in thread
From: Ludovic Courtès @ 2010-07-27  8:24 UTC (permalink / raw)
  To: guile-devel

Hi!

Neil Jerram <neil@ossau.uklinux.net> writes:

>>    Void pointers may be accessed as bytevectors.
>>
>>  -- Scheme Procedure: foreign->bytevector foreign [uvec_type [offset
>>           [len]]]
>>  -- C Function: scm_foreign_to_bytevector foreign uvec_type offset len
>>      Return a bytevector aliasing the memory pointed to by FOREIGN.
>>
>>      FOREIGN must be a void pointer, a foreign whose type is VOID. By
>>      default, the resulting bytevector will alias all of the memory
>>      pointed to by FOREIGN, from beginning to end, treated as a `vu8'
>>      array.
>
> It feels like we're missing a unification trick here.
>
> Thought #1: if we have, e.g., an int8 pointer ip, why not just use
> (foreign-ref ip n) to interpret the pointer as pointing to an array, and
> get its nth element?
>
> Thought #2: but if we do that we'll be duplicating the bytevector API.
> So instead, shouldn't the fundamental operation be (foreign->bytevector
> NAME TYPE LIBRARY [LEN]), and get/set then done using the bytevector
> API?

Andy and I discussed it at GHM, which led to this simplification of the API:

  http://git.savannah.gnu.org/cgit/guile.git/commit/?id=d4149a510e4a87915b625255f4de3301510d810c

There are a few more changes coming: renaming some of the procedures
from ‘foreign’ to ‘pointer’, and adding more convenience procedures
(C string manipulation notably).

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-07-27  8:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-06 21:36 ffi docs Andy Wingo
2010-04-07 21:38 ` Ludovic Courtès
2010-04-07 22:01   ` Andy Wingo
2010-04-15 22:36 ` Neil Jerram
2010-04-16  8:43   ` Ludovic Courtès
2010-04-16  9:33   ` Andy Wingo
2010-04-16 22:34     ` Neil Jerram
2010-04-17 10:38       ` Andy Wingo
2010-07-27  8:24   ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).