From: Marius Vollmer <mvo@zagadka.de>
Subject: The future: accessing vectors, arrays, etc from C
Date: 30 Dec 2004 15:56:06 +0100 [thread overview]
Message-ID: <87acrvq1h5.fsf@zagadka.de> (raw)
Hi,
after some procrastination, I have more or less convinced myself to
make accessing vectors and arrays (including uniform numeric vectors
and arrays) more difficult from C. Here is how and why. Please
comment!
Right now, Guile's implementation of vectors and arrays is a bit
upside down: arrays are build on top of vectors, with the consequence
that not all one-dimensional arrays can be treated as vectors. While
fixing this, we should also allow future improvements like
copy-on-write sub-arrays, growable vectors etc.
The first step is to come up with a C API that supports rich array
(and vector) implementations. (We already have such an API for
strings, but it is not exported since the string implementation needs
to change significantly once more for Unicode support.)
One observation is that we can not have an API that locks arrays while
the locker is allowed to run arbitrary code. That would probably very
quickly lead to unmanageable dead locks.
However, we do want allow C code access to the raw storage block of an
array, so that it can pass it to external code such as an image
processing library or linear algebra routines.
This can be achieved with a two-level scheme where an array points to
a storage object. The storage object of an array can change over time
(when the array is copied-on-write, say), but storage objects
themselves always point to the same raw memory.
When accessing an array from C, one extracts the storage object and
then only works with that object. The raw memory of the storage
object is guaranteed to stay in place as long as the storage object
itself is protected.
This mechanism is abstracted away via an 'array handle'. You need to
get an array handle and then can access the array memory through this
handle. The handle needs to be protected: this is most easily done by
just placing it on the stack, and then you don't need to 'release' it,
which is very comfortable.
Here are procedures for dealing with such handles.
- void scm_array_get_handle (SCM array, scm_t_array_handle *h);
Fill the array handle H so that it can be used with the procedures
below. The handle H must be visible to the garbage collector,
therefore H must point to a struct on the stack. See also
scm_array_handle_copy.
When ARRAY is not an array, an error is signalled. All kinds of
arrays are acceptable, including uniform numeric arrays, strings,
and bitvectors. To restrict the type of the array, use one of the
scm_array_handle_*_elements functions.
- void scm_vector_get_handle (SCM vec, scm_t_array_handle *);
Like scm_array_get_handle, but only accepts one-dimensional arrays.
- scm_t_array_handle *scm_array_handle_copy (scm_t_array_handle *H)
Make a copy of H and return it. This copy must be freed with
scm_array_handle_free (H). This function might be useful in
situationswhere you can not allocate you handle on the stack.
- void scm_array_handle_free (scm_t_array_handle *H)
Free the array handle H, which _must_ have been created with
scm_array_handle_copy. Normal handles that are allocated on the
stack _must_ _not_ be freed with this procedure.
- size_t scm_array_handle_rank (scm_t_array_handle *);
- scm_t_array_dimension *scm_array_handle_dims (scm_t_array_handle *);
These procedures deliver information about the storage layout of the
array, to be detailed elsewhere.
- SCM scm_array_handle_ref (scm_t_array_handle *, size_t pos);
Return the value at position POS in the storage vector of the
handle. POS can be computed from the layout information above and
must be valid; no range checking is done.
This function works for all kinds of arrays.
- void scm_array_handle_set (scm_t_array_handle *, size_t pos, SCM val);
Set the value at position POS in the storage vector of the handle to
VAL.
- const SCM *scm_array_handle_elements (scm_t_array_handle *);
Return a pointer to the raw memory of a generic (non-uniform) array
for reading. When the array is not a generic one, signal an error.
This pointer is valid as long as the handle is protected. It is
possible that the representation of the original array changes (in a
copy-on-write operation, say) and in that case the pointer returned
by this function will still be valid, but will no longer belong to
the original array. Thus, you might miss modifications to the
array. It is therefore best to refresh the pointer by a new call to
this function from time to time. Exactly how often is up to you.
- SCM *scm_array_handle_writable_elements (scm_t_array_handle *);
Like scm_array_handle_elements, but returns a pointer that is good
for reading and writing.
- size_t scm_array_handle_element_size (scm_t_array_handle *);
- const void *scm_array_handle_untyped_elements (scm_t_array_handle *);
- void *scm_array_handle_untyped_writable_elements (scm_t_array_handle *);
Like above, but works with any kind of array. You are not allowed
to interpret the values, but you can copy them around with memcpy,
say.
- const scm_t_uint8 *scm_array_handle_u8_elements (scm_t_array_handle *);
- scm_t_uint8 *scm_array_handle_u8_writable_elements (scm_t_array_handle *);
- const scm_t_int8 *scm_array_handle_s8_elements (scm_t_array_handle *);
- scm_t_int8 *scm_array_handle_s8_writable_elements (scm_t_array_handle *);
- ETC
Like scm_array_handle_elements and scm_array_handle_writable_elements,
but for uniform numeric arrays.
- const scm_t_uint32 *scm_array_handle_bit_elements (scm_t_array_handle *);
- scm_t_uint8 *scm_array_handle_bit_writable_elements (scm_t_array_handle *);
For bitvectors.
A typical function that optimizes for f64vectors:
double
vector_norm (SCM vec)
{
scm_t_array_handle h;
scm_t_array_dimension *dim;
size_t i;
double sum = 0;
scm_vector_get_handle (vec, &h);
dim = scm_array_handle_dimensions (&h);
if (scm_is_true (scm_f64vector_p (vec)))
{
double *elts;
elts = scm_array_handle_f64_elements (&h);
for (i = 0; i <= dim->len; i++, elts += dim->inc)
sum += elts[0]*elts[0];
}
else
{
size_t pos = 0;
for (i = 0; i <= dim->len; i++, pos += dim->inc)
{
double elt = scm_to_double (scm_array_handle_ref (&h, pos));
sum += elt*elt;
}
}
return sqrt (sum);
}
There are and will be alternative and simpler ways to access vectors.
The first is just to use scm_c_vector_ref and scm_c_vector_set_x. A
second is to only work with 'simple' vectors. A simple vector is what
we have now: a simple, non-changing pointer to memory. You can use
the macros SCM_SIMPLE_VECTOR_REF and SCM_SIMPLE_VECTOR_SET with them
but you don't get the full generality.
So what do you say? Is something like the above acceptable? Too
involved? Are there holes in the thinking?
Again, the point is to make it relatively easy to write very general
code when dealing with vectors and arrays, to allow for future
improvements to the arrays implementation (maybe to the point of
unifying it with the string implementation) and to be thread-safe.
--
GPG: D5D4E405 - 2F9B BCCC 8527 692A 04E3 331E FAF8 226A D5D4 E405
_______________________________________________
Guile-user mailing list
Guile-user@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-user
next reply other threads:[~2004-12-30 14:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-12-30 14:56 Marius Vollmer [this message]
2005-01-04 0:31 ` The future: accessing vectors, arrays, etc from C Kevin Ryde
2005-01-04 1:51 ` Marius Vollmer
2005-01-05 0:04 ` Neil Jerram
2005-01-05 4:12 ` Mike Gran
2005-01-05 18:10 ` Marius Vollmer
2005-01-05 11:52 ` Thien-Thi Nguyen
2005-01-05 18:01 ` Marius Vollmer
2005-01-06 19:13 ` Marius Vollmer
2005-01-06 23:08 ` Neil Jerram
2005-01-08 23:19 ` Neil Jerram
2005-01-11 18:01 ` Marius Vollmer
2005-01-11 19:53 ` Mikael Djurfeldt
2005-01-15 21:27 ` Neil Jerram
2005-01-16 8:06 ` Neil Jerram
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87acrvq1h5.fsf@zagadka.de \
--to=mvo@zagadka.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).