From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Neil Jerram Newsgroups: gmane.lisp.guile.devel Subject: Re: review/merge request: wip-array-refactor Date: Wed, 22 Jul 2009 22:48:14 +0100 Message-ID: <874ot48k4h.fsf@arudy.ossau.uklinux.net> References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1248299359 25040 80.91.229.12 (22 Jul 2009 21:49:19 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 22 Jul 2009 21:49:19 +0000 (UTC) Cc: guile-devel To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Jul 22 23:49:11 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MTjgR-0006Tp-Pk for guile-devel@m.gmane.org; Wed, 22 Jul 2009 23:49:10 +0200 Original-Received: from localhost ([127.0.0.1]:48365 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MTjgR-00088n-8S for guile-devel@m.gmane.org; Wed, 22 Jul 2009 17:49:07 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MTjgJ-00087O-3r for guile-devel@gnu.org; Wed, 22 Jul 2009 17:48:59 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MTjgE-000855-J3 for guile-devel@gnu.org; Wed, 22 Jul 2009 17:48:58 -0400 Original-Received: from [199.232.76.173] (port=60078 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MTjgE-00084u-DW for guile-devel@gnu.org; Wed, 22 Jul 2009 17:48:54 -0400 Original-Received: from mail3.uklinux.net ([80.84.72.33]:47497) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MTjgD-0003nz-Cw for guile-devel@gnu.org; Wed, 22 Jul 2009 17:48:54 -0400 Original-Received: from arudy (host86-152-99-133.range86-152.btcentralplus.com [86.152.99.133]) by mail3.uklinux.net (Postfix) with ESMTP id B31801F67FF; Wed, 22 Jul 2009 22:48:15 +0100 (BST) Original-Received: from arudy.ossau.uklinux.net (arudy [127.0.0.1]) by arudy (Postfix) with ESMTP id BEC1A3801F; Wed, 22 Jul 2009 22:48:14 +0100 (BST) User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.4-2.6 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:8919 Archived-At: Andy Wingo writes: > Hi all, > > I've finished up my refactor of Guile's arrays. To my eye it's much > nicer now. Yes, to me too. But I have two overall questions in mind. - What do you have in mind as regards releasing this? Even though it looks good, I think it would be better to let it mature for a while, and hence not to put it into 1.9.x/2.0. (And we're not short of new stuff in 1.9.x/2.0!) - What about the header file incompatibilities? I.e. the problem that if there are applications out that that #include , or another of the renamed ones, they will stop working. Do you think we are OK just to document this well, or should we do more than that? > The only bits I could anticipate being controversial would be the last > two or three patches, in which bytevectors are given an "element type" > field. This is so that I can make the srfi-4 uniform vector code use > bytevectors as their representation -- it's one less orthogonal type > that the VM would have to deal with. > > Those patches also allow uniform vectors of one type to be accessed > using accessors for other types: > > (u8vector-ref #u32(#xffffffff) 0) => 255 > > Although: > > (u8vector? #u32(#xffffffff)) => #f > > But: > > (bytevector? #u32(#xffffffff)) => #t What is the detailed rationale for that? If it can be explained in a way that doesn't sound like nonsense, fine. > Anyway, logs are here, in chronological order. Let me know what you > think. Make check should pass at each of these patches. This is on the > "wip-array-refactor" branch, which I had been rebasing, but I don't plan > to rebase it again. > > Andy > > commit 86d88a223c64276e7cd9b4503e7e2ecca5aae320 > Author: Andy Wingo > Date: Thu Jul 16 21:51:47 2009 +0200 > > remove deprecated functions from unif.c > > * libguile/unif.h: > * libguile/unif.c: Remove deprecated functions. > > * module/ice-9/deprecated.scm: Remove array-related deprecated > functions. > > * NEWS: Update. I can understand why a lot of the NEWS deltas evaporate when they are logically merged into the 1.8-2.0 section, but I'm not sure about - the entry for Brainfuck - the entry on scm_set_port_seek and scm_set_port_truncate Aren't these entries still of interest? The rest of this commit looks fine. > commit 4b126598445c4f12c0aebca2adfaa45f3edd86ab > Author: Andy Wingo > Date: Thu Jul 16 22:02:25 2009 +0200 > > remove convert.{c,i.c,h} > > * libguile/convert.c: > * libguile/convert.h: > * libguile/convert.i.c: Remove these functions, which were undocumented, > not in the libguile/ header, and thus unlikely to have been used. I'm happy with removing these; but in case anyone _is_ using them it would be helpful to provide (in NEWS) a mapping from them to whatever people should use instead. Is that possible? > commit a4a0d399c877cb802cdaf2c48713d3377a412c4f > Author: Andy Wingo > Date: Thu Jul 16 22:15:04 2009 +0200 > > clean up libguile/Makefile.am > > * libguile/Makefile.am: Clean up some of the file lists, should make > future diffs easier to parse. Fine. > commit b6149d8d9f35c8091a31b12fb3aeecee0e3a470c > Author: Andy Wingo > Date: Fri Jul 17 00:16:43 2009 +0200 > > rename scm_i_make_ra to scm_i_make_array > > * libguile/unif.c (scm_i_make_array): Rename from scm_i_make_ra. All > callers changed. Fine. > commit 5d1b3b2db9349b615baac313ae5a111fa68573ac > Author: Andy Wingo > Date: Fri Jul 17 00:25:49 2009 +0200 > > rename ramap.[ch] to array-map.[ch] > > * libguile/array-map.c: > * libguile/array-map.h: Rename from ramap.c and ramap.h. > > * libguile.h: > * libguile/Makefile.am: > * libguile/eq.c: > * libguile/init.c: > * libguile/sort.c: > * libguile/unif.c: > * libguile/vectors.c: All referrers changed. Compatibility issue here: #include will stop working. How easy is it for an application writer to test for ramap.h or array-map.h and #include the right one? Depending on the answer to that, I guess we should either cover this change in NEWS, or install an ramap.h that just #includes array-map.h. > commit c53c0893a3bad3312230003707f71c2f441460d4 > Author: Andy Wingo > Date: Fri Jul 17 00:47:31 2009 +0200 > > parts of unif.[ch] to array-handle.[ch] > > * libguile/array-handle.c: > * libguile/array-handle.h: Move some parts of unif.c and unif.h to these > new files. > > * libguile/unif.c: > * libguile/unif.h: Update includers. Since unif.h depends on the array > handle type, we include array-handle.h, which also means that there > will be no difference for our callers. > > * libguile/init.c: Call scm_init_array_handle, though it does nothing as > of yet. > > * libguile/Makefile.am: Adapt for new files. Nice. > commit cf396142405d9076cc20eca9bf53376e80359a56 > Author: Andy Wingo > Date: Fri Jul 17 00:58:32 2009 +0200 > > bitvector exodus from unif.[ch] > > * libguile/Makefile.am: > * libguile/unif.c: > * libguile/unif.h: > * libguile/bitvectors.c: > * libguile/bitvectors.h: Move bitvector functionality out of unif.[ch]. > > * libguile/array-handle.c: > * libguile/array-map.c: > * libguile/init.c: > * libguile/read.c: > * libguile/srfi-4.c: > * libguile/vectors.c: Oh, what a tangled web we weave... Nice, but again a #include issue. Should we make unif.h include bitvectors.h? > commit 2fa901a51f62da8a01112aefbf687530f4bff160 > Author: Andy Wingo > Date: Fri Jul 17 01:08:35 2009 +0200 > > rename unif.[ch] to arrays.[ch] > > * libguile/Makefile.am: > * libguile/unif.c: > * libguile/unif.h: > * libguile/arrays.c: > * libguile/arrays.h: Rename unif.[ch] to arrays.[ch]. > > * libguile.h: > * libguile/array-handle.c: > * libguile/array-map.c: > * libguile/bitvectors.c: > * libguile/bytevectors.c: > * libguile/eq.c: > * libguile/gc-card.c: > * libguile/gc-malloc.c: > * libguile/gc-mark.c: > * libguile/gc.c: > * libguile/init.c: > * libguile/inline.h: > * libguile/print.c: > * libguile/random.c: > * libguile/read.c: > * libguile/socket.c: > * libguile/sort.c: > * libguile/srfi-4.c: > * libguile/srfi-4.h: > * libguile/strports.c: > * libguile/vectors.c: > * libguile/vectors.h: Update includers. Same again; no more #include . What to do about that? > commit 2a610be59412a9d633a373c6f6ec4d4794c40fd8 > Author: Andy Wingo > Date: Sun Jul 19 15:04:40 2009 +0200 > > add generic array implementation facility > > * libguile/array-handle.c (scm_i_register_array_implementation): > (scm_i_array_implementation_for_obj): Add generic array facility, > which will (in a few commits) detangle the array code. > (scm_array_get_handle): Use the generic array facility. Note that > scm_t_array_handle no longer has ref and set function pointers; > instead it has a pointer to the array implementation. It is unlikely > that code out there used these functions, however, as the supported > way was through scm_array_handle_ref/set_x. > (scm_array_handle_pos): Move this function here from arrays.c. > (scm_array_handle_element_type): New function, returns a Scheme value > representing the type of element stored in this array. > > * libguile/array-handle.h (scm_t_array_element_type): New enum, for > generically determining the type of an array. > (scm_array_handle_rank): > (scm_array_handle_dims): These are now just #defines. > > * libguile/arrays.c: > * libguile/bitvectors.c: > * libguile/bytevectors.c: > * libguile/srfi-4.c: > * libguile/strings.c: > * libguile/vectors.c: Register array implementations for all of these. > > * libguile/inline.h: Update for array_handle_ref/set change. > * libguile/deprecated.h: Need to include arrays.h now. const SCM * scm_array_handle_elements (scm_t_array_handle *h) { - SCM vec = h->array; - if (SCM_I_ARRAYP (vec)) - vec = SCM_I_ARRAY_V (vec); - if (SCM_I_IS_VECTOR (vec)) - return SCM_I_VECTOR_ELTS (vec) + h->base; - scm_wrong_type_arg_msg (NULL, 0, h->array, "non-uniform array"); + if (h->element_type != SCM_ARRAY_ELEMENT_TYPE_SCM) + scm_wrong_type_arg_msg (NULL, 0, h->array, "non-uniform array"); + return ((const SCM*)h->elements) + h->base; Isn't this a significant C API change - i.e. that this function now only works for non-uniform arrays? Ditto scm_array_handle_writable_elements. +#define SCM_ARRAY_IMPLEMENTATION(tag_,mask_,vref_,vset_,handle_) \ + SCM_SNARF_INIT ({ \ + scm_t_array_implementation impl; \ + impl.tag = tag_; impl.mask = mask_; \ + impl.vref = vref_; impl.vset = vset_; \ + impl.get_handle = handle_; \ + scm_i_register_array_implementation (&impl); \ + }) I like this! Much nicer than equivalent documentation or comments saying how an implementation has to register itself. + /* perhaps should catch overflow here too */ Don't understand. Could you expand this comment a bit? + scm_out_of_range ("vector-handle-ref", scm_from_size_t (idx)); I worry slightly here that "vector-handle-ref" will appear in a Scheme level error message, and that no one will know what that means. Everything else here looks great. > commit 66b9d7d304a349d5bb4f763a47143f09da58d97f > Author: Andy Wingo > Date: Fri Jul 17 12:45:24 2009 +0200 > > remove enclosed arrays > > * libguile/arrays.h: > * libguile/array-map.c: > * libguile/arrays.c: > * libguile/deprecated.c: Remove "enclosed arrays". The only user-facing > procedures that this affects are scm_enclose_array / enclose-array. If > enclosed arrays are added back, it should be through the generic array > interface; but really, it sounds like something that would be better > implemented in Scheme. I'm not sure about that. I also haven't fully understood them yet! Do the array mapping functions work on enclosed arrays? If so, wouldn't that be a lot of work to reimplement with a Scheme-level implementation? Did you manage to find out anything about who added enclosed arrays, and why? Also, is it definitely a problem for the new framework to support them? > commit 1030b45049f564f4abd459abd8e59db34c7867cc > Author: Andy Wingo > Date: Fri Jul 17 18:54:06 2009 +0200 > > move generic array foo out to its own file > > * libguile/arrays.h: > * libguile/arrays.c: > * libguile/generalized-arrays.h: > * libguile/generalized-arrays.c: Move some generic functionality out of > arrays.c to a new file. > > * libguile/array-map.c: > * libguile/deprecated.c: > * libguile/init.c: Update includers. Fine. (Although another header file issue.) > commit f332e9571703ddcd27c51ebe3c847459c2a649b7 > Author: Andy Wingo > Date: Fri Jul 17 19:05:32 2009 +0200 > > generic vector ops to own file > > * libguile/Makefile.am: > * libguile/vectors.c: > * libguile/vectors.h: > * libguile/generalized-vectors.c: > * libguile/generalized-vectors.h: Move generic vector ops off into their > own file too. The implementation is now based on the generic > array-handle infrastructure. > > * libguile.h: > * libguile/array-map.c: > * libguile/bitvectors.c: > * libguile/random.c: > * libguile/srfi-4.c: Update includers. So now a generalized vector is exactly the same as a generalized array with rank 1? Cool. Also seems like we then have a lot more API than we need! > commit 476b894c71b436f3befb8af46b899aaf244763e2 > Author: Andy Wingo > Date: Sat Jul 18 12:18:15 2009 +0200 > > uniform vector functions to their own file > > * libguile/uniform.c: > * libguile/uniform.h: > * libguile/srfi-4.c: > * libguile/srfi-4.h: > * libguile/Makefile.am: Move uniform vector funcs out of srfi-4 to their > own file. > > * libguile.h: > * libguile/arrays.c: > * libguile/bytevectors.c: Update includers. +const size_t scm_i_array_element_type_sizes[SCM_ARRAY_ELEMENT_TYPE_LAST + 1] = { + 0, + 0, + 1, + 8, + 8, 8, + 16, 16, + 32, 32, + 64, 64, + 32, 64, + 64, 128 +}; That's a bit hard to read and maintain on its own. Could it go next to where the element types are defined? + ret = SCM_ARRAY_ELEMENT_TYPE_IS_UNBOXED (h.element_type); Why not use scm_array_handle_uniform_element_size (h) here too? > commit f45eccffa73c043466a4cc0f5037132ee5795eee > Author: Andy Wingo > Date: Sat Jul 18 12:43:54 2009 +0200 > > add registry of vector constructors, make-generalized-vector > > * libguile/generalized-vectors.h: > * libguile/generalized-vectors.c: Add a registry of vector constructors. > (scm_make_generalized_vector): New public function, constructs a > vector of a given type. > > * libguile/bitvectors.c: > * libguile/bytevectors.c: > * libguile/srfi-4.c: > * libguile/strings.c: > * libguile/vectors.c: Register vector constructors. > > * libguile/extensions.c (scm_init_extensions): No need to NULL the list > of registered extensions here, the static init does it for us. Allows > scm_c_register_extension to be called before scm_init_extensions. > > * libguile/init.c (scm_i_init_guile): Move array initialization earlier, > so e.g. scm_init_strings has access to a valid list of array element > types when registering its vector constructor. Can you say more about why we need this? Given the new equivalence between generalized vectors and arrays, won't the array API suffice? +SCM_DEFINE (scm_make_generalized_vector, "make-generalized-vector", 2, 1, 0, + (SCM type, SCM len, SCM fill), + "Make a generalized vector") +#define FUNC_NAME s_scm_make_generalized_vector +{ + int i; + for (i = 0; i < num_vector_ctors_registered; i++) + if (vector_ctors[i].tag == type) I don't see how a Scheme caller can get a type value (which I believe is a string) that is == the registered tag. + scm_init_bitvectors (); + scm_bootstrap_bytevectors (); Can you now combine these two? That would also allow using the nice ARRAY_IMPLEMENTATION and VECTOR_IMPLEMENTATION macros in bytevectors.c. +#define REGISTER(tag, TAG) \ + scm_i_register_vector_constructor \ + (scm_i_array_element_types[SCM_ARRAY_ELEMENT_TYPE_##TAG], \ + scm_make_##tag##vector) + + REGISTER (u8, U8); + REGISTER (s8, S8); + REGISTER (u16, U16); + REGISTER (s16, S16); + REGISTER (u32, U32); + REGISTER (s32, S32); + REGISTER (u64, U64); + REGISTER (s64, S64); + REGISTER (f32, F32); + REGISTER (f64, F64); + REGISTER (c32, C32); + REGISTER (c64, C64); I'd rather see 12 SCM_VECTOR_IMPLEMENTATION calls here. It's a touch more code, but much nicer for someone looking for vector implementations. > commit 943a0a8759504c4a367c1904bef4a8afbc6208dd > Author: Andy Wingo > Date: Sat Jul 18 12:58:37 2009 +0200 > > make-typed-array builds backing vector via make-generalized-vector > > * libguile/arrays.c: Rework to use scm_make_generalized_vector instead > of our own type table. > > * libguile/bitvectors.c: Fix some includes. Nice. Really starting to make good sense now! > commit ac8ed3db31769d7ede5e75fba1697e8dde55fae4 > Author: Andy Wingo > Date: Sat Jul 18 13:26:18 2009 +0200 > > any->u8vector and family now implemented in Scheme > > * module/Makefile.am: > * module/srfi/srfi-4/gnu.scm: New module, for extensions to srfi-4. > Currently defines the any->FOOvector family. > > * libguile/srfi-4.c: > * libguile/srfi-4.i.c: Dispatch scm_any_to_FOOvector calls to the > scheme-implemented functions in (srfi srfi-4 gnu). +DEFINE_SRFI_4_GNU_PROXIES (u8); +DEFINE_SRFI_4_GNU_PROXIES (s8); +DEFINE_SRFI_4_GNU_PROXIES (u16); +DEFINE_SRFI_4_GNU_PROXIES (s16); +DEFINE_SRFI_4_GNU_PROXIES (u32); +DEFINE_SRFI_4_GNU_PROXIES (s32); +DEFINE_SRFI_4_GNU_PROXIES (u64); +DEFINE_SRFI_4_GNU_PROXIES (s64); +DEFINE_SRFI_4_GNU_PROXIES (f32); +DEFINE_SRFI_4_GNU_PROXIES (f64); +DEFINE_SRFI_4_GNU_PROXIES (c32); +DEFINE_SRFI_4_GNU_PROXIES (c64); Again, for the sake of grepping, I'd rather see 12 DEFPROXY100 calls here. > commit f332089ed43761440a2a8c272ee61a709b38cc24 > Author: Andy Wingo > Date: Sat Jul 18 13:46:29 2009 +0200 > > bytevector inlinedness indicated by flag, not length > > * libguile/bytevectors.h (SCM_BYTEVECTOR_INLINE_P): Change to check a > flag instead of checking the length of the bytevector. Why do that? This means we have two pieces of information that always have to be in sync - the length and the flag - instead of just one (the length), and I don't believe that accessing the flag is any quicker than the length. Also you've had to add more code to keep the flag set correctly. > * libguile/bytevectors.c (make_bytevector_from_buffer): Handle the len > <= inline threshold case as well. Set the inline flag as > appropriate. OK. > (make_bytevector): Updat the inline flag as appropriate. Your changes here don't seem to add anything, or make the code clearer. Am I missing something? > (scm_c_take_bytevector): Just dispatch to > make_bytevector_from_buffer. Nice. > (scm_i_shrink_bytevector): Update the inline flag as appropriate. > Update the length when shrinking an already-inlined vector. Cool (except general query above about the inline flag). > (STRING_TO_UTF): Fix some indentation. OK, I'll stop here for now, and finish the rest (hopefully) tomorrow. Thanks for giving this area some serious attention! Neil