unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: guile-devel@gnu.org
Subject: Re: review/merge request: wip-array-refactor
Date: Thu, 23 Jul 2009 23:08:47 +0200	[thread overview]
Message-ID: <87ab2vyun4.fsf@gnu.org> (raw)
In-Reply-To: m3my70kc41.fsf@pobox.com

Hello!

Andy Wingo <wingo@pobox.com> writes:

> I've finished up my refactor of Guile's arrays. To my eye it's much
> nicer now.

Hey, great!

Several general remarks:

  * Since I'm conservative and lazy, I'd have happily let this code rest
    in peace.  ;-)

    In particular, last time I checked[*], we had poor coverage of
    {ramap,srfi-4,unif}.c so I'd be afraid of breaking something without
    noticing.

    [*] http://www.fdn.fr/~lcourtes/software/guile/coverage/libguile/index.html

  * As we discussed on IRC, I don't fully understand the rationale
    behind the SRFI-4/bytevector unification.  For anything binary I/O,
    the bytevector API is probably better suited.

    A use case you once mentioned was graphics applications: they may
    get pixel streams (for which f32 vectors are a better fit than
    bytevectors) extracted after doing bytevector-based binary I/O.  I'm
    not fully convinced that this use case justifies jointing the SRFI-4
    and bytevector types.

  * Polymorphism is quite unusual in Scheme APIs.

  * There's definitely room for improvement, and factorization or the
    array/vector code, and a large part of what you've been doing here
    goes in the right direction.

I agree with most of what Neil said, so I won't repeat it here.

> commit 86d88a223c64276e7cd9b4503e7e2ecca5aae320
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 21:51:47 2009 +0200
>
>     remove deprecated functions from unif.c
>     
>     * libguile/unif.h:
>     * libguile/unif.c: Remove deprecated functions.
>     
>     * module/ice-9/deprecated.scm: Remove array-related deprecated
>       functions.
>     
>     * NEWS: Update.

OK.  (I assume the change in `NEWS' is to only keep "Changes in 1.9.2"
at the top and remove "Changes in 1.9.1", right?)

Can already be pushed to `master'.

> commit 4b126598445c4f12c0aebca2adfaa45f3edd86ab
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 22:02:25 2009 +0200
>
>     remove convert.{c,i.c,h}
>     
>     * libguile/convert.c:
>     * libguile/convert.h:
>     * libguile/convert.i.c: Remove these functions, which were undocumented,
>       not in the libguile/ header, and thus unlikely to have been used.

OK; push to `master'.

> commit a4a0d399c877cb802cdaf2c48713d3377a412c4f
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 22:15:04 2009 +0200
>
>     clean up libguile/Makefile.am
>     
>     * libguile/Makefile.am: Clean up some of the file lists, should make
>       future diffs easier to parse.

Good in principle.

(In practice, I fear the day when I'll have to merge that one in the
BDW-GC branch...)

> commit 5d1b3b2db9349b615baac313ae5a111fa68573ac
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 00:25:49 2009 +0200
>
>     rename ramap.[ch] to array-map.[ch]

Neil is concerned about the header renaming.  I think it's OK because
individual headers are not meant to be included individually, i.e., one
is supposed to always #include <libguile.h> (except for
<libguile/i18n.h>).

> commit 2a610be59412a9d633a373c6f6ec4d4794c40fd8
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sun Jul 19 15:04:40 2009 +0200
>
>     add generic array implementation facility

Excellent!

I would just add comments to `scm_t_array_implementation', and pipe the
changes through GNU Ident.  :-)

Also:

+typedef enum {    
+  SCM_ARRAY_ELEMENT_TYPE_SCM = 0, /* SCM values */
+  SCM_ARRAY_ELEMENT_TYPE_CHAR = 1, /* characters */

[...]

+  SCM_ARRAY_ELEMENT_TYPE_C64 = 15,
+  SCM_ARRAY_ELEMENT_TYPE_LAST = 15,
+} scm_t_array_element_type;

Per C99 (and previous versions) §6.7.2.2, this is not necessary:

  If the first enumerator has no =, the value of its enumeration
  constant is 0. Each subsequent enumerator with no = defines its
  enumeration constant as the value of the constant expression obtained
  by adding 1 to the value of the previous enumeration constant.

Up-to-here, I think the changes are mostly non-controversial and would
be OK for 1.9 IMO.

> commit 66b9d7d304a349d5bb4f763a47143f09da58d97f
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 12:45:24 2009 +0200
>
>     remove enclosed arrays

Same as Neil: I'd like to avoid gratuitous incompatibilities if it can
be avoided.

> commit f45eccffa73c043466a4cc0f5037132ee5795eee
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 12:43:54 2009 +0200
>
>     add registry of vector constructors, make-generalized-vector
>     
>     * libguile/generalized-vectors.h:
>     * libguile/generalized-vectors.c: Add a registry of vector constructors.
>       (scm_make_generalized_vector): New public function, constructs a
>       vector of a given type.
>     
>     * libguile/bitvectors.c:
>     * libguile/bytevectors.c:
>     * libguile/srfi-4.c:
>     * libguile/strings.c:
>     * libguile/vectors.c: Register vector constructors.
>     
>     * libguile/extensions.c (scm_init_extensions): No need to NULL the list
>       of registered extensions here, the static init does it for us. Allows
>       scm_c_register_extension to be called before scm_init_extensions.

This `extensions.c' change should be in a separate commit, I think.

>     * libguile/init.c (scm_i_init_guile): Move array initialization earlier,
>       so e.g. scm_init_strings has access to a valid list of array element
>       types when registering its vector constructor.

This:

-static extension_t *registered_extensions;
+static extension_t *registered_extensions = NULL;

is not needed.

Otherwise OK.

> commit 943a0a8759504c4a367c1904bef4a8afbc6208dd
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 12:58:37 2009 +0200
>
>     make-typed-array builds backing vector via make-generalized-vector
>     
>     * libguile/arrays.c: Rework to use scm_make_generalized_vector instead
>       of our own type table.
>     
>     * libguile/bitvectors.c: Fix some includes.

OK.  (Is the latter related to the former?)

> commit ac8ed3db31769d7ede5e75fba1697e8dde55fae4
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 13:26:18 2009 +0200
>
>     any->u8vector and family now implemented in Scheme
>     
>     * module/Makefile.am:
>     * module/srfi/srfi-4/gnu.scm: New module, for extensions to srfi-4.
>       Currently defines the any->FOOvector family.
>     
>     * libguile/srfi-4.c:
>     * libguile/srfi-4.i.c: Dispatch scm_any_to_FOOvector calls to the
>       scheme-implemented functions in (srfi srfi-4 gnu).

+#define DEFINE_SCHEME_PROXY100(cname, modname, scmname)                 \

I'd remove "100"...

+#define DEFPROXY100(cname, scmname)               \
+  DEFINE_SCHEME_PROXY100 (cname, MOD, scmname)
+
+#define DEFINE_SRFI_4_GNU_PROXIES(tag)                              \
+  DEFPROXY100 (scm_any_to_##tag##vector, "any->" #tag "vector")

... and keep only one macro, as Neil suggested.

+DEFINE_SRFI_4_GNU_PROXIES (u8);

Remove extraneous semicolons.

+;;; Extensions to SRFI-4
+
+;; 	Copyright (C) 2001, 2002, 2004, 2006, 2009 Free Software Foundation, Inc.

Should be just "2009".

> commit f332089ed43761440a2a8c272ee61a709b38cc24
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 13:46:29 2009 +0200
>
>     bytevector inlinedness indicated by flag, not length
>     
>     * libguile/bytevectors.h (SCM_BYTEVECTOR_INLINE_P): Change to check a
>       flag instead of checking the length of the bytevector.
>     
>     * libguile/bytevectors.c (make_bytevector_from_buffer): Handle the len
>       <= inline threshold case as well. Set the inline flag as appropriate.
>       (make_bytevector): Updat the inline flag as appropriate.
>       (scm_c_take_bytevector): Just dispatch to make_bytevector_from_buffer.
>       (scm_i_shrink_bytevector): Update the inline flag as appropriate.
>       Update the length when shrinking an already-inlined vector.
>       (STRING_TO_UTF): Fix some indentation.

The scm_c_take_bytevector -> make_bytevector_from_buffer shuffling can
be avoided.

 static inline SCM
 make_bytevector (size_t len)
 {
-  SCM bv;
-
   if (SCM_UNLIKELY (len == 0))
-    bv = scm_null_bytevector;
+    return scm_null_bytevector;
+
+  if (SCM_BYTEVECTOR_INLINEABLE_SIZE_P (len))
+    {
+      SCM ret;
+      SCM_NEWSMOB2 (ret, scm_tc16_bytevector, len, NULL);
+      SCM_BYTEVECTOR_SET_INLINE (ret);
+      return ret;
+    }

The code initially had a single `return' statement per function, to
improve readability (IMO).  Do you think otherwise?

> commit e286c973fcd63c0930d9302cc5f1a280b9b22615
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sun Jul 19 15:11:53 2009 +0200
>
>     bytevectors have "element type" field, e.g. for generalized-vector-ref
>     
>     Bytevectors have a very close relationship to other forms of uniform
>     vectors. Often you want to view a u64vector as a series of bytes, for
>     writing over a socket; or to process an incoming stream using the
>     convenient and less error-prone s16vector-ref API rather than
>     bytevector-s16-native-ref.
>     
>     The essential needs of the representation of a bytevector and an
>     s64vector are the same, so we take advantage of that and extend the
>     bytevector implementation to have a "native type" field, which defaults
>     to VU8.

This comments belong to a header (info "(standards) Change Log
Concepts").

>     This commit doesn't actually expose any user-noticeable changes,
>     however.
>     
>     * libguile/bytevectors.h (SCM_BYTEVECTOR_ELEMENT_TYPE): New internal
>       defines.
>       (scm_i_make_typed_bytevector, scm_c_take_typed_bytevector): New
>       internal functions.
>     
>     * libguile/bytevectors.c (SCM_BYTEVECTOR_SET_ELEMENT_TYPE):
>       (SCM_BYTEVECTOR_TYPE_SIZE):
>       (SCM_BYTEVECTOR_TYPED_LENGTH): New internal macros.
>       (make_bytevector, make_bytevector_from_buffer): Take an extra
>       argument, the element type. The length argument is interpreted as
>       being the number of elements, which corresponds to the number of bytes
>       in the default VU8 case. Doing it this way eliminates a class of bugs
>       -- e.g. a u32vector of length 3 bytes doesn't make sense. We do have
>       to check for another class of bugs: overflow. The length stored on the
>       bytevector itself is still the byte length, though.
>       (scm_i_make_typed_bytevector):
>       (scm_c_take_typed_bytevector): New internal functions.
>       (scm_i_shrink_bytevector): Make sure the new size is valid for the
>       bytevector's type.
>       (scm_i_bytevector_generalized_set_x): Remove this function, the
>       array-handle infrastructure takes care of this for us.
>       (print_bytevector): Print the bytevector according to its type.
>       (scm_make_bytevector, scm_bytevector_copy)
>       (scm_uniform_array_to_bytevector)
>       (scm_u8_list_to_bytevector, scm_bytevector_to_uint_list): Adapt to
>       make_bytevector extra arg.
>       (bv_handle_ref, bv_handle_set_x): Adapt to ref and set based on the
>       type of the bytevector, e.g. f64 or u8.
>       (bytevector_get_handle): Set the typed length of the vector, not the
>       byte length.
>     
>     Conflicts:
>     
>     	libguile/bytevectors.c

At this point, bytevectors are aware of all the other vector types.

In principle, I don't like this idea.  I don't have a clear picture of
the array/vector dependency graph, though.

This:

+  if (SCM_UNLIKELY (len == 0 && element_type == 0))

should be "element_type == SCM_ARRAY_ELEMENT_TYPE_SCM".

> commit cd43fdc5b7a7c851ee0f2b4e96a1f394fb50d869
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 19:03:28 2009 +0200
>
>     fix (bytevector-ieee-single-native-set! x 0 0)
>     
>     * libguile/bytevectors.c (VALIDATE_REAL): SCM_VALIDATE_REAL is not what
>       we need for checking values for bytevector-ieee-single-native-set! et
>       al, so define our own validator.
>       (IEEE754_SET, IEEE754_NATIVE_SET): Use it.

Can you please push it to `master', preferably with a test case?

> commit 6f4895577734bb107d488f31dca82d5ad4c25a65
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sun Jul 19 15:35:33 2009 +0200
>
>     reimplement srfi-4 vectors on top of bytevectors
>     
>     * libguile/srfi-4.h:
>     * libguile/srfi-4.c (scm_make_srfi_4_vector): New function, exported by
>       (srfi srfi-4 gnu).
>     * libguile/srfi-4.i.c: Removed.
>     * module/srfi/srfi-4.scm:
>     * module/srfi/srfi-4/gnu.scm: Reimplement srfi-4 vectors on top of
>       bytevectors. The implementation is mostly in Scheme now.

I appreciate the move to Scheme.  ;-)

>     * module/rnrs/bytevector.scm:
>     * libguile/bytevectors.c (scm_uniform_array_to_bytevector): No more need
>       for this, as uniform vectors are bytevectors, and we can get the
>       backing vector of an array via shared-array-root.
>     
>     * libguile/bytevectors.c (bytevector_ref_c32, bytevector_ref_c64)
>       (bytevector_set_c32, bytevector_set_c64): Fix some embarrassing bugs.
>       Still need to do an upper bounds check.

Hmm...

>     * libguile/deprecated.h: Remove deprecated array functions:
>       scm_i_arrayp, scm_i_array_ndim, scm_i_array_mem, scm_i_array_v,
>       scm_i_array_base, scm_i_array_dims, and the deprecated macros:
>       SCM_ARRAYP, SCM_ARRAY_NDIM, SCM_ARRAY_CONTP, SCM_ARRAY_MEM,
>       SCM_ARRAY_V, SCM_ARRAY_BASE, SCM_ARRAY_DIMS.
>     * libguile/deprecated.c (scm_uniform_vector_read_x)
>       (scm_uniform_vector_write, scm_uniform_array_read_x)
>       (scm_uniform_array_write): Newly deprecated functions.
>     
>     * libguile/generalized-arrays.c (scm_array_type): Remove the bytevector
>       hack. This does introduce the bug that #vu8(1 2 3) will compile to
>       #u8(1 2 3). I'm working on that.
>     
>     * libguile/objcodes.c (scm_bytecode_to_objcode, scm_objcode_to_bytecode):
>       Rework to operate on bytevectors, as scm_make_u8vector now causes a
>       module lookup, which can't be done e.g. when loading the VM boot
>       program for psyntax-pp.go on a fresh bootstrap.
>     
>     * libguile/objcodes.h (SCM_F_OBJCODE_IS_BYTEVECTOR):
>       (SCM_OBJCODE_IS_BYTEVECTOR): s/U8VECTOR/BYTEVECTOR/.
>     
>     * module/ice-9/boot-9.scm (the-scm-module): A terrible hack to pull in
>       (srfi srfi-4), as the bindings are primarily there now. We'll worry
>       about this later.

The issue is that `make-c32vector' et al., which used to be in
`the-scm-module' are now confined in `(srfi srfi-4 gnu)'.  There's a
compatibility issue here.  Perhaps it should be pulled in as well?

>     * module/language/glil/compile-assembly.scm (dump-object): Update to
>       work with array-contents and shared-array-root.
>     
>     * test-suite/tests/bytevectors.test: Remove uniform-array->bytevector
>       test.


> commit ae3543b1f27e973b25060f71f095bf23ef149aee
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 19:08:43 2009 +0200
>
>     clean up includes in vectors.[ch]
>     
>     * libguile/vectors.h:
>     * libguile/vectors.c: Clean up the includes... mostly.

"Mostly"?  What does it clean up?

+#include "libguile/arrays.h" /* Hit me with the ugly stick */

Does that mean there are circular dependencies or something like that?


Waouh, it's a lot of work, and reviewing it takes some time, too.
Honestly, I'd rather spend the small amount of time I spend on Guile
these days in other areas with higher priorities.

Thanks,
Ludo'.





  parent reply	other threads:[~2009-07-23 21:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-19 13:59 review/merge request: wip-array-refactor Andy Wingo
2009-07-22 21:48 ` Neil Jerram
2009-07-28 22:41   ` Andy Wingo
2009-07-30 21:10     ` Neil Jerram
2009-08-04 12:21       ` Andy Wingo
2009-08-09 16:41         ` Ludovic Courtès
2009-08-12 22:03           ` Andy Wingo
2009-08-13  9:16             ` Ludovic Courtès
2009-07-23 21:08 ` Ludovic Courtès [this message]
2009-07-24 22:01   ` Neil Jerram

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ab2vyun4.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).