review/merge request: wip-array-refactor

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

* review/merge request: wip-array-refactor
@ 2009-07-19 13:59 Andy Wingo
  2009-07-22 21:48 ` Neil Jerram
  2009-07-23 21:08 ` Ludovic Courtès
  0 siblings, 2 replies; 10+ messages in thread
From: Andy Wingo @ 2009-07-19 13:59 UTC (permalink / raw)
  To: guile-devel

Hi all,

I've finished up my refactor of Guile's arrays. To my eye it's much
nicer now.

The only bits I could anticipate being controversial would be the last
two or three patches, in which bytevectors are given an "element type"
field. This is so that I can make the srfi-4 uniform vector code use
bytevectors as their representation -- it's one less orthogonal type
that the VM would have to deal with.

Those patches also allow uniform vectors of one type to be accessed
using accessors for other types:

  (u8vector-ref #u32(#xffffffff) 0) => 255

Although:

  (u8vector? #u32(#xffffffff)) => #f

But:

  (bytevector? #u32(#xffffffff)) => #t

Anyway, logs are here, in chronological order. Let me know what you
think. Make check should pass at each of these patches. This is on the
"wip-array-refactor" branch, which I had been rebasing, but I don't plan
to rebase it again.

Andy

commit 86d88a223c64276e7cd9b4503e7e2ecca5aae320
Author: Andy Wingo <wingo@pobox.com>
Date:   Thu Jul 16 21:51:47 2009 +0200

    remove deprecated functions from unif.c
    
    * libguile/unif.h:
    * libguile/unif.c: Remove deprecated functions.
    
    * module/ice-9/deprecated.scm: Remove array-related deprecated
      functions.
    
    * NEWS: Update.

commit 4b126598445c4f12c0aebca2adfaa45f3edd86ab
Author: Andy Wingo <wingo@pobox.com>
Date:   Thu Jul 16 22:02:25 2009 +0200

    remove convert.{c,i.c,h}
    
    * libguile/convert.c:
    * libguile/convert.h:
    * libguile/convert.i.c: Remove these functions, which were undocumented,
      not in the libguile/ header, and thus unlikely to have been used.

commit a4a0d399c877cb802cdaf2c48713d3377a412c4f
Author: Andy Wingo <wingo@pobox.com>
Date:   Thu Jul 16 22:15:04 2009 +0200

    clean up libguile/Makefile.am
    
    * libguile/Makefile.am: Clean up some of the file lists, should make
      future diffs easier to parse.

commit b6149d8d9f35c8091a31b12fb3aeecee0e3a470c
Author: Andy Wingo <wingo@pobox.com>
Date:   Fri Jul 17 00:16:43 2009 +0200

    rename scm_i_make_ra to scm_i_make_array
    
    * libguile/unif.c (scm_i_make_array): Rename from scm_i_make_ra. All
      callers changed.

commit 5d1b3b2db9349b615baac313ae5a111fa68573ac
Author: Andy Wingo <wingo@pobox.com>
Date:   Fri Jul 17 00:25:49 2009 +0200

    rename ramap.[ch] to array-map.[ch]
    
    * libguile/array-map.c:
    * libguile/array-map.h: Rename from ramap.c and ramap.h.
    
    * libguile.h:
    * libguile/Makefile.am:
    * libguile/eq.c:
    * libguile/init.c:
    * libguile/sort.c:
    * libguile/unif.c:
    * libguile/vectors.c: All referrers changed.

commit c53c0893a3bad3312230003707f71c2f441460d4
Author: Andy Wingo <wingo@pobox.com>
Date:   Fri Jul 17 00:47:31 2009 +0200

    parts of unif.[ch] to array-handle.[ch]
    
    * libguile/array-handle.c:
    * libguile/array-handle.h: Move some parts of unif.c and unif.h to these
      new files.
    
    * libguile/unif.c:
    * libguile/unif.h: Update includers. Since unif.h depends on the array
      handle type, we include array-handle.h, which also means that there
      will be no difference for our callers.
    
    * libguile/init.c: Call scm_init_array_handle, though it does nothing as
      of yet.
    
    * libguile/Makefile.am: Adapt for new files.

commit cf396142405d9076cc20eca9bf53376e80359a56
Author: Andy Wingo <wingo@pobox.com>
Date:   Fri Jul 17 00:58:32 2009 +0200

    bitvector exodus from unif.[ch]
    
    * libguile/Makefile.am:
    * libguile/unif.c:
    * libguile/unif.h:
    * libguile/bitvectors.c:
    * libguile/bitvectors.h: Move bitvector functionality out of unif.[ch].
    
    * libguile/array-handle.c:
    * libguile/array-map.c:
    * libguile/init.c:
    * libguile/read.c:
    * libguile/srfi-4.c:
    * libguile/vectors.c: Oh, what a tangled web we weave...

commit 2fa901a51f62da8a01112aefbf687530f4bff160
Author: Andy Wingo <wingo@pobox.com>
Date:   Fri Jul 17 01:08:35 2009 +0200

    rename unif.[ch] to arrays.[ch]
    
    * libguile/Makefile.am:
    * libguile/unif.c:
    * libguile/unif.h:
    * libguile/arrays.c:
    * libguile/arrays.h: Rename unif.[ch] to arrays.[ch].
    
    * libguile.h:
    * libguile/array-handle.c:
    * libguile/array-map.c:
    * libguile/bitvectors.c:
    * libguile/bytevectors.c:
    * libguile/eq.c:
    * libguile/gc-card.c:
    * libguile/gc-malloc.c:
    * libguile/gc-mark.c:
    * libguile/gc.c:
    * libguile/init.c:
    * libguile/inline.h:
    * libguile/print.c:
    * libguile/random.c:
    * libguile/read.c:
    * libguile/socket.c:
    * libguile/sort.c:
    * libguile/srfi-4.c:
    * libguile/srfi-4.h:
    * libguile/strports.c:
    * libguile/vectors.c:
    * libguile/vectors.h: Update includers.

commit 2a610be59412a9d633a373c6f6ec4d4794c40fd8
Author: Andy Wingo <wingo@pobox.com>
Date:   Sun Jul 19 15:04:40 2009 +0200

    add generic array implementation facility
    
    * libguile/array-handle.c (scm_i_register_array_implementation):
      (scm_i_array_implementation_for_obj): Add generic array facility,
      which will (in a few commits) detangle the array code.
      (scm_array_get_handle): Use the generic array facility. Note that
      scm_t_array_handle no longer has ref and set function pointers;
      instead it has a pointer to the array implementation. It is unlikely
      that code out there used these functions, however, as the supported
      way was through scm_array_handle_ref/set_x.
      (scm_array_handle_pos): Move this function here from arrays.c.
      (scm_array_handle_element_type): New function, returns a Scheme value
      representing the type of element stored in this array.
    
    * libguile/array-handle.h (scm_t_array_element_type): New enum, for
      generically determining the type of an array.
      (scm_array_handle_rank):
      (scm_array_handle_dims): These are now just #defines.
    
    * libguile/arrays.c:
    * libguile/bitvectors.c:
    * libguile/bytevectors.c:
    * libguile/srfi-4.c:
    * libguile/strings.c:
    * libguile/vectors.c: Register array implementations for all of these.
    
    * libguile/inline.h: Update for array_handle_ref/set change.
    * libguile/deprecated.h: Need to include arrays.h now.

commit 66b9d7d304a349d5bb4f763a47143f09da58d97f
Author: Andy Wingo <wingo@pobox.com>
Date:   Fri Jul 17 12:45:24 2009 +0200

    remove enclosed arrays
    
    * libguile/arrays.h:
    * libguile/array-map.c:
    * libguile/arrays.c:
    * libguile/deprecated.c: Remove "enclosed arrays". The only user-facing
      procedures that this affects are scm_enclose_array / enclose-array. If
      enclosed arrays are added back, it should be through the generic array
      interface; but really, it sounds like something that would be better
      implemented in Scheme.

commit 1030b45049f564f4abd459abd8e59db34c7867cc
Author: Andy Wingo <wingo@pobox.com>
Date:   Fri Jul 17 18:54:06 2009 +0200

    move generic array foo out to its own file
    
    * libguile/arrays.h:
    * libguile/arrays.c:
    * libguile/generalized-arrays.h:
    * libguile/generalized-arrays.c: Move some generic functionality out of
      arrays.c to a new file.
    
    * libguile/array-map.c:
    * libguile/deprecated.c:
    * libguile/init.c: Update includers.

commit f332e9571703ddcd27c51ebe3c847459c2a649b7
Author: Andy Wingo <wingo@pobox.com>
Date:   Fri Jul 17 19:05:32 2009 +0200

    generic vector ops to own file
    
    * libguile/Makefile.am:
    * libguile/vectors.c:
    * libguile/vectors.h:
    * libguile/generalized-vectors.c:
    * libguile/generalized-vectors.h: Move generic vector ops off into their
      own file too. The implementation is now based on the generic
      array-handle infrastructure.
    
    * libguile.h:
    * libguile/array-map.c:
    * libguile/bitvectors.c:
    * libguile/random.c:
    * libguile/srfi-4.c: Update includers.

commit 476b894c71b436f3befb8af46b899aaf244763e2
Author: Andy Wingo <wingo@pobox.com>
Date:   Sat Jul 18 12:18:15 2009 +0200

    uniform vector functions to their own file
    
    * libguile/uniform.c:
    * libguile/uniform.h:
    * libguile/srfi-4.c:
    * libguile/srfi-4.h:
    * libguile/Makefile.am: Move uniform vector funcs out of srfi-4 to their
      own file.
    
    * libguile.h:
    * libguile/arrays.c:
    * libguile/bytevectors.c: Update includers.

commit f45eccffa73c043466a4cc0f5037132ee5795eee
Author: Andy Wingo <wingo@pobox.com>
Date:   Sat Jul 18 12:43:54 2009 +0200

    add registry of vector constructors, make-generalized-vector
    
    * libguile/generalized-vectors.h:
    * libguile/generalized-vectors.c: Add a registry of vector constructors.
      (scm_make_generalized_vector): New public function, constructs a
      vector of a given type.
    
    * libguile/bitvectors.c:
    * libguile/bytevectors.c:
    * libguile/srfi-4.c:
    * libguile/strings.c:
    * libguile/vectors.c: Register vector constructors.
    
    * libguile/extensions.c (scm_init_extensions): No need to NULL the list
      of registered extensions here, the static init does it for us. Allows
      scm_c_register_extension to be called before scm_init_extensions.
    
    * libguile/init.c (scm_i_init_guile): Move array initialization earlier,
      so e.g. scm_init_strings has access to a valid list of array element
      types when registering its vector constructor.

commit 943a0a8759504c4a367c1904bef4a8afbc6208dd
Author: Andy Wingo <wingo@pobox.com>
Date:   Sat Jul 18 12:58:37 2009 +0200

    make-typed-array builds backing vector via make-generalized-vector
    
    * libguile/arrays.c: Rework to use scm_make_generalized_vector instead
      of our own type table.
    
    * libguile/bitvectors.c: Fix some includes.

commit ac8ed3db31769d7ede5e75fba1697e8dde55fae4
Author: Andy Wingo <wingo@pobox.com>
Date:   Sat Jul 18 13:26:18 2009 +0200

    any->u8vector and family now implemented in Scheme
    
    * module/Makefile.am:
    * module/srfi/srfi-4/gnu.scm: New module, for extensions to srfi-4.
      Currently defines the any->FOOvector family.
    
    * libguile/srfi-4.c:
    * libguile/srfi-4.i.c: Dispatch scm_any_to_FOOvector calls to the
      scheme-implemented functions in (srfi srfi-4 gnu).

commit f332089ed43761440a2a8c272ee61a709b38cc24
Author: Andy Wingo <wingo@pobox.com>
Date:   Sat Jul 18 13:46:29 2009 +0200

    bytevector inlinedness indicated by flag, not length
    
    * libguile/bytevectors.h (SCM_BYTEVECTOR_INLINE_P): Change to check a
      flag instead of checking the length of the bytevector.
    
    * libguile/bytevectors.c (make_bytevector_from_buffer): Handle the len
      <= inline threshold case as well. Set the inline flag as appropriate.
      (make_bytevector): Updat the inline flag as appropriate.
      (scm_c_take_bytevector): Just dispatch to make_bytevector_from_buffer.
      (scm_i_shrink_bytevector): Update the inline flag as appropriate.
      Update the length when shrinking an already-inlined vector.
      (STRING_TO_UTF): Fix some indentation.

commit e286c973fcd63c0930d9302cc5f1a280b9b22615
Author: Andy Wingo <wingo@pobox.com>
Date:   Sun Jul 19 15:11:53 2009 +0200

    bytevectors have "element type" field, e.g. for generalized-vector-ref
    
    Bytevectors have a very close relationship to other forms of uniform
    vectors. Often you want to view a u64vector as a series of bytes, for
    writing over a socket; or to process an incoming stream using the
    convenient and less error-prone s16vector-ref API rather than
    bytevector-s16-native-ref.
    
    The essential needs of the representation of a bytevector and an
    s64vector are the same, so we take advantage of that and extend the
    bytevector implementation to have a "native type" field, which defaults
    to VU8.
    
    This commit doesn't actually expose any user-noticeable changes,
    however.
    
    * libguile/bytevectors.h (SCM_BYTEVECTOR_ELEMENT_TYPE): New internal
      defines.
      (scm_i_make_typed_bytevector, scm_c_take_typed_bytevector): New
      internal functions.
    
    * libguile/bytevectors.c (SCM_BYTEVECTOR_SET_ELEMENT_TYPE):
      (SCM_BYTEVECTOR_TYPE_SIZE):
      (SCM_BYTEVECTOR_TYPED_LENGTH): New internal macros.
      (make_bytevector, make_bytevector_from_buffer): Take an extra
      argument, the element type. The length argument is interpreted as
      being the number of elements, which corresponds to the number of bytes
      in the default VU8 case. Doing it this way eliminates a class of bugs
      -- e.g. a u32vector of length 3 bytes doesn't make sense. We do have
      to check for another class of bugs: overflow. The length stored on the
      bytevector itself is still the byte length, though.
      (scm_i_make_typed_bytevector):
      (scm_c_take_typed_bytevector): New internal functions.
      (scm_i_shrink_bytevector): Make sure the new size is valid for the
      bytevector's type.
      (scm_i_bytevector_generalized_set_x): Remove this function, the
      array-handle infrastructure takes care of this for us.
      (print_bytevector): Print the bytevector according to its type.
      (scm_make_bytevector, scm_bytevector_copy)
      (scm_uniform_array_to_bytevector)
      (scm_u8_list_to_bytevector, scm_bytevector_to_uint_list): Adapt to
      make_bytevector extra arg.
      (bv_handle_ref, bv_handle_set_x): Adapt to ref and set based on the
      type of the bytevector, e.g. f64 or u8.
      (bytevector_get_handle): Set the typed length of the vector, not the
      byte length.
    
    Conflicts:
    
    	libguile/bytevectors.c

commit cd43fdc5b7a7c851ee0f2b4e96a1f394fb50d869
Author: Andy Wingo <wingo@pobox.com>
Date:   Sat Jul 18 19:03:28 2009 +0200

    fix (bytevector-ieee-single-native-set! x 0 0)
    
    * libguile/bytevectors.c (VALIDATE_REAL): SCM_VALIDATE_REAL is not what
      we need for checking values for bytevector-ieee-single-native-set! et
      al, so define our own validator.
      (IEEE754_SET, IEEE754_NATIVE_SET): Use it.

commit 6f4895577734bb107d488f31dca82d5ad4c25a65
Author: Andy Wingo <wingo@pobox.com>
Date:   Sun Jul 19 15:35:33 2009 +0200

    reimplement srfi-4 vectors on top of bytevectors
    
    * libguile/srfi-4.h:
    * libguile/srfi-4.c (scm_make_srfi_4_vector): New function, exported by
      (srfi srfi-4 gnu).
    * libguile/srfi-4.i.c: Removed.
    * module/srfi/srfi-4.scm:
    * module/srfi/srfi-4/gnu.scm: Reimplement srfi-4 vectors on top of
      bytevectors. The implementation is mostly in Scheme now.
    
    * module/rnrs/bytevector.scm:
    * libguile/bytevectors.c (scm_uniform_array_to_bytevector): No more need
      for this, as uniform vectors are bytevectors, and we can get the
      backing vector of an array via shared-array-root.
    
    * libguile/bytevectors.c (bytevector_ref_c32, bytevector_ref_c64)
      (bytevector_set_c32, bytevector_set_c64): Fix some embarrassing bugs.
      Still need to do an upper bounds check.
    
    * libguile/deprecated.h: Remove deprecated array functions:
      scm_i_arrayp, scm_i_array_ndim, scm_i_array_mem, scm_i_array_v,
      scm_i_array_base, scm_i_array_dims, and the deprecated macros:
      SCM_ARRAYP, SCM_ARRAY_NDIM, SCM_ARRAY_CONTP, SCM_ARRAY_MEM,
      SCM_ARRAY_V, SCM_ARRAY_BASE, SCM_ARRAY_DIMS.
    * libguile/deprecated.c (scm_uniform_vector_read_x)
      (scm_uniform_vector_write, scm_uniform_array_read_x)
      (scm_uniform_array_write): Newly deprecated functions.
    
    * libguile/generalized-arrays.c (scm_array_type): Remove the bytevector
      hack. This does introduce the bug that #vu8(1 2 3) will compile to
      #u8(1 2 3). I'm working on that.
    
    * libguile/objcodes.c (scm_bytecode_to_objcode, scm_objcode_to_bytecode):
      Rework to operate on bytevectors, as scm_make_u8vector now causes a
      module lookup, which can't be done e.g. when loading the VM boot
      program for psyntax-pp.go on a fresh bootstrap.
    
    * libguile/objcodes.h (SCM_F_OBJCODE_IS_BYTEVECTOR):
      (SCM_OBJCODE_IS_BYTEVECTOR): s/U8VECTOR/BYTEVECTOR/.
    
    * module/ice-9/boot-9.scm (the-scm-module): A terrible hack to pull in
      (srfi srfi-4), as the bindings are primarily there now. We'll worry
      about this later.
    
    * module/language/glil/compile-assembly.scm (dump-object): Update to
      work with array-contents and shared-array-root.
    
    * test-suite/tests/bytevectors.test: Remove uniform-array->bytevector
      test.

commit ae3543b1f27e973b25060f71f095bf23ef149aee
Author: Andy Wingo <wingo@pobox.com>
Date:   Sat Jul 18 19:08:43 2009 +0200

    clean up includes in vectors.[ch]
    
    * libguile/vectors.h:
    * libguile/vectors.c: Clean up the includes... mostly.

-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-07-19 13:59 review/merge request: wip-array-refactor Andy Wingo
@ 2009-07-22 21:48 ` Neil Jerram
  2009-07-28 22:41   ` Andy Wingo
  2009-07-23 21:08 ` Ludovic Courtès
  1 sibling, 1 reply; 10+ messages in thread
From: Neil Jerram @ 2009-07-22 21:48 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> Hi all,
>
> I've finished up my refactor of Guile's arrays. To my eye it's much
> nicer now.

Yes, to me too.  But I have two overall questions in mind.

- What do you have in mind as regards releasing this?  Even though it
  looks good, I think it would be better to let it mature for a while,
  and hence not to put it into 1.9.x/2.0.  (And we're not short of new
  stuff in 1.9.x/2.0!)

- What about the header file incompatibilities?  I.e. the problem that
  if there are applications out that that #include <libguile/ramap.h>,
  or another of the renamed ones, they will stop working.  Do you
  think we are OK just to document this well, or should we do more
  than that?

> The only bits I could anticipate being controversial would be the last
> two or three patches, in which bytevectors are given an "element type"
> field. This is so that I can make the srfi-4 uniform vector code use
> bytevectors as their representation -- it's one less orthogonal type
> that the VM would have to deal with.
>
> Those patches also allow uniform vectors of one type to be accessed
> using accessors for other types:
>
>   (u8vector-ref #u32(#xffffffff) 0) => 255
>
> Although:
>
>   (u8vector? #u32(#xffffffff)) => #f
>
> But:
>
>   (bytevector? #u32(#xffffffff)) => #t

What is the detailed rationale for that?  If it can be explained in a
way that doesn't sound like nonsense, fine.

> Anyway, logs are here, in chronological order. Let me know what you
> think. Make check should pass at each of these patches. This is on the
> "wip-array-refactor" branch, which I had been rebasing, but I don't plan
> to rebase it again.
>
> Andy
>
> commit 86d88a223c64276e7cd9b4503e7e2ecca5aae320
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 21:51:47 2009 +0200
>
>     remove deprecated functions from unif.c
>     
>     * libguile/unif.h:
>     * libguile/unif.c: Remove deprecated functions.
>     
>     * module/ice-9/deprecated.scm: Remove array-related deprecated
>       functions.
>     
>     * NEWS: Update.

I can understand why a lot of the NEWS deltas evaporate when they are
logically merged into the 1.8-2.0 section, but I'm not sure about

- the entry for Brainfuck

- the entry on scm_set_port_seek and scm_set_port_truncate

Aren't these entries still of interest?

The rest of this commit looks fine.

> commit 4b126598445c4f12c0aebca2adfaa45f3edd86ab
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 22:02:25 2009 +0200
>
>     remove convert.{c,i.c,h}
>     
>     * libguile/convert.c:
>     * libguile/convert.h:
>     * libguile/convert.i.c: Remove these functions, which were undocumented,
>       not in the libguile/ header, and thus unlikely to have been used.

I'm happy with removing these; but in case anyone _is_ using them it
would be helpful to provide (in NEWS) a mapping from them to whatever
people should use instead.  Is that possible?

> commit a4a0d399c877cb802cdaf2c48713d3377a412c4f
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 22:15:04 2009 +0200
>
>     clean up libguile/Makefile.am
>     
>     * libguile/Makefile.am: Clean up some of the file lists, should make
>       future diffs easier to parse.

Fine.

> commit b6149d8d9f35c8091a31b12fb3aeecee0e3a470c
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 00:16:43 2009 +0200
>
>     rename scm_i_make_ra to scm_i_make_array
>     
>     * libguile/unif.c (scm_i_make_array): Rename from scm_i_make_ra. All
>       callers changed.

Fine.

> commit 5d1b3b2db9349b615baac313ae5a111fa68573ac
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 00:25:49 2009 +0200
>
>     rename ramap.[ch] to array-map.[ch]
>     
>     * libguile/array-map.c:
>     * libguile/array-map.h: Rename from ramap.c and ramap.h.
>     
>     * libguile.h:
>     * libguile/Makefile.am:
>     * libguile/eq.c:
>     * libguile/init.c:
>     * libguile/sort.c:
>     * libguile/unif.c:
>     * libguile/vectors.c: All referrers changed.

Compatibility issue here: #include <libguile/ramap.h> will stop
working.  How easy is it for an application writer to test for ramap.h
or array-map.h and #include the right one?

Depending on the answer to that, I guess we should either cover this
change in NEWS, or install an ramap.h that just #includes array-map.h.

> commit c53c0893a3bad3312230003707f71c2f441460d4
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 00:47:31 2009 +0200
>
>     parts of unif.[ch] to array-handle.[ch]
>     
>     * libguile/array-handle.c:
>     * libguile/array-handle.h: Move some parts of unif.c and unif.h to these
>       new files.
>     
>     * libguile/unif.c:
>     * libguile/unif.h: Update includers. Since unif.h depends on the array
>       handle type, we include array-handle.h, which also means that there
>       will be no difference for our callers.
>     
>     * libguile/init.c: Call scm_init_array_handle, though it does nothing as
>       of yet.
>     
>     * libguile/Makefile.am: Adapt for new files.

Nice.

> commit cf396142405d9076cc20eca9bf53376e80359a56
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 00:58:32 2009 +0200
>
>     bitvector exodus from unif.[ch]
>     
>     * libguile/Makefile.am:
>     * libguile/unif.c:
>     * libguile/unif.h:
>     * libguile/bitvectors.c:
>     * libguile/bitvectors.h: Move bitvector functionality out of unif.[ch].
>     
>     * libguile/array-handle.c:
>     * libguile/array-map.c:
>     * libguile/init.c:
>     * libguile/read.c:
>     * libguile/srfi-4.c:
>     * libguile/vectors.c: Oh, what a tangled web we weave...

Nice, but again a #include issue.  Should we make unif.h include
bitvectors.h?

> commit 2fa901a51f62da8a01112aefbf687530f4bff160
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 01:08:35 2009 +0200
>
>     rename unif.[ch] to arrays.[ch]
>     
>     * libguile/Makefile.am:
>     * libguile/unif.c:
>     * libguile/unif.h:
>     * libguile/arrays.c:
>     * libguile/arrays.h: Rename unif.[ch] to arrays.[ch].
>     
>     * libguile.h:
>     * libguile/array-handle.c:
>     * libguile/array-map.c:
>     * libguile/bitvectors.c:
>     * libguile/bytevectors.c:
>     * libguile/eq.c:
>     * libguile/gc-card.c:
>     * libguile/gc-malloc.c:
>     * libguile/gc-mark.c:
>     * libguile/gc.c:
>     * libguile/init.c:
>     * libguile/inline.h:
>     * libguile/print.c:
>     * libguile/random.c:
>     * libguile/read.c:
>     * libguile/socket.c:
>     * libguile/sort.c:
>     * libguile/srfi-4.c:
>     * libguile/srfi-4.h:
>     * libguile/strports.c:
>     * libguile/vectors.c:
>     * libguile/vectors.h: Update includers.

Same again; no more #include <libguile/unif.h>.  What to do about
that?

> commit 2a610be59412a9d633a373c6f6ec4d4794c40fd8
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sun Jul 19 15:04:40 2009 +0200
>
>     add generic array implementation facility
>     
>     * libguile/array-handle.c (scm_i_register_array_implementation):
>       (scm_i_array_implementation_for_obj): Add generic array facility,
>       which will (in a few commits) detangle the array code.
>       (scm_array_get_handle): Use the generic array facility. Note that
>       scm_t_array_handle no longer has ref and set function pointers;
>       instead it has a pointer to the array implementation. It is unlikely
>       that code out there used these functions, however, as the supported
>       way was through scm_array_handle_ref/set_x.
>       (scm_array_handle_pos): Move this function here from arrays.c.
>       (scm_array_handle_element_type): New function, returns a Scheme value
>       representing the type of element stored in this array.
>     
>     * libguile/array-handle.h (scm_t_array_element_type): New enum, for
>       generically determining the type of an array.
>       (scm_array_handle_rank):
>       (scm_array_handle_dims): These are now just #defines.
>     
>     * libguile/arrays.c:
>     * libguile/bitvectors.c:
>     * libguile/bytevectors.c:
>     * libguile/srfi-4.c:
>     * libguile/strings.c:
>     * libguile/vectors.c: Register array implementations for all of these.
>     
>     * libguile/inline.h: Update for array_handle_ref/set change.
>     * libguile/deprecated.h: Need to include arrays.h now.

 const SCM *
 scm_array_handle_elements (scm_t_array_handle *h)
 {
-  SCM vec = h->array;
-  if (SCM_I_ARRAYP (vec))
-    vec = SCM_I_ARRAY_V (vec);
-  if (SCM_I_IS_VECTOR (vec))
-    return SCM_I_VECTOR_ELTS (vec) + h->base;
-  scm_wrong_type_arg_msg (NULL, 0, h->array, "non-uniform array");
+  if (h->element_type != SCM_ARRAY_ELEMENT_TYPE_SCM)
+    scm_wrong_type_arg_msg (NULL, 0, h->array, "non-uniform array");
+  return ((const SCM*)h->elements) + h->base;

Isn't this a significant C API change - i.e. that this function now
only works for non-uniform arrays?  Ditto
scm_array_handle_writable_elements.

+#define SCM_ARRAY_IMPLEMENTATION(tag_,mask_,vref_,vset_,handle_) \
+  SCM_SNARF_INIT ({                                                     \
+      scm_t_array_implementation impl;                                  \
+      impl.tag = tag_; impl.mask = mask_;                               \
+      impl.vref = vref_; impl.vset = vset_;                             \
+      impl.get_handle = handle_;                                        \
+      scm_i_register_array_implementation (&impl);                      \
+  })

I like this!  Much nicer than equivalent documentation or comments
saying how an implementation has to register itself.

+  /* perhaps should catch overflow here too */

Don't understand.  Could you expand this comment a bit?

+    scm_out_of_range ("vector-handle-ref", scm_from_size_t (idx));

I worry slightly here that "vector-handle-ref" will appear in a Scheme
level error message, and that no one will know what that means.

Everything else here looks great.

> commit 66b9d7d304a349d5bb4f763a47143f09da58d97f
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 12:45:24 2009 +0200
>
>     remove enclosed arrays
>     
>     * libguile/arrays.h:
>     * libguile/array-map.c:
>     * libguile/arrays.c:
>     * libguile/deprecated.c: Remove "enclosed arrays". The only user-facing
>       procedures that this affects are scm_enclose_array / enclose-array. If
>       enclosed arrays are added back, it should be through the generic array
>       interface; but really, it sounds like something that would be better
>       implemented in Scheme.

I'm not sure about that.  I also haven't fully understood them yet!

Do the array mapping functions work on enclosed arrays?  If so,
wouldn't that be a lot of work to reimplement with a Scheme-level
implementation?

Did you manage to find out anything about who added enclosed arrays,
and why?

Also, is it definitely a problem for the new framework to support
them?

> commit 1030b45049f564f4abd459abd8e59db34c7867cc
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 18:54:06 2009 +0200
>
>     move generic array foo out to its own file
>     
>     * libguile/arrays.h:
>     * libguile/arrays.c:
>     * libguile/generalized-arrays.h:
>     * libguile/generalized-arrays.c: Move some generic functionality out of
>       arrays.c to a new file.
>     
>     * libguile/array-map.c:
>     * libguile/deprecated.c:
>     * libguile/init.c: Update includers.

Fine.  (Although another header file issue.)

> commit f332e9571703ddcd27c51ebe3c847459c2a649b7
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 19:05:32 2009 +0200
>
>     generic vector ops to own file
>     
>     * libguile/Makefile.am:
>     * libguile/vectors.c:
>     * libguile/vectors.h:
>     * libguile/generalized-vectors.c:
>     * libguile/generalized-vectors.h: Move generic vector ops off into their
>       own file too. The implementation is now based on the generic
>       array-handle infrastructure.
>     
>     * libguile.h:
>     * libguile/array-map.c:
>     * libguile/bitvectors.c:
>     * libguile/random.c:
>     * libguile/srfi-4.c: Update includers.

So now a generalized vector is exactly the same as a generalized array
with rank 1?

Cool.  Also seems like we then have a lot more API than we need!

> commit 476b894c71b436f3befb8af46b899aaf244763e2
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 12:18:15 2009 +0200
>
>     uniform vector functions to their own file
>     
>     * libguile/uniform.c:
>     * libguile/uniform.h:
>     * libguile/srfi-4.c:
>     * libguile/srfi-4.h:
>     * libguile/Makefile.am: Move uniform vector funcs out of srfi-4 to their
>       own file.
>     
>     * libguile.h:
>     * libguile/arrays.c:
>     * libguile/bytevectors.c: Update includers.

+const size_t scm_i_array_element_type_sizes[SCM_ARRAY_ELEMENT_TYPE_LAST + 1] = {
+  0,
+  0,
+  1,
+  8,
+  8, 8,
+  16, 16,
+  32, 32,
+  64, 64,
+  32, 64,
+  64, 128
+};

That's a bit hard to read and maintain on its own.  Could it go next
to where the element types are defined?

+      ret = SCM_ARRAY_ELEMENT_TYPE_IS_UNBOXED (h.element_type);

Why not use scm_array_handle_uniform_element_size (h) here too?

> commit f45eccffa73c043466a4cc0f5037132ee5795eee
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 12:43:54 2009 +0200
>
>     add registry of vector constructors, make-generalized-vector
>     
>     * libguile/generalized-vectors.h:
>     * libguile/generalized-vectors.c: Add a registry of vector constructors.
>       (scm_make_generalized_vector): New public function, constructs a
>       vector of a given type.
>     
>     * libguile/bitvectors.c:
>     * libguile/bytevectors.c:
>     * libguile/srfi-4.c:
>     * libguile/strings.c:
>     * libguile/vectors.c: Register vector constructors.
>     
>     * libguile/extensions.c (scm_init_extensions): No need to NULL the list
>       of registered extensions here, the static init does it for us. Allows
>       scm_c_register_extension to be called before scm_init_extensions.
>     
>     * libguile/init.c (scm_i_init_guile): Move array initialization earlier,
>       so e.g. scm_init_strings has access to a valid list of array element
>       types when registering its vector constructor.

Can you say more about why we need this?  Given the new equivalence
between generalized vectors and arrays, won't the array API suffice?

+SCM_DEFINE (scm_make_generalized_vector, "make-generalized-vector", 2, 1, 0,
+            (SCM type, SCM len, SCM fill),
+            "Make a generalized vector")
+#define FUNC_NAME s_scm_make_generalized_vector
+{
+  int i;
+  for (i = 0; i < num_vector_ctors_registered; i++)
+    if (vector_ctors[i].tag == type)

I don't see how a Scheme caller can get a type value (which I believe
is a string) that is == the registered tag.

+  scm_init_bitvectors ();
+  scm_bootstrap_bytevectors ();

Can you now combine these two?  That would also allow using the nice
ARRAY_IMPLEMENTATION and VECTOR_IMPLEMENTATION macros in bytevectors.c.

+#define REGISTER(tag, TAG)                                       \
+  scm_i_register_vector_constructor                              \
+    (scm_i_array_element_types[SCM_ARRAY_ELEMENT_TYPE_##TAG],    \
+     scm_make_##tag##vector)
+
+  REGISTER (u8, U8); 
+  REGISTER (s8, S8); 
+  REGISTER (u16, U16);
+  REGISTER (s16, S16);
+  REGISTER (u32, U32);
+  REGISTER (s32, S32);
+  REGISTER (u64, U64);
+  REGISTER (s64, S64);
+  REGISTER (f32, F32);
+  REGISTER (f64, F64);
+  REGISTER (c32, C32);
+  REGISTER (c64, C64);

I'd rather see 12 SCM_VECTOR_IMPLEMENTATION calls here.  It's a touch
more code, but much nicer for someone looking for vector
implementations.

> commit 943a0a8759504c4a367c1904bef4a8afbc6208dd
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 12:58:37 2009 +0200
>
>     make-typed-array builds backing vector via make-generalized-vector
>     
>     * libguile/arrays.c: Rework to use scm_make_generalized_vector instead
>       of our own type table.
>     
>     * libguile/bitvectors.c: Fix some includes.

Nice.  Really starting to make good sense now!

> commit ac8ed3db31769d7ede5e75fba1697e8dde55fae4
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 13:26:18 2009 +0200
>
>     any->u8vector and family now implemented in Scheme
>     
>     * module/Makefile.am:
>     * module/srfi/srfi-4/gnu.scm: New module, for extensions to srfi-4.
>       Currently defines the any->FOOvector family.
>     
>     * libguile/srfi-4.c:
>     * libguile/srfi-4.i.c: Dispatch scm_any_to_FOOvector calls to the
>       scheme-implemented functions in (srfi srfi-4 gnu).

+DEFINE_SRFI_4_GNU_PROXIES (u8);
+DEFINE_SRFI_4_GNU_PROXIES (s8);
+DEFINE_SRFI_4_GNU_PROXIES (u16);
+DEFINE_SRFI_4_GNU_PROXIES (s16);
+DEFINE_SRFI_4_GNU_PROXIES (u32);
+DEFINE_SRFI_4_GNU_PROXIES (s32);
+DEFINE_SRFI_4_GNU_PROXIES (u64);
+DEFINE_SRFI_4_GNU_PROXIES (s64);
+DEFINE_SRFI_4_GNU_PROXIES (f32);
+DEFINE_SRFI_4_GNU_PROXIES (f64);
+DEFINE_SRFI_4_GNU_PROXIES (c32);
+DEFINE_SRFI_4_GNU_PROXIES (c64);

Again, for the sake of grepping, I'd rather see 12 DEFPROXY100 calls
here.

> commit f332089ed43761440a2a8c272ee61a709b38cc24
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 13:46:29 2009 +0200
>
>     bytevector inlinedness indicated by flag, not length
>     
>     * libguile/bytevectors.h (SCM_BYTEVECTOR_INLINE_P): Change to check a
>       flag instead of checking the length of the bytevector.

Why do that?  This means we have two pieces of information that always
have to be in sync - the length and the flag - instead of just one
(the length), and I don't believe that accessing the flag is any
quicker than the length.  Also you've had to add more code to keep the
flag set correctly.

>     * libguile/bytevectors.c (make_bytevector_from_buffer): Handle the len
>       <= inline threshold case as well. Set the inline flag as
>       appropriate.

OK.

>       (make_bytevector): Updat the inline flag as appropriate.

Your changes here don't seem to add anything, or make the code
clearer.  Am I missing something?

>       (scm_c_take_bytevector): Just dispatch to
>       make_bytevector_from_buffer.

Nice.

>       (scm_i_shrink_bytevector): Update the inline flag as appropriate.
>       Update the length when shrinking an already-inlined vector.

Cool (except general query above about the inline flag).

>       (STRING_TO_UTF): Fix some indentation.

OK, I'll stop here for now, and finish the rest (hopefully) tomorrow.

Thanks for giving this area some serious attention!

     Neil




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-07-19 13:59 review/merge request: wip-array-refactor Andy Wingo
  2009-07-22 21:48 ` Neil Jerram
@ 2009-07-23 21:08 ` Ludovic Courtès
  2009-07-24 22:01   ` Neil Jerram
  1 sibling, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2009-07-23 21:08 UTC (permalink / raw)
  To: guile-devel

Hello!

Andy Wingo <wingo@pobox.com> writes:

> I've finished up my refactor of Guile's arrays. To my eye it's much
> nicer now.

Hey, great!

Several general remarks:

  * Since I'm conservative and lazy, I'd have happily let this code rest
    in peace.  ;-)

    In particular, last time I checked[*], we had poor coverage of
    {ramap,srfi-4,unif}.c so I'd be afraid of breaking something without
    noticing.

    [*] http://www.fdn.fr/~lcourtes/software/guile/coverage/libguile/index.html

  * As we discussed on IRC, I don't fully understand the rationale
    behind the SRFI-4/bytevector unification.  For anything binary I/O,
    the bytevector API is probably better suited.

    A use case you once mentioned was graphics applications: they may
    get pixel streams (for which f32 vectors are a better fit than
    bytevectors) extracted after doing bytevector-based binary I/O.  I'm
    not fully convinced that this use case justifies jointing the SRFI-4
    and bytevector types.

  * Polymorphism is quite unusual in Scheme APIs.

  * There's definitely room for improvement, and factorization or the
    array/vector code, and a large part of what you've been doing here
    goes in the right direction.

I agree with most of what Neil said, so I won't repeat it here.

> commit 86d88a223c64276e7cd9b4503e7e2ecca5aae320
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 21:51:47 2009 +0200
>
>     remove deprecated functions from unif.c
>     
>     * libguile/unif.h:
>     * libguile/unif.c: Remove deprecated functions.
>     
>     * module/ice-9/deprecated.scm: Remove array-related deprecated
>       functions.
>     
>     * NEWS: Update.

OK.  (I assume the change in `NEWS' is to only keep "Changes in 1.9.2"
at the top and remove "Changes in 1.9.1", right?)

Can already be pushed to `master'.

> commit 4b126598445c4f12c0aebca2adfaa45f3edd86ab
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 22:02:25 2009 +0200
>
>     remove convert.{c,i.c,h}
>     
>     * libguile/convert.c:
>     * libguile/convert.h:
>     * libguile/convert.i.c: Remove these functions, which were undocumented,
>       not in the libguile/ header, and thus unlikely to have been used.

OK; push to `master'.

> commit a4a0d399c877cb802cdaf2c48713d3377a412c4f
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Thu Jul 16 22:15:04 2009 +0200
>
>     clean up libguile/Makefile.am
>     
>     * libguile/Makefile.am: Clean up some of the file lists, should make
>       future diffs easier to parse.

Good in principle.

(In practice, I fear the day when I'll have to merge that one in the
BDW-GC branch...)

> commit 5d1b3b2db9349b615baac313ae5a111fa68573ac
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 00:25:49 2009 +0200
>
>     rename ramap.[ch] to array-map.[ch]

Neil is concerned about the header renaming.  I think it's OK because
individual headers are not meant to be included individually, i.e., one
is supposed to always #include <libguile.h> (except for
<libguile/i18n.h>).

> commit 2a610be59412a9d633a373c6f6ec4d4794c40fd8
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sun Jul 19 15:04:40 2009 +0200
>
>     add generic array implementation facility

Excellent!

I would just add comments to `scm_t_array_implementation', and pipe the
changes through GNU Ident.  :-)

Also:

+typedef enum {    
+  SCM_ARRAY_ELEMENT_TYPE_SCM = 0, /* SCM values */
+  SCM_ARRAY_ELEMENT_TYPE_CHAR = 1, /* characters */

[...]

+  SCM_ARRAY_ELEMENT_TYPE_C64 = 15,
+  SCM_ARRAY_ELEMENT_TYPE_LAST = 15,
+} scm_t_array_element_type;

Per C99 (and previous versions) §6.7.2.2, this is not necessary:

  If the first enumerator has no =, the value of its enumeration
  constant is 0. Each subsequent enumerator with no = defines its
  enumeration constant as the value of the constant expression obtained
  by adding 1 to the value of the previous enumeration constant.

Up-to-here, I think the changes are mostly non-controversial and would
be OK for 1.9 IMO.

> commit 66b9d7d304a349d5bb4f763a47143f09da58d97f
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Fri Jul 17 12:45:24 2009 +0200
>
>     remove enclosed arrays

Same as Neil: I'd like to avoid gratuitous incompatibilities if it can
be avoided.

> commit f45eccffa73c043466a4cc0f5037132ee5795eee
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 12:43:54 2009 +0200
>
>     add registry of vector constructors, make-generalized-vector
>     
>     * libguile/generalized-vectors.h:
>     * libguile/generalized-vectors.c: Add a registry of vector constructors.
>       (scm_make_generalized_vector): New public function, constructs a
>       vector of a given type.
>     
>     * libguile/bitvectors.c:
>     * libguile/bytevectors.c:
>     * libguile/srfi-4.c:
>     * libguile/strings.c:
>     * libguile/vectors.c: Register vector constructors.
>     
>     * libguile/extensions.c (scm_init_extensions): No need to NULL the list
>       of registered extensions here, the static init does it for us. Allows
>       scm_c_register_extension to be called before scm_init_extensions.

This `extensions.c' change should be in a separate commit, I think.

>     * libguile/init.c (scm_i_init_guile): Move array initialization earlier,
>       so e.g. scm_init_strings has access to a valid list of array element
>       types when registering its vector constructor.

This:

-static extension_t *registered_extensions;
+static extension_t *registered_extensions = NULL;

is not needed.

Otherwise OK.

> commit 943a0a8759504c4a367c1904bef4a8afbc6208dd
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 12:58:37 2009 +0200
>
>     make-typed-array builds backing vector via make-generalized-vector
>     
>     * libguile/arrays.c: Rework to use scm_make_generalized_vector instead
>       of our own type table.
>     
>     * libguile/bitvectors.c: Fix some includes.

OK.  (Is the latter related to the former?)

> commit ac8ed3db31769d7ede5e75fba1697e8dde55fae4
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 13:26:18 2009 +0200
>
>     any->u8vector and family now implemented in Scheme
>     
>     * module/Makefile.am:
>     * module/srfi/srfi-4/gnu.scm: New module, for extensions to srfi-4.
>       Currently defines the any->FOOvector family.
>     
>     * libguile/srfi-4.c:
>     * libguile/srfi-4.i.c: Dispatch scm_any_to_FOOvector calls to the
>       scheme-implemented functions in (srfi srfi-4 gnu).

+#define DEFINE_SCHEME_PROXY100(cname, modname, scmname)                 \

I'd remove "100"...

+#define DEFPROXY100(cname, scmname)               \
+  DEFINE_SCHEME_PROXY100 (cname, MOD, scmname)
+
+#define DEFINE_SRFI_4_GNU_PROXIES(tag)                              \
+  DEFPROXY100 (scm_any_to_##tag##vector, "any->" #tag "vector")

... and keep only one macro, as Neil suggested.

+DEFINE_SRFI_4_GNU_PROXIES (u8);

Remove extraneous semicolons.

+;;; Extensions to SRFI-4
+
+;; 	Copyright (C) 2001, 2002, 2004, 2006, 2009 Free Software Foundation, Inc.

Should be just "2009".

> commit f332089ed43761440a2a8c272ee61a709b38cc24
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 13:46:29 2009 +0200
>
>     bytevector inlinedness indicated by flag, not length
>     
>     * libguile/bytevectors.h (SCM_BYTEVECTOR_INLINE_P): Change to check a
>       flag instead of checking the length of the bytevector.
>     
>     * libguile/bytevectors.c (make_bytevector_from_buffer): Handle the len
>       <= inline threshold case as well. Set the inline flag as appropriate.
>       (make_bytevector): Updat the inline flag as appropriate.
>       (scm_c_take_bytevector): Just dispatch to make_bytevector_from_buffer.
>       (scm_i_shrink_bytevector): Update the inline flag as appropriate.
>       Update the length when shrinking an already-inlined vector.
>       (STRING_TO_UTF): Fix some indentation.

The scm_c_take_bytevector -> make_bytevector_from_buffer shuffling can
be avoided.

 static inline SCM
 make_bytevector (size_t len)
 {
-  SCM bv;
-
   if (SCM_UNLIKELY (len == 0))
-    bv = scm_null_bytevector;
+    return scm_null_bytevector;
+
+  if (SCM_BYTEVECTOR_INLINEABLE_SIZE_P (len))
+    {
+      SCM ret;
+      SCM_NEWSMOB2 (ret, scm_tc16_bytevector, len, NULL);
+      SCM_BYTEVECTOR_SET_INLINE (ret);
+      return ret;
+    }

The code initially had a single `return' statement per function, to
improve readability (IMO).  Do you think otherwise?

> commit e286c973fcd63c0930d9302cc5f1a280b9b22615
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sun Jul 19 15:11:53 2009 +0200
>
>     bytevectors have "element type" field, e.g. for generalized-vector-ref
>     
>     Bytevectors have a very close relationship to other forms of uniform
>     vectors. Often you want to view a u64vector as a series of bytes, for
>     writing over a socket; or to process an incoming stream using the
>     convenient and less error-prone s16vector-ref API rather than
>     bytevector-s16-native-ref.
>     
>     The essential needs of the representation of a bytevector and an
>     s64vector are the same, so we take advantage of that and extend the
>     bytevector implementation to have a "native type" field, which defaults
>     to VU8.

This comments belong to a header (info "(standards) Change Log
Concepts").

>     This commit doesn't actually expose any user-noticeable changes,
>     however.
>     
>     * libguile/bytevectors.h (SCM_BYTEVECTOR_ELEMENT_TYPE): New internal
>       defines.
>       (scm_i_make_typed_bytevector, scm_c_take_typed_bytevector): New
>       internal functions.
>     
>     * libguile/bytevectors.c (SCM_BYTEVECTOR_SET_ELEMENT_TYPE):
>       (SCM_BYTEVECTOR_TYPE_SIZE):
>       (SCM_BYTEVECTOR_TYPED_LENGTH): New internal macros.
>       (make_bytevector, make_bytevector_from_buffer): Take an extra
>       argument, the element type. The length argument is interpreted as
>       being the number of elements, which corresponds to the number of bytes
>       in the default VU8 case. Doing it this way eliminates a class of bugs
>       -- e.g. a u32vector of length 3 bytes doesn't make sense. We do have
>       to check for another class of bugs: overflow. The length stored on the
>       bytevector itself is still the byte length, though.
>       (scm_i_make_typed_bytevector):
>       (scm_c_take_typed_bytevector): New internal functions.
>       (scm_i_shrink_bytevector): Make sure the new size is valid for the
>       bytevector's type.
>       (scm_i_bytevector_generalized_set_x): Remove this function, the
>       array-handle infrastructure takes care of this for us.
>       (print_bytevector): Print the bytevector according to its type.
>       (scm_make_bytevector, scm_bytevector_copy)
>       (scm_uniform_array_to_bytevector)
>       (scm_u8_list_to_bytevector, scm_bytevector_to_uint_list): Adapt to
>       make_bytevector extra arg.
>       (bv_handle_ref, bv_handle_set_x): Adapt to ref and set based on the
>       type of the bytevector, e.g. f64 or u8.
>       (bytevector_get_handle): Set the typed length of the vector, not the
>       byte length.
>     
>     Conflicts:
>     
>     	libguile/bytevectors.c

At this point, bytevectors are aware of all the other vector types.

In principle, I don't like this idea.  I don't have a clear picture of
the array/vector dependency graph, though.

This:

+  if (SCM_UNLIKELY (len == 0 && element_type == 0))

should be "element_type == SCM_ARRAY_ELEMENT_TYPE_SCM".

> commit cd43fdc5b7a7c851ee0f2b4e96a1f394fb50d869
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 19:03:28 2009 +0200
>
>     fix (bytevector-ieee-single-native-set! x 0 0)
>     
>     * libguile/bytevectors.c (VALIDATE_REAL): SCM_VALIDATE_REAL is not what
>       we need for checking values for bytevector-ieee-single-native-set! et
>       al, so define our own validator.
>       (IEEE754_SET, IEEE754_NATIVE_SET): Use it.

Can you please push it to `master', preferably with a test case?

> commit 6f4895577734bb107d488f31dca82d5ad4c25a65
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sun Jul 19 15:35:33 2009 +0200
>
>     reimplement srfi-4 vectors on top of bytevectors
>     
>     * libguile/srfi-4.h:
>     * libguile/srfi-4.c (scm_make_srfi_4_vector): New function, exported by
>       (srfi srfi-4 gnu).
>     * libguile/srfi-4.i.c: Removed.
>     * module/srfi/srfi-4.scm:
>     * module/srfi/srfi-4/gnu.scm: Reimplement srfi-4 vectors on top of
>       bytevectors. The implementation is mostly in Scheme now.

I appreciate the move to Scheme.  ;-)

>     * module/rnrs/bytevector.scm:
>     * libguile/bytevectors.c (scm_uniform_array_to_bytevector): No more need
>       for this, as uniform vectors are bytevectors, and we can get the
>       backing vector of an array via shared-array-root.
>     
>     * libguile/bytevectors.c (bytevector_ref_c32, bytevector_ref_c64)
>       (bytevector_set_c32, bytevector_set_c64): Fix some embarrassing bugs.
>       Still need to do an upper bounds check.

Hmm...

>     * libguile/deprecated.h: Remove deprecated array functions:
>       scm_i_arrayp, scm_i_array_ndim, scm_i_array_mem, scm_i_array_v,
>       scm_i_array_base, scm_i_array_dims, and the deprecated macros:
>       SCM_ARRAYP, SCM_ARRAY_NDIM, SCM_ARRAY_CONTP, SCM_ARRAY_MEM,
>       SCM_ARRAY_V, SCM_ARRAY_BASE, SCM_ARRAY_DIMS.
>     * libguile/deprecated.c (scm_uniform_vector_read_x)
>       (scm_uniform_vector_write, scm_uniform_array_read_x)
>       (scm_uniform_array_write): Newly deprecated functions.
>     
>     * libguile/generalized-arrays.c (scm_array_type): Remove the bytevector
>       hack. This does introduce the bug that #vu8(1 2 3) will compile to
>       #u8(1 2 3). I'm working on that.
>     
>     * libguile/objcodes.c (scm_bytecode_to_objcode, scm_objcode_to_bytecode):
>       Rework to operate on bytevectors, as scm_make_u8vector now causes a
>       module lookup, which can't be done e.g. when loading the VM boot
>       program for psyntax-pp.go on a fresh bootstrap.
>     
>     * libguile/objcodes.h (SCM_F_OBJCODE_IS_BYTEVECTOR):
>       (SCM_OBJCODE_IS_BYTEVECTOR): s/U8VECTOR/BYTEVECTOR/.
>     
>     * module/ice-9/boot-9.scm (the-scm-module): A terrible hack to pull in
>       (srfi srfi-4), as the bindings are primarily there now. We'll worry
>       about this later.

The issue is that `make-c32vector' et al., which used to be in
`the-scm-module' are now confined in `(srfi srfi-4 gnu)'.  There's a
compatibility issue here.  Perhaps it should be pulled in as well?

>     * module/language/glil/compile-assembly.scm (dump-object): Update to
>       work with array-contents and shared-array-root.
>     
>     * test-suite/tests/bytevectors.test: Remove uniform-array->bytevector
>       test.


> commit ae3543b1f27e973b25060f71f095bf23ef149aee
> Author: Andy Wingo <wingo@pobox.com>
> Date:   Sat Jul 18 19:08:43 2009 +0200
>
>     clean up includes in vectors.[ch]
>     
>     * libguile/vectors.h:
>     * libguile/vectors.c: Clean up the includes... mostly.

"Mostly"?  What does it clean up?

+#include "libguile/arrays.h" /* Hit me with the ugly stick */

Does that mean there are circular dependencies or something like that?


Waouh, it's a lot of work, and reviewing it takes some time, too.
Honestly, I'd rather spend the small amount of time I spend on Guile
these days in other areas with higher priorities.

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-07-23 21:08 ` Ludovic Courtès
@ 2009-07-24 22:01   ` Neil Jerram
  0 siblings, 0 replies; 10+ messages in thread
From: Neil Jerram @ 2009-07-24 22:01 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Waouh, it's a lot of work, and reviewing it takes some time, too.
> Honestly, I'd rather spend the small amount of time I spend on Guile
> these days in other areas with higher priorities.

Hehe; yes, it certainly is a `mixed blessing' to have so many
contributions to Guile as we have at the moment.  I have my doubts too
about being able to review everything effectively, and I apologise to
people (Daniel, Mark, Julian, ...) who have to wait a long time for a
response, and may wonder if their patch has fallen into a black
hole...

But that said, I'm sure that I wouldn't have things any other way.
Guile has never stood still, during the 10 years that I've been a bit
involved, but I don't think it's ever had quite the buzz that it seems
to have at the moment.

Also, refactoring is worthwhile.  I don't have details ready to prove
it, but I'm sure that a lot of our recent work has benefitted greatly
(if unknowingly) from cleanups and refactoring that Marius did (IIRC,
affecting symbols, modules and vectors) in the run up to 1.8.

Regards,
        Neil

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-07-22 21:48 ` Neil Jerram
@ 2009-07-28 22:41   ` Andy Wingo
  2009-07-30 21:10     ` Neil Jerram
  0 siblings, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2009-07-28 22:41 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Hi Neil,

Thanks for the review.

On Wed 22 Jul 2009 23:48, Neil Jerram <neil@ossau.uklinux.net> writes:

> I have two overall questions in mind.
>
> - What do you have in mind as regards releasing this?  Even though it
>   looks good, I think it would be better to let it mature for a while,
>   and hence not to put it into 1.9.x/2.0.  (And we're not short of new
>   stuff in 1.9.x/2.0!)

Personally I would prefer that it come out in 2.0. I'm fairly (but not
entirely) confident of its consistency as it is, and quite confident
that it is a more maintainable, and hence fixable, codebase.

The reason that I want it in is that bytevectors are a nice api for I/O,
but once you have data in memory, often it's best to treat it as a
uniform array -- be it of u8 color components, or s32 audio samples.

Uniform vectors are almost by nature "in flight" between two places. But
it is the bytevector that is most equipped to "fly", via e.g. (rnrs io
ports); but bytevectors do not provide an ideal API for manipulation.
s32vector-ref is a better API than bytevector-s32-native-ref, if only
that the offset of the former is in native units (s32 elements), and of
the second in bytes.

I guess what I wanted to do was break uniform vectors out of their type
ghettoes. Like Perlis said, "Pascal is for building pyramids --
imposing, breathtaking, static structures built by armies pushing heavy
blocks into place. Lisp is for building organisms -- imposing,
breathtaking, dynamic structures built by squads fitting fluctuating
myriads of simpler organisms into place." I should be able to put a
64-bit float into two 32-bit words, when I'm writing an instruction
sequence for a 32-bit machine, without having to change my data type --
they are just bytes anyway.

> - What about the header file incompatibilities?  I.e. the problem that
>   if there are applications out that that #include <libguile/ramap.h>,
>   or another of the renamed ones, they will stop working.  Do you
>   think we are OK just to document this well, or should we do more
>   than that?

I thought that our only support API was that provided by #include
<libguile.h>.

We can add back compatibility header files if necessary. Do you think it
is necessary?

>> The only bits I could anticipate being controversial would be the last
>> two or three patches, in which bytevectors are given an "element type"
>> field. This is so that I can make the srfi-4 uniform vector code use
>> bytevectors as their representation -- it's one less orthogonal type
>> that the VM would have to deal with.
>>
>> Those patches also allow uniform vectors of one type to be accessed
>> using accessors for other types:
>>
> What is the detailed rationale for [the following examples]?

>>   (u8vector-ref #u32(#xffffffff) 0) => 255

I ought to be able to get at the bits of a packed (uniform) vector. The
whole point of being a uniform vector is to specify a certain bit layout
of your data.

>>   (u8vector? #u32(#xffffffff)) => #f

However, we need to preserve type dispatch:

  (cond
    ((u8vector? x) ...)
    ((s8vector? x) ...)
    ...)

>>   (bytevector? #u32(#xffffffff)) => #t

This to me is like allowing (integer? 1) and (integer? 1.0); in
/essence/ a #u32 is an array of bytes, interpreted in a certain way. In
practice, one might use real? / complex? / integer? / etc to distinguish
elements of the numeric tower; in this case we have a short tower:
uniform vectors are bytevectors.

>> commit 86d88a223c64276e7cd9b4503e7e2ecca5aae320
>> Author: Andy Wingo <wingo@pobox.com>
>> Date:   Thu Jul 16 21:51:47 2009 +0200
>>
>>     remove deprecated functions from unif.c
>>     
>>     * libguile/unif.h:
>>     * libguile/unif.c: Remove deprecated functions.
>>     
>>     * module/ice-9/deprecated.scm: Remove array-related deprecated
>>       functions.
>>     
>>     * NEWS: Update.
>
> I can understand why a lot of the NEWS deltas evaporate when they are
> logically merged into the 1.8-2.0 section, but I'm not sure about
>
> - the entry for Brainfuck
>
> - the entry on scm_set_port_seek and scm_set_port_truncate
>
> Aren't these entries still of interest?
>
> The rest of this commit looks fine.

Yes they are. Good catch.

I'll finish this response, and the response to Ludovic, later -- but I
wanted to get out the rationale now. Thanks again for the review!

Andy

-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-07-28 22:41   ` Andy Wingo
@ 2009-07-30 21:10     ` Neil Jerram
  2009-08-04 12:21       ` Andy Wingo
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Jerram @ 2009-07-30 21:10 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> Hi Neil,
>
> Thanks for the review.
>
> On Wed 22 Jul 2009 23:48, Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> I have two overall questions in mind.
>>
>> - What do you have in mind as regards releasing this?  Even though it
>>   looks good, I think it would be better to let it mature for a while,
>>   and hence not to put it into 1.9.x/2.0.  (And we're not short of new
>>   stuff in 1.9.x/2.0!)
>
> Personally I would prefer that it come out in 2.0. I'm fairly (but not
> entirely) confident of its consistency as it is, and quite confident
> that it is a more maintainable, and hence fixable, codebase.

I could be wrong, but I don't intuitively feel comfortable with that.
It just feels too quick/early.

On the other hand, I think this is really valuable work, and
absolutely don't want an interval of years or months before it gets
out there.

What is our release plan after 2.0?  I don't know.  I'd like something
more dynamic than the very long intervals between major releases that
we've had in the past.  But it seems there is a conflict between

- major releases being the points at which we can break the API/ABI
  (with accompanying documentation)

- wanting to have such releases more frequently than in the past, so
  that good new stuff gets out quicker

- wanting not to create grief for Guile users by changing the API/ABI
  frequently.

Is there a solution?

> The reason that I want it in is that bytevectors are a nice api for I/O,
> but once you have data in memory, often it's best to treat it as a
> uniform array -- be it of u8 color components, or s32 audio samples.
>
> Uniform vectors are almost by nature "in flight" between two places.

(Not sure I agree.  I'd say uniform vectors are mostly holding numbers
in a computation, or for plotting on a graph.)

> But
> it is the bytevector that is most equipped to "fly", via e.g. (rnrs io
> ports); but bytevectors do not provide an ideal API for manipulation.
> s32vector-ref is a better API than bytevector-s32-native-ref, if only
> that the offset of the former is in native units (s32 elements), and of
> the second in bytes.
>
> I guess what I wanted to do was break uniform vectors out of their type
> ghettoes. Like Perlis said, "Pascal is for building pyramids --
> imposing, breathtaking, static structures built by armies pushing heavy
> blocks into place. Lisp is for building organisms -- imposing,
> breathtaking, dynamic structures built by squads fitting fluctuating
> myriads of simpler organisms into place." I should be able to put a
> 64-bit float into two 32-bit words, when I'm writing an instruction
> sequence for a 32-bit machine, without having to change my data type --
> they are just bytes anyway.

That sounds to me like motivation for adding a richer API to
bytevectors (and which could all be in Scheme), not necessarily for
the deep unification of uniform and byte vectors that you have coded.

TBH, with your refactoring up to this point, I still don't have the
overall picture (arrays, uniform vectors, bitvectors, strings etc)
firmly in my head.  I'd like to do that and then reconsider your
points above.

>> - What about the header file incompatibilities?  I.e. the problem that
>>   if there are applications out that that #include <libguile/ramap.h>,
>>   or another of the renamed ones, they will stop working.  Do you
>>   think we are OK just to document this well, or should we do more
>>   than that?
>
> I thought that our only support API was that provided by #include
> <libguile.h>.

Yeah, actually (and given that Ludovic has said the same) I'm happy
with that too.  Sorry for raising unnecessary concerns!

>>>   (u8vector-ref #u32(#xffffffff) 0) => 255

Note that using #xffffffff here glosses over the endianness problem.

(I think my inclination at this point is that I'd prefer explicit
conversions.)

> I ought to be able to get at the bits of a packed (uniform) vector. The
> whole point of being a uniform vector is to specify a certain bit layout
> of your data.

Huh?  I would say it is to be able to store numbers with a given range
(or precision) efficiently, and to be able to access them efficiently
from both Scheme and C.

>>>   (u8vector? #u32(#xffffffff)) => #f
>
> However, we need to preserve type dispatch:
>
>   (cond
>     ((u8vector? x) ...)
>     ((s8vector? x) ...)
>     ...)
>
>>>   (bytevector? #u32(#xffffffff)) => #t
>
> This to me is like allowing (integer? 1) and (integer? 1.0); in
> /essence/ a #u32 is an array of bytes, interpreted in a certain way.

I think you have in mind that all uniform vectors are filled by
reading from a port, or are destined for writing out to a port.

That is an important use, but there is another one: preparing
numerical data for handling in both C and Scheme.  In this use, the
concept of an underlying array of bytes plays no part.

>>> commit 86d88a223c64276e7cd9b4503e7e2ecca5aae320
>>> Author: Andy Wingo <wingo@pobox.com>
>>> Date:   Thu Jul 16 21:51:47 2009 +0200
>>>
>>>     remove deprecated functions from unif.c
>>>     
>>>     * libguile/unif.h:
>>>     * libguile/unif.c: Remove deprecated functions.
>>>     
>>>     * module/ice-9/deprecated.scm: Remove array-related deprecated
>>>       functions.
>>>     
>>>     * NEWS: Update.
>>
>> I can understand why a lot of the NEWS deltas evaporate when they are
>> logically merged into the 1.8-2.0 section, but I'm not sure about
>>
>> - the entry for Brainfuck
>>
>> - the entry on scm_set_port_seek and scm_set_port_truncate
>>
>> Aren't these entries still of interest?
>>
>> The rest of this commit looks fine.
>
> Yes they are. Good catch.

Cool, thanks.

> I'll finish this response, and the response to Ludovic, later -- but I
> wanted to get out the rationale now. Thanks again for the review!

No problem, it was a good read!

Regards,
        Neil




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-07-30 21:10     ` Neil Jerram
@ 2009-08-04 12:21       ` Andy Wingo
  2009-08-09 16:41         ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2009-08-04 12:21 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Hi Neil,

On Thu 30 Jul 2009 23:10, Neil Jerram <neil@ossau.uklinux.net> writes:

> Andy Wingo <wingo@pobox.com> writes:
>
>> On Wed 22 Jul 2009 23:48, Neil Jerram <neil@ossau.uklinux.net> writes:
>>
>>> I have two overall questions in mind.
>>>
>>> - What do you have in mind as regards releasing this?  Even though it
>>>   looks good, I think it would be better to let it mature for a while,
>>>   and hence not to put it into 1.9.x/2.0.  (And we're not short of new
>>>   stuff in 1.9.x/2.0!)
>>
>> Personally I would prefer that it come out in 2.0. I'm fairly (but not
>> entirely) confident of its consistency as it is, and quite confident
>> that it is a more maintainable, and hence fixable, codebase.
>
> I could be wrong, but I don't intuitively feel comfortable with that.
> It just feels too quick/early.
>
> On the other hand, I think this is really valuable work, and
> absolutely don't want an interval of years or months before it gets
> out there.
>
> What is our release plan after 2.0?  I don't know.  I'd like something
> more dynamic than the very long intervals between major releases that
> we've had in the past.  But it seems there is a conflict between
>
> - major releases being the points at which we can break the API/ABI
>   (with accompanying documentation)
>
> - wanting to have such releases more frequently than in the past, so
>   that good new stuff gets out quicker
>
> - wanting not to create grief for Guile users by changing the API/ABI
>   frequently.
>
> Is there a solution?

I don't know of one, no.

I know of two models that work: one, when you're just starting
developing a library, and downstream users are making the first cuts at
their software too, and everything is in froth and people are willing to
adapt to API or ABI changes. The library is not widely distributed, so
changes don't affect many people besides the developers, like
distributors or users.

The second model is when you already have a wide deployed base. You can
make additions to your API and ABI, and deprecated old API or ABI, but
you can't remove old API or change the ABI. Incompatible breaks are
painful, and the switching-over time is somewhere between a year and
three years. The right length of a stable series seems to be about 4 or
5 years.

So in the second condition, where Guile seems to be, we need to mostly
preserve API and ABI, though we can remove the deprecated bits every few
years. But new API or ABI has to be accompanied with lots of thought,
because you have to support it for 5 years or more.

Dunno, I'm babbling, but the thing is that I feel like if there are
changes that need making, we should make them now. Like Mark's %nil
work. My perception is that we won't have another chance for another few
years.

Unless of course, distros miss 2.0 altogether, like Python has done with
3.0 and 3.1... We could do that. Seems like needless churn, but perhaps
it's necessary to get the wider exposure.

>> The reason that I want it in is that bytevectors are a nice api for I/O,
>> but once you have data in memory, often it's best to treat it as a
>> uniform array -- be it of u8 color components, or s32 audio samples.
>>
>> Uniform vectors are almost by nature "in flight" between two places.
>
> (Not sure I agree.  I'd say uniform vectors are mostly holding numbers
> in a computation, or for plotting on a graph.)

But how do you plot? If you use some sort of external software, you have
two options: code your plotting in C, and loop over the data with the C
API. Or do it in Scheme, and... loop over the s16vector, writing each
sample individually? How do you get at the bits of the s16vector so they
can be written to a port? Use the impoverished uniform-vector-write ?

(rnrs bytevectors) combined with (rnrs io ports) is the best way to get
numeric data into and out of a process, from Scheme. But -- the uniform
vector API is the best API for dealing with that data from Scheme.

> That sounds to me like motivation for adding a richer API to
> bytevectors (and which could all be in Scheme), not necessarily for
> the deep unification of uniform and byte vectors that you have coded.

There is e.g.:

   (bytevector-s16-native-ref bv n)

or

   (bytevector-s16-ref bv n (endianness big))

But that `n' is in bytes, not in elements. If you really want to treat
the bytevector as a numeric array, you're better off with the SRFI-4
API. It is a better API. There's no reason why the SRFI-4 API could not
apply to bytevectors:

   (s16vector-ref bv n) == (bytevector-s16-native-ref bv (* n 2))

Also, only srfi-4 vectors have a read syntax like #s16(1 2 3). You can't
express that with bytevectors, because you would have to encode the
endianness into your source file.

> TBH, with your refactoring up to this point, I still don't have the
> overall picture (arrays, uniform vectors, bitvectors, strings etc)
> firmly in my head.  I'd like to do that and then reconsider your
> points above.

There are two things. One is a generic API for accessing arrays, using
array handles. The second is a rebase of srfi-4 vectors on top of
bytevectors.

>>>>   (u8vector-ref #u32(#xffffffff) 0) => 255
>
> Note that using #xffffffff here glosses over the endianness problem.

Of course. Fortunately there is a sensible interpretation -- that the
u32vector is in native-endianness. The alternative is this:

   (let ((bv (make-bytevector 4)))
     (bytevector-u32-native-set! bv 0 #xffffffff)
     (bytevector-u8-ref bv 0))

Which is actually less efficient. You could of course do:

   (bytevector-u8-ref #u32(#xffffffff) 0) => 255
   (bytevector-u8-ref #u32(#x01234567) 0) => ?

if that would be your preference; the latter answer is just as
endianness-dependent as if you used the `let' idiom above to ref the
value.

> (I think my inclination at this point is that I'd prefer explicit
> conversions.)

When it matters, I would think that the bytevector API is sufficiently
explicit for anyone. Note that referencing values that are more than 8
bits wide have two flavors:

  bytevector-s16-ref bv n endianness
  bytevector-s16-native-ref bv n

So you have all the power available to you.

Or... we could indeed prohibit (u8vector-ref #u32(0) 0). But there
doesn't seem to be a point. Why bother?

>> I ought to be able to get at the bits of a packed (uniform) vector. The
>> whole point of being a uniform vector is to specify a certain bit layout
>> of your data.
>
> Huh?  I would say it is to be able to store numbers with a given range
> (or precision) efficiently, and to be able to access them efficiently
> from both Scheme and C.

Note that for access, an f64vector is almost certainly less efficient
than a Scheme vector of reals, from Scheme, due to the need to
heap-allocate the f64 values as you ref them.

I've written lots of code that deals with srfi-4 vectors. I have three
kinds of use cases. First is data being shoved around in a
dynamically-typed system: dbus messages, gconf values, a system we 
at work, etc. Second, but related, is dealing with chunks of data that
come from elsewhere, like GDK pixbufs, or GStreamer buffers. Third is
hacking compilers, as in Guile itself, or emitting machine code for
other machines.

In all of these cases, the data doesn't just stay in Guile. It either
comes from somewhere else or ends up going somewhere else. The semantics
that are implemented in this patch set actually help all of these cases,
and make Scheme more powerful -- it's not just C any more that can get
at the bits of an array. It allows me to code less in C and more in
Scheme.

>>>>   (u8vector? #u32(#xffffffff)) => #f
>>
>> However, we need to preserve type dispatch:
>>
>>   (cond
>>     ((u8vector? x) ...)
>>     ((s8vector? x) ...)
>>     ...)
>>
>>>>   (bytevector? #u32(#xffffffff)) => #t
>>
>> This to me is like allowing (integer? 1) and (integer? 1.0); in
>> /essence/ a #u32 is an array of bytes, interpreted in a certain way.
>
> I think you have in mind that all uniform vectors are filled by
> reading from a port, or are destined for writing out to a port.
>
> That is an important use, but there is another one: preparing
> numerical data for handling in both C and Scheme.  In this use, the
> concept of an underlying array of bytes plays no part.

You are correct. But the other use cases I mentioned are no less valid.

In summary... I don't mean to be a bore, but I really don't like the
existing unif.c and srfi-4.c. They are painful to understand and to hack
on. I think those bits should be merged.

I also think that srfi-4 vectors should be implemented in terms of
bytevectors, for the reasons above. If you really want to, we can
prohibit u8vector-ref from operating on u32vectors, but that seems
unnecessary to me. I also think that the behavior as implemented in
wip-array-refactor should go in, for 2.0 -- because we just won't have
another chance in the next few years. Not enough testing isn't really a
valid concern IMO, because how else is it going to get testing?

But I do appreciate your input, and decisions.

Cheers,

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-08-04 12:21       ` Andy Wingo
@ 2009-08-09 16:41         ` Ludovic Courtès
  2009-08-12 22:03           ` Andy Wingo
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2009-08-09 16:41 UTC (permalink / raw)
  To: guile-devel

Hi Andy!

Andy Wingo <wingo@pobox.com> writes:

> The second model is when you already have a wide deployed base. You can
> make additions to your API and ABI, and deprecated old API or ABI, but
> you can't remove old API or change the ABI. Incompatible breaks are
> painful, and the switching-over time is somewhere between a year and
> three years. The right length of a stable series seems to be about 4 or
> 5 years.

I'm in favor of sticking to this model, i.e., paying attention to both
source and binary compatibility.  That sounds important to me as Guile
is an old piece of software for which users may expect a relative
stability and clear upgrade path when that is needed.

>> (Not sure I agree.  I'd say uniform vectors are mostly holding numbers
>> in a computation, or for plotting on a graph.)
>
> But how do you plot? If you use some sort of external software, you have
> two options: code your plotting in C, and loop overx the data with the C
> API. Or do it in Scheme, and... loop over the s16vector, writing each
> sample individually? How do you get at the bits of the s16vector so they
> can be written to a port? Use the impoverished uniform-vector-write ?

My feeling is that if you plot using, say, GNUplot or Ploticus, you give
them ASCII-formatted numbers, so SRFI-4 is pointless; in other cases, I
suspect the application may use higher-level data types than u8vectors,
which means some conversion is needed anyway when exporting data
bit-wise.

> I've written lots of code that deals with srfi-4 vectors. I have three
> kinds of use cases. First is data being shoved around in a
> dynamically-typed system: dbus messages, gconf values, a system we 
> at work, etc. Second, but related, is dealing with chunks of data that
> come from elsewhere, like GDK pixbufs, or GStreamer buffers. Third is
> hacking compilers, as in Guile itself, or emitting machine code for
> other machines.

My feeling is that the 1st and 3rd use cases are what bytevectors were
written for in the first place.  They allow applications to avoid
reimplementing number serialization as in `write-bytecode' in
`assembly/compile/bytecode.scm'.

SRFI-4 is a good fit for the 2nd use case as you're dealing with
fixed-width native-endianness numbers coming from C code.  But in this
case, I don't think bytevectors are needed at all.

> In summary... I don't mean to be a bore, but I really don't like the
> existing unif.c and srfi-4.c. They are painful to understand and to hack
> on. I think those bits should be merged.

Agreed.

> I also think that srfi-4 vectors should be implemented in terms of
> bytevectors, for the reasons above.

I'm not convinced, but OTOH, I don't think it hurts.

Like Neil, the one thing that I'm not fond of is the switch from
disjoint SRFI-4 types to polymorphic types, because programming errors
that currently yield a `wrong-type-arg' error will be silently ignored.
The SRFI text allows it, but the rationale says that "the use of
homogeneous vectors allows certain errors to be caught earlier."

Thanks,
Ludo'.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-08-09 16:41         ` Ludovic Courtès
@ 2009-08-12 22:03           ` Andy Wingo
  2009-08-13  9:16             ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2009-08-12 22:03 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hello Ludovic :)

On Sun 09 Aug 2009 18:41, ludo@gnu.org (Ludovic Courtès) writes:

> Andy Wingo <wingo@pobox.com> writes:
>
>> The second model is when you already have a wide deployed base. You can
>> make additions to your API and ABI, and deprecated old API or ABI, but
>> you can't remove old API or change the ABI. Incompatible breaks are
>> painful, and the switching-over time is somewhere between a year and
>> three years. The right length of a stable series seems to be about 4 or
>> 5 years.
>
> I'm in favor of sticking to this model, i.e., paying attention to both
> source and binary compatibility.  That sounds important to me as Guile
> is an old piece of software for which users may expect a relative
> stability and clear upgrade path when that is needed.

I agree.

>> I've written lots of code that deals with srfi-4 vectors. I have three
>> kinds of use cases. First is data being shoved around in a
>> dynamically-typed system: dbus messages, gconf values, a system we 
>> at work, etc. Second, but related, is dealing with chunks of data that
>> come from elsewhere, like GDK pixbufs, or GStreamer buffers. Third is
>> hacking compilers, as in Guile itself, or emitting machine code for
>> other machines.
>
> My feeling is that the 1st and 3rd use cases are what bytevectors were
> written for in the first place.

Agreed.

> SRFI-4 is a good fit for the 2nd use case as you're dealing with
> fixed-width native-endianness numbers coming from C code.

Agreed, modulo the possibility for this data to be embedded within some
other stream.

> But in this case, I don't think bytevectors are needed at all.

I think they are needed whenever you want to *do* something with this
data -- i/o for example.

>> In summary... I don't mean to be a bore, but I really don't like the
>> existing unif.c and srfi-4.c. They are painful to understand and to hack
>> on. I think those bits should be merged.
>
> Agreed.

OK, I'll see about merging up until the polymorphic change after the
release.

>> I also think that srfi-4 vectors should be implemented in terms of
>> bytevectors, for the reasons above.
>
> I'm not convinced, but OTOH, I don't think it hurts.
>
> Like Neil, the one thing that I'm not fond of is the switch from
> disjoint SRFI-4 types to polymorphic types, because programming errors
> that currently yield a `wrong-type-arg' error will be silently ignored.
> The SRFI text allows it, but the rationale says that "the use of
> homogeneous vectors allows certain errors to be caught earlier."

OK. Hopefully when I do my merge, the advantages/disadvantages of the
various approaches will be more clear to all of us (including myself).

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: review/merge request: wip-array-refactor
  2009-08-12 22:03           ` Andy Wingo
@ 2009-08-13  9:16             ` Ludovic Courtès
  0 siblings, 0 replies; 10+ messages in thread
From: Ludovic Courtès @ 2009-08-13  9:16 UTC (permalink / raw)
  To: guile-devel

Hi Andy!

Andy Wingo <wingo@pobox.com> writes:

> On Sun 09 Aug 2009 18:41, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Andy Wingo <wingo@pobox.com> writes:

[...]

>>> I've written lots of code that deals with srfi-4 vectors. I have three
>>> kinds of use cases. First is data being shoved around in a
>>> dynamically-typed system: dbus messages, gconf values, a system we 
>>> at work, etc. Second, but related, is dealing with chunks of data that
>>> come from elsewhere, like GDK pixbufs, or GStreamer buffers. Third is
>>> hacking compilers, as in Guile itself, or emitting machine code for
>>> other machines.

[...]

>> SRFI-4 is a good fit for the 2nd use case as you're dealing with
>> fixed-width native-endianness numbers coming from C code.
>
> Agreed, modulo the possibility for this data to be embedded within some
> other stream.
>
>> But in this case, I don't think bytevectors are needed at all.
>
> I think they are needed whenever you want to *do* something with this
> data -- i/o for example.

In the 2nd use case (GDK pixbufs, GStreamer buffers), I suppose you
don't do I/O with the data; it just travels back and forth between C and
Scheme code.  Is this correct?

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-08-13  9:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-19 13:59 review/merge request: wip-array-refactor Andy Wingo
2009-07-22 21:48 ` Neil Jerram
2009-07-28 22:41   ` Andy Wingo
2009-07-30 21:10     ` Neil Jerram
2009-08-04 12:21       ` Andy Wingo
2009-08-09 16:41         ` Ludovic Courtès
2009-08-12 22:03           ` Andy Wingo
2009-08-13  9:16             ` Ludovic Courtès
2009-07-23 21:08 ` Ludovic Courtès
2009-07-24 22:01   ` Neil Jerram

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).