From: Mark H Weaver <mhw@netris.org>
To: Daniel Llorens <daniel.llorens@bluewin.ch>
Cc: guile-devel@gnu.org
Subject: Re: vectors are something else
Date: Tue, 16 Apr 2013 02:19:36 -0400 [thread overview]
Message-ID: <877gk32e13.fsf@tines.lan> (raw)
In-Reply-To: <9282A433-1C3C-4D65-BDCB-9824F9227815@bluewin.ch> (Daniel Llorens's message of "Tue, 16 Apr 2013 06:10:21 +0200")
Daniel Llorens <daniel.llorens@bluewin.ch> writes:
> On Apr 16, 2013, at 04:00, Mark H Weaver wrote:
>
>>> [1] have vector- ops only accept vector- types. This removes container
>>> type dispatch from the vector- ops but not from the array- ops. The
>>> last patch set I've sent to the list goes in this direction. It
>>> matches the behavior of other vector-like types such as bytevectors
>>> and strings.
>>
>> I very much favor this approach, but unfortunately I don't think we can
>> do it in 2.0. It'll have to wait until 2.2.
>
> What should we do then about the consistency bugs? like vector? being
> #f on something that is accepted by vector-ref and so on.
I don't know.
> Did you have a look at the table I posted?
Yes, I did, and generally I strongly prefer column [1], but we have to
be very careful what changes we make in a stable release series that
might break compatibility with existing code. I find this frustrating
as well, which is one reason why I agree with Andy that shifting focus
to 2.2 soon would be a good idea.
>> I consider this option completely unacceptable. Arrays are much more
>> complex than vectors, and will inevitably be *much* slower in the common
>> case where the compiler doesn't know much.
>
> ...
>
>> Most egregiously,
>> 'make-shared-array' requires that array references apply an arbitrary
>> affine transformation that's unknown at compile time, which means
>> multiplications and additions for each index.
>
> make-shared-array itself is a very slow function and I'd like to have
> some alternatives, but it's only called when setting up the shared
> array. In any case, it's not slow because of any multiplications, but
> because of how much it conses.
I think you misunderstood me. I'm not talking about the cost of the
'make-shared-array' call itself. I'm talking about the fact that in
order to support arrays that might have been created using
'make-shared-array', things like 'array-ref' must apply an arbitrary
affine transformation to the indices in order to compute the offset into
the linear storage.
> You overestimate (by a lot) the cost of the index computation. One
> works on arrays by looping over them. In these loops the strides are
> always constant. No matter the rank of the array, in the inner loop
> only one index moves. The cost of this stepping is negligible on any
> scenario that isn't hugely contrived.
Here you're assuming either a single core procedure that loops over the
entire array (e.g. something like 'array-map!'), or a Scheme loop that
the compiler is able to analyze in detail. Of course, when we have a
fancy native compiler, and if you write your Scheme carefully, you could
certainly arrange to make this optimization possible.
However, there's a lot of Scheme code that uses vectors and is not coded
with such concerns in mind. In such code, you might have a top-level
procedure that receives an argument and applies 'vector-ref' to it. In
such cases, if all vectors are actually arrays, the compiler cannot
assume anything about what kind of array is passed in. Such an array
might have been created by 'make-shared-array', and thus might use an
arbitrary affine transformation. If 'vector-ref' can be used on such an
array, then this procedure would have to fetch the stride and offset and
do the multiplication each time. If there's a loop that calls this
top-level procedure, then the compiler can't do anything with this.
In the context of Scheme, with no static type system, with most
procedures accessed via mutable top-level variables, and where programs
are often written in terms of higher-order procedures, analyzing
programs can be very difficult or impossible. Therefore, it's a bad
idea to have complicated primitives that are slow in the general case,
in the hope that a fancy compiler will be able to analyze entire loops
and make them fast through non-trivial analyses.
>> Don't get me wrong: I'm all in favor of providing flexible, generic,
>> extensible procedures. They have their place. I think that's why
>> there's still a justification for keeping 'generalized-vectors' around.
>
> They didn't do anything that the array functions didn't do just as
> well.
Well, generalized vector accessors don't have to cope with multiple
dimensions, nor do they require performing multiplications for most of
the supported vector types.
> Plus they were buggy —they are buggy in stable-2.0.
Bugs are not a good reason to remove an API.
> And the names took half the screen.
This is a weak argument. You can always make shorter names for them.
Having said all this, I haven't yet thought much about this issue, and
don't (yet) have a strong opinion on whether the generic vector
procedures should be kept.
>> However, efficiency demands that we also have simple, non-generic,
>> non-extensible procedures as well. IMO, we should keep the vector,
>> bitvector, and bytevector procedures as simple as possible.
>
> Other than the bizarre idea that arrays have to be slow, I think this
> is a sensible position. After all, arrays have both an array
> descriptor and a reference to linear storage, so there must be a type
> for this linear storage. (E.g. in stable-2.0, vector- operations take
> either arrays, in a buggy way, or the underlying storage type, which
> is called 'simple vector' and isn't exposed to Scheme.)
>
> What's your take on Daniel Hartwig's position? For example a and b on
> the table I posted —b is an array, but it prints as #(2 4). How do you
> justify forbidding vector-ref for something that prints #(2 4) ?
I don't think that arrays should ever be printed as #(2 4).
Thanks for the discussion.
Regards,
Mark
next prev parent reply other threads:[~2013-04-16 6:19 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <mailman.1288755.1365813667.854.guile-devel@gnu.org>
2013-04-15 11:29 ` vectors are something else Daniel Llorens
2013-04-15 12:28 ` Daniel Hartwig
2013-04-15 14:08 ` Daniel Llorens
2013-04-15 14:17 ` Daniel Hartwig
2013-04-15 14:10 ` vector types poll Daniel Llorens
2013-04-15 22:47 ` Daniel Hartwig
2013-04-16 0:16 ` Daniel Llorens
2013-04-16 2:00 ` vectors are something else Mark H Weaver
2013-04-16 4:10 ` Daniel Llorens
2013-04-16 6:19 ` Mark H Weaver [this message]
2013-04-16 8:31 ` Daniel Llorens
2013-04-17 15:29 ` Mutable top-level bindings (was: vectors are something else) Chris K. Jester-Young
2013-04-17 17:53 ` Mutable top-level bindings Mark H Weaver
2013-04-17 20:25 ` Ian Price
2013-04-20 14:00 ` Ludovic Courtès
[not found] <mailman.197.1365782461.8676.guile-devel@gnu.org>
2013-04-12 23:12 ` vectors are something else Daniel Llorens
[not found] <mailman.1287634.1365761713.854.guile-devel@gnu.org>
2013-04-12 12:37 ` Daniel Llorens
2013-04-12 14:06 ` Daniel Hartwig
2013-04-13 0:40 ` Daniel Llorens
2013-04-10 23:07 Daniel Llorens
2013-04-11 7:29 ` Daniel Llorens
2013-04-11 23:53 ` Daniel Hartwig
2013-04-12 7:23 ` Daniel Llorens
2013-04-12 10:15 ` Daniel Hartwig
2013-04-12 10:41 ` Daniel Hartwig
2013-04-12 21:43 ` Mark H Weaver
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877gk32e13.fsf@tines.lan \
--to=mhw@netris.org \
--cc=daniel.llorens@bluewin.ch \
--cc=guile-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).