* [BDW-GC] "Inlined" storage; `scm_take_' functions
@ 2009-09-01 0:14 Ludovic Courtès
2009-09-01 0:48 ` Mike Gran
2009-09-08 23:54 ` Neil Jerram
0 siblings, 2 replies; 6+ messages in thread
From: Ludovic Courtès @ 2009-09-01 0:14 UTC (permalink / raw)
To: guile-devel
Hello!
Stringbufs and bytevectors are now always "inlined" in the BDW-GC
branch [0, 1], which means that there's no cell->buffer indirection,
which greatly simplifies code (it also takes less room and may slightly
improve performance).
The `scm_take_' functions for strings/symbols/bytevectors are now
essentially aliases to the corresponding `scm_from_' because we cannot
advantageously reuse the provided storage.
Should these functions be deprecated or discouraged?
Thanks,
Ludo'.
[0] http://git.savannah.gnu.org/cgit/guile.git/commit/?h=boehm-demers-weiser-gc&id=ba54a2026beaadb4e7566d4b9e2c9e4c7cd793e6
[1] http://git.savannah.gnu.org/cgit/guile.git/commit/?h=boehm-demers-weiser-gc&id=0665b3ffcb7ec5232a51ff632a818a638dfd4054
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
2009-09-01 0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès
@ 2009-09-01 0:48 ` Mike Gran
2009-09-01 8:20 ` Ludovic Courtès
2009-09-08 23:54 ` Neil Jerram
1 sibling, 1 reply; 6+ messages in thread
From: Mike Gran @ 2009-09-01 0:48 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
On Tue, 2009-09-01 at 02:14 +0200, Ludovic Courtès wrote:
> Hello!
>
> Stringbufs and bytevectors are now always "inlined" in the BDW-GC
> branch [0, 1], which means that there's no cell->buffer indirection,
> which greatly simplifies code (it also takes less room and may slightly
> improve performance).
Neat!
>
> The `scm_take_' functions for strings/symbols/bytevectors are now
> essentially aliases to the corresponding `scm_from_' because we cannot
> advantageously reuse the provided storage.
>
> Should these functions be deprecated or discouraged?
>
codesearch.google.com says that scm_take_ isn't often used by other
projects, but, it is used by lilypond. I think that's reason enough to
leave it in. I'd vote for keeping them and adjusting the docs to say
something like
Like `scm_from_locale_string' and `scm_from_locale_stringn',
respectively, but also immediately frees STR after creating
the Guile string.
Or something like that.
-Mike
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
2009-09-01 0:48 ` Mike Gran
@ 2009-09-01 8:20 ` Ludovic Courtès
0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2009-09-01 8:20 UTC (permalink / raw)
To: guile-devel
Hi,
Mike Gran <spk121@yahoo.com> writes:
> On Tue, 2009-09-01 at 02:14 +0200, Ludovic Courtès wrote:
[...]
>> The `scm_take_' functions for strings/symbols/bytevectors are now
>> essentially aliases to the corresponding `scm_from_' because we cannot
>> advantageously reuse the provided storage.
>>
>> Should these functions be deprecated or discouraged?
>>
>
> codesearch.google.com says that scm_take_ isn't often used by other
> projects, but, it is used by lilypond. I think that's reason enough to
> leave it in. I'd vote for keeping them and adjusting the docs to say
> something like
>
> Like `scm_from_locale_string' and `scm_from_locale_stringn',
> respectively, but also immediately frees STR after creating
> the Guile string.
>
> Or something like that.
Of course, I meant "keep them but possibly moved into
{discouraged,deprecated}.c". Your doc suggestion looks good to me also.
Thanks,
Ludo'.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
2009-09-01 0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès
2009-09-01 0:48 ` Mike Gran
@ 2009-09-08 23:54 ` Neil Jerram
2009-09-09 8:03 ` Ludovic Courtès
1 sibling, 1 reply; 6+ messages in thread
From: Neil Jerram @ 2009-09-08 23:54 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
ludo@gnu.org (Ludovic Courtès) writes:
> Hello!
Hi!
> Stringbufs and bytevectors are now always "inlined" in the BDW-GC
> branch [0, 1], which means that there's no cell->buffer indirection,
> which greatly simplifies code (it also takes less room and may slightly
> improve performance).
>
> The `scm_take_' functions for strings/symbols/bytevectors are now
> essentially aliases to the corresponding `scm_from_' because we cannot
> advantageously reuse the provided storage.
That seems a bit of a shame. (i.e. that we can't advantageously keep
the caller's string or vector data)
Did you consider the option of
- always having an indirection from the stringbuf/bytevector object to
the underlying data
- optimizing the scm_from_... case by doing a single
scm_gc_malloc_pointerless (), and making the "underlying data
pointer" point into the same malloc'd block.
The first point should allow a similar simplification of the code as
you have in your commits - by not having to handle both the inline and
indirected cases everywhere - but the indirection would allow us to
keep meaningful scm_take_... functions.
Neil
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
2009-09-08 23:54 ` Neil Jerram
@ 2009-09-09 8:03 ` Ludovic Courtès
2009-09-09 21:38 ` Neil Jerram
0 siblings, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2009-09-09 8:03 UTC (permalink / raw)
To: guile-devel
Hi Neil!
Neil Jerram <neil@ossau.uklinux.net> writes:
> ludo@gnu.org (Ludovic Courtès) writes:
>> Stringbufs and bytevectors are now always "inlined" in the BDW-GC
>> branch [0, 1], which means that there's no cell->buffer indirection,
>> which greatly simplifies code (it also takes less room and may slightly
>> improve performance).
>>
>> The `scm_take_' functions for strings/symbols/bytevectors are now
>> essentially aliases to the corresponding `scm_from_' because we cannot
>> advantageously reuse the provided storage.
>
> That seems a bit of a shame. (i.e. that we can't advantageously keep
> the caller's string or vector data)
It’s not such a shame IMO because:
* You have to allocate anyway, to store the (double) cell, and
allocating the whole thing may be just as costly as allocating the
cell, at least for small stringbufs/bytevectors.
* For stringbufs, the user-provided buffer can be reused only if it’s
either Latin-1 or UCS-4, anyway.
* Removing the indirection and using only GC-managed memory is
beneficial for Scheme code (which doesn’t use ‘scm_take’).
* Reusing the malloc(3)-allocated buffer means that we have to
register a finalizer to later free(3) that buffer (see, e.g., commit
d7e7a02a6251c8ed4f76933d9d30baeee3f599c0), which is costly (see, e.g.,
http://www.hpl.hp.com/personal/Hans_Boehm/popl03/web/html/slide_7.html).
That said...
> Did you consider the option of
>
> - always having an indirection from the stringbuf/bytevector object to
> the underlying data
... this may be valuable (Andy pointed it out as well), at least for
bytevectors. The indirection is a requirement for Andy’s
SRFI-4-on-bytevector patch set, so that ‘scm_take_u8vector ()’ can still
be supported; it’s also required if we want to provide mmap(3) bindings,
for instance, that return a bytevector.
For stringbufs, though, I’m happy if we can leave the code as it is.
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
2009-09-09 8:03 ` Ludovic Courtès
@ 2009-09-09 21:38 ` Neil Jerram
0 siblings, 0 replies; 6+ messages in thread
From: Neil Jerram @ 2009-09-09 21:38 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
ludo@gnu.org (Ludovic Courtès) writes:
> It’s not such a shame IMO because:
>
> * You have to allocate anyway, to store the (double) cell, and
> allocating the whole thing may be just as costly as allocating the
> cell, at least for small stringbufs/bytevectors.
>
> * For stringbufs, the user-provided buffer can be reused only if it’s
> either Latin-1 or UCS-4, anyway.
>
> * Removing the indirection and using only GC-managed memory is
> beneficial for Scheme code (which doesn’t use ‘scm_take’).
>
> * Reusing the malloc(3)-allocated buffer means that we have to
> register a finalizer to later free(3) that buffer (see, e.g., commit
> d7e7a02a6251c8ed4f76933d9d30baeee3f599c0), which is costly (see, e.g.,
> http://www.hpl.hp.com/personal/Hans_Boehm/popl03/web/html/slide_7.html).
All good points.
> That said...
>
>> Did you consider the option of
>>
>> - always having an indirection from the stringbuf/bytevector object to
>> the underlying data
>
> ... this may be valuable (Andy pointed it out as well), at least for
> bytevectors. The indirection is a requirement for Andy’s
> SRFI-4-on-bytevector patch set, so that ‘scm_take_u8vector ()’ can still
> be supported; it’s also required if we want to provide mmap(3) bindings,
> for instance, that return a bytevector.
OK, cool. It was actually large bytevectors that I was mostly
thinking about, and IIUC it sounds quite likely that we will end up
keeping meaningful scm_take_... functions there.
> For stringbufs, though, I’m happy if we can leave the code as it is.
Yes, fine. For stringbufs reallocating feels less painful, especially
given the encoding restriction.
Thanks!
Neil
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-09-09 21:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-01 0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès
2009-09-01 0:48 ` Mike Gran
2009-09-01 8:20 ` Ludovic Courtès
2009-09-08 23:54 ` Neil Jerram
2009-09-09 8:03 ` Ludovic Courtès
2009-09-09 21:38 ` Neil Jerram
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).