unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* [BDW-GC] "Inlined" storage; `scm_take_' functions
@ 2009-09-01  0:14 Ludovic Courtès
  2009-09-01  0:48 ` Mike Gran
  2009-09-08 23:54 ` Neil Jerram
  0 siblings, 2 replies; 6+ messages in thread
From: Ludovic Courtès @ 2009-09-01  0:14 UTC (permalink / raw)
  To: guile-devel

Hello!

Stringbufs and bytevectors are now always "inlined" in the BDW-GC
branch [0, 1], which means that there's no cell->buffer indirection,
which greatly simplifies code (it also takes less room and may slightly
improve performance).

The `scm_take_' functions for strings/symbols/bytevectors are now
essentially aliases to the corresponding `scm_from_' because we cannot
advantageously reuse the provided storage.

Should these functions be deprecated or discouraged?

Thanks,
Ludo'.

[0] http://git.savannah.gnu.org/cgit/guile.git/commit/?h=boehm-demers-weiser-gc&id=ba54a2026beaadb4e7566d4b9e2c9e4c7cd793e6
[1] http://git.savannah.gnu.org/cgit/guile.git/commit/?h=boehm-demers-weiser-gc&id=0665b3ffcb7ec5232a51ff632a818a638dfd4054





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
  2009-09-01  0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès
@ 2009-09-01  0:48 ` Mike Gran
  2009-09-01  8:20   ` Ludovic Courtès
  2009-09-08 23:54 ` Neil Jerram
  1 sibling, 1 reply; 6+ messages in thread
From: Mike Gran @ 2009-09-01  0:48 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Tue, 2009-09-01 at 02:14 +0200, Ludovic Courtès wrote:
> Hello!
> 
> Stringbufs and bytevectors are now always "inlined" in the BDW-GC
> branch [0, 1], which means that there's no cell->buffer indirection,
> which greatly simplifies code (it also takes less room and may slightly
> improve performance).

Neat!

> 
> The `scm_take_' functions for strings/symbols/bytevectors are now
> essentially aliases to the corresponding `scm_from_' because we cannot
> advantageously reuse the provided storage.
> 
> Should these functions be deprecated or discouraged?
> 

codesearch.google.com says that scm_take_ isn't often used by other
projects, but, it is used by lilypond.  I think that's reason enough to
leave it in.  I'd vote for keeping them and adjusting the docs to say
something like

     Like `scm_from_locale_string' and `scm_from_locale_stringn',
     respectively, but also immediately frees STR after creating
     the Guile string.

Or something like that.

-Mike







^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
  2009-09-01  0:48 ` Mike Gran
@ 2009-09-01  8:20   ` Ludovic Courtès
  0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2009-09-01  8:20 UTC (permalink / raw)
  To: guile-devel

Hi,

Mike Gran <spk121@yahoo.com> writes:

> On Tue, 2009-09-01 at 02:14 +0200, Ludovic Courtès wrote:

[...]

>> The `scm_take_' functions for strings/symbols/bytevectors are now
>> essentially aliases to the corresponding `scm_from_' because we cannot
>> advantageously reuse the provided storage.
>> 
>> Should these functions be deprecated or discouraged?
>> 
>
> codesearch.google.com says that scm_take_ isn't often used by other
> projects, but, it is used by lilypond.  I think that's reason enough to
> leave it in.  I'd vote for keeping them and adjusting the docs to say
> something like
>
>      Like `scm_from_locale_string' and `scm_from_locale_stringn',
>      respectively, but also immediately frees STR after creating
>      the Guile string.
>
> Or something like that.

Of course, I meant "keep them but possibly moved into
{discouraged,deprecated}.c".  Your doc suggestion looks good to me also.

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
  2009-09-01  0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès
  2009-09-01  0:48 ` Mike Gran
@ 2009-09-08 23:54 ` Neil Jerram
  2009-09-09  8:03   ` Ludovic Courtès
  1 sibling, 1 reply; 6+ messages in thread
From: Neil Jerram @ 2009-09-08 23:54 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hello!

Hi!

> Stringbufs and bytevectors are now always "inlined" in the BDW-GC
> branch [0, 1], which means that there's no cell->buffer indirection,
> which greatly simplifies code (it also takes less room and may slightly
> improve performance).
>
> The `scm_take_' functions for strings/symbols/bytevectors are now
> essentially aliases to the corresponding `scm_from_' because we cannot
> advantageously reuse the provided storage.

That seems a bit of a shame.  (i.e. that we can't advantageously keep
the caller's string or vector data)

Did you consider the option of

- always having an indirection from the stringbuf/bytevector object to
the underlying data

- optimizing the scm_from_... case by doing a single
  scm_gc_malloc_pointerless (), and making the "underlying data
  pointer" point into the same malloc'd block.

The first point should allow a similar simplification of the code as
you have in your commits - by not having to handle both the inline and
indirected cases everywhere - but the indirection would allow us to
keep meaningful scm_take_... functions.

    Neil




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
  2009-09-08 23:54 ` Neil Jerram
@ 2009-09-09  8:03   ` Ludovic Courtès
  2009-09-09 21:38     ` Neil Jerram
  0 siblings, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2009-09-09  8:03 UTC (permalink / raw)
  To: guile-devel

Hi Neil!

Neil Jerram <neil@ossau.uklinux.net> writes:

> ludo@gnu.org (Ludovic Courtès) writes:

>> Stringbufs and bytevectors are now always "inlined" in the BDW-GC
>> branch [0, 1], which means that there's no cell->buffer indirection,
>> which greatly simplifies code (it also takes less room and may slightly
>> improve performance).
>>
>> The `scm_take_' functions for strings/symbols/bytevectors are now
>> essentially aliases to the corresponding `scm_from_' because we cannot
>> advantageously reuse the provided storage.
>
> That seems a bit of a shame.  (i.e. that we can't advantageously keep
> the caller's string or vector data)

It’s not such a shame IMO because:

  * You have to allocate anyway, to store the (double) cell, and
    allocating the whole thing may be just as costly as allocating the
    cell, at least for small stringbufs/bytevectors.

  * For stringbufs, the user-provided buffer can be reused only if it’s
    either Latin-1 or UCS-4, anyway.

  * Removing the indirection and using only GC-managed memory is
    beneficial for Scheme code (which doesn’t use ‘scm_take’).

  * Reusing the malloc(3)-allocated buffer means that we have to
    register a finalizer to later free(3) that buffer (see, e.g., commit
    d7e7a02a6251c8ed4f76933d9d30baeee3f599c0), which is costly (see, e.g.,
    http://www.hpl.hp.com/personal/Hans_Boehm/popl03/web/html/slide_7.html).

That said...

> Did you consider the option of
>
> - always having an indirection from the stringbuf/bytevector object to
> the underlying data

... this may be valuable (Andy pointed it out as well), at least for
bytevectors.  The indirection is a requirement for Andy’s
SRFI-4-on-bytevector patch set, so that ‘scm_take_u8vector ()’ can still
be supported; it’s also required if we want to provide mmap(3) bindings,
for instance, that return a bytevector.

For stringbufs, though, I’m happy if we can leave the code as it is.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions
  2009-09-09  8:03   ` Ludovic Courtès
@ 2009-09-09 21:38     ` Neil Jerram
  0 siblings, 0 replies; 6+ messages in thread
From: Neil Jerram @ 2009-09-09 21:38 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> It’s not such a shame IMO because:
>
>   * You have to allocate anyway, to store the (double) cell, and
>     allocating the whole thing may be just as costly as allocating the
>     cell, at least for small stringbufs/bytevectors.
>
>   * For stringbufs, the user-provided buffer can be reused only if it’s
>     either Latin-1 or UCS-4, anyway.
>
>   * Removing the indirection and using only GC-managed memory is
>     beneficial for Scheme code (which doesn’t use ‘scm_take’).
>
>   * Reusing the malloc(3)-allocated buffer means that we have to
>     register a finalizer to later free(3) that buffer (see, e.g., commit
>     d7e7a02a6251c8ed4f76933d9d30baeee3f599c0), which is costly (see, e.g.,
>     http://www.hpl.hp.com/personal/Hans_Boehm/popl03/web/html/slide_7.html).

All good points.

> That said...
>
>> Did you consider the option of
>>
>> - always having an indirection from the stringbuf/bytevector object to
>> the underlying data
>
> ... this may be valuable (Andy pointed it out as well), at least for
> bytevectors.  The indirection is a requirement for Andy’s
> SRFI-4-on-bytevector patch set, so that ‘scm_take_u8vector ()’ can still
> be supported; it’s also required if we want to provide mmap(3) bindings,
> for instance, that return a bytevector.

OK, cool.  It was actually large bytevectors that I was mostly
thinking about, and IIUC it sounds quite likely that we will end up
keeping meaningful scm_take_... functions there.

> For stringbufs, though, I’m happy if we can leave the code as it is.

Yes, fine.  For stringbufs reallocating feels less painful, especially
given the encoding restriction.

Thanks!
        Neil




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-09-09 21:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-01  0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès
2009-09-01  0:48 ` Mike Gran
2009-09-01  8:20   ` Ludovic Courtès
2009-09-08 23:54 ` Neil Jerram
2009-09-09  8:03   ` Ludovic Courtès
2009-09-09 21:38     ` Neil Jerram

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).