* [BDW-GC] "Inlined" storage; `scm_take_' functions @ 2009-09-01 0:14 Ludovic Courtès 2009-09-01 0:48 ` Mike Gran 2009-09-08 23:54 ` Neil Jerram 0 siblings, 2 replies; 6+ messages in thread From: Ludovic Courtès @ 2009-09-01 0:14 UTC (permalink / raw) To: guile-devel Hello! Stringbufs and bytevectors are now always "inlined" in the BDW-GC branch [0, 1], which means that there's no cell->buffer indirection, which greatly simplifies code (it also takes less room and may slightly improve performance). The `scm_take_' functions for strings/symbols/bytevectors are now essentially aliases to the corresponding `scm_from_' because we cannot advantageously reuse the provided storage. Should these functions be deprecated or discouraged? Thanks, Ludo'. [0] http://git.savannah.gnu.org/cgit/guile.git/commit/?h=boehm-demers-weiser-gc&id=ba54a2026beaadb4e7566d4b9e2c9e4c7cd793e6 [1] http://git.savannah.gnu.org/cgit/guile.git/commit/?h=boehm-demers-weiser-gc&id=0665b3ffcb7ec5232a51ff632a818a638dfd4054 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions 2009-09-01 0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès @ 2009-09-01 0:48 ` Mike Gran 2009-09-01 8:20 ` Ludovic Courtès 2009-09-08 23:54 ` Neil Jerram 1 sibling, 1 reply; 6+ messages in thread From: Mike Gran @ 2009-09-01 0:48 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel On Tue, 2009-09-01 at 02:14 +0200, Ludovic Courtès wrote: > Hello! > > Stringbufs and bytevectors are now always "inlined" in the BDW-GC > branch [0, 1], which means that there's no cell->buffer indirection, > which greatly simplifies code (it also takes less room and may slightly > improve performance). Neat! > > The `scm_take_' functions for strings/symbols/bytevectors are now > essentially aliases to the corresponding `scm_from_' because we cannot > advantageously reuse the provided storage. > > Should these functions be deprecated or discouraged? > codesearch.google.com says that scm_take_ isn't often used by other projects, but, it is used by lilypond. I think that's reason enough to leave it in. I'd vote for keeping them and adjusting the docs to say something like Like `scm_from_locale_string' and `scm_from_locale_stringn', respectively, but also immediately frees STR after creating the Guile string. Or something like that. -Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions 2009-09-01 0:48 ` Mike Gran @ 2009-09-01 8:20 ` Ludovic Courtès 0 siblings, 0 replies; 6+ messages in thread From: Ludovic Courtès @ 2009-09-01 8:20 UTC (permalink / raw) To: guile-devel Hi, Mike Gran <spk121@yahoo.com> writes: > On Tue, 2009-09-01 at 02:14 +0200, Ludovic Courtès wrote: [...] >> The `scm_take_' functions for strings/symbols/bytevectors are now >> essentially aliases to the corresponding `scm_from_' because we cannot >> advantageously reuse the provided storage. >> >> Should these functions be deprecated or discouraged? >> > > codesearch.google.com says that scm_take_ isn't often used by other > projects, but, it is used by lilypond. I think that's reason enough to > leave it in. I'd vote for keeping them and adjusting the docs to say > something like > > Like `scm_from_locale_string' and `scm_from_locale_stringn', > respectively, but also immediately frees STR after creating > the Guile string. > > Or something like that. Of course, I meant "keep them but possibly moved into {discouraged,deprecated}.c". Your doc suggestion looks good to me also. Thanks, Ludo'. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions 2009-09-01 0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès 2009-09-01 0:48 ` Mike Gran @ 2009-09-08 23:54 ` Neil Jerram 2009-09-09 8:03 ` Ludovic Courtès 1 sibling, 1 reply; 6+ messages in thread From: Neil Jerram @ 2009-09-08 23:54 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel ludo@gnu.org (Ludovic Courtès) writes: > Hello! Hi! > Stringbufs and bytevectors are now always "inlined" in the BDW-GC > branch [0, 1], which means that there's no cell->buffer indirection, > which greatly simplifies code (it also takes less room and may slightly > improve performance). > > The `scm_take_' functions for strings/symbols/bytevectors are now > essentially aliases to the corresponding `scm_from_' because we cannot > advantageously reuse the provided storage. That seems a bit of a shame. (i.e. that we can't advantageously keep the caller's string or vector data) Did you consider the option of - always having an indirection from the stringbuf/bytevector object to the underlying data - optimizing the scm_from_... case by doing a single scm_gc_malloc_pointerless (), and making the "underlying data pointer" point into the same malloc'd block. The first point should allow a similar simplification of the code as you have in your commits - by not having to handle both the inline and indirected cases everywhere - but the indirection would allow us to keep meaningful scm_take_... functions. Neil ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions 2009-09-08 23:54 ` Neil Jerram @ 2009-09-09 8:03 ` Ludovic Courtès 2009-09-09 21:38 ` Neil Jerram 0 siblings, 1 reply; 6+ messages in thread From: Ludovic Courtès @ 2009-09-09 8:03 UTC (permalink / raw) To: guile-devel Hi Neil! Neil Jerram <neil@ossau.uklinux.net> writes: > ludo@gnu.org (Ludovic Courtès) writes: >> Stringbufs and bytevectors are now always "inlined" in the BDW-GC >> branch [0, 1], which means that there's no cell->buffer indirection, >> which greatly simplifies code (it also takes less room and may slightly >> improve performance). >> >> The `scm_take_' functions for strings/symbols/bytevectors are now >> essentially aliases to the corresponding `scm_from_' because we cannot >> advantageously reuse the provided storage. > > That seems a bit of a shame. (i.e. that we can't advantageously keep > the caller's string or vector data) It’s not such a shame IMO because: * You have to allocate anyway, to store the (double) cell, and allocating the whole thing may be just as costly as allocating the cell, at least for small stringbufs/bytevectors. * For stringbufs, the user-provided buffer can be reused only if it’s either Latin-1 or UCS-4, anyway. * Removing the indirection and using only GC-managed memory is beneficial for Scheme code (which doesn’t use ‘scm_take’). * Reusing the malloc(3)-allocated buffer means that we have to register a finalizer to later free(3) that buffer (see, e.g., commit d7e7a02a6251c8ed4f76933d9d30baeee3f599c0), which is costly (see, e.g., http://www.hpl.hp.com/personal/Hans_Boehm/popl03/web/html/slide_7.html). That said... > Did you consider the option of > > - always having an indirection from the stringbuf/bytevector object to > the underlying data ... this may be valuable (Andy pointed it out as well), at least for bytevectors. The indirection is a requirement for Andy’s SRFI-4-on-bytevector patch set, so that ‘scm_take_u8vector ()’ can still be supported; it’s also required if we want to provide mmap(3) bindings, for instance, that return a bytevector. For stringbufs, though, I’m happy if we can leave the code as it is. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BDW-GC] "Inlined" storage; `scm_take_' functions 2009-09-09 8:03 ` Ludovic Courtès @ 2009-09-09 21:38 ` Neil Jerram 0 siblings, 0 replies; 6+ messages in thread From: Neil Jerram @ 2009-09-09 21:38 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel ludo@gnu.org (Ludovic Courtès) writes: > It’s not such a shame IMO because: > > * You have to allocate anyway, to store the (double) cell, and > allocating the whole thing may be just as costly as allocating the > cell, at least for small stringbufs/bytevectors. > > * For stringbufs, the user-provided buffer can be reused only if it’s > either Latin-1 or UCS-4, anyway. > > * Removing the indirection and using only GC-managed memory is > beneficial for Scheme code (which doesn’t use ‘scm_take’). > > * Reusing the malloc(3)-allocated buffer means that we have to > register a finalizer to later free(3) that buffer (see, e.g., commit > d7e7a02a6251c8ed4f76933d9d30baeee3f599c0), which is costly (see, e.g., > http://www.hpl.hp.com/personal/Hans_Boehm/popl03/web/html/slide_7.html). All good points. > That said... > >> Did you consider the option of >> >> - always having an indirection from the stringbuf/bytevector object to >> the underlying data > > ... this may be valuable (Andy pointed it out as well), at least for > bytevectors. The indirection is a requirement for Andy’s > SRFI-4-on-bytevector patch set, so that ‘scm_take_u8vector ()’ can still > be supported; it’s also required if we want to provide mmap(3) bindings, > for instance, that return a bytevector. OK, cool. It was actually large bytevectors that I was mostly thinking about, and IIUC it sounds quite likely that we will end up keeping meaningful scm_take_... functions there. > For stringbufs, though, I’m happy if we can leave the code as it is. Yes, fine. For stringbufs reallocating feels less painful, especially given the encoding restriction. Thanks! Neil ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-09-09 21:38 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-09-01 0:14 [BDW-GC] "Inlined" storage; `scm_take_' functions Ludovic Courtès 2009-09-01 0:48 ` Mike Gran 2009-09-01 8:20 ` Ludovic Courtès 2009-09-08 23:54 ` Neil Jerram 2009-09-09 8:03 ` Ludovic Courtès 2009-09-09 21:38 ` Neil Jerram
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).