From: Andy Wingo <wingo@pobox.com>
To: Mark H Weaver <mhw@netris.org>
Cc: Ken Raeburn <raeburn@raeburn.org>, guile-devel@gnu.org
Subject: Re: [PATCH] Efficient Gensym Hack (v2)
Date: Wed, 07 Mar 2012 18:25:49 +0100 [thread overview]
Message-ID: <87vcmgz5k2.fsf@pobox.com> (raw)
In-Reply-To: <878vjcwedw.fsf@netris.org> (Mark H. Weaver's message of "Wed, 07 Mar 2012 11:43:23 -0500")
Hi Mark!
Thanks for the response. I have various minor thoughts here and one
serious note on GC.
On Wed 07 Mar 2012 17:43, Mark H Weaver <mhw@netris.org> writes:
>>> + if (!STRINGBUF_SHARED (buf))
>>> + {
>>> + scm_i_pthread_mutex_lock (&stringbuf_write_mutex);
>>> + SCM_SET_CELL_WORD_0 (buf, SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED);
>>> + scm_i_pthread_mutex_unlock (&stringbuf_write_mutex);
>>> + }
>>
>> Does this work, with C's memory model?
>
> That's true. However, in that case, the shared flag is already being
> set by another thread, so it doesn't matter, because the only
> requirement of this function is to make sure the flag gets set.
I think it will be fine. Thanks for walking through it with me.
>>> + /* Attempt to intern the symbol */
>>> + handle = scm_hash_fn_create_handle_x (symbols, sym, SCM_UNDEFINED,
>>> + symbol_lookup_hash_fn,
>>> + symbol_lookup_assoc_fn,
>>> + NULL);
>>> + } while (SCM_UNLIKELY (!scm_is_eq (sym, SCM_CAR (handle))));
>>
>> Note that this is racy: this is a weak key hash table, so it's not safe
>> to access the car of the handle outside the alloc lock.
>
> It's not an issue here, because the only thing I'm doing with the 'car'
> is checking that it's 'eq?' to the lazy gensym that's being forced. If
> it _is_ 'eq?' to our gensym, then there's no possibility that it will be
> nulled out, because we hold a reference to it. If it's _not_ 'eq?' to
> our gensym, then we don't care whether it's null or not; in either case
> we have failed to intern this name and we try again with the next
> counter value.
>
> However, note that 'intern_symbol', 'lookup_interned_symbol', and
> 'lookup_interned_latin1_symbol' all access the 'car' of a handle of the
> symbol table outside of the alloc lock, and those might indeed be
> problems. Those issues are not from this patch though. The relevant
> code was last changed by you in 2011.
Yes, it was part of an attempt to correct this situation, and part of a
learning process as well. I'm more satisfied with master's
correctness in this regard.
>>> + /* We must not clear the lazy gensym flag until our symbol has
>>> + been interned. The lock does not save us here, because another
>>> + thread could retrieve our gensym's name or hash outside of any
>>> + lock. */
>>> + SCM_SET_CELL_WORD_0 (sym, (SCM_CELL_WORD_0 (sym)
>>> + & ~SCM_I_F_SYMBOL_LAZY_GENSYM));
>>> + }
>>> + scm_i_pthread_mutex_unlock (&symbols_lock);
>>> +}
>>
>> Doing all this work within a mutex is prone to deadlock, if allocation
>> causes a finalizer to run that forces another lazy symbol.
>
> Ugh. Well, we already have a problem then, because 'intern_symbol' also
> does allocation while holding this lock, via 'symbol_lookup_assoc_fn',
> which calls 'scm_symbol_to_string', which must allocate the string
> object (symbols hold only stringbufs). Therefore, with Guile 2.0.5, if
> anyone calls 'scm_from_*_symbol' within a finalizer, there is already
> the possibility for deadlock.
Yuck.
> Have I mentioned lately how much I hate locks? :/
:)
Locks aren't really the problem here though -- it's the
finalizer-introduced concurrency, specifically in the case in which your
program is in a critical section. If we ran finalizers in a separate
thread, we would not have these issues.
> The good news is that it should be possible to fix this (pre-existing)
> class of problems for 'symbols_lock' in stable-2.0 by changing
> 'symbol_lookup_assoc_fn' to avoid allocation.
That's not enough. Adding spine segments, ribs, and associating a
disappearing link all allocate memory, and those are internal to the
hash table implementation.
^ The serious note :)
Maybe you'll never hit it. I don't know. It depends very much on your
allocation pattern. What's the likelihood that a finalizer accesses a
symbol's characters? Who knows.
Maybe it's good enough to document this defect in 2.0. "Don't try to
string->symbol or symbol->string in a finalizer".
Andy
--
http://wingolog.org/
next prev parent reply other threads:[~2012-03-07 17:25 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-05 17:17 [PATCH] Efficient Gensym Hack Mark H Weaver
2012-03-05 21:52 ` Andy Wingo
2012-03-06 3:16 ` Mark H Weaver
2012-03-06 8:56 ` Andy Wingo
2012-03-06 9:55 ` [PATCH] Efficient Gensym Hack (v2) Mark H Weaver
2012-03-07 10:40 ` Andy Wingo
2012-03-07 16:43 ` Mark H Weaver
2012-03-07 17:25 ` Andy Wingo [this message]
2012-03-07 19:28 ` Mark H Weaver
2012-03-07 20:04 ` Andy Wingo
2013-01-16 17:25 ` Andy Wingo
2012-03-10 22:55 ` [PATCH] Efficient Gensym Hack Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87vcmgz5k2.fsf@pobox.com \
--to=wingo@pobox.com \
--cc=guile-devel@gnu.org \
--cc=mhw@netris.org \
--cc=raeburn@raeburn.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).