From: ludo@gnu.org (Ludovic Courtès)
To: guile-devel@gnu.org
Subject: Re: Locks and threads
Date: Fri, 27 Mar 2009 00:12:52 +0100 [thread overview]
Message-ID: <87tz5fho97.fsf@gnu.org> (raw)
In-Reply-To: 87myb8q6yx.fsf@arudy.ossau.uklinux.net
Hi Neil,
Neil Jerram <neil@ossau.uklinux.net> writes:
> That's an interesting idea, but to be honest I'd prefer to finish this
> work and move onto something else. Would it be OK just to reduce the
> default runtime to 2 seconds? (It wouldn't make any practical
> difference, because 10 seconds isn't normally enough to see a define
> race anyway. It'll just be enough to catch any obvious bitrotting,
> and people who want to run a real test can use
> GUILE_TEST_DEFINE_RACE_DURATION.)
Yes, that'd be OK.
> The key points about #2 are that it uses a straight pthreads mutex to
> protect the symbol hash, and that the symbol hash is a weak hash
> table. Using a pthreads mutex means that we have to take care not to
> do any allocations (or anything else that could block) while the mutex
> is locked. The weakness means that as well as the obvious lookup and
> intern operations, we have to allow for the hash table being accessed
> and resized from an after-GC hook.
OK, got it. (A limitation that vanishes with BDW-GC.)
> #3, on the other hand, uses a fat mutex, and can be applied to any
> non-weak hash table. Using a fat mutex means that it's fine to
> allocate or block while the mutex is locked, and means that the code
> can use scm_dynwind_lock_mutex, which is quite neat. On the other
> hand, it would cause a problem if the hash table concerned could be
> accessed during GC, because
>
> - if the GC didn't try to lock the fat mutex itself, it would be
> modifying the hash table under the feet of the thread that has the
> mutex locked
>
> - if the GC does try to lock the fat mutex, it won't be able to and so
> will block... deadlock!
>
> Hence the restriction of #3 to non-weak hash tables.
OK.
> If #3 was needed, I wouldn't see any problem with its API/ABI changes.
> The API change is addition of one new primitive, and the ABI change is
> to a structure that isn't intended for user code to access. Am I
> missing something?
I was concerned about the `scm_t_hashtable' change, but it doesn't
appear to be accessed by public inlines/macros, so that should be OK.
> On the other hand, if it isn't needed - as appears to be true -
> there's no case for adding it to 1.8.
Agreed.
>> It needs to be conditionalized on `--with-threads'.
>
> It's already inside an "if BUILD_PTHREAD_SUPPORT". Do I need to do
> more than that?
No, I had overlooked this.
>>> scm_i_rehash (SCM table,
>>> unsigned long (*hash_fn)(),
>>> void *closure,
>>> - const char* func_name)
>>> + const char* func_name,
>>> + scm_i_pthread_mutex_t *mutex)
>>
>> This function assumes that MUTEX is locked. I would either write it
>> prominently in a comment above or change the semantics so that MUTEX is
>> acquired in `scm_i_rehash ()', not in the caller.
>
> I think I prefer the prominent comment. My first thought was for
> scm_i_rehash () to acquire the mutex. I changed to the current code
> because then all of the lock..unlock pairs are in symbols.c - which
> seemed more consistent to me.
Fair enough.
>> The latter might be achieved by integrating tests about whether TABLE
>> needs to be rehashed into `scm_i_rehash ()' itself. I.e., this:
>>
>> if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table)
>> || SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
>> scm_i_rehash (...);
>>
>> would become:
>>
>> scm_i_rehash (...);
>>
>> I haven't reviewed all uses to see whether it's actually possible and
>> whether it would lead to race conditions, though.
>
> I think that could work, but it would add another lock/unlock to the
> mainline. (We need to hold the lock even just to evaluate the
> rehashing condition.) The change doesn't feel compelling enough to me
> to justify that.
Contention-free locks are "cheap" in that there's no syscall involved
(when using futexes); OTOH there's at least one function call, which
we'd rather avoid.
>>> -static SCM symbols;
>>> +SCM scm_i_symbols;
>>
>> I don't think this is needed, is it?
>
> Yes, so that rehash_after_gc () (in hashtab.c) can identify the symbol
> hash and call scm_i_rehash_symbols_after_gc () for it.
Oops, right.
>>> +static SCM
>>> +intern_symbol (SCM symbol, const char *name, size_t len, size_t raw_hash)
>>
>> Since all users of `intern_symbol ()' perform a
>> `lookup_interned_symbol ()' right before, I'd rather change users so
>> that they lock before `lookup' and unlock after `intern':
>>
>> lock (&symbols_mutex);
>>
>> sym = lookup_interned_symbol (name);
>> if (scm_is_false (sym))
>> {
>> sym = scm_i_c_make_symbol (...);
>> intern_symbol (sym);
>> }
>>
>> unlock (&symbols_mutex);
>>
>> Is it doable?
>
> The problem with this is allocation (scm_i_c_make_symbol) while
> holding the mutex. The overall arrangement of the symbol code is
>
> lookup - allocate - intern
>
> and I assume that is so as to avoid allocation in the case where the
> symbol is already interned; otherwise the arrangement could be
>
> allocate - lookup/intern
>
> with the lookup/intern being implemented as a single operation.
>
> I've made my changes so as to leave this overall pattern the same.
> The code certainly could be simpler if we always allocated upfront,
> but I think it is better (for performance) not to do that.
Yes, you're right. I hadn't grasped all the implications here.
One nitpick: `intern_symbol ()' shouldn't need NAME, LEN and RAW_HASH as
it can get them "for free" from SYMBOL.
> On the other hand, I suspect the code could still be simplified to
> remove the duplication between lookup_interned_symbol and
> intern_symbol. I'll have a look at that.
Possible.
Thanks for looking into this!
Ludo'.
next prev parent reply other threads:[~2009-03-26 23:12 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-11 22:31 Locks and threads Neil Jerram
2009-02-11 23:05 ` Neil Jerram
2009-02-11 23:32 ` Ludovic Courtès
2009-02-11 23:30 ` Linas Vepstas
2009-02-11 23:53 ` Neil Jerram
2009-02-12 0:18 ` Linas Vepstas
2009-02-12 20:51 ` Ludovic Courtès
2009-02-11 23:30 ` Ludovic Courtès
2009-02-12 12:55 ` Greg Troxel
2009-02-12 18:00 ` Ken Raeburn
2009-02-12 21:14 ` Ludovic Courtès
2009-02-14 1:25 ` Ken Raeburn
2009-02-14 16:09 ` Ludovic Courtès
2009-03-05 20:41 ` Neil Jerram
2009-03-04 23:49 ` Neil Jerram
2009-03-05 3:54 ` Linas Vepstas
2009-03-05 19:46 ` Neil Jerram
2009-03-05 20:05 ` Neil Jerram
2009-03-05 20:40 ` Linas Vepstas
2009-03-05 20:49 ` Neil Jerram
2009-03-05 20:57 ` Linas Vepstas
2009-03-05 21:25 ` Neil Jerram
2009-03-05 21:56 ` Linas Vepstas
2009-03-06 11:01 ` Andy Wingo
2009-03-06 12:36 ` Linas Vepstas
2009-03-06 22:05 ` Ludovic Courtès
2009-03-08 22:04 ` Neil Jerram
2009-03-25 19:00 ` Neil Jerram
2009-03-25 22:08 ` Ludovic Courtès
2009-03-05 21:35 ` Ludovic Courtès
2009-03-10 23:57 ` Neil Jerram
2009-03-12 0:07 ` Neil Jerram
2009-03-12 0:53 ` Neil Jerram
2009-03-12 1:29 ` Linas Vepstas
2009-03-12 3:09 ` Clinton Ebadi
2009-03-25 22:13 ` Neil Jerram
2009-03-25 22:34 ` Linas Vepstas
2009-03-12 22:13 ` Andy Wingo
2009-03-13 19:13 ` Neil Jerram
2009-03-25 23:19 ` Neil Jerram
2009-03-26 3:40 ` Linas Vepstas
2009-03-26 8:02 ` Neil Jerram
2009-03-26 18:39 ` Linas Vepstas
2009-03-26 9:10 ` Ludovic Courtès
2009-03-26 22:01 ` Neil Jerram
2009-03-26 23:12 ` Ludovic Courtès [this message]
2009-03-26 22:51 ` Neil Jerram
2009-03-27 3:15 ` Linas Vepstas
2009-03-14 14:23 ` Ludovic Courtès
2009-03-16 22:57 ` Andy Wingo
2009-03-25 18:57 ` Neil Jerram
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tz5fho97.fsf@gnu.org \
--to=ludo@gnu.org \
--cc=guile-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).