Re: Locks and threads - Ludovic Courtès

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

From: ludo@gnu.org (Ludovic Courtès)
To: guile-devel@gnu.org
Subject: Re: Locks and threads
Date: Fri, 27 Mar 2009 00:12:52 +0100	[thread overview]
Message-ID: <87tz5fho97.fsf@gnu.org> (raw)
In-Reply-To: 87myb8q6yx.fsf@arudy.ossau.uklinux.net

Hi Neil,

Neil Jerram <neil@ossau.uklinux.net> writes:

> That's an interesting idea, but to be honest I'd prefer to finish this
> work and move onto something else.  Would it be OK just to reduce the
> default runtime to 2 seconds?  (It wouldn't make any practical
> difference, because 10 seconds isn't normally enough to see a define
> race anyway.  It'll just be enough to catch any obvious bitrotting,
> and people who want to run a real test can use
> GUILE_TEST_DEFINE_RACE_DURATION.)

Yes, that'd be OK.

> The key points about #2 are that it uses a straight pthreads mutex to
> protect the symbol hash, and that the symbol hash is a weak hash
> table.  Using a pthreads mutex means that we have to take care not to
> do any allocations (or anything else that could block) while the mutex
> is locked.  The weakness means that as well as the obvious lookup and
> intern operations, we have to allow for the hash table being accessed
> and resized from an after-GC hook.

OK, got it. (A limitation that vanishes with BDW-GC.)

> #3, on the other hand, uses a fat mutex, and can be applied to any
> non-weak hash table.  Using a fat mutex means that it's fine to
> allocate or block while the mutex is locked, and means that the code
> can use scm_dynwind_lock_mutex, which is quite neat.  On the other
> hand, it would cause a problem if the hash table concerned could be
> accessed during GC, because
>
> - if the GC didn't try to lock the fat mutex itself, it would be
>   modifying the hash table under the feet of the thread that has the
>   mutex locked
>
> - if the GC does try to lock the fat mutex, it won't be able to and so
>   will block... deadlock!
>
> Hence the restriction of #3 to non-weak hash tables.

OK.

> If #3 was needed, I wouldn't see any problem with its API/ABI changes.
> The API change is addition of one new primitive, and the ABI change is
> to a structure that isn't intended for user code to access.  Am I
> missing something?

I was concerned about the `scm_t_hashtable' change, but it doesn't
appear to be accessed by public inlines/macros, so that should be OK.

> On the other hand, if it isn't needed - as appears to be true -
> there's no case for adding it to 1.8.

Agreed.

>> It needs to be conditionalized on `--with-threads'.
>
> It's already inside an "if BUILD_PTHREAD_SUPPORT".  Do I need to do
> more than that?

No, I had overlooked this.

>>>  scm_i_rehash (SCM table,
>>>  	      unsigned long (*hash_fn)(),
>>>  	      void *closure,
>>> -	      const char* func_name)
>>> +	      const char* func_name,
>>> +	      scm_i_pthread_mutex_t *mutex)
>>
>> This function assumes that MUTEX is locked.  I would either write it
>> prominently in a comment above or change the semantics so that MUTEX is
>> acquired in `scm_i_rehash ()', not in the caller.
>
> I think I prefer the prominent comment.  My first thought was for
> scm_i_rehash () to acquire the mutex.  I changed to the current code
> because then all of the lock..unlock pairs are in symbols.c - which
> seemed more consistent to me.

Fair enough.

>> The latter might be achieved by integrating tests about whether TABLE
>> needs to be rehashed into `scm_i_rehash ()' itself.  I.e., this:
>>
>>   if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table)
>>       || SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
>>     scm_i_rehash (...);
>>
>> would become:
>>
>>   scm_i_rehash (...);
>>
>> I haven't reviewed all uses to see whether it's actually possible and
>> whether it would lead to race conditions, though.
>
> I think that could work, but it would add another lock/unlock to the
> mainline.  (We need to hold the lock even just to evaluate the
> rehashing condition.)  The change doesn't feel compelling enough to me
> to justify that.

Contention-free locks are "cheap" in that there's no syscall involved
(when using futexes); OTOH there's at least one function call, which
we'd rather avoid.

>>> -static SCM symbols;
>>> +SCM scm_i_symbols;
>>
>> I don't think this is needed, is it?
>
> Yes, so that rehash_after_gc () (in hashtab.c) can identify the symbol
> hash and call scm_i_rehash_symbols_after_gc () for it.

Oops, right.

>>> +static SCM
>>> +intern_symbol (SCM symbol, const char *name, size_t len, size_t raw_hash)
>>
>> Since all users of `intern_symbol ()' perform a
>> `lookup_interned_symbol ()' right before, I'd rather change users so
>> that they lock before `lookup' and unlock after `intern':
>>
>>   lock (&symbols_mutex);
>>
>>   sym = lookup_interned_symbol (name);
>>   if (scm_is_false (sym))
>>     {
>>       sym = scm_i_c_make_symbol (...);
>>       intern_symbol (sym);
>>     }
>>
>>   unlock (&symbols_mutex);
>>
>> Is it doable?
>
> The problem with this is allocation (scm_i_c_make_symbol) while
> holding the mutex.  The overall arrangement of the symbol code is
>
>   lookup - allocate - intern
>
> and I assume that is so as to avoid allocation in the case where the
> symbol is already interned; otherwise the arrangement could be
>
>   allocate - lookup/intern
>
> with the lookup/intern being implemented as a single operation.
>
> I've made my changes so as to leave this overall pattern the same.
> The code certainly could be simpler if we always allocated upfront,
> but I think it is better (for performance) not to do that.

Yes, you're right.  I hadn't grasped all the implications here.

One nitpick: `intern_symbol ()' shouldn't need NAME, LEN and RAW_HASH as
it can get them "for free" from SYMBOL.

> On the other hand, I suspect the code could still be simplified to
> remove the duplication between lookup_interned_symbol and
> intern_symbol.  I'll have a look at that.

Possible.

Thanks for looking into this!

Ludo'.

next prev parent reply	other threads:[~2009-03-26 23:12 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-11 22:31 Locks and threads Neil Jerram
2009-02-11 23:05 ` Neil Jerram
2009-02-11 23:32   ` Ludovic Courtès
2009-02-11 23:30 ` Linas Vepstas
2009-02-11 23:53   ` Neil Jerram
2009-02-12  0:18     ` Linas Vepstas
2009-02-12 20:51     ` Ludovic Courtès
2009-02-11 23:30 ` Ludovic Courtès
2009-02-12 12:55 ` Greg Troxel
2009-02-12 18:00   ` Ken Raeburn
2009-02-12 21:14     ` Ludovic Courtès
2009-02-14  1:25       ` Ken Raeburn
2009-02-14 16:09         ` Ludovic Courtès
2009-03-05 20:41   ` Neil Jerram
2009-03-04 23:49 ` Neil Jerram
2009-03-05  3:54   ` Linas Vepstas
2009-03-05 19:46     ` Neil Jerram
2009-03-05 20:05       ` Neil Jerram
2009-03-05 20:40         ` Linas Vepstas
2009-03-05 20:49           ` Neil Jerram
2009-03-05 20:57         ` Linas Vepstas
2009-03-05 21:25           ` Neil Jerram
2009-03-05 21:56             ` Linas Vepstas
2009-03-06 11:01               ` Andy Wingo
2009-03-06 12:36                 ` Linas Vepstas
2009-03-06 22:05                   ` Ludovic Courtès
2009-03-08 22:04               ` Neil Jerram
2009-03-25 19:00         ` Neil Jerram
2009-03-25 22:08           ` Ludovic Courtès
2009-03-05 21:35   ` Ludovic Courtès
2009-03-10 23:57   ` Neil Jerram
2009-03-12  0:07     ` Neil Jerram
2009-03-12  0:53       ` Neil Jerram
2009-03-12  1:29         ` Linas Vepstas
2009-03-12  3:09           ` Clinton Ebadi
2009-03-25 22:13           ` Neil Jerram
2009-03-25 22:34             ` Linas Vepstas
2009-03-12 22:13         ` Andy Wingo
2009-03-13 19:13           ` Neil Jerram
2009-03-25 23:19             ` Neil Jerram
2009-03-26  3:40               ` Linas Vepstas
2009-03-26  8:02                 ` Neil Jerram
2009-03-26 18:39                   ` Linas Vepstas
2009-03-26  9:10               ` Ludovic Courtès
2009-03-26 22:01                 ` Neil Jerram
2009-03-26 23:12                   ` Ludovic Courtès [this message]
2009-03-26 22:51               ` Neil Jerram
2009-03-27  3:15                 ` Linas Vepstas
2009-03-14 14:23           ` Ludovic Courtès
2009-03-16 22:57             ` Andy Wingo
2009-03-25 18:57     ` Neil Jerram

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tz5fho97.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).