unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Andy Wingo <wingo@pobox.com>
To: Mark H Weaver <mhw@netris.org>
Cc: Ken Raeburn <raeburn@raeburn.org>, guile-devel@gnu.org
Subject: Re: [PATCH] Efficient Gensym Hack (v2)
Date: Wed, 07 Mar 2012 18:25:49 +0100	[thread overview]
Message-ID: <87vcmgz5k2.fsf@pobox.com> (raw)
In-Reply-To: <878vjcwedw.fsf@netris.org> (Mark H. Weaver's message of "Wed, 07 Mar 2012 11:43:23 -0500")

Hi Mark!

Thanks for the response.  I have various minor thoughts here and one
serious note on GC.

On Wed 07 Mar 2012 17:43, Mark H Weaver <mhw@netris.org> writes:

>>> +  if (!STRINGBUF_SHARED (buf))
>>> +    {
>>> +      scm_i_pthread_mutex_lock (&stringbuf_write_mutex);
>>> +      SCM_SET_CELL_WORD_0 (buf, SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED);
>>> +      scm_i_pthread_mutex_unlock (&stringbuf_write_mutex);
>>> +    }
>>
>> Does this work, with C's memory model?
>
> That's true.  However, in that case, the shared flag is already being
> set by another thread, so it doesn't matter, because the only
> requirement of this function is to make sure the flag gets set.

I think it will be fine.  Thanks for walking through it with me.

>>> +          /* Attempt to intern the symbol */
>>> +          handle = scm_hash_fn_create_handle_x (symbols, sym, SCM_UNDEFINED,
>>> +                                                symbol_lookup_hash_fn,
>>> +                                                symbol_lookup_assoc_fn,
>>> +                                                NULL);
>>> +        } while (SCM_UNLIKELY (!scm_is_eq (sym, SCM_CAR (handle))));
>>
>> Note that this is racy: this is a weak key hash table, so it's not safe
>> to access the car of the handle outside the alloc lock.
>
> It's not an issue here, because the only thing I'm doing with the 'car'
> is checking that it's 'eq?' to the lazy gensym that's being forced.  If
> it _is_ 'eq?' to our gensym, then there's no possibility that it will be
> nulled out, because we hold a reference to it.  If it's _not_ 'eq?' to
> our gensym, then we don't care whether it's null or not; in either case
> we have failed to intern this name and we try again with the next
> counter value.
>
> However, note that 'intern_symbol', 'lookup_interned_symbol', and
> 'lookup_interned_latin1_symbol' all access the 'car' of a handle of the
> symbol table outside of the alloc lock, and those might indeed be
> problems.  Those issues are not from this patch though.  The relevant
> code was last changed by you in 2011.

Yes, it was part of an attempt to correct this situation, and part of a
learning process as well.  I'm more satisfied with master's
correctness in this regard.

>>> +      /* We must not clear the lazy gensym flag until our symbol has
>>> +         been interned.  The lock does not save us here, because another
>>> +         thread could retrieve our gensym's name or hash outside of any
>>> +         lock. */
>>> +      SCM_SET_CELL_WORD_0 (sym, (SCM_CELL_WORD_0 (sym)
>>> +                                 & ~SCM_I_F_SYMBOL_LAZY_GENSYM));
>>> +    }
>>> +  scm_i_pthread_mutex_unlock (&symbols_lock);
>>> +}
>>
>> Doing all this work within a mutex is prone to deadlock, if allocation
>> causes a finalizer to run that forces another lazy symbol.
>
> Ugh.  Well, we already have a problem then, because 'intern_symbol' also
> does allocation while holding this lock, via 'symbol_lookup_assoc_fn',
> which calls 'scm_symbol_to_string', which must allocate the string
> object (symbols hold only stringbufs).  Therefore, with Guile 2.0.5, if
> anyone calls 'scm_from_*_symbol' within a finalizer, there is already
> the possibility for deadlock.

Yuck.

> Have I mentioned lately how much I hate locks? :/

:)

Locks aren't really the problem here though -- it's the
finalizer-introduced concurrency, specifically in the case in which your
program is in a critical section.  If we ran finalizers in a separate
thread, we would not have these issues.

> The good news is that it should be possible to fix this (pre-existing)
> class of problems for 'symbols_lock' in stable-2.0 by changing
> 'symbol_lookup_assoc_fn' to avoid allocation.

That's not enough.  Adding spine segments, ribs, and associating a
disappearing link all allocate memory, and those are internal to the
hash table implementation.

^ The serious note :)

Maybe you'll never hit it.  I don't know.  It depends very much on your
allocation pattern.  What's the likelihood that a finalizer accesses a
symbol's characters?  Who knows.

Maybe it's good enough to document this defect in 2.0.  "Don't try to
string->symbol or symbol->string in a finalizer".

Andy
-- 
http://wingolog.org/



  reply	other threads:[~2012-03-07 17:25 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-05 17:17 [PATCH] Efficient Gensym Hack Mark H Weaver
2012-03-05 21:52 ` Andy Wingo
2012-03-06  3:16   ` Mark H Weaver
2012-03-06  8:56     ` Andy Wingo
2012-03-06  9:55 ` [PATCH] Efficient Gensym Hack (v2) Mark H Weaver
2012-03-07 10:40   ` Andy Wingo
2012-03-07 16:43     ` Mark H Weaver
2012-03-07 17:25       ` Andy Wingo [this message]
2012-03-07 19:28         ` Mark H Weaver
2012-03-07 20:04           ` Andy Wingo
2013-01-16 17:25   ` Andy Wingo
2012-03-10 22:55 ` [PATCH] Efficient Gensym Hack Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87vcmgz5k2.fsf@pobox.com \
    --to=wingo@pobox.com \
    --cc=guile-devel@gnu.org \
    --cc=mhw@netris.org \
    --cc=raeburn@raeburn.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).