unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Andy Wingo <wingo@pobox.com>
To: Mark H Weaver <mhw@netris.org>
Cc: Ken Raeburn <raeburn@raeburn.org>, guile-devel@gnu.org
Subject: Re: [PATCH] Efficient Gensym Hack (v2)
Date: Wed, 07 Mar 2012 11:40:11 +0100	[thread overview]
Message-ID: <8762eg1ypg.fsf@pobox.com> (raw)
In-Reply-To: <87pqcqvysj.fsf@netris.org> (Mark H. Weaver's message of "Tue, 06 Mar 2012 04:55:40 -0500")

Hi Mark,

On Tue 06 Mar 2012 10:55, Mark H Weaver <mhw@netris.org> writes:

> +static void
> +set_stringbuf_shared (SCM buf)
> +{
> +  /* Don't modify BUF if it's already marked as shared since it
> +     might be a read-only, statically allocated stringbuf.  */
> +  if (!STRINGBUF_SHARED (buf))
> +    {
> +      scm_i_pthread_mutex_lock (&stringbuf_write_mutex);
> +      SCM_SET_CELL_WORD_0 (buf, SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED);
> +      scm_i_pthread_mutex_unlock (&stringbuf_write_mutex);
> +    }
> +}
> +

Does this work, with C's memory model?  It seems that if thread A sets
the shared flag on stringbuf S, a concurrent call to
set_stringbuf_shared(S) from thread B has no guarantee as to what value
to see, as the initial flag check is outside the lock (a synchronization
point).

Perhaps it doesn't matter.  This is the only place that the SHARED flag
is accessed outside of a mutex, yes?

Adding Ken for thoughts on threadsafety.  If you are convinced it's
right, please add a comment to the source code.

> From 6c644645ecd2b1e84754b4759789edab2fdf9260 Mon Sep 17 00:00:00 2001
> From: Mark H Weaver <mhw@netris.org>
> Date: Mon, 5 Mar 2012 10:06:34 -0500
> Subject: [PATCH 2/3] Move prototype for scm_i_try_narrow_string where it
>  belongs

LGTM

> +  return scm_double_cell (scm_tc7_symbol | SCM_I_F_SYMBOL_LAZY_GENSYM,
> +                          SCM_UNPACK (prefix_stringbuf), (scm_t_bits) 0,
> +                          SCM_UNPACK (scm_cons (SCM_BOOL_F, SCM_EOL)));

Would be nice to avoid the plist cons if possible, but that's another
issue.

> +          /* Attempt to intern the symbol */
> +          handle = scm_hash_fn_create_handle_x (symbols, sym, SCM_UNDEFINED,
> +                                                symbol_lookup_hash_fn,
> +                                                symbol_lookup_assoc_fn,
> +                                                NULL);
> +        } while (SCM_UNLIKELY (!scm_is_eq (sym, SCM_CAR (handle))));

Note that this is racy: this is a weak key hash table, so it's not safe
to access the car of the handle outside the alloc lock.  I suppose
though that given that you have a strong reference to the value you're
comparing to, and you're using pointer comparison, this will work.  But
note that sometimes scm_is_eq will get a stale value or a 0 as its
second argument.

> +      /* We must not clear the lazy gensym flag until our symbol has
> +         been interned.  The lock does not save us here, because another
> +         thread could retrieve our gensym's name or hash outside of any
> +         lock. */
> +      SCM_SET_CELL_WORD_0 (sym, (SCM_CELL_WORD_0 (sym)
> +                                 & ~SCM_I_F_SYMBOL_LAZY_GENSYM));
> +    }
> +  scm_i_pthread_mutex_unlock (&symbols_lock);
> +}

Doing all this work within a mutex is prone to deadlock, if allocation
causes a finalizer to run that forces another lazy symbol.

This is not an issue on `master'.

If we can get around this potential problem, then we should indeed apply
this to stable-2.0 as well.

>  #define scm_is_symbol(x)            (!SCM_IMP (x) \
>                                       && (SCM_TYP7 (x) == scm_tc7_symbol))
> -#define scm_i_symbol_hash(x)        ((unsigned long) SCM_CELL_WORD_2 (x))
>  #define scm_i_symbol_is_interned(x) \
>    (!(SCM_CELL_WORD_0 (x) & SCM_I_F_SYMBOL_UNINTERNED))
> +#define scm_i_symbol_is_lazy_gensym(x) \
> +  (SCM_CELL_WORD_0 (x) & SCM_I_F_SYMBOL_LAZY_GENSYM)
>  
>  #define SCM_I_F_SYMBOL_UNINTERNED   0x100
> +#define SCM_I_F_SYMBOL_LAZY_GENSYM  0x200

Can we make this change in stable-2.0?  It is an ABI change of sorts.

If you are convinced that we can, please surround the scm_i_* with
#ifdef BUILDING_LIBGUILE.

OK, I think that's it.  I'm very much looking forward to this going in:
on master, meta/guile examples/web/hello.scm spends 13% of its
instructions in gensym and resulting GC foo.

Regards,

Andy
-- 
http://wingolog.org/



  reply	other threads:[~2012-03-07 10:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-05 17:17 [PATCH] Efficient Gensym Hack Mark H Weaver
2012-03-05 21:52 ` Andy Wingo
2012-03-06  3:16   ` Mark H Weaver
2012-03-06  8:56     ` Andy Wingo
2012-03-06  9:55 ` [PATCH] Efficient Gensym Hack (v2) Mark H Weaver
2012-03-07 10:40   ` Andy Wingo [this message]
2012-03-07 16:43     ` Mark H Weaver
2012-03-07 17:25       ` Andy Wingo
2012-03-07 19:28         ` Mark H Weaver
2012-03-07 20:04           ` Andy Wingo
2013-01-16 17:25   ` Andy Wingo
2012-03-10 22:55 ` [PATCH] Efficient Gensym Hack Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8762eg1ypg.fsf@pobox.com \
    --to=wingo@pobox.com \
    --cc=guile-devel@gnu.org \
    --cc=mhw@netris.org \
    --cc=raeburn@raeburn.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).