From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: Re: [PATCH] Efficient Gensym Hack (v2) Date: Wed, 07 Mar 2012 11:40:11 +0100 Message-ID: <8762eg1ypg.fsf@pobox.com> References: <87mx7vx8zg.fsf@netris.org> <87pqcqvysj.fsf@netris.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1331116860 19300 80.91.229.3 (7 Mar 2012 10:41:00 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 7 Mar 2012 10:41:00 +0000 (UTC) Cc: Ken Raeburn , guile-devel@gnu.org To: Mark H Weaver Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Mar 07 11:40:58 2012 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1S5EIg-0004jb-Hv for guile-devel@m.gmane.org; Wed, 07 Mar 2012 11:40:54 +0100 Original-Received: from localhost ([::1]:57046 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S5EIf-0003Uq-Sw for guile-devel@m.gmane.org; Wed, 07 Mar 2012 05:40:53 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:36695) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S5EIE-0002vl-TZ for guile-devel@gnu.org; Wed, 07 Mar 2012 05:40:49 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S5EI8-0000Sf-6Z for guile-devel@gnu.org; Wed, 07 Mar 2012 05:40:26 -0500 Original-Received: from a-pb-sasl-sd.pobox.com ([74.115.168.62]:62596 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S5EI7-0000SR-T6 for guile-devel@gnu.org; Wed, 07 Mar 2012 05:40:20 -0500 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id 3538D8A7B; Wed, 7 Mar 2012 05:40:17 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=ovMa4khsHx64BatE09c++6zedZc=; b=mimcUe ljFz34CWzcbCvjttJqlYxm6VDT/N8lMi3PvAZ9JoLPZntxLOUzvpt6i1m96ufccN yVqbtUhc7RE12PvcavrRVr3b322bIErEbMpmGVfVUzTKU71qKEc7ILoTM20mJTlo UTTArrQ5u7afB/uKPP5YoiTm4znH5NHBtTk18= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=pbQltl7ohE1aZ+djAszHw8WgiSRdQ7aX SkoZTlMbEVS96/OCG2RFnwiX5Xb6H5Ez9d7nPvEBeJTVe1oNZuQI/2kN2e5DTimM vo/ZrYY3ugBhc5CrkTg+UxqxbQ7YyjghXKlW7/ECjP0myfMN/ezWTXupBzLWC3Oz Vawv9knoj78= Original-Received: from a-pb-sasl-sd.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id 2E8188A7A; Wed, 7 Mar 2012 05:40:17 -0500 (EST) Original-Received: from badger (unknown [90.163.36.96]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTPSA id 4E9E78A79; Wed, 7 Mar 2012 05:40:15 -0500 (EST) In-Reply-To: <87pqcqvysj.fsf@netris.org> (Mark H. Weaver's message of "Tue, 06 Mar 2012 04:55:40 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) X-Pobox-Relay-ID: ECE87B90-6841-11E1-9744-65B1DE995924-02397024!a-pb-sasl-sd.pobox.com X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-Received-From: 74.115.168.62 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:14037 Archived-At: Hi Mark, On Tue 06 Mar 2012 10:55, Mark H Weaver writes: > +static void > +set_stringbuf_shared (SCM buf) > +{ > + /* Don't modify BUF if it's already marked as shared since it > + might be a read-only, statically allocated stringbuf. */ > + if (!STRINGBUF_SHARED (buf)) > + { > + scm_i_pthread_mutex_lock (&stringbuf_write_mutex); > + SCM_SET_CELL_WORD_0 (buf, SCM_CELL_WORD_0 (buf) | STRINGBUF_F_SHARED); > + scm_i_pthread_mutex_unlock (&stringbuf_write_mutex); > + } > +} > + Does this work, with C's memory model? It seems that if thread A sets the shared flag on stringbuf S, a concurrent call to set_stringbuf_shared(S) from thread B has no guarantee as to what value to see, as the initial flag check is outside the lock (a synchronization point). Perhaps it doesn't matter. This is the only place that the SHARED flag is accessed outside of a mutex, yes? Adding Ken for thoughts on threadsafety. If you are convinced it's right, please add a comment to the source code. > From 6c644645ecd2b1e84754b4759789edab2fdf9260 Mon Sep 17 00:00:00 2001 > From: Mark H Weaver > Date: Mon, 5 Mar 2012 10:06:34 -0500 > Subject: [PATCH 2/3] Move prototype for scm_i_try_narrow_string where it > belongs LGTM > + return scm_double_cell (scm_tc7_symbol | SCM_I_F_SYMBOL_LAZY_GENSYM, > + SCM_UNPACK (prefix_stringbuf), (scm_t_bits) 0, > + SCM_UNPACK (scm_cons (SCM_BOOL_F, SCM_EOL))); Would be nice to avoid the plist cons if possible, but that's another issue. > + /* Attempt to intern the symbol */ > + handle = scm_hash_fn_create_handle_x (symbols, sym, SCM_UNDEFINED, > + symbol_lookup_hash_fn, > + symbol_lookup_assoc_fn, > + NULL); > + } while (SCM_UNLIKELY (!scm_is_eq (sym, SCM_CAR (handle)))); Note that this is racy: this is a weak key hash table, so it's not safe to access the car of the handle outside the alloc lock. I suppose though that given that you have a strong reference to the value you're comparing to, and you're using pointer comparison, this will work. But note that sometimes scm_is_eq will get a stale value or a 0 as its second argument. > + /* We must not clear the lazy gensym flag until our symbol has > + been interned. The lock does not save us here, because another > + thread could retrieve our gensym's name or hash outside of any > + lock. */ > + SCM_SET_CELL_WORD_0 (sym, (SCM_CELL_WORD_0 (sym) > + & ~SCM_I_F_SYMBOL_LAZY_GENSYM)); > + } > + scm_i_pthread_mutex_unlock (&symbols_lock); > +} Doing all this work within a mutex is prone to deadlock, if allocation causes a finalizer to run that forces another lazy symbol. This is not an issue on `master'. If we can get around this potential problem, then we should indeed apply this to stable-2.0 as well. > #define scm_is_symbol(x) (!SCM_IMP (x) \ > && (SCM_TYP7 (x) == scm_tc7_symbol)) > -#define scm_i_symbol_hash(x) ((unsigned long) SCM_CELL_WORD_2 (x)) > #define scm_i_symbol_is_interned(x) \ > (!(SCM_CELL_WORD_0 (x) & SCM_I_F_SYMBOL_UNINTERNED)) > +#define scm_i_symbol_is_lazy_gensym(x) \ > + (SCM_CELL_WORD_0 (x) & SCM_I_F_SYMBOL_LAZY_GENSYM) > > #define SCM_I_F_SYMBOL_UNINTERNED 0x100 > +#define SCM_I_F_SYMBOL_LAZY_GENSYM 0x200 Can we make this change in stable-2.0? It is an ABI change of sorts. If you are convinced that we can, please surround the scm_i_* with #ifdef BUILDING_LIBGUILE. OK, I think that's it. I'm very much looking forward to this going in: on master, meta/guile examples/web/hello.scm spends 13% of its instructions in gensym and resulting GC foo. Regards, Andy -- http://wingolog.org/