From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.devel Subject: Re: [PATCH] Efficient Gensym Hack Date: Mon, 05 Mar 2012 22:16:17 -0500 Message-ID: <87y5rewha6.fsf@netris.org> References: <87mx7vx8zg.fsf@netris.org> <87sjhmsol9.fsf@pobox.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1331003920 20203 80.91.229.3 (6 Mar 2012 03:18:40 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 6 Mar 2012 03:18:40 +0000 (UTC) Cc: guile-devel@gnu.org To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Mar 06 04:18:39 2012 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1S4kv9-0003Su-0Z for guile-devel@m.gmane.org; Tue, 06 Mar 2012 04:18:39 +0100 Original-Received: from localhost ([::1]:55688 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4kv8-0005VK-Ac for guile-devel@m.gmane.org; Mon, 05 Mar 2012 22:18:38 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:59804) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4kv5-0005VC-TL for guile-devel@gnu.org; Mon, 05 Mar 2012 22:18:37 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S4kv3-00038p-Jm for guile-devel@gnu.org; Mon, 05 Mar 2012 22:18:35 -0500 Original-Received: from world.peace.net ([96.39.62.75]:46997) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4kv3-00038i-8i for guile-devel@gnu.org; Mon, 05 Mar 2012 22:18:33 -0500 Original-Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=yeeloong) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1S4kut-0002GI-2X; Mon, 05 Mar 2012 22:18:23 -0500 In-Reply-To: <87sjhmsol9.fsf@pobox.com> (Andy Wingo's message of "Mon, 05 Mar 2012 22:52:02 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 96.39.62.75 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:14017 Archived-At: Hi Andy, thanks for the quick response! :) Andy Wingo writes: > On Mon 05 Mar 2012 18:17, Mark H Weaver writes: > >> It makes 'gensym' about 4.7 times faster on my Yeeloong. Gensyms are >> not given names or even numbers until they are asked for their names or >> hash values (for 'equal?' hash tables only). > > Ah, interesting :) I had always thought that you would need to number > them first, but I see that you found a way to avoid that. I got the idea from http://icfp06.cs.uchicago.edu/dybvig-talk.pdf Anyway, in retrospect, I don't even see how I could make it work otherwise. The problem is that with lazy gensyms, the name you ultimately assign to the gensym _must_ not already be interned. Think about it. Suppose you assign a gensym with prefix "foo" the number 6, and that there's another symbol already interned with the name "foo6". Now you have two distinct symbols (in the sense of 'eq?'), both semantically interned, with the same name. There's no way to recover from this. The only solution I could find is to give the gensym a name that has not already been interned. In my implementation, I don't increment the counter at all until the lazy gensym is "forced". If that name is already interned, and I just keep incrementing the counter until I find a unique symbol. Only after it has been successfully interned do I _commit_ to the new name by clearing the "lazy gensym flag". >> 1. The implementation of symbols is split between symbols.c and >> strings.c, and the gensym hack needs the internals of both. I had to >> add some new internal functions, including one to make a stringbuf from >> a string and one to make a string from a stringbuf. > > Yeah, this is not good. With my dynstack work I found that functions > that are internal but not static can prevent some important inlining. That's definitely true, but fortunately these new internal functions are used only by the gensym code. Anyway, if you're concerned about this, one option would be to combine string.c and symbol.c into a single file. > Your thoughts on that weak set mechanism would be appreciated. Everything I know about weak storage mechanisms I learned from Bruno Haible. Highly recommended reading: http://www.haible.de/bruno/papers/cs/weak/WeakDatastructures-writeup.html I'll explore issues regarding the symbol table in another email. Thanks! Mark