From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: Re: [PATCH] Efficient Gensym Hack Date: Mon, 05 Mar 2012 22:52:02 +0100 Message-ID: <87sjhmsol9.fsf@pobox.com> References: <87mx7vx8zg.fsf@netris.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1330984344 21705 80.91.229.3 (5 Mar 2012 21:52:24 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 5 Mar 2012 21:52:24 +0000 (UTC) Cc: guile-devel@gnu.org To: Mark H Weaver Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Mon Mar 05 22:52:23 2012 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1S4fpO-0002Kc-I5 for guile-devel@m.gmane.org; Mon, 05 Mar 2012 22:52:22 +0100 Original-Received: from localhost ([::1]:56543 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4fpN-0002Nq-SX for guile-devel@m.gmane.org; Mon, 05 Mar 2012 16:52:21 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:35177) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4fpL-0002N8-87 for guile-devel@gnu.org; Mon, 05 Mar 2012 16:52:20 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S4fpC-0006Kh-Rc for guile-devel@gnu.org; Mon, 05 Mar 2012 16:52:18 -0500 Original-Received: from a-pb-sasl-sd.pobox.com ([74.115.168.62]:65435 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S4fpC-0006KM-IQ for guile-devel@gnu.org; Mon, 05 Mar 2012 16:52:10 -0500 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id 4EA74901A; Mon, 5 Mar 2012 16:52:07 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=OLU6QZTn/8fq+/SSaj016WiJ0F8=; b=u334Z0 ORjU9AVpBQjCXE5LEG5DlA/7W5vYidqwQwdn16sJOqwJK3xfqgaxbJxWiFFCge0b b2v8CC4ocxGKoBlGQbmeREH4KJM6zXVKQzrxa7AUdB0nZmQVQaiqiAqpCQH+ms2o XdvaDFLQqlInXEirMsABCxvRXSLEH7Xz2z46k= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=NW5cXihsoWxadyOKHyW4JWKPrfsmZLtR C2FcXLImUqqaMSrchW+tNp/CTlVle+zUNDsV3o46OhmDYNw1U15qwzFpum5ZzH1H 6WgwU/2aTbnm8FRyECsYwTeqdLADWd+5poUCg6YUmCTvMLpTjRtMm7btBfqc2Q3q LCZ6MDye3+8= Original-Received: from a-pb-sasl-sd.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTP id 4291C9019; Mon, 5 Mar 2012 16:52:07 -0500 (EST) Original-Received: from badger (unknown [90.164.198.39]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-sd.pobox.com (Postfix) with ESMTPSA id 676669006; Mon, 5 Mar 2012 16:52:06 -0500 (EST) In-Reply-To: <87mx7vx8zg.fsf@netris.org> (Mark H. Weaver's message of "Mon, 05 Mar 2012 12:17:55 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) X-Pobox-Relay-ID: 735FD5A0-670D-11E1-B705-65B1DE995924-02397024!a-pb-sasl-sd.pobox.com X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-Received-From: 74.115.168.62 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:14014 Archived-At: Hi Mark, A quick reaction to your summary; I'll look at the patches shortly. On Mon 05 Mar 2012 18:17, Mark H Weaver writes: > Here's an implementation of the efficient gensym hack for stable-2.0. Excellent! > It makes 'gensym' about 4.7 times faster on my Yeeloong. Gensyms are > not given names or even numbers until they are asked for their names or > hash values (for 'equal?' hash tables only). Ah, interesting :) I had always thought that you would need to number them first, but I see that you found a way to avoid that. > 1. The implementation of symbols is split between symbols.c and > strings.c, and the gensym hack needs the internals of both. I had to > add some new internal functions, including one to make a stringbuf from > a string and one to make a string from a stringbuf. Yeah, this is not good. With my dynstack work I found that functions that are internal but not static can prevent some important inlining. (I found the performance impact using "perf record", and valgrind --tool=callgrind). It's good that we have internal functions to avoid bloating our public API, but they do seem to prevent optimization. I wonder if LTO could help here. > 2. The symbol table uses the symbols themselves as the keys. This was > already hairy and inefficient: take a look at symbol_lookup_assoc_fn, > which has to convert symbols to strings (which involves allocation) to > implement the hash lookup! It uses the symbols as keys, but it uses the string hash value (not the symbol hashq value) as the hash. There are some important cases in which no string need be allocated: scm_from_utf8_symbol and scm_from_latin1_symbol. But yes, it's hairy. Note also that this has changed significantly in master. Your thoughts on that weak set mechanism would be appreciated. > IMHO, it would be much better to use a weak-value hash table, with > strings as the keys and symbols as the values. Maybe we can do that > for 2.2. Interesting idea. It's not clear to me how this would solve this problem though; but perhaps that will be clear when I read the patches. Anyway, to keep this short, I'll look at the patches in another mail. Cheers! Andy ps. An interesting benchmark (before and after) would be to time the execution of (compile-file "module/ice-9/psyntax.scm"). -- http://wingolog.org/