From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: data-crunching in guile Date: Wed, 24 Jun 2009 14:03:10 +0200 Message-ID: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1245845005 20567 80.91.229.12 (24 Jun 2009 12:03:25 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 24 Jun 2009 12:03:25 +0000 (UTC) To: guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Jun 24 14:03:18 2009 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MJRC6-0008DV-68 for guile-devel@m.gmane.org; Wed, 24 Jun 2009 14:03:17 +0200 Original-Received: from localhost ([127.0.0.1]:43956 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MJRC5-0007Vr-HD for guile-devel@m.gmane.org; Wed, 24 Jun 2009 08:03:13 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MJRBw-0007St-Rf for guile-devel@gnu.org; Wed, 24 Jun 2009 08:03:04 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MJRBs-0007Oq-VZ for guile-devel@gnu.org; Wed, 24 Jun 2009 08:03:04 -0400 Original-Received: from [199.232.76.173] (port=60926 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MJRBs-0007OS-L8 for guile-devel@gnu.org; Wed, 24 Jun 2009 08:03:00 -0400 Original-Received: from a-sasl-quonix.sasl.smtp.pobox.com ([208.72.237.25]:53212 helo=sasl.smtp.pobox.com) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MJRBm-0007zs-Bf for guile-devel@gnu.org; Wed, 24 Jun 2009 08:02:54 -0400 Original-Received: from localhost.localdomain (unknown [127.0.0.1]) by a-sasl-quonix.sasl.smtp.pobox.com (Postfix) with ESMTP id 05ED12261C for ; Wed, 24 Jun 2009 08:02:53 -0400 (EDT) Original-Received: from unquote (unknown [83.44.189.188]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-sasl-quonix.sasl.smtp.pobox.com (Postfix) with ESMTPSA id 36E602261B for ; Wed, 24 Jun 2009 08:02:51 -0400 (EDT) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux) X-Pobox-Relay-ID: F21F1744-60B6-11DE-BE71-DC021A496417-02397024!a-sasl-quonix.pobox.com X-detected-operating-system: by monty-python.gnu.org: Solaris 10 (beta) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:8755 Archived-At: Hey all, I'm pretty displeased about the way that the brainfuck compilation works. Check this: brainfuck@(guile-user)> ,c ++] (The trailing ] is to note the end of the program. It's kindof a hack, but it works fine.) Disassembly of #: 0 (make-int8:0) ;; 0 1 (load-symbol "make-vector") ;; make-vector 16 (link-now) 17 (variable-ref) 18 (make-int16 117 48) ;; 30000 21 (make-int8:0) ;; 0 22 (call 2) 24 (local-set 1) 26 (local-set 0) OK, up to here we're good, it's a generic prolog to make the tape and pointer. 28 (load-symbol "vector-set!") ;; vector-set! 43 (link-now) 44 (variable-ref) 45 (local-ref 1) 47 (local-ref 0) 49 (load-symbol "vector-ref") ;; vector-ref 63 (link-now) 64 (variable-ref) 65 (local-ref 1) 67 (local-ref 0) 69 (call 2) 71 (make-int8:1) ;; 1 72 (add) 73 (mv-call 3 :L32) ;; MV -> 81 77 (drop) 78 (br :L33) ;; -> 84 81 (truncate-values 0 0) 84 (load-symbol "vector-set!") ;; vector-set! 99 (link-now) 100 (variable-ref) 101 (local-ref 1) 103 (local-ref 0) 105 (load-symbol "vector-ref") ;; vector-ref 119 (link-now) 120 (variable-ref) 121 (local-ref 1) 123 (local-ref 0) 125 (call 2) 127 (make-int8:1) ;; 1 128 (add) 129 (goto/args 3) Then all of this to add 2 to a cell of a vector!!! The fact that we have to resolve vector-set! and vector-ref, indeed at every `+', is due to the fact that the code is not wrapped in a lambda, and thus has no place to cache the looked up var locations. See toplevel-ref in vm.texi, for more info. But then at every call to vector-set! (except the last one of course), we have to set up a multiple-values context, in the possibility that vector-ref returns multiple values. In brainfuck, which is an imperative language, most calls are for effect, not tail calls or calls for value, meaning that mv calls proportionally take up many instructions. And all this for vector-set!, which only returns a value because it has to (in the C api). But I was thinking, the brainfuck pattern (random-access reads and writes to a vector) is really the pattern of data crunching. The compiled output of brainfuck programs is the same as the compiled output of a hashing algorithm or an image synthesizer. And these data-crunching operations are precisely those that are closest to the underlying hardware, for better or for worse. What I'm getting at is that I think we should have VM ops for working on vectors -- both generic vectors, and specific ops for bytevectors, and probably an op for string-ref as well, and possibly string-set!. Then a native code backend could be effectively implemented to operate on the GLIL or assembly level, relying on the Tree-IL compiler's previous resolution of high-level operations (i.e., vector-set!) to low-level instructions. I think we have the space in the VM, and if we group all of the vector instructions at the end, we shouldn't affect the instruction cache too much. So that's my plan. Thoughts? Cheers, Andy -- http://wingolog.org/