data-crunching in guile

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

From: Andy Wingo <wingo@pobox.com>
To: guile-devel <guile-devel@gnu.org>
Subject: data-crunching in guile
Date: Wed, 24 Jun 2009 14:03:10 +0200	[thread overview]
Message-ID: <m3ljnhon41.fsf@pobox.com> (raw)

Hey all,

I'm pretty displeased about the way that the brainfuck compilation
works. Check this:

    brainfuck@(guile-user)> ,c ++]

(The trailing ] is to note the end of the program. It's kindof a hack,
but it works fine.)

    Disassembly of #<objcode b719c8e0>:

       0    (make-int8:0)                   ;; 0
       1    (load-symbol "make-vector")     ;; make-vector
      16    (link-now)                      
      17    (variable-ref)                  
      18    (make-int16 117 48)             ;; 30000
      21    (make-int8:0)                   ;; 0
      22    (call 2)                        
      24    (local-set 1)                   
      26    (local-set 0)                   

OK, up to here we're good, it's a generic prolog to make the tape and
pointer.

      28    (load-symbol "vector-set!")     ;; vector-set!
      43    (link-now)                      
      44    (variable-ref)                  
      45    (local-ref 1)                   
      47    (local-ref 0)                   
      49    (load-symbol "vector-ref")      ;; vector-ref
      63    (link-now)                      
      64    (variable-ref)                  
      65    (local-ref 1)                   
      67    (local-ref 0)                   
      69    (call 2)                        
      71    (make-int8:1)                   ;; 1
      72    (add)                           
      73    (mv-call 3 :L32)                ;; MV -> 81
      77    (drop)                          
      78    (br :L33)                       ;; -> 84
      81    (truncate-values 0 0)           
      84    (load-symbol "vector-set!")     ;; vector-set!
      99    (link-now)                      
     100    (variable-ref)                  
     101    (local-ref 1)                   
     103    (local-ref 0)                   
     105    (load-symbol "vector-ref")      ;; vector-ref
     119    (link-now)                      
     120    (variable-ref)                  
     121    (local-ref 1)                   
     123    (local-ref 0)                   
     125    (call 2)                        
     127    (make-int8:1)                   ;; 1
     128    (add)                           
     129    (goto/args 3)      

Then all of this to add 2 to a cell of a vector!!!

The fact that we have to resolve vector-set! and vector-ref, indeed at
every `+', is due to the fact that the code is not wrapped in a lambda,
and thus has no place to cache the looked up var locations. See
toplevel-ref in vm.texi, for more info.

But then at every call to vector-set! (except the last one of course),
we have to set up a multiple-values context, in the possibility that
vector-ref returns multiple values. In brainfuck, which is an imperative
language, most calls are for effect, not tail calls or calls for value,
meaning that mv calls proportionally take up many instructions.

And all this for vector-set!, which only returns a value because it has
to (in the C api).

But I was thinking, the brainfuck pattern (random-access reads and
writes to a vector) is really the pattern of data crunching. The
compiled output of brainfuck programs is the same as the compiled output
of a hashing algorithm or an image synthesizer. And these data-crunching
operations are precisely those that are closest to the underlying
hardware, for better or for worse.

What I'm getting at is that I think we should have VM ops for working on
vectors -- both generic vectors, and specific ops for bytevectors, and
probably an op for string-ref as well, and possibly string-set!. Then a
native code backend could be effectively implemented to operate on the
GLIL or assembly level, relying on the Tree-IL compiler's previous
resolution of high-level operations (i.e., vector-set!) to low-level
instructions. I think we have the space in the VM, and if we group all
of the vector instructions at the end, we shouldn't affect the
instruction cache too much.

So that's my plan. Thoughts?

Cheers,

Andy
-- 
http://wingolog.org/

next             reply	other threads:[~2009-06-24 12:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-24 12:03 Andy Wingo [this message]
2009-06-25  7:26 ` data-crunching in guile Neil Jerram
2009-06-25  8:04 ` Ludovic Courtès
2009-06-25 21:08   ` Andy Wingo
2009-06-25 22:47     ` Neil Jerram
2009-06-26 14:37       ` Andy Wingo
2009-06-26 21:25         ` Neil Jerram
2009-06-26 12:09 ` Andy Wingo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3ljnhon41.fsf@pobox.com \
    --to=wingo@pobox.com \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).