unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* data-crunching in guile
@ 2009-06-24 12:03 Andy Wingo
  2009-06-25  7:26 ` Neil Jerram
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Andy Wingo @ 2009-06-24 12:03 UTC (permalink / raw)
  To: guile-devel

Hey all,

I'm pretty displeased about the way that the brainfuck compilation
works. Check this:

    brainfuck@(guile-user)> ,c ++]

(The trailing ] is to note the end of the program. It's kindof a hack,
but it works fine.)

    Disassembly of #<objcode b719c8e0>:

       0    (make-int8:0)                   ;; 0
       1    (load-symbol "make-vector")     ;; make-vector
      16    (link-now)                      
      17    (variable-ref)                  
      18    (make-int16 117 48)             ;; 30000
      21    (make-int8:0)                   ;; 0
      22    (call 2)                        
      24    (local-set 1)                   
      26    (local-set 0)                   

OK, up to here we're good, it's a generic prolog to make the tape and
pointer.

      28    (load-symbol "vector-set!")     ;; vector-set!
      43    (link-now)                      
      44    (variable-ref)                  
      45    (local-ref 1)                   
      47    (local-ref 0)                   
      49    (load-symbol "vector-ref")      ;; vector-ref
      63    (link-now)                      
      64    (variable-ref)                  
      65    (local-ref 1)                   
      67    (local-ref 0)                   
      69    (call 2)                        
      71    (make-int8:1)                   ;; 1
      72    (add)                           
      73    (mv-call 3 :L32)                ;; MV -> 81
      77    (drop)                          
      78    (br :L33)                       ;; -> 84
      81    (truncate-values 0 0)           
      84    (load-symbol "vector-set!")     ;; vector-set!
      99    (link-now)                      
     100    (variable-ref)                  
     101    (local-ref 1)                   
     103    (local-ref 0)                   
     105    (load-symbol "vector-ref")      ;; vector-ref
     119    (link-now)                      
     120    (variable-ref)                  
     121    (local-ref 1)                   
     123    (local-ref 0)                   
     125    (call 2)                        
     127    (make-int8:1)                   ;; 1
     128    (add)                           
     129    (goto/args 3)      

Then all of this to add 2 to a cell of a vector!!!

The fact that we have to resolve vector-set! and vector-ref, indeed at
every `+', is due to the fact that the code is not wrapped in a lambda,
and thus has no place to cache the looked up var locations. See
toplevel-ref in vm.texi, for more info.

But then at every call to vector-set! (except the last one of course),
we have to set up a multiple-values context, in the possibility that
vector-ref returns multiple values. In brainfuck, which is an imperative
language, most calls are for effect, not tail calls or calls for value,
meaning that mv calls proportionally take up many instructions.

And all this for vector-set!, which only returns a value because it has
to (in the C api).

But I was thinking, the brainfuck pattern (random-access reads and
writes to a vector) is really the pattern of data crunching. The
compiled output of brainfuck programs is the same as the compiled output
of a hashing algorithm or an image synthesizer. And these data-crunching
operations are precisely those that are closest to the underlying
hardware, for better or for worse.

What I'm getting at is that I think we should have VM ops for working on
vectors -- both generic vectors, and specific ops for bytevectors, and
probably an op for string-ref as well, and possibly string-set!. Then a
native code backend could be effectively implemented to operate on the
GLIL or assembly level, relying on the Tree-IL compiler's previous
resolution of high-level operations (i.e., vector-set!) to low-level
instructions. I think we have the space in the VM, and if we group all
of the vector instructions at the end, we shouldn't affect the
instruction cache too much.

So that's my plan. Thoughts?

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-06-26 21:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-24 12:03 data-crunching in guile Andy Wingo
2009-06-25  7:26 ` Neil Jerram
2009-06-25  8:04 ` Ludovic Courtès
2009-06-25 21:08   ` Andy Wingo
2009-06-25 22:47     ` Neil Jerram
2009-06-26 14:37       ` Andy Wingo
2009-06-26 21:25         ` Neil Jerram
2009-06-26 12:09 ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).