unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* guile 3 update: more number unboxing improvements
@ 2017-11-29  8:14 Andy Wingo
  2017-11-30  3:03 ` Stefan Monnier
  0 siblings, 1 reply; 2+ messages in thread
From: Andy Wingo @ 2017-11-29  8:14 UTC (permalink / raw)
  To: guile-devel

Hi,

In the last update
(https://lists.gnu.org/archive/html/guile-devel/2017-11/msg00010.html) I
talked a lot about fixnums.  This last couple weeks' work is along the
same lines.

Firstly I added support for specialized instructions that compare (in
the < or = sense) s64 and u64 values against 8-bit immediates.  Then I
started to do some "exploding" of the comparison operators: for example,
if a number is a 64-bit integer, maybe first see if it's a fixnum and
take a fast path there, otherwise call out to a scm->u64 primitive.

I was thinking maybe we would be able to inline even scm->u64 entirely,
but this turned out to be a bad idea, and indeed even the speculative
inlining of the fixnum case of e.g. < wasn't that great.  The problem is
that if you replace a simple primcall (define x FOO) with a diamond
control structure like (define x (if FIXNUM? FOO-1 FOO-2)) early in the
compiler, that makes it harder to do code motion like CSE or LICM on X,
as X now has two definitions.  It could make sense to reify a diamond
control flow more towards the compiler back-end, but not in the
beginning.  Instruction explosion only makes sense if some of the
exploded instructions can be eliminated or are themselves subject to
CSE.  If it's just exploding instructions that can never be eliminated,
probably it's better to call out to the runtime (as in the case of
scm->u64, unless it can be reduced to untag-fixnum).

By this time I had backed myself into a little corner with number
specialization, so I had to back out and fix my mess; took a week or so,
but then I was back on track, with a new idea.  

The problem I was looking to solve was to hoist a fixnum? check above
code like (= x (logand x #xff)).  This predicate will only succeed if X
is a fixnum.  I then realized it would be tricky, as it would involve
hoisting the assertion made inside logand that X is an exact integer
"above" (but to where?), and it could be that the assertion doesn't
commute with other effects so it can't be hoisted.

So I modified the problem :)  Instead I wanted to optimize this kind of
code:

(define (t x)
  (unless (exact-integer? x) (error "expected an integer" x))
  (unless (= x (logand x #xff) (error "out of range" x))
  (* x 2))

or

(define (t x)
  (unless (exact-integer? x) (error "expected an integer" x))
  (unless (<= -10 x 100) (error "out of range" x))
  (* x 2))

Here exact-integer? is effectively (or (fixnum? x) (bignum? x)), so we
have a more precise type of X flowing into the predicate.  However it's
not magical; the "=" in the first example still sees that X could be a
bignum, and likewise for the <= operations.

What you want to do here is to "devirtualize" part of this function.
This is a compiler pass that effectively inlines the concrete
implementations of a virtual operation like =, logand, or <=.  But you
don't want to devirtualize just one operation; you want to devirtualize
a trace of operations.  Like this:

  (define (out-of-range x) (error "out of range" x))
  (define (not-int x) (error "expected an integer" x))
  (cond
   ((fixnum? x)
    (if (<= -10 x 100)
        (* x 2)
        (out-of-range x)))
   ((bignum? x)
    (if (<= -10 x 100)
        (* x 2)
        (out-of-range x)))
   (else
    (not-int x)))

This causes code growth initially, but probably causes shrinkage later;
concretely this code optimizes to:

  (define (out-of-range x) (error "out of range" x))
  (define (not-int x) (error "expected an integer" x))
  (cond
   ((fixnum? x)
    (let ((x* (untag-fixnum x)))
      (if (s64<= -10 x*)
          (if (s64<= x* 100)
              (tag-fixnum (s64+ x* x*))
              (out-of-range x))
          (out-of-range x))))
   ((bignum? x)
    (out-of-range x)
   (else
    (error "expected an integer" x)))

which is indeed what we get in code:

  Disassembly of #<procedure t (x)> at #xb437a0:

     0    (assert-nargs-ee/locals 2 0)    ;; 2 slots (1 arg)    at (unknown file):10:0
     1    (immediate-tag=? 0 3 2)         ;; fixnum?            at (unknown file):11:10
     3    (jne 11)                        ;; -> L1
     4    (untag-fixnum 1 0)                                    at (unknown file):12:10
     5    (s64-imm<? 1 -10)               
     6    (jl 14)                         ;; -> L2
     7    (imm-s64<? 1 100)               
     8    (jl 12)                         ;; -> L2
     9    (mov 0 1)                                             at (unknown file):13:2
    10    (uadd 1 0 1)                    
    11    (tag-fixnum 0 1)                
    12    (handle-interrupts)             
    13    (return-values 2)               ;; 1 value
  L1:
    14    (immediate-tag=? 0 7 0)         ;; heap-object?       at (unknown file):11:10
    16    (jne 8)                         ;; -> L3
    17    (heap-tag=? 0 4095 279)         ;; bignum?
    19    (jne 5)                         ;; -> L3
  L2:
    20    (throw/value 0 138)             ;; #(misc-error #f "out of range ~S") at (unknown file):12:25
    22    (handle-interrupts)             
    23    (return-values 1)               ;; 0 values
  L3:
    24    (throw/value 0 150)             ;; #(misc-error #f "expected an integer ~S") at (unknown file):11:29
    26    (handle-interrupts)             
    27    (return-values 1)               ;; 0 values

Pretty good code I think, though I should probably improve the
disassembler to make more readable disassemblies.  I would be interested
if anyone knows if other compilers do this.  It would appear that GCC
and LLVM do not:

  https://www.godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAKxAEZSBnVAV2OUxAHIBSAJgGY8AO2QAbZlgDU3fgGEGBfEIIA6BDOzcADAEFtOhcWbICkgEZ5gQ5gFsZAIX37mQvKiGThBTMEzFpAOyOupKhYcxeAA4ExAD6pgBmeAAe1nb8wTph2ZKGxqYWVraSAFSFaQ5OAQAilbrOru6eyj5%2B5pZpsQCG6Oi0khAubh5ercQAlHU6Q02jvv5RMfHxqLHlxRCLcQSTGU66Cl0EeMjNosKYkhHKABzxngyxSakbMyMt856T9UH6OZLETAEVgeTYqZ5paS8ABsklo42k/FqSLhU24NX2BgIRxOZwuV0a728ny2yyITxSkM2ymi20kyW%2Bel%2BIRyb2axLagNMMmqgXsknBlOKPIGUGuBFp8QRDMRshksjhCPRCv66ORmX%2BoS5IIBQLRNVCmMOx1OwnOQkubLmbVJBBWsWt/mpEqWpgZVQ1OTwCVFYtt0qV/E0Qck0N4Sphir%2Bmq1QJ1tvt6xsA3dexZMdCmFEDEw0YzOW1xCJkrt5IhG1TmTVRuxJrxFvZJck5Zs9ttAytHxtjPRnuyhdBzpLCLwgpeNkDwewqLTTNq9QOtdxZvxnY5/hb3V6/UGhPZYy%2BHrzYQA9CfJAB1S4AayEqAA7pIekpgHDJPf3GBOKZUAA3PwJKID6SAQCCXLaKjHv2cZFo2rptl2TqbuS7abBG/Lwvq856IuOKmkI5qWnujpPtuHbEYhh4/H2YQDg8FLjgMeCBgAYs2Qqts%2BO7MZIIDtEUnFkWhWGYmuB7PgALORwz7iSPbMlkBYwR4XEDKpEDqepzHjDpWGcOMpCiFwACsnCkEIXBaGZqBcHKvCOPZuQsGwlx8PwtBmQQln6QZ14gBJACcKgBSFoVhWFhlcBJZkWZwVmkDZnBmQwIBaKQXlxfppBwLASBoDYkR4KIfjkJQ%2BWFcVxAoKIXRCMAxlaGlSSiN4xApRAZjeaQFhCF0xAAJ5cB5pD5TYmDKAA8gRg2ZaQWA2LVwDFV1%2BCAiYeD/ils2YMkmDIMw3hDWZoxGbNMR4DY3kGecZgpZABmoNEwxbQAtBN/DJc57B0NdJkxV1iXJDc0IvdCUliItkjGSoWgwwMuCECQULuaQkiyKgBVFW0bnwp5V2%2Bf5QXhcToWRZw0XmQDXDJal6X42TvD/bNiV45lOmkP%2BbXDP5QA

I.e. each individual fixnum op is inlined, but GCC and LLVM don't peel
out a trace for the fixnum case.  I call this pass "integer
devirtualization" but also use the term "peeling a trace".  Words!

So that's my status.  Next up, I need to fix <= and >= on NaN values, as
I didn't get to that from last time; then I will move on to instruction
explosion of vector-ref and related instructions.  But first, I promised
Ludovic I would fix the compiler speed issue in slot allocation, both
for 2.2 and 3.0, so I will do that :)

Cheers,

Andy



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: guile 3 update: more number unboxing improvements
  2017-11-29  8:14 guile 3 update: more number unboxing improvements Andy Wingo
@ 2017-11-30  3:03 ` Stefan Monnier
  0 siblings, 0 replies; 2+ messages in thread
From: Stefan Monnier @ 2017-11-30  3:03 UTC (permalink / raw)
  To: guile-devel

>   (define (out-of-range x) (error "out of range" x))
>   (define (not-int x) (error "expected an integer" x))
>   (cond
>    ((fixnum? x)
>     (if (<= -10 x 100)
>         (* x 2)
>         (out-of-range x)))
>    ((bignum? x)
>     (if (<= -10 x 100)
>         (* x 2)
>         (out-of-range x)))
>    (else
>     (not-int x)))

Looks a bit like the result of "splitting tails", in this case,
tho selectively.


        Stefan




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-11-30  3:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-29  8:14 guile 3 update: more number unboxing improvements Andy Wingo
2017-11-30  3:03 ` Stefan Monnier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).