* tracing and profiling at the repl
@ 2009-12-21 22:18 Andy Wingo
2009-12-22 21:45 ` Neil Jerram
0 siblings, 1 reply; 5+ messages in thread
From: Andy Wingo @ 2009-12-21 22:18 UTC (permalink / raw)
To: guile-devel
Hello,
wingo@unquote:~/src/guile$ meta/guile
Guile Scheme interpreter 0.5 on Guile 1.9.6
Copyright (C) 2001-2008 Free Software Foundation, Inc.
Enter `,help' for help.
scheme@(guile-user)> (define (fibo x)
... (if (or (= x 1) (= x 2))
... 1
... (+ (fibo (- x 1))
... (fibo (- x 2)))))
scheme@(guile-user)> ,tr (fibo 5)
#<program 8c74200 at system/vm/trace.scm:30:6 ()>
* #<program 8c74240 at <unknown port>:0:1 ()>
* fibo 5
** fibo 4
*** fibo 3
**** fibo 2
**** fibo 1
*** fibo 2
** fibo 3
*** fibo 2
*** fibo 1
#<program 8c741e0 at system/vm/trace.scm:31:6 ()>
vm-trace-off! #<vm 8904680>
Sweet, no? We don't have return values here unfortunately, but that
could be arranged.
Now this too:
scheme@(guile-user)> ,pr (fibo 30)
% cumulative self
time seconds seconds name
100.00 0.29 0.29 fibo
0.00 0.29 0.00 top-repl
0.00 0.29 0.00 call-with-backtrace
0.00 0.29 0.00 #<program 8caeb40 ()>
0.00 0.29 0.00 #<program 8ca10e0 opts>
0.00 0.29 0.00 start-repl
0.00 0.29 0.00 #<program 89f3420 at ice-9/boot-9.scm:3394:10 ()>
0.00 0.29 0.00 #<program 8ca13d0 at system/repl/repl.scm:87:9 ()>
---
Sample count: 23
Total time: 0.29 seconds (0 seconds in GC)
Unfortunately (make-stack ...) doesn't actually do anything right now,
so we see the whole continuation. Also unfortunately we don't see subrs
on the stack, like dynamic-wind. But hopefully these problems will be
fixed within a couple weeks.
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: tracing and profiling at the repl
2009-12-21 22:18 tracing and profiling at the repl Andy Wingo
@ 2009-12-22 21:45 ` Neil Jerram
2009-12-23 8:26 ` Andy Wingo
0 siblings, 1 reply; 5+ messages in thread
From: Neil Jerram @ 2009-12-22 21:45 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Andy Wingo <wingo@pobox.com> writes:
> scheme@(guile-user)> ,tr (fibo 5)
> #<program 8c74200 at system/vm/trace.scm:30:6 ()>
> * #<program 8c74240 at <unknown port>:0:1 ()>
> * fibo 5
> ** fibo 4
> *** fibo 3
> **** fibo 2
> **** fibo 1
> *** fibo 2
> ** fibo 3
> *** fibo 2
> *** fibo 1
> #<program 8c741e0 at system/vm/trace.scm:31:6 ()>
> vm-trace-off! #<vm 8904680>
>
> Sweet, no?
Hi Andy,
Interesting stuff!
But I'm not sure. To me it seems less nice than the version that I
recently posted.
- Why are those #<program> and vm-trace-off! lines there?
- What if the function that you want to trace is not the one that you
want to invoke at top level?
- What if you want to trace several functions within a given
computation?
- Alternatively, if ,tr is supposed to mean "trace everything within
this computation", why aren't there trace lines for the applications
of `or', `=' and so on?
> We don't have return values here unfortunately, but that
> could be arranged.
Yes, I believe they are useful too.
> Now this too:
>
> scheme@(guile-user)> ,pr (fibo 30)
What would someone typing that actually be looking for? I would guess:
something about the timings of the internal parts of fibo ...
> % cumulative self
> time seconds seconds name
> 100.00 0.29 0.29 fibo
> 0.00 0.29 0.00 top-repl
> 0.00 0.29 0.00 call-with-backtrace
> 0.00 0.29 0.00 #<program 8caeb40 ()>
> 0.00 0.29 0.00 #<program 8ca10e0 opts>
> 0.00 0.29 0.00 start-repl
> 0.00 0.29 0.00 #<program 89f3420 at ice-9/boot-9.scm:3394:10 ()>
> 0.00 0.29 0.00 #<program 8ca13d0 at system/repl/repl.scm:87:9 ()>
> ---
> Sample count: 23
> Total time: 0.29 seconds (0 seconds in GC)
... but there is nothing here at a finer-grained level than fibo
itself. So I'm afraid I don't see the usefulness, yet.
> Unfortunately (make-stack ...) doesn't actually do anything right now,
> so we see the whole continuation. Also unfortunately we don't see subrs
> on the stack, like dynamic-wind. But hopefully these problems will be
> fixed within a couple weeks.
Cool. But I think the key thing is generating the internal information
that the user would really want. Perhaps that requires a more complex
example.
Regards,
Neil
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: tracing and profiling at the repl
2009-12-22 21:45 ` Neil Jerram
@ 2009-12-23 8:26 ` Andy Wingo
2009-12-23 16:03 ` Neil Jerram
0 siblings, 1 reply; 5+ messages in thread
From: Andy Wingo @ 2009-12-23 8:26 UTC (permalink / raw)
To: Neil Jerram; +Cc: guile-devel
Hi Neil,
I really must be miscommunicating here :) Let's see if I can do a better
job.
On Tue 22 Dec 2009 22:45, Neil Jerram <neil@ossau.uklinux.net> writes:
> Andy Wingo <wingo@pobox.com> writes:
>
>> scheme@(guile-user)> ,tr (fibo 5)
>> #<program 8c74200 at system/vm/trace.scm:30:6 ()>
>> * #<program 8c74240 at <unknown port>:0:1 ()>
>> * fibo 5
>> ** fibo 4
>> *** fibo 3
>> **** fibo 2
>> **** fibo 1
>> *** fibo 2
>> ** fibo 3
>> *** fibo 2
>> *** fibo 1
>> #<program 8c741e0 at system/vm/trace.scm:31:6 ()>
>> vm-trace-off! #<vm 8904680>
>
> - Why are those #<program> and vm-trace-off! lines there?
,tr (short for ,trace) traces *all* function applications that happen
when executing a form. It prints the procedure name and arguments, and
the stack level at that point. If the procedure doesn't have a name, it
just prints the procedure as-is -- thus the #<program ...> lines there.
In fact the #<program> lines there are all the anonymous functions
passed to dynamic-wind in trace.scm. Those lines are there because
tracing is still on until vm-trace-off! gets called.
> - What if the function that you want to trace is not the one that you
> want to invoke at top level?
>
> - What if you want to trace several functions within a given
> computation?
This works. You get a lot of output, as things are currently; but we can
restrict tracing to work only on certain procedures, etc.
> - Alternatively, if ,tr is supposed to mean "trace everything within
> this computation", why aren't there trace lines for the applications
> of `or', `=' and so on?
Perhaps it would help to disassemble fibo itself:
scheme@(guile-user)> ,x fibo
Disassembly of #<program fibo (x)>:
0 (assert-nargs-ee 0 1)
3 (reserve-locals 0 2)
Prolog
6 (local-ref 0) ;; `x'
8 (make-int8:1) ;; 1
9 (ee?) at (unknown file):1:10
10 (local-set 1) ;; `t'
12 (local-ref 1) ;; `t'
14 (br-if-not :L71) ;; -> 24
18 (local-ref 1) ;; `t'
20 (br :L72) ;; -> 29
24 (local-ref 0) ;; `x'
26 (make-int8 2) ;; 2
28 (ee?)
29 (br-if-not :L73) ;; -> 35 at (unknown file):1:2
Is x 1 or 2?
33 (make-int8:1) ;; 1
34 (return)
If so return 1
35 (new-frame) at (unknown file):3:9
36 (toplevel-ref 1) ;; `fibo'
38 (local-ref 0) ;; `x'
40 (sub1) at (unknown file):3:15
41 (call 1) at (unknown file):3:9
Otherwise call (fibo (1- x))
43 (new-frame) at (unknown file):4:1
44 (toplevel-ref 1) ;; `fibo'
46 (local-ref 0) ;; `x'
48 (make-int8 2) ;; 2
50 (sub) at (unknown file):4:7
51 (call 1) at (unknown file):4:1
and (fibo (- x 2)
53 (add) at (unknown file):3:6
54 (return)
and add and return.
The VM can fire hooks at function application, leaving a function,
entering a new frame, leaving a frame, on "next instruction", and a
couple other places. The tracer hooks into the first four. As you can
see from the disassembly, there is no "or" -- it gets expanded away by
psyntax anyway. = is on lines 9 and 28, inlined into the body. So you
see that the only function that is called is... fibo itself. That's why
it's the only one in the trace.
>> We don't have return values here unfortunately, but that
>> could be arranged.
>
> Yes, I believe they are useful too.
Yep.
>> Now this too:
>>
>> scheme@(guile-user)> ,pr (fibo 30)
>
> What would someone typing that actually be looking for? I would guess:
> something about the timings of the internal parts of fibo ...
>
>> % cumulative self
>> time seconds seconds name
>> 100.00 0.29 0.29 fibo
>> 0.00 0.29 0.00 top-repl
>> 0.00 0.29 0.00 call-with-backtrace
>> 0.00 0.29 0.00 #<program 8caeb40 ()>
>> 0.00 0.29 0.00 #<program 8ca10e0 opts>
>> 0.00 0.29 0.00 start-repl
>> 0.00 0.29 0.00 #<program 89f3420 at ice-9/boot-9.scm:3394:10 ()>
>> 0.00 0.29 0.00 #<program 8ca13d0 at system/repl/repl.scm:87:9 ()>
>> ---
>> Sample count: 23
>> Total time: 0.29 seconds (0 seconds in GC)
>
> ... but there is nothing here at a finer-grained level than fibo
> itself. So I'm afraid I don't see the usefulness, yet.
As I mentioned making new stacks doesn't work. When that does you will
see:
>> % cumulative self
>> time seconds seconds name
>> 100.00 0.29 0.29 fibo
>> ---
>> Sample count: 23
>> Total time: 0.29 seconds (0 seconds in GC)
You can't get any more information on the function call level in this
example, because it's not complicated enough. But try it on anything
else -- (resolve-module '(gnome gtk)) for example:
% cumulative self
time seconds seconds name
14.18 0.51 0.26 write-bytecode
9.22 0.17 0.17 #<program 9f1c1e0 at language/scheme/spec.scm:40:16 (port env)>
7.09 0.13 0.13 byte-length
4.26 0.31 0.08 glil->assembly
2.13 0.17 0.04 dump-object
2.13 0.15 0.04 #{\get-global-definition-hook\\\ 90}#
2.13 0.04 0.04 write-byte
2.13 0.04 0.04 module-ref
1.42 1.80 0.03 #<program 9e4ed20 at ice-9/boot-9.scm:2039:4 (name . args)>
1.42 0.31 0.03 #{\syntax-type\\\ 162}#
1.42 0.15 0.03 #<program a10ef50 at language/assembly.scm:63:8 (x len)>
1.42 0.06 0.03 #{\id-var-name\\\ 150}#
1.42 0.03 0.03 #<program a00a870 at language/tree-il/fix-letrec.scm:57:18 (x unref ref set simple lambda* complex)>
1.42 0.03 0.03 object->assembly
0.71 1.33 0.01 eval
This is when it had to compile a bunch of things, so really it's a test
of the compiler. Totally useful information though -- we really need to
speed up the bytecode writer!
Regards,
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: tracing and profiling at the repl
2009-12-23 8:26 ` Andy Wingo
@ 2009-12-23 16:03 ` Neil Jerram
2009-12-24 16:30 ` Andy Wingo
0 siblings, 1 reply; 5+ messages in thread
From: Neil Jerram @ 2009-12-23 16:03 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Andy Wingo <wingo@pobox.com> writes:
> Hi Neil,
>
> I really must be miscommunicating here :)
I think you just left out a couple of details. Plus I think you're
assuming that there is no requirement (or usefulness) for tracing or
profiling below the program level - i.e. within a sequence of VM
operations.
Is that assumption correct? Doesn't it mean that we can profile/trace
less as the compiler becomes cleverer?
> Let's see if I can do a better job.
Thank you! :-)
> ,tr (short for ,trace) traces *all* function applications that happen
> when executing a form.
OK.
>> - What if the function that you want to trace is not the one that you
>> want to invoke at top level?
>>
>> - What if you want to trace several functions within a given
>> computation?
>
> This works. You get a lot of output, as things are currently; but we can
> restrict tracing to work only on certain procedures, etc.
That sounds good. I would suggest that we allow for any combination of
- some procedures for which only call+args and return value are
traced
- some procedures for which all applications within the dynamic context
of those procedures are traced
- some top-level computation (presumably including some of the previous
procedures), for which the tracing function is enabled.
>> - Alternatively, if ,tr is supposed to mean "trace everything within
>> this computation", why aren't there trace lines for the applications
>> of `or', `=' and so on?
>
> Perhaps it would help to disassemble fibo itself:
[...]
> The VM can fire hooks at function application, leaving a function,
> entering a new frame, leaving a frame, on "next instruction", and a
> couple other places. The tracer hooks into the first four. As you can
> see from the disassembly, there is no "or" -- it gets expanded away by
> psyntax anyway. = is on lines 9 and 28, inlined into the body. So you
> see that the only function that is called is... fibo itself. That's why
> it's the only one in the trace.
Thanks for the detailed explanation. I understand, but I wonder (as
also above) if this will be useful enough in practice.
> You can't get any more information on the function call level in this
> example, because it's not complicated enough. But try it on anything
> else -- (resolve-module '(gnome gtk)) for example:
>
> % cumulative self
> time seconds seconds name
> 14.18 0.51 0.26 write-bytecode
> 9.22 0.17 0.17 #<program 9f1c1e0 at language/scheme/spec.scm:40:16 (port env)>
> 7.09 0.13 0.13 byte-length
> 4.26 0.31 0.08 glil->assembly
> 2.13 0.17 0.04 dump-object
> 2.13 0.15 0.04 #{\get-global-definition-hook\\\ 90}#
> 2.13 0.04 0.04 write-byte
> 2.13 0.04 0.04 module-ref
> 1.42 1.80 0.03 #<program 9e4ed20 at ice-9/boot-9.scm:2039:4 (name . args)>
> 1.42 0.31 0.03 #{\syntax-type\\\ 162}#
> 1.42 0.15 0.03 #<program a10ef50 at language/assembly.scm:63:8 (x len)>
> 1.42 0.06 0.03 #{\id-var-name\\\ 150}#
> 1.42 0.03 0.03 #<program a00a870 at language/tree-il/fix-letrec.scm:57:18 (x unref ref set simple lambda* complex)>
> 1.42 0.03 0.03 object->assembly
> 0.71 1.33 0.01 eval
>
> This is when it had to compile a bunch of things, so really it's a test
> of the compiler. Totally useful information though -- we really need to
> speed up the bytecode writer!
Yes, that's pretty nice.
I'm just not sure how things work as the generated code becomes more
optimised. For example, when a `map' procedure is inlined. But maybe
that's a fundamental problem, and it's well known (except to me!) that
you can't unambiguously trace or profile optimised code.
> Regards,
>
> Andy
Regards,
Neil
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: tracing and profiling at the repl
2009-12-23 16:03 ` Neil Jerram
@ 2009-12-24 16:30 ` Andy Wingo
0 siblings, 0 replies; 5+ messages in thread
From: Andy Wingo @ 2009-12-24 16:30 UTC (permalink / raw)
To: Neil Jerram; +Cc: guile-devel
Hi,
On Wed 23 Dec 2009 17:03, Neil Jerram <neil@ossau.uklinux.net> writes:
> I think you're assuming that there is no requirement (or usefulness)
> for tracing or profiling below the program level - i.e. within a
> sequence of VM operations.
>
> Is that assumption correct? Doesn't it mean that we can profile/trace
> less as the compiler becomes cleverer?
Well, you're right that statprof only works on the function level, and
as the compiler gets better, the number of allocated closures goes down,
and so we see less detail. But function-level profiling is still a very
useful first step.
Profiling on the instruction level is possible, though. You could use
the next-instruction hook, but that might affect the results too much;
or you could handle interrupts if they are present in the NEXT handler.
I have the feeling that this would be too expensive to leave on
normally, but we could build another VM engine that has this enabled.
This would allow you to see in what instructions your hot function is
spending its time, and you can either use that to annotate a
disassembly, or map directly back to source information.
>>> - What if the function that you want to trace is not the one that you
>>> want to invoke at top level?
>>>
>>> - What if you want to trace several functions within a given
>>> computation?
>>
>> This works. You get a lot of output, as things are currently; but we can
>> restrict tracing to work only on certain procedures, etc.
>
> That sounds good. I would suggest that we allow for any combination of
>
> - some procedures for which only call+args and return value are
> traced
>
> - some procedures for which all applications within the dynamic context
> of those procedures are traced
>
> - some top-level computation (presumably including some of the previous
> procedures), for which the tracing function is enabled.
Sounds like good ideas. I do think that specifically tracing certain
functions or forms using a macro and set! is still useful. To me, the
VM-level tracing support is mostly for when you don't want to recompile
your functions, and just want to get a picture of what's happening.
See http://www.scheme.com/csug/debug.html#./debug:h0 for more takes on
this question.
> I'm just not sure how things work as the generated code becomes more
> optimised. For example, when a `map' procedure is inlined. But maybe
> that's a fundamental problem, and it's well known (except to me!) that
> you can't unambiguously trace or profile optimised code.
I think the thing is that `lambda' doesn't create a function -- it's
just an abstraction, that might instantiate itself in a number of ways.
A compiler has the time to analyze things in more detail, widening the
number of ways that `lambda' might instantiate itself.
My ideas were recently clarified by
http://www.brics.dk/~hosc/local/HOSC-11-4-pp399-404.pdf, page 4, last
paragraph.
Anyway the thing that changes in optimized code is that a lambda often
isn't a closure, expressions get jumbled, variables that exist in the
source might never be bound -- and this is all good! Code gets faster
this way, and the semantics stay the same. It does make optimizing a but
more confusing, though, and that is unfortunate, as you note.
Happy holidays :)
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-12-24 16:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-21 22:18 tracing and profiling at the repl Andy Wingo
2009-12-22 21:45 ` Neil Jerram
2009-12-23 8:26 ` Andy Wingo
2009-12-23 16:03 ` Neil Jerram
2009-12-24 16:30 ` Andy Wingo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).