unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Elisp native profiler
@ 2012-09-25  2:41 Stefan Monnier
  2012-09-25 11:40 ` Juanma Barranquero
  2012-09-25 13:18 ` Eli Zaretskii
  0 siblings, 2 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-25  2:41 UTC (permalink / raw)
  To: emacs-devel

The native profiler is pretty much ready for inclusion in the trunk.
But I'm not sure how to update src/makefile-w32 so as to add the new
profiler.o file.
Could someone check out the `profiler' branch and fix it so it compiles
under w32 as well?


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25  2:41 Elisp native profiler Stefan Monnier
@ 2012-09-25 11:40 ` Juanma Barranquero
  2012-09-25 13:18 ` Eli Zaretskii
  1 sibling, 0 replies; 113+ messages in thread
From: Juanma Barranquero @ 2012-09-25 11:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Tue, Sep 25, 2012 at 4:41 AM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:

> Could someone check out the `profiler' branch and fix it so it compiles
> under w32 as well?

Checking...

    Juanma



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25  2:41 Elisp native profiler Stefan Monnier
  2012-09-25 11:40 ` Juanma Barranquero
@ 2012-09-25 13:18 ` Eli Zaretskii
  2012-09-25 13:26   ` Eli Zaretskii
  2012-09-25 13:40   ` Juanma Barranquero
  1 sibling, 2 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-25 13:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 24 Sep 2012 22:41:37 -0400
> 
> The native profiler is pretty much ready for inclusion in the trunk.
> But I'm not sure how to update src/makefile-w32 so as to add the new
> profiler.o file.
> Could someone check out the `profiler' branch and fix it so it compiles
> under w32 as well?

How do you mean "compiles"?  If you want the profiler to actually work
in the w32 build, additional code needs to be written, to replace
setitimer and the SIGPROF signal, which aren't supported on Windows.

If you only mean to exclude the HAVE_SETITIMER parts, then that's a
simpler job, but then sampling of the call stack will not work.  If
you want this part of Emacs 24.3, it would make sense to work on
implementing the alternative to setitimer and SIGPROF now.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 13:18 ` Eli Zaretskii
@ 2012-09-25 13:26   ` Eli Zaretskii
  2012-09-25 17:28     ` Stefan Monnier
  2012-09-26  1:00     ` Stefan Monnier
  2012-09-25 13:40   ` Juanma Barranquero
  1 sibling, 2 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-25 13:26 UTC (permalink / raw)
  To: monnier; +Cc: emacs-devel

Btw, doesn't sigprof_handler in profile.c violate the SYNC_INPUT
paradigm, by running non-trivial code which can allocate memory from a
signal handler?  Shouldn't profiling use the same framework as atimers
and the other async events?



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 13:18 ` Eli Zaretskii
  2012-09-25 13:26   ` Eli Zaretskii
@ 2012-09-25 13:40   ` Juanma Barranquero
  1 sibling, 0 replies; 113+ messages in thread
From: Juanma Barranquero @ 2012-09-25 13:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel

On Tue, Sep 25, 2012 at 3:18 PM, Eli Zaretskii <eliz@gnu.org> wrote:

> How do you mean "compiles"?

I think Stefan meant just that, compiles, i.e., that merging the
branch does not break Windows. Implementing the profiling is another
thing altogether.

    Juanma



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 13:26   ` Eli Zaretskii
@ 2012-09-25 17:28     ` Stefan Monnier
  2012-09-25 19:21       ` Paul Eggert
                         ` (2 more replies)
  2012-09-26  1:00     ` Stefan Monnier
  1 sibling, 3 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-25 17:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[ Sorry, I hit C-c C-c by mistake in my previous email. ]

> Btw, doesn't sigprof_handler in profile.c violate the SYNC_INPUT
> paradigm, by running non-trivial code which can allocate memory from a
> signal handler?

sigprof_handler is written in such a way that it doesn't allocate any
memory, so while it runs non-trivial code, it doesn't make any
system-call or anything like that, it doesn't modify any data reachable
from the rest of Emacs code, and neither does the rest of Emacs code
modify any of the data it uses (except for a few global vars like
cpu_log, and I believe these don't suffer from race-conditions, but I'm
not very good at detecting those problems).

> Shouldn't profiling use the same framework as atimers
> and the other async events?

We could do that, but to the extent that it is possible to do all the
work in the signal_handler while still being safe, I think it's
preferable.  Most of the work done in the other signal handlers are
inherently "unsafe in the handler" because they really need to
make syscalls.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 17:28     ` Stefan Monnier
@ 2012-09-25 19:21       ` Paul Eggert
  2012-09-25 20:12         ` Eli Zaretskii
                           ` (2 more replies)
  2012-09-25 20:10       ` Eli Zaretskii
  2012-09-26  0:30       ` Stephen J. Turnbull
  2 siblings, 3 replies; 113+ messages in thread
From: Paul Eggert @ 2012-09-25 19:21 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On 09/25/2012 10:28 AM, Stefan Monnier wrote:
> while it runs non-trivial code, it doesn't make any
> system-call or anything like that

But there are races if the SIGPROF handler runs in
some other thread, no?  That bug could be worked around
by using thread-specific timers (which are available
in recent GNU/Linux but not older versions or in other
POSIXish platforms).  More portably, albeit more slowly,
the SIGPROF handler could use deliver_process_signal,
like the other handlers of process signals.

A lesser question: why does the profiler use SA_SIGINFO?
It doesn't seem to use the extra info that SA_SIGINFO delivers.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 17:28     ` Stefan Monnier
  2012-09-25 19:21       ` Paul Eggert
@ 2012-09-25 20:10       ` Eli Zaretskii
  2012-09-26  0:53         ` Stefan Monnier
  2012-09-26  0:30       ` Stephen J. Turnbull
  2 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-25 20:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Tue, 25 Sep 2012 13:28:37 -0400
> 
> > Shouldn't profiling use the same framework as atimers
> > and the other async events?
> 
> We could do that, but to the extent that it is possible to do all the
> work in the signal_handler while still being safe, I think it's
> preferable.

I bet the "read input in a signal handler" started that way, too.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 19:21       ` Paul Eggert
@ 2012-09-25 20:12         ` Eli Zaretskii
  2012-09-26  0:55         ` Stefan Monnier
  2012-09-26  7:39         ` Tomohiro Matsuyama
  2 siblings, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-25 20:12 UTC (permalink / raw)
  To: Paul Eggert; +Cc: monnier, emacs-devel

> Date: Tue, 25 Sep 2012 12:21:39 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> Cc: emacs-devel@gnu.org
> 
> On 09/25/2012 10:28 AM, Stefan Monnier wrote:
> > while it runs non-trivial code, it doesn't make any
> > system-call or anything like that
> 
> But there are races if the SIGPROF handler runs in
> some other thread, no?

E.g., if Emacs was configure with --enable-profiling.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 17:28     ` Stefan Monnier
  2012-09-25 19:21       ` Paul Eggert
  2012-09-25 20:10       ` Eli Zaretskii
@ 2012-09-26  0:30       ` Stephen J. Turnbull
  2 siblings, 0 replies; 113+ messages in thread
From: Stephen J. Turnbull @ 2012-09-26  0:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

Stefan Monnier writes:

 > I believe these don't suffer from race-conditions, but I'm not very
 > good at detecting those problems.
[...]
 > We could do that, but to the extent that it is possible to do all the
 > work in the signal_handler while still being safe, I think it's
 > preferable.

You're obviously in an adventurous mood today, diving in where even
Stefan Monnier fears to tread. ;-)




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 20:10       ` Eli Zaretskii
@ 2012-09-26  0:53         ` Stefan Monnier
  0 siblings, 0 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-26  0:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> > Shouldn't profiling use the same framework as atimers
>> > and the other async events?
>> We could do that, but to the extent that it is possible to do all the
>> work in the signal_handler while still being safe, I think it's
>> preferable.
> I bet the "read input in a signal handler" started that way, too.

AFAIK, the "read input in a signal handler" has been wrong/racy from
the beginning.  Tho, I guess at the beginning either the author didn't
know it was wrong, or he figured he could avoid the known problems with
it by playing with the malloc hooks.

But, yes, maybe at some we'll have to do it differently.  But doing it
differently means:
- doing potentially more work on QUIT (thus slowing down Emacs even
  when the profiler is not in use).
- suffer from a sampling bias (only the stack traces that lead to
  a QUIT can be sampled).
- miss some samples (because no QUIT occurred between the two signals).
- running QUIT more often (to reduce the bias and try and avoid missing
  some samples), again slowing down Emacs even when the profiler is not
  in use.
So I think for now I'd rather try and stick to "do it in the handler".

  
        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 19:21       ` Paul Eggert
  2012-09-25 20:12         ` Eli Zaretskii
@ 2012-09-26  0:55         ` Stefan Monnier
  2012-09-26  6:16           ` Tomohiro Matsuyama
  2012-09-26  7:39         ` Tomohiro Matsuyama
  2 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-26  0:55 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> But there are races if the SIGPROF handler runs in
> some other thread, no?

I guess there are races if the main thread is not suspended while the
handler runs, indeed (mostly because we try and read backtrace_list from
the handler which is allocated on the main thread's stack).

> That bug could be worked around by using thread-specific timers (which
> are available in recent GNU/Linux but not older versions or in other
> POSIXish platforms).

That sounds good, yes.  I don't think it's terribly important for this
profiler to work in "unusual" systems.

> A lesser question: why does the profiler use SA_SIGINFO?

No idea.  Better ask the author (Tomohiro Matsuyama <tomo@cx4a.org>).


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 13:26   ` Eli Zaretskii
  2012-09-25 17:28     ` Stefan Monnier
@ 2012-09-26  1:00     ` Stefan Monnier
  2012-09-26  7:41       ` Eli Zaretskii
  1 sibling, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-26  1:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:
> If you want the profiler to actually work in the w32 build, additional
> code needs to be written, to replace setitimer and the SIGPROF signal,
> which aren't supported on Windows.

Actually, even without the timer you can still use the memory profiler
where the samples are taken when malloc is called (more or less).
It would be nice to add support for the cpu-profiler, but for now I'm
focusing on getting the branch in a good enough state to be merged into
trunk, so "fully working on Windows" is not nearly as important as
"doesn't break the Window build".


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  0:55         ` Stefan Monnier
@ 2012-09-26  6:16           ` Tomohiro Matsuyama
  2012-09-26  7:35             ` Eli Zaretskii
  2012-09-26 17:42             ` Stefan Monnier
  0 siblings, 2 replies; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-26  6:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Paul Eggert, emacs-devel

> > A lesser question: why does the profiler use SA_SIGINFO?
> 
> No idea.  Better ask the author (Tomohiro Matsuyama <tomo@cx4a.org>).
The information given to the signal handler is not used for now, so I will get rid of SA_SIGINFO from the flags.  In the future, however, we will obtain C-level backtrace from the information so that we can profile C-level function calls as well.  It might help us to profile byte-compiled code interpretation too, but I'm not sure.

Let me explain here three methods for cpu profiling and their pros and cons.

1. setitimer
Pros: Easy to implement, Accuracy, C-level backtrace
Cons: Non-portable

2. threads
Pros: Almost portable
Cons: Multithread programming, Lack of C-level backtrace

3. atimers
Pros: Fully portable, Easy to implement
Cons: Stop the world, Lack of C-level backtrace, Inaccuracy

A reason why I've chose to use setitimer is just I have no ideas except setitimer at that time.

Ohter ideas are welcome.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  6:16           ` Tomohiro Matsuyama
@ 2012-09-26  7:35             ` Eli Zaretskii
  2012-09-26  7:59               ` Tomohiro Matsuyama
  2012-09-26 17:42             ` Stefan Monnier
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-26  7:35 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: eggert, monnier, emacs-devel

> Date: Wed, 26 Sep 2012 15:16:25 +0900
> From: Tomohiro Matsuyama <tomo@cx4a.org>
> Cc: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org
> 
> Let me explain here three methods for cpu profiling and their pros and cons.
> 
> 1. setitimer
> Pros: Easy to implement, Accuracy, C-level backtrace
> Cons: Non-portable
> 
> 2. threads
> Pros: Almost portable
> Cons: Multithread programming, Lack of C-level backtrace
> 
> 3. atimers
> Pros: Fully portable, Easy to implement
> Cons: Stop the world, Lack of C-level backtrace, Inaccuracy

I don't understand why do you say that threads mean no C-level
backtrace, please explain.  Also, atimers are not fully portable
because they are based on SIGALRM, which is not universally
available.  Perhaps you thought only about a subset of the systems
supported by Emacs.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-25 19:21       ` Paul Eggert
  2012-09-25 20:12         ` Eli Zaretskii
  2012-09-26  0:55         ` Stefan Monnier
@ 2012-09-26  7:39         ` Tomohiro Matsuyama
  2012-09-26 14:33           ` Paul Eggert
  2 siblings, 1 reply; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-26  7:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Stefan Monnier, emacs-devel

> On 09/25/2012 10:28 AM, Stefan Monnier wrote:
> > while it runs non-trivial code, it doesn't make any
> > system-call or anything like that
> 
> But there are races if the SIGPROF handler runs in
> some other thread, no?  That bug could be worked around
> by using thread-specific timers (which are available
> in recent GNU/Linux but not older versions or in other
> POSIXish platforms). More portably, albeit more slowly,
> the SIGPROF handler could use deliver_process_signal,
> like the other handlers of process signals.
OK, I'll use thread-specific timers in recent GNU/Linux systems and use deliver_process_signal for older systems.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  1:00     ` Stefan Monnier
@ 2012-09-26  7:41       ` Eli Zaretskii
  2012-09-26 12:55         ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-26  7:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Tue, 25 Sep 2012 21:00:59 -0400
> 
> >>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:
> > If you want the profiler to actually work in the w32 build, additional
> > code needs to be written, to replace setitimer and the SIGPROF signal,
> > which aren't supported on Windows.
> 
> Actually, even without the timer you can still use the memory profiler
> where the samples are taken when malloc is called (more or less).

Yes, I was talking about the "cpu" profiler.

> It would be nice to add support for the cpu-profiler, but for now I'm
> focusing on getting the branch in a good enough state to be merged into
> trunk, so "fully working on Windows" is not nearly as important as
> "doesn't break the Window build".

Will the merge happen before the feature freeze?  If not, the
adaptation of itimers to MS-Windows can wait.  Otherwise, it doesn't
make sense to me to have an interim release with an important new
feature that is not supported on MS-Windows.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  7:35             ` Eli Zaretskii
@ 2012-09-26  7:59               ` Tomohiro Matsuyama
  2012-09-26  9:32                 ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-26  7:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

> > Date: Wed, 26 Sep 2012 15:16:25 +0900
> > From: Tomohiro Matsuyama <tomo@cx4a.org>
> > Cc: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org
> > 
> > Let me explain here three methods for cpu profiling and their pros and cons.
> > 
> > 1. setitimer
> > Pros: Easy to implement, Accuracy, C-level backtrace
> > Cons: Non-portable
> > 
> > 2. threads
> > Pros: Almost portable
> > Cons: Multithread programming, Lack of C-level backtrace
> > 
> > 3. atimers
> > Pros: Fully portable, Easy to implement
> > Cons: Stop the world, Lack of C-level backtrace, Inaccuracy
> 
> I don't understand why do you say that threads mean no C-level
> backtrace, please explain.
I've explored and finally didn't find out how to obtain the information like siginfo_t and ucontext_t of running main thread from profiler thread.  I want to know if possible.

> Also, atimers are not fully portable
> because they are based on SIGALRM, which is not universally
> available.  Perhaps you thought only about a subset of the systems
> supported by Emacs.
I'm sorry, I was confused with normal timers.  SIGALRM is fired in real time interval, so we can't say that elisp really consumed CPU and system time.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  7:59               ` Tomohiro Matsuyama
@ 2012-09-26  9:32                 ` Eli Zaretskii
  2012-09-26 22:46                   ` Tomohiro Matsuyama
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-26  9:32 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: eggert, monnier, emacs-devel

> Date: Wed, 26 Sep 2012 16:59:46 +0900
> From: Tomohiro Matsuyama <tomo@cx4a.org>
> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
> 
> > I don't understand why do you say that threads mean no C-level
> > backtrace, please explain.
> I've explored and finally didn't find out how to obtain the information like siginfo_t and ucontext_t of running main thread from profiler thread.  I want to know if possible.

What kind of information do you need about the C-level backtrace?  We
are talking about profiling the primitives written in C, right?  So
what do you need, in addition to the name of the primitive, that is in
siginfo_t and ucontext_t?



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  7:41       ` Eli Zaretskii
@ 2012-09-26 12:55         ` Stefan Monnier
  2012-09-26 13:19           ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-26 12:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> Will the merge happen before the feature freeze?

Yes (probably later today).

> If not, the adaptation of itimers to MS-Windows can wait.  Otherwise,
> it doesn't make sense to me to have an interim release with an
> important new feature that is not supported on MS-Windows.

It's mostly useful for Elisp package authors.  Most of those packages
are platform agnostic, so it's not that serious if you need to find
a GNU/Linux system in order to do the profiling.
You can even ask some other GNU/Linux user to do the profiling for you
and send you the result.
So, I don't see any reason why we should prevent GNU/Linux users from
enjoying this feature in 24.3.  If someone an implement something
equivalent for Windows quickly enough, that's perfectly fine for me, of
course, but to allow the merge I allow needed to unbreak the
Windows build.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 12:55         ` Stefan Monnier
@ 2012-09-26 13:19           ` Eli Zaretskii
  2012-09-26 13:26             ` Eli Zaretskii
  2012-09-30 15:54             ` Eli Zaretskii
  0 siblings, 2 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-26 13:19 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Wed, 26 Sep 2012 08:55:48 -0400
> 
> > If not, the adaptation of itimers to MS-Windows can wait.  Otherwise,
> > it doesn't make sense to me to have an interim release with an
> > important new feature that is not supported on MS-Windows.
> 
> It's mostly useful for Elisp package authors.  Most of those packages
> are platform agnostic, so it's not that serious if you need to find
> a GNU/Linux system in order to do the profiling.
> You can even ask some other GNU/Linux user to do the profiling for you
> and send you the result.
> So, I don't see any reason why we should prevent GNU/Linux users from
> enjoying this feature in 24.3.

I didn't say we should prevent users of other platforms from enjoying
this feature.  But if this will be in 24.3, someone, probably myself,
will need to add the missing pieces to get it working on Windows.

IOW, I was asking for information necessary to plan my time.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 13:19           ` Eli Zaretskii
@ 2012-09-26 13:26             ` Eli Zaretskii
  2012-09-26 17:49               ` Stefan Monnier
  2012-09-30 15:54             ` Eli Zaretskii
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-26 13:26 UTC (permalink / raw)
  To: monnier; +Cc: emacs-devel

> Date: Wed, 26 Sep 2012 15:19:49 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> I didn't say we should prevent users of other platforms from enjoying
> this feature.  But if this will be in 24.3, someone, probably myself,
> will need to add the missing pieces to get it working on Windows.

Btw, is it possible to have a simple test case, starting from "emacs -Q",
to check that the profiling (both cpu and memory) works?  That would
simplify testing of this on MS-Windows.

TIA



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  7:39         ` Tomohiro Matsuyama
@ 2012-09-26 14:33           ` Paul Eggert
  2012-09-26 18:58             ` Ivan Andrus
  0 siblings, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-09-26 14:33 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: emacs-devel

On 09/26/2012 12:39 AM, Tomohiro Matsuyama wrote:
> OK, I'll use thread-specific timers in recent GNU/Linux systems
> and use deliver_process_signal for older systems.

While you're working on that, could you please have the code
prefer timer_settime to setitimer if SIGEV_SIGNAL is
defined?  That's how it's currently done in the Emacs trunk
starting with bzr 110197.  We need to use timer_settime for
thread-specific timers anyway, and it's better to use
timer_settime for process timers too, on platforms where
it's available.




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  6:16           ` Tomohiro Matsuyama
  2012-09-26  7:35             ` Eli Zaretskii
@ 2012-09-26 17:42             ` Stefan Monnier
  2012-09-26 22:52               ` Tomohiro Matsuyama
                                 ` (2 more replies)
  1 sibling, 3 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-26 17:42 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: emacs-devel

I've just installed the profiler branch into `trunk' where we can do the
rest of the development (such as fix the race-condition in the presence
of multiple threads).


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 13:26             ` Eli Zaretskii
@ 2012-09-26 17:49               ` Stefan Monnier
  2012-09-26 19:31                 ` Eli Zaretskii
  2012-09-29 15:24                 ` Eli Zaretskii
  0 siblings, 2 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-26 17:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> I didn't say we should prevent users of other platforms from enjoying
>> this feature.  But if this will be in 24.3, someone, probably myself,
>> will need to add the missing pieces to get it working on Windows.

> Btw, is it possible to have a simple test case, starting from "emacs -Q",
> to check that the profiling (both cpu and memory) works?  That would
> simplify testing of this on MS-Windows.

I typically do:

   emacs -Q
   M-x profiler-start RET
   C-x C-f src/regexp.c RET
   page-down until I get to the end of the buffer, then page-up to the beginning
   M-x profiler-report RET

Verifying that the resulting profile is accurate is rather difficult,
of course, so as long as it looks credible, I assume it's more or
less correct.  Note that the sampler truncates the backtraces it records
if they're too deep and profiler.el doesn't know yet how to make up for it
(it assumes the backtraces aren't truncated), so "deep" backtraces lead
to profiles where some internal functions appear to be called at
top-level.  It would be nice to fix this, but it's probably going to
take more time, so the workaround is to increase
profiler-max-stack-depth if it turns out to be a problem.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 14:33           ` Paul Eggert
@ 2012-09-26 18:58             ` Ivan Andrus
  2012-09-27  1:11               ` Paul Eggert
  0 siblings, 1 reply; 113+ messages in thread
From: Ivan Andrus @ 2012-09-26 18:58 UTC (permalink / raw)
  To: Paul Eggert, Emacs development discussions

On Sep 26, 2012, at 4:33 PM, Paul Eggert wrote:
> On 09/26/2012 12:39 AM, Tomohiro Matsuyama wrote:
>> OK, I'll use thread-specific timers in recent GNU/Linux systems
>> and use deliver_process_signal for older systems.
> 
> While you're working on that, could you please have the code
> prefer timer_settime to setitimer if SIGEV_SIGNAL is
> defined?  That's how it's currently done in the Emacs trunk
> starting with bzr 110197.  We need to use timer_settime for
> thread-specific timers anyway, and it's better to use
> timer_settime for process timers too, on platforms where
> it's available.

This actually caused the build on OS X to break.  As near as I can tell after searching online, OS X defines SIGEV_SIGNAL, but not timer_t.  I simply added an #undef SIGEV_SIGNAL on my machine and the build succeeded.  This is obviously not the correct solution, but I'm not sure what the correct solution is since I'm not familiar with timer_t and friends.

-Ivan




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 17:49               ` Stefan Monnier
@ 2012-09-26 19:31                 ` Eli Zaretskii
  2012-09-26 20:20                   ` Stefan Monnier
  2012-09-29 15:24                 ` Eli Zaretskii
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-26 19:31 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: emacs-devel@gnu.org
> Date: Wed, 26 Sep 2012 13:49:57 -0400
> 
> >> I didn't say we should prevent users of other platforms from enjoying
> >> this feature.  But if this will be in 24.3, someone, probably myself,
> >> will need to add the missing pieces to get it working on Windows.
> 
> > Btw, is it possible to have a simple test case, starting from "emacs -Q",
> > to check that the profiling (both cpu and memory) works?  That would
> > simplify testing of this on MS-Windows.
> 
> I typically do:
> 
>    emacs -Q
>    M-x profiler-start RET
>    C-x C-f src/regexp.c RET
>    page-down until I get to the end of the buffer, then page-up to the beginning
>    M-x profiler-report RET

(I presume that was regex.c, not regexp.c.)

Thanks for the recipe.

> Verifying that the resulting profile is accurate is rather difficult,
> of course, so as long as it looks credible, I assume it's more or
> less correct.

Can you show your results, for the reference?

The first lines in what I got (still only the memory profiler, the
itimer code is in the works and not ready yet) look suspicious to me
in the last column:

  + call-interactively                                       46,352,691   2%
  + Automatic Redisplay                                      30,389,586  -1%
  + scroll-up-command                                        24,834,342   2%
  + jit-lock-function                                        20,082,147   0%
  + scroll-up                                                15,687,351   3%
  + jit-lock-fontify-now                                      2,196,032   1%
  + byte-code                                                 1,426,989   1%
  + read-from-minibuffer                                        312,804   0%
  + run-hook-with-args                                          237,005   0%

(Overflow?)

Here's the same with the + signs expanded:

  - call-interactively                                       46,352,691   2%
    - scroll-up-command                                      43,373,729   0%
      - scroll-up                                            43,373,729   0%
	- jit-lock-function                                  41,648,995   0%
	  - jit-lock-fontify-now                             41,648,995   0%
	    - byte-code                                      41,619,059   0%
	      - run-hook-with-args                           41,619,059   0%
		- font-lock-fontify-region                   41,619,059   0%
		  - c-font-lock-fontify-region               41,619,059   0%
		    - font-lock-default-fontify-region       40,842,125  -1%
		      - font-lock-fontify-keywords-region    39,963,822  -2%
			- c-font-lock-declarations           34,602,780   1%
			  - c-find-decl-spots                34,602,780   1%
			    + #<compiled 0x40567bf>          28,867,024  -2%
			      looking-at                      2,817,588   1%
			      re-search-forward               1,791,738   1%
			    - c-forward-sws                     413,348   0%
				looking-at                      355,304   0%
				put-text-property                26,844   0%
			    - c-beginning-of-current-toke       150,676   0%
				looking-at                      150,676   0%
			    - c-beginning-of-macro               66,408   0%
				looking-at                       66,408   0%
			    - c-backward-sws                     39,468   0%
				c-beginning-of-macro             12,480   0%




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 19:31                 ` Eli Zaretskii
@ 2012-09-26 20:20                   ` Stefan Monnier
  2012-09-26 22:50                     ` Tomohiro Matsuyama
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-26 20:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> The first lines in what I got (still only the memory profiler, the
> itimer code is in the works and not ready yet) look suspicious to me
> in the last column:

Yes, the profiler.el code that handles the percents is victim of
overflows, especially with the memory profiler where the numbers are
typically much higher.  There's a FIXME for it in profiler.el.

> Here's the same with the + signs expanded:

>   - call-interactively                                       46,352,691   2%
>     - scroll-up-command                                      43,373,729   0%
>       - scroll-up                                            43,373,729   0%
> 	- jit-lock-function                                  41,648,995   0%
> 	  - jit-lock-fontify-now                             41,648,995   0%
> 	    - byte-code                                      41,619,059   0%
> 	      - run-hook-with-args                           41,619,059   0%
> 		- font-lock-fontify-region                   41,619,059   0%
> 		  - c-font-lock-fontify-region               41,619,059   0%
> 		    - font-lock-default-fontify-region       40,842,125  -1%
> 		      - font-lock-fontify-keywords-region    39,963,822  -2%
> 			- c-font-lock-declarations           34,602,780   1%
> 			  - c-find-decl-spots                34,602,780   1%
> 			    + #<compiled 0x40567bf>          28,867,024  -2%
> 			      looking-at                      2,817,588   1%
> 			      re-search-forward               1,791,738   1%
> 			    - c-forward-sws                     413,348   0%
> 				looking-at                      355,304   0%
> 				put-text-property                26,844   0%
> 			    - c-beginning-of-current-toke       150,676   0%
> 				looking-at                      150,676   0%
> 			    - c-beginning-of-macro               66,408   0%
> 				looking-at                       66,408   0%
> 			    - c-backward-sws                     39,468   0%
> 				c-beginning-of-macro             12,480   0%

That looks fine (modulo the known bug).


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26  9:32                 ` Eli Zaretskii
@ 2012-09-26 22:46                   ` Tomohiro Matsuyama
  2012-09-27  5:55                     ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-26 22:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

> > Date: Wed, 26 Sep 2012 16:59:46 +0900
> > From: Tomohiro Matsuyama <tomo@cx4a.org>
> > Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
> > 
> > > I don't understand why do you say that threads mean no C-level
> > > backtrace, please explain.
> > I've explored and finally didn't find out how to obtain the information like siginfo_t and ucontext_t of running main thread from profiler thread.  I want to know if possible.
> 
> What kind of information do you need about the C-level backtrace?  We
> are talking about profiling the primitives written in C, right?
Yes.

> So what do you need, in addition to the name of the primitive, that is in
> siginfo_t and ucontext_t?
What I wanted to say here is that we need to get a stack pointer of the 
running main thread to obtain C-level backtrace.  In my understand, the
stack pointer is not available to other threads without firing signals.
Of course, the profile thread can fire a signal (e.g. SIGALRM), but it
also makes the profiler non-portable and complicated.  The profiler using
threads looks like this:

    void profiler_thread() {
      while (1) {
        /* Record the elisp backtrace (backtrace_list) here.  */
        record_backtrace(backtrace_list);

        /* Obtain C-level backtrace here if possible.  */
        record_c_backtrace();

        /* Sleep for sampling interval.  */
        profiler_sleep(sampling_interval);
      }
    }

By the way, is it easy to obtain C-level backtrace?  gdb and addr2line might
help me, but I want a small and portable library for that.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 20:20                   ` Stefan Monnier
@ 2012-09-26 22:50                     ` Tomohiro Matsuyama
  2012-09-27  4:02                       ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-26 22:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

> > The first lines in what I got (still only the memory profiler, the
> > itimer code is in the works and not ready yet) look suspicious to me
> > in the last column:
> 
> Yes, the profiler.el code that handles the percents is victim of
> overflows, especially with the memory profiler where the numbers are
> typically much higher.  There's a FIXME for it in profiler.el.
That's could be fixed by using floats.  I'll do it.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 17:42             ` Stefan Monnier
@ 2012-09-26 22:52               ` Tomohiro Matsuyama
  2012-09-28  9:14               ` Tomohiro Matsuyama
  2012-09-30 16:42               ` Tomohiro Matsuyama
  2 siblings, 0 replies; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-26 22:52 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> I've just installed the profiler branch into `trunk' where we can do the
> rest of the development (such as fix the race-condition in the presence
> of multiple threads).
Thanks!

I found out profiler-report-write-log is broken.  I'll fix it.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 18:58             ` Ivan Andrus
@ 2012-09-27  1:11               ` Paul Eggert
  0 siblings, 0 replies; 113+ messages in thread
From: Paul Eggert @ 2012-09-27  1:11 UTC (permalink / raw)
  To: Ivan Andrus; +Cc: Emacs development discussions

On 09/26/2012 11:58 AM, Ivan Andrus wrote:
> As near as I can tell after searching online, OS X defines SIGEV_SIGNAL, but not timer_t.

Thanks for reporting that.  I checked in what I hope fixes that,
by testing directly for timer_settime rather than guessing
indirectly by looking at SIGEV_SIGNAL.  Please try trunk bzr 110212
or later.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 22:50                     ` Tomohiro Matsuyama
@ 2012-09-27  4:02                       ` Stefan Monnier
  2012-09-30  7:32                         ` Paul Eggert
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-27  4:02 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: Eli Zaretskii, emacs-devel

> That's could be fixed by using floats.  I'll do it.

That might work for the percent calculation, but we can also have
wrap-around for the actual counters.  I'm thinking that we might want to
scale them a little (e.g. instead of measuring bytes, we could measure
bytes/N where N would be at least 16, and more if possible) to at least
push the limit by a factor of N.
I guess we could also use floats for them, but that would take a bit
more work and hack.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 22:46                   ` Tomohiro Matsuyama
@ 2012-09-27  5:55                     ` Eli Zaretskii
  2012-09-27  7:01                       ` Eli Zaretskii
  2012-09-28 10:09                       ` Tomohiro Matsuyama
  0 siblings, 2 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-27  5:55 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: eggert, monnier, emacs-devel

> Date: Thu, 27 Sep 2012 07:46:55 +0900
> From: Tomohiro Matsuyama <tomo@cx4a.org>
> Cc: eggert@cs.ucla.edu, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> > So what do you need, in addition to the name of the primitive, that is in
> > siginfo_t and ucontext_t?
> What I wanted to say here is that we need to get a stack pointer of the 
> running main thread to obtain C-level backtrace.

Sorry, I don't understand: if all you need is to profile primitives,
why do you need a full C-level backtrace?  The primitives called are
already available in the backtrace_list structure, aren't they?

> In my understand, the
> stack pointer is not available to other threads without firing signals.

It's possible on Windows.  If it isn't possible with pthreads, that's
just too bad.  Anyway, getting at the stack pointer does not yet give
you the backtrace.

> By the way, is it easy to obtain C-level backtrace?  gdb and addr2line might
> help me, but I want a small and portable library for that.

See emacs_backtrace on sysdep.c.  Not sure about the portability of
the underlying APIs.

AFAIK, any backtrace facility is specific to the object file format
and the debug info used by the platform.  Compiler optimizations and
various non-local jumps complicate things a lot.  My suggestion would
be not to go there, unless absolutely needed and desirable, because it
most probably be in a constant need of maintenance to keep up with
development of the underlying platform, compilers, etc.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27  5:55                     ` Eli Zaretskii
@ 2012-09-27  7:01                       ` Eli Zaretskii
  2012-09-27 12:37                         ` Stefan Monnier
  2012-09-30  7:39                         ` Paul Eggert
  2012-09-28 10:09                       ` Tomohiro Matsuyama
  1 sibling, 2 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-27  7:01 UTC (permalink / raw)
  To: tomo; +Cc: eggert, monnier, emacs-devel

I think profiler-cpu-stop should revert the SIGPROF handler to its
original value.  Currently, it only disables the interval timer, but
leaves the SIGPROF handler pointed at its handler in profiler.c, which
ain't right AFAICS.

Btw, the names profiler-cpu-* are not good names, IMO: the "cpu" part
is not correct.  How about profiler-funcall-* or profiler-execution-*
instead?



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27  7:01                       ` Eli Zaretskii
@ 2012-09-27 12:37                         ` Stefan Monnier
  2012-09-27 17:49                           ` Eli Zaretskii
  2012-09-27 17:52                           ` Eli Zaretskii
  2012-09-30  7:39                         ` Paul Eggert
  1 sibling, 2 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-27 12:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, eggert, emacs-devel

> AFAIK, any backtrace facility is specific to the object file format
> and the debug info used by the platform.  Compiler optimizations and
> various non-local jumps complicate things a lot.  My suggestion would
> be not to go there, unless absolutely needed and desirable, because it
> most probably be in a constant need of maintenance to keep up with
> development of the underlying platform, compilers, etc.

Agreed.  Instead, I think we should just add some backtrace_list
registration to a few more functions (I've already added
Fgarbage_collect and redisplay_internal which covers the main points) if
we want more precision.  But since those don't come for free, we can
only do it for functions that take long enough that the impact can
be ignored.

> I think profiler-cpu-stop should revert the SIGPROF handler to its
> original value.  Currently, it only disables the interval timer, but
> leaves the SIGPROF handler pointed at its handler in profiler.c, which
> ain't right AFAICS.

Yes, that would be cleaner.

> Btw, the names profiler-cpu-* are not good names,

I'm not wedded to them, so if you have good alternatives, we can
change them.

> IMO: the "cpu" part is not correct.  How about profiler-funcall-* or
> profiler-execution-* instead?

Why?  The distinction between profiler-cpu-* and profiler-memory-* is
the "time basis", where one uses "cpu time" and the other "memory
allocation", hence the names.  "funcall" and "execution" both seem
completely unrelated.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27 12:37                         ` Stefan Monnier
@ 2012-09-27 17:49                           ` Eli Zaretskii
  2012-09-27 17:52                           ` Eli Zaretskii
  1 sibling, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-27 17:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: tomo@cx4a.org,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
> Date: Thu, 27 Sep 2012 08:37:48 -0400
> 
> > IMO: the "cpu" part is not correct.  How about profiler-funcall-* or
> > profiler-execution-* instead?
> 
> Why?  The distinction between profiler-cpu-* and profiler-memory-* is
> the "time basis", where one uses "cpu time" and the other "memory
> allocation", hence the names.  "funcall" and "execution" both seem
> completely unrelated.

Well, I'm at a disadvantage here, since I didn't yet see the result of
this.  I assumed that it is similar to what gprof produces, whose
product is known as "execution profile" or "call graph profile".
That's why I suggested "execution" or "funcall".



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27 12:37                         ` Stefan Monnier
  2012-09-27 17:49                           ` Eli Zaretskii
@ 2012-09-27 17:52                           ` Eli Zaretskii
  2012-09-27 20:35                             ` Stefan Monnier
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-27 17:52 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, emacs-devel

Since the merge of the profiler branch, I see this at the end of the
output of the GDB command "bt", when there normally would be no Lisp
backtrace at all:

  Lisp Backtrace:
  "Automatic Redisplay" (0x1649638)

What is this "Automatic Redisplay" about?



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27 17:52                           ` Eli Zaretskii
@ 2012-09-27 20:35                             ` Stefan Monnier
  2012-09-27 20:46                               ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-27 20:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, emacs-devel

> Since the merge of the profiler branch, I see this at the end of the
> output of the GDB command "bt", when there normally would be no Lisp
> backtrace at all:

>   Lisp Backtrace:
>   "Automatic Redisplay" (0x1649638)

> What is this "Automatic Redisplay" about?

It's an entry for the function redisplay_internal.
This way, the time spent in redisplay is labelled properly in
the profile.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27 20:35                             ` Stefan Monnier
@ 2012-09-27 20:46                               ` Eli Zaretskii
  2012-09-27 22:12                                 ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-27 20:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: tomo@cx4a.org, emacs-devel@gnu.org
> Date: Thu, 27 Sep 2012 16:35:16 -0400
> 
> >   Lisp Backtrace:
> >   "Automatic Redisplay" (0x1649638)
> 
> > What is this "Automatic Redisplay" about?
> 
> It's an entry for the function redisplay_internal.

Why not write redisplay_internal, then?

And, btw, redisplay_internal is not always called "automatically", it
is sometimes triggered by very specific events.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27 20:46                               ` Eli Zaretskii
@ 2012-09-27 22:12                                 ` Stefan Monnier
  2012-09-28  7:09                                   ` Eli Zaretskii
  2012-09-28  8:03                                   ` Stephen Berman
  0 siblings, 2 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-27 22:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, emacs-devel

>> >   Lisp Backtrace:
>> >   "Automatic Redisplay" (0x1649638)
>> > What is this "Automatic Redisplay" about?
>> It's an entry for the function redisplay_internal.
> Why not write redisplay_internal, then?

Because the names that appear there are supposed to be Elisp names, not
C names.  Maybe we could use a convention like "C redisplay_internal"
or "redisplay_internal (C)"?

> And, btw, redisplay_internal is not always called "automatically", it
> is sometimes triggered by very specific events.

To the Elisp programmer, it's largely automatic in general.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27 22:12                                 ` Stefan Monnier
@ 2012-09-28  7:09                                   ` Eli Zaretskii
  2012-09-28  8:24                                     ` Stefan Monnier
  2012-09-28  8:03                                   ` Stephen Berman
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-28  7:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: tomo@cx4a.org,  emacs-devel@gnu.org
> Date: Thu, 27 Sep 2012 18:12:50 -0400
> 
> > Why not write redisplay_internal, then?
> 
> Because the names that appear there are supposed to be Elisp names, not
> C names.  Maybe we could use a convention like "C redisplay_internal"
> or "redisplay_internal (C)"?

That would be good, yes.

> > And, btw, redisplay_internal is not always called "automatically", it
> > is sometimes triggered by very specific events.
> 
> To the Elisp programmer, it's largely automatic in general.

Well, as long as "automatic" doesn't appear in the backtrace, this is
a moot point.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27 22:12                                 ` Stefan Monnier
  2012-09-28  7:09                                   ` Eli Zaretskii
@ 2012-09-28  8:03                                   ` Stephen Berman
  2012-09-28  8:26                                     ` Stefan Monnier
  1 sibling, 1 reply; 113+ messages in thread
From: Stephen Berman @ 2012-09-28  8:03 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, Eli Zaretskii, emacs-devel

On Thu, 27 Sep 2012 18:12:50 -0400 Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>>> >   Lisp Backtrace:
>>> >   "Automatic Redisplay" (0x1649638)
>>> > What is this "Automatic Redisplay" about?
>>> It's an entry for the function redisplay_internal.
>> Why not write redisplay_internal, then?
>
> Because the names that appear there are supposed to be Elisp names, not
> C names.  Maybe we could use a convention like "C redisplay_internal"
> or "redisplay_internal (C)"?

Doesn't the underscore already conventionally identify it as a C name?
It seems at least as odd to me to see "C redisplay_internal" or
"redisplay_internal (C)" as to see simply "redisplay_internal" in a list
identified as "Lisp Backtrace".

Steve Berman



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28  7:09                                   ` Eli Zaretskii
@ 2012-09-28  8:24                                     ` Stefan Monnier
  2012-09-28 14:11                                       ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-28  8:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, emacs-devel

>> Because the names that appear there are supposed to be Elisp names, not
>> C names.  Maybe we could use a convention like "C redisplay_internal"
>> or "redisplay_internal (C)"?
> That would be good, yes.

I'll let you pick the one you prefer and make the change ;-)


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28  8:03                                   ` Stephen Berman
@ 2012-09-28  8:26                                     ` Stefan Monnier
  0 siblings, 0 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-28  8:26 UTC (permalink / raw)
  To: Stephen Berman; +Cc: tomo, Eli Zaretskii, emacs-devel

> Doesn't the underscore already conventionally identify it as a C name?
> It seems at least as odd to me to see "C redisplay_internal" or
> "redisplay_internal (C)" as to see simply "redisplay_internal" in a list
> identified as "Lisp Backtrace".

I don't want the name to sound "odd name for an Elisp function".
I want it to sound "Oh, that must be some special thingy" or "Oh, that's an
internal C function".


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 17:42             ` Stefan Monnier
  2012-09-26 22:52               ` Tomohiro Matsuyama
@ 2012-09-28  9:14               ` Tomohiro Matsuyama
  2012-09-28 11:00                 ` Eli Zaretskii
  2012-09-28 12:37                 ` Stefan Monnier
  2012-09-30 16:42               ` Tomohiro Matsuyama
  2 siblings, 2 replies; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-28  9:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

The default value of profiler-sample-interval is now 1, but
I think it is too high frequent.  I've profiled the following
code, and found there is a huge mismatch between 1 and 10 of 
profiler-sample-interval value (1000ms vs 4000ms).

(defun tak (x y z)
  (if (<= x y)
      z
    (tak (tak (1- x) y z)
         (tak (1- y) z x)
         (tak (1- z) x y))))
(byte-compile 'tak)
(benchmark 1 '(tak 20 12 2))

How about setting the value to 9, for example?

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27  5:55                     ` Eli Zaretskii
  2012-09-27  7:01                       ` Eli Zaretskii
@ 2012-09-28 10:09                       ` Tomohiro Matsuyama
  2012-09-28 11:05                         ` Eli Zaretskii
  2012-09-28 12:40                         ` Stefan Monnier
  1 sibling, 2 replies; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-28 10:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

On Thu, 27 Sep 2012 07:55:55 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

> > Date: Thu, 27 Sep 2012 07:46:55 +0900
> > From: Tomohiro Matsuyama <tomo@cx4a.org>
> > Cc: eggert@cs.ucla.edu, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > 
> > > So what do you need, in addition to the name of the primitive, that is in
> > > siginfo_t and ucontext_t?
> > What I wanted to say here is that we need to get a stack pointer of the 
> > running main thread to obtain C-level backtrace.
> 
> Sorry, I don't understand: if all you need is to profile primitives,
> why do you need a full C-level backtrace?  The primitives called are
> already available in the backtrace_list structure, aren't they?
No, because C-level function calls like Fvertical_motion could never been
listed in backtrace_list.  That's why byte-code often appears at top in a
profiling result.

> > In my understand, the
> > stack pointer is not available to other threads without firing signals.
> 
> It's possible on Windows.  If it isn't possible with pthreads, that's
> just too bad.  Anyway, getting at the stack pointer does not yet give
> you the backtrace.
I explored pthread manual, but I didn't finally find anything like that.

> > By the way, is it easy to obtain C-level backtrace?  gdb and addr2line might
> > help me, but I want a small and portable library for that.
> 
> See emacs_backtrace on sysdep.c.  Not sure about the portability of
> the underlying APIs.
Great!

> AFAIK, any backtrace facility is specific to the object file format
> and the debug info used by the platform.  Compiler optimizations and
> various non-local jumps complicate things a lot.  My suggestion would
> be not to go there, unless absolutely needed and desirable, because it
> most probably be in a constant need of maintenance to keep up with
> development of the underlying platform, compilers, etc.
How do you think about inserting a backtrace building code (like Ffuncall does)
into DEFUN macro expansion if specified so in configure script, for example?
This could be fully portable, but involves some overhead.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28  9:14               ` Tomohiro Matsuyama
@ 2012-09-28 11:00                 ` Eli Zaretskii
  2012-09-30  7:37                   ` Paul Eggert
  2012-09-28 12:37                 ` Stefan Monnier
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-28 11:00 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: monnier, emacs-devel

> Date: Fri, 28 Sep 2012 18:14:43 +0900
> From: Tomohiro Matsuyama <tomo@cx4a.org>
> Cc: emacs-devel@gnu.org
> 
> The default value of profiler-sample-interval is now 1, but
> I think it is too high frequent.  I've profiled the following
> code, and found there is a huge mismatch between 1 and 10 of 
> profiler-sample-interval value (1000ms vs 4000ms).
> 
> (defun tak (x y z)
>   (if (<= x y)
>       z
>     (tak (tak (1- x) y z)
>          (tak (1- y) z x)
>          (tak (1- z) x y))))
> (byte-compile 'tak)
> (benchmark 1 '(tak 20 12 2))
> 
> How about setting the value to 9, for example?

What interval is used by the profiling code in a C program compiled
with -pg?



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28 10:09                       ` Tomohiro Matsuyama
@ 2012-09-28 11:05                         ` Eli Zaretskii
  2012-09-28 12:43                           ` Stefan Monnier
  2012-09-28 12:40                         ` Stefan Monnier
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-28 11:05 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: monnier, emacs-devel

> Date: Fri, 28 Sep 2012 19:09:56 +0900
> From: Tomohiro Matsuyama <tomo@cx4a.org>
> Cc: eggert@cs.ucla.edu, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> > AFAIK, any backtrace facility is specific to the object file format
> > and the debug info used by the platform.  Compiler optimizations and
> > various non-local jumps complicate things a lot.  My suggestion would
> > be not to go there, unless absolutely needed and desirable, because it
> > most probably be in a constant need of maintenance to keep up with
> > development of the underlying platform, compilers, etc.
> How do you think about inserting a backtrace building code (like Ffuncall does)
> into DEFUN macro expansion

I believe this is what Stefan was thinking of as well, yes.

> if specified so in configure script, for example?

Why doe we need a configure-time decision?  It is best to have this
decided at run time, because then you can ask any user to turn on
profiling when some performance problem is reported.

> This could be fully portable, but involves some overhead.

If the overhead can be avoided when not profiling (e.g., depending on
some global variable), that would be acceptable, I think.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28  9:14               ` Tomohiro Matsuyama
  2012-09-28 11:00                 ` Eli Zaretskii
@ 2012-09-28 12:37                 ` Stefan Monnier
  1 sibling, 0 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-28 12:37 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: emacs-devel

> The default value of profiler-sample-interval is now 1, but
> I think it is too high frequent.  I've profiled the following
> code, and found there is a huge mismatch between 1 and 10 of 
> profiler-sample-interval value (1000ms vs 4000ms).

I'm not sure what you mean by mismatch.  You mean that the profiler
with the current default slows down execution by a factor of 4?

In my tests, the slowdown was of the order of 1%.

> How about setting the value to 9, for example?

I'm willing to change the default, but I'd first like to understand
what's the root cause.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28 10:09                       ` Tomohiro Matsuyama
  2012-09-28 11:05                         ` Eli Zaretskii
@ 2012-09-28 12:40                         ` Stefan Monnier
  1 sibling, 0 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-28 12:40 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: Eli Zaretskii, eggert, emacs-devel

> No, because C-level function calls like Fvertical_motion could never been
> listed in backtrace_list.  That's why byte-code often appears at top in a
> profiling result.

If Fvertical_motion is often called directly from C, we can make it so
that this function registers itself explicitly on backtrace_list, like
we do with redisplay_internal.

> How do you think about inserting a backtrace building code (like
> Ffuncall does) into DEFUN macro expansion if specified so in configure
> script, for example?  This could be fully portable, but involves
> some overhead.

I'd rather not force the user to choose at compile-time, so instead I'd
like to always add this overhead, but only at those places where it's
cheap and useful, because the function does a fair bit of work.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28 11:05                         ` Eli Zaretskii
@ 2012-09-28 12:43                           ` Stefan Monnier
  2012-09-28 14:16                             ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-28 12:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Tomohiro Matsuyama, emacs-devel

> If the overhead can be avoided when not profiling (e.g., depending on
> some global variable), that would be acceptable, I think.

No, I think some of the overhead can't be chosen at runtime (e.g. the
`struct backtrace' will always use up stack space), and adding a test on
a global variable to functions like `car' doesn't sound like
a good idea.
IOW I don't want to add it flat out to all DEFUN functions.  To some of
them, yes, th, and to some non-DEFUN functions as well.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28  8:24                                     ` Stefan Monnier
@ 2012-09-28 14:11                                       ` Eli Zaretskii
  0 siblings, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-28 14:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: tomo@cx4a.org,  emacs-devel@gnu.org
> Date: Fri, 28 Sep 2012 04:24:18 -0400
> 
> >> Because the names that appear there are supposed to be Elisp names, not
> >> C names.  Maybe we could use a convention like "C redisplay_internal"
> >> or "redisplay_internal (C)"?
> > That would be good, yes.
> 
> I'll let you pick the one you prefer and make the change ;-)

Done.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28 12:43                           ` Stefan Monnier
@ 2012-09-28 14:16                             ` Eli Zaretskii
  2012-09-28 20:11                               ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-28 14:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Tomohiro Matsuyama <tomo@cx4a.org>,  emacs-devel@gnu.org
> Date: Fri, 28 Sep 2012 08:43:00 -0400
> 
> > If the overhead can be avoided when not profiling (e.g., depending on
> > some global variable), that would be acceptable, I think.
> 
> No, I think some of the overhead can't be chosen at runtime (e.g. the
> `struct backtrace' will always use up stack space)

I thought about CPU overhead, not stack size overhead.

> and adding a test on a global variable to functions like `car'
> doesn't sound like a good idea.

Why not?  Profiling is a global state, affecting all of Emacs, isn't
it?



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28 14:16                             ` Eli Zaretskii
@ 2012-09-28 20:11                               ` Stefan Monnier
  2012-09-28 20:14                                 ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-28 20:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, emacs-devel

>> and adding a test on a global variable to functions like `car'
>> doesn't sound like a good idea.
> Why not?

Because it would slow down `car' even when the profiler is not running.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28 20:11                               ` Stefan Monnier
@ 2012-09-28 20:14                                 ` Eli Zaretskii
  2012-09-29  2:13                                   ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-28 20:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: tomo@cx4a.org, emacs-devel@gnu.org
> Date: Fri, 28 Sep 2012 16:11:40 -0400
> 
> >> and adding a test on a global variable to functions like `car'
> >> doesn't sound like a good idea.
> > Why not?
> 
> Because it would slow down `car' even when the profiler is not running.

OK, let me rephrase: why not add such a test to any primitive which is
large enough to make an addition of a test negligible for its run time?



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28 20:14                                 ` Eli Zaretskii
@ 2012-09-29  2:13                                   ` Stefan Monnier
  0 siblings, 0 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-29  2:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, emacs-devel

> OK, let me rephrase: why not add such a test to any primitive which is
> large enough to make an addition of a test negligible for its run time?

That'd be fine, yes.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 17:49               ` Stefan Monnier
  2012-09-26 19:31                 ` Eli Zaretskii
@ 2012-09-29 15:24                 ` Eli Zaretskii
  2012-09-29 16:06                   ` Stefan Monnier
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-29 15:24 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: emacs-devel@gnu.org
> Date: Wed, 26 Sep 2012 13:49:57 -0400
> 
> I typically do:
> 
>    emacs -Q
>    M-x profiler-start RET
>    C-x C-f src/regexp.c RET
>    page-down until I get to the end of the buffer, then page-up to the beginning
>    M-x profiler-report RET
> 
> Verifying that the resulting profile is accurate is rather difficult,
> of course, so as long as it looks credible, I assume it's more or
> less correct.

Is the cpu profile below (obtained with the above recipe) reasonable?

Why do functions like scroll-up-command and jit-lock-function appear
twice or three times, once under call-interactively, the other time by
themselves or under scroll-up?

(Should we rename "Automatic GC" like we did with redisplay?)

 - call-interactively                                               83  47%
   - scroll-up-command                                              43  24%
     - scroll-up                                                    43  24%
       - jit-lock-function                                          40  22%
	 - jit-lock-fontify-now                                     40  22%
	   - byte-code                                              40  22%
	     - run-hook-with-args                                   40  22%
	       - font-lock-fontify-region                           40  22%
		 - c-font-lock-fontify-region                       40  22%
		   - font-lock-default-fontify-region               38  21%
		     - font-lock-fontify-keywords-region            37  21%
		       - c-font-lock-declarations                   28  16%
			 - c-find-decl-spots                        28  16%
			   - #<compiled 0xd66b09>                   25  14%
			     + c-determine-limit                    16   9%
			       looking-at                            2   1%
			     + c-parse-state                         2   1%
			     + c-forward-decl-or-cast-1              1   0%
			     + c-font-lock-declarators               1   0%
			     + c-cheap-inside-bracelist-p            1   0%
			     + c-backward-sws                        1   0%
			     looking-at                              2   1%
			   + c-forward-sws                           1   0%
			 re-search-forward                           3   1%
		       + c-font-lock-enclosing-decls                 2   1%
		       + c-font-lock-enum-tail                       1   0%
		       + #<compiled 0xd6756d>                        1   0%
		       + c-font-lock-complex-decl-prepare            1   0%
		       + #<compiled 0xd67523>                        1   0%
		     + font-lock-fontify-syntactically-region        1   0%
		   - mapc                                            2   1%
		     + #<compiled 0xd36357>                          2   1%
   - list                                                           40  22%
     - read-extended-command                                        40  22%
       - completing-read                                            40  22%
	 - completing-read-default                                  40  22%
	     read-from-minibuffer                                   40  22%
 - scroll-up                                                        20  11%
   - jit-lock-function                                              20  11%
     - jit-lock-fontify-now                                         20  11%
       - byte-code                                                  20  11%
	 + run-hook-with-args                                       20  11%
   Automatic GC                                                     17   9%
 + redisplay_internal (C function)                                  16   9%
 - scroll-up-command                                                14   8%
   + scroll-up                                                      14   8%
 + jit-lock-function                                                14   8%
 + byte-code                                                         6   3%
 + jit-lock-fontify-now                                              4   2%
 + font-lock-fontify-region                                          1   0%



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-29 15:24                 ` Eli Zaretskii
@ 2012-09-29 16:06                   ` Stefan Monnier
  2012-09-29 16:18                     ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-09-29 16:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> Is the cpu profile below (obtained with the above recipe) reasonable?

Yes.

> Why do functions like scroll-up-command and jit-lock-function appear
> twice or three times, once under call-interactively, the other time by
> themselves or under scroll-up?

I'd guess it's because of the FIXME in profiler-call-tree-build-1.
I.e. you should be able to work around this limitation by increasing
profiler-max-stack-depth.


        Stefan


> (Should we rename "Automatic GC" like we did with redisplay?)

As you may see, it's handled very specially, and is presented always at
top-level.  I basically treat it as a separate (cooperative) thread.
So "Automatic GC" is meant as the name of the "thread" more than the
name of the function.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-29 16:06                   ` Stefan Monnier
@ 2012-09-29 16:18                     ` Eli Zaretskii
  2012-09-29 20:17                       ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-29 16:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Sat, 29 Sep 2012 12:06:21 -0400
> 
> > Why do functions like scroll-up-command and jit-lock-function appear
> > twice or three times, once under call-interactively, the other time by
> > themselves or under scroll-up?
> 
> I'd guess it's because of the FIXME in profiler-call-tree-build-1.
> I.e. you should be able to work around this limitation by increasing
> profiler-max-stack-depth.

Yes, looks more reasonable with 64 (vs default 16), see below.  Should
the default be increased?

 - call-interactively                                              160  87%
   - scroll-up-command                                             112  61%
     - scroll-up                                                   112  61%
       - jit-lock-function                                         106  58%
	 - jit-lock-fontify-now                                    106  58%
	   - byte-code                                             106  58%
	     - run-hook-with-args                                  106  58%
	       - font-lock-fontify-region                          106  58%
		 - c-font-lock-fontify-region                      106  58%
		   - font-lock-default-fontify-region               94  51%
		     - font-lock-fontify-keywords-region            94  51%
		       - c-font-lock-declarations                   63  34%
			 - c-find-decl-spots                        63  34%
			   - #<compiled 0xd4164d>                   62  34%
			     + c-parse-state                        12   6%
			     + c-determine-limit                    12   6%
			     + c-beginning-of-decl-1                11   6%
			     + c-forward-decl-or-cast-1              7   3%
			     + c-cheap-inside-bracelist-p            7   3%
			     + c-backward-sws                        5   2%
			     + c-forward-label                       3   1%
			     + c-backward-token-2                    3   1%
			       looking-at                            1   0%
			     looking-at                              1   0%
		       - c-font-lock-enclosing-decls                22  12%
			 - c-beginning-of-decl-1                    19  10%
			   - byte-code                              19  10%
			     - c-beginning-of-statement-1           12   6%
			       + byte-code                           9   4%
			       + c-backward-sws                      3   1%
			     + c-in-knr-argdecl                      7   3%
			 + c-parse-state                             3   1%
		       + c-font-lock-complex-decl-prepare            3   1%
		       + #<compiled 0xd50bf9>                        2   1%
			 re-search-forward                           2   1%
		       + #<compiled 0xd50bdb>                        1   0%
		       + #<compiled 0xd53453>                        1   0%
		   - mapc                                           12   6%
		     - #<compiled 0x4015749>                        12   6%
		       - c-context-set-fl-decl-start                12   6%
			 - c-set-fl-decl-start                      12   6%
			   + c-beginning-of-decl-1                   8   4%
			   + c-determine-limit                       3   1%
			   + c-literal-limits                        1   0%
   - list                                                           48  26%
     - read-extended-command                                        48  26%
       - completing-read                                            48  26%
	 - completing-read-default                                  48  26%
	   - read-from-minibuffer                                   48  26%
	       redisplay_internal (C function)                       1   0%
 - redisplay_internal (C function)                                  11   6%
   - jit-lock-function                                               8   4%
     - jit-lock-fontify-now                                          8   4%
       - byte-code                                                   8   4%
	 - run-hook-with-args                                        8   4%
	   - font-lock-fontify-region                                8   4%
	     - c-font-lock-fontify-region                            8   4%
	       - font-lock-default-fontify-region                    8   4%
		 - font-lock-fontify-keywords-region                 8   4%
		   - c-font-lock-declarations                        4   2%
		     + c-find-decl-spots                             4   2%
		     re-search-forward                               2   1%
		   - c-font-lock-enclosing-decls                     2   1%
		     + c-parse-state                                 2   1%
   Automatic GC                                                     11   6%



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-29 16:18                     ` Eli Zaretskii
@ 2012-09-29 20:17                       ` Stefan Monnier
  0 siblings, 0 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-29 20:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> I'd guess it's because of the FIXME in profiler-call-tree-build-1.
>> I.e. you should be able to work around this limitation by increasing
>> profiler-max-stack-depth.
> Yes, looks more reasonable with 64 (vs default 16), see below.  Should
> the default be increased?

The problem is that it increases substantially the memory (and CPU) used
by the profiler.  I'd rather fix the FIXME, at which point we should be
able to use a much smaller value.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27  4:02                       ` Stefan Monnier
@ 2012-09-30  7:32                         ` Paul Eggert
  0 siblings, 0 replies; 113+ messages in thread
From: Paul Eggert @ 2012-09-30  7:32 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Tomohiro Matsuyama, Eli Zaretskii, emacs-devel

On 09/26/2012 09:02 PM, Stefan Monnier wrote:
> we can also have wrap-around for the actual counters

I noticed this independently, and just installed a patch
that uses saturation arithmetic for this, which is not
ideal but is better than wrapping around.




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-28 11:00                 ` Eli Zaretskii
@ 2012-09-30  7:37                   ` Paul Eggert
  2012-09-30  7:58                     ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-09-30  7:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Tomohiro Matsuyama, emacs-devel

On 09/28/2012 04:00 AM, Eli Zaretskii wrote:
> What interval is used by the profiling code in a C program compiled
> with -pg?

It depends on the platform.  Historically it was 1/60th of
a second, because that's how often the clock interrupt was.
Nowadays 10 ms is common -- that's what GNU/Linux
uses, at least my copy does.





^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-27  7:01                       ` Eli Zaretskii
  2012-09-27 12:37                         ` Stefan Monnier
@ 2012-09-30  7:39                         ` Paul Eggert
  2012-09-30  7:58                           ` Eli Zaretskii
  1 sibling, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-09-30  7:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, emacs-devel

On 09/27/2012 12:01 AM, Eli Zaretskii wrote:
> I think profiler-cpu-stop should revert the SIGPROF handler to its
> original value.

I did that in the patch I just installed.  "Original value"
in this case means "ignored", as SIGPROF is now ignored
when the profiler isn't running.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30  7:37                   ` Paul Eggert
@ 2012-09-30  7:58                     ` Eli Zaretskii
  2012-09-30  8:16                       ` Paul Eggert
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-30  7:58 UTC (permalink / raw)
  To: Paul Eggert; +Cc: tomo, emacs-devel

> Date: Sun, 30 Sep 2012 00:37:22 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: Tomohiro Matsuyama <tomo@cx4a.org>, emacs-devel@gnu.org
> 
> On 09/28/2012 04:00 AM, Eli Zaretskii wrote:
> > What interval is used by the profiling code in a C program compiled
> > with -pg?
> 
> It depends on the platform.  Historically it was 1/60th of
> a second, because that's how often the clock interrupt was.
> Nowadays 10 ms is common -- that's what GNU/Linux
> uses, at least my copy does.

So perhaps we should use 10 ms as the default sampling period.

Btw, does profiler.c live in peace with Emacs configured with
"--enable-profiling"?  They both use SIGPROF, AFAIK.  Perhaps
profiler.c features should be disabled when Emacs is profiling itself.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30  7:39                         ` Paul Eggert
@ 2012-09-30  7:58                           ` Eli Zaretskii
  0 siblings, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-30  7:58 UTC (permalink / raw)
  To: Paul Eggert; +Cc: tomo, emacs-devel

> Date: Sun, 30 Sep 2012 00:39:58 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: tomo@cx4a.org, emacs-devel@gnu.org
> 
> On 09/27/2012 12:01 AM, Eli Zaretskii wrote:
> > I think profiler-cpu-stop should revert the SIGPROF handler to its
> > original value.
> 
> I did that in the patch I just installed.

Thanks.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30  7:58                     ` Eli Zaretskii
@ 2012-09-30  8:16                       ` Paul Eggert
  0 siblings, 0 replies; 113+ messages in thread
From: Paul Eggert @ 2012-09-30  8:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, emacs-devel

On 09/30/2012 12:58 AM, Eli Zaretskii wrote:
> Perhaps profiler.c features should be disabled when Emacs is profiling itself.

Good catch.  The two sets of features do collide, in that
profiler.c messes up Emacs profiling.  I followed your
suggestion in trunk bzr 110272.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 13:19           ` Eli Zaretskii
  2012-09-26 13:26             ` Eli Zaretskii
@ 2012-09-30 15:54             ` Eli Zaretskii
  1 sibling, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-09-30 15:54 UTC (permalink / raw)
  To: monnier; +Cc: emacs-devel

> Date: Wed, 26 Sep 2012 15:19:49 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> I didn't say we should prevent users of other platforms from enjoying
> this feature.  But if this will be in 24.3, someone, probably myself,
> will need to add the missing pieces to get it working on Windows.

Done as trunk revision 110287.

As a nice side effect, atimers now work on MS-Windows, so I could get
rid of Windows-specific code that displays the hourglass cursor, and
use the regular code in xdisp.c.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-26 17:42             ` Stefan Monnier
  2012-09-26 22:52               ` Tomohiro Matsuyama
  2012-09-28  9:14               ` Tomohiro Matsuyama
@ 2012-09-30 16:42               ` Tomohiro Matsuyama
  2012-09-30 19:23                 ` Paul Eggert
  2 siblings, 1 reply; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-30 16:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 499 bytes --]

Could someone please reviews this patch?
This patch includes the following changes.

1. Correct ambiguous words
  e.g. sample -> sampling
       sample -> cpu (in some context)
2. Introduce `profile' data structure (that was `log' data structure)
  `log' data structure is now pushed down to lower level.
3. Fix broken serialization/deserialization of profiles.
4. Change profiler-sampling-interval to 10 by default.
5. Introduce some new commands.
6. Rename some badly named commands

    Tomohiro

[-- Attachment #2: profiler.patch --]
[-- Type: text/x-diff, Size: 22976 bytes --]

=== modified file 'lisp/profiler.el'
--- lisp/profiler.el	2012-09-26 15:19:10 +0000
+++ lisp/profiler.el	2012-09-30 16:17:12 +0000
@@ -24,19 +24,21 @@
 
 ;;; Code:
 
-(eval-when-compile
-  (require 'cl-lib))
+(require 'cl-lib)
 
 (defgroup profiler nil
   "Emacs profiler."
   :group 'lisp
   :prefix "profiler-")
 
-(defcustom profiler-sample-interval 1
-  "Default sample interval in millisecond."
+(defconst profiler-version "24.3")
+
+(defcustom profiler-sampling-interval 10
+  "Default sampling interval in millisecond."
   :type 'integer
   :group 'profiler)
 
+\f
 ;;; Utilities
 
 (defun profiler-ensure-string (object)
@@ -49,6 +51,23 @@
 	(t
 	 (format "%s" object))))
 
+(defun profiler-format-percent (number divisor)
+  (concat (number-to-string (/ (* number 100) divisor)) "%"))
+
+(defun profiler-format-number (number)
+  "Format NUMBER in humarn readable string."
+  (if (and (integerp number) (> number 0))
+      (cl-loop with i = (% (1+ (floor (log10 number))) 3)
+	       for c in (append (number-to-string number) nil)
+	       if (= i 0)
+	       collect ?, into s
+	       and do (setq i 3)
+	       collect c into s
+	       do (cl-decf i)
+	       finally return
+	       (apply 'string (if (eq (car s) ?,) (cdr s) s)))
+    (profiler-ensure-string number)))
+
 (defun profiler-format (fmt &rest args)
   (cl-loop for (width align subfmt) in fmt
 	   for arg in args
@@ -74,27 +93,10 @@
 	   into frags
 	   finally return (apply #'concat frags)))
 
-(defun profiler-format-percent (number divisor)
-  (concat (number-to-string (/ (* number 100) divisor)) "%"))
-
-(defun profiler-format-nbytes (nbytes)
-  "Format NBYTES in humarn readable string."
-  (if (and (integerp nbytes) (> nbytes 0))
-      (cl-loop with i = (% (1+ (floor (log10 nbytes))) 3)
-	       for c in (append (number-to-string nbytes) nil)
-	       if (= i 0)
-	       collect ?, into s
-	       and do (setq i 3)
-	       collect c into s
-	       do (cl-decf i)
-	       finally return
-	       (apply 'string (if (eq (car s) ?,) (cdr s) s)))
-    (profiler-ensure-string nbytes)))
-
 \f
 ;;; Entries
 
-(defun profiler-entry-format (entry)
+(defun profiler-format-entry (entry)
   "Format ENTRY in human readable string.  ENTRY would be a
 function name of a function itself."
   (cond ((memq (car-safe entry) '(closure lambda))
@@ -106,76 +108,117 @@
 	(t
 	 (format "#<unknown 0x%x>" (sxhash entry)))))
 
-;;; Log data structure
+(defun profiler-fixup-entry (entry)
+  (if (symbolp entry)
+      entry
+    (profiler-format-entry entry)))
+
+\f
+;;; Backtraces
+
+(defun profiler-fixup-backtrace (backtrace)
+  (apply 'vector (mapcar 'profiler-fixup-entry backtrace)))
+
+\f
+;;; Logs
 
 ;; The C code returns the log in the form of a hash-table where the keys are
 ;; vectors (of size profiler-max-stack-depth, holding truncated
 ;; backtraces, where the first element is the top of the stack) and
 ;; the values are integers (which count how many times this backtrace
 ;; has been seen, multiplied by a "weight factor" which is either the
-;; sample-interval or the memory being allocated).
-;; We extend it by adding a few other entries to the hash-table, most notably:
-;; - Key `type' has a value indicating the kind of log (`memory' or `cpu').
-;; - Key `timestamp' has a value giving the time when the log was obtained.
-;; - Key `diff-p' indicates if this log represents a diff between two logs.
-
-(defun profiler-log-timestamp (log) (gethash 'timestamp log))
-(defun profiler-log-type (log) (gethash 'type log))
-(defun profiler-log-diff-p (log) (gethash 'diff-p log))
-
-(defun profiler-log-diff (log1 log2)
-  "Compare LOG1 with LOG2 and return a diff log.  Both logs must
-be same type."
-  (unless (eq (profiler-log-type log1)
-	      (profiler-log-type log2))
-    (error "Can't compare different type of logs"))
+;; sampling-interval or the memory being allocated).
+
+(defun profiler-compare-logs (log1 log2)
+  "Compare LOG1 with LOG2 and return diff."
   (let ((newlog (make-hash-table :test 'equal)))
     ;; Make a copy of `log1' into `newlog'.
     (maphash (lambda (backtrace count) (puthash backtrace count newlog))
              log1)
-    (puthash 'diff-p t newlog)
     (maphash (lambda (backtrace count)
-               (when (vectorp backtrace)
-                 (puthash backtrace (- (gethash backtrace log1 0) count)
-                          newlog)))
+               (puthash backtrace (- (gethash backtrace log1 0) count)
+                        newlog))
              log2)
     newlog))
 
-(defun profiler-log-fixup-entry (entry)
-  (if (symbolp entry)
-      entry
-    (profiler-entry-format entry)))
-
-(defun profiler-log-fixup-backtrace (backtrace)
-  (mapcar 'profiler-log-fixup-entry backtrace))
-
-(defun profiler-log-fixup (log)
-  "Fixup LOG so that the log could be serialized into file."
+(defun profiler-fixup-log (log)
   (let ((newlog (make-hash-table :test 'equal)))
     (maphash (lambda (backtrace count)
-               (puthash (if (not (vectorp backtrace))
-                            backtrace
-                          (profiler-log-fixup-backtrace backtrace))
+               (puthash (profiler-fixup-backtrace backtrace)
                         count newlog))
              log)
     newlog))
 
-(defun profiler-log-write-file (log filename &optional confirm)
-  "Write LOG into FILENAME."
+\f
+;;; Profiles
+
+(cl-defstruct (profiler-profile (:type vector)
+                                (:constructor profiler-make-profile))
+  (tag 'profiler-profile)
+  (version profiler-version)
+  ;; - `type' has a value indicating the kind of profile (`memory' or `cpu').
+  ;; - `log' indicates the profile log.
+  ;; - `timestamp' has a value giving the time when the profile was obtained.
+  ;; - `diff-p' indicates if this profile represents a diff between two profiles.
+  type log timestamp diff-p)
+
+(defun profiler-compare-profiles (profile1 profile2)
+  "Compare PROFILE1 with PROFILE2 and return diff."
+  (unless (eq (profiler-profile-type profile1)
+	      (profiler-profile-type profile2))
+    (error "Can't compare different type of profiles"))
+  (profiler-make-profile
+   :type (profiler-profile-type profile1)
+   :timestamp (current-time)
+   :diff-p t
+   :log (profiler-compare-logs
+         (profiler-profile-log profile1)
+         (profiler-profile-log profile2))))
+
+(defun profiler-fixup-profile (profile)
+  "Fixup PROFILE so that the profile could be serialized into file."
+  (profiler-make-profile
+   :type (profiler-profile-type profile)
+   :timestamp (profiler-profile-timestamp profile)
+   :diff-p (profiler-profile-diff-p profile)
+   :log (profiler-fixup-log (profiler-profile-log profile))))
+
+(defun profiler-write-profile (profile filename &optional confirm)
+  "Write PROFILE into file FILENAME."
   (with-temp-buffer
     (let (print-level print-length)
-      (print (profiler-log-fixup log) (current-buffer)))
+      (print (profiler-fixup-profile profile)
+             (current-buffer)))
     (write-file filename confirm)))
 
-(defun profiler-log-read-file (filename)
-  "Read log from FILENAME."
+(defun profiler-read-profile (filename)
+  "Read profile from file FILENAME."
+  ;; FIXME: tag and version check
   (with-temp-buffer
     (insert-file-contents filename)
     (goto-char (point-min))
     (read (current-buffer))))
 
+(defun profiler-cpu-profile ()
+  "Return CPU profile."
+  (when (and (fboundp 'profiler-cpu-running-p)
+             (fboundp 'profiler-cpu-log)
+             (profiler-cpu-running-p))
+    (profiler-make-profile
+     :type 'cpu
+     :timestamp (current-time)
+     :log (profiler-cpu-log))))
+
+(defun profiler-memory-profile ()
+  "Return memory profile."
+  (when (profiler-memory-running-p)
+    (profiler-make-profile
+     :type 'memory
+     :timestamp (current-time)
+     :log (profiler-memory-log))))
+
 \f
-;;; Calltree data structure
+;;; Calltrees
 
 (cl-defstruct (profiler-calltree (:constructor profiler-make-calltree))
   entry
@@ -202,7 +245,6 @@
 
 (defun profiler-calltree-find (tree entry)
   "Return a child tree of ENTRY under TREE."
-  ;; OPTIMIZED
   (let (result (children (profiler-calltree-children tree)))
     ;; FIXME: Use `assoc'.
     (while (and children (null result))
@@ -224,19 +266,18 @@
   ;; get a meaningful call-tree.
   (maphash
    (lambda (backtrace count)
-     (when (vectorp backtrace)
-       (let ((node tree)
-	     (max (length backtrace)))
-	 (dotimes (i max)
-	   (let ((entry (aref backtrace (if reverse i (- max i 1)))))
-	     (when entry
-	       (let ((child (profiler-calltree-find node entry)))
-		 (unless child
-		   (setq child (profiler-make-calltree
-				:entry entry :parent node))
-		   (push child (profiler-calltree-children node)))
-		 (cl-incf (profiler-calltree-count child) count)
-		 (setq node child))))))))
+     (let ((node tree)
+           (max (length backtrace)))
+       (dotimes (i max)
+         (let ((entry (aref backtrace (if reverse i (- max i 1)))))
+           (when entry
+             (let ((child (profiler-calltree-find node entry)))
+               (unless child
+                 (setq child (profiler-make-calltree
+                              :entry entry :parent node))
+                 (push child (profiler-calltree-children node)))
+               (cl-incf (profiler-calltree-count child) count)
+               (setq node child)))))))
    log))
 
 (defun profiler-calltree-compute-percentages (tree)
@@ -281,18 +322,18 @@
   :type 'string
   :group 'profiler)
 
-(defvar profiler-report-sample-line-format
+(defvar profiler-report-cpu-line-format
   '((60 left)
     (14 right ((9 right)
 	       (5 right)))))
 
 (defvar profiler-report-memory-line-format
   '((55 left)
-    (19 right ((14 right profiler-format-nbytes)
+    (19 right ((14 right profiler-format-number)
 	       (5 right)))))
 
-(defvar-local profiler-report-log nil
-  "The current profiler log.")
+(defvar-local profiler-report-profile nil
+  "The current profile.")
 
 (defvar-local profiler-report-reversed nil
   "True if calltree is rendered in bottom-up.  Do not touch this
@@ -313,7 +354,7 @@
 			      'mouse-face 'highlight
 			      'help-echo "mouse-2 or RET jumps to definition"))
 		 (t
-		  (profiler-entry-format entry)))))
+		  (profiler-format-entry entry)))))
     (propertize string 'profiler-entry entry)))
 
 (defun profiler-report-make-name-part (tree)
@@ -332,12 +373,12 @@
     (concat " " escaped)))
 
 (defun profiler-report-line-format (tree)
-  (let ((diff-p (profiler-log-diff-p profiler-report-log))
+  (let ((diff-p (profiler-profile-diff-p profiler-report-profile))
 	(name-part (profiler-report-make-name-part tree))
 	(count (profiler-calltree-count tree))
 	(count-percent (profiler-calltree-count-percent tree)))
-    (profiler-format (cl-ecase (profiler-log-type profiler-report-log)
-		       (cpu profiler-report-sample-line-format)
+    (profiler-format (cl-ecase (profiler-profile-type profiler-report-profile)
+		       (cpu profiler-report-cpu-line-format)
 		       (memory profiler-report-memory-line-format))
 		     name-part
 		     (if diff-p
@@ -378,27 +419,35 @@
     (define-key map "B"	    'profiler-report-render-reversed-calltree)
     (define-key map "A"	    'profiler-report-ascending-sort)
     (define-key map "D"	    'profiler-report-descending-sort)
-    (define-key map "="	    'profiler-report-compare-log)
-    (define-key map (kbd "C-x C-w") 'profiler-report-write-log)
+    (define-key map "="	    'profiler-report-compare-profile)
+    (define-key map (kbd "C-x C-w") 'profiler-report-write-profile)
     (define-key map "q"     'quit-window)
     map))
 
-(defun profiler-report-make-buffer-name (log)
+(defun profiler-report-make-buffer-name (profile)
   (format "*%s-Profiler-Report %s*"
-          (cl-ecase (profiler-log-type log) (cpu 'CPU) (memory 'Memory))
-          (format-time-string "%Y-%m-%d %T" (profiler-log-timestamp log))))
+          (cl-ecase (profiler-profile-type profile) (cpu 'CPU) (memory 'Memory))
+          (format-time-string "%Y-%m-%d %T" (profiler-profile-timestamp profile))))
 
-(defun profiler-report-setup-buffer (log)
-  "Make a buffer for LOG and return it."
-  (let* ((buf-name (profiler-report-make-buffer-name log))
+(defun profiler-report-setup-buffer-1 (profile)
+  "Make a buffer for PROFILE and return it."
+  (let* ((buf-name (profiler-report-make-buffer-name profile))
 	 (buffer (get-buffer-create buf-name)))
     (with-current-buffer buffer
       (profiler-report-mode)
-      (setq profiler-report-log log
+      (setq profiler-report-profile profile
 	    profiler-report-reversed nil
 	    profiler-report-order 'descending))
     buffer))
 
+(defun profiler-report-setup-buffer (profile)
+  "Make a buffer for PROFILE with rendering the profile and
+return it."
+  (let ((buffer (profiler-report-setup-buffer-1 profile)))
+    (with-current-buffer buffer
+      (profiler-report-render-calltree))
+    buffer))
+
 (define-derived-mode profiler-report-mode special-mode "Profiler-Report"
   "Profiler Report Mode."
   (setq buffer-read-only t
@@ -408,12 +457,12 @@
 \f
 ;;; Report commands
 
-(defun profiler-report-calltree-at-point ()
-  (get-text-property (point) 'calltree))
+(defun profiler-report-calltree-at-point (&optional point)
+  (get-text-property (or point (point)) 'calltree))
 
 (defun profiler-report-move-to-entry ()
-  (let ((point (next-single-property-change (line-beginning-position)
-                                            'profiler-entry)))
+  (let ((point (next-single-property-change
+                (line-beginning-position) 'profiler-entry)))
     (if point
 	(goto-char point)
       (back-to-indentation))))
@@ -493,14 +542,15 @@
 	(describe-function entry)))))
 
 (cl-defun profiler-report-render-calltree-1
-    (log &key reverse (order 'descending))
-  (let ((calltree (profiler-calltree-build profiler-report-log
-					   :reverse reverse)))
+    (profile &key reverse (order 'descending))
+  (let ((calltree (profiler-calltree-build
+                   (profiler-profile-log profile)
+                   :reverse reverse)))
     (setq header-line-format
-	  (cl-ecase (profiler-log-type log)
+	  (cl-ecase (profiler-profile-type profile)
 	    (cpu
 	     (profiler-report-header-line-format
-	      profiler-report-sample-line-format
+	      profiler-report-cpu-line-format
 	      "Function" (list "Time (ms)" "%")))
 	    (memory
 	     (profiler-report-header-line-format
@@ -517,7 +567,7 @@
       (profiler-report-move-to-entry))))
 
 (defun profiler-report-rerender-calltree ()
-  (profiler-report-render-calltree-1 profiler-report-log
+  (profiler-report-render-calltree-1 profiler-report-profile
 				     :reverse profiler-report-reversed
 				     :order profiler-report-order))
 
@@ -545,28 +595,31 @@
   (setq profiler-report-order 'descending)
   (profiler-report-rerender-calltree))
 
-(defun profiler-report-log (log)
-  (let ((buffer (profiler-report-setup-buffer log)))
-    (with-current-buffer buffer
-      (profiler-report-render-calltree))
-    (pop-to-buffer buffer)))
-
-(defun profiler-report-compare-log (buffer)
-  "Compare the current profiler log with another."
+(defun profiler-report-profile (profile)
+  (switch-to-buffer (profiler-report-setup-buffer profile)))
+
+(defun profiler-report-profile-other-window (profile)
+  (switch-to-buffer-other-window (profiler-report-setup-buffer profile)))
+
+(defun profiler-report-profile-other-frame (profile)
+  (switch-to-buffer-other-frame (profiler-report-setup-buffer profile)))
+
+(defun profiler-report-compare-profile (buffer)
+  "Compare the current profile with another."
   (interactive (list (read-buffer "Compare to: ")))
-  (let* ((log1 (with-current-buffer buffer profiler-report-log))
-	 (log2 profiler-report-log)
-	 (diff-log (profiler-log-diff log1 log2)))
-    (profiler-report-log diff-log)))
+  (let* ((profile1 (with-current-buffer buffer profiler-report-profile))
+	 (profile2 profiler-report-profile)
+	 (diff-profile (profiler-compare-profiles profile1 profile2)))
+    (profiler-report-profile diff-profile)))
 
-(defun profiler-report-write-log (filename &optional confirm)
-  "Write the current profiler log into FILENAME."
+(defun profiler-report-write-profile (filename &optional confirm)
+  "Write the current profile into file FILENAME."
   (interactive
-   (list (read-file-name "Write log: " default-directory)
+   (list (read-file-name "Write profile: " default-directory)
 	 (not current-prefix-arg)))
-  (profiler-log-write-file profiler-report-log
-			   filename
-			   confirm))
+  (profiler-write-profile profiler-report-profile
+                          filename
+                          confirm))
 
 \f
 ;;; Profiler commands
@@ -584,13 +637,13 @@
                                     nil t nil nil "cpu")))))
   (cl-ecase mode
     (cpu
-     (profiler-cpu-start profiler-sample-interval)
+     (profiler-cpu-start profiler-sampling-interval)
      (message "CPU profiler started"))
     (mem
      (profiler-memory-start)
      (message "Memory profiler started"))
     (cpu+mem
-     (profiler-cpu-start profiler-sample-interval)
+     (profiler-cpu-start profiler-sampling-interval)
      (profiler-memory-start)
      (message "CPU and memory profiler started"))))
 
@@ -606,48 +659,58 @@
                    (t "No")))))
 
 (defun profiler-reset ()
-  "Reset profiler log."
+  "Reset profiler logs."
   (interactive)
   (when (fboundp 'profiler-cpu-log)
     (ignore (profiler-cpu-log)))
   (ignore (profiler-memory-log))
   t)
 
-(defun profiler--report-cpu ()
-  (let ((log (if (fboundp 'profiler-cpu-log) (profiler-cpu-log))))
-    (when log
-      (puthash 'type 'cpu log)
-      (puthash 'timestamp (current-time) log)
-      (profiler-report-log log))))
+(defun profiler-report-cpu ()
+  (let ((profile (profiler-cpu-profile)))
+    (when profile
+      (profiler-report-profile-other-window profile))))
 
-(defun profiler--report-memory ()
-  (let ((log (profiler-memory-log)))
-    (when log
-      (puthash 'type 'memory log)
-      (puthash 'timestamp (current-time) log)
-      (profiler-report-log log))))
+(defun profiler-report-memory ()
+  (let ((profile (profiler-memory-profile)))
+    (when profile
+      (profiler-report-profile-other-window profile))))
 
 (defun profiler-report ()
   "Report profiling results."
   (interactive)
-  (profiler--report-cpu)
-  (profiler--report-memory))
-
-;;;###autoload
-(defun profiler-find-log (filename)
-  "Read a profiler log from FILENAME and report it."
-  (interactive
-   (list (read-file-name "Find log: " default-directory)))
-  (profiler-report-log (profiler-log-read-file filename)))
+  (profiler-report-cpu)
+  (profiler-report-memory))
+
+;;;###autoload
+(defun profiler-find-profile (filename)
+  "Open profile FILENAME."
+  (interactive
+   (list (read-file-name "Find profile: " default-directory)))
+  (profiler-report-profile (profiler-read-profile filename)))
+
+;;;###autoload
+(defun profiler-find-profile-other-window (filename)
+  "Open profile FILENAME."
+  (interactive
+   (list (read-file-name "Find profile: " default-directory)))
+  (profiler-report-profile-other-window (profiler-read-profile filename)))
+
+;;;###autoload
+(defun profiler-find-profile-other-frame (filename)
+  "Open profile FILENAME."
+  (interactive
+   (list (read-file-name "Find profile: " default-directory)))
+  (profiler-report-profile-other-frame(profiler-read-profile filename)))
 
 \f
 ;;; Profiling helpers
 
-;; (cl-defmacro with-sample-profiling ((&key interval) &rest body)
+;; (cl-defmacro with-cpu-profiling ((&key sampling-interval) &rest body)
 ;;   `(unwind-protect
 ;;        (progn
 ;;          (ignore (profiler-cpu-log))
-;;          (profiler-cpu-start ,interval)
+;;          (profiler-cpu-start ,sampling-interval)
 ;;          ,@body)
 ;;      (profiler-cpu-stop)
 ;;      (profiler--report-cpu)))

=== modified file 'src/profiler.c'
--- src/profiler.c	2012-09-30 15:49:05 +0000
+++ src/profiler.c	2012-09-30 16:17:12 +0000
@@ -198,7 +198,7 @@
   }
 }
 \f
-/* Sample profiler.  */
+/* Sampling profiler.  */
 
 #ifdef PROFILER_CPU_SUPPORT
 
@@ -220,10 +220,10 @@
 /* Separate counter for the time spent in the GC.  */
 static EMACS_INT cpu_gc_count;
 
-/* The current sample interval in milliseconds.  */
-static EMACS_INT current_sample_interval;
+/* The current sampling interval in milliseconds.  */
+static EMACS_INT current_sampling_interval;
 
-/* Signal handler for sample profiler.  */
+/* Signal handler for sampling profiler.  */
 
 static void
 handle_profiler_signal (int signal)
@@ -235,11 +235,11 @@
        not expect the ARRAY_MARK_FLAG to be set.  We could try and
        harden the hash-table code, but it doesn't seem worth the
        effort.  */
-    cpu_gc_count = saturated_add (cpu_gc_count, current_sample_interval);
+    cpu_gc_count = saturated_add (cpu_gc_count, current_sampling_interval);
   else
     {
       eassert (HASH_TABLE_P (cpu_log));
-      record_backtrace (XHASH_TABLE (cpu_log), current_sample_interval);
+      record_backtrace (XHASH_TABLE (cpu_log), current_sampling_interval);
     }
 }
 
@@ -250,21 +250,21 @@
 }
 
 static enum profiler_cpu_running
-setup_cpu_timer (Lisp_Object sample_interval)
+setup_cpu_timer (Lisp_Object sampling_interval)
 {
   struct sigaction action;
   struct itimerval timer;
   struct timespec interval;
 
-  if (! RANGED_INTEGERP (1, sample_interval,
+  if (! RANGED_INTEGERP (1, sampling_interval,
 			 (TYPE_MAXIMUM (time_t) < EMACS_INT_MAX / 1000
 			  ? (EMACS_INT) TYPE_MAXIMUM (time_t) * 1000 + 999
 			  : EMACS_INT_MAX)))
     return NOT_RUNNING;
 
-  current_sample_interval = XINT (sample_interval);
-  interval = make_emacs_time (current_sample_interval / 1000,
-			      current_sample_interval % 1000 * 1000000);
+  current_sampling_interval = XINT (sampling_interval);
+  interval = make_emacs_time (current_sampling_interval / 1000,
+			      current_sampling_interval % 1000 * 1000000);
   emacs_sigaction_init (&action, deliver_profiler_signal);
   sigaction (SIGPROF, &action, 0);
 
@@ -315,12 +315,12 @@
 DEFUN ("profiler-cpu-start", Fprofiler_cpu_start, Sprofiler_cpu_start,
        1, 1, 0,
        doc: /* Start or restart the cpu profiler.
-It takes call-stack samples each SAMPLE-INTERVAL milliseconds.
+It takes call-stack samples each SAMPLING-INTERVAL milliseconds.
 See also `profiler-log-size' and `profiler-max-stack-depth'.  */)
-  (Lisp_Object sample_interval)
+  (Lisp_Object sampling_interval)
 {
   if (profiler_cpu_running)
-    error ("Sample profiler is already running");
+    error ("CPU profiler is already running");
 
   if (NILP (cpu_log))
     {
@@ -329,9 +329,9 @@
 			  profiler_max_stack_depth);
     }
 
-  profiler_cpu_running = setup_cpu_timer (sample_interval);
+  profiler_cpu_running = setup_cpu_timer (sampling_interval);
   if (! profiler_cpu_running)
-    error ("Invalid sample interval");
+    error ("Invalid sampling interval");
 
   return Qt;
 }


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30 16:42               ` Tomohiro Matsuyama
@ 2012-09-30 19:23                 ` Paul Eggert
  2012-09-30 20:13                   ` Stefan Monnier
  2012-09-30 22:26                   ` Tomohiro Matsuyama
  0 siblings, 2 replies; 113+ messages in thread
From: Paul Eggert @ 2012-09-30 19:23 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: emacs-devel

Thanks, the C part of that change looks OK.
I also like increasing the default sample interval to 10 ms.

I didn't look at the Lisp change carefully,
but I did notice that "human" was misspelled as "humarn"
in at least two places.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30 19:23                 ` Paul Eggert
@ 2012-09-30 20:13                   ` Stefan Monnier
  2012-09-30 20:43                     ` Paul Eggert
  2012-09-30 22:28                     ` Tomohiro Matsuyama
  2012-09-30 22:26                   ` Tomohiro Matsuyama
  1 sibling, 2 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-09-30 20:13 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Tomohiro Matsuyama, emacs-devel

> I also like increasing the default sample interval to 10 ms.

Any particular reason?


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30 20:13                   ` Stefan Monnier
@ 2012-09-30 20:43                     ` Paul Eggert
  2012-10-01  0:41                       ` Stefan Monnier
  2012-10-01 10:37                       ` Eli Zaretskii
  2012-09-30 22:28                     ` Tomohiro Matsuyama
  1 sibling, 2 replies; 113+ messages in thread
From: Paul Eggert @ 2012-09-30 20:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Tomohiro Matsuyama, emacs-devel

On 09/30/2012 01:13 PM, Stefan Monnier wrote:
>> I also like increasing the default sample interval to 10 ms.
> 
> Any particular reason?

Conservativism.  If the sample interval is too small,
bad things happen: Emacs goes catatonic or dumps
core or whatever.  (I found this out by using a
sample interval of 1 microsecond -- silly, I know,
but I wanted to see what would happen....)

Whether an interval is "too small" depends on the
platform.  The platforms most at risk are the older,
slower ones.

If 1 ms was ten times better than 10 ms, in terms of the profile
quality that it gave you, it'd be worth going with 1 ms
despite these potential (and maybe nonexistent) problems.
But it doesn't seem to be *that* much better.

All that being said, it's no big deal.
1 ms is fine too, I expect.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30 19:23                 ` Paul Eggert
  2012-09-30 20:13                   ` Stefan Monnier
@ 2012-09-30 22:26                   ` Tomohiro Matsuyama
  1 sibling, 0 replies; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-30 22:26 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Thanks, the C part of that change looks OK.
> I also like increasing the default sample interval to 10 ms.
>
> I didn't look at the Lisp change carefully,
> but I did notice that "human" was misspelled as "humarn"
> in at least two places.
Thanks.  I've installed the change into trunk.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30 20:13                   ` Stefan Monnier
  2012-09-30 20:43                     ` Paul Eggert
@ 2012-09-30 22:28                     ` Tomohiro Matsuyama
  1 sibling, 0 replies; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-09-30 22:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Paul Eggert, emacs-devel

> > I also like increasing the default sample interval to 10 ms.
> 
> Any particular reason?
I'd like to know the reason.  I've just changed the default for now
to make sure the profiler can work fine.  Need to investigate.

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30 20:43                     ` Paul Eggert
@ 2012-10-01  0:41                       ` Stefan Monnier
  2012-10-01  1:49                         ` Paul Eggert
  2012-10-01 10:37                       ` Eli Zaretskii
  1 sibling, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-10-01  0:41 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Tomohiro Matsuyama, emacs-devel

> Whether an interval is "too small" depends on the platform.
> The platforms most at risk are the older, slower ones.

My main machines are nettops, and I always run with
enable-checking, -O0, and whatnot, so I generally consider that if it's
fast enough for me, it'll be fast enough for the large majority
of users.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01  0:41                       ` Stefan Monnier
@ 2012-10-01  1:49                         ` Paul Eggert
  2012-10-01 14:25                           ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-10-01  1:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Tomohiro Matsuyama, emacs-devel

On 09/30/2012 05:41 PM, Stefan Monnier wrote:
> I generally consider that if it's
> fast enough for me, it'll be fast enough for the large majority
> of users.

OK, I changed the default back to 1 ms.

How about if we also change the sampling interval units
from millseconds to nanoseconds?  It's odd to have a knob
where the default is turned all the way to the left.
And the underlying primitives nominally support nanoseconds.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-09-30 20:43                     ` Paul Eggert
  2012-10-01  0:41                       ` Stefan Monnier
@ 2012-10-01 10:37                       ` Eli Zaretskii
  2012-10-01 11:25                         ` Tomohiro Matsuyama
  2012-10-01 14:54                         ` Paul Eggert
  1 sibling, 2 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-01 10:37 UTC (permalink / raw)
  To: Paul Eggert; +Cc: tomo, monnier, emacs-devel

> Date: Sun, 30 Sep 2012 13:43:14 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> Cc: Tomohiro Matsuyama <tomo@cx4a.org>, emacs-devel@gnu.org
> 
> If the sample interval is too small,
> bad things happen: Emacs goes catatonic or dumps
> core or whatever.  (I found this out by using a
> sample interval of 1 microsecond -- silly, I know,
> but I wanted to see what would happen....)

That's a bug somewhere, isn't it?  We already arrange for SIGPROF to
be blocked while we are in the SIGPROF handler, so why would too
frequent signals be a problem?

Hmm... what happens when the SIGPROF handler exits, and SIGPROF is
unblocked?  Will the signals caught during the period it was blocked
be reissued, and if so, how many times?

> Whether an interval is "too small" depends on the
> platform.  The platforms most at risk are the older,
> slower ones.

Slower platforms should have their minimal supported interval in
itimers be higher, no?  And 'setitimer' should round up the interval
to the minimal supported value, shouldn't it?

> All that being said, it's no big deal.
> 1 ms is fine too, I expect.

Default value is just that: the default.  If there are bad
consequences possible when using very small sampling intervals, we
should warn users about that in the doc strings, and perhaps even test
the value inside profiler-start and ask for confirmation.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 10:37                       ` Eli Zaretskii
@ 2012-10-01 11:25                         ` Tomohiro Matsuyama
  2012-10-01 14:59                           ` Paul Eggert
  2012-10-01 14:54                         ` Paul Eggert
  1 sibling, 1 reply; 113+ messages in thread
From: Tomohiro Matsuyama @ 2012-10-01 11:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, monnier, emacs-devel

> > Whether an interval is "too small" depends on the
> > platform.  The platforms most at risk are the older,
> > slower ones.
> 
> Slower platforms should have their minimal supported interval in
> itimers be higher, no?  And 'setitimer' should round up the interval
> to the minimal supported value, shouldn't it?
Is it possble to use the value of "sysconf(_SC_CLK_TCK)" as the minimal
supported value?

    Tomohiro



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01  1:49                         ` Paul Eggert
@ 2012-10-01 14:25                           ` Stefan Monnier
  2012-10-01 14:48                             ` Eli Zaretskii
  2012-10-01 15:10                             ` Paul Eggert
  0 siblings, 2 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-10-01 14:25 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Tomohiro Matsuyama, emacs-devel

>> I generally consider that if it's fast enough for me, it'll be fast
>> enough for the large majority of users.
> OK, I changed the default back to 1 ms.

I'd much rather that people actually answer my questions about *why*
they want to change the default, rather than have them change it
back&forth as the wind blows.

> How about if we also change the sampling interval units
> from millseconds to nanoseconds?  It's odd to have a knob
> where the default is turned all the way to the left.

Whether 1ms is a good default (and leads to good/valid profiles) is
still an open question.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 14:25                           ` Stefan Monnier
@ 2012-10-01 14:48                             ` Eli Zaretskii
  2012-10-01 16:31                               ` Stefan Monnier
  2012-10-01 16:41                               ` Paul Eggert
  2012-10-01 15:10                             ` Paul Eggert
  1 sibling, 2 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-01 14:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 01 Oct 2012 10:25:28 -0400
> Cc: Tomohiro Matsuyama <tomo@cx4a.org>, emacs-devel@gnu.org
> 
> >> I generally consider that if it's fast enough for me, it'll be fast
> >> enough for the large majority of users.
> > OK, I changed the default back to 1 ms.
> 
> I'd much rather that people actually answer my questions about *why*
> they want to change the default, rather than have them change it
> back&forth as the wind blows.
> 
> > How about if we also change the sampling interval units
> > from millseconds to nanoseconds?  It's odd to have a knob
> > where the default is turned all the way to the left.
> 
> Whether 1ms is a good default (and leads to good/valid profiles) is
> still an open question.

Here's one data point: on a Windows XP system based on Core i7 CPU at
3.4GHz, running the profiler in an optimized build makes one of the 8
cores fully (90%) busy.  Emacs is fully usable during profiling, and I
see in the CPU utilization graph that another core does whatever Emacs
needs in the foreground thread.  This is with the default profiling
interval set by profiler.el, which causes the profiling timer be run
at 15 msec intervals, the minimal one supported by the emulated
itimer.

On fencepost.gnu.org, which appears to be a 7-node machine with 8
cores on each node, running the CPU profiler with the default sampling
interval barely takes 1% of the system CPU resources, i.e. again only
one core (at most) is busy.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 10:37                       ` Eli Zaretskii
  2012-10-01 11:25                         ` Tomohiro Matsuyama
@ 2012-10-01 14:54                         ` Paul Eggert
  2012-10-01 15:28                           ` Eli Zaretskii
  1 sibling, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-10-01 14:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, monnier, emacs-devel

On 10/01/2012 03:37 AM, Eli Zaretskii wrote:

> That's a bug somewhere, isn't it?

It shouldn't dump core, true.  Someone with some free time should
probably look into that.  (I'm guessing it'll be me....)

> Will the signals caught during the period it was blocked
> be reissued, and if so, how many times?

On modern POSIXish systems, the pending SIGPROF signal is delivered
when the SIGPROF handler returns.  It's delivered just once.  There is
an overrun counter that will tell you how many instances of the signal
were discarded.

From the user's point of view, if the sampling interval is too
short, Emacs spends all its time profiling and refuses to do
anything else.  I.e., it goes catatonic.

We currently don't let the user set the sampling interval to be less
than 1 ms, because our interface (perhaps unwisely) has millisecond
resolution.  As far as we know, 1 ms works on all POSIXish platforms,
and if so we don't have to worry about Emacs going catatonic, or about
overrun counts, or the like.  If we changed the API to let users
sample more often than 1 ms we would have to worry about this.

> Slower platforms should have their minimal supported interval in
> itimers be higher, no?

Typically the minimal supported interval is small enough so that an
Emacs process could easily go catatonic if we set the sampling
interval small enough (less than 1 ms).

> And 'setitimer' should round up the interval
> to the minimal supported value, shouldn't it?

Yes, but the value is pretty small.

> Default value is just that: the default.  If there are bad
> consequences possible when using very small sampling intervals,

Right now 1 ms is the default, and Stefan says that it works even on
his slow platform, and we're guessing that means that it works
everywhere of practical interest.




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 11:25                         ` Tomohiro Matsuyama
@ 2012-10-01 14:59                           ` Paul Eggert
  0 siblings, 0 replies; 113+ messages in thread
From: Paul Eggert @ 2012-10-01 14:59 UTC (permalink / raw)
  To: Tomohiro Matsuyama; +Cc: Eli Zaretskii, monnier, emacs-devel

On 10/01/2012 04:25 AM, Tomohiro Matsuyama wrote:
> Is it possble to use the value of "sysconf(_SC_CLK_TCK)" as the minimal
> supported value?

We could change Emacs to do that, but why would
we want to?  On my host, the clock tick is 100,
meaning a 10 ms interval, but it's perfectly reasonable to
use a 1 ms sampling interval.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 14:25                           ` Stefan Monnier
  2012-10-01 14:48                             ` Eli Zaretskii
@ 2012-10-01 15:10                             ` Paul Eggert
  1 sibling, 0 replies; 113+ messages in thread
From: Paul Eggert @ 2012-10-01 15:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Tomohiro Matsuyama, emacs-devel

On 10/01/2012 07:25 AM, Stefan Monnier wrote:
> I'd much rather that people actually answer my questions about *why*
> they want to change the default, rather than have them change it
> back&forth as the wind blows.

I thought that's what we were doing -- you asked why change it to 10 ms,
I responded that I was worried that 1 ms wouldn't work for older/slower
hosts, you said 1 ms works fine on your very-slow host, and that was
good enough for me so I changed it back.

> Whether 1ms is a good default (and leads to good/valid profiles) is
> still an open question.

Absolutely.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 14:54                         ` Paul Eggert
@ 2012-10-01 15:28                           ` Eli Zaretskii
  2012-10-01 16:34                             ` Stefan Monnier
  2012-10-01 16:39                             ` Paul Eggert
  0 siblings, 2 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-01 15:28 UTC (permalink / raw)
  To: Paul Eggert; +Cc: tomo, monnier, emacs-devel

> Date: Mon, 01 Oct 2012 07:54:51 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: monnier@iro.umontreal.ca, tomo@cx4a.org, emacs-devel@gnu.org
> 
> > Will the signals caught during the period it was blocked
> > be reissued, and if so, how many times?
> 
> On modern POSIXish systems, the pending SIGPROF signal is delivered
> when the SIGPROF handler returns.  It's delivered just once.

That's what I thought.

> >From the user's point of view, if the sampling interval is too
> short, Emacs spends all its time profiling and refuses to do
> anything else.  I.e., it goes catatonic.

But on a modern multi-core machines, this shouldn't happen, I think.
Or should it?

Or how about setting SIGPROF to SIG_IGN while the handler runs?  Would
this make things better?

> We currently don't let the user set the sampling interval to be less
> than 1 ms, because our interface (perhaps unwisely) has millisecond
> resolution.  As far as we know, 1 ms works on all POSIXish platforms,
> and if so we don't have to worry about Emacs going catatonic, or about
> overrun counts, or the like.  If we changed the API to let users
> sample more often than 1 ms we would have to worry about this.

Well, you've just asked for such a change, AFAIR ;-)

> > Default value is just that: the default.  If there are bad
> > consequences possible when using very small sampling intervals,
> 
> Right now 1 ms is the default, and Stefan says that it works even on
> his slow platform, and we're guessing that means that it works
> everywhere of practical interest.

But if we do allow smaller values, I think we should display a
warning.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 14:48                             ` Eli Zaretskii
@ 2012-10-01 16:31                               ` Stefan Monnier
  2012-10-01 17:19                                 ` Eli Zaretskii
  2012-10-01 16:41                               ` Paul Eggert
  1 sibling, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-10-01 16:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, eggert, emacs-devel

> needs in the foreground thread.  This is with the default profiling
> interval set by profiler.el, which causes the profiling timer be run
> at 15 msec intervals, the minimal one supported by the emulated
> itimer.

One potential problem I see is when the interval is smaller than what
the OS really does.  The resulting percentages are (probably) OK,
but the resulting times are not.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 15:28                           ` Eli Zaretskii
@ 2012-10-01 16:34                             ` Stefan Monnier
  2012-10-01 16:51                               ` Paul Eggert
  2012-10-01 16:39                             ` Paul Eggert
  1 sibling, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-10-01 16:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, Paul Eggert, emacs-devel

>> Right now 1 ms is the default, and Stefan says that it works even on
>> his slow platform, and we're guessing that means that it works
>> everywhere of practical interest.

That's assuming that my 1ms sampling timer is really run 1000 times per
(CPU) second.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 15:28                           ` Eli Zaretskii
  2012-10-01 16:34                             ` Stefan Monnier
@ 2012-10-01 16:39                             ` Paul Eggert
  1 sibling, 0 replies; 113+ messages in thread
From: Paul Eggert @ 2012-10-01 16:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, monnier, emacs-devel

>> From the user's point of view, if the sampling interval is too
>> short, Emacs spends all its time profiling and refuses to do
>> anything else.  I.e., it goes catatonic.
> 
> But on a modern multi-core machines, this shouldn't happen, I think.
> Or should it?

It shouldn't happen on any reasonably modern machine, multicore
or not.  The amount of work done per sampling interval should be well
below how much power the CPU has, if the sampling interval is 1 ms
or more.  (I had some qualms about slower machines, but Stefan has
eased them.)

> Or how about setting SIGPROF to SIG_IGN while the handler runs?  Would
> this make things better?

It's not a path I would head down, no.  Emacs would still appear
to be catatonic if the sampling interval were small enough.  And
fiddling with SIG_IGN would slow down the signal handler and introduce
potential further races.

> Well, you've just asked for such a change, AFAIR ;-)

Yes, I'm still inclined to give users the extra rope.

> if we do allow smaller values, I think we should display a
> warning.

Yes, part of the change would be to detect timer overruns and
to do the right thing when they happen.  One possibility would
be to increase the sampling interval if it appears to be too small.




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 14:48                             ` Eli Zaretskii
  2012-10-01 16:31                               ` Stefan Monnier
@ 2012-10-01 16:41                               ` Paul Eggert
  2012-10-01 17:52                                 ` Eli Zaretskii
  1 sibling, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-10-01 16:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, Stefan Monnier, emacs-devel

On 10/01/2012 07:48 AM, Eli Zaretskii wrote:
> Here's one data point: on a Windows XP system based on Core i7 CPU at
> 3.4GHz, running the profiler in an optimized build makes one of the 8
> cores fully (90%) busy.

That doesn't sound good.  Is this the main thread, the one that
runs the Emacs interpreter?  Then presumably it's running at roughly
10% of effectiveness.  And if it's *not* the main thread, there
are some race condition gotchas; the profiler signal handler
should not run in parallel with the garbage collector.

It's quite worrisome that a simple counting operation, done every 15 ms,
should nearly peg a core.  Something seems amiss.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 16:34                             ` Stefan Monnier
@ 2012-10-01 16:51                               ` Paul Eggert
       [not found]                                 ` <1349123390.31932.1@faina>
  0 siblings, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-10-01 16:51 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, Eli Zaretskii, emacs-devel

On 10/01/2012 09:34 AM, Stefan Monnier wrote:
> That's assuming that my 1ms sampling timer is really run 1000 times per
> (CPU) second.

It's true that the actual samples might come in at some slower rate,
since timers are not guaranteed to have exactly the interval
asked for.  But as long as the rate is uniform pretty much any interval
will do, as not as it's not outrageously different from what's
requested.  My guess is that it's too much trouble to add code to test for the
actual sampling rate and to try to account for the sampling-rate
error more precisely.  At any rate (no pun intended) I don't know of
any other CPU-time profiler that tries to get that fancy,
not that I'm an expert....



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 16:31                               ` Stefan Monnier
@ 2012-10-01 17:19                                 ` Eli Zaretskii
  0 siblings, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-01 17:19 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: eggert@cs.ucla.edu,  tomo@cx4a.org,  emacs-devel@gnu.org
> Date: Mon, 01 Oct 2012 12:31:55 -0400
> 
> > needs in the foreground thread.  This is with the default profiling
> > interval set by profiler.el, which causes the profiling timer be run
> > at 15 msec intervals, the minimal one supported by the emulated
> > itimer.
> 
> One potential problem I see is when the interval is smaller than what
> the OS really does.

I don't think this can happen, because the timers normally live in the
OS kernel, and their timeout is updated by the OS using the same
machinery that drives the scheduler.

But I'm not up to date with the latest kernels, maybe this is no
longer true.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 16:41                               ` Paul Eggert
@ 2012-10-01 17:52                                 ` Eli Zaretskii
  2012-10-01 17:59                                   ` Eli Zaretskii
                                                     ` (3 more replies)
  0 siblings, 4 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-01 17:52 UTC (permalink / raw)
  To: Paul Eggert; +Cc: tomo, monnier, emacs-devel

> Date: Mon, 01 Oct 2012 09:41:37 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: Stefan Monnier <monnier@iro.umontreal.ca>, tomo@cx4a.org, 
>  emacs-devel@gnu.org
> 
> On 10/01/2012 07:48 AM, Eli Zaretskii wrote:
> > Here's one data point: on a Windows XP system based on Core i7 CPU at
> > 3.4GHz, running the profiler in an optimized build makes one of the 8
> > cores fully (90%) busy.
> 
> That doesn't sound good.  Is this the main thread, the one that
> runs the Emacs interpreter?

No, it isn't.  It's the additional thread that implements the
emulation of the Posix interval timers.

In a nutshell, that thread sleeps until the timer expiration, then
suspends the main thread, calls the SIGPROF handler, resumes the main
thread, and goes back to sleep.  The deviation from the Posix
semantics is that the handler runs in the context of a thread other
than the main thread.  But I don't think this is a big deal in this
case (the handler doesn't run Lisp), and even if it were, I don't know
how to run the handler in the context of the main thread.

> Then presumably it's running at roughly 10% of effectiveness.

No, it runs at 100%, because there are 7 more cores to do that.  Even
on a single-core multithreaded machine, which is quite a rarity today,
there will be another half-core to be used by the main thread.

> And if it's *not* the main thread, there are some race condition
> gotchas; the profiler signal handler should not run in parallel with
> the garbage collector.

The timer thread suspends the main thread (thus simulating a signal),
so the main thread cannot run at that time.

> It's quite worrisome that a simple counting operation, done every 15 ms,
> should nearly peg a core.  Something seems amiss.

I think this is because with the minimal sampling period I allow, the
timer thread almost never sleeps, and I think the signal handler does
quite a lot.  If I enlarge the sampling period to 30 ms, the thread
eats up only half a core, and with 50 ms it takes 10% of a core.

Or maybe I don't know how to write an efficient thread of this kind ;-)



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 17:52                                 ` Eli Zaretskii
@ 2012-10-01 17:59                                   ` Eli Zaretskii
  2012-10-01 21:12                                     ` Eli Zaretskii
  2012-10-01 22:21                                   ` Paul Eggert
                                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-01 17:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, eggert, monnier, emacs-devel

> Date: Mon, 01 Oct 2012 19:52:39 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: tomo@cx4a.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> > It's quite worrisome that a simple counting operation, done every 15 ms,
> > should nearly peg a core.  Something seems amiss.
> 
> I think this is because with the minimal sampling period I allow, the
> timer thread almost never sleeps, and I think the signal handler does
> quite a lot.

Actually, I take this back: I'm quite sure the problem is that the
handler is quite expensive.  The proof: when the user types 
"M-x profiler-stop RET", the timer thread does not exit, it continues
to run, sleeping 30 ms at a time, then awakens, sees that the timer is
disabled, and goes back to sleep.  So almost the whole thread loop is
executed, except suspending/resuming the main thread and calling the
handler.  The difference in terms of CPU usage is tremendous, though:
zero CPU usage.

So either suspending/resuming a thread takes a lot of time, or the
handler, or both.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 17:59                                   ` Eli Zaretskii
@ 2012-10-01 21:12                                     ` Eli Zaretskii
  0 siblings, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-01 21:12 UTC (permalink / raw)
  To: eggert; +Cc: tomo, monnier, emacs-devel

> Date: Mon, 01 Oct 2012 19:59:46 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: eggert@cs.ucla.edu, tomo@cx4a.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> Actually, I take this back: I'm quite sure the problem is that the
> handler is quite expensive.  The proof: when the user types 
> "M-x profiler-stop RET", the timer thread does not exit, it continues
> to run, sleeping 30 ms at a time, then awakens, sees that the timer is
> disabled, and goes back to sleep.  So almost the whole thread loop is
> executed, except suspending/resuming the main thread and calling the
> handler.  The difference in terms of CPU usage is tremendous, though:
> zero CPU usage.
> 
> So either suspending/resuming a thread takes a lot of time, or the
> handler, or both.

The truth is neither.  I timed the code with a high-resolution timer
and found a bug.  After fixing it, the CPU usage while profiling is
zero (not visible on the CPU graph).  Also, it turns out that calling
the SIGPROF handler takes about 10 usec, half of it to suspend and
resume the main thread.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 17:52                                 ` Eli Zaretskii
  2012-10-01 17:59                                   ` Eli Zaretskii
@ 2012-10-01 22:21                                   ` Paul Eggert
  2012-10-02  2:06                                   ` Stefan Monnier
  2012-10-02 13:23                                   ` Jason Rumney
  3 siblings, 0 replies; 113+ messages in thread
From: Paul Eggert @ 2012-10-01 22:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, monnier, emacs-devel

On 10/01/2012 10:52 AM, Eli Zaretskii wrote:
> The timer thread suspends the main thread (thus simulating a signal),
> so the main thread cannot run at that time.

Ah, OK, thanks for explaining.  Sounds good,
especially now that the CPU usage while profiling
is essentially zero.

In trunk bzr 110328 I fixed a bug that caused a crash in
some of my profiler tests -- it was a race condition
involving malloc.  I am still looking into allowing
finer-grained profiling when supported.  Simplest may
be to change the nominal resolution of profiler-sampling-interval
from milliseconds to nanoseconds (and the default value from
1 to 1000000).  Of course no machine really does nanoseconds
(yet :-) but the low-level interface is there so why not?



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 17:52                                 ` Eli Zaretskii
  2012-10-01 17:59                                   ` Eli Zaretskii
  2012-10-01 22:21                                   ` Paul Eggert
@ 2012-10-02  2:06                                   ` Stefan Monnier
  2012-10-02  4:00                                     ` Eli Zaretskii
  2012-10-02  5:28                                     ` Paul Eggert
  2012-10-02 13:23                                   ` Jason Rumney
  3 siblings, 2 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-10-02  2:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, Paul Eggert, emacs-devel

> In a nutshell, that thread sleeps until the timer expiration, then
> suspends the main thread, calls the SIGPROF handler, resumes the main
> thread, and goes back to sleep.  The deviation from the Posix
> semantics is that the handler runs in the context of a thread other
> than the main thread.

Isn't there another difference in that your code samples at a rate
determined by the wall-clock time rather than the cpu-time of the process?


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02  2:06                                   ` Stefan Monnier
@ 2012-10-02  4:00                                     ` Eli Zaretskii
  2012-10-02 12:58                                       ` Stefan Monnier
  2012-10-02  5:28                                     ` Paul Eggert
  1 sibling, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-02  4:00 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Paul Eggert <eggert@cs.ucla.edu>,  tomo@cx4a.org,  emacs-devel@gnu.org
> Date: Mon, 01 Oct 2012 22:06:05 -0400
> 
> > In a nutshell, that thread sleeps until the timer expiration, then
> > suspends the main thread, calls the SIGPROF handler, resumes the main
> > thread, and goes back to sleep.  The deviation from the Posix
> > semantics is that the handler runs in the context of a thread other
> > than the main thread.
> 
> Isn't there another difference in that your code samples at a rate
> determined by the wall-clock time rather than the cpu-time of the process?

The timer thread is part of the process, so whenever it runs, the
process runs.

Or maybe I don't understand what you mean, exactly.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02  2:06                                   ` Stefan Monnier
  2012-10-02  4:00                                     ` Eli Zaretskii
@ 2012-10-02  5:28                                     ` Paul Eggert
  2012-10-02 13:00                                       ` Stefan Monnier
  1 sibling, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-10-02  5:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, Eli Zaretskii, emacs-devel

On 10/01/2012 07:06 PM, Stefan Monnier wrote:
> Isn't there another difference in that your code samples at a rate
> determined by the wall-clock time rather than the cpu-time of the process?

I'm afraid many POSIXish systems get this wrong too.  Some don't
even claim to support CPU-time timers.  Others claim to, but
silently substitute wall-clock time.  One really can't take sample
intervals as if they were gospel; they're not.  It's better to
think of them as just samples of the CPU.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
       [not found]                                 ` <1349123390.31932.1@faina>
@ 2012-10-02  6:41                                   ` Paul Eggert
  2012-10-02 13:03                                     ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-10-02  6:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs Development

On 10/01/2012 01:29 PM, Stefan Monnier wrote:
> If it's common for 1ms to suffer from significant delays,
> then it's a good argument for setting the default higher.

I did a bit more investigation on this, on my host
(x86-64, AMD Phenom II X4 910e, Fedora 17).  I instrumented
trunk bzr 110342 to count total timer overruns, and did a
CPU benchmark that took about 15.6 seconds when not profiling.

With a 10 ms profiling interval, there were zero overruns and the
benchmark took 15.62 seconds, essentially full speed.

With a 1 ms interval, there were 65 overruns and the
benchmark took 51.52 seconds, a worse than 3x slowdown.

With an 0.1 ms interval, there were 466,644 overruns
and the benchmark took 52.12 seconds.  It appears that
the effective interval was closer to 1 ms than to 0.1 ms, as
most of the samples were overrun.

These numbers suggest that we should go with a 10 ms interval,
at least for this kind of host.




^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02  4:00                                     ` Eli Zaretskii
@ 2012-10-02 12:58                                       ` Stefan Monnier
  2012-10-02 17:14                                         ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-10-02 12:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, eggert, emacs-devel

>> > In a nutshell, that thread sleeps until the timer expiration, then
>> > suspends the main thread, calls the SIGPROF handler, resumes the main
>> > thread, and goes back to sleep.  The deviation from the Posix
>> > semantics is that the handler runs in the context of a thread other
>> > than the main thread.
>> Isn't there another difference in that your code samples at a rate
>> determined by the wall-clock time rather than the cpu-time of the process?
> The timer thread is part of the process, so whenever it runs, the
> process runs.
> Or maybe I don't understand what you mean, exactly.

If your profiler is set to sample every 10ms and your Emacs process uses
50% of the CPU, how many samples will you get after 1s: 100 or 50?
And if Emacs is idle waiting for user input, after 1s will you get 100
samples or 0?
Under GNU/Linux (at least with the "old" setitimer code) you'd have
gotten 50 and 0 respectively.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02  5:28                                     ` Paul Eggert
@ 2012-10-02 13:00                                       ` Stefan Monnier
  2012-10-02 16:17                                         ` Paul Eggert
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-10-02 13:00 UTC (permalink / raw)
  To: Paul Eggert; +Cc: tomo, Eli Zaretskii, emacs-devel

>> Isn't there another difference in that your code samples at a rate
>> determined by the wall-clock time rather than the cpu-time of the process?
> I'm afraid many POSIXish systems get this wrong too.  Some don't
> even claim to support CPU-time timers.  Others claim to, but
> silently substitute wall-clock time.  One really can't take sample
> intervals as if they were gospel; they're not.  It's better to
> think of them as just samples of the CPU.

At least the original setitimer code did sample based on cpu-time, based
on my test under GNU/Linux.  I hope your new code does too.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02  6:41                                   ` Paul Eggert
@ 2012-10-02 13:03                                     ` Stefan Monnier
  2012-10-03  5:11                                       ` Paul Eggert
  0 siblings, 1 reply; 113+ messages in thread
From: Stefan Monnier @ 2012-10-02 13:03 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Emacs Development

>> If it's common for 1ms to suffer from significant delays,
>> then it's a good argument for setting the default higher.
> I did a bit more investigation on this, on my host
> (x86-64, AMD Phenom II X4 910e, Fedora 17).  I instrumented
> trunk bzr 110342 to count total timer overruns, and did a
> CPU benchmark that took about 15.6 seconds when not profiling.

What is an "overrun" exactly?  Is it when the signal gets blocked
because we're still in the signal handler?  Or is it when the signal
gets sent "significantly" later than requested, e.g. because of internal
OS limits on timer granularity or something like that?


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-01 17:52                                 ` Eli Zaretskii
                                                     ` (2 preceding siblings ...)
  2012-10-02  2:06                                   ` Stefan Monnier
@ 2012-10-02 13:23                                   ` Jason Rumney
  2012-10-02 17:20                                     ` Eli Zaretskii
  3 siblings, 1 reply; 113+ messages in thread
From: Jason Rumney @ 2012-10-02 13:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, Paul Eggert, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> I think this is because with the minimal sampling period I allow, the
> timer thread almost never sleeps

If it works as you say, it should be spending most  of the time asleep. I
think the problem is more likely to be the sampling rate of the CPU
monitor - Windows built in CPU monitor is known to vastly overestimate
the CPU use of a thread that is frequently waking at regular intervals
like that.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02 13:00                                       ` Stefan Monnier
@ 2012-10-02 16:17                                         ` Paul Eggert
  0 siblings, 0 replies; 113+ messages in thread
From: Paul Eggert @ 2012-10-02 16:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, Eli Zaretskii, emacs-devel

On 10/02/2012 06:00 AM, Stefan Monnier wrote:
> At least the original setitimer code did sample based on cpu-time, based
> on my test under GNU/Linux.  I hope your new code does too.

It does, on platforms that support CPU timers.  It works on mine.

My guess (though I am by no means an expert) is that platforms
where the new code has problems are platforms where setitimer
has similar problems.  I'd be surprised if it were otherwise,
since kernels typically use the same implementation
strategy for both setitimer and timer_settime.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02 12:58                                       ` Stefan Monnier
@ 2012-10-02 17:14                                         ` Eli Zaretskii
  2012-10-03  1:27                                           ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-02 17:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: tomo, eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: eggert@cs.ucla.edu,  tomo@cx4a.org,  emacs-devel@gnu.org
> Date: Tue, 02 Oct 2012 08:58:47 -0400
> 
> >> > In a nutshell, that thread sleeps until the timer expiration, then
> >> > suspends the main thread, calls the SIGPROF handler, resumes the main
> >> > thread, and goes back to sleep.  The deviation from the Posix
> >> > semantics is that the handler runs in the context of a thread other
> >> > than the main thread.
> >> Isn't there another difference in that your code samples at a rate
> >> determined by the wall-clock time rather than the cpu-time of the process?
> > The timer thread is part of the process, so whenever it runs, the
> > process runs.
> > Or maybe I don't understand what you mean, exactly.
> 
> If your profiler is set to sample every 10ms and your Emacs process uses
> 50% of the CPU, how many samples will you get after 1s: 100 or 50?

The answer is "it depends".  You should talk about threads, not
processes, since the emulated timer is a thread within the Emacs
process.

But yes, if the main (Lisp) thread does not get CPU time at all, while
the timer thread does, then the current code will get 100 samples
after 1 sec (if such a situation can happen on modern multi-core
machines).  However, it never bothered me until now, because, as Paul
says, you cannot trust the timer to work on CPU clock of your process
anyway.  And on top of that, AFAIK you cannot even trust to get the
interval you asked for without checking with getitimer, see my other
message.

If the decision is not to collect the actual CPU times in profiler.c's
SIGPROF handler, like I suggested, then I can adjust for the above
discrepancy by looking at the CPU time used by the main thread,
instead of looking at the wall-clock time, and simulating the timer
tick based on that.  (Assuming the relevant API is not too expensive,
that is.)

> And if Emacs is idle waiting for user input, after 1s will you get 100
> samples or 0?

Ah, but Emacs's idle loop is anything but idleness, as you well know.
Quite a few things get run while Emacs is "idle".

> Under GNU/Linux (at least with the "old" setitimer code) you'd have
> gotten 50 and 0 respectively.

I think this is sheer luck at best, even on GNU/Linux, and even then
when the system is not too loaded.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02 13:23                                   ` Jason Rumney
@ 2012-10-02 17:20                                     ` Eli Zaretskii
  2012-10-03 14:25                                       ` Jason Rumney
  0 siblings, 1 reply; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-02 17:20 UTC (permalink / raw)
  To: Jason Rumney; +Cc: tomo, eggert, monnier, emacs-devel

> From: Jason Rumney <jasonr@gnu.org>
> Cc: Paul Eggert <eggert@cs.ucla.edu>,  tomo@cx4a.org,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Tue, 02 Oct 2012 21:23:15 +0800
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I think this is because with the minimal sampling period I allow, the
> > timer thread almost never sleeps
> 
> If it works as you say, it should be spending most  of the time asleep. I
> think the problem is more likely to be the sampling rate of the CPU
> monitor - Windows built in CPU monitor is known to vastly overestimate
> the CPU use of a thread that is frequently waking at regular intervals
> like that.

Well, as you saw by now, you should never attribute to OS misfeatures
what can be easily attributed to bugs in the application ;-)

The problem was my incorrect interpretation of what 'Sleep (0);' does
in terms of timing.  The documentation says it causes the thread to
relinquish the remainder of its time slice, but I apparently didn't
realize how short that slice is.  Therefore the loop

  while (clock () < expiration_time)
    Sleep (0);

which was meant to make sure we never fire SIGPROF before the
expiration time, pegged one core because it called 'clock' too
frequently.  Changing 0 to 5 made the trick.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02 17:14                                         ` Eli Zaretskii
@ 2012-10-03  1:27                                           ` Stefan Monnier
  0 siblings, 0 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-10-03  1:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, eggert, emacs-devel

>> Under GNU/Linux (at least with the "old" setitimer code) you'd have
>> gotten 50 and 0 respectively.
> I think this is sheer luck at best, even on GNU/Linux, and even then
> when the system is not too loaded.

Obviously, it wouldn't have been *exactly* 50 and 0, but close enough.


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02 13:03                                     ` Stefan Monnier
@ 2012-10-03  5:11                                       ` Paul Eggert
  2012-10-03 12:50                                         ` Stefan Monnier
  0 siblings, 1 reply; 113+ messages in thread
From: Paul Eggert @ 2012-10-03  5:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Emacs Development

On 10/02/2012 06:03 AM, Stefan Monnier wrote:
> What is an "overrun" exactly?  Is it when the signal gets blocked
> because we're still in the signal handler?

Close.  It's when a signal gets discarded because the
same signal is already pending; this can occur when
the signal is being blocked, and the signal can be
blocked because we're in the signal handler.

It's the performance figures, not the overruns, that
were the real eye-opener.  Going with a 1 ms sampling
interval made Emacs 3x slower.  Going with 10 ms made
Emacs run nearly at full speed.  That's a strong argument
for going with 10 ms.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-03  5:11                                       ` Paul Eggert
@ 2012-10-03 12:50                                         ` Stefan Monnier
  2012-10-03 14:06                                           ` Andreas Schwab
  2012-10-03 17:39                                           ` Eli Zaretskii
  0 siblings, 2 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-10-03 12:50 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Emacs Development

> Going with a 1 ms sampling interval made Emacs 3x slower.

That's weird, tho.  On my lowly fit-pc2, the overhead was just 1% in
my test.  Of course, maybe the setitimer ended up firing at a much
slower rate?


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-03 12:50                                         ` Stefan Monnier
@ 2012-10-03 14:06                                           ` Andreas Schwab
  2012-10-03 17:22                                             ` Stefan Monnier
  2012-10-03 17:39                                           ` Eli Zaretskii
  1 sibling, 1 reply; 113+ messages in thread
From: Andreas Schwab @ 2012-10-03 14:06 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Paul Eggert, Emacs Development

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> That's weird, tho.  On my lowly fit-pc2, the overhead was just 1% in
> my test.  Of course, maybe the setitimer ended up firing at a much
> slower rate?

Perhaps your system doesn't have a high precision timer, so the kernel
falls back to the jiffies timer (which typically runs at 100Hz,
depending on kernel config).

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-02 17:20                                     ` Eli Zaretskii
@ 2012-10-03 14:25                                       ` Jason Rumney
  2012-10-03 17:43                                         ` Eli Zaretskii
  0 siblings, 1 reply; 113+ messages in thread
From: Jason Rumney @ 2012-10-03 14:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: tomo, eggert, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> The documentation says it causes the thread to
> relinquish the remainder of its time slice, but I apparently didn't
> realize how short that slice is.

Timeslices are probably much shorter now than the 10ms they used to be
in Windows 95, but I guarantee they are still much longer than the time
it takes to execute:

>   while (clock () < expiration_time)
>     Sleep (0);

This is exactly the condition that will cause a false reading on a CPU
monitor, as the thread is runnable in every single timeslice,
even if it is immediately relinquishing it to another thread.  The CPU
monitor is only checking on the granularity of a timeslice so it sees
100%, where the actual usage for this process might be a fraction of 1%.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-03 14:06                                           ` Andreas Schwab
@ 2012-10-03 17:22                                             ` Stefan Monnier
  0 siblings, 0 replies; 113+ messages in thread
From: Stefan Monnier @ 2012-10-03 17:22 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Paul Eggert, Emacs Development

>> That's weird, tho.  On my lowly fit-pc2, the overhead was just 1% in
>> my test.  Of course, maybe the setitimer ended up firing at a much
>> slower rate?
> Perhaps your system doesn't have a high precision timer, so the kernel
> falls back to the jiffies timer (which typically runs at 100Hz,
> depending on kernel config).

Could be.  Tho a Fit-PC2 seems to be sufficiently recent to have
a high-precision timer (it's basically like the early Atom netbooks with
the dreaded GPU (gma500) that's sold by Intel but isn't supported by the
Xorg intel driver).  At least dmesg includes "HPET: 3 timers in total,
0 timers will be used for per-cpu timer".


        Stefan



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-03 12:50                                         ` Stefan Monnier
  2012-10-03 14:06                                           ` Andreas Schwab
@ 2012-10-03 17:39                                           ` Eli Zaretskii
  1 sibling, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-03 17:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eggert, Emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Wed, 03 Oct 2012 08:50:46 -0400
> Cc: Emacs Development <Emacs-devel@gnu.org>
> 
> > Going with a 1 ms sampling interval made Emacs 3x slower.
> 
> That's weird, tho.  On my lowly fit-pc2, the overhead was just 1% in
> my test.  Of course, maybe the setitimer ended up firing at a much
> slower rate?

Add a call to gettimeofday to the signal handler and time it.  Then
you'd now.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: Elisp native profiler
  2012-10-03 14:25                                       ` Jason Rumney
@ 2012-10-03 17:43                                         ` Eli Zaretskii
  0 siblings, 0 replies; 113+ messages in thread
From: Eli Zaretskii @ 2012-10-03 17:43 UTC (permalink / raw)
  To: Jason Rumney; +Cc: tomo, eggert, monnier, emacs-devel

> From: Jason Rumney <jasonr@gnu.org>
> Cc: eggert@cs.ucla.edu,  tomo@cx4a.org,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Wed, 03 Oct 2012 22:25:36 +0800
> 
> Timeslices are probably much shorter now than the 10ms they used to be
> in Windows 95, but I guarantee they are still much longer than the time
> it takes to execute:
> 
> >   while (clock () < expiration_time)
> >     Sleep (0);
> 
> This is exactly the condition that will cause a false reading on a CPU
> monitor, as the thread is runnable in every single timeslice,
> even if it is immediately relinquishing it to another thread.  The CPU
> monitor is only checking on the granularity of a timeslice so it sees
> 100%, where the actual usage for this process might be a fraction of 1%.

Thanks.  There's something to learn every day.



^ permalink raw reply	[flat|nested] 113+ messages in thread

end of thread, other threads:[~2012-10-03 17:43 UTC | newest]

Thread overview: 113+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-25  2:41 Elisp native profiler Stefan Monnier
2012-09-25 11:40 ` Juanma Barranquero
2012-09-25 13:18 ` Eli Zaretskii
2012-09-25 13:26   ` Eli Zaretskii
2012-09-25 17:28     ` Stefan Monnier
2012-09-25 19:21       ` Paul Eggert
2012-09-25 20:12         ` Eli Zaretskii
2012-09-26  0:55         ` Stefan Monnier
2012-09-26  6:16           ` Tomohiro Matsuyama
2012-09-26  7:35             ` Eli Zaretskii
2012-09-26  7:59               ` Tomohiro Matsuyama
2012-09-26  9:32                 ` Eli Zaretskii
2012-09-26 22:46                   ` Tomohiro Matsuyama
2012-09-27  5:55                     ` Eli Zaretskii
2012-09-27  7:01                       ` Eli Zaretskii
2012-09-27 12:37                         ` Stefan Monnier
2012-09-27 17:49                           ` Eli Zaretskii
2012-09-27 17:52                           ` Eli Zaretskii
2012-09-27 20:35                             ` Stefan Monnier
2012-09-27 20:46                               ` Eli Zaretskii
2012-09-27 22:12                                 ` Stefan Monnier
2012-09-28  7:09                                   ` Eli Zaretskii
2012-09-28  8:24                                     ` Stefan Monnier
2012-09-28 14:11                                       ` Eli Zaretskii
2012-09-28  8:03                                   ` Stephen Berman
2012-09-28  8:26                                     ` Stefan Monnier
2012-09-30  7:39                         ` Paul Eggert
2012-09-30  7:58                           ` Eli Zaretskii
2012-09-28 10:09                       ` Tomohiro Matsuyama
2012-09-28 11:05                         ` Eli Zaretskii
2012-09-28 12:43                           ` Stefan Monnier
2012-09-28 14:16                             ` Eli Zaretskii
2012-09-28 20:11                               ` Stefan Monnier
2012-09-28 20:14                                 ` Eli Zaretskii
2012-09-29  2:13                                   ` Stefan Monnier
2012-09-28 12:40                         ` Stefan Monnier
2012-09-26 17:42             ` Stefan Monnier
2012-09-26 22:52               ` Tomohiro Matsuyama
2012-09-28  9:14               ` Tomohiro Matsuyama
2012-09-28 11:00                 ` Eli Zaretskii
2012-09-30  7:37                   ` Paul Eggert
2012-09-30  7:58                     ` Eli Zaretskii
2012-09-30  8:16                       ` Paul Eggert
2012-09-28 12:37                 ` Stefan Monnier
2012-09-30 16:42               ` Tomohiro Matsuyama
2012-09-30 19:23                 ` Paul Eggert
2012-09-30 20:13                   ` Stefan Monnier
2012-09-30 20:43                     ` Paul Eggert
2012-10-01  0:41                       ` Stefan Monnier
2012-10-01  1:49                         ` Paul Eggert
2012-10-01 14:25                           ` Stefan Monnier
2012-10-01 14:48                             ` Eli Zaretskii
2012-10-01 16:31                               ` Stefan Monnier
2012-10-01 17:19                                 ` Eli Zaretskii
2012-10-01 16:41                               ` Paul Eggert
2012-10-01 17:52                                 ` Eli Zaretskii
2012-10-01 17:59                                   ` Eli Zaretskii
2012-10-01 21:12                                     ` Eli Zaretskii
2012-10-01 22:21                                   ` Paul Eggert
2012-10-02  2:06                                   ` Stefan Monnier
2012-10-02  4:00                                     ` Eli Zaretskii
2012-10-02 12:58                                       ` Stefan Monnier
2012-10-02 17:14                                         ` Eli Zaretskii
2012-10-03  1:27                                           ` Stefan Monnier
2012-10-02  5:28                                     ` Paul Eggert
2012-10-02 13:00                                       ` Stefan Monnier
2012-10-02 16:17                                         ` Paul Eggert
2012-10-02 13:23                                   ` Jason Rumney
2012-10-02 17:20                                     ` Eli Zaretskii
2012-10-03 14:25                                       ` Jason Rumney
2012-10-03 17:43                                         ` Eli Zaretskii
2012-10-01 15:10                             ` Paul Eggert
2012-10-01 10:37                       ` Eli Zaretskii
2012-10-01 11:25                         ` Tomohiro Matsuyama
2012-10-01 14:59                           ` Paul Eggert
2012-10-01 14:54                         ` Paul Eggert
2012-10-01 15:28                           ` Eli Zaretskii
2012-10-01 16:34                             ` Stefan Monnier
2012-10-01 16:51                               ` Paul Eggert
     [not found]                                 ` <1349123390.31932.1@faina>
2012-10-02  6:41                                   ` Paul Eggert
2012-10-02 13:03                                     ` Stefan Monnier
2012-10-03  5:11                                       ` Paul Eggert
2012-10-03 12:50                                         ` Stefan Monnier
2012-10-03 14:06                                           ` Andreas Schwab
2012-10-03 17:22                                             ` Stefan Monnier
2012-10-03 17:39                                           ` Eli Zaretskii
2012-10-01 16:39                             ` Paul Eggert
2012-09-30 22:28                     ` Tomohiro Matsuyama
2012-09-30 22:26                   ` Tomohiro Matsuyama
2012-09-26  7:39         ` Tomohiro Matsuyama
2012-09-26 14:33           ` Paul Eggert
2012-09-26 18:58             ` Ivan Andrus
2012-09-27  1:11               ` Paul Eggert
2012-09-25 20:10       ` Eli Zaretskii
2012-09-26  0:53         ` Stefan Monnier
2012-09-26  0:30       ` Stephen J. Turnbull
2012-09-26  1:00     ` Stefan Monnier
2012-09-26  7:41       ` Eli Zaretskii
2012-09-26 12:55         ` Stefan Monnier
2012-09-26 13:19           ` Eli Zaretskii
2012-09-26 13:26             ` Eli Zaretskii
2012-09-26 17:49               ` Stefan Monnier
2012-09-26 19:31                 ` Eli Zaretskii
2012-09-26 20:20                   ` Stefan Monnier
2012-09-26 22:50                     ` Tomohiro Matsuyama
2012-09-27  4:02                       ` Stefan Monnier
2012-09-30  7:32                         ` Paul Eggert
2012-09-29 15:24                 ` Eli Zaretskii
2012-09-29 16:06                   ` Stefan Monnier
2012-09-29 16:18                     ` Eli Zaretskii
2012-09-29 20:17                       ` Stefan Monnier
2012-09-30 15:54             ` Eli Zaretskii
2012-09-25 13:40   ` Juanma Barranquero

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).