tl;dr:

I'm working on adding one level of context to function calls
in ELP reports, looking for implementation and usability
feedback.

----

As a heavy TRAMP user I have been frustrated the past few
years with slowness, and painstakingly looked for ways to
remedy this.  Along the way I started to explore Emacs' core
file API and with it the magic filename system, specifically
tramp-sh.el.  I quickly came to realize its complexity and
the inadequacy of the Emacs Lisp Profiler in a highly
recursive and interconnected codebase.

This comment from elp.el in particular resonated with me:
;; TBD:
;; Make this act like a real profiler, so that it records time spent
;; in all branches of execution.

I decided to take one step in this direction and change ELP
to have one additional level of context for each result.
For example,

(defun g (n)
  (f n))

(defun f (n)
  (when (not (= n 0))
    (f (1- n))))
 
(g 3)

currently results in this report:

Function Name   Call Count  
g               1          
f               4

My changes produce a report like this:

Function Name   Call Count
g               1  ;; calls from functions not instrumented
`-f             1

f               0  ;; calls from functions not instrumented
`-f             3


I implemented this by changing the elp-timer-info-property
from a vector to a plist of vectors indexed by function
symbol, a.k.a. the "caller".  The advice wrapper sets a
dynamically scoped variable to be the current function's
symbol (the caller), which is used in callee functions to
lookup which timer info vector to update.  At reporting
time, the plists are inverted from callee->caller to
caller->callee.

Before I go any further with this I thought I'd reach out to
the community for some early-middle feedback in case there's
something important I'm missing.  Some topics I'm open to
discuss:

1) Is there a better way to programmatically access the call
   stack than a dynamic binding hack?
2) How can I make the report more intuitive? Specifically
   I'm worried about confusing people with the 0 call count
   for f in the above example.
3) Maybe a hashmap would be better than a plist when the
   number of instrumented functions is large.

Next on my plate:

- Check for unexpected behavior with async functions (mainly
  the aio library)
- Implement sorting by call count, total time & avg time
  (each section is sorted normally, and the entire report is
  sorted by the max value of each section)
- Keybinding to toggle un-inverting the report
- Keybinding to jump to section
- Clean up documentation

Looking forward to any feedback folks want to share.