unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Tom Tromey <tromey@redhat.com>
To: esr@thyrsus.com
Cc: emacs-devel@gnu.org
Subject: Re: Hideously slow VC status queries fixed
Date: Sun, 30 Dec 2007 14:54:21 -0700	[thread overview]
Message-ID: <m3k5mvetma.fsf@fleche.redhat.com> (raw)
In-Reply-To: <20071229214956.GB24787@thyrsus.com> (Eric S. Raymond's message of "Sat\, 29 Dec 2007 16\:49\:56 -0500")

>>>>> "Eric" == Eric S Raymond <esr@thyrsus.com> writes:

Eric> I'm going to do another rewrite of vc-dired-hook shortly, and I'd
Eric> appreciate it if you'd profile the results and compare those numbers
Eric> against the baseline you've established.

No problem.

>> I wonder if there is a way to do a lot less work.  For instance, could
>> we have VC look only at files that are not 'up-to-date?  In my tree
>> this would mean processing 24 files -- 3 orders of magnitude fewer.  I
>> think this would be a pretty common result for large trees, since it
>> is rare to have a patch that touches a substantial fraction of gcc.

Eric> I think that you are not quite understanding the problem here -- or at
Eric> least my assumptions about what constitutes "less work" (while
Eric> possibly incorrect) are quite different from yours.

Yeah, I'm sure I don't understand :-).  More on this below.

Eric> By contrast, I think the size of the data returned by the VC status
Eric> command, and the parsing time required for VC mode to pull that data
Eric> into Lisp-space, is much less significant.  Or, to put it a different
Eric> way, I'm assuming that the startup latency of the report generator(s)
Eric> dominates the total time from the C-x v d keystroke to the display
Eric> refresh.

Eric> If you can find a way to crunch your profiling data that
Eric> separates the latency cost of getting the report from the rest
Eric> of the interpretation time, it would be useful to compare the
Eric> two.

I think there is something to this, but it is not the whole picture.

I timed 'svn status -v' on gcc:

opsy. /usr/bin/time svn status -v > /dev/null
0.95user 1.13system 0:16.79elapsed 12%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (37major+6120minor)pagefaults 0swaps

However, this is with a cold cache.  With a warm cache it is
dramatically different:

opsy. /usr/bin/time svn status -v > /dev/null
0.90user 0.33system 0:01.28elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6157minor)pagefaults 0swaps


I also tried it on a much smaller tree I have sitting around:

opsy. /usr/bin/time svn status -v > /dev/null
0.02user 0.04system 0:01.61elapsed 4%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+740minor)pagefaults 0swaps

So we can clearly see here that svn is just a lot slower on a big
tree, at least when cold.  This confirms part of your theory.


However, I also tried this:

    (let ((zz-s-time (current-time)))
      (vc-directory "~/gnu/Trunk/trunk/gcc/" nil)
       (time-subtract (current-time) zz-s-time))
   
This yielded (0 320 260248) with a warm cache -- so I think most of
the time must be in Emacs processing, not in svn.  Even with a cold
cache this number would be very bad.


I think what I don't understand is why we run 'svn status -v'.  This
will print information about every file.  But, why do we need
information about every file?  Would it be possible to only deal with
files that aren't "up-to-date", i.e., omit the '-v'?  This would seem
to be much more efficient.

That's just an idea though.  I also don't understand everything that
is going on in the elp results, like where all those calls to
vc-call-backend come from.

Tom

  reply	other threads:[~2007-12-30 21:54 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-27  0:11 Hideously slow VC status queries fixed Eric S. Raymond
2007-12-27  1:27 ` Tom Tromey
2007-12-27  2:19   ` Eric S. Raymond
2007-12-27 18:24     ` Richard Stallman
2007-12-29 19:18     ` Tom Tromey
2007-12-29 21:49       ` Eric S. Raymond
2007-12-30 21:54         ` Tom Tromey [this message]
2007-12-31  3:41           ` Eric S. Raymond
2007-12-31 19:09             ` Tom Tromey
2007-12-31 20:33               ` Eric S. Raymond
2007-12-31 20:58                 ` Dan Nicolaescu
2007-12-31 21:42                   ` Eric S. Raymond
2008-01-01 23:21                 ` Tom Tromey
2008-01-02  0:19                   ` Eric S. Raymond
2007-12-27 18:24   ` Richard Stallman
2007-12-28  9:02     ` Eric S. Raymond
2007-12-28 18:46       ` Tom Tromey
2007-12-28 19:36         ` Tom Tromey
2007-12-27  2:41 ` Dan Nicolaescu
2007-12-27  6:13   ` Alexandru Harsanyi
2007-12-27 13:21   ` Eric S. Raymond
2007-12-29  7:40     ` Stefan Monnier
2007-12-29  7:34 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3k5mvetma.fsf@fleche.redhat.com \
    --to=tromey@redhat.com \
    --cc=emacs-devel@gnu.org \
    --cc=esr@thyrsus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).