From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Tom Tromey Newsgroups: gmane.emacs.devel Subject: Re: Hideously slow VC status queries fixed Date: Sun, 30 Dec 2007 14:54:21 -0700 Message-ID: References: <20071227001113.3EDFE830B6E@snark.thyrsus.com> <20071227021940.GA17434@thyrsus.com> <20071229214956.GB24787@thyrsus.com> Reply-To: Tom Tromey NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1199053436 12580 80.91.229.12 (30 Dec 2007 22:23:56 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 30 Dec 2007 22:23:56 +0000 (UTC) Cc: emacs-devel@gnu.org To: esr@thyrsus.com Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Dec 30 23:24:10 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1J96Zj-0001g5-7y for ged-emacs-devel@m.gmane.org; Sun, 30 Dec 2007 23:24:07 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1J96ZN-0007sN-9J for ged-emacs-devel@m.gmane.org; Sun, 30 Dec 2007 17:23:45 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1J96ZI-0007oe-6f for emacs-devel@gnu.org; Sun, 30 Dec 2007 17:23:40 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1J96ZH-0007nF-Hi for emacs-devel@gnu.org; Sun, 30 Dec 2007 17:23:39 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1J96ZH-0007mv-BF for emacs-devel@gnu.org; Sun, 30 Dec 2007 17:23:39 -0500 Original-Received: from mx1.redhat.com ([66.187.233.31]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1J96ZH-0000Zv-4M for emacs-devel@gnu.org; Sun, 30 Dec 2007 17:23:39 -0500 Original-Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id lBUMNcYe005344; Sun, 30 Dec 2007 17:23:38 -0500 Original-Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [10.11.255.20]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id lBUMNcAS015652; Sun, 30 Dec 2007 17:23:38 -0500 Original-Received: from opsy.redhat.com (ton.yyz.redhat.com [10.15.16.15]) by pobox.corp.redhat.com (8.13.1/8.13.1) with ESMTP id lBUMNbrh023463; Sun, 30 Dec 2007 17:23:37 -0500 Original-Received: by opsy.redhat.com (Postfix, from userid 500) id 492D9C8803B; Sun, 30 Dec 2007 14:54:21 -0700 (MST) X-Attribution: Tom In-Reply-To: <20071229214956.GB24787@thyrsus.com> (Eric S. Raymond's message of "Sat\, 29 Dec 2007 16\:49\:56 -0500") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.990 (gnu/linux) X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:85707 Archived-At: >>>>> "Eric" == Eric S Raymond writes: Eric> I'm going to do another rewrite of vc-dired-hook shortly, and I'd Eric> appreciate it if you'd profile the results and compare those numbers Eric> against the baseline you've established. No problem. >> I wonder if there is a way to do a lot less work. For instance, could >> we have VC look only at files that are not 'up-to-date? In my tree >> this would mean processing 24 files -- 3 orders of magnitude fewer. I >> think this would be a pretty common result for large trees, since it >> is rare to have a patch that touches a substantial fraction of gcc. Eric> I think that you are not quite understanding the problem here -- or at Eric> least my assumptions about what constitutes "less work" (while Eric> possibly incorrect) are quite different from yours. Yeah, I'm sure I don't understand :-). More on this below. Eric> By contrast, I think the size of the data returned by the VC status Eric> command, and the parsing time required for VC mode to pull that data Eric> into Lisp-space, is much less significant. Or, to put it a different Eric> way, I'm assuming that the startup latency of the report generator(s) Eric> dominates the total time from the C-x v d keystroke to the display Eric> refresh. Eric> If you can find a way to crunch your profiling data that Eric> separates the latency cost of getting the report from the rest Eric> of the interpretation time, it would be useful to compare the Eric> two. I think there is something to this, but it is not the whole picture. I timed 'svn status -v' on gcc: opsy. /usr/bin/time svn status -v > /dev/null 0.95user 1.13system 0:16.79elapsed 12%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (37major+6120minor)pagefaults 0swaps However, this is with a cold cache. With a warm cache it is dramatically different: opsy. /usr/bin/time svn status -v > /dev/null 0.90user 0.33system 0:01.28elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6157minor)pagefaults 0swaps I also tried it on a much smaller tree I have sitting around: opsy. /usr/bin/time svn status -v > /dev/null 0.02user 0.04system 0:01.61elapsed 4%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (1major+740minor)pagefaults 0swaps So we can clearly see here that svn is just a lot slower on a big tree, at least when cold. This confirms part of your theory. However, I also tried this: (let ((zz-s-time (current-time))) (vc-directory "~/gnu/Trunk/trunk/gcc/" nil) (time-subtract (current-time) zz-s-time)) This yielded (0 320 260248) with a warm cache -- so I think most of the time must be in Emacs processing, not in svn. Even with a cold cache this number would be very bad. I think what I don't understand is why we run 'svn status -v'. This will print information about every file. But, why do we need information about every file? Would it be possible to only deal with files that aren't "up-to-date", i.e., omit the '-v'? This would seem to be much more efficient. That's just an idea though. I also don't understand everything that is going on in the elp results, like where all those calls to vc-call-backend come from. Tom