From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: James Westby Newsgroups: gmane.comp.version-control.bazaar-ng.general,gmane.emacs.devel Subject: Re: Emacs repository benchmark: bzr and git Date: Tue, 18 Mar 2008 20:22:48 +0000 Message-ID: <1205871768.3304.51.camel@flash> References: <20080318154316.GA6242@mithlond.arda.local> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-khoCAOCkf9mJH8GS1HQv" X-Trace: ger.gmane.org 1205871804 27916 80.91.229.12 (18 Mar 2008 20:23:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 18 Mar 2008 20:23:24 +0000 (UTC) Cc: bazaar@lists.canonical.com, emacs-devel@gnu.org To: Teemu Likonen Original-X-From: bazaar-bounces@lists.canonical.com Tue Mar 18 21:23:52 2008 Return-path: Envelope-to: gcvbg-bazaar-ng@m.gmane.org Original-Received: from chlorine.canonical.com ([91.189.94.204]) by lo.gmane.org with esmtp (Exim 4.50) id 1JbiLP-0006kC-Fj for gcvbg-bazaar-ng@m.gmane.org; Tue, 18 Mar 2008 21:23:35 +0100 Original-Received: from localhost ([127.0.0.1] helo=chlorine.canonical.com) by chlorine.canonical.com with esmtp (Exim 4.60) (envelope-from ) id 1JbiKl-0004Ml-DQ; Tue, 18 Mar 2008 20:22:55 +0000 Original-Received: from jameswestby.net ([89.145.97.141]) by chlorine.canonical.com with esmtp (Exim 4.60) (envelope-from ) id 1JbiKk-0004Mg-IJ for bazaar@lists.canonical.com; Tue, 18 Mar 2008 20:22:54 +0000 Original-Received: from 77-99-12-164.cable.ubr13.azte.blueyonder.co.uk ([77.99.12.164] helo=[192.168.1.109]) by jameswestby.net with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.63) (envelope-from ) id 1JbiKj-0002wh-5U; Tue, 18 Mar 2008 20:22:53 +0000 In-Reply-To: <20080318154316.GA6242@mithlond.arda.local> X-Mailer: Evolution 2.22.0 X-BeenThere: bazaar@lists.canonical.com X-Mailman-Version: 2.1.8 Precedence: list List-Id: bazaar discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bazaar-bounces@lists.canonical.com Errors-To: bazaar-bounces@lists.canonical.com Xref: news.gmane.org gmane.comp.version-control.bazaar-ng.general:38939 gmane.emacs.devel:92927 Archived-At: --=-khoCAOCkf9mJH8GS1HQv Content-Type: text/plain Content-Transfer-Encoding: 7bit On Tue, 2008-03-18 at 17:43 +0200, Teemu Likonen wrote: > I did some benchmarking in git and bzr repositories of Emacs. Some > numbers: 89711 revisions (by "git log --pretty=oneline | wc -l"), 2825 > files. Both repositories seem to have just linear history converted from > CVS repo. Both have the same head revision which is > 481c2a1e31f32c8aa0fb6d504575b75a18537788 (git) and > revid:cvs-1:tsdh-20080318180244-lxbzttdnh6ecqbka (bzr). > Hi, Thanks for the concrete numbers that we can work from. So, I hacked up a quick plugin to output logs in a flat style like git. I've attached it to this mail. You can install it by dropping it in to ~/.bazaar/plugins. It is totally unintegrated with bzr's log framework. I plan on rectifying that and proposing it for inclusion. It also has some nasty UI warts like using --end revid to limit the revisions that you want to see to ones that are not that revid or the parents of it. This means you can't do things like bzr flatlog -r-100.. I include the numbers from that inline with yours > > Viewing history > --------------- > > > The complete history: > > $ time git log >/dev/null > real 0m5.741s > > $ time bzr log >/dev/null > real 3m15.708s > bzr flatlog > /dev/null 45.17s user 0.75s system 97% cpu 46.883 total > > Last 100 revisions: > > $ time git log -100 >/dev/null > real 0m0.011s > > $ time bzr log -l100 >/dev/null > real 2m10.270s > bzr flatlog -l100 > /dev/null 12.11s user 0.32s system 97% cpu 12.714 total > > Last 10 revisions: > > $ time git log -10 >/dev/null > real 0m0.007s > > $ time bzr log -l10 >/dev/null > real 2m9.163s > bzr flatlog -l10 > /dev/null 12.18s user 0.25s system 98% cpu 12.582 total I can also suggest a workaround for anyone that is using bzr and wants to speed up log further while we work on it. If you edit ~/.bazaar/bazaar.conf and add [ALIASES] flatlog = --end cvs-1:bastien1-20080217010841-op363t09ccs7pais you can get bzr flatlog --end cvs-1:bastien1-20080217010841-op363t09ccs7pais > /dev/null 0.77s user 0.08s system 96% cpu 0.876 total and still have 1000 revisions to look at. I'll work on getting -r properly integrated which means you could have this alias set at all times, and then if you needed full history you could do bzr flatlog -r1.. > > Creating a branch > ----------------- > > With git I chose "git checkout -b" instead of "git branch" because the > former also checks out the files as does "bzr branch". The bzr branch is > created inside the same shared repository so that the common objects are > shared. bzr creates a second working tree, git replaces your first. > > > Create new topic branch based on the head revision of the main > development branch: > > $ time git checkout -b topic master >/dev/null > real 0m0.062s > > $ time bzr branch trunk topic >/dev/null > real 0m7.249s to compare getting the second working tree you can use git clone emacs temp2 1.70s user 1.29s system 31% cpu 9.395 total (it hardlinks the objects rather than just re-using them as bzr does, so it's still not the same) However this is a bit silly, as you are just comparing the usual ways to get a new branch in that particular VCS. It is possible to emulate the git way using a couple of bzr commands if you prefer to work in one directory with one working tree. I don't have benchmark numbers for that currently though I'm afraid. > > > Create new topic branch based on earlier revision of main development > branch: > > $ time git checkout -b topic master~4 >/dev/null > real 0m0.085s > > $ time bzr branch -r -5 trunk topic >/dev/null > real 2m51.551s > There was a patch proposed a couple of days to tackle an efficiency operation in exactly this command. > > > Compare branches' commit histories > ---------------------------------- > > In above benchmark I created branch 'topic' which is based on earlier > revision of main development branch. In this test I compared commands > which display commits that are missing from 'topic' branch compared to > the main development branch (four commits in total). > > > $ time git log topic..master >/dev/null > real 0m0.006s > > $ time bzr missing --theirs-only ../trunk >/dev/null > real 18m25.173s > An inefficiency was also highlighted here a couple of days ago. I think something was proposed that we could switch this and other commands to that would speed them up greatly. If you want to talk about bzr's performance or features I will be happy to do so, but please drop the emacs list from the discussion. Thanks, James --=-khoCAOCkf9mJH8GS1HQv Content-Disposition: attachment; filename=flatlog.py Content-Type: text/x-python; name=flatlog.py; charset=utf-8 Content-Transfer-Encoding: 7bit from bzrlib import ( branch, builtins, commands, errors, revision as _mod_revision, option, osutils, ) class cmd_flatlog(commands.Command): """Output the log in a flat format.""" takes_options = [ option.Option('limit', short_name='l', help='Limit the output to the first N revisions.', argname='N', type=builtins._parse_limit, ), option.Option('start', type=str, ), option.Option('end', type=str, ), option.Option('timezone', type=str, help="Display timezone as local, original or utc", ), option.Option('forward', help="Display the revisions from oldest to newest", ), ] def iter_all_history_topo_sorted(self, repo, revision_id, limit=None, end=None, no_merges=False): if revision_id in (None, _mod_revision.NULL_REVISION): return if revision_id == end: return if limit is not None: if limit <= 0: raise errors.BzrCommandError("limit option must be positive") if limit == 1: yield revision_id return to_process = [revision_id] indegree = {revision_id:0} graph = repo.get_graph() parents_map = {} no_merge = True _limit = limit while to_process: node = to_process.pop(0) parent_map = graph.get_parent_map([node]) if node not in parent_map: #ghost continue parents = parent_map[node] parents_map[node] = parents for parent in parents: if parent == end: continue done = indegree.setdefault(parent, 0) indegree[parent] += 1 if done == 0: to_process.append(parent) to_process = [revision_id] while to_process: node = to_process.pop(0) parents = parents_map[node] if not no_merges or len(parents) < 2: yield node if limit is not None: limit -= 1 if limit == 0: return for parent in parents: if parent == end or parent is _mod_revision.NULL_REVISION: continue if parent in parents_map: #not ghost indegree[parent] -= 1 if indegree[parent] == 0: to_process.append(parent) def iter_all_history_topo_sorted_forward(self, repo, revision_id, end=None, limit=None, no_merges=False): if revision_id in (None, _mod_revision.NULL_REVISION): return if revision_id == end: return if limit is not None: if limit <= 0: raise errors.BzrCommandError("limit option must be positive") to_process = [revision_id] graph = repo.get_graph() revs = set() while to_process: node = to_process.pop() parent_map = graph.get_parent_map([node]) if node not in parent_map: #ghost continue revs.add(node) parents = parent_map[node] for parent in parents: if parent is not _mod_revision.NULL_REVISION and parent not in revs: to_process.append(parent) for rev_id in graph.iter_topo_order(revs): if no_merges: parent_map = graph.get_parent_map([node]) if len(parent_map[node]) > 1: continue yield rev_id if limit is not None: limit -= 1 if limit == 0: return def iter_revs(self, repo, start, end, limit, forward, no_merges): num = 9 i = 0 rev_ids = [] if forward: rev_generator = self.iter_all_history_topo_sorted_forward(repo, start, limit=limit, end=end, no_merges=no_merges) else: rev_generator = self.iter_all_history_topo_sorted(repo, start, limit=limit, end=end, no_merges=no_merges) for rev_id in rev_generator: rev_ids.append(rev_id) i += 1 if i == num: revs = repo.get_revisions(rev_ids) for rev in revs: yield rev i = 0 rev_ids = [] num = min(int(num * 1.5), 200) revs = repo.get_revisions(rev_ids) for rev in revs: yield rev def run(self, limit=None, start=None, end=None, timezone=None, forward=False, no_merges=False): b = branch.Branch.open_containing('.')[0] if timezone is None: timezone = "original" if (start is not None or end is not None) and forward: raise errors.BzrCommandError("--forward and --start and --end are not supported yet") repo = b.repository repo.lock_read() try: if start is None: start = b.last_revision() for rev in self.iter_revs(repo, start, end, limit, forward, no_merges): self.outf.write("commit %s\nAuthor: %s\nDate: %s\n\n" % (rev.revision_id, rev.get_apparent_author(), osutils.format_date(rev.timestamp, rev.timezone or 0, timezone))) if not rev.message: self.outf.write(" (no message)\n") else: for l in rev.message.split('\n'): self.outf.write(" %s\n" % l) self.outf.write("\n") finally: repo.unlock() commands.register_command(cmd_flatlog) --=-khoCAOCkf9mJH8GS1HQv--