Mercurial > hg > hg-git
changeset 654:f221c7b5bdfb
git_handler: use convert_list to cache git objects
getnewgitcommits() does a weird traversal where a particular commit SHA is
visited as many times as the number of parents it has, effectively doubling
object reads in the standard case with one parent. This patch makes the
convert_list a cache for objects, so that a particular Git object is read just
once.
On a mostly linear repository with over 50,000 commits, this brings a no-op hg
pull down from 70 seconds to 38, which is close to half the time, as expected.
Note that even a no-op hg pull currently does a full DAG traversal -- an
upcoming patch will fix this.
author | Siddharth Agarwal <sid0@fb.com> |
---|---|
date | Tue, 18 Feb 2014 20:22:13 -0800 |
parents | 4ab616864329 |
children | baba2cf03d41 |
files | hggit/git_handler.py |
diffstat | 1 files changed, 5 insertions(+), 2 deletions(-) [+] |
line wrap: on
line diff
--- a/hggit/git_handler.py +++ b/hggit/git_handler.py @@ -620,7 +620,11 @@ todo.pop() continue assert isinstance(sha, str) - obj = self.git.get_object(sha) + if sha in convert_list: + obj = convert_list[sha] + else: + obj = self.git.get_object(sha) + convert_list[sha] = obj assert isinstance(obj, Commit) for p in obj.parents: if p not in done: @@ -630,7 +634,6 @@ break else: commits.append(sha) - convert_list[sha] = obj done.add(sha) todo.pop()