* Random idea about speeding up guix pull
@ 2017-09-03 14:27 Hartmut Goebel
2017-09-03 14:38 ` ng0
2017-09-04 15:01 ` Ludovic Courtès
0 siblings, 2 replies; 9+ messages in thread
From: Hartmut Goebel @ 2017-09-03 14:27 UTC (permalink / raw)
To: guix-devel
[-- Attachment #1: Type: text/plain, Size: 777 bytes --]
Hi,
I've seen in Ludo's slides that speeding up guix pull is topic. Here is
a random idea on the:
"git pull" can probably be speed up by using something like
git init .
git remote add …
git fetch --depth=1 origin master
git checkout FETCH_HEAD
This will only download the top-most commit resp. commit-state.
From my mostly up-t-date clone, this method downloads only 1559 objects
and 'du -s .git' reports 13M – compared to "git pull" downloading 133284
objects and taking 49M.
We could use this for downloading sourcce-code via git (git-download).
--
Regards
Hartmut Goebel
| Hartmut Goebel | h.goebel@crazy-compilers.com |
| www.crazy-compilers.com | compilers which you thought are impossible |
[-- Attachment #2: 0xBF773B65.asc --]
[-- Type: application/pgp-keys, Size: 14855 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull
2017-09-03 14:27 Random idea about speeding up guix pull Hartmut Goebel
@ 2017-09-03 14:38 ` ng0
2017-09-04 15:01 ` Ludovic Courtès
1 sibling, 0 replies; 9+ messages in thread
From: ng0 @ 2017-09-03 14:38 UTC (permalink / raw)
To: Hartmut Goebel; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]
Hartmut Goebel transcribed 15K bytes:
> Hi,
>
> I've seen in Ludo's slides that speeding up guix pull is topic. Here is
> a random idea on the:
>
> "git pull" can probably be speed up by using something like
>
> git init .
> git remote add …
> git fetch --depth=1 origin master
> git checkout FETCH_HEAD
>
> This will only download the top-most commit resp. commit-state.
>
> From my mostly up-t-date clone, this method downloads only 1559 objects
> and 'du -s .git' reports 13M – compared to "git pull" downloading 133284
> objects and taking 49M.
Yes, that would make many git clones take less space.
> We could use this for downloading sourcce-code via git (git-download).
Andy Wingo has proposed this in the past and had a patch which once
upon a time in 2015 worked. If you are motivated enough to adjust it,
it's still on the list but git-download and the other file it touches
has been changed very much since 2015.
> --
> Regards
> Hartmut Goebel
>
> | Hartmut Goebel | h.goebel@crazy-compilers.com |
> | www.crazy-compilers.com | compilers which you thought are impossible |
>
pub RSA 4096/BF773B65 2013-10-05 Hartmut Goebel <h.goebel@goebel-consult.de>
> sub RSA 4096/DDEAFF1A 2013-10-05
> >
--
ng0
GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
GnuPG: https://n0is.noblogs.org/my-keys
https://www.infotropique.org https://krosos.org
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull
2017-09-03 14:27 Random idea about speeding up guix pull Hartmut Goebel
2017-09-03 14:38 ` ng0
@ 2017-09-04 15:01 ` Ludovic Courtès
2017-09-04 15:39 ` Hartmut Goebel
1 sibling, 1 reply; 9+ messages in thread
From: Ludovic Courtès @ 2017-09-04 15:01 UTC (permalink / raw)
To: Hartmut Goebel; +Cc: guix-devel
Heya,
Hartmut Goebel <h.goebel@crazy-compilers.com> skribis:
> I've seen in Ludo's slides that speeding up guix pull is topic. Here is
> a random idea on the:
>
> "git pull" can probably be speed up by using something like
>
> git init .
> git remote add …
> git fetch --depth=1 origin master
> git checkout FETCH_HEAD
>
> This will only download the top-most commit resp. commit-state.
That’s roughly what ‘guix pull’ does nowadays, now that it uses
Guile-Git.
The problem is elsewhere: it’s compiling Guix’s Scheme code that takes
ages, in particular since we switch to Guile 2.2 (Guile 2.2’s fancy
compiler gives us significant speedups at run time on core Guix, but
it’s also slower when compiling simple code like package definitions.)
Ludo’.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull
2017-09-04 15:01 ` Ludovic Courtès
@ 2017-09-04 15:39 ` Hartmut Goebel
2017-09-04 21:56 ` Ludovic Courtès
0 siblings, 1 reply; 9+ messages in thread
From: Hartmut Goebel @ 2017-09-04 15:39 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Am 04.09.2017 um 17:01 schrieb Ludovic Courtès:
> That’s roughly what ‘guix pull’ does nowadays, now that it uses
> Guile-Git.
Does it? I only found the call to `remote-fetch` in guix/git.scm, which
is not passed any option to.
The trick is to use `--depth=1` and fetch the one, expected commit, tag
or branch-head.
--
Regards
Hartmut Goebel
| Hartmut Goebel | h.goebel@crazy-compilers.com |
| www.crazy-compilers.com | compilers which you thought are impossible |
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull
2017-09-04 15:39 ` Hartmut Goebel
@ 2017-09-04 21:56 ` Ludovic Courtès
2017-09-05 12:23 ` Hartmut Goebel
0 siblings, 1 reply; 9+ messages in thread
From: Ludovic Courtès @ 2017-09-04 21:56 UTC (permalink / raw)
To: Hartmut Goebel; +Cc: guix-devel
Hartmut Goebel <h.goebel@crazy-compilers.com> skribis:
> Am 04.09.2017 um 17:01 schrieb Ludovic Courtès:
>> That’s roughly what ‘guix pull’ does nowadays, now that it uses
>> Guile-Git.
>
> Does it? I only found the call to `remote-fetch` in guix/git.scm, which
> is not passed any option to.
>
> The trick is to use `--depth=1` and fetch the one, expected commit, tag
> or branch-head.
Oh right, it doesn’t do that.
What it does do is maintain a cached checkout in ~/.cache/guix/pull,
which makes subsequent pulls much faster.
Does that make sense?
Ludo’.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull
2017-09-04 21:56 ` Ludovic Courtès
@ 2017-09-05 12:23 ` Hartmut Goebel
2017-09-05 14:33 ` Ludovic Courtès
0 siblings, 1 reply; 9+ messages in thread
From: Hartmut Goebel @ 2017-09-05 12:23 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 3264 bytes --]
Am 04.09.2017 um 23:56 schrieb Ludovic Courtès:
> What it does do is maintain a cached checkout in ~/.cache/guix/pull,
> which makes subsequent pulls much faster.
Summary ( TL;DR):
* "guix pull" should use "git fetch master"
* "guix download" we can keep the current behaviour
I did a series of tests
* - "fetch" without any argument will fetch *all* data from *all*
branches.
* - "fetch master" only fetches data living on "master", other
branches are ignored
I compared the data fetched for a repo with status of 6bd1c41e8
(yesterday 05:29):
* - "fetch" fetches 1000K
* - "fetch master" fetches 755K
* - "fetch --depth=1 master" fetches 588K (but see below)
I did some more tests (see results below and attached script) and had
the following insights:
* if not checking out FETCH_HEAD after fetch, the next fetch will
download all data again (compare "fetch by ref" with "fetch by ref +
checkout"
* --depth=1 will download the *whole* state (at the given ref), no
matter how many of the data is already here (compare "fetch by ref +
checkout" with "fetch --depth=1 by ref + checkout")
* I was not able to create a test-case where "fetch --depth=1 master"
would only fetch parts of the data – so this contradicts the results
when updating from 6bd1c41e8.
I suggest to make "guix pull" to fetch only from "master", since this
already reduces the since of downloaded data.
For guix download we don't (need to) cache former downloads, thus
"--depth=1 <commit>" would suffice. Unfortunately this only works for
branches and tags, not for commit-ids (see "man git-fetch-pack" for
exceptions). But most current package definitions are based on
commit-ids. Thus it is not worth trying "--depth=1 <commit>" first.
cloned repo ---------------
size 45M
fetch all ------------------
size 45M
fetch by ref ------------------
size v0.11.0 26M
size v0.12.0 32M
size v0.13.0 40M
size marker-1 45M
size marker-2 45M
size marker-3 45M
size marker-4 45M
size marker-5 45M
size master 45M
fetch by ref + checkout ------------------
size v0.11.0 26M
size v0.12.0 11M
size v0.13.0 12M
size marker-1 8,9M
size marker-2 1,1M
size marker-3 856K
size marker-4 856K
size marker-5 1,1M
size master 1,1M
fetch --depth=1 by ref ------------------
size v0.11.0 9,8M
size v0.12.0 11M
size v0.13.0 13M
size marker-1 13M
size marker-2 13M
size marker-3 13M
size marker-4 13M
size marker-5 13M
size master 13M
fetch --depth=1 by ref + checkout ------------------
size v0.11.0 9,8M
size v0.12.0 3,8M
size v0.13.0 5,6M
size marker-1 4,1M
size marker-2 4,1M
size marker-3 4,1M
size marker-4 4,1M
size marker-5 4,1M
size master 4,1M
fetch older all and master with --depth=1 by ref + checkout
------------------
size master 45M
size master 45M
--
Regards
Hartmut Goebel
| Hartmut Goebel | h.goebel@crazy-compilers.com |
| www.crazy-compilers.com | compilers which you thought are impossible |
[-- Attachment #2: test-fetch.sh --]
[-- Type: application/x-shellscript, Size: 2590 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull
2017-09-05 12:23 ` Hartmut Goebel
@ 2017-09-05 14:33 ` Ludovic Courtès
2017-09-05 14:51 ` Hartmut Goebel
0 siblings, 1 reply; 9+ messages in thread
From: Ludovic Courtès @ 2017-09-05 14:33 UTC (permalink / raw)
To: Hartmut Goebel; +Cc: guix-devel
Hartmut Goebel <h.goebel@crazy-compilers.com> skribis:
> Am 04.09.2017 um 23:56 schrieb Ludovic Courtès:
>> What it does do is maintain a cached checkout in ~/.cache/guix/pull,
>> which makes subsequent pulls much faster.
>
> Summary ( TL;DR):
>
> * "guix pull" should use "git fetch master"
> * "guix download" we can keep the current behaviour
>
> I did a series of tests
>
> * - "fetch" without any argument will fetch *all* data from *all*
> branches.
> * - "fetch master" only fetches data living on "master", other
> branches are ignored
>
> I compared the data fetched for a repo with status of 6bd1c41e8
> (yesterday 05:29):
>
> * - "fetch" fetches 1000K
> * - "fetch master" fetches 755K
> * - "fetch --depth=1 master" fetches 588K (but see below)
Thanks for the detailed analysis!
The problem is that libgit2 doesn’t support shallow clones, and it’s
unclear whether it will support it in the future:
https://github.com/libgit2/libgit2/issues/3058
:-/
Ludo’.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull
2017-09-05 14:33 ` Ludovic Courtès
@ 2017-09-05 14:51 ` Hartmut Goebel
2017-09-07 8:28 ` Ludovic Courtès
0 siblings, 1 reply; 9+ messages in thread
From: Hartmut Goebel @ 2017-09-05 14:51 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel
Am 05.09.2017 um 16:33 schrieb Ludovic Courtès:
> The problem is that libgit2 doesn’t support shallow clones, and it’s
> unclear whether it will support it in the future:
Maybe I'm wrong, but to my understanding fetching a single branch/tag is
not a "shallow clone", isn't it?
--
Regards
Hartmut Goebel
| Hartmut Goebel | h.goebel@crazy-compilers.com |
| www.crazy-compilers.com | compilers which you thought are impossible |
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull
2017-09-05 14:51 ` Hartmut Goebel
@ 2017-09-07 8:28 ` Ludovic Courtès
0 siblings, 0 replies; 9+ messages in thread
From: Ludovic Courtès @ 2017-09-07 8:28 UTC (permalink / raw)
To: Hartmut Goebel; +Cc: guix-devel
Hartmut Goebel <h.goebel@crazy-compilers.com> skribis:
> Am 05.09.2017 um 16:33 schrieb Ludovic Courtès:
>> The problem is that libgit2 doesn’t support shallow clones, and it’s
>> unclear whether it will support it in the future:
>
> Maybe I'm wrong, but to my understanding fetching a single branch/tag is
> not a "shallow clone", isn't it?
I think it is, in the sense that just a subset of the Git object graph
is fetched, but I’m not 100% sure about the terminology.
Ludo’.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-09-07 8:28 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-03 14:27 Random idea about speeding up guix pull Hartmut Goebel
2017-09-03 14:38 ` ng0
2017-09-04 15:01 ` Ludovic Courtès
2017-09-04 15:39 ` Hartmut Goebel
2017-09-04 21:56 ` Ludovic Courtès
2017-09-05 12:23 ` Hartmut Goebel
2017-09-05 14:33 ` Ludovic Courtès
2017-09-05 14:51 ` Hartmut Goebel
2017-09-07 8:28 ` Ludovic Courtès
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).