* Random idea about speeding up guix pull @ 2017-09-03 14:27 Hartmut Goebel 2017-09-03 14:38 ` ng0 2017-09-04 15:01 ` Ludovic Courtès 0 siblings, 2 replies; 9+ messages in thread From: Hartmut Goebel @ 2017-09-03 14:27 UTC (permalink / raw) To: guix-devel [-- Attachment #1: Type: text/plain, Size: 777 bytes --] Hi, I've seen in Ludo's slides that speeding up guix pull is topic. Here is a random idea on the: "git pull" can probably be speed up by using something like git init . git remote add … git fetch --depth=1 origin master git checkout FETCH_HEAD This will only download the top-most commit resp. commit-state. From my mostly up-t-date clone, this method downloads only 1559 objects and 'du -s .git' reports 13M – compared to "git pull" downloading 133284 objects and taking 49M. We could use this for downloading sourcce-code via git (git-download). -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel@crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible | [-- Attachment #2: 0xBF773B65.asc --] [-- Type: application/pgp-keys, Size: 14855 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull 2017-09-03 14:27 Random idea about speeding up guix pull Hartmut Goebel @ 2017-09-03 14:38 ` ng0 2017-09-04 15:01 ` Ludovic Courtès 1 sibling, 0 replies; 9+ messages in thread From: ng0 @ 2017-09-03 14:38 UTC (permalink / raw) To: Hartmut Goebel; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 1455 bytes --] Hartmut Goebel transcribed 15K bytes: > Hi, > > I've seen in Ludo's slides that speeding up guix pull is topic. Here is > a random idea on the: > > "git pull" can probably be speed up by using something like > > git init . > git remote add … > git fetch --depth=1 origin master > git checkout FETCH_HEAD > > This will only download the top-most commit resp. commit-state. > > From my mostly up-t-date clone, this method downloads only 1559 objects > and 'du -s .git' reports 13M – compared to "git pull" downloading 133284 > objects and taking 49M. Yes, that would make many git clones take less space. > We could use this for downloading sourcce-code via git (git-download). Andy Wingo has proposed this in the past and had a patch which once upon a time in 2015 worked. If you are motivated enough to adjust it, it's still on the list but git-download and the other file it touches has been changed very much since 2015. > -- > Regards > Hartmut Goebel > > | Hartmut Goebel | h.goebel@crazy-compilers.com | > | www.crazy-compilers.com | compilers which you thought are impossible | > pub RSA 4096/BF773B65 2013-10-05 Hartmut Goebel <h.goebel@goebel-consult.de> > sub RSA 4096/DDEAFF1A 2013-10-05 > > -- ng0 GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588 GnuPG: https://n0is.noblogs.org/my-keys https://www.infotropique.org https://krosos.org [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull 2017-09-03 14:27 Random idea about speeding up guix pull Hartmut Goebel 2017-09-03 14:38 ` ng0 @ 2017-09-04 15:01 ` Ludovic Courtès 2017-09-04 15:39 ` Hartmut Goebel 1 sibling, 1 reply; 9+ messages in thread From: Ludovic Courtès @ 2017-09-04 15:01 UTC (permalink / raw) To: Hartmut Goebel; +Cc: guix-devel Heya, Hartmut Goebel <h.goebel@crazy-compilers.com> skribis: > I've seen in Ludo's slides that speeding up guix pull is topic. Here is > a random idea on the: > > "git pull" can probably be speed up by using something like > > git init . > git remote add … > git fetch --depth=1 origin master > git checkout FETCH_HEAD > > This will only download the top-most commit resp. commit-state. That’s roughly what ‘guix pull’ does nowadays, now that it uses Guile-Git. The problem is elsewhere: it’s compiling Guix’s Scheme code that takes ages, in particular since we switch to Guile 2.2 (Guile 2.2’s fancy compiler gives us significant speedups at run time on core Guix, but it’s also slower when compiling simple code like package definitions.) Ludo’. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull 2017-09-04 15:01 ` Ludovic Courtès @ 2017-09-04 15:39 ` Hartmut Goebel 2017-09-04 21:56 ` Ludovic Courtès 0 siblings, 1 reply; 9+ messages in thread From: Hartmut Goebel @ 2017-09-04 15:39 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel Am 04.09.2017 um 17:01 schrieb Ludovic Courtès: > That’s roughly what ‘guix pull’ does nowadays, now that it uses > Guile-Git. Does it? I only found the call to `remote-fetch` in guix/git.scm, which is not passed any option to. The trick is to use `--depth=1` and fetch the one, expected commit, tag or branch-head. -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel@crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible | ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull 2017-09-04 15:39 ` Hartmut Goebel @ 2017-09-04 21:56 ` Ludovic Courtès 2017-09-05 12:23 ` Hartmut Goebel 0 siblings, 1 reply; 9+ messages in thread From: Ludovic Courtès @ 2017-09-04 21:56 UTC (permalink / raw) To: Hartmut Goebel; +Cc: guix-devel Hartmut Goebel <h.goebel@crazy-compilers.com> skribis: > Am 04.09.2017 um 17:01 schrieb Ludovic Courtès: >> That’s roughly what ‘guix pull’ does nowadays, now that it uses >> Guile-Git. > > Does it? I only found the call to `remote-fetch` in guix/git.scm, which > is not passed any option to. > > The trick is to use `--depth=1` and fetch the one, expected commit, tag > or branch-head. Oh right, it doesn’t do that. What it does do is maintain a cached checkout in ~/.cache/guix/pull, which makes subsequent pulls much faster. Does that make sense? Ludo’. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull 2017-09-04 21:56 ` Ludovic Courtès @ 2017-09-05 12:23 ` Hartmut Goebel 2017-09-05 14:33 ` Ludovic Courtès 0 siblings, 1 reply; 9+ messages in thread From: Hartmut Goebel @ 2017-09-05 12:23 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 3264 bytes --] Am 04.09.2017 um 23:56 schrieb Ludovic Courtès: > What it does do is maintain a cached checkout in ~/.cache/guix/pull, > which makes subsequent pulls much faster. Summary ( TL;DR): * "guix pull" should use "git fetch master" * "guix download" we can keep the current behaviour I did a series of tests * - "fetch" without any argument will fetch *all* data from *all* branches. * - "fetch master" only fetches data living on "master", other branches are ignored I compared the data fetched for a repo with status of 6bd1c41e8 (yesterday 05:29): * - "fetch" fetches 1000K * - "fetch master" fetches 755K * - "fetch --depth=1 master" fetches 588K (but see below) I did some more tests (see results below and attached script) and had the following insights: * if not checking out FETCH_HEAD after fetch, the next fetch will download all data again (compare "fetch by ref" with "fetch by ref + checkout" * --depth=1 will download the *whole* state (at the given ref), no matter how many of the data is already here (compare "fetch by ref + checkout" with "fetch --depth=1 by ref + checkout") * I was not able to create a test-case where "fetch --depth=1 master" would only fetch parts of the data – so this contradicts the results when updating from 6bd1c41e8. I suggest to make "guix pull" to fetch only from "master", since this already reduces the since of downloaded data. For guix download we don't (need to) cache former downloads, thus "--depth=1 <commit>" would suffice. Unfortunately this only works for branches and tags, not for commit-ids (see "man git-fetch-pack" for exceptions). But most current package definitions are based on commit-ids. Thus it is not worth trying "--depth=1 <commit>" first. cloned repo --------------- size 45M fetch all ------------------ size 45M fetch by ref ------------------ size v0.11.0 26M size v0.12.0 32M size v0.13.0 40M size marker-1 45M size marker-2 45M size marker-3 45M size marker-4 45M size marker-5 45M size master 45M fetch by ref + checkout ------------------ size v0.11.0 26M size v0.12.0 11M size v0.13.0 12M size marker-1 8,9M size marker-2 1,1M size marker-3 856K size marker-4 856K size marker-5 1,1M size master 1,1M fetch --depth=1 by ref ------------------ size v0.11.0 9,8M size v0.12.0 11M size v0.13.0 13M size marker-1 13M size marker-2 13M size marker-3 13M size marker-4 13M size marker-5 13M size master 13M fetch --depth=1 by ref + checkout ------------------ size v0.11.0 9,8M size v0.12.0 3,8M size v0.13.0 5,6M size marker-1 4,1M size marker-2 4,1M size marker-3 4,1M size marker-4 4,1M size marker-5 4,1M size master 4,1M fetch older all and master with --depth=1 by ref + checkout ------------------ size master 45M size master 45M -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel@crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible | [-- Attachment #2: test-fetch.sh --] [-- Type: application/x-shellscript, Size: 2590 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull 2017-09-05 12:23 ` Hartmut Goebel @ 2017-09-05 14:33 ` Ludovic Courtès 2017-09-05 14:51 ` Hartmut Goebel 0 siblings, 1 reply; 9+ messages in thread From: Ludovic Courtès @ 2017-09-05 14:33 UTC (permalink / raw) To: Hartmut Goebel; +Cc: guix-devel Hartmut Goebel <h.goebel@crazy-compilers.com> skribis: > Am 04.09.2017 um 23:56 schrieb Ludovic Courtès: >> What it does do is maintain a cached checkout in ~/.cache/guix/pull, >> which makes subsequent pulls much faster. > > Summary ( TL;DR): > > * "guix pull" should use "git fetch master" > * "guix download" we can keep the current behaviour > > I did a series of tests > > * - "fetch" without any argument will fetch *all* data from *all* > branches. > * - "fetch master" only fetches data living on "master", other > branches are ignored > > I compared the data fetched for a repo with status of 6bd1c41e8 > (yesterday 05:29): > > * - "fetch" fetches 1000K > * - "fetch master" fetches 755K > * - "fetch --depth=1 master" fetches 588K (but see below) Thanks for the detailed analysis! The problem is that libgit2 doesn’t support shallow clones, and it’s unclear whether it will support it in the future: https://github.com/libgit2/libgit2/issues/3058 :-/ Ludo’. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull 2017-09-05 14:33 ` Ludovic Courtès @ 2017-09-05 14:51 ` Hartmut Goebel 2017-09-07 8:28 ` Ludovic Courtès 0 siblings, 1 reply; 9+ messages in thread From: Hartmut Goebel @ 2017-09-05 14:51 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel Am 05.09.2017 um 16:33 schrieb Ludovic Courtès: > The problem is that libgit2 doesn’t support shallow clones, and it’s > unclear whether it will support it in the future: Maybe I'm wrong, but to my understanding fetching a single branch/tag is not a "shallow clone", isn't it? -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel@crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible | ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Random idea about speeding up guix pull 2017-09-05 14:51 ` Hartmut Goebel @ 2017-09-07 8:28 ` Ludovic Courtès 0 siblings, 0 replies; 9+ messages in thread From: Ludovic Courtès @ 2017-09-07 8:28 UTC (permalink / raw) To: Hartmut Goebel; +Cc: guix-devel Hartmut Goebel <h.goebel@crazy-compilers.com> skribis: > Am 05.09.2017 um 16:33 schrieb Ludovic Courtès: >> The problem is that libgit2 doesn’t support shallow clones, and it’s >> unclear whether it will support it in the future: > > Maybe I'm wrong, but to my understanding fetching a single branch/tag is > not a "shallow clone", isn't it? I think it is, in the sense that just a subset of the Git object graph is fetched, but I’m not 100% sure about the terminology. Ludo’. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-09-07 8:28 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-09-03 14:27 Random idea about speeding up guix pull Hartmut Goebel 2017-09-03 14:38 ` ng0 2017-09-04 15:01 ` Ludovic Courtès 2017-09-04 15:39 ` Hartmut Goebel 2017-09-04 21:56 ` Ludovic Courtès 2017-09-05 12:23 ` Hartmut Goebel 2017-09-05 14:33 ` Ludovic Courtès 2017-09-05 14:51 ` Hartmut Goebel 2017-09-07 8:28 ` Ludovic Courtès
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).