* Suggestion: disable offloading for texlive builds on hydra? @ 2014-10-26 7:36 Mark H Weaver 2014-10-26 7:49 ` John Darrington 0 siblings, 1 reply; 10+ messages in thread From: Mark H Weaver @ 2014-10-26 7:36 UTC (permalink / raw) To: guix-devel When texlive is built on hydra, the build slave that built it is tied up for 12 hours or more waiting for the build outputs (over 3 gigabytes!) to be transferred back to hydra. By design, only one transfer can happen at a time from a given build slave, so during those 12 hours, the build slave's CPU is left idle, and typically another 3 built-but-not-yet-transferred packages must wait until the texlive transfer finishes. I suggest that we arrange for hydra.gnu.org to build texlive locally for x86_64 and i686, to avoid this problem. What do you think? Mark ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-26 7:36 Suggestion: disable offloading for texlive builds on hydra? Mark H Weaver @ 2014-10-26 7:49 ` John Darrington 2014-10-26 14:12 ` Ludovic Courtès 0 siblings, 1 reply; 10+ messages in thread From: John Darrington @ 2014-10-26 7:49 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 1064 bytes --] On Sun, Oct 26, 2014 at 03:36:03AM -0400, Mark H Weaver wrote: When texlive is built on hydra, the build slave that built it is tied up for 12 hours or more waiting for the build outputs (over 3 gigabytes!) to be transferred back to hydra. By design, only one transfer can happen at a time from a given build slave, so during those 12 hours, the build slave's CPU is left idle, and typically another 3 built-but-not-yet-transferred packages must wait until the texlive transfer finishes. Why is it designed like that? It seems like a poor design to me. I suggest that we arrange for hydra.gnu.org to build texlive locally for x86_64 and i686, to avoid this problem. Would it help if texlive was split into more outputs? For example, the docs take up a lot of space, and not everyone needs them. J' -- PGP Public key ID: 1024D/2DE827B3 fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 See http://sks-keyservers.net or any PGP keyserver for public key. [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-26 7:49 ` John Darrington @ 2014-10-26 14:12 ` Ludovic Courtès 2014-10-26 16:07 ` Mark H Weaver 2014-10-29 12:29 ` Andreas Enge 0 siblings, 2 replies; 10+ messages in thread From: Ludovic Courtès @ 2014-10-26 14:12 UTC (permalink / raw) To: John Darrington; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 1776 bytes --] John Darrington <john@darrington.wattle.id.au> skribis: > On Sun, Oct 26, 2014 at 03:36:03AM -0400, Mark H Weaver wrote: > When texlive is built on hydra, the build slave that built it is tied up > for 12 hours or more waiting for the build outputs (over 3 gigabytes!) > to be transferred back to hydra. > > By design, only one transfer can happen at a time from a given build > slave, so during those 12 hours, the build slave's CPU is left idle, and > typically another 3 built-but-not-yet-transferred packages must wait > until the texlive transfer finishes. > > Why is it designed like that? It seems like a poor design to me. The rationale was that, in general, you just slow everything down by sending several things at once. TeX Live is a pathological case in that respect. As for disabling offloading, see my reply to Federico: we could expose #:local-build? to gnu-build-system, and use that here, but at the moment that also disables substitutes. WDYT? > I suggest that we arrange for hydra.gnu.org to build texlive locally for > x86_64 and i686, to avoid this problem. > > Would it help if texlive was split into more outputs? For example, the docs > take up a lot of space, and not everyone needs them. I think it may help a bit, at least by leaving a small window during which other builds could get started. And it would also be more convenient for users, who could choose whether to install the whole thing or not. When you mentioned it some time ago on IRC, I gave it a try, but then failed to actually test the patch due to... ENOSPC. :-) Anyway, here’s the patch I had. I’d be happy if you or someone else could just confirm it works as expected: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch, Size: 439 bytes --] diff --git a/gnu/packages/texlive.scm b/gnu/packages/texlive.scm index e562b02..bc0ece7 100644 --- a/gnu/packages/texlive.scm +++ b/gnu/packages/texlive.scm @@ -88,7 +88,7 @@ ("pkg-config" ,pkg-config) ("python" ,python-2) ; incompatible with Python 3 (print syntax) ("tcsh" ,tcsh))) - (outputs '("out" "data")) + (outputs '("out" "data" "doc")) (arguments `(#:out-of-source? #t #:configure-flags [-- Attachment #3: Type: text/plain, Size: 438 bytes --] Data point: there’s 1.6 GiB in texmf-dist/doc (which the patch above splits out), and 1.4 GiB in texmf-dist/fonts. Another option Andreas and I discussed a while back would be to use a fixed-output derivation for the data, since it’s really what it is. That’s a bit hacky though: we’d have to install it, compute the hash of the installed files, and then use that as the derivation’s output hash. Thanks, Ludo’. ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-26 14:12 ` Ludovic Courtès @ 2014-10-26 16:07 ` Mark H Weaver 2014-10-27 12:58 ` Ludovic Courtès ` (2 more replies) 2014-10-29 12:29 ` Andreas Enge 1 sibling, 3 replies; 10+ messages in thread From: Mark H Weaver @ 2014-10-26 16:07 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel ludo@gnu.org (Ludovic Courtès) writes: > John Darrington <john@darrington.wattle.id.au> skribis: > >> On Sun, Oct 26, 2014 at 03:36:03AM -0400, Mark H Weaver wrote: >> When texlive is built on hydra, the build slave that built it is tied up >> for 12 hours or more waiting for the build outputs (over 3 gigabytes!) >> to be transferred back to hydra. >> >> By design, only one transfer can happen at a time from a given build >> slave, so during those 12 hours, the build slave's CPU is left idle, and >> typically another 3 built-but-not-yet-transferred packages must wait >> until the texlive transfer finishes. >> >> Why is it designed like that? It seems like a poor design to me. > > The rationale was that, in general, you just slow everything down by > sending several things at once. I have my doubts that it would slow things down very much, if at all. The number of parallel transfers would still be limited to a small number, typically 4 per build slave. The expense associated with running multiple processes on a CPU is mainly due to cache effects, but I wouldn't expect that to be an issue with network connections, especially when those connections are between the same two hosts. The practice of using multiple connections is well established in web browsers and imap clients, as long as the number is not too large. We're losing a huge amount of available CPU capacity in our build farm (probably over 30 machine-hours per texinfo rebuild) in exchange for a dubious increase in network efficiency. The more I think about it, the more I agree with John that we've chosen the wrong tradeoff here. I think we should remove those mutexes. > diff --git a/gnu/packages/texlive.scm b/gnu/packages/texlive.scm > index e562b02..bc0ece7 100644 > --- a/gnu/packages/texlive.scm > +++ b/gnu/packages/texlive.scm > @@ -88,7 +88,7 @@ > ("pkg-config" ,pkg-config) > ("python" ,python-2) ; incompatible with Python 3 (print syntax) > ("tcsh" ,tcsh))) > - (outputs '("out" "data")) > + (outputs '("out" "data" "doc")) > (arguments > `(#:out-of-source? #t > #:configure-flags > > > Data point: there’s 1.6 GiB in texmf-dist/doc (which the patch above > splits out), and 1.4 GiB in texmf-dist/fonts. I'd definitely be in favor of splitting out the docs. > Another option Andreas and I discussed a while back would be to use a > fixed-output derivation for the data, since it’s really what it is. > That’s a bit hacky though: we’d have to install it, compute the hash of > the installed files, and then use that as the derivation’s output hash. Hmm. It is indeed a hack, but maybe worth considering. When I think about Guix users downloading over 3 GiB from our humble hydra quite often just to have TeX, it makes me worry about our bandwidth requirements. Thanks, Mark ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-26 16:07 ` Mark H Weaver @ 2014-10-27 12:58 ` Ludovic Courtès 2014-10-28 23:55 ` Ludovic Courtès 2014-10-29 21:50 ` Andreas Enge 2 siblings, 0 replies; 10+ messages in thread From: Ludovic Courtès @ 2014-10-27 12:58 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel Mark H Weaver <mhw@netris.org> skribis: > ludo@gnu.org (Ludovic Courtès) writes: [...] >> The rationale was that, in general, you just slow everything down by >> sending several things at once. > > I have my doubts that it would slow things down very much, if at all. > The number of parallel transfers would still be limited to a small > number, typically 4 per build slave. The expense associated with > running multiple processes on a CPU is mainly due to cache effects, but > I wouldn't expect that to be an issue with network connections, > especially when those connections are between the same two hosts. The > practice of using multiple connections is well established in web > browsers and imap clients, as long as the number is not too large. > > We're losing a huge amount of available CPU capacity in our build farm > (probably over 30 machine-hours per texinfo rebuild) in exchange for a > dubious increase in network efficiency. > > The more I think about it, the more I agree with John that we've chosen > the wrong tradeoff here. I think we should remove those mutexes. Hmm OK. I’m happy to try that (it’s a two-line change plus deployment.) I can do it one of the next few days, but I’m happy if you do it. :-) >> diff --git a/gnu/packages/texlive.scm b/gnu/packages/texlive.scm >> index e562b02..bc0ece7 100644 >> --- a/gnu/packages/texlive.scm >> +++ b/gnu/packages/texlive.scm >> @@ -88,7 +88,7 @@ >> ("pkg-config" ,pkg-config) >> ("python" ,python-2) ; incompatible with Python 3 (print syntax) >> ("tcsh" ,tcsh))) >> - (outputs '("out" "data")) >> + (outputs '("out" "data" "doc")) >> (arguments >> `(#:out-of-source? #t >> #:configure-flags >> >> >> Data point: there’s 1.6 GiB in texmf-dist/doc (which the patch above >> splits out), and 1.4 GiB in texmf-dist/fonts. > > I'd definitely be in favor of splitting out the docs. OK, I’ll test it locally and commit if nothing breaks. >> Another option Andreas and I discussed a while back would be to use a >> fixed-output derivation for the data, since it’s really what it is. >> That’s a bit hacky though: we’d have to install it, compute the hash of >> the installed files, and then use that as the derivation’s output hash. > > Hmm. It is indeed a hack, but maybe worth considering. When I think > about Guix users downloading over 3 GiB from our humble hydra quite > often just to have TeX, it makes me worry about our bandwidth > requirements. Agreed. Ludo’. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-26 16:07 ` Mark H Weaver 2014-10-27 12:58 ` Ludovic Courtès @ 2014-10-28 23:55 ` Ludovic Courtès 2014-10-29 21:50 ` Andreas Enge 2 siblings, 0 replies; 10+ messages in thread From: Ludovic Courtès @ 2014-10-28 23:55 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel Mark H Weaver <mhw@netris.org> skribis: > The more I think about it, the more I agree with John that we've chosen > the wrong tradeoff here. I think we should remove those mutexes. Done in commit 940a8c5, which I’ve just deployed on hydra.gnu.org. Ludo’. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-26 16:07 ` Mark H Weaver 2014-10-27 12:58 ` Ludovic Courtès 2014-10-28 23:55 ` Ludovic Courtès @ 2014-10-29 21:50 ` Andreas Enge 2 siblings, 0 replies; 10+ messages in thread From: Andreas Enge @ 2014-10-29 21:50 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel On Sun, Oct 26, 2014 at 12:07:13PM -0400, Mark H Weaver wrote: > Hmm. It is indeed a hack, but maybe worth considering. When I think > about Guix users downloading over 3 GiB from our humble hydra quite > often just to have TeX, it makes me worry about our bandwidth > requirements. What do you mean by "just to have Tex"? This is definitely one of the most important pieces of software I am using. And having all of it including its documentation with one installation is a big gain over debian, where one must always be afraid of being on the train and missing this one crucial latex style file, or not being able to look up all the obscure options of algorithm2e.sty ;-) One option would be to have something like "texlive-small", containing only the binaries and a smallish subset of texmf-dist, excluding the documentation and most of the fonts. Users could then choose between the small and the full package. Andreas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-26 14:12 ` Ludovic Courtès 2014-10-26 16:07 ` Mark H Weaver @ 2014-10-29 12:29 ` Andreas Enge 2014-10-29 16:20 ` Andreas Enge 2014-10-29 22:17 ` Ludovic Courtès 1 sibling, 2 replies; 10+ messages in thread From: Andreas Enge @ 2014-10-29 12:29 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel On Sun, Oct 26, 2014 at 03:12:40PM +0100, Ludovic Courtès wrote: > - (outputs '("out" "data")) > + (outputs '("out" "data" "doc")) I just tried this, and it does not work: builder for `/gnu/store/r39sf9gzfdlxb6q2c4zaz18z63mmc8fz-texlive-2014.drv' failed to produce output path `/gnu/store/s756nm0dcw57h64vimq6bz3hzmf4s40p-texlive-2014-doc' So I think we need to shuffle things around ourselves. I will try to have a look. A problem is that I need a working texlive, and compiling an additional one may take too much space (in a leap of confidence, I just deleted my texlive before trying the splitting of the docs, but I will not do this again; well, using a usb stick for /tmp helps a bit...). Apparently, texlive does not honour --docdir=..., although it does not complain about the option. I asked about it on the texlive mailing list and will wait for a suggestion. In any case, it should be easy to just move the documentation directory; but I will first need to recompile a working texlive to have a closer look at the directory... Andreas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-29 12:29 ` Andreas Enge @ 2014-10-29 16:20 ` Andreas Enge 2014-10-29 22:17 ` Ludovic Courtès 1 sibling, 0 replies; 10+ messages in thread From: Andreas Enge @ 2014-10-29 16:20 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guix-devel On Wed, Oct 29, 2014 at 01:29:01PM +0100, Andreas Enge wrote: > A problem is that I need a working texlive, and compiling an additional > one may take too much space. Actually, now that deduplication works, havings several texlive installations with the same data is not a problem any more! Andreas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Suggestion: disable offloading for texlive builds on hydra? 2014-10-29 12:29 ` Andreas Enge 2014-10-29 16:20 ` Andreas Enge @ 2014-10-29 22:17 ` Ludovic Courtès 1 sibling, 0 replies; 10+ messages in thread From: Ludovic Courtès @ 2014-10-29 22:17 UTC (permalink / raw) To: Andreas Enge; +Cc: guix-devel Andreas Enge <andreas@enge.fr> skribis: > On Sun, Oct 26, 2014 at 03:12:40PM +0100, Ludovic Courtès wrote: >> - (outputs '("out" "data")) >> + (outputs '("out" "data" "doc")) > > I just tried this, and it does not work: > builder for `/gnu/store/r39sf9gzfdlxb6q2c4zaz18z63mmc8fz-texlive-2014.drv' failed to produce output path `/gnu/store/s756nm0dcw57h64vimq6bz3hzmf4s40p-texlive-2014-doc' > > So I think we need to shuffle things around ourselves. I will try to have a > look. A problem is that I need a working texlive, and compiling an additional > one may take too much space (in a leap of confidence, I just deleted my > texlive before trying the splitting of the docs, but I will not do this again; > well, using a usb stick for /tmp helps a bit...). Heh. :-) > Apparently, texlive does not honour --docdir=..., although it does not > complain about the option. I asked about it on the texlive mailing list > and will wait for a suggestion. It’s not uncommon to find configure scripts that ignore --docdir et al. and instead provide their own option. Thanks for looking into it. Ludo’. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-10-29 22:17 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-10-26 7:36 Suggestion: disable offloading for texlive builds on hydra? Mark H Weaver 2014-10-26 7:49 ` John Darrington 2014-10-26 14:12 ` Ludovic Courtès 2014-10-26 16:07 ` Mark H Weaver 2014-10-27 12:58 ` Ludovic Courtès 2014-10-28 23:55 ` Ludovic Courtès 2014-10-29 21:50 ` Andreas Enge 2014-10-29 12:29 ` Andreas Enge 2014-10-29 16:20 ` Andreas Enge 2014-10-29 22:17 ` Ludovic Courtès
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).