* bug#49439: grafts cause “guix environment” to get killed with OOM @ 2021-07-06 14:38 Ricardo Wurmus 2021-07-23 4:59 ` Sarah Morgensen 0 siblings, 1 reply; 10+ messages in thread From: Ricardo Wurmus @ 2021-07-06 14:38 UTC (permalink / raw) To: 49439 With a recent version of Guix, “guix environment” will not terminate on its own, keeps the CPU busy, and gets killed when the system eventually runs out of memory. $ guix describe -f channels --8<---------------cut here---------------start------------->8--- (list (channel (name 'guix) (url "/home/rekado/dev/gx/branches/master") (commit "685cfdec94e5e48c4ad28de53466a28dfc258edb"))) --8<---------------cut here---------------end--------------->8--- $ guix environment pigx-scrnaseq [wait until it gets killed] The problem disappears when grafts are disabled: $ guix environment --no-grafts pigx-scrnaseq $ [env] yay! -- Ricardo ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-07-06 14:38 bug#49439: grafts cause “guix environment” to get killed with OOM Ricardo Wurmus @ 2021-07-23 4:59 ` Sarah Morgensen 2021-07-27 14:52 ` Julien Lepiller 2021-07-27 16:28 ` Ludovic Courtès 0 siblings, 2 replies; 10+ messages in thread From: Sarah Morgensen @ 2021-07-23 4:59 UTC (permalink / raw) To: Ricardo Wurmus; +Cc: 49439 Hello, Ricardo Wurmus <rekado@elephly.net> writes: > With a recent version of Guix, “guix environment” will not > terminate on its own, keeps the CPU busy, and gets killed when the > system eventually runs out of memory. > > $ guix describe -f channels > > (list (channel > (name 'guix) > (url "/home/rekado/dev/gx/branches/master") > (commit > "685cfdec94e5e48c4ad28de53466a28dfc258edb"))) > > > $ guix environment pigx-scrnaseq > [wait until it gets killed] I can reproduce this with pigx-scrnaseq as well a number of other packages (listed below). $ ./pre-inst-env guix describe -f channels (list (channel (name 'guix) (url "/home/sarah/guix") (commit "3217a04b0352c2dd13323257b369604eeabfccc3"))) Does not complete within 5 minutes: package # inputs # transitive inputs (from package-transitive-inputs) pigx-chipseq 48 338 pigx-scrnaseq 41 321 r-cellchat 34 110 pigx-rnaseq 34 343 pigx-bsseq 32 358 pigx-sars-cov2-ww 25 261 r-circus 16 134 Does complete: r-chipseq 6 37 completes in >2m r-shortread 17 36 completes in >1m python-scanpy 25 113 completes in <15s I suspect it has something to do with the number of transitive inputs, because it is so prevalent with these R packages, which all use propagated inputs. However... python-scanpy succeeds in under 15 seconds, and it has more transitive inputs than r-chipseq. Can we reproduce this with a large number of low-transitivity packages directly on the command line? > > The problem disappears when grafts are disabled: > > $ guix environment --no-grafts pigx-scrnaseq > $ [env] yay! -- Sarah ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-07-23 4:59 ` Sarah Morgensen @ 2021-07-27 14:52 ` Julien Lepiller 2021-07-27 16:28 ` Ludovic Courtès 1 sibling, 0 replies; 10+ messages in thread From: Julien Lepiller @ 2021-07-27 14:52 UTC (permalink / raw) To: 49439, iskarian, rekado [-- Attachment #1: Type: text/plain, Size: 3171 bytes --] I have a similar issue with an ocaml package I use at work. It's not free software, but all its dependencies are. The dependencies are not all yeet in guix, so to reproduce you might have to import them first with "guix import opam -r foo" for ocaml-foo. The package depends on ocaml-ounit, ocaml-lp, ocaml-apronext, menhir, ocaml-async, ocaml-core, ocaml-graph, ocaml-libsvm, ocaml-minisat, ocaml-ppx-deriving-yojson, ocaml-yojson, ocaml-z3, ocaml-zarith and z3. In total, that's 118 transitive inputs. Building the profile takes 30 minutes for me, on an SSD. The builder takes 1.5g resident. Other than that, I measured time and memory for creating the environment when the profile was already built (no more derivation to build): `which time` -v ~/guix/pre-inst-env guix environment ocaml-dummy-package -- echo done User time: 121.43s System time: 2.28s Maximum resident: 1803028kB (1.8 GB) With a warning from GC: Repeated allocation of very large block (approx. size 35606528) Note that I get the same numbers with --no-grafts, so it might be a different issue. "guix build" terminates quickly. Le 23 juillet 2021 00:59:21 GMT-04:00, Sarah Morgensen <iskarian@mgsn.dev> a écrit : >Hello, > >Ricardo Wurmus <rekado@elephly.net> writes: > >> With a recent version of Guix, “guix environment” will not >> terminate on its own, keeps the CPU busy, and gets killed when the >> system eventually runs out of memory. >> >> $ guix describe -f channels >> >> (list (channel >> (name 'guix) >> (url "/home/rekado/dev/gx/branches/master") >> (commit >> "685cfdec94e5e48c4ad28de53466a28dfc258edb"))) >> >> >> $ guix environment pigx-scrnaseq >> [wait until it gets killed] > >I can reproduce this with pigx-scrnaseq as well a number of other >packages (listed below). > >$ ./pre-inst-env guix describe -f channels >(list (channel > (name 'guix) > (url "/home/sarah/guix") > (commit > "3217a04b0352c2dd13323257b369604eeabfccc3"))) > >Does not complete within 5 minutes: >package # inputs # transitive inputs > (from package-transitive-inputs) >pigx-chipseq 48 338 >pigx-scrnaseq 41 321 >r-cellchat 34 110 >pigx-rnaseq 34 343 >pigx-bsseq 32 358 >pigx-sars-cov2-ww 25 261 >r-circus 16 134 > >Does complete: >r-chipseq 6 37 completes in >2m >r-shortread 17 36 completes in >1m >python-scanpy 25 113 completes in <15s > >I suspect it has something to do with the number of transitive inputs, >because it is so prevalent with these R packages, which all use >propagated inputs. However... python-scanpy succeeds in under 15 >seconds, and it has more transitive inputs than r-chipseq. > >Can we reproduce this with a large number of low-transitivity packages >directly on the command line? > >> >> The problem disappears when grafts are disabled: >> >> $ guix environment --no-grafts pigx-scrnaseq >> $ [env] yay! > >-- >Sarah [-- Attachment #2: Type: text/html, Size: 3790 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-07-23 4:59 ` Sarah Morgensen 2021-07-27 14:52 ` Julien Lepiller @ 2021-07-27 16:28 ` Ludovic Courtès 2021-07-27 23:35 ` Sarah Morgensen 1 sibling, 1 reply; 10+ messages in thread From: Ludovic Courtès @ 2021-07-27 16:28 UTC (permalink / raw) To: Sarah Morgensen; +Cc: Ricardo Wurmus, 49439 Hi, Sarah Morgensen <iskarian@mgsn.dev> skribis: > Ricardo Wurmus <rekado@elephly.net> writes: > >> With a recent version of Guix, “guix environment” will not >> terminate on its own, keeps the CPU busy, and gets killed when the >> system eventually runs out of memory. >> >> $ guix describe -f channels >> >> (list (channel >> (name 'guix) >> (url "/home/rekado/dev/gx/branches/master") >> (commit >> "685cfdec94e5e48c4ad28de53466a28dfc258edb"))) >> >> >> $ guix environment pigx-scrnaseq >> [wait until it gets killed] > > I can reproduce this with pigx-scrnaseq as well a number of other > packages (listed below). > > $ ./pre-inst-env guix describe -f channels > (list (channel > (name 'guix) > (url "/home/sarah/guix") > (commit > "3217a04b0352c2dd13323257b369604eeabfccc3"))) > > Does not complete within 5 minutes: What hardware are you using? Here’s what I observe (with everything already in store and on a hot cache, with an i7 CPU): --8<---------------cut here---------------start------------->8--- $ guix describe Generacio 188 Jul 25 2021 12:47:29 (nuna) guix a92dfbc repository URL: https://git.savannah.gnu.org/git/guix.git branch: master commit: a92dfbce30777de6ca05031e275410cf9f56c84c $ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths --no-grafts >/dev/null Garbage collection statistics: heap size: 160.31 MiB allocated: 1440.70 MiB GC times: 39 time spent in GC: 4.51 seconds (46% of user time) real 0m7.534s user 0m9.747s sys 0m0.167s $ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths >/dev/null Garbage collection statistics: heap size: 168.31 MiB allocated: 2111.71 MiB GC times: 53 time spent in GC: 6.92 seconds (45% of user time) real 0m12.100s user 0m15.517s sys 0m0.198s --8<---------------cut here---------------end--------------->8--- Commit 78daf9e02e5bc51f91488d8237cab2050cc060cf optimizes ‘coalesce-duplicate-inputs’, which was quadratic in the number of inputs (!). After that change, I get: --8<---------------cut here---------------start------------->8--- $ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths --no-grafts >/dev/null Garbage collection statistics: heap size: 168.31 MiB allocated: 716.58 MiB GC times: 24 time spent in GC: 2.65 seconds (40% of user time) real 0m5.720s user 0m6.708s sys 0m0.149s $ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths >/dev/null Garbage collection statistics: heap size: 160.31 MiB allocated: 1387.96 MiB GC times: 42 time spent in GC: 5.87 seconds (43% of user time) real 0m10.821s user 0m13.513s sys 0m0.198s --8<---------------cut here---------------end--------------->8--- Could you tell me if it improves the situation for you? It’s not the end of the road, but it should be an improvement. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-07-27 16:28 ` Ludovic Courtès @ 2021-07-27 23:35 ` Sarah Morgensen 2021-07-28 10:00 ` Ludovic Courtès 0 siblings, 1 reply; 10+ messages in thread From: Sarah Morgensen @ 2021-07-27 23:35 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Ricardo Wurmus, 49439 Hi, Ludovic Courtès <ludo@gnu.org> writes: [...] >> I can reproduce this with pigx-scrnaseq as well a number of other >> packages (listed below). >> >> $ ./pre-inst-env guix describe -f channels >> (list (channel >> (name 'guix) >> (url "/home/sarah/guix") >> (commit >> "3217a04b0352c2dd13323257b369604eeabfccc3"))) >> >> Does not complete within 5 minutes: Okay, so all of a sudden I can't reproduce this; even with the same commit as above, it completes in ~20s. guix time-machine --commit=3217a04 -- environment pigx-scrnaseq --search-paths >/dev/null > What hardware are you using? Virtualbox VM with VT-x etc. on a host i7-6700. The VM has 6GB of memory. > > Here’s what I observe (with everything already in store and on a hot > cache, with an i7 CPU): > > $ guix describe > Generacio 188 Jul 25 2021 12:47:29 (nuna) > guix a92dfbc > repository URL: https://git.savannah.gnu.org/git/guix.git > branch: master > commit: a92dfbce30777de6ca05031e275410cf9f56c84c > $ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths --no-grafts >/dev/null > Garbage collection statistics: > heap size: 160.31 MiB > allocated: 1440.70 MiB > GC times: 39 > time spent in GC: 4.51 seconds (46% of user time) > > real 0m7.534s > user 0m9.747s > sys 0m0.167s > $ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths >/dev/null > Garbage collection statistics: > heap size: 168.31 MiB > allocated: 2111.71 MiB > GC times: 53 > time spent in GC: 6.92 seconds (45% of user time) > > real 0m12.100s > user 0m15.517s > sys 0m0.198s > > > Commit 78daf9e02e5bc51f91488d8237cab2050cc060cf optimizes > ‘coalesce-duplicate-inputs’, which was quadratic in the number of inputs (!). > After that change, I get: > > $ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths --no-grafts >/dev/null > Garbage collection statistics: > heap size: 168.31 MiB > allocated: 716.58 MiB > GC times: 24 > time spent in GC: 2.65 seconds (40% of user time) > > real 0m5.720s > user 0m6.708s > sys 0m0.149s > $ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths >/dev/null > Garbage collection statistics: > heap size: 160.31 MiB > allocated: 1387.96 MiB > GC times: 42 > time spent in GC: 5.87 seconds (43% of user time) > > real 0m10.821s > user 0m13.513s > sys 0m0.198s > > Could you tell me if it improves the situation for you? *Now* my experience is like yours: --8<---------------cut here---------------start------------->8--- $ guix describe Generation 9 Jul 27 2021 12:35:05 (current) guix d0ec739 repository URL: https://git.savannah.gnu.org/git/guix.git branch: master commit: d0ec73907c2995034a34339f4a7c2c72c2e21fea time GUIX_PROFILING=gc guix time-machine --commit=3217a04 -- environment pigx-scrnaseq --search-paths >/dev/null Garbage collection statistics: heap size: 176.31 MiB allocated: 2107.82 MiB GC times: 52 time spent in GC: 5.26 seconds (23% of user time) real 0m20.471s user 0m22.605s sys 0m0.372s $ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths >/dev/null Garbage collection statistics: heap size: 152.31 MiB allocated: 1367.16 MiB GC times: 40 time spent in GC: 3.25 seconds (21% of user time) real 0m14.701s user 0m15.698s sys 0m0.361s --8<---------------cut here---------------end--------------->8--- But why was it occurring before? The only I thing I can think of is that I didn't have everything in the store first. Is there a way I can prune just the relevant items from the store to test this? > > It’s not the end of the road, but it should be an improvement. > > Thanks, > Ludo’. -- Sarah ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-07-27 23:35 ` Sarah Morgensen @ 2021-07-28 10:00 ` Ludovic Courtès 2021-07-28 22:03 ` Ludovic Courtès 0 siblings, 1 reply; 10+ messages in thread From: Ludovic Courtès @ 2021-07-28 10:00 UTC (permalink / raw) To: Sarah Morgensen; +Cc: Ricardo Wurmus, 49439 Hi, Sarah Morgensen <iskarian@mgsn.dev> skribis: > But why was it occurring before? The only I thing I can think of is that > I didn't have everything in the store first. Is there a way I can prune > just the relevant items from the store to test this? You could try something like: guix gc -D $(guix gc --references $(guix build pigx-scrnaseq)) \ $(guix gc --references $(guix build pigx-scrnaseq --no-grafts)) Thinking about it, the grafts code depends on what’s in the store: when nothing is in the store, it bounces to the “build handler”, which accumulates the list of missing store items, until it starts building them. Let’s see if I can reproduce that behavior, too… Thanks, Ludo’. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-07-28 10:00 ` Ludovic Courtès @ 2021-07-28 22:03 ` Ludovic Courtès 2021-07-29 3:20 ` Sarah Morgensen 0 siblings, 1 reply; 10+ messages in thread From: Ludovic Courtès @ 2021-07-28 22:03 UTC (permalink / raw) To: Sarah Morgensen; +Cc: Ricardo Wurmus, 49439 Hi, Ludovic Courtès <ludo@gnu.org> skribis: > Thinking about it, the grafts code depends on what’s in the store: when > nothing is in the store, it bounces to the “build handler”, which > accumulates the list of missing store items, until it starts building > them. So I can reproduce the problem Ricardo and you initially reported by running: ./pre-inst-env guix environment pigx-scrnaseq --search-paths after removing some of the ungrafted store items with: guix gc -D $(guix build r-rlang --no-grafts) \ $(guix gc --references $(guix build pigx-scrnaseq --no-grafts)) The seemingly endless CPU usage and unbound memory use comes from the ‘build-accumulator’ build handler. Here, the graph of ‘pigx-scrnaseq’ has many nodes, and many of them depend on, say, ‘r-rlang’. Thus, when ‘r-rlang’ is not in the store, the grafting code keeps asking for it by calling ‘build-derivations’, which aborts to the build handler; the build handler saves the .drv file name and the continuation and keeps going. But since the next package also depends on ‘r-langr’, we abort again to the build handler, and so on. The end result is a very long list of <unresolved> nodes, probably of this order in this case: $ guix graph -t reverse-package r-rlang |grep 'label = "'|wc -l 594 Presumably, the captured continuations occupy quite a bit of memory, hence the quick growth. I suppose one solution is to fire suspended builds when the build handler sees a build request for a given derivation for the second time. It needs more thought and testing… Ludo’. PS: Did you know ‘pigx-scrnaseq’ has twice as many nodes as ‘libreoffice’? $ guix graph -t bag pigx-scrnaseq |grep 'label = "'|wc -l 1359 $ guix graph -t bag libreoffice |grep 'label = "'|wc -l 699 That makes it a great example to study and fix scalability issues! ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-07-28 22:03 ` Ludovic Courtès @ 2021-07-29 3:20 ` Sarah Morgensen 2021-08-09 21:26 ` Ludovic Courtès 0 siblings, 1 reply; 10+ messages in thread From: Sarah Morgensen @ 2021-07-29 3:20 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Ricardo Wurmus, 49439 Hi, Ludovic Courtès <ludo@gnu.org> writes: > Hi, > > Ludovic Courtès <ludo@gnu.org> skribis: > >> Thinking about it, the grafts code depends on what’s in the store: when >> nothing is in the store, it bounces to the “build handler”, which >> accumulates the list of missing store items, until it starts building >> them. > > So I can reproduce the problem Ricardo and you initially reported by > running: > > ./pre-inst-env guix environment pigx-scrnaseq --search-paths > > after removing some of the ungrafted store items with: > > guix gc -D $(guix build r-rlang --no-grafts) \ > $(guix gc --references $(guix build pigx-scrnaseq --no-grafts)) Same here. I'm glad we were able to pinpoint this! > > The seemingly endless CPU usage and unbound memory use comes from the > ‘build-accumulator’ build handler. Here, the graph of ‘pigx-scrnaseq’ > has many nodes, and many of them depend on, say, ‘r-rlang’. Thus, when > ‘r-rlang’ is not in the store, the grafting code keeps asking for it by > calling ‘build-derivations’, which aborts to the build handler; the > build handler saves the .drv file name and the continuation and keeps > going. But since the next package also depends on ‘r-langr’, we abort > again to the build handler, and so on. > > The end result is a very long list of <unresolved> nodes, probably of > this order in this case: > > $ guix graph -t reverse-package r-rlang |grep 'label = "'|wc -l > 594 > > Presumably, the captured continuations occupy quite a bit of memory, > hence the quick growth. > > I suppose one solution is to fire suspended builds when the build > handler sees a build request for a given derivation for the second time. > It needs more thought and testing… I wonder if instead it's possible to reduce the memory taken by the continuations? As someone who has absolutely no experience with the build/derivation system, it seems like all we *should* need to save is the derivation file name. > > Ludo’. > > PS: Did you know ‘pigx-scrnaseq’ has twice as many nodes as > ‘libreoffice’? > > $ guix graph -t bag pigx-scrnaseq |grep 'label = "'|wc -l > 1359 > $ guix graph -t bag libreoffice |grep 'label = "'|wc -l > 699 > > That makes it a great example to study and fix scalability issues! For those interested, I've compiled a list of the top 60 highest-dependency packages, as reported by package-closure: pigx 1630 nextcloud-client 1539 akregator 1486 kmail 1484 korganizer 1481 kdepim-runtime 1480 kincidenceeditor 1478 keventviews 1477 kmailcommon 1476 kcalendarsupport 1475 kmessagelib 1474 knotes 1472 kaddressbook 1469 libksieve 1467 kdepim-apps-libs 1465 libgravatar 1463 kpimcommon 1462 akonadi-calendar 1453 pigx-bsseq 1448 elisa 1446 kaffeine 1432 kdenlive 1431 kmailtransport 1431 dolphin-plugins 1426 k3b 1424 libkgapi 1422 dolphin 1421 kopete 1403 pigx-sars-cov2-ww 1401 krdc 1398 baloo-widgets 1397 baloo 1396 pigx-chipseq 1396 krfb 1389 ffmpegthumbs 1388 kget 1382 kmplayer 1381 kdelibs4support 1375 pigx-scrnaseq 1342 kdevelop 1340 kmailimporter 1326 libkdepim 1325 pigx-rnaseq 1324 labplot 1316 smb4k 1315 kleopatra 1311 kalarmcal 1311 choqok 1311 okular 1310 ktnef 1310 ktorrent 1310 kate 1308 akonadi-search 1308 kcalutils 1307 yakuake 1306 khelpcenter 1305 libksysguard 1305 kdeconnect 1304 konsole 1304 libkleo 1304 There seem to be a lot of kde packages in there, so perhaps this issue isn't as rare as we might otherwise expect? -- Sarah ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-07-29 3:20 ` Sarah Morgensen @ 2021-08-09 21:26 ` Ludovic Courtès 2021-08-10 15:36 ` Ludovic Courtès 0 siblings, 1 reply; 10+ messages in thread From: Ludovic Courtès @ 2021-08-09 21:26 UTC (permalink / raw) To: Sarah Morgensen; +Cc: Ricardo Wurmus, 49439 [-- Attachment #1: Type: text/plain, Size: 5328 bytes --] Hi! Sarah Morgensen <iskarian@mgsn.dev> skribis: > Ludovic Courtès <ludo@gnu.org> writes: [...] >> The seemingly endless CPU usage and unbound memory use comes from the >> ‘build-accumulator’ build handler. Here, the graph of ‘pigx-scrnaseq’ >> has many nodes, and many of them depend on, say, ‘r-rlang’. Thus, when >> ‘r-rlang’ is not in the store, the grafting code keeps asking for it by >> calling ‘build-derivations’, which aborts to the build handler; the >> build handler saves the .drv file name and the continuation and keeps >> going. But since the next package also depends on ‘r-langr’, we abort >> again to the build handler, and so on. >> >> The end result is a very long list of <unresolved> nodes, probably of >> this order in this case: >> >> $ guix graph -t reverse-package r-rlang |grep 'label = "'|wc -l >> 594 >> >> Presumably, the captured continuations occupy quite a bit of memory, >> hence the quick growth. >> >> I suppose one solution is to fire suspended builds when the build >> handler sees a build request for a given derivation for the second time. >> It needs more thought and testing… > > I wonder if instead it's possible to reduce the memory taken by the > continuations? As someone who has absolutely no experience with the > build/derivation system, it seems like all we *should* need to save is > the derivation file name. Continuations may take a bit of room but the main problem is that we’re storing too many of them. :-) Roughly, I think we keep one <unresolved> node per incoming edge into a package not in the store (‘r-rlang’ in the example above). In a case like ‘pigx-scrnaseq’, that’s a lot of items. I added counters and ‘pk’ calls in ‘map/accumulate-builds’ & co. to see what happens. AFAICS, a cutoff as in the attached patch does the job. It’s a basic heuristic to avoid unbounded growth, at the expense of possibly reduced accumulation. For example, here’s the effect on my user profile with 300+ packages: --8<---------------cut here---------------start------------->8--- $ # with cutoff $ time GUIX_PROFILING=gc ./pre-inst-env guix upgrade -n [...] 2,926.7 MB would be downloaded Garbage collection statistics: heap size: 119.37 MiB allocated: 849.56 MiB GC times: 30 time spent in GC: 2.88 seconds (30% of user time) real 0m8.987s user 0m9.482s sys 0m0.186s $ $ # without cutoff $ time GUIX_PROFILING=gc ./pre-inst-env guix upgrade -n [...] 3,402.5 MB would be downloaded Garbage collection statistics: heap size: 127.37 MiB allocated: 1504.59 MiB GC times: 46 time spent in GC: 5.31 seconds (29% of user time) real 0m21.616s user 0m18.082s sys 0m0.255s --8<---------------cut here---------------end--------------->8--- You can tell that, without the cutoff (2nd run), the command more accurately finds out what it’s going to have to download since the “would be downloaded” figure is higher; with the cutoff (1st run), it “sees less” of what it’s going to end up downloading. This could be annoying from a UI viewpoint in “extreme” cases like this one, but most likely the difference would be invisible in many cases. More importantly, having this cutoff halves the execution time, which is nice. The command initially given in this bug report now behaves like this: --8<---------------cut here---------------start------------->8--- $ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths -n -v2 41.8 MB would be downloaded: /gnu/store/difgj9gf8l7y5bsilcl42x2vzgh493c6-r-utf8-1.2.2 /gnu/store/z4rpk1sn3jhy8brsr30zccln8mira3z5-r-purrr-0.3.4 /gnu/store/nkiga9rfd8a9gdfsp2rjcqz39s8p2hg3-r-magrittr-2.0.1 /gnu/store/c5y0xlw1fbcwa5p5qnk4xji286hd7cqk-r-vctrs-0.3.8 /gnu/store/86vz257dqy4vm6s7axq7zl2jqxacgngf-r-ellipsis-0.3.2 /gnu/store/m2m7h497g5n6vdrp5pxsnr2v8hsriq5f-r-glue-1.4.2 /gnu/store/0zv1sl91fz3ddq8namdfvf6yr0j1p2vx-texlive-bin-20190410 /gnu/store/7ks38sv5cpg7x76g2fdp9cra3x7dpbyq-texlive-union-51265 /gnu/store/vkdjhkl5s025jsjy7jrr9q7kxlsi2sc4-r-minimal-4.1.0 /gnu/store/pysc6ixb5fj1m2n50dac6aq5lgd5bnv4-r-rlang-0.4.11 Garbage collection statistics: heap size: 127.31 MiB allocated: 621.97 MiB GC times: 24 time spent in GC: 2.82 seconds (37% of user time) real 0m6.493s user 0m7.584s sys 0m0.117s --8<---------------cut here---------------end--------------->8--- This time it completes, which is already an improvement ;-). The 41.8 MB download reported is underestimated, but again that’s the tradeoff we’re making. Attached is the patch. I’ll go ahead with that if there are no objections. > For those interested, I've compiled a list of the top 60 > highest-dependency packages, as reported by package-closure: > > pigx 1630 > nextcloud-client 1539 > akregator 1486 > kmail 1484 > korganizer 1481 > kdepim-runtime 1480 > kincidenceeditor 1478 > keventviews 1477 > kmailcommon 1476 > kcalendarsupport 1475 Nice! I really need to stop taking ‘libreoffice’ as an example. Thanks, Ludo’. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch, Size: 2026 bytes --] diff --git a/guix/store.scm b/guix/store.scm index 1ab2b08b47..340ad8bc10 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -1358,11 +1358,27 @@ on the build output of a previous derivation." (define (map/accumulate-builds store proc lst) "Apply PROC over each element of LST, accumulating 'build-things' calls and coalescing them into a single call." - (define result - (map (lambda (obj) - (with-build-handler build-accumulator - (proc obj))) - lst)) + (define accumulation-cutoff + ;; Threshold above which we stop accumulating unresolved nodes to avoid + ;; pessimal behavior. See <https://bugs.gnu.org/49439>. + 30) + + (define-values (result rest) + (let loop ((lst lst) + (result '()) + (unresolved 0)) + (match lst + ((head . tail) + (match (with-build-handler build-accumulator + (proc head)) + ((? unresolved? obj) + (if (> unresolved accumulation-cutoff) + (values (reverse (cons obj result)) tail) + (loop tail (cons obj result) (+ 1 unresolved)))) + (obj + (loop tail (cons obj result) unresolved)))) + (() + (values (reverse result) lst))))) (match (append-map (lambda (obj) (if (unresolved? obj) @@ -1370,6 +1386,7 @@ coalescing them into a single call." '())) result) (() + ;; REST is necessarily empty. result) (to-build ;; We've accumulated things TO-BUILD. Actually build them and resume the @@ -1382,7 +1399,7 @@ coalescing them into a single call." ;; unnecessary. ((unresolved-continuation obj) #f) obj)) - result)))) + (append result rest))))) (define build-things (let ((build (operation (build-things (string-list things) ^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#49439: grafts cause “guix environment” to get killed with OOM 2021-08-09 21:26 ` Ludovic Courtès @ 2021-08-10 15:36 ` Ludovic Courtès 0 siblings, 0 replies; 10+ messages in thread From: Ludovic Courtès @ 2021-08-10 15:36 UTC (permalink / raw) To: Sarah Morgensen; +Cc: Ricardo Wurmus, 49439-done Hello! Ludovic Courtès <ludo@gnu.org> skribis: > I added counters and ‘pk’ calls in ‘map/accumulate-builds’ & co. to see > what happens. AFAICS, a cutoff as in the attached patch does the job. > It’s a basic heuristic to avoid unbounded growth, at the expense of > possibly reduced accumulation. For example, here’s the effect on my > user profile with 300+ packages: Pushed as fa81971cbae85b39183ccf8f51e8d96ac88fb4ac. I saw your message on IRC, Sarah; note that because grafts are “dynamic dependencies” (they depend on previous build results), we cannot know in advance which grafts we’re going to build, so there’s necessarily at least a second phase during which grafts get built. See <https://guix.gnu.org/en/blog/2020/grafts-continued/> for background. Thanks! Ludo’. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-08-10 15:37 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-06 14:38 bug#49439: grafts cause “guix environment” to get killed with OOM Ricardo Wurmus 2021-07-23 4:59 ` Sarah Morgensen 2021-07-27 14:52 ` Julien Lepiller 2021-07-27 16:28 ` Ludovic Courtès 2021-07-27 23:35 ` Sarah Morgensen 2021-07-28 10:00 ` Ludovic Courtès 2021-07-28 22:03 ` Ludovic Courtès 2021-07-29 3:20 ` Sarah Morgensen 2021-08-09 21:26 ` Ludovic Courtès 2021-08-10 15:36 ` Ludovic Courtès
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/guix.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.