* bug#44254: Performance of package input rewriting @ 2020-10-27 13:26 Lars-Dominik Braun 2020-10-27 14:14 ` zimoun 2020-10-27 19:58 ` Ricardo Wurmus 0 siblings, 2 replies; 8+ messages in thread From: Lars-Dominik Braun @ 2020-10-27 13:26 UTC (permalink / raw) To: 44254 [-- Attachment #1: Type: text/plain, Size: 1126 bytes --] Hi, this issue is similar to https://issues.guix.gnu.org/41702, but I’m not sure it’s exactly the same. For guix-science I’m trying to provide some packages like python-jupyterlab, which depend on a mix of packages from guix proper and newer versions of packages already included in guix proper. Thus I need to rewrite inputs of the former to the latter. (Because Python only propagates dependencies and thus collisions would occur.) Previously I have been doing this using package-input-rewriting, but starting an environment containing python-jupyterlab alone took about 20s (warm caches, all derivations in the store). Manually rewriting inputs by inheriting and alist-delete’ing brings this down to 3s, which is pretty significant. --no-grafts has not much of an impact (15s vs 2s) here. See https://github.com/guix-science/guix-science/commit/972795a23cc9eb5a0bb1a2ffb5681d151fc4d4b0 for the exact changes. My expectation would be that package-input-rewriting is the preferred, because easier, solution to this problem and thus should have minimal impact on performance. Cheers, Lars [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 659 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#44254: Performance of package input rewriting 2020-10-27 13:26 bug#44254: Performance of package input rewriting Lars-Dominik Braun @ 2020-10-27 14:14 ` zimoun 2020-10-28 14:19 ` Ludovic Courtès 2020-10-27 19:58 ` Ricardo Wurmus 1 sibling, 1 reply; 8+ messages in thread From: zimoun @ 2020-10-27 14:14 UTC (permalink / raw) To: Lars-Dominik Braun, 44254 Hi Lars, On Tue, 27 Oct 2020 at 14:26, Lars-Dominik Braun <ldb@leibniz-psychology.org> wrote: > Previously I have been doing this using package-input-rewriting, but starting > an environment containing python-jupyterlab alone took about 20s (warm caches, > all derivations in the store). Manually rewriting inputs by inheriting and > alist-delete’ing brings this down to 3s, which is pretty significant. > --no-grafts has not much of an impact (15s vs 2s) here. See > https://github.com/guix-science/guix-science/commit/972795a23cc9eb5a0bb1a2ffb5681d151fc4d4b0 > for the exact changes. Is it not related to “#:deep? #t“ by default? The default was #f. Well, using ’inherit’ only rewrites the direct explicit dependencies. However, ’package-input-rewriting’ traverse all the graph of dependencies and replaces accordingly. Maybe I misunderstand. All the best, simon ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#44254: Performance of package input rewriting 2020-10-27 14:14 ` zimoun @ 2020-10-28 14:19 ` Ludovic Courtès 0 siblings, 0 replies; 8+ messages in thread From: Ludovic Courtès @ 2020-10-28 14:19 UTC (permalink / raw) To: zimoun; +Cc: 44254, Lars-Dominik Braun Hi, zimoun <zimon.toutoune@gmail.com> skribis: > On Tue, 27 Oct 2020 at 14:26, Lars-Dominik Braun <ldb@leibniz-psychology.org> wrote: > >> Previously I have been doing this using package-input-rewriting, but starting >> an environment containing python-jupyterlab alone took about 20s (warm caches, >> all derivations in the store). Manually rewriting inputs by inheriting and >> alist-delete’ing brings this down to 3s, which is pretty significant. >> --no-grafts has not much of an impact (15s vs 2s) here. See >> https://github.com/guix-science/guix-science/commit/972795a23cc9eb5a0bb1a2ffb5681d151fc4d4b0 >> for the exact changes. > > Is it not related to “#:deep? #t“ by default? The default was #f. Yes, that’s a possible culprit. Try passing #:deep? #f if it works for your use case. Another thing to look at is the <package> object graph (as show by ‘guix graph’). Input rewriting can duplicate parts of the graph, which in turn defeats package->derivation memoization. Just looking at the number of nodes in the graph can give hints. Like Ricardo wrote, it’d be great it you could share a short reproducer. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#44254: Performance of package input rewriting 2020-10-27 13:26 bug#44254: Performance of package input rewriting Lars-Dominik Braun 2020-10-27 14:14 ` zimoun @ 2020-10-27 19:58 ` Ricardo Wurmus 2020-10-30 8:42 ` Lars-Dominik Braun 1 sibling, 1 reply; 8+ messages in thread From: Ricardo Wurmus @ 2020-10-27 19:58 UTC (permalink / raw) To: Lars-Dominik Braun; +Cc: 44254 Lars-Dominik Braun <ldb@leibniz-psychology.org> writes: > this issue is similar to https://issues.guix.gnu.org/41702, but I’m not sure > it’s exactly the same. For guix-science I’m trying to provide some packages > like python-jupyterlab, which depend on a mix of packages from guix proper and > newer versions of packages already included in guix proper. Thus I need to > rewrite inputs of the former to the latter. (Because Python only propagates > dependencies and thus collisions would occur.) > > Previously I have been doing this using package-input-rewriting, but starting > an environment containing python-jupyterlab alone took about 20s (warm caches, > all derivations in the store). Manually rewriting inputs by inheriting and > alist-delete’ing brings this down to 3s, which is pretty significant. Could you show us a concrete example? Input rewriting is recursive and will traverse the whole package graph by default, even if you *know* that, say, GCC doesn’t need to be rewritten. For the more generic “package-mapping” you can provide a “cut?” procedure to determine when to stop recursion. Perhaps this would make things faster in your case? -- Ricardo ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#44254: Performance of package input rewriting 2020-10-27 19:58 ` Ricardo Wurmus @ 2020-10-30 8:42 ` Lars-Dominik Braun 2020-10-31 10:27 ` Ludovic Courtès 0 siblings, 1 reply; 8+ messages in thread From: Lars-Dominik Braun @ 2020-10-30 8:42 UTC (permalink / raw) To: Ricardo Wurmus, Ludovic Courtès; +Cc: 44254 [-- Attachment #1: Type: text/plain, Size: 999 bytes --] Hi, > Yes, that’s a possible culprit. Try passing #:deep? #f if it works for > your use case. Yeah, that brings it down to ~8s, which is still alot. > Another thing to look at is the <package> object graph (as show by ‘guix > graph’). Input rewriting can duplicate parts of the graph, which in > turn defeats package->derivation memoization. Just looking at the > number of nodes in the graph can give hints. Aha, it’s 913 nodes without rewriting, 13916 with rewriting (#:deep? #t) and 4286 with rewriting (#:deep? #f) as determined by a rather ad-hoc `guix graph -L . -t package python-jupyterlab | grep 'shape = box' | wc -l`. That seems way too much. Does that mean I’m using package rewriting in the wrong way or is that a bug? Unfortunately I don’t have a short reproducer right now. I’ll look at the graph more closely to figure out which parts are actually duplicated. Maybe I can create a reproducing testcase with more information. Cheers, Lars [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 659 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#44254: Performance of package input rewriting 2020-10-30 8:42 ` Lars-Dominik Braun @ 2020-10-31 10:27 ` Ludovic Courtès 2020-11-03 8:23 ` Lars-Dominik Braun 0 siblings, 1 reply; 8+ messages in thread From: Ludovic Courtès @ 2020-10-31 10:27 UTC (permalink / raw) To: Lars-Dominik Braun; +Cc: 44254 Hi Lars, Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis: >> Another thing to look at is the <package> object graph (as show by ‘guix >> graph’). Input rewriting can duplicate parts of the graph, which in >> turn defeats package->derivation memoization. Just looking at the >> number of nodes in the graph can give hints. > Aha, it’s 913 nodes without rewriting, 13916 with rewriting (#:deep? #t) and > 4286 with rewriting (#:deep? #f) as determined by a rather ad-hoc `guix graph > -L . -t package python-jupyterlab | grep 'shape = box' | wc -l`. That seems way > too much. Does that mean I’m using package rewriting in the wrong way or is > that a bug? It could be a mixture thereof. :-) I guess it’s easy to end up creating huge object graphs. Here’s an example of an anti-pattern: (define a ((package-input-rewriting x) ((package-input-rewriting y) p1))) (define b ((package-input-rewriting x) ((package-input-rewriting y) p2))) The correct use is: (define transform (package-input-rewriting (append x y))) (define a (transform p1)) (define b (transform p2)) That guarantees that ‘a’ and ‘b’ share most of the nodes of their object graph. From a quick look, the code in Guix-Science seemed to be following the pattern above. For example, there’s: --8<---------------cut here---------------start------------->8--- (define python-ipykernel-5.3-bootstrap (let ((rewritten ((package-input-rewriting `((,python-jupyter-client . ,python-jupyter-client-6.1-bootstrap) ;; Indirect through IPython. (,python-testpath . ,python-testpath-0.4) (,python-nbformat . ,python-nbformat-5.0))) python-ipykernel-5.3-proper))) (package (inherit rewritten) (name "python-ipykernel-bootstrap")))) (define-public python-jupyter-client-6.1 ((package-input-rewriting `((,python-ipykernel . ,python-ipykernel-5.3-bootstrap) (,python-jupyter-core . ,python-jupyter-core-4.6) ;; Indirect through IPython. (,python-testpath . ,python-testpath-0.4) (,python-nbformat . ,python-nbformat-5.0))) python-jupyter-client-6.1-proper)) (define-public python-ipykernel-5.3 ((package-input-rewriting `((,python-jupyter-client . ,python-jupyter-client-6.1) ;; Indirect through IPython. (,python-testpath . ,python-testpath-0.4) (,python-nbformat . ,python-nbformat-5.0))) python-ipykernel-5.3-proper)) --8<---------------cut here---------------end--------------->8--- It seems to me that you’re redefining a dependency graph, node by node. Thus, you probably don’t need ‘package-input-rewriting’ here. What you did in Guix-Science commit 972795a23cc9eb5a0bb1a2ffb5681d151fc4d4b0 looks more appropriate to me, in terms of style and semantics. Does that make sense? Thanks, Ludo’. ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#44254: Performance of package input rewriting 2020-10-31 10:27 ` Ludovic Courtès @ 2020-11-03 8:23 ` Lars-Dominik Braun 2020-11-03 9:32 ` Ludovic Courtès 0 siblings, 1 reply; 8+ messages in thread From: Lars-Dominik Braun @ 2020-11-03 8:23 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 44254 [-- Attachment #1: Type: text/plain, Size: 1239 bytes --] Hi Ludo, > I guess it’s easy to end up creating huge object graphs. Here’s an > example of an anti-pattern: > > (define a > ((package-input-rewriting x) ((package-input-rewriting y) p1))) > > (define b > ((package-input-rewriting x) ((package-input-rewriting y) p2))) > > The correct use is: > > (define transform > (package-input-rewriting (append x y))) > > (define a (transform p1)) > (define b (transform p2)) that sounds like a section for the cookbook :) > It seems to me that you’re redefining a dependency graph, node by node. > Thus, you probably don’t need ‘package-input-rewriting’ here. What you > did in Guix-Science commit 972795a23cc9eb5a0bb1a2ffb5681d151fc4d4b0 > looks more appropriate to me, in terms of style and semantics. Okay, got it. My initial concern was that rewriting the graph “by hand” (i.e. alist-delete) would be tedious and error-prone. Thank you very much, Lars -- Lars-Dominik Braun Wissenschaftlicher Mitarbeiter/Research Associate www.leibniz-psychology.org ZPID - Leibniz-Institut für Psychologie / ZPID - Leibniz Institute for Psychology Universitätsring 15 D-54296 Trier - Germany Tel.: +49–651–201-4964 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 659 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#44254: Performance of package input rewriting 2020-11-03 8:23 ` Lars-Dominik Braun @ 2020-11-03 9:32 ` Ludovic Courtès 0 siblings, 0 replies; 8+ messages in thread From: Ludovic Courtès @ 2020-11-03 9:32 UTC (permalink / raw) To: Lars-Dominik Braun; +Cc: 44254 Hi, Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis: >> I guess it’s easy to end up creating huge object graphs. Here’s an >> example of an anti-pattern: >> >> (define a >> ((package-input-rewriting x) ((package-input-rewriting y) p1))) >> >> (define b >> ((package-input-rewriting x) ((package-input-rewriting y) p2))) >> >> The correct use is: >> >> (define transform >> (package-input-rewriting (append x y))) >> >> (define a (transform p1)) >> (define b (transform p2)) > that sounds like a section for the cookbook :) Note that there’s a new section in the manual on this topic: https://guix.gnu.org/manual/devel/en/html_node/Defining-Package-Variants.html >> It seems to me that you’re redefining a dependency graph, node by node. >> Thus, you probably don’t need ‘package-input-rewriting’ here. What you >> did in Guix-Science commit 972795a23cc9eb5a0bb1a2ffb5681d151fc4d4b0 >> looks more appropriate to me, in terms of style and semantics. > Okay, got it. My initial concern was that rewriting the graph “by hand” (i.e. > alist-delete) would be tedious and error-prone. I haven’t looked closely enough. If you can define a single procedure that rewrites the graph, that’s of course better than rewriting nodes one by one. Maybe that’s possible, but you need to be careful about factorizing the transformation procedure as I shown above. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-11-03 9:33 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-10-27 13:26 bug#44254: Performance of package input rewriting Lars-Dominik Braun 2020-10-27 14:14 ` zimoun 2020-10-28 14:19 ` Ludovic Courtès 2020-10-27 19:58 ` Ricardo Wurmus 2020-10-30 8:42 ` Lars-Dominik Braun 2020-10-31 10:27 ` Ludovic Courtès 2020-11-03 8:23 ` Lars-Dominik Braun 2020-11-03 9:32 ` Ludovic Courtès
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).