Efraim Flashner writes: > [[PGP Signed Part:Signature made by expired key 41AAE7DCCA3D8351 Efraim Flashner ]] > On Fri, Jan 05, 2024 at 04:41:14PM +0000, Christopher Baines wrote: >> >> Ludovic Courtès writes: >> >> > Hi, >> > >> > Christopher Baines skribis: >> > >> >> When asked by the data service, it seems to take Guix around 3 minutes >> >> to compute cross derivations for all packages (to a single >> >> target). Here's a simple script that replicates this: >> >> ... >> >> > One idiom that defeats caching is: >> > >> > (define (make-me-a-package x y z) >> > (package >> > …)) >> > >> > Such a procedure returns a fresh package every time it’s called, >> > preventing caching from happening (because cache entries are compared >> > with ‘eq?’). That typically leads to lower hit rates. >> > >> > Anyway, lots of words to say that I don’t see anything immediately >> > obvious with cross-compilation, yet I wouldn’t be surprised if some of >> > these cache-defeating idioms were used because we’ve payed less >> > attention to this. >> >> I've got a feeling that performance has got worse since I looked at this >> originally, I've finally got around to having a further look. >> >> I spent some time looking at various metrics, but it was most useful to >> just write the cache keys of various types to files and have a read. >> >> The cross-base module was causing many issues, as all but one of the >> procedures there produced new package records each time. There is also >> make-rust-sysroot which showed up. >> >> I've sent some patches as #68266 to add memoization to avoid this, and >> that seems to speed things up. >> >> Looking at other things in the cache, I think there are some issues with >> file-append and local-file. The use of file-append in svn-fetch and >> local-file in the lower procedure in the python build system both bloat >> the cache for example, although I'm less sure about how to address these >> cases. >> >> One thing I am sure about though, is that these problems will come >> back. Maybe we could add some reporting in to Guix to look through the >> cache at the keys, lower them all and check for equivalence. That way it >> should be possible to automate saying that having [1] in the cache >> several thousand times is unhelpful. The data service could then run >> this reporting and store it. >> >> 1: # "/bin/svn"> > > I grabbed the patch for make-rust-sysroot to try it out: > Native builds: > time GUIX_PROFILING="object-cache" ./pre-inst-env guix build --no-grafts $(./pre-inst-env ~/list-all-cargo-build-system-packages | grep rust- | head -n 100) -d ... > That's a massive drop in the size of the cache and a big decrease in the > amount of time it took to calculate those 100 items. I think you're right, while I send some other changes in #68266, I think it's this change around make-rust-sysroot that has pretty much all the effects on performance. I think the tens of thousands of duplicated packages from cross-base that I was looking at are almost entirely coming from make-rust-sysroot. As Ludo mentions in [1], maybe this has something to do with use of cross- procedures in native-inputs, although I'm not sure that moving those calls out of native-inputs is a correct thing to do. I don't know what the correct approach here is, but I think something needs doing here to address the performance regression. 1: https://lists.gnu.org/archive/html/guix-patches/2024-01/msg00733.html