Ludovic Courtès writes: > Hi Christopher, > > Christopher Baines skribis: > >> I've attached a script that when run should reproduce the issue. I >> extracted the code relating to lint warnings from the Guix Data >> Service. The script attached runs this code twice against the inferior, >> once will often be enough to cause it to crash, but twice should >> reproduce it more reliably. > > Thanks a lot. > > Here’s a backtrace from the core dumped by the inferior: ... > It could be an unbounded growth of libgc’s finalizer table or our weak > tables as we experienced in . > > We should be able to reproduce it with something like: > > guix time-machine --commit=d523eb5c9c2659cbbaf4eeef3691234ae527ee6a -- \ > lint -c inputs-should-be-native,license,mirror-url,source-file-name,source-unstable-tarball,derivation,patch-file-names,formatting,synopsis > > In top one can see that heap usage keeps growing, which may well be a > bug in Guix proper rather than in Guile… but it doesn’t crash. > > I would propose three actions here: > > 1. Run linters un ‘gcprof’ to see what’s eating memory and hopefully > find and address the leak. As a start, maybe just start reducing > the list of checkers to see if there’s one of them that’s causing > it. > > The ‘derivation’ checker is definitely responsible for a lot of the > heap consumption because of the various caches in (guix packages) & > co. Perhaps add calls to ‘invalidate-derivation-caches!’ as in > (gnu ci). > > 2. Work around the problem in Guix Data Service by running, say, one > inferior per checker instead of one inferior for all checkers for > all packages. > > 3. If #1 didn’t help, let’s see if we can isolate a Guile weak-table > bug or something like that. > > Thoughts? Thanks, that's useful to know. I think I've now managed to find a way of reproducing this without the inferior getting in the way. I was testing if triggering garbage collection in Guile would help avoid the problem, but actually it seems to cause it. I guess given the mentions of GC in the above stacktrace, and the major version change of libgc, some GC related bug seems quite likely here. I've been testing with a checkout of Guix built with Guix from the core-updates branch. I think that provides the same broken Guile that the guix repl is using. When trying to just use a checkout of the core-updates branch, and guile built from that branch I get the following odd error: → ./pre-inst-env /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/guile ./reproduce-core-updates-mmap-PROT_NONE-failed.scm guile: warning: failed to install locale warning: failed to load '(gnu packages abiword)': Function not implemented error: git-fetch: unbound variable hint: Did you forget `(use-modules (guix git-download))'? error: git-version: unbound variable No idea what's happening there, but when I ./configure and make with packages from core-updates, I seem to end up with a setup that works: This is the guile I'm using: /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/guile If you just run the script, you should see: → ./pre-inst-env guile ./reproduce-core-updates-mmap-PROT_NONE-failed.scm ;;; ("%package-table-setup" #) mmap(PROT_NONE) failed Aborted For more information, you can pipe the script to the REPL. What you should see is that it's slow to compute the lint warnings the first time, but the subsequent times are quick, and it crashes in one of the (gc) calls. I'm going to try and continue looking in to this, at least it'll be easier to delve in to guile now that I can directly control what guile is used. Thanks, Chris