Hey hey, comrades! I have a fix for some (most?) of the crashes we were seeing while running multi-threaded code such as (guix build compile), and, presumably, the grafting code mentioned at the beginning of this bug report, although I haven’t checked yet. So, ‘scm_i_vm_mark_stack’ marks the stack precisely, but contrary to what I suspected, precise marking is not at fault. Instead, the problem has to do with the fact that some VM instructions change the frame pointer (vp->fp) before they have set up the dynamic link for that new frame. As a consequence, if a stop-the-world GC is triggered after vp->fp has been changed and before its dynamic link has been set, the stack-walking loop in ‘scm_i_vm_mark_stack’ could stop very early, leaving a lot of objects unmarked. The patch below fixes the problem for me. \o/ I’m thinking we could perhaps add a compiler barrier before ‘vp->fp = new_fp’ statements, but in practice it’s not necessary here (x86_64, gcc 7). Thoughts? I’d like to push this real soon. I’ll also do more testing on real workloads from Guix, and then I’d like to release 2.2.4, hopefully within a few days. Thank you and thanks Andy for the discussions on IRC! Ludo’, who’s going to party all night long. :-)