I figured out the race condition in the tests. The previous test was still running when the failing test started, the joys of using a shared emacs for running all of the tests in one file. The attached diff is split into the the commits that introduce the tests in question in my working series, but you should be able to just apply it on top of the posted series if you want.