From 2030f198892f972ee6bc2fb8529cbd6afb6bc9ad Mon Sep 17 00:00:00 2001 From: Reepca Russelstein Date: Tue, 24 Dec 2024 05:40:58 -0600 Subject: [PATCH] nix: build.cc: explicitly unlock in the has-become-valid case. (hopefully) fixes #31785. Similar to Nix issue #178, fixed in https://github.com/NixOS/nix/commit/29cde917fe6b8f2e669c8bf10b38f640045c83b8. We can't rely on Goal deletion to release our locks in a timely manner. In the case in which multiple guix-daemon processes simultaneously try producing an output path path1, the one that gets there first (P1) will get the lock, and the second one (P2) will continue trying to acquire the lock until it is released. Once it has acquired the lock, it checks to see whether the path has already become valid in the meantime, and if so it reports success to those Goals waiting on its completion and finishes. Unfortunately, it fails to release the locks it holds first, so those stay held until that Goal gets deleted. Suppose we have the following store path dependency graph: path4 / | \ path1 path2 path3 P2 is now sitting on path1's lock, and will continue to do so until path4 is completed. Suppose there is also a P3, and it has been blocked while P1 builds path2. Now P3 is sitting on path2's lock, and can't acquire path1's lock to determine that it has been completed. Likewise P2 is sitting on path1's lock, and now can't acquire path2's lock to determine that it has been completed. Finally, P3 completes path3 while P2 is blocked. Now: - P1 knows that path1 and path2 are complete, and holds no locks, but can't determine that path3 is complete - P2 knows that path1 and path3 are complete, and holds locks on path1 and path3, but can't determine that path2 is complete - P3 knows that path2 and path3 are complete, and holds a lock on path2, but can't determine that path1 is complete And none of these locks will be released until path4 is complete. Thus, we have a deadlock. To resolve this, we should explicitly release these locks as soon as they should be released. * nix/libstore/build.cc (DerivationGoal::tryToBuild, SubstitutionGoal::tryToRun): Explicitly release locks in the has-become-valid case. Change-Id: Ie510f84828892315fe6776c830db33d0f70bcef8 --- nix/libstore/build.cc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc index 43a8a371846..edd01bab34d 100644 --- a/nix/libstore/build.cc +++ b/nix/libstore/build.cc @@ -1200,6 +1200,7 @@ void DerivationGoal::tryToBuild() if (buildMode != bmCheck && validPaths.size() == drv.outputs.size()) { debug(format("skipping build of derivation `%1%', someone beat us to it") % drvPath); outputLocks.setDeletion(true); + outputLocks.unlock(); done(BuildResult::AlreadyValid); return; } @@ -3070,6 +3071,7 @@ void SubstitutionGoal::tryToRun() if (!repair && worker.store.isValidPath(storePath)) { debug(format("store path `%1%' has become valid") % storePath); outputLock->setDeletion(true); + outputLock.reset(); amDone(ecSuccess); return; } -- 2.45.2