From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id ted/KtaGnF8XBAAA0tVLHw (envelope-from ) for ; Fri, 30 Oct 2020 21:34:14 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id YM7xJdaGnF90LQAAbx9fmQ (envelope-from ) for ; Fri, 30 Oct 2020 21:34:14 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id DEB97940237 for ; Fri, 30 Oct 2020 21:34:13 +0000 (UTC) Received: from localhost ([::1]:43496 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYc2G-0007MU-Oh for larch@yhetil.org; Fri, 30 Oct 2020 17:34:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37598) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYc26-0007MJ-US for bug-guix@gnu.org; Fri, 30 Oct 2020 17:34:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:48459) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYc26-0007yi-LQ for bug-guix@gnu.org; Fri, 30 Oct 2020 17:34:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kYc26-0002CC-IR for bug-guix@gnu.org; Fri, 30 Oct 2020 17:34:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#44261: running a daemon with userns in relocateble pack breaks Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Fri, 30 Oct 2020 21:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44261 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Jan Nieuwenhuizen Received: via spool by 44261-submit@debbugs.gnu.org id=B44261.16040936348418 (code B ref 44261); Fri, 30 Oct 2020 21:34:02 +0000 Received: (at 44261) by debbugs.gnu.org; 30 Oct 2020 21:33:54 +0000 Received: from localhost ([127.0.0.1]:60003 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kYc1x-0002Bh-Sl for submit@debbugs.gnu.org; Fri, 30 Oct 2020 17:33:54 -0400 Received: from eggs.gnu.org ([209.51.188.92]:37496) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kYc1v-0002BU-Ns for 44261@debbugs.gnu.org; Fri, 30 Oct 2020 17:33:52 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:54943) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYc1p-0007xF-3i; Fri, 30 Oct 2020 17:33:45 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=55656 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kYc1o-0000gr-J1; Fri, 30 Oct 2020 17:33:44 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87blgn30w0.fsf@gnu.org> <875z6v2zz5.fsf@gnu.org> Date: Fri, 30 Oct 2020 22:33:42 +0100 In-Reply-To: <875z6v2zz5.fsf@gnu.org> (Jan Nieuwenhuizen's message of "Tue, 27 Oct 2020 21:09:02 +0100") Message-ID: <871rhfxutl.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -3.3 (---) X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 44261@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Scanner: ns3122888.ip-94-23-21.eu Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of bug-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=bug-guix-bounces@gnu.org X-Spam-Score: -1.51 X-TUID: qxdDXlbYoH+m --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello! As discussed on IRC, my initial advice about MS_PRIVATE was misguided. The real issue is the =E2=80=9Crm_rf (new_root);=E2=80=9D call, which remov= es the root directory and thus leaves child processes (the daemon) with nothing. The attached patch adds a test loosely based on yours and a fix for that. The fix (for the =E2=80=9Cuserns=E2=80=9D engine) is to make NEW_ROO= T a tmpfs, such that upon completion, all we need to do is to unmount it and remove it; it lives on as the root file system of child processes. In the =E2=80=9Cfakechroot=E2=80=9D case, we have to leave NEW_ROOT behind,= which is not great but acceptable (it=E2=80=99s user-owned, #o700, and it=E2=80=99s unde= r /tmp). The test only checks the =E2=80=9Cuserns=E2=80=9D engine. If you confirm that it works for you and looks reasonable, we can apply it. Thanks, Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/gnu/packages/aux-files/run-in-namespace.c b/gnu/packages/aux-files/run-in-namespace.c index 52a16a5362..1d64ef9f44 100644 --- a/gnu/packages/aux-files/run-in-namespace.c +++ b/gnu/packages/aux-files/run-in-namespace.c @@ -41,6 +41,7 @@ #include #include #include +#include /* Whether we're building the ld.so/libfakechroot wrapper. */ #define HAVE_EXEC_WITH_LOADER \ @@ -258,11 +259,20 @@ exec_in_user_namespace (const char *store, int argc, char *argv[]) { /* Spawn @WRAPPED_PROGRAM@ in a separate namespace where STORE is bind-mounted in the right place. */ - int err; + int err, is_tmpfs; char *new_root = mkdtemp (strdup ("/tmp/guix-exec-XXXXXX")); char *new_store = concat (new_root, original_store); char *cwd = get_current_dir_name (); + /* Become the new parent of grand-children when their parent dies. */ + prctl (PR_SET_CHILD_SUBREAPER, 1); + + /* Optionally, make NEW_ROOT a tmpfs. That way, if we have to leave it + behind because there are sub-processes still running when this wrapper + exits, it's OK. */ + err = mount ("none", new_root, "tmpfs", 0, NULL); + is_tmpfs = (err == 0); + /* Create a child with separate namespaces and set up bind-mounts from there. That way, bind-mounts automatically disappear when the child exits, which simplifies cleanup for the parent. Note: clone is more @@ -300,6 +310,7 @@ exec_in_user_namespace (const char *store, int argc, char *argv[]) /* Failure: user namespaces not supported. */ fprintf (stderr, "%s: error: 'clone' failed: %m\n", argv[0]); rm_rf (new_root); + free (new_root); break; default: @@ -312,10 +323,27 @@ exec_in_user_namespace (const char *store, int argc, char *argv[]) write_id_map (child, "uid_map", getuid ()); write_id_map (child, "gid_map", getgid ()); - int status; + int status, status_other; waitpid (child, &status, 0); - chdir ("/"); /* avoid EBUSY */ - rm_rf (new_root); + + if (is_tmpfs) + { + /* NEW_ROOT lives on in child processes and we no longer need it + to exist as an empty directory in the global namespace. */ + umount (new_root); + rmdir (new_root); + } + /* Check whether there are child processes left. If there are none, + we can remove NEW_ROOT just fine. Conversely, if there are + processes left (for example because this wrapper's child forked), + we have to leave NEW_ROOT behind so that those processes can still + access their root file system (XXX). */ + else if (waitpid (-1 , &status_other, WNOHANG) == -1) + { + chdir ("/"); /* avoid EBUSY */ + rm_rf (new_root); + } + free (new_root); if (WIFEXITED (status)) @@ -490,6 +518,9 @@ exec_with_loader (const char *store, int argc, char *argv[]) setenv ("FAKECHROOT_BASE", new_root, 1); + /* Become the new parent of grand-children when their parent dies. */ + prctl (PR_SET_CHILD_SUBREAPER, 1); + pid_t child = fork (); switch (child) { @@ -507,11 +538,18 @@ exec_with_loader (const char *store, int argc, char *argv[]) default: { - int status; + int status, status_other; waitpid (child, &status, 0); - chdir ("/"); /* avoid EBUSY */ - rm_rf (new_root); - free (new_root); + + /* If there are child processes still running, leave NEW_ROOT around + so they can still access it. XXX: In that case NEW_ROOT is left + behind. */ + if (waitpid (-1 , &status_other, WNOHANG) == -1) + { + chdir ("/"); /* avoid EBUSY */ + rm_rf (new_root); + free (new_root); + } close (2); /* flushing stderr should be silent */ diff --git a/tests/guix-pack-relocatable.sh b/tests/guix-pack-relocatable.sh index a960ecd209..88cbe63b59 100644 --- a/tests/guix-pack-relocatable.sh +++ b/tests/guix-pack-relocatable.sh @@ -58,6 +58,19 @@ run_without_store () fi } +# Wait for the given file to show up. Error out if it doesn't show up in a +# timely fashion. +wait_for_file () +{ + i=0 + while ! test -f "$1" && test $i -lt 20 + do + sleep 0.3 + i=`expr $i + 1` + done + test -f "$1" +} + test_directory="`mktemp -d`" export test_directory trap 'chmod -Rf +w "$test_directory"; rm -rf "$test_directory"' EXIT @@ -129,6 +142,65 @@ case "`uname -m`" in ;; esac +if unshare -r true +then + # Check what happens if the wrapped binary forks and leaves child + # processes behind, like a daemon. The root file system should remain + # available to those child processes. See . + cat > "$test_directory/manifest.scm" <