From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Thompson, David" Subject: Re: [PATCH 0/15] Add preliminary support for Linux containers Date: Wed, 8 Jul 2015 09:00:33 -0400 Message-ID: References: <87lhetcudk.fsf@izanagi.i-did-not-set--mail-host-address--so-tickle-me> <87h9pgl0s5.fsf@gnu.org> <87oajmlsv1.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:60321) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZCoxr-0002LC-FC for guix-devel@gnu.org; Wed, 08 Jul 2015 09:00:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZCoxm-00039J-PT for guix-devel@gnu.org; Wed, 08 Jul 2015 09:00:39 -0400 Received: from mail-lb0-f174.google.com ([209.85.217.174]:33830) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZCoxm-00038y-5x for guix-devel@gnu.org; Wed, 08 Jul 2015 09:00:34 -0400 Received: by lbnk3 with SMTP id k3so55604819lbn.1 for ; Wed, 08 Jul 2015 06:00:33 -0700 (PDT) In-Reply-To: <87oajmlsv1.fsf@gnu.org> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org To: =?UTF-8?Q?Ludovic_Court=C3=A8s?= Cc: guix-devel On Wed, Jul 8, 2015 at 8:46 AM, Ludovic Court=C3=A8s wrote: > "Thompson, David" skribis: > >> On Tue, Jul 7, 2015 at 6:28 AM, Ludovic Court=C3=A8s wrot= e: > > [...] > >>>> (lambda () >>>> (sethostname "guix-0.8.3")) >>> >>> Surprisingly, calling =E2=80=98getpid=E2=80=99 in the thunk returns the= PID of the >>> parent (I was expecting it to return 1.) Not sure why that is the >>> case. I=E2=80=99m still amazed that this works as non-root, BTW. >> >> The first process created inside the PID namespace gets the honor of >> being PID 1, not the process created with the 'clone' call. >> >> For more information, see: https://lwn.net/Articles/532748/ > > To me, the thunk above is just like =E2=80=98childFunc=E2=80=99 in > =E2=80=93i.e., it=E2=80=99s the procedu= re that =E2=80=98clone=E2=80=99 > calls in the first child process of the new PID name space. > > What am I missing? It's non-intuitive because PID namespaces are given special treatment. The cloned process is like PID 1 in the sense that if you fork, the new process is PID 2. However, if you call 'getpid' in the cloned process, it returns the PID in the context of the parent PID namespace, and you are expecting PID 1. In that example from LWN, 'childFunc' calls 'execvp', and *that* new process becomes PID 1 (and 'getpid' agrees). This is the usual pattern I see in all container implementations: The process that calls clone sets up the environment and then execs the real init system. Is it more clear now? >>> There=E2=80=99s an issue when the parent=E2=80=99s Guile is not mapped = into the >>> container=E2=80=99s file system: =E2=80=98use-modules=E2=80=99 forms an= d auto-loading will fail. >>> For instance, I did (use-modules (ice-9 ftw)) in the parent and called >>> =E2=80=98scandir=E2=80=99 in the child, but that failed because of an a= ttempt to >>> auto-load (ice-9 i18n), which is unavailable in the container. >> >> Hmm, I don't know of a way to deal with that other than the user being >> careful to bind-mount in the Guile modules they need. > > Right. Maybe the best we can do is to add a word of caution in the > docstring or something. Okay, I will do that. >> Hmm, there's various reasons that EINVAL would be thrown. Could you >> readlink "those" files, that is /proc//ns/user >> and /proc//ns/user, and tell me if the contents >> are the same? They shouldn't be, but this will eliminate one of the >> possible causes of EINVAL. > > It turns out I was targeting the wrong PID. Glad it's not totally broken on machines other than mine. :) >>> Also, I think we should add --expose and --share as for =E2=80=98guix s= ystem=E2=80=99, >>> though that can come later. >> >> Yes, I also really want that, but it's a task for another time. > > Sure. > >>>> Here's how you build it: >>>> >>>> guix system container container.scm >>> >>> Very neat. I wonder if that should automatically override the >>> =E2=80=98file-systems=E2=80=99 field to be =E2=80=98%container-file-sys= tems=E2=80=99, so that one can >>> reuse existing OS declarations unmodified. WDYT? >> >> This would be a better user experience, for sure. I thought about >> this, but I don't know how to do it in a way that isn't surprising or >> just broken. Ideas? > > IMO it=E2=80=99d be fine to simply override the subset of =E2=80=98file-s= ystems=E2=80=99 that > clashes with =E2=80=98%container-file-systems=E2=80=99, similar to what > =E2=80=98virtualized-operating-system=E2=80=99 does in (gnu system vm). I will implement that. Thanks! - Dave