From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Philipp Stephani <p.stephani2@gmail.com>
Newsgroups: gmane.emacs.bugs
Subject: bug#45198: 28.0.50; Sandbox mode
Date: Mon, 14 Dec 2020 16:37:42 +0100
Message-ID: <CAArVCkSAzA81Y0j_zcne+jMm+_ox-vDEhHwBNQqOtg5vtVwuvg@mail.gmail.com>
References: <jwvpn3ehpjz.fsf@iro.umontreal.ca>
 <CAArVCkSd9XjF1_18ovR3tBLyJejNsyhc4dSMx0XEWJwPMy85MA@mail.gmail.com>
 <jwvo8ixlhv9.fsf-monnier+emacs@gnu.org>
 <CAArVCkSJmVAsrKwLMVtVhamW=Py5NhgSoc+1pZ+Cn=cq5PjLUg@mail.gmail.com>
 <jwvczzdlfws.fsf-monnier+emacs@gnu.org>
 <CAArVCkRahKpNVNQXsA_bYMoso-eQwy6b=LnaNTG9BtrJ0cMi1g@mail.gmail.com>
 <jwv4kkoiim5.fsf-monnier+emacs@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="11615"; mail-complaints-to="usenet@ciao.gmane.io"
Cc: Bastien <bzg@gnu.org>, 45198@debbugs.gnu.org,
 =?UTF-8?Q?Jo=C3=A3o_?= =?UTF-8?Q?T=C3=A1vora?= <joaotavora@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Dec 14 16:39:23 2020
Return-path: <bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org>
Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org>)
	id 1kopwY-0002vu-Uc
	for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 14 Dec 2020 16:39:23 +0100
Original-Received: from localhost ([::1]:33680 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org>)
	id 1kopwY-000167-0K
	for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 14 Dec 2020 10:39:22 -0500
Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:57688)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>)
 id 1kopwD-0000lS-WF
 for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2020 10:39:02 -0500
Original-Received: from debbugs.gnu.org ([209.51.188.43]:42303)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>)
 id 1kopwD-00078f-N6
 for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2020 10:39:01 -0500
Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2)
 (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1kopwD-0003Fe-KZ
 for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2020 10:39:01 -0500
X-Loop: help-debbugs@gnu.org
Resent-From: Philipp Stephani <p.stephani2@gmail.com>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
Resent-CC: bug-gnu-emacs@gnu.org
Resent-Date: Mon, 14 Dec 2020 15:39:01 +0000
Resent-Message-ID: <handler.45198.B45198.160796028212407@debbugs.gnu.org>
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 45198
X-GNU-PR-Package: emacs
X-GNU-PR-Keywords: patch
Original-Received: via spool by 45198-submit@debbugs.gnu.org id=B45198.160796028212407
 (code B ref 45198); Mon, 14 Dec 2020 15:39:01 +0000
Original-Received: (at 45198) by debbugs.gnu.org; 14 Dec 2020 15:38:02 +0000
Original-Received: from localhost ([127.0.0.1]:53849 helo=debbugs.gnu.org)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
 id 1kopvF-0003Dm-PI
 for submit@debbugs.gnu.org; Mon, 14 Dec 2020 10:38:02 -0500
Original-Received: from mail-oo1-f42.google.com ([209.85.161.42]:40812)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <p.stephani2@gmail.com>) id 1kopvD-0003DU-U5
 for 45198@debbugs.gnu.org; Mon, 14 Dec 2020 10:38:00 -0500
Original-Received: by mail-oo1-f42.google.com with SMTP id 9so4035338ooy.7
 for <45198@debbugs.gnu.org>; Mon, 14 Dec 2020 07:37:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; 
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=E0iEc0rosFKPXqrWtEPjgp5aPAzhaWtsrs1+mwtwc8M=;
 b=Bbx3s8idf8Xlp9Z3X22cHHSmZ73OImzUw70Zg7pWuqIikH1ydMhFjasl97Xp6woFla
 zmsRRpcwrGwlz1Bf2h4Dva4Qiprtuk5ZfmgVILMR1QSgo4t/Eg7hdS/1gHazblY9ynB9
 7mZyNjGvEepD1BPoOjg6awv1Mp7eBvZLb9WwLEJMVMG0RCcTsdzAzoChomVcQYcSL015
 eDxpsPiS+Me7URL/yVmAjWAxGI9DQ6hXDAkm1/b/wQQUjMRqSiFNAi7ELFn5LFFpw34i
 0n8rPdtvxJJOs8TSe6/9vFg5/fT9uHCA3SgzKxiV7wiQR8yd9rw1Mm2g1j+pNDq9iJAg
 VAUA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=E0iEc0rosFKPXqrWtEPjgp5aPAzhaWtsrs1+mwtwc8M=;
 b=Vox2b056zRp0ujWWPAbuIZKrDm3Q4b3Z61oU85viE1m2ErKDnnpkkEvO06k4aWP1f2
 60HWwmnatNaDxTYe3n3gaA9wPTb++V7RDglech+2u9Ex5qg5C0jJnLOKxq9SXu3T8Qu6
 aNvRjIxCG5SxoHWIrSpMbMzyx0wqf1fC89vNspCIKym0b3p37WEv+8mFay9RnSlYfFIL
 ONzgz2Xx/cn0OQlJ+fh66Yl+jyS0Yu99O4iKOXG8TGizuIFRsAYJmTzbMtjRyJf4v9pR
 pqyQWLUsF2j4J+9zobTq1oAZW1IOQwmFGZA4PZpC061+NjmPWoBbysFW7dJWfyR7ut6r
 Q1KQ==
X-Gm-Message-State: AOAM531p9R5ANf4kkF5dXrxWQQhWsygYj8vC8tWmRr2Dz0nm5YClmrzJ
 a6Py1iozJ/XENcY7hdFqz0UoE9GEQxaEN8H+Oew=
X-Google-Smtp-Source: ABdhPJy4fK82LyyIhQ9lvvde6iEsjhhsCkpu4FgeQSIi3MaeThbNrshTgpqAJ4uWuzJMdDlU3ugFTh+bhc1/dSzDHDg=
X-Received: by 2002:a4a:e8d1:: with SMTP id h17mr2805453ooe.71.1607960273852; 
 Mon, 14 Dec 2020 07:37:53 -0800 (PST)
In-Reply-To: <jwv4kkoiim5.fsf-monnier+emacs@gnu.org>
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
X-BeenThere: bug-gnu-emacs@gnu.org
List-Id: "Bug reports for GNU Emacs,
 the Swiss army knife of text editors" <bug-gnu-emacs.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-gnu-emacs>,
 <mailto:bug-gnu-emacs-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/bug-gnu-emacs>
List-Post: <mailto:bug-gnu-emacs@gnu.org>
List-Help: <mailto:bug-gnu-emacs-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-gnu-emacs>,
 <mailto:bug-gnu-emacs-request@gnu.org?subject=subscribe>
Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org
Original-Sender: "bug-gnu-emacs"
 <bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org>
Xref: news.gmane.io gmane.emacs.bugs:196054
Archived-At: <http://permalink.gmane.org/gmane.emacs.bugs/196054>

Am Mo., 14. Dez. 2020 um 15:44 Uhr schrieb Stefan Monnier
<monnier@iro.umontreal.ca>:
>
> > In some way certainly, but it's not necessarily through stdout. I tend
> > to write potential output into a file whose filename is passed on the
> > command line. That's more robust than stdout (which often contains
> > spurious messages about loading files etc).
>
> Hmm... but that requires write access to some part of the file system,
> so it requires a finer granularity of control.

When using mount namespaces, that comes at a low cost: you mount
exactly those files that you want to write as writable filesystems,
and read-only files as read-only.

> I'd much rather limit
> the output to something equivalent to a pipe (the implementation of the
> sandboxing doesn't have to use stdout for that if that's a problem).

It's definitely true that only accessing open file descriptors is more
secure and simpler ("capability-based security"). However, I assume
that generic Elisp code doesn't work too well if it can't do things
like create temporary files. I've found 211 calls to make-temp-file in
the Emacs codebase alone, and I'd be surprised if none of them were
applicable to sandboxed processes.

>
> >> Also, I think the async option is the most important one.  How 'bout:
> >>     (sandbox-start FUNCTION)
> >>     Lunch a sandboxed Emacs subprocess running FUNCTION.
> > Passing a function here might be confusing because e.g. lexical
> > closures won't work.
>
> What makes you think they don't?

Hmm, you mean printing and reading closure objects? It might work,
though I don't know how portable/robust it is.

>
> > It might be preferable to pass a form and state
> > that both dynamic and lexical bindings are ignored.
>
> If closures turn out to be a problem I'd rather use FUNCTION + VALUES
> than FORM (using FORM implies the use of `eval`, and you have to think
> of all those kitten that'll suffer if we do that).

It is always eval, because that's how Elisp works. How else would you
deserialize and execute code than with read + eval?

>
> >>     Returns a process object.
> > Depending how much we care about forward compatibility, it might be
> > better to return an opaque sandbox object (which will initially wrap a
> > process object).
>
> We always use process objects to represent file-descriptors, so I can't
> find any good reason why this one should be different or why an
> implementation might find it difficult to expose a process object.

That's a fair point, but when thinking about forwards-compatibility we
might want to anticipate reasons that we currently can't think of.

>
> >>     FUNCTION is called with no arguments and it can use `sandbox-read`
> >>     to read the data sent to the process object via `process-send-string`,
> >>     and `sandbox-reply` to send back a reply to the parent process
> >>     (which will receive it via its `process-filter`).
> > That is, sandbox-read and sandbox-reply just read/write stdin/stdout?
>
> While it may use stdin/stdout internally, I can imagine good reasons why
> we'd want to use some other file descriptors.

If the process can write to stdout, then users will do that. Users
would probably just try whether 'print' works, and use it if it does.
So if you want a specialized function, then we'd need to make sure
that 'print' doesn't work (e.g. by having the parent process only read
the new pipe).

>
> > That would certainly work, but (a) it doesn't really have anything to
> > do with sandboxing, so these functions should rather be called
> > stdin-read and stdout-write or similar,
>
> I think "the right thing" would be to represent the parent as a process
> object inside the child.  I proposed dedicated functions only because
> but when it uses stdin/stdout, providing a process object seems awkward
> to implement.

It would be a file descriptor in all cases, so we might as well use a
pipe process (and introduce a function to write to a pipe). Stdout
isn't really different from other open file descriptors.
We'd need some special support for other file descriptors though, as
we'd need to make sure to not open them with O_CLOEXEC.

>
> >>     The sandboxed process has read access to all the local files
> >>     but no write access to them, nor any access to the network or
> >>     the display.
> > This might be a bit too specific. I'd imagine we'd want to restrict
> > reading files to the absolute minimum (files that Emacs itself needs
> > plus a fixed set of input files/directories known in advance), but
> > often allow writing some output files.
>
> I'm trying to design an API which can be made to work in as many
> circumstances as possible without imposing too high a maintenance
> burden.  So while I agree that it'd be better to limit the set of files
> that can be read and to allow writing to some files, I think I'd rather
> start with something more crude.
>
> We can refine it later if/when we have more experience with how it's
> used, and how it's implemented in the various OSes.

Even without specific experience, we can definitely say that
restricting something later is much harder than lifting restrictions.
So if we'd start with a policy that allows full read access, we'd have
a very hard time restricting that later. IOW, I'd be fine with not
providing write access to files (with the exception of maybe a
process-local tmpfs), but I'd try to restrict reading right from the
start to the installation directory and other known sources like the
load path.

>
> >> >> - I suspect we'll still want to use the extra "manual" checks I put in
> >> >>   my code (so as to get clean ELisp errors when bumping against the
> >> >>   walls of the sandbox, and because of the added in-depth security).
> >> > That's reasonable, though I'm worried that it will give users a false
> >> > sense of security.
> >> That would only be the case if we don't additionally use process-level
> >> isolation, right?
> > My worry is that people see a function like enter-sandbox and then
> > assume that Emacs will be secure after calling it, without actually
> > verifying the security implications.
>
> This seems universally true and hence suggests we should just forget
> about this idea of providing a sandbox functionality.  IOW I'm not sure
> what this has to do with the `ensure_no_sandbox` calls I'm suggesting
> we keep.

Seccomp and namespaces are battle-tested kernel-level security
features. If we use them, we'll have a much better chance at providing
an actually secure sandbox.

>
> > I've looked into this, and what I'd suggest for now is:
> > 1. Add a --seccomp=FILE command-line option that loads seccomp filters
> > from FILE and applies them directly after startup (first thing in
> > main). Why do this in Emacs? Because that's the easiest way to prevent
> > execve. When installing a seccomp filter in a separate process, execve
> > needs to be allowed because otherwise there'd be no way to execute the
> > Emacs binary. While there are workarounds (ptrace, LD_PRELOAD), it's
> > easiest to install the seccomp filter directly in the Emacs process.
> > 2. Generate appropriate seccomp filters using libseccomp or similar.
> > 3. In the sandboxing functions, start Emacs with bwrap to set up
> > namespaces and invoke Emacs with the new --seccomp flag.
>
> Sounds OK, tho I must say I don't understand why we care particularly
> about disallowing execve inside the bwrap jail.  AFAIK anything that an
> external process can do can also be done directly by Emacs since ELisp
> is a fairly fully-featured language (since there's nothing like setuid
> inside a bwrap jail).  I mean, I agree that we want to disallow running
> subprocesses, but can't think of a good reason why we would need this to
> be 100%, so we could rely on `ensure_no_sandbox` for that.

It's basically just another way to make the attack surface smaller and
provide defense in depth. If we allow execve, then suddenly the attack
surface increases to include all possible binaries that Emacs might
execute.
I've implemented the --seccomp flag, and it's really benign - mapping
the input file plus one or two syscalls. It's also strictly optional
and doesn't preclude using other technologies.