* RFC: (ice-9 sandbox)
@ 2017-03-31 9:27 Andy Wingo
2017-03-31 11:33 ` Ludovic Courtès
` (5 more replies)
0 siblings, 6 replies; 17+ messages in thread
From: Andy Wingo @ 2017-03-31 9:27 UTC (permalink / raw)
To: guile-devel
[-- Attachment #1: Type: text/plain, Size: 500 bytes --]
Hi,
Attached is a module that can evaluate an expression within a sandbox.
If the evaluation takes too long or allocates too much, it will be
cancelled. The evaluation will take place with respect to a module with
a "safe" set of imports. Those imports include most of the bindings
available in a default Guile environment. See the file below for full
details and a number of caveats.
Any thoughts? I would like something like this for a web service that
has to evaluate untrusted code.
Andy
[-- Attachment #2: sandbox.scm --]
[-- Type: text/plain, Size: 36124 bytes --]
;;; Sandboxed evaluation of Scheme code
;;; Copyright (C) 2017 Free Software Foundation, Inc.
;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public
;;;; License as published by the Free Software Foundation; either
;;;; version 3 of the License, or (at your option) any later version.
;;;;
;;;; This library is distributed in the hope that it will be useful,
;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
;;;; Lesser General Public License for more details.
;;;;
;;;; You should have received a copy of the GNU Lesser General Public
;;;; License along with this library; if not, write to the Free Software
;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
;;; Commentary:
;;;
;;; Code:
(define-module (ice-9 sandbox)
#:use-module (ice-9 control)
#:use-module (ice-9 match)
#:use-module (system vm vm)
#:export (call-with-time-limit
call-with-allocation-limit
call-with-time-and-allocation-limits
eval-in-sandbox
make-sandbox-module
*alist-bindings*
*array-bindings*
*bit-bindings*
*bitvector-bindings*
*char-bindings*
*char-set-bindings*
*clock-bindings*
*core-bindings*
*error-bindings*
*fluid-bindings*
*hash-bindings*
*iteration-bindings*
*keyword-bindings*
*list-bindings*
*macro-bindings*
*nil-bindings*
*number-bindings*
*pair-bindings*
*predicate-bindings*
*procedure-bindings*
*promise-bindings*
*prompt-bindings*
*regexp-bindings*
*sort-bindings*
*srfi-4-bindings*
*string-bindings*
*symbol-bindings*
*unspecified-bindings*
*variable-bindings*
*vector-bindings*
*version-bindings*
*mutating-alist-bindings*
*mutating-array-bindings*
*mutating-bitvector-bindings*
*mutating-fluid-bindings*
*mutating-hash-bindings*
*mutating-list-bindings*
*mutating-pair-bindings*
*mutating-sort-bindings*
*mutating-srfi-4-bindings*
*mutating-string-bindings*
*mutating-variable-bindings*
*mutating-vector-bindings*
*all-pure-bindings*
*all-pure-and-impure-bindings*))
(define (call-with-time-limit limit thunk limit-reached)
"Call @var{thunk}, but cancel it if @var{limit} seconds of wall-clock
time have elapsed. If the computation is cancelled, call
@var{limit-reached} in tail position. @var{thunk} must not disable
interrupts or prevent an abort via a @code{dynamic-wind} unwind
handler."
;; FIXME: use separate thread instead of sigalrm.
(let ((limit-usecs (inexact->exact (round (* limit 1e6))))
(prev-sigalarm-handler #f)
(tag (make-prompt-tag)))
(call-with-prompt tag
(lambda ()
(dynamic-wind
(lambda ()
(set! prev-sigalarm-handler
(sigaction SIGALRM (lambda (sig) (abort-to-prompt tag))))
(setitimer ITIMER_REAL 0 0 0 limit-usecs))
thunk
(lambda ()
(setitimer ITIMER_REAL 0 0 0 0)
(match prev-sigalarm-handler
((handler . flags)
(sigaction SIGALRM handler flags))))))
(lambda (k)
(limit-reached)))))
(define (call-with-allocation-limit limit thunk limit-reached)
"Call @var{thunk}, but cancel it if @var{limit} bytes have been
allocated. If the computation is cancelled, call @var{limit-reached} in
tail position. @var{thunk} must not disable interrupts or prevent an
abort via a @code{dynamic-wind} unwind handler.
This limit applies to both stack and heap allocation. The computation
will not be aborted before @var{limit} bytes have been allocated, but
for the heap allocation limit, the check may be postponed until the next garbage collection."
(define (bytes-allocated) (assq-ref (gc-stats) 'heap-total-allocated))
(let ((zero (bytes-allocated))
(tag (make-prompt-tag)))
(define (check-allocation)
(when (< limit (- (bytes-allocated) zero))
(abort-to-prompt tag)))
(call-with-prompt tag
(lambda ()
(dynamic-wind
(lambda ()
(add-hook! after-gc-hook check-allocation))
(lambda ()
(call-with-stack-overflow-handler
;; The limit is in "words", which used to be 4 or 8 but now
;; is always 8 bytes.
(floor/ limit 8)
thunk
(lambda () (abort-to-prompt tag))))
(lambda ()
(remove-hook! after-gc-hook check-allocation))))
(lambda (k)
(limit-reached)))))
(define (call-with-time-and-allocation-limits time-limit allocation-limit
thunk)
"Invoke @var{thunk} in a dynamic extent in which its execution is
limited to @var{time-limit} seconds of wall-clock time, and its
allocation to @var{allocation-limit} bytes. @var{thunk} must not
disable interrupts or prevent an abort via a @code{dynamic-wind} unwind
handler.
If successful, return all values produced by invoking @var{thunk}. Any
uncaught exception thrown by the thunk will propagate out. If the time
or allocation limit is exceeded, an exception will be thrown to the
@code{limit-exceeded} key."
(call-with-time-limit
time-limit
(lambda ()
(call-with-allocation-limit
allocation-limit
thunk
(lambda ()
(scm-error 'limit-exceeded "with-resource-limits"
"Allocation limit exceeded" '() #f))))
(lambda ()
(scm-error 'limit-exceeded "with-resource-limits"
"Time limit exceeded" '() #f))))
(define (sever-module! m)
"Remove @var{m} from its container module."
(match (module-name m)
((head ... tail)
(let ((parent (resolve-module head #f)))
(unless (eq? m (module-ref-submodule parent tail))
(error "can't sever module?"))
(hashq-remove! (module-submodules parent) tail)))))
;; bindings := module-binding-list ...
;; module-binding-list := interface-name import ...
;; import := name | (exported-name . imported-name)
;; name := symbol
(define (make-sandbox-module bindings)
"Return a fresh module that only contains @var{bindings}.
The @var{bindings} should be given as a list of import sets. One import
set is a list whose car names an interface, like @code{(ice-9 q)}, and
whose cdr is a list of imports. An import is either a bare symbol or a
pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
both symbols and denote the name under which a binding is exported from
the module, and the name under which to make the binding available,
respectively."
(let ((m (make-fresh-user-module)))
(purify-module! m)
;; FIXME: We want to have a module that will be collectable by GC.
;; Currently in Guile all modules are part of a single tree, and
;; once a module is part of that tree it will never be collected.
;; So we want to sever the module off from that tree. However the
;; psyntax syntax expander currently needs to be able to look up
;; modules by name; being severed from the name tree prevents that
;; from happening. So for now, each evaluation leaks memory :/
;;
;; (sever-module! m)
(module-use-interfaces! m
(map (match-lambda
((mod-name . bindings)
(resolve-interface mod-name
#:select bindings)))
bindings))
m))
(define* (eval-in-sandbox exp #:key
(time-limit 0.1)
(allocation-limit #e10e6)
(bindings *all-pure-bindings*)
(module (make-sandbox-module bindings)))
"Evaluate the Scheme expression @var{exp} within an isolated
\"sandbox\". Limit its execution to @var{time-limit} seconds of
wall-clock time, and limit its allocation to @var{allocation-limit}
bytes.
The evaluation will occur in @var{module}, which defaults to the result
of calling @code{make-sandbox-module} on @var{bindings}, which itself
defaults to @code{*all-pure-bindings*}. This is the core of the
sandbox: creating a scope for the expression that is @dfn{safe}.
A safe sandbox module has two characteristics. Firstly, it will not
allow the expression being evaluated to avoid being cancelled due to
time or allocation limits. This ensures that the expression terminates
in a timely fashion.
Secondly, a safe sandbox module will prevent the evaluation from
receiving information from previous evaluations, or from affecting
future evaluations. All combinations of binding sets exported by
@code{(ice-9 sandbox)} form safe sandbox modules.
The @var{bindings} should be given as a list of import sets. One import
set is a list whose car names an interface, like @code{(ice-9 q)}, and
whose cdr is a list of imports. An import is either a bare symbol or a
pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
both symbols and denote the name under which a binding is exported from
the module, and the name under which to make the binding available,
respectively. Note that @var{bindings} is only used as an input to the
default initializer for the @var{module} argument; if you pass
@code{#:module}, @var{bindings} is unused.
If successful, return all values produced by @var{exp}. Any uncaught
exception thrown by the expression will propagate out. If the time or
allocation limit is exceeded, an exception will be thrown to the
@code{limit-exceeded} key."
(call-with-time-and-allocation-limits
time-limit allocation-limit
(lambda ()
;; Prevent the expression from forging syntax objects. See "Syntax
;; Transformer Helpers" in the manual.
(parameterize ((allow-legacy-syntax-objects? #f))
(eval exp module)))))
;; An evaluation-sandboxing facility is safe if:
;;
;; (1) every evaluation will terminate in a timely manner
;;
;; (2) no evaluation can affect future evaluations
;;
;; For (1), we impose a user-controllable time limit on the evaluation,
;; in wall-clock time. When that limit is reached, Guile schedules an
;; asynchronous interrupt in the sandbox that aborts the computation.
;; For this to work, the sandboxed evaluation must not disable
;; interrupts, and it must not prevent timely aborts via malicious "out"
;; guards in dynamic-wind thunks.
;;
;; The sandbox also has an allocation limit that uses a similar cancel
;; mechanism, but this limit is less precise as it only runs at
;; garbage-collection time.
;;
;; The sandbox sets the allocation limit as the stack limit as well.
;;
;; For (2), the only way an evaluation can affect future evaluations is
;; if it causes a side-effect outside its sandbox. That side effect
;; could change the way the host or future sandboxed evaluations
;; operate, or it could leak information to future evaluations.
;;
;; One means of information leakage would be the file system. Although
;; one can imagine "safe" ways to access a file system, in practice we
;; just prevent all access to this and other operating system facilities
;; by not exposing the Guile primitives that access the file system,
;; connect to networking hosts, etc. If we chose our set of bindings
;; correctly and it is impossible to access host values other than those
;; given to the evaluation, then we have succeeded in granting only a
;; limited set of capabilities to the guest.
;;
;; To prevent information leakage we also limit other information about
;; the host, like its hostname or the Guile build information.
;;
;; The guest must also not have the capability to mutate a location used
;; by the host or by future sandboxed evaluations. Either you expose no
;; primitives to the evaluation that can mutate locations, or you expose
;; no mutable locations. In this sandbox we opt for a combination of
;; the two, though the selection of bindings is up to the user. "set!"
;; is always excluded, as Guile doesn't have a nice way to prevent set!
;; on imported bindings. But variable-set! is included, as no set of
;; bindings from this module includes a variable or a capability to a
;; variable. It's possible though to build sandbox modules with no
;; mutating primitives. As far as we know, all possible combinations of
;; the binding sets listed below are safe.
;;
(define *core-bindings*
'(((guile)
and
begin
apply
call-with-values
values
case
case-lambda
case-lambda*
cond
define
define*
define-values
do
if
lambda
lambda*
let
let*
letrec
letrec*
or
quasiquote
quote
;; Can't allow mutation to globals.
;; set!
unless
unquote
unquote-splicing
when
while
λ)))
(define *macro-bindings*
'(((guile)
bound-identifier=?
;; Although these have "current" in their name, they are lexically
;; scoped, not dynamically scoped.
current-filename
current-source-location
datum->syntax
define-macro
define-syntax
define-syntax-parameter
define-syntax-rule
defmacro
free-identifier=?
generate-temporaries
gensym
identifier-syntax
identifier?
let-syntax
letrec-syntax
macroexpand
macroexpanded?
quasisyntax
start-stack
syntax
syntax->datum
syntax-case
syntax-error
syntax-parameterize
syntax-rules
syntax-source
syntax-violation
unsyntax
unsyntax-splicing
with-ellipsis
with-syntax
make-variable-transformer)))
(define *iteration-bindings*
'(((guile)
compose
for-each
identity
iota
map
map-in-order
const
noop)))
(define *clock-bindings*
'(((guile)
get-internal-real-time
internal-time-units-per-second
sleep
usleep)))
(define *procedure-bindings*
'(((guile)
procedure-documentation
procedure-minimum-arity
procedure-name
procedure?
thunk?)))
(define *version-bindings*
'(((guile)
effective-version
major-version
micro-version
minor-version
version
version-matches?)))
(define *nil-bindings*
'(((guile)
nil?)))
(define *unspecified-bindings*
'(((guile)
unspecified?
*unspecified*)))
(define *predicate-bindings*
'(((guile)
->bool
and-map
and=>
boolean?
eq?
equal?
eqv?
negate
not
or-map)))
;; The current ports (current-input-port et al) are dynamically scoped,
;; which is a footgun from a sandboxing perspective. It's too easy for
;; a procedure that is the result of a sandboxed evaluation to be later
;; invoked in a different context and thereby be implicitly granted
;; capabilities to whatever port is then current. This is compounded by
;; the fact that most Scheme i/o primitives allow the port to be omitted
;; and thereby default to whatever's current. For now, sadly, we avoid
;; exposing any i/o primitive to the sandbox.
#;
(define *i/o-bindings*
'(((guile)
display
eof-object?
force-output
format
make-soft-port
newline
read
simple-format
write
write-char)
((ice-9 ports)
%make-void-port
char-ready?
;; Note that these are mutable parameters.
current-error-port
current-input-port
current-output-port
current-warning-port
drain-input
eof-object?
file-position
force-output
ftell
input-port?
output-port?
peek-char
port-closed?
port-column
port-conversion-strategy
port-encoding
port-filename
port-line
port-mode
port?
read-char
the-eof-object
;; We don't provide open-output-string because it needs
;; get-output-string, and get-output-string provides a generic
;; capability on any output string port. For consistency then we
;; don't provide open-input-string either; call-with-input-string
;; is sufficient.
call-with-input-string
call-with-output-string
with-error-to-port
with-error-to-string
with-input-from-port
with-input-from-string
with-output-to-port
with-output-to-string)))
;; If two evaluations are called with the same input port, unread-char
;; and unread-string can use a port as a mutable channel to pass
;; information from one to the other.
#;
(define *mutating-i/o-bindings*
'(((guile)
set-port-encoding!)
((ice-9 ports)
close-input-port
close-output-port
close-port
file-set-position
seek
set-port-column!
set-port-conversion-strategy!
set-port-encoding!
set-port-filename!
set-port-line!
setvbuf
unread-char
unread-string)))
(define *error-bindings*
'(((guile)
error
throw
with-throw-handler
catch
;; false-if-exception can cause i/o if the #:warning arg is passed.
;; false-if-exception
;; See notes on *i/o-bindings*.
;; peek
;; pk
;; print-exception
;; warn
strerror
scm-error
)))
;; FIXME: Currently we can't expose anything that works on the current
;; module to the sandbox. It could be that the sandboxed evaluation
;; returns a procedure, and that procedure may later be invoked in a
;; different context with a different current-module and it is unlikely
;; that the later caller will consider themselves as granting a
;; capability on whatever module is then current. Likewise export (and
;; by extension, define-public and the like) also operate on the current
;; module.
;;
;; It could be that we could expose a statically scoped eval to the
;; sandbox.
#;
(define *eval-bindings*
'(((guile)
current-module
module-name
module?
define-once
define-private
define-public
defined?
export
defmacro-public
;; FIXME: single-arg eval?
eval
primitive-eval
eval-string
self-evaluating?
;; Can we?
set-current-module)))
(define *sort-bindings*
'(((guile)
sort
sorted?
stable-sort
sort-list)))
;; These can only form part of a safe binding set if no mutable pair or
;; vector is exposed to the sandbox.
(define *mutating-sort-bindings*
'(((guile)
sort!
stable-sort!
sort-list!
restricted-vector-sort!)))
(define *regexp-bindings*
'(((guile)
make-regexp
regexp-exec
regexp/basic
regexp/extended
regexp/icase
regexp/newline
regexp/notbol
regexp/noteol
regexp?)))
(define *alist-bindings*
'(((guile)
acons
assoc
assoc-ref
assq
assq-ref
assv
assv-ref
sloppy-assoc
sloppy-assq
sloppy-assv)))
;; These can only form part of a safe binding set if no mutable pair
;; is exposed to the sandbox. Unfortunately all charsets in Guile are
;; mutable, currently, including the built-in charsets, so we can't
;; expose these primitives.
(define *mutating-alist-bindings*
'(((guile)
assoc-remove!
assoc-set!
assq-remove!
assq-set!
assv-remove!
assv-set!)))
(define *number-bindings*
'(((guile)
*
+
-
/
1+
1-
<
<=
=
>
>=
abs
acos
acosh
angle
asin
asinh
atan
atanh
ceiling
ceiling-quotient
ceiling-remainder
ceiling/
centered-quotient
centered-remainder
centered/
complex?
cos
cosh
denominator
euclidean-quotient
euclidean-remainder
euclidean/
even?
exact->inexact
exact-integer-sqrt
exact-integer?
exact?
exp
expt
finite?
floor
floor-quotient
floor-remainder
floor/
gcd
imag-part
inf
inf?
integer-expt
integer-length
integer?
lcm
log
log10
magnitude
make-polar
make-rectangular
max
min
modulo
modulo-expt
most-negative-fixnum
most-positive-fixnum
nan
nan?
negative?
numerator
odd?
positive?
quotient
rational?
rationalize
real-part
real?
remainder
round
round-quotient
round-remainder
round/
sin
sinh
sqrt
tan
tanh
truncate
truncate-quotient
truncate-remainder
truncate/
zero?
number?
number->string
string->number)))
(define *char-set-bindings*
'(((guile)
->char-set
char-set
char-set->list
char-set->string
char-set-adjoin
char-set-any
char-set-complement
char-set-contains?
char-set-copy
char-set-count
char-set-cursor
char-set-cursor-next
char-set-delete
char-set-diff+intersection
char-set-difference
char-set-every
char-set-filter
char-set-fold
char-set-for-each
char-set-hash
char-set-intersection
char-set-map
char-set-ref
char-set-size
char-set-unfold
char-set-union
char-set-xor
char-set:ascii
char-set:blank
char-set:designated
char-set:digit
char-set:empty
char-set:full
char-set:graphic
char-set:hex-digit
char-set:iso-control
char-set:letter
char-set:letter+digit
char-set:lower-case
char-set:printing
char-set:punctuation
char-set:symbol
char-set:title-case
char-set:upper-case
char-set:whitespace
char-set<=
char-set=
char-set?
end-of-char-set?
list->char-set
string->char-set
ucs-range->char-set)))
;; These can only form part of a safe binding set if no mutable char-set
;; is exposed to the sandbox. Unfortunately all charsets in Guile are
;; mutable, currently, including the built-in charsets, so we can't
;; expose these primitives.
#;
(define *mutating-char-set-bindings*
'(((guile)
char-set-adjoin!
char-set-complement!
char-set-delete!
char-set-diff+intersection!
char-set-difference!
char-set-filter!
char-set-intersection!
char-set-unfold!
char-set-union!
char-set-xor!
list->char-set!
string->char-set!
ucs-range->char-set!)))
(define *array-bindings*
'(((guile)
array->list
array-cell-ref
array-contents
array-dimensions
array-equal?
array-for-each
array-in-bounds?
array-length
array-rank
array-ref
array-shape
array-slice
array-slice-for-each
array-slice-for-each-in-order
array-type
array-type-code
array?
list->array
list->typed-array
make-array
make-shared-array
make-typed-array
shared-array-increments
shared-array-offset
shared-array-root
transpose-array
typed-array?)))
;; These can only form part of a safe binding set if no mutable vector,
;; bitvector, bytevector, srfi-4 vector, or array is exposed to the
;; sandbox.
(define *mutating-array-bindings*
'(((guile)
array-cell-set!
array-copy!
array-copy-in-order!
array-fill!
array-index-map!
array-map!
array-map-in-order!
array-set!)))
(define *hash-bindings*
'(((guile)
doubly-weak-hash-table?
hash
hash-count
hash-fold
hash-for-each
hash-for-each-handle
hash-get-handle
hash-map->list
hash-ref
hash-table?
hashq
hashq-get-handle
hashq-ref
hashv
hashv-get-handle
hashv-ref
hashx-get-handle
hashx-ref
make-doubly-weak-hash-table
make-hash-table
make-weak-key-hash-table
make-weak-value-hash-table
weak-key-hash-table?
weak-value-hash-table?)))
;; These can only form part of a safe binding set if no hash table is
;; exposed to the sandbox.
(define *mutating-hash-bindings*
'(((guile)
hash-clear!
hash-create-handle!
hash-remove!
hash-set!
hashq-create-handle!
hashq-remove!
hashq-set!
hashv-create-handle!
hashv-remove!
hashv-set!
hashx-create-handle!
hashx-remove!
hashx-set!)))
(define *variable-bindings*
'(((guile)
make-undefined-variable
make-variable
variable-bound?
variable-ref
variable?)))
;; These can only form part of a safe binding set if no mutable variable
;; is exposed to the sandbox; this applies particularly to variables
;; that are module bindings.
(define *mutating-variable-bindings*
'(((guile)
variable-set!
variable-unset!)))
(define *string-bindings*
'(((guile)
absolute-file-name?
file-name-separator-string
file-name-separator?
in-vicinity
basename
dirname
list->string
make-string
object->string
reverse-list->string
string
string->list
string-any
string-any-c-code
string-append
string-append/shared
string-capitalize
string-ci<
string-ci<=
string-ci<=?
string-ci<>
string-ci<?
string-ci=
string-ci=?
string-ci>
string-ci>=
string-ci>=?
string-ci>?
string-compare
string-compare-ci
string-concatenate
string-concatenate-reverse
string-concatenate-reverse/shared
string-concatenate/shared
string-contains
string-contains-ci
string-copy
string-count
string-delete
string-downcase
string-drop
string-drop-right
string-every
string-every-c-code
string-filter
string-fold
string-fold-right
string-for-each
string-for-each-index
string-hash
string-hash-ci
string-index
string-index-right
string-join
string-length
string-map
string-normalize-nfc
string-normalize-nfd
string-normalize-nfkc
string-normalize-nfkd
string-null?
string-pad
string-pad-right
string-prefix-ci?
string-prefix-length
string-prefix-length-ci
string-prefix?
string-ref
string-replace
string-reverse
string-rindex
string-skip
string-skip-right
string-split
string-suffix-ci?
string-suffix-length
string-suffix-length-ci
string-suffix?
string-tabulate
string-take
string-take-right
string-titlecase
string-tokenize
string-trim
string-trim-both
string-trim-right
string-unfold
string-unfold-right
string-upcase
string-utf8-length
string<
string<=
string<=?
string<>
string<?
string=
string=?
string>
string>=
string>=?
string>?
string?
substring
substring/copy
substring/read-only
substring/shared
xsubstring)))
;; These can only form part of a safe binding set if no mutable string
;; is exposed to the sandbox.
(define *mutating-string-bindings*
'(((guile)
string-capitalize!
string-copy!
string-downcase!
string-fill!
string-map!
string-reverse!
string-set!
string-titlecase!
string-upcase!
string-xcopy!
substring-fill!
substring-move!)))
(define *symbol-bindings*
'(((guile)
string->symbol
string-ci->symbol
symbol->string
list->symbol
make-symbol
symbol
symbol-append
symbol-hash
symbol-interned?
symbol?)))
(define *keyword-bindings*
'(((guile)
keyword?
keyword->symbol
symbol->keyword)))
;; These can only form part of a safe binding set if no valid prompt tag
;; is ever exposed to the sandbox, or can be constructed by the sandbox.
(define *prompt-bindings*
'(((guile)
abort-to-prompt
abort-to-prompt*
call-with-prompt
make-prompt-tag)))
(define *bit-bindings*
'(((guile)
ash
round-ash
logand
logcount
logior
lognot
logtest
logxor
logbit?)))
(define *bitvector-bindings*
'(((guile)
bit-count
bit-count*
bit-extract
bit-position
bitvector
bitvector->list
bitvector-length
bitvector-ref
bitvector?
list->bitvector
make-bitvector)))
;; These can only form part of a safe binding set if no mutable
;; bitvector is exposed to the sandbox.
(define *mutating-bitvector-bindings*
'(((guile)
bit-invert!
bit-set*!
bitvector-fill!
bitvector-set!)))
(define *fluid-bindings*
'(((guile)
fluid-bound?
fluid-ref
;; fluid-ref* could escape the sandbox and is not allowed.
fluid-thread-local?
fluid?
make-fluid
make-thread-local-fluid
make-unbound-fluid
with-fluid*
with-fluids
with-fluids*
make-parameter
parameter?
parameterize)))
;; These can only form part of a safe binding set if no fluid is
;; directly exposed to the sandbox.
(define *mutating-fluid-bindings*
'(((guile)
fluid-set!
fluid-unset!
fluid->parameter)))
(define *char-bindings*
'(((guile)
char-alphabetic?
char-ci<=?
char-ci<?
char-ci=?
char-ci>=?
char-ci>?
char-downcase
char-general-category
char-is-both?
char-lower-case?
char-numeric?
char-titlecase
char-upcase
char-upper-case?
char-whitespace?
char<=?
char<?
char=?
char>=?
char>?
char?
char->integer
integer->char)))
(define *list-bindings*
'(((guile)
list
list-cdr-ref
list-copy
list-head
list-index
list-ref
list-tail
list?
null?
make-list
append
delete
delq
delv
filter
length
member
memq
memv
merge
reverse)))
;; These can only form part of a safe binding set if no mutable
;; pair is exposed to the sandbox.
(define *mutating-list-bindings*
'(((guile)
list-cdr-set!
list-set!
append!
delete!
delete1!
delq!
delq1!
delv!
delv1!
filter!
merge!
reverse!)))
(define *pair-bindings*
'(((guile)
last-pair
pair?
caaaar
caaadr
caaar
caadar
caaddr
caadr
caar
cadaar
cadadr
cadar
caddar
cadddr
caddr
cadr
car
cdaaar
cdaadr
cdaar
cdadar
cdaddr
cdadr
cdar
cddaar
cddadr
cddar
cdddar
cddddr
cdddr
cddr
cdr
cons
cons*)))
;; These can only form part of a safe binding set if no mutable
;; pair is exposed to the sandbox.
(define *mutating-pair-bindings*
'(((guile)
set-car!
set-cdr!)))
(define *vector-bindings*
'(((guile)
list->vector
make-vector
vector
vector->list
vector-copy
vector-length
vector-ref
vector?)))
;; These can only form part of a safe binding set if no mutable
;; vector is exposed to the sandbox.
(define *mutating-vector-bindings*
'(((guile)
vector-fill!
vector-move-left!
vector-move-right!
vector-set!)))
(define *promise-bindings*
'(((guile)
force
delay
make-promise
promise?)))
(define *srfi-4-bindings*
'(((srfi srfi-4)
f32vector
f32vector->list
f32vector-length
f32vector-ref
f32vector?
f64vector
f64vector->list
f64vector-length
f64vector-ref
f64vector?
list->f32vector
list->f64vector
list->s16vector
list->s32vector
list->s64vector
list->s8vector
list->u16vector
list->u32vector
list->u64vector
list->u8vector
make-f32vector
make-f64vector
make-s16vector
make-s32vector
make-s64vector
make-s8vector
make-u16vector
make-u32vector
make-u64vector
make-u8vector
s16vector
s16vector->list
s16vector-length
s16vector-ref
s16vector?
s32vector
s32vector->list
s32vector-length
s32vector-ref
s32vector?
s64vector
s64vector->list
s64vector-length
s64vector-ref
s64vector?
s8vector
s8vector->list
s8vector-length
s8vector-ref
s8vector?
u16vector
u16vector->list
u16vector-length
u16vector-ref
u16vector?
u32vector
u32vector->list
u32vector-length
u32vector-ref
u32vector?
u64vector
u64vector->list
u64vector-length
u64vector-ref
u64vector?
u8vector
u8vector->list
u8vector-length
u8vector-ref
u8vector?)))
;; These can only form part of a safe binding set if no mutable
;; bytevector is exposed to the sandbox.
(define *mutating-srfi-4-bindings*
'(((srfi srfi-4)
f32vector-set!
f64vector-set!
s16vector-set!
s32vector-set!
s64vector-set!
s8vector-set!
u16vector-set!
u32vector-set!
u64vector-set!
u8vector-set!)))
(define *all-pure-bindings*
(append *alist-bindings*
*array-bindings*
*bit-bindings*
*bitvector-bindings*
*char-bindings*
*char-set-bindings*
*clock-bindings*
*core-bindings*
*error-bindings*
*fluid-bindings*
*hash-bindings*
*iteration-bindings*
*keyword-bindings*
*list-bindings*
*macro-bindings*
*nil-bindings*
*number-bindings*
*pair-bindings*
*predicate-bindings*
*procedure-bindings*
*promise-bindings*
*prompt-bindings*
*regexp-bindings*
*sort-bindings*
*srfi-4-bindings*
*string-bindings*
*symbol-bindings*
*unspecified-bindings*
*variable-bindings*
*vector-bindings*
*version-bindings*))
(define *all-pure-and-impure-bindings*
(append *all-pure-bindings*
*mutating-alist-bindings*
*mutating-array-bindings*
*mutating-bitvector-bindings*
*mutating-fluid-bindings*
*mutating-hash-bindings*
*mutating-list-bindings*
*mutating-pair-bindings*
*mutating-sort-bindings*
*mutating-srfi-4-bindings*
*mutating-string-bindings*
*mutating-variable-bindings*
*mutating-vector-bindings*))
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 9:27 RFC: (ice-9 sandbox) Andy Wingo
@ 2017-03-31 11:33 ` Ludovic Courtès
2017-03-31 16:26 ` Andy Wingo
2017-03-31 14:41 ` Mike Gran
` (4 subsequent siblings)
5 siblings, 1 reply; 17+ messages in thread
From: Ludovic Courtès @ 2017-03-31 11:33 UTC (permalink / raw)
To: guile-devel
Hello!
Andy Wingo <wingo@pobox.com> skribis:
> Any thoughts? I would like something like this for a web service that
> has to evaluate untrusted code.
Would be nice!
> (define (call-with-allocation-limit limit thunk limit-reached)
> "Call @var{thunk}, but cancel it if @var{limit} bytes have been
> allocated. If the computation is cancelled, call @var{limit-reached} in
> tail position. @var{thunk} must not disable interrupts or prevent an
> abort via a @code{dynamic-wind} unwind handler.
>
> This limit applies to both stack and heap allocation. The computation
> will not be aborted before @var{limit} bytes have been allocated, but
> for the heap allocation limit, the check may be postponed until the next garbage collection."
> (define (bytes-allocated) (assq-ref (gc-stats) 'heap-total-allocated))
> (let ((zero (bytes-allocated))
> (tag (make-prompt-tag)))
> (define (check-allocation)
> (when (< limit (- (bytes-allocated) zero))
> (abort-to-prompt tag)))
> (call-with-prompt tag
> (lambda ()
> (dynamic-wind
> (lambda ()
> (add-hook! after-gc-hook check-allocation))
> (lambda ()
> (call-with-stack-overflow-handler
> ;; The limit is in "words", which used to be 4 or 8 but now
> ;; is always 8 bytes.
> (floor/ limit 8)
> thunk
> (lambda () (abort-to-prompt tag))))
> (lambda ()
> (remove-hook! after-gc-hook check-allocation))))
> (lambda (k)
> (limit-reached)))))
The allocations that trigger ‘after-gc-hook’ could be caused by a
separate thread, right? That’s probably an acceptable limitation, but
one to be aware of.
Also, if the code does:
(make-bytevector (expt 2 32))
then ‘after-gc-hook’ run too late, as the comment notes.
> (define (make-sandbox-module bindings)
> "Return a fresh module that only contains @var{bindings}.
>
> The @var{bindings} should be given as a list of import sets. One import
> set is a list whose car names an interface, like @code{(ice-9 q)}, and
> whose cdr is a list of imports. An import is either a bare symbol or a
> pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
> both symbols and denote the name under which a binding is exported from
> the module, and the name under which to make the binding available,
> respectively."
> (let ((m (make-fresh-user-module)))
> (purify-module! m)
> ;; FIXME: We want to have a module that will be collectable by GC.
> ;; Currently in Guile all modules are part of a single tree, and
> ;; once a module is part of that tree it will never be collected.
> ;; So we want to sever the module off from that tree. However the
> ;; psyntax syntax expander currently needs to be able to look up
> ;; modules by name; being severed from the name tree prevents that
> ;; from happening. So for now, each evaluation leaks memory :/
> ;;
> ;; (sever-module! m)
> (module-use-interfaces! m
> (map (match-lambda
> ((mod-name . bindings)
> (resolve-interface mod-name
> #:select bindings)))
> bindings))
> m))
IIUC ‘@@’ in unavailable in the returned module, right?
--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (eval '(@@ (guile) resolve-interface)
(let ((m (make-fresh-user-module)))
(purify-module! m)
m))
ERROR: In procedure %resolve-variable:
ERROR: Unbound variable: @@
--8<---------------cut here---------------end--------------->8---
Isn’t make-fresh-user-module + purify-module! equivalent to just
(make-module)?
> ;; These can only form part of a safe binding set if no mutable
> ;; pair is exposed to the sandbox.
> (define *mutating-pair-bindings*
> '(((guile)
> set-car!
> set-cdr!)))
When used on a literal pair (mapped read-only), these can cause a
segfault. Now since the code is ‘eval’d, the only literal pairs it can
see are those passed by the caller I suppose, so this may be safe?
> (define *all-pure-and-impure-bindings*
> (append *all-pure-bindings*
Last but not least: why all the stars? :-)
I’m used to ‘%something’.
Thank you!
Ludo’.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 9:27 RFC: (ice-9 sandbox) Andy Wingo
2017-03-31 11:33 ` Ludovic Courtès
@ 2017-03-31 14:41 ` Mike Gran
2017-04-01 14:33 ` Christopher Allan Webber
` (3 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Mike Gran @ 2017-03-31 14:41 UTC (permalink / raw)
To: Andy Wingo, guile-devel@gnu.org
?> On Friday, March 31, 2017 2:28 AM, Andy Wingo <wingo@pobox.com> wrote:
> Any thoughts? I would like something like this for a web service that
> has to evaluate untrusted code.
Neat! Here are some random, tangential ideas.
While this might be a good route toward a pragmatic definition of
"safe," a route to a stronger version of safety might be trying
to compile a Guile against the CloudABI C library -- which prevents
OS interaction altogether -- and then use something like inetd to
to communicate with your safe guile.
As a middle ground, there are the --disable-posix,
--disable-networking, and --disable-regex options, to consider.
-Mike Gran
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 11:33 ` Ludovic Courtès
@ 2017-03-31 16:26 ` Andy Wingo
2017-03-31 21:41 ` Ludovic Courtès
0 siblings, 1 reply; 17+ messages in thread
From: Andy Wingo @ 2017-03-31 16:26 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
On Fri 31 Mar 2017 13:33, ludo@gnu.org (Ludovic Courtès) writes:
> Andy Wingo <wingo@pobox.com> skribis:
>
> The allocations that trigger ‘after-gc-hook’ could be caused by a
> separate thread, right? That’s probably an acceptable limitation, but
> one to be aware of.
Ah yes, we should document this. Sadly we just don't have very good
metrics here.
> Also, if the code does:
>
> (make-bytevector (expt 2 32))
>
> then ‘after-gc-hook’ run too late, as the comment notes.
Yep.
> IIUC ‘@@’ in unavailable in the returned module, right?
Correct. You could put it there but that's a bad ideal.
> Isn’t make-fresh-user-module + purify-module! equivalent to just
> (make-module)?
No, beautify-user-module! does a few more things too. I was thinking
that we would want to be able to work on the public interface of the
module so I wanted to make sure it was there but in retrospect we don't
need it and can probably simplify things I guess.
>> ;; These can only form part of a safe binding set if no mutable
>> ;; pair is exposed to the sandbox.
>> (define *mutating-pair-bindings*
>> '(((guile)
>> set-car!
>> set-cdr!)))
>
> When used on a literal pair (mapped read-only), these can cause a
> segfault. Now since the code is ‘eval’d, the only literal pairs it can
> see are those passed by the caller I suppose, so this may be safe?
Who knows. I mean vector-set! can also cause segfaults. I think we
should fix that situation to throw an exception.
>> (define *all-pure-and-impure-bindings*
>> (append *all-pure-bindings*
>
> Last but not least: why all the stars? :-)
> I’m used to ‘%something’.
For me I read % as being pronounced "sys" and indicating internal
bindings. Why do you use it for globals? Is it your proposal that we
use it for globals?
Andy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 16:26 ` Andy Wingo
@ 2017-03-31 21:41 ` Ludovic Courtès
2017-04-02 10:18 ` Andy Wingo
0 siblings, 1 reply; 17+ messages in thread
From: Ludovic Courtès @ 2017-03-31 21:41 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Andy Wingo <wingo@pobox.com> skribis:
> On Fri 31 Mar 2017 13:33, ludo@gnu.org (Ludovic Courtès) writes:
[...]
>>> ;; These can only form part of a safe binding set if no mutable
>>> ;; pair is exposed to the sandbox.
>>> (define *mutating-pair-bindings*
>>> '(((guile)
>>> set-car!
>>> set-cdr!)))
>>
>> When used on a literal pair (mapped read-only), these can cause a
>> segfault. Now since the code is ‘eval’d, the only literal pairs it can
>> see are those passed by the caller I suppose, so this may be safe?
>
> Who knows. I mean vector-set! can also cause segfaults. I think we
> should fix that situation to throw an exception.
Yes, that would be nice, though I suppose it’s currently tricky to
achieve no? Maybe that newfangled ‘userfaultfd’ will save us all.
>>> (define *all-pure-and-impure-bindings*
>>> (append *all-pure-bindings*
>>
>> Last but not least: why all the stars? :-)
>> I’m used to ‘%something’.
>
> For me I read % as being pronounced "sys" and indicating internal
> bindings. Why do you use it for globals? Is it your proposal that we
> use it for globals?
I tend to do that but I realize I must be a minority here. Let it be
stars then. :-)
Thanks for working on this!
Ludo’.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 9:27 RFC: (ice-9 sandbox) Andy Wingo
2017-03-31 11:33 ` Ludovic Courtès
2017-03-31 14:41 ` Mike Gran
@ 2017-04-01 14:33 ` Christopher Allan Webber
2017-04-06 21:41 ` Freja Nordsiek
` (2 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Christopher Allan Webber @ 2017-04-01 14:33 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Wow! With this I suppose we could implement something like
http://mumble.net/~jar/pubs/secureos/secureos.html
?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 21:41 ` Ludovic Courtès
@ 2017-04-02 10:18 ` Andy Wingo
2017-04-03 15:35 ` Ludovic Courtès
0 siblings, 1 reply; 17+ messages in thread
From: Andy Wingo @ 2017-04-02 10:18 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
On Fri 31 Mar 2017 23:41, ludo@gnu.org (Ludovic Courtès) writes:
> Andy Wingo <wingo@pobox.com> skribis:
>
>> On Fri 31 Mar 2017 13:33, ludo@gnu.org (Ludovic Courtès) writes:
>
> [...]
>
>>>> ;; These can only form part of a safe binding set if no mutable
>>>> ;; pair is exposed to the sandbox.
>>>> (define *mutating-pair-bindings*
>>>> '(((guile)
>>>> set-car!
>>>> set-cdr!)))
>>>
>>> When used on a literal pair (mapped read-only), these can cause a
>>> segfault. Now since the code is ‘eval’d, the only literal pairs it can
>>> see are those passed by the caller I suppose, so this may be safe?
>>
>> Who knows. I mean vector-set! can also cause segfaults. I think we
>> should fix that situation to throw an exception.
>
> Yes, that would be nice, though I suppose it’s currently tricky to
> achieve no? Maybe that newfangled ‘userfaultfd’ will save us all.
Maybe :) I mean it's possible now to catch SIGSEGV. I just sent a
patch to guile-devel; wdyt? Needs docs & tests of course.
>>>> (define *all-pure-and-impure-bindings*
>>>> (append *all-pure-bindings*
>>>
>>> Last but not least: why all the stars? :-)
>>> I’m used to ‘%something’.
>>
>> For me I read % as being pronounced "sys" and indicating internal
>> bindings. Why do you use it for globals? Is it your proposal that we
>> use it for globals?
>
> I tend to do that but I realize I must be a minority here. Let it be
> stars then. :-)
I think that like you, I learned Scheme conventions in an ad-hoc way,
apeing conventions from many sources (Guile's own code, Common Lisp,
random Scheme). I would be happy if we could be a bit more purposeful
about our conventions and I would be happy to change mine :) %
can work fine :)
Andy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-04-02 10:18 ` Andy Wingo
@ 2017-04-03 15:35 ` Ludovic Courtès
2017-04-14 10:52 ` Andy Wingo
0 siblings, 1 reply; 17+ messages in thread
From: Ludovic Courtès @ 2017-04-03 15:35 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Andy Wingo <wingo@pobox.com> skribis:
> On Fri 31 Mar 2017 23:41, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Andy Wingo <wingo@pobox.com> skribis:
>>
>>> On Fri 31 Mar 2017 13:33, ludo@gnu.org (Ludovic Courtès) writes:
>>
>> [...]
>>
>>>>> ;; These can only form part of a safe binding set if no mutable
>>>>> ;; pair is exposed to the sandbox.
>>>>> (define *mutating-pair-bindings*
>>>>> '(((guile)
>>>>> set-car!
>>>>> set-cdr!)))
>>>>
>>>> When used on a literal pair (mapped read-only), these can cause a
>>>> segfault. Now since the code is ‘eval’d, the only literal pairs it can
>>>> see are those passed by the caller I suppose, so this may be safe?
>>>
>>> Who knows. I mean vector-set! can also cause segfaults. I think we
>>> should fix that situation to throw an exception.
>>
>> Yes, that would be nice, though I suppose it’s currently tricky to
>> achieve no? Maybe that newfangled ‘userfaultfd’ will save us all.
>
> Maybe :) I mean it's possible now to catch SIGSEGV. I just sent a
> patch to guile-devel; wdyt? Needs docs & tests of course.
Neat! I’ll look into it.
>>>>> (define *all-pure-and-impure-bindings*
>>>>> (append *all-pure-bindings*
>>>>
>>>> Last but not least: why all the stars? :-)
>>>> I’m used to ‘%something’.
>>>
>>> For me I read % as being pronounced "sys" and indicating internal
>>> bindings. Why do you use it for globals? Is it your proposal that we
>>> use it for globals?
>>
>> I tend to do that but I realize I must be a minority here. Let it be
>> stars then. :-)
>
> I think that like you, I learned Scheme conventions in an ad-hoc way,
> apeing conventions from many sources (Guile's own code, Common Lisp,
> random Scheme). I would be happy if we could be a bit more purposeful
> about our conventions and I would be happy to change mine :) %
> can work fine :)
I grepped Guile and it seems that stars are actually more common for
globals than % (I thought it was the opposite but as you say, I kind of
discovered/invented the conventions.)
Riastradh’s document at <http://mumble.net/~campbell/scheme/style.txt>
has this:
Affix asterisks to the beginning and end of a globally mutable
variable. This allows the reader of the program to recognize very
easily that it is badly written!
… but it doesn’t say anything about constants nor about %.
It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
‘%all-pure-bindings’. So, dunno, as you see fit!
Ludo’.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 9:27 RFC: (ice-9 sandbox) Andy Wingo
` (2 preceding siblings ...)
2017-04-01 14:33 ` Christopher Allan Webber
@ 2017-04-06 21:41 ` Freja Nordsiek
2017-04-14 10:58 ` Andy Wingo
2017-04-15 17:23 ` Nala Ginrut
2017-04-18 19:48 ` Andy Wingo
5 siblings, 1 reply; 17+ messages in thread
From: Freja Nordsiek @ 2017-04-06 21:41 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
I took a look at the specific binding the sandbox makes available and
have a few thoughts.
I didn't see any problems with any of the pure bindings made
available, but I am only very familiar with basic R5RS, R6RS, and R7RS
bindings, not Guile extensions (yet, at least), so I can't comment on
many of them.
On the subject of ports and i/o, I have a few ideas. R6RS i/o in the
(rnrs io ports) module generally requires the port to be explicitly
given, rather than assuming current in or out if not given (though
rnrs io simple does make those assumptions). For many, it would be
impossible because they put the port as the first argument and a
required second argument afterwards. Looking at module/io/ports.scm in
Guile 2.2.x, it looks like the reading and writing procedures there
should be safe. Obviously, nothing that opens a file should be used,
nor the procedures to get current input, output, and error; but the
rest can be used. And this includes string and bytevector ports, which
could be very useful in the sandbox (I don't know about anyone else,
but I use string ports all the time).
One question, is there a particular reason that guard is not exported?
It doesn't seem like it is as nasty as dynamic-wind with trying to
terminate, though maybe I am just not seeing how it could be used to
prevent the sandbox terminating the process. Having at least one
exception handling binding might be very helpful in a sandbox.
Freja Nordsiek
On Fri, Mar 31, 2017 at 11:27 AM, Andy Wingo <wingo@pobox.com> wrote:
> Hi,
>
> Attached is a module that can evaluate an expression within a sandbox.
> If the evaluation takes too long or allocates too much, it will be
> cancelled. The evaluation will take place with respect to a module with
> a "safe" set of imports. Those imports include most of the bindings
> available in a default Guile environment. See the file below for full
> details and a number of caveats.
>
> Any thoughts? I would like something like this for a web service that
> has to evaluate untrusted code.
>
> Andy
>
>
> ;;; Sandboxed evaluation of Scheme code
>
> ;;; Copyright (C) 2017 Free Software Foundation, Inc.
>
> ;;;; This library is free software; you can redistribute it and/or
> ;;;; modify it under the terms of the GNU Lesser General Public
> ;;;; License as published by the Free Software Foundation; either
> ;;;; version 3 of the License, or (at your option) any later version.
> ;;;;
> ;;;; This library is distributed in the hope that it will be useful,
> ;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> ;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> ;;;; Lesser General Public License for more details.
> ;;;;
> ;;;; You should have received a copy of the GNU Lesser General Public
> ;;;; License along with this library; if not, write to the Free Software
> ;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>
> ;;; Commentary:
> ;;;
> ;;; Code:
>
> (define-module (ice-9 sandbox)
> #:use-module (ice-9 control)
> #:use-module (ice-9 match)
> #:use-module (system vm vm)
> #:export (call-with-time-limit
> call-with-allocation-limit
> call-with-time-and-allocation-limits
>
> eval-in-sandbox
> make-sandbox-module
>
> *alist-bindings*
> *array-bindings*
> *bit-bindings*
> *bitvector-bindings*
> *char-bindings*
> *char-set-bindings*
> *clock-bindings*
> *core-bindings*
> *error-bindings*
> *fluid-bindings*
> *hash-bindings*
> *iteration-bindings*
> *keyword-bindings*
> *list-bindings*
> *macro-bindings*
> *nil-bindings*
> *number-bindings*
> *pair-bindings*
> *predicate-bindings*
> *procedure-bindings*
> *promise-bindings*
> *prompt-bindings*
> *regexp-bindings*
> *sort-bindings*
> *srfi-4-bindings*
> *string-bindings*
> *symbol-bindings*
> *unspecified-bindings*
> *variable-bindings*
> *vector-bindings*
> *version-bindings*
>
> *mutating-alist-bindings*
> *mutating-array-bindings*
> *mutating-bitvector-bindings*
> *mutating-fluid-bindings*
> *mutating-hash-bindings*
> *mutating-list-bindings*
> *mutating-pair-bindings*
> *mutating-sort-bindings*
> *mutating-srfi-4-bindings*
> *mutating-string-bindings*
> *mutating-variable-bindings*
> *mutating-vector-bindings*
>
> *all-pure-bindings*
> *all-pure-and-impure-bindings*))
>
>
> (define (call-with-time-limit limit thunk limit-reached)
> "Call @var{thunk}, but cancel it if @var{limit} seconds of wall-clock
> time have elapsed. If the computation is cancelled, call
> @var{limit-reached} in tail position. @var{thunk} must not disable
> interrupts or prevent an abort via a @code{dynamic-wind} unwind
> handler."
> ;; FIXME: use separate thread instead of sigalrm.
> (let ((limit-usecs (inexact->exact (round (* limit 1e6))))
> (prev-sigalarm-handler #f)
> (tag (make-prompt-tag)))
> (call-with-prompt tag
> (lambda ()
> (dynamic-wind
> (lambda ()
> (set! prev-sigalarm-handler
> (sigaction SIGALRM (lambda (sig) (abort-to-prompt tag))))
> (setitimer ITIMER_REAL 0 0 0 limit-usecs))
> thunk
> (lambda ()
> (setitimer ITIMER_REAL 0 0 0 0)
> (match prev-sigalarm-handler
> ((handler . flags)
> (sigaction SIGALRM handler flags))))))
> (lambda (k)
> (limit-reached)))))
>
> (define (call-with-allocation-limit limit thunk limit-reached)
> "Call @var{thunk}, but cancel it if @var{limit} bytes have been
> allocated. If the computation is cancelled, call @var{limit-reached} in
> tail position. @var{thunk} must not disable interrupts or prevent an
> abort via a @code{dynamic-wind} unwind handler.
>
> This limit applies to both stack and heap allocation. The computation
> will not be aborted before @var{limit} bytes have been allocated, but
> for the heap allocation limit, the check may be postponed until the next garbage collection."
> (define (bytes-allocated) (assq-ref (gc-stats) 'heap-total-allocated))
> (let ((zero (bytes-allocated))
> (tag (make-prompt-tag)))
> (define (check-allocation)
> (when (< limit (- (bytes-allocated) zero))
> (abort-to-prompt tag)))
> (call-with-prompt tag
> (lambda ()
> (dynamic-wind
> (lambda ()
> (add-hook! after-gc-hook check-allocation))
> (lambda ()
> (call-with-stack-overflow-handler
> ;; The limit is in "words", which used to be 4 or 8 but now
> ;; is always 8 bytes.
> (floor/ limit 8)
> thunk
> (lambda () (abort-to-prompt tag))))
> (lambda ()
> (remove-hook! after-gc-hook check-allocation))))
> (lambda (k)
> (limit-reached)))))
>
> (define (call-with-time-and-allocation-limits time-limit allocation-limit
> thunk)
> "Invoke @var{thunk} in a dynamic extent in which its execution is
> limited to @var{time-limit} seconds of wall-clock time, and its
> allocation to @var{allocation-limit} bytes. @var{thunk} must not
> disable interrupts or prevent an abort via a @code{dynamic-wind} unwind
> handler.
>
> If successful, return all values produced by invoking @var{thunk}. Any
> uncaught exception thrown by the thunk will propagate out. If the time
> or allocation limit is exceeded, an exception will be thrown to the
> @code{limit-exceeded} key."
>
> (call-with-time-limit
> time-limit
> (lambda ()
> (call-with-allocation-limit
> allocation-limit
> thunk
> (lambda ()
> (scm-error 'limit-exceeded "with-resource-limits"
> "Allocation limit exceeded" '() #f))))
> (lambda ()
> (scm-error 'limit-exceeded "with-resource-limits"
> "Time limit exceeded" '() #f))))
>
> (define (sever-module! m)
> "Remove @var{m} from its container module."
> (match (module-name m)
> ((head ... tail)
> (let ((parent (resolve-module head #f)))
> (unless (eq? m (module-ref-submodule parent tail))
> (error "can't sever module?"))
> (hashq-remove! (module-submodules parent) tail)))))
>
> ;; bindings := module-binding-list ...
> ;; module-binding-list := interface-name import ...
> ;; import := name | (exported-name . imported-name)
> ;; name := symbol
> (define (make-sandbox-module bindings)
> "Return a fresh module that only contains @var{bindings}.
>
> The @var{bindings} should be given as a list of import sets. One import
> set is a list whose car names an interface, like @code{(ice-9 q)}, and
> whose cdr is a list of imports. An import is either a bare symbol or a
> pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
> both symbols and denote the name under which a binding is exported from
> the module, and the name under which to make the binding available,
> respectively."
> (let ((m (make-fresh-user-module)))
> (purify-module! m)
> ;; FIXME: We want to have a module that will be collectable by GC.
> ;; Currently in Guile all modules are part of a single tree, and
> ;; once a module is part of that tree it will never be collected.
> ;; So we want to sever the module off from that tree. However the
> ;; psyntax syntax expander currently needs to be able to look up
> ;; modules by name; being severed from the name tree prevents that
> ;; from happening. So for now, each evaluation leaks memory :/
> ;;
> ;; (sever-module! m)
> (module-use-interfaces! m
> (map (match-lambda
> ((mod-name . bindings)
> (resolve-interface mod-name
> #:select bindings)))
> bindings))
> m))
>
> (define* (eval-in-sandbox exp #:key
> (time-limit 0.1)
> (allocation-limit #e10e6)
> (bindings *all-pure-bindings*)
> (module (make-sandbox-module bindings)))
> "Evaluate the Scheme expression @var{exp} within an isolated
> \"sandbox\". Limit its execution to @var{time-limit} seconds of
> wall-clock time, and limit its allocation to @var{allocation-limit}
> bytes.
>
> The evaluation will occur in @var{module}, which defaults to the result
> of calling @code{make-sandbox-module} on @var{bindings}, which itself
> defaults to @code{*all-pure-bindings*}. This is the core of the
> sandbox: creating a scope for the expression that is @dfn{safe}.
>
> A safe sandbox module has two characteristics. Firstly, it will not
> allow the expression being evaluated to avoid being cancelled due to
> time or allocation limits. This ensures that the expression terminates
> in a timely fashion.
>
> Secondly, a safe sandbox module will prevent the evaluation from
> receiving information from previous evaluations, or from affecting
> future evaluations. All combinations of binding sets exported by
> @code{(ice-9 sandbox)} form safe sandbox modules.
>
> The @var{bindings} should be given as a list of import sets. One import
> set is a list whose car names an interface, like @code{(ice-9 q)}, and
> whose cdr is a list of imports. An import is either a bare symbol or a
> pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
> both symbols and denote the name under which a binding is exported from
> the module, and the name under which to make the binding available,
> respectively. Note that @var{bindings} is only used as an input to the
> default initializer for the @var{module} argument; if you pass
> @code{#:module}, @var{bindings} is unused.
>
> If successful, return all values produced by @var{exp}. Any uncaught
> exception thrown by the expression will propagate out. If the time or
> allocation limit is exceeded, an exception will be thrown to the
> @code{limit-exceeded} key."
> (call-with-time-and-allocation-limits
> time-limit allocation-limit
> (lambda ()
> ;; Prevent the expression from forging syntax objects. See "Syntax
> ;; Transformer Helpers" in the manual.
> (parameterize ((allow-legacy-syntax-objects? #f))
> (eval exp module)))))
>
>
> ;; An evaluation-sandboxing facility is safe if:
> ;;
> ;; (1) every evaluation will terminate in a timely manner
> ;;
> ;; (2) no evaluation can affect future evaluations
> ;;
> ;; For (1), we impose a user-controllable time limit on the evaluation,
> ;; in wall-clock time. When that limit is reached, Guile schedules an
> ;; asynchronous interrupt in the sandbox that aborts the computation.
> ;; For this to work, the sandboxed evaluation must not disable
> ;; interrupts, and it must not prevent timely aborts via malicious "out"
> ;; guards in dynamic-wind thunks.
> ;;
> ;; The sandbox also has an allocation limit that uses a similar cancel
> ;; mechanism, but this limit is less precise as it only runs at
> ;; garbage-collection time.
> ;;
> ;; The sandbox sets the allocation limit as the stack limit as well.
> ;;
> ;; For (2), the only way an evaluation can affect future evaluations is
> ;; if it causes a side-effect outside its sandbox. That side effect
> ;; could change the way the host or future sandboxed evaluations
> ;; operate, or it could leak information to future evaluations.
> ;;
> ;; One means of information leakage would be the file system. Although
> ;; one can imagine "safe" ways to access a file system, in practice we
> ;; just prevent all access to this and other operating system facilities
> ;; by not exposing the Guile primitives that access the file system,
> ;; connect to networking hosts, etc. If we chose our set of bindings
> ;; correctly and it is impossible to access host values other than those
> ;; given to the evaluation, then we have succeeded in granting only a
> ;; limited set of capabilities to the guest.
> ;;
> ;; To prevent information leakage we also limit other information about
> ;; the host, like its hostname or the Guile build information.
> ;;
> ;; The guest must also not have the capability to mutate a location used
> ;; by the host or by future sandboxed evaluations. Either you expose no
> ;; primitives to the evaluation that can mutate locations, or you expose
> ;; no mutable locations. In this sandbox we opt for a combination of
> ;; the two, though the selection of bindings is up to the user. "set!"
> ;; is always excluded, as Guile doesn't have a nice way to prevent set!
> ;; on imported bindings. But variable-set! is included, as no set of
> ;; bindings from this module includes a variable or a capability to a
> ;; variable. It's possible though to build sandbox modules with no
> ;; mutating primitives. As far as we know, all possible combinations of
> ;; the binding sets listed below are safe.
> ;;
> (define *core-bindings*
> '(((guile)
> and
> begin
> apply
> call-with-values
> values
> case
> case-lambda
> case-lambda*
> cond
> define
> define*
> define-values
> do
> if
> lambda
> lambda*
> let
> let*
> letrec
> letrec*
> or
> quasiquote
> quote
> ;; Can't allow mutation to globals.
> ;; set!
> unless
> unquote
> unquote-splicing
> when
> while
> λ)))
>
> (define *macro-bindings*
> '(((guile)
> bound-identifier=?
> ;; Although these have "current" in their name, they are lexically
> ;; scoped, not dynamically scoped.
> current-filename
> current-source-location
> datum->syntax
> define-macro
> define-syntax
> define-syntax-parameter
> define-syntax-rule
> defmacro
> free-identifier=?
> generate-temporaries
> gensym
> identifier-syntax
> identifier?
> let-syntax
> letrec-syntax
> macroexpand
> macroexpanded?
> quasisyntax
> start-stack
> syntax
> syntax->datum
> syntax-case
> syntax-error
> syntax-parameterize
> syntax-rules
> syntax-source
> syntax-violation
> unsyntax
> unsyntax-splicing
> with-ellipsis
> with-syntax
> make-variable-transformer)))
>
> (define *iteration-bindings*
> '(((guile)
> compose
> for-each
> identity
> iota
> map
> map-in-order
> const
> noop)))
>
> (define *clock-bindings*
> '(((guile)
> get-internal-real-time
> internal-time-units-per-second
> sleep
> usleep)))
>
> (define *procedure-bindings*
> '(((guile)
> procedure-documentation
> procedure-minimum-arity
> procedure-name
> procedure?
> thunk?)))
>
> (define *version-bindings*
> '(((guile)
> effective-version
> major-version
> micro-version
> minor-version
> version
> version-matches?)))
>
> (define *nil-bindings*
> '(((guile)
> nil?)))
>
> (define *unspecified-bindings*
> '(((guile)
> unspecified?
> *unspecified*)))
>
> (define *predicate-bindings*
> '(((guile)
> ->bool
> and-map
> and=>
> boolean?
> eq?
> equal?
> eqv?
> negate
> not
> or-map)))
>
> ;; The current ports (current-input-port et al) are dynamically scoped,
> ;; which is a footgun from a sandboxing perspective. It's too easy for
> ;; a procedure that is the result of a sandboxed evaluation to be later
> ;; invoked in a different context and thereby be implicitly granted
> ;; capabilities to whatever port is then current. This is compounded by
> ;; the fact that most Scheme i/o primitives allow the port to be omitted
> ;; and thereby default to whatever's current. For now, sadly, we avoid
> ;; exposing any i/o primitive to the sandbox.
> #;
> (define *i/o-bindings*
> '(((guile)
> display
> eof-object?
> force-output
> format
> make-soft-port
> newline
> read
> simple-format
> write
> write-char)
> ((ice-9 ports)
> %make-void-port
> char-ready?
> ;; Note that these are mutable parameters.
> current-error-port
> current-input-port
> current-output-port
> current-warning-port
> drain-input
> eof-object?
> file-position
> force-output
> ftell
> input-port?
> output-port?
> peek-char
> port-closed?
> port-column
> port-conversion-strategy
> port-encoding
> port-filename
> port-line
> port-mode
> port?
> read-char
> the-eof-object
> ;; We don't provide open-output-string because it needs
> ;; get-output-string, and get-output-string provides a generic
> ;; capability on any output string port. For consistency then we
> ;; don't provide open-input-string either; call-with-input-string
> ;; is sufficient.
> call-with-input-string
> call-with-output-string
> with-error-to-port
> with-error-to-string
> with-input-from-port
> with-input-from-string
> with-output-to-port
> with-output-to-string)))
>
> ;; If two evaluations are called with the same input port, unread-char
> ;; and unread-string can use a port as a mutable channel to pass
> ;; information from one to the other.
> #;
> (define *mutating-i/o-bindings*
> '(((guile)
> set-port-encoding!)
> ((ice-9 ports)
> close-input-port
> close-output-port
> close-port
> file-set-position
> seek
> set-port-column!
> set-port-conversion-strategy!
> set-port-encoding!
> set-port-filename!
> set-port-line!
> setvbuf
> unread-char
> unread-string)))
>
> (define *error-bindings*
> '(((guile)
> error
> throw
> with-throw-handler
> catch
> ;; false-if-exception can cause i/o if the #:warning arg is passed.
> ;; false-if-exception
>
> ;; See notes on *i/o-bindings*.
> ;; peek
> ;; pk
> ;; print-exception
> ;; warn
> strerror
> scm-error
> )))
>
> ;; FIXME: Currently we can't expose anything that works on the current
> ;; module to the sandbox. It could be that the sandboxed evaluation
> ;; returns a procedure, and that procedure may later be invoked in a
> ;; different context with a different current-module and it is unlikely
> ;; that the later caller will consider themselves as granting a
> ;; capability on whatever module is then current. Likewise export (and
> ;; by extension, define-public and the like) also operate on the current
> ;; module.
> ;;
> ;; It could be that we could expose a statically scoped eval to the
> ;; sandbox.
> #;
> (define *eval-bindings*
> '(((guile)
> current-module
> module-name
> module?
> define-once
> define-private
> define-public
> defined?
> export
> defmacro-public
> ;; FIXME: single-arg eval?
> eval
> primitive-eval
> eval-string
> self-evaluating?
> ;; Can we?
> set-current-module)))
>
> (define *sort-bindings*
> '(((guile)
> sort
> sorted?
> stable-sort
> sort-list)))
>
> ;; These can only form part of a safe binding set if no mutable pair or
> ;; vector is exposed to the sandbox.
> (define *mutating-sort-bindings*
> '(((guile)
> sort!
> stable-sort!
> sort-list!
> restricted-vector-sort!)))
>
> (define *regexp-bindings*
> '(((guile)
> make-regexp
> regexp-exec
> regexp/basic
> regexp/extended
> regexp/icase
> regexp/newline
> regexp/notbol
> regexp/noteol
> regexp?)))
>
> (define *alist-bindings*
> '(((guile)
> acons
> assoc
> assoc-ref
> assq
> assq-ref
> assv
> assv-ref
> sloppy-assoc
> sloppy-assq
> sloppy-assv)))
>
> ;; These can only form part of a safe binding set if no mutable pair
> ;; is exposed to the sandbox. Unfortunately all charsets in Guile are
> ;; mutable, currently, including the built-in charsets, so we can't
> ;; expose these primitives.
> (define *mutating-alist-bindings*
> '(((guile)
> assoc-remove!
> assoc-set!
> assq-remove!
> assq-set!
> assv-remove!
> assv-set!)))
>
> (define *number-bindings*
> '(((guile)
> *
> +
> -
> /
> 1+
> 1-
> <
> <=
> =
> >
> >=
> abs
> acos
> acosh
> angle
> asin
> asinh
> atan
> atanh
> ceiling
> ceiling-quotient
> ceiling-remainder
> ceiling/
> centered-quotient
> centered-remainder
> centered/
> complex?
> cos
> cosh
> denominator
> euclidean-quotient
> euclidean-remainder
> euclidean/
> even?
> exact->inexact
> exact-integer-sqrt
> exact-integer?
> exact?
> exp
> expt
> finite?
> floor
> floor-quotient
> floor-remainder
> floor/
> gcd
> imag-part
> inf
> inf?
> integer-expt
> integer-length
> integer?
> lcm
> log
> log10
> magnitude
> make-polar
> make-rectangular
> max
> min
> modulo
> modulo-expt
> most-negative-fixnum
> most-positive-fixnum
> nan
> nan?
> negative?
> numerator
> odd?
> positive?
> quotient
> rational?
> rationalize
> real-part
> real?
> remainder
> round
> round-quotient
> round-remainder
> round/
> sin
> sinh
> sqrt
> tan
> tanh
> truncate
> truncate-quotient
> truncate-remainder
> truncate/
> zero?
> number?
> number->string
> string->number)))
>
> (define *char-set-bindings*
> '(((guile)
> ->char-set
> char-set
> char-set->list
> char-set->string
> char-set-adjoin
> char-set-any
> char-set-complement
> char-set-contains?
> char-set-copy
> char-set-count
> char-set-cursor
> char-set-cursor-next
> char-set-delete
> char-set-diff+intersection
> char-set-difference
> char-set-every
> char-set-filter
> char-set-fold
> char-set-for-each
> char-set-hash
> char-set-intersection
> char-set-map
> char-set-ref
> char-set-size
> char-set-unfold
> char-set-union
> char-set-xor
> char-set:ascii
> char-set:blank
> char-set:designated
> char-set:digit
> char-set:empty
> char-set:full
> char-set:graphic
> char-set:hex-digit
> char-set:iso-control
> char-set:letter
> char-set:letter+digit
> char-set:lower-case
> char-set:printing
> char-set:punctuation
> char-set:symbol
> char-set:title-case
> char-set:upper-case
> char-set:whitespace
> char-set<=
> char-set=
> char-set?
> end-of-char-set?
> list->char-set
> string->char-set
> ucs-range->char-set)))
>
> ;; These can only form part of a safe binding set if no mutable char-set
> ;; is exposed to the sandbox. Unfortunately all charsets in Guile are
> ;; mutable, currently, including the built-in charsets, so we can't
> ;; expose these primitives.
> #;
> (define *mutating-char-set-bindings*
> '(((guile)
> char-set-adjoin!
> char-set-complement!
> char-set-delete!
> char-set-diff+intersection!
> char-set-difference!
> char-set-filter!
> char-set-intersection!
> char-set-unfold!
> char-set-union!
> char-set-xor!
> list->char-set!
> string->char-set!
> ucs-range->char-set!)))
>
> (define *array-bindings*
> '(((guile)
> array->list
> array-cell-ref
> array-contents
> array-dimensions
> array-equal?
> array-for-each
> array-in-bounds?
> array-length
> array-rank
> array-ref
> array-shape
> array-slice
> array-slice-for-each
> array-slice-for-each-in-order
> array-type
> array-type-code
> array?
> list->array
> list->typed-array
> make-array
> make-shared-array
> make-typed-array
> shared-array-increments
> shared-array-offset
> shared-array-root
> transpose-array
> typed-array?)))
>
> ;; These can only form part of a safe binding set if no mutable vector,
> ;; bitvector, bytevector, srfi-4 vector, or array is exposed to the
> ;; sandbox.
> (define *mutating-array-bindings*
> '(((guile)
> array-cell-set!
> array-copy!
> array-copy-in-order!
> array-fill!
> array-index-map!
> array-map!
> array-map-in-order!
> array-set!)))
>
> (define *hash-bindings*
> '(((guile)
> doubly-weak-hash-table?
> hash
> hash-count
> hash-fold
> hash-for-each
> hash-for-each-handle
> hash-get-handle
> hash-map->list
> hash-ref
> hash-table?
> hashq
> hashq-get-handle
> hashq-ref
> hashv
> hashv-get-handle
> hashv-ref
> hashx-get-handle
> hashx-ref
> make-doubly-weak-hash-table
> make-hash-table
> make-weak-key-hash-table
> make-weak-value-hash-table
> weak-key-hash-table?
> weak-value-hash-table?)))
>
> ;; These can only form part of a safe binding set if no hash table is
> ;; exposed to the sandbox.
> (define *mutating-hash-bindings*
> '(((guile)
> hash-clear!
> hash-create-handle!
> hash-remove!
> hash-set!
> hashq-create-handle!
> hashq-remove!
> hashq-set!
> hashv-create-handle!
> hashv-remove!
> hashv-set!
> hashx-create-handle!
> hashx-remove!
> hashx-set!)))
>
> (define *variable-bindings*
> '(((guile)
> make-undefined-variable
> make-variable
> variable-bound?
> variable-ref
> variable?)))
>
> ;; These can only form part of a safe binding set if no mutable variable
> ;; is exposed to the sandbox; this applies particularly to variables
> ;; that are module bindings.
> (define *mutating-variable-bindings*
> '(((guile)
> variable-set!
> variable-unset!)))
>
> (define *string-bindings*
> '(((guile)
> absolute-file-name?
> file-name-separator-string
> file-name-separator?
> in-vicinity
> basename
> dirname
>
> list->string
> make-string
> object->string
> reverse-list->string
> string
> string->list
> string-any
> string-any-c-code
> string-append
> string-append/shared
> string-capitalize
> string-ci<
> string-ci<=
> string-ci<=?
> string-ci<>
> string-ci<?
> string-ci=
> string-ci=?
> string-ci>
> string-ci>=
> string-ci>=?
> string-ci>?
> string-compare
> string-compare-ci
> string-concatenate
> string-concatenate-reverse
> string-concatenate-reverse/shared
> string-concatenate/shared
> string-contains
> string-contains-ci
> string-copy
> string-count
> string-delete
> string-downcase
> string-drop
> string-drop-right
> string-every
> string-every-c-code
> string-filter
> string-fold
> string-fold-right
> string-for-each
> string-for-each-index
> string-hash
> string-hash-ci
> string-index
> string-index-right
> string-join
> string-length
> string-map
> string-normalize-nfc
> string-normalize-nfd
> string-normalize-nfkc
> string-normalize-nfkd
> string-null?
> string-pad
> string-pad-right
> string-prefix-ci?
> string-prefix-length
> string-prefix-length-ci
> string-prefix?
> string-ref
> string-replace
> string-reverse
> string-rindex
> string-skip
> string-skip-right
> string-split
> string-suffix-ci?
> string-suffix-length
> string-suffix-length-ci
> string-suffix?
> string-tabulate
> string-take
> string-take-right
> string-titlecase
> string-tokenize
> string-trim
> string-trim-both
> string-trim-right
> string-unfold
> string-unfold-right
> string-upcase
> string-utf8-length
> string<
> string<=
> string<=?
> string<>
> string<?
> string=
> string=?
> string>
> string>=
> string>=?
> string>?
> string?
> substring
> substring/copy
> substring/read-only
> substring/shared
> xsubstring)))
>
> ;; These can only form part of a safe binding set if no mutable string
> ;; is exposed to the sandbox.
> (define *mutating-string-bindings*
> '(((guile)
> string-capitalize!
> string-copy!
> string-downcase!
> string-fill!
> string-map!
> string-reverse!
> string-set!
> string-titlecase!
> string-upcase!
> string-xcopy!
> substring-fill!
> substring-move!)))
>
> (define *symbol-bindings*
> '(((guile)
> string->symbol
> string-ci->symbol
> symbol->string
> list->symbol
> make-symbol
> symbol
> symbol-append
> symbol-hash
> symbol-interned?
> symbol?)))
>
> (define *keyword-bindings*
> '(((guile)
> keyword?
> keyword->symbol
> symbol->keyword)))
>
> ;; These can only form part of a safe binding set if no valid prompt tag
> ;; is ever exposed to the sandbox, or can be constructed by the sandbox.
> (define *prompt-bindings*
> '(((guile)
> abort-to-prompt
> abort-to-prompt*
> call-with-prompt
> make-prompt-tag)))
>
> (define *bit-bindings*
> '(((guile)
> ash
> round-ash
> logand
> logcount
> logior
> lognot
> logtest
> logxor
> logbit?)))
>
> (define *bitvector-bindings*
> '(((guile)
> bit-count
> bit-count*
> bit-extract
> bit-position
> bitvector
> bitvector->list
> bitvector-length
> bitvector-ref
> bitvector?
> list->bitvector
> make-bitvector)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; bitvector is exposed to the sandbox.
> (define *mutating-bitvector-bindings*
> '(((guile)
> bit-invert!
> bit-set*!
> bitvector-fill!
> bitvector-set!)))
>
> (define *fluid-bindings*
> '(((guile)
> fluid-bound?
> fluid-ref
> ;; fluid-ref* could escape the sandbox and is not allowed.
> fluid-thread-local?
> fluid?
> make-fluid
> make-thread-local-fluid
> make-unbound-fluid
> with-fluid*
> with-fluids
> with-fluids*
> make-parameter
> parameter?
> parameterize)))
>
> ;; These can only form part of a safe binding set if no fluid is
> ;; directly exposed to the sandbox.
> (define *mutating-fluid-bindings*
> '(((guile)
> fluid-set!
> fluid-unset!
> fluid->parameter)))
>
> (define *char-bindings*
> '(((guile)
> char-alphabetic?
> char-ci<=?
> char-ci<?
> char-ci=?
> char-ci>=?
> char-ci>?
> char-downcase
> char-general-category
> char-is-both?
> char-lower-case?
> char-numeric?
> char-titlecase
> char-upcase
> char-upper-case?
> char-whitespace?
> char<=?
> char<?
> char=?
> char>=?
> char>?
> char?
> char->integer
> integer->char)))
>
> (define *list-bindings*
> '(((guile)
> list
> list-cdr-ref
> list-copy
> list-head
> list-index
> list-ref
> list-tail
> list?
> null?
> make-list
> append
> delete
> delq
> delv
> filter
> length
> member
> memq
> memv
> merge
> reverse)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; pair is exposed to the sandbox.
> (define *mutating-list-bindings*
> '(((guile)
> list-cdr-set!
> list-set!
> append!
> delete!
> delete1!
> delq!
> delq1!
> delv!
> delv1!
> filter!
> merge!
> reverse!)))
>
> (define *pair-bindings*
> '(((guile)
> last-pair
> pair?
> caaaar
> caaadr
> caaar
> caadar
> caaddr
> caadr
> caar
> cadaar
> cadadr
> cadar
> caddar
> cadddr
> caddr
> cadr
> car
> cdaaar
> cdaadr
> cdaar
> cdadar
> cdaddr
> cdadr
> cdar
> cddaar
> cddadr
> cddar
> cdddar
> cddddr
> cdddr
> cddr
> cdr
> cons
> cons*)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; pair is exposed to the sandbox.
> (define *mutating-pair-bindings*
> '(((guile)
> set-car!
> set-cdr!)))
>
> (define *vector-bindings*
> '(((guile)
> list->vector
> make-vector
> vector
> vector->list
> vector-copy
> vector-length
> vector-ref
> vector?)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; vector is exposed to the sandbox.
> (define *mutating-vector-bindings*
> '(((guile)
> vector-fill!
> vector-move-left!
> vector-move-right!
> vector-set!)))
>
> (define *promise-bindings*
> '(((guile)
> force
> delay
> make-promise
> promise?)))
>
> (define *srfi-4-bindings*
> '(((srfi srfi-4)
> f32vector
> f32vector->list
> f32vector-length
> f32vector-ref
> f32vector?
> f64vector
> f64vector->list
> f64vector-length
> f64vector-ref
> f64vector?
> list->f32vector
> list->f64vector
> list->s16vector
> list->s32vector
> list->s64vector
> list->s8vector
> list->u16vector
> list->u32vector
> list->u64vector
> list->u8vector
> make-f32vector
> make-f64vector
> make-s16vector
> make-s32vector
> make-s64vector
> make-s8vector
> make-u16vector
> make-u32vector
> make-u64vector
> make-u8vector
> s16vector
> s16vector->list
> s16vector-length
> s16vector-ref
> s16vector?
> s32vector
> s32vector->list
> s32vector-length
> s32vector-ref
> s32vector?
> s64vector
> s64vector->list
> s64vector-length
> s64vector-ref
> s64vector?
> s8vector
> s8vector->list
> s8vector-length
> s8vector-ref
> s8vector?
> u16vector
> u16vector->list
> u16vector-length
> u16vector-ref
> u16vector?
> u32vector
> u32vector->list
> u32vector-length
> u32vector-ref
> u32vector?
> u64vector
> u64vector->list
> u64vector-length
> u64vector-ref
> u64vector?
> u8vector
> u8vector->list
> u8vector-length
> u8vector-ref
> u8vector?)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; bytevector is exposed to the sandbox.
> (define *mutating-srfi-4-bindings*
> '(((srfi srfi-4)
> f32vector-set!
> f64vector-set!
> s16vector-set!
> s32vector-set!
> s64vector-set!
> s8vector-set!
> u16vector-set!
> u32vector-set!
> u64vector-set!
> u8vector-set!)))
>
> (define *all-pure-bindings*
> (append *alist-bindings*
> *array-bindings*
> *bit-bindings*
> *bitvector-bindings*
> *char-bindings*
> *char-set-bindings*
> *clock-bindings*
> *core-bindings*
> *error-bindings*
> *fluid-bindings*
> *hash-bindings*
> *iteration-bindings*
> *keyword-bindings*
> *list-bindings*
> *macro-bindings*
> *nil-bindings*
> *number-bindings*
> *pair-bindings*
> *predicate-bindings*
> *procedure-bindings*
> *promise-bindings*
> *prompt-bindings*
> *regexp-bindings*
> *sort-bindings*
> *srfi-4-bindings*
> *string-bindings*
> *symbol-bindings*
> *unspecified-bindings*
> *variable-bindings*
> *vector-bindings*
> *version-bindings*))
>
>
> (define *all-pure-and-impure-bindings*
> (append *all-pure-bindings*
> *mutating-alist-bindings*
> *mutating-array-bindings*
> *mutating-bitvector-bindings*
> *mutating-fluid-bindings*
> *mutating-hash-bindings*
> *mutating-list-bindings*
> *mutating-pair-bindings*
> *mutating-sort-bindings*
> *mutating-srfi-4-bindings*
> *mutating-string-bindings*
> *mutating-variable-bindings*
> *mutating-vector-bindings*))
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-04-03 15:35 ` Ludovic Courtès
@ 2017-04-14 10:52 ` Andy Wingo
2017-04-14 12:17 ` tomas
2017-04-14 12:32 ` Ludovic Courtès
0 siblings, 2 replies; 17+ messages in thread
From: Andy Wingo @ 2017-04-14 10:52 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
On Mon 03 Apr 2017 17:35, ludo@gnu.org (Ludovic Courtès) writes:
> Riastradh’s document at <http://mumble.net/~campbell/scheme/style.txt>
> has this:
>
> Affix asterisks to the beginning and end of a globally mutable
> variable. This allows the reader of the program to recognize very
> easily that it is badly written!
>
> … but it doesn’t say anything about constants nor about %.
>
> It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
> ‘%all-pure-bindings’. So, dunno, as you see fit!
I feel like I would have less of a need for name sigils like *earmuffs*
or %preficentiles if we had more reliably immutable data.
Right now one of the functions of these sigils is to tell the reader,
"Don't use append! on this data structure or you will cause spooky
action-at-a-distance!"
It sure would be nice to be able to use these values without worries of
this kind. We don't have this immutability problem with strings because
our compiled string literals are marked as immutable, and string
mutators assert that the strings are mutable. We should do the same for
all literal constants.
We currently can't add an immutable bit to pairs due to our tagging
scheme -- pairs are just two words. But we can do this easily with
other data types: vectors, arrays, bytevectors, etc. (If we want to do
this, anyway.)
However we it is possible to do a more expensive check to see if a pair
is embedded in an ELF image (or the converse, that it is allocated on
the GC heap). I just looked in Guile and there are only a few dozen
instances of set-car! in Guile's source and a bit more of set-cdr!, so
it's conceivable to think of this being a check that we can make.
If we are able to do this, we can avoid the whole discussion about
SIGSEGV handlers.
It would be nice of course to be able to cons an immutable pair on the
heap -- so a simple GC_is_heap_ptr(x) check wouldn't suffice to prove
immutability. Not sure quite what the right solution would be there.
FWIW, Racket uses four words for pairs: the type tag, the hash code, and
the two fields. Four words is I think the logical progression after 2
given GC's object size granularity. It would be nice to avoid having
the extra words, but if we ever switched to a moving GC we would need
space for a hash code I think.
Thoughts on the plan for immutable literals?
Concretely for this use case, assuming that we can solve the immutable
literal problem, I propose to remove sigils entirely. Thoughts welcome
here.
Andy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-04-06 21:41 ` Freja Nordsiek
@ 2017-04-14 10:58 ` Andy Wingo
0 siblings, 0 replies; 17+ messages in thread
From: Andy Wingo @ 2017-04-14 10:58 UTC (permalink / raw)
To: Freja Nordsiek; +Cc: guile-devel
On Thu 06 Apr 2017 23:41, Freja Nordsiek <fnordsie@gmail.com> writes:
> On the subject of ports and i/o, I have a few ideas. R6RS i/o in the
> (rnrs io ports) module generally requires the port to be explicitly
> given, rather than assuming current in or out if not given (though
> rnrs io simple does make those assumptions). For many, it would be
> impossible because they put the port as the first argument and a
> required second argument afterwards. Looking at module/io/ports.scm in
> Guile 2.2.x, it looks like the reading and writing procedures there
> should be safe. Obviously, nothing that opens a file should be used,
> nor the procedures to get current input, output, and error; but the
> rest can be used. And this includes string and bytevector ports, which
> could be very useful in the sandbox (I don't know about anyone else,
> but I use string ports all the time).
>
> One question, is there a particular reason that guard is not exported?
> It doesn't seem like it is as nasty as dynamic-wind with trying to
> terminate, though maybe I am just not seeing how it could be used to
> prevent the sandbox terminating the process. Having at least one
> exception handling binding might be very helpful in a sandbox.
These questions are related. There is nothing unsafe about "guard"
specifically. Indeed the sandbox environment has "catch" and similar
things. "guard" isn't in this default set because currently the set of
bindings that (ice-9 sandbox) offers in *all-pure-and-impure-bindings*
is subset of the bindings that are available by default. "guard" has to
be imported via srfi-34. Likewise for r6rs port procedures. I think
it's reasonable to have this limitation -- otherwise there's no point at
which to stop. Other binding sets are of course possible.
I would of course like I/O in the sandbox :) We could have versions of
"display" et al that require their port argument; that would be a
consistent with the strict-subset criteria.
Andy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-04-14 10:52 ` Andy Wingo
@ 2017-04-14 12:17 ` tomas
2017-04-14 12:32 ` Ludovic Courtès
1 sibling, 0 replies; 17+ messages in thread
From: tomas @ 2017-04-14 12:17 UTC (permalink / raw)
To: guile-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Fri, Apr 14, 2017 at 12:52:19PM +0200, Andy Wingo wrote:
[...]
> Concretely for this use case, assuming that we can solve the immutable
> literal problem, I propose to remove sigils entirely. Thoughts welcome
> here.
There's still the "cultural value" of such sigils, which eases the
communication between humans. That'll depend on what other Schemes
do, and how current pedagogical literature is set up. Readability
and all that. Cultures are bound to change, though.
Of course, really marking things as immutable (the "technical" bit)
is still very cool.
regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEARECAAYFAljwvcUACgkQBcgs9XrR2kYZTACcDPuqBDCiuPT9Etz3YS1m6Mta
TT4AniJs2TRtp899aiuleeV1FqYo1be7
=nA1X
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-04-14 10:52 ` Andy Wingo
2017-04-14 12:17 ` tomas
@ 2017-04-14 12:32 ` Ludovic Courtès
1 sibling, 0 replies; 17+ messages in thread
From: Ludovic Courtès @ 2017-04-14 12:32 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Hi!
Andy Wingo <wingo@pobox.com> skribis:
> On Mon 03 Apr 2017 17:35, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Riastradh’s document at <http://mumble.net/~campbell/scheme/style.txt>
>> has this:
>>
>> Affix asterisks to the beginning and end of a globally mutable
>> variable. This allows the reader of the program to recognize very
>> easily that it is badly written!
>>
>> … but it doesn’t say anything about constants nor about %.
>>
>> It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
>> ‘%all-pure-bindings’. So, dunno, as you see fit!
>
> I feel like I would have less of a need for name sigils like *earmuffs*
> or %preficentiles if we had more reliably immutable data.
[...]
> However we it is possible to do a more expensive check to see if a pair
> is embedded in an ELF image (or the converse, that it is allocated on
> the GC heap). I just looked in Guile and there are only a few dozen
> instances of set-car! in Guile's source and a bit more of set-cdr!, so
> it's conceivable to think of this being a check that we can make.
>
> If we are able to do this, we can avoid the whole discussion about
> SIGSEGV handlers.
>
> It would be nice of course to be able to cons an immutable pair on the
> heap -- so a simple GC_is_heap_ptr(x) check wouldn't suffice to prove
> immutability. Not sure quite what the right solution would be there.
>
> FWIW, Racket uses four words for pairs: the type tag, the hash code, and
> the two fields. Four words is I think the logical progression after 2
> given GC's object size granularity. It would be nice to avoid having
> the extra words, but if we ever switched to a moving GC we would need
> space for a hash code I think.
>
> Thoughts on the plan for immutable literals?
My feeling is that using GC_is_heap_ptr or similar would be nicer than
adding bits to the type tags, because we’d need to add this read-only
bit for every type, and we could have bugs where we forget to check them
in some cases.
GC_is_heap_ptr is probably enough until we support immutable objects
allocated on the heap.
> Concretely for this use case, assuming that we can solve the immutable
> literal problem, I propose to remove sigils entirely. Thoughts welcome
> here.
In practice I guess the funny characters will stay for a while. :-)
But I agree that it’d be nice to have a generic way to represent
immutable objects.
Ludo’.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 9:27 RFC: (ice-9 sandbox) Andy Wingo
` (3 preceding siblings ...)
2017-04-06 21:41 ` Freja Nordsiek
@ 2017-04-15 17:23 ` Nala Ginrut
2017-04-17 8:07 ` Andy Wingo
2017-04-18 19:48 ` Andy Wingo
5 siblings, 1 reply; 17+ messages in thread
From: Nala Ginrut @ 2017-04-15 17:23 UTC (permalink / raw)
To: Andy Wingo, guile-devel
[-- Attachment #1: Type: text/plain, Size: 797 bytes --]
Hi Andy!
It's pretty cool!
Could you please add #:from keyword to evil-in-sand box to indicate the
language front-end? Don't forget there's multi-lang plan. :-)
Best regards.
Andy Wingo <wingo@pobox.com>于2017年3月31日周五 17:28写道:
> Hi,
>
> Attached is a module that can evaluate an expression within a sandbox.
> If the evaluation takes too long or allocates too much, it will be
> cancelled. The evaluation will take place with respect to a module with
> a "safe" set of imports. Those imports include most of the bindings
> available in a default Guile environment. See the file below for full
> details and a number of caveats.
>
> Any thoughts? I would like something like this for a web service that
> has to evaluate untrusted code.
>
> Andy
>
>
[-- Attachment #2: Type: text/html, Size: 1109 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-04-15 17:23 ` Nala Ginrut
@ 2017-04-17 8:07 ` Andy Wingo
2017-04-17 9:12 ` Nala Ginrut
0 siblings, 1 reply; 17+ messages in thread
From: Andy Wingo @ 2017-04-17 8:07 UTC (permalink / raw)
To: Nala Ginrut; +Cc: guile-devel
On Sat 15 Apr 2017 19:23, Nala Ginrut <nalaginrut@gmail.com> writes:
> Could you please add #:from keyword to evil-in-sand box to indicate
> the language front-end? Don't forget there's multi-lang plan. :-)
In theory yes, but I don't know how to make safe sandboxes in other
languages. ice-9 sandbox relies on the Scheme characteristic that the
only capabilities granted to a program are those that are in scope.
Other languages often have ambient capabilities -- like Bash for example
where there's no way to not provide the pipe ("|") operator. I think
adding other languages should be an exercise for the reader :)
Andy
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-04-17 8:07 ` Andy Wingo
@ 2017-04-17 9:12 ` Nala Ginrut
0 siblings, 0 replies; 17+ messages in thread
From: Nala Ginrut @ 2017-04-17 9:12 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
[-- Attachment #1: Type: text/plain, Size: 1136 bytes --]
Hmm...I didn't think about this security issue. And even if we may do some
verification in IR(say, CPS or lower level), it's insufficient to avoid
security issue, since front-end implementation may use cross module
function to mimic primitives for other languages.
Now I think maybe front-end writer has to write their own sandbox with
(ice-9 sandbox) if any necessary. :-)
Best regards.
2017年4月17日 16:07,"Andy Wingo" <wingo@pobox.com>写道:
> On Sat 15 Apr 2017 19:23, Nala Ginrut <nalaginrut@gmail.com> writes:
>
> > Could you please add #:from keyword to evil-in-sand box to indicate
> > the language front-end? Don't forget there's multi-lang plan. :-)
>
> In theory yes, but I don't know how to make safe sandboxes in other
> languages. ice-9 sandbox relies on the Scheme characteristic that the
> only capabilities granted to a program are those that are in scope.
> Other languages often have ambient capabilities -- like Bash for example
> where there's no way to not provide the pipe ("|") operator. I think
> adding other languages should be an exercise for the reader :)
>
> Andy
>
[-- Attachment #2: Type: text/html, Size: 1609 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RFC: (ice-9 sandbox)
2017-03-31 9:27 RFC: (ice-9 sandbox) Andy Wingo
` (4 preceding siblings ...)
2017-04-15 17:23 ` Nala Ginrut
@ 2017-04-18 19:48 ` Andy Wingo
5 siblings, 0 replies; 17+ messages in thread
From: Andy Wingo @ 2017-04-18 19:48 UTC (permalink / raw)
To: guile-devel
On Fri 31 Mar 2017 11:27, Andy Wingo <wingo@pobox.com> writes:
> Attached is a module that can evaluate an expression within a sandbox.
Pushed to master. See NEWS here, where I include a couple more entries
of note:
* Notable changes
** New sandboxed evaluation facility
Guile now has a way to execute untrusted code in a safe way. See
"Sandboxed Evaluation" in the manual for full details, including some
important notes on limitations on the sandbox's ability to prevent
resource exhaustion.
** All literal constants are read-only
According to the Scheme language definition, it is an error to attempt
to mutate a "constant literal". A constant literal is data that is a
literal quoted part of a program. For example, all of these are errors:
(set-car! '(1 . 2) 42)
(append! '(1 2 3) '(4 5 6))
(vector-set! '#(a b c) 1 'B)
Guile takes advantage of this provision of Scheme to deduplicate shared
structure in constant literals within a compilation unit, and to
allocate constant data directly in the compiled object file. If the
data needs no relocation at run-time, as is the case for pairs or
vectors that only contain immediate values, then the data can actually
be shared between different Guile processes, using the operating
system's virtual memory facilities.
However, in Guile 2.2.0, constants that needed relocation were actually
mutable -- though (vector-set! '#(a b c) 1 'B) was an error, Guile
wouldn't actually cause an exception to be raised, silently allowing the
mutation. This could affect future users of this constant, or indeed of
any constant in the compilation unit that shared structure with the
original vector.
Additionally, attempting to mutate constant literals mapped in the
read-only section of files would actually cause a segmentation fault, as
the operating system prohibits writes to read-only memory. "Don't do
that" isn't a very nice solution :)
Both of these problems have been fixed. Any attempt to mutate a
constant literal will now raise an exception, whether the constant needs
relocation or not.
** Syntax objects are now a distinct type
It used to be that syntax objects were represented as a tagged vector.
These values could be forged by users to break scoping abstractions,
preventing the implementation of sandboxing facilities in Guile. We are
as embarrassed about the previous situation as we pleased are about the
fact that we've fixed it.
Unfortunately, during the 2.2 stable series (or at least during part of
it), we need to support files compiled with Guile 2.2.0. These files
may contain macros that contain legacy syntax object constants. See the
discussion of "allow-legacy-syntax-objects?" in "Syntax Transformer
Helpers" in the manual for full details.
And the documentation formatted as text is below. I guess a 2.2.1 is
coming soon. Thanks all for the review!
Andy
1.12 Sandboxed Evaluation
-------------------------
Sometimes you would like to evaluate code that comes from an untrusted
party. The safest way to do this is to buy a new computer, evaluate the
code on that computer, then throw the machine away. However if you are
unwilling to take this simple approach, Guile does include a limited
"sandbox" facility that can allow untrusted code to be evaluated with
some confidence.
To use the sandboxed evaluator, load its module:
(use-modules (ice-9 sandbox))
Guile's sandboxing facility starts with the ability to restrict the
time and space used by a piece of code.
-- Scheme Procedure: call-with-time-limit limit thunk limit-reached
Call THUNK, but cancel it if LIMIT seconds of wall-clock time have
elapsed. If the computation is cancelled, call LIMIT-REACHED in
tail position. THUNK must not disable interrupts or prevent an
abort via a 'dynamic-wind' unwind handler.
-- Scheme Procedure: call-with-allocation-limit limit thunk
limit-reached
Call THUNK, but cancel it if LIMIT bytes have been allocated. If
the computation is cancelled, call LIMIT-REACHED in tail position.
THUNK must not disable interrupts or prevent an abort via a
'dynamic-wind' unwind handler.
This limit applies to both stack and heap allocation. The
computation will not be aborted before LIMIT bytes have been
allocated, but for the heap allocation limit, the check may be
postponed until the next garbage collection.
Note that as a current shortcoming, the heap size limit applies to
all threads; concurrent allocation by other unrelated threads
counts towards the allocation limit.
-- Scheme Procedure: call-with-time-and-allocation-limits time-limit
allocation-limit thunk
Invoke THUNK in a dynamic extent in which its execution is limited
to TIME-LIMIT seconds of wall-clock time, and its allocation to
ALLOCATION-LIMIT bytes. THUNK must not disable interrupts or
prevent an abort via a 'dynamic-wind' unwind handler.
If successful, return all values produced by invoking THUNK. Any
uncaught exception thrown by the thunk will propagate out. If the
time or allocation limit is exceeded, an exception will be thrown
to the 'limit-exceeded' key.
The time limit and stack limit are both very precise, but the heap
limit only gets checked asynchronously, after a garbage collection. In
particular, if the heap is already very large, the number of allocated
bytes between garbage collections will be large, and therefore the
precision of the check is reduced.
Additionally, due to the mechanism used by the allocation limit (the
'after-gc-hook'), large single allocations like '(make-vector #e1e7)'
are only detected after the allocation completes, even if the allocation
itself causes garbage collection. It's possible therefore for user code
to not only exceed the allocation limit set, but also to exhaust all
available memory, causing out-of-memory conditions at any allocation
site. Failure to allocate memory in Guile itself should be safe and
cause an exception to be thrown, but most systems are not designed to
handle 'malloc' failures. An allocation failure may therefore exercise
unexpected code paths in your system, so it is a weakness of the sandbox
(and therefore an interesting point of attack).
The main sandbox interface is 'eval-in-sandbox'.
-- Scheme Procedure: eval-in-sandbox exp [#:time-limit 0.1]
[#:allocation-limit #e10e6] [#:bindings all-pure-bindings]
[#:module (make-sandbox-module bindings)] [#:sever-module? #t]
Evaluate the Scheme expression EXP within an isolated "sandbox".
Limit its execution to TIME-LIMIT seconds of wall-clock time, and
limit its allocation to ALLOCATION-LIMIT bytes.
The evaluation will occur in MODULE, which defaults to the result
of calling 'make-sandbox-module' on BINDINGS, which itself defaults
to 'all-pure-bindings'. This is the core of the sandbox: creating
a scope for the expression that is "safe".
A safe sandbox module has two characteristics. Firstly, it will
not allow the expression being evaluated to avoid being cancelled
due to time or allocation limits. This ensures that the expression
terminates in a timely fashion.
Secondly, a safe sandbox module will prevent the evaluation from
receiving information from previous evaluations, or from affecting
future evaluations. All combinations of binding sets exported by
'(ice-9 sandbox)' form safe sandbox modules.
The BINDINGS should be given as a list of import sets. One import
set is a list whose car names an interface, like '(ice-9 q)', and
whose cdr is a list of imports. An import is either a bare symbol
or a pair of '(OUT . IN)', where OUT and IN are both symbols and
denote the name under which a binding is exported from the module,
and the name under which to make the binding available,
respectively. Note that BINDINGS is only used as an input to the
default initializer for the MODULE argument; if you pass
'#:module', BINDINGS is unused. If SEVER-MODULE? is true (the
default), the module will be unlinked from the global module tree
after the evaluation returns, to allow MOD to be garbage-collected.
If successful, return all values produced by EXP. Any uncaught
exception thrown by the expression will propagate out. If the time
or allocation limit is exceeded, an exception will be thrown to the
'limit-exceeded' key.
Constructing a safe sandbox module is tricky in general. Guile
defines an easy way to construct safe modules from predefined sets of
bindings. Before getting to that interface, here are some general notes
on safety.
1. The time and allocation limits rely on the ability to interrupt and
cancel a computation. For this reason, no binding included in a
sandbox module should be able to indefinitely postpone interrupt
handling, nor should a binding be able to prevent an abort. In
practice this second consideration means that 'dynamic-wind' should
not be included in any binding set.
2. The time and allocation limits apply only to the 'eval-in-sandbox'
call. If the call returns a procedure which is later called, no
limit is "automatically" in place. Users of 'eval-in-sandbox' have
to be very careful to reimpose limits when calling procedures that
escape from sandboxes.
3. Similarly, the dynamic environment of the 'eval-in-sandbox' call is
not necessarily in place when any procedure that escapes from the
sandbox is later called.
This detail prevents us from exposing 'primitive-eval' to the
sandbox, for two reasons. The first is that it's possible for
legacy code to forge references to any binding, if the
'allow-legacy-syntax-objects?' parameter is true. The default for
this parameter is true; *note Syntax Transformer Helpers:: for the
details. The parameter is bound to '#f' for the duration of the
'eval-in-sandbox' call itself, but that will not be in place during
calls to escaped procedures.
The second reason we don't expose 'primitive-eval' is that
'primitive-eval' implicitly works in the current module, which for
an escaped procedure will probably be different than the module
that is current for the 'eval-in-sandbox' call itself.
The common denominator here is that if an interface exposed to the
sandbox relies on dynamic environments, it is easy to mistakenly
grant the sandboxed procedure additional capabilities in the form
of bindings that it should not have access to. For this reason,
the default sets of predefined bindings do not depend on any
dynamically scoped value.
4. Mutation may allow a sandboxed evaluation to break some invariant
in users of data supplied to it. A lot of code culturally doesn't
expect mutation, but if you hand mutable data to a sandboxed
evaluation and you also grant mutating capabilities to that
evaluation, then the sandboxed code may indeed mutate that data.
The default set of bindings to the sandbox do not include any
mutating primitives.
Relatedly, 'set!' may allow a sandbox to mutate a primitive,
invalidating many system-wide invariants. Guile is currently quite
permissive when it comes to imported bindings and mutability.
Although 'set!' to a module-local or lexically bound variable would
be fine, we don't currently have an easy way to disallow 'set!' to
an imported binding, so currently no binding set includes 'set!'.
5. Mutation may allow a sandboxed evaluation to keep state, or make a
communication mechanism with other code. On the one hand this
sounds cool, but on the other hand maybe this is part of your
threat model. Again, the default set of bindings doesn't include
mutating primitives, preventing sandboxed evaluations from keeping
state.
6. The sandbox should probably not be able to open a network
connection, or write to a file, or open a file from disk. The
default binding set includes no interaction with the operating
system.
If you, dear reader, find the above discussion interesting, you will
enjoy Jonathan Rees' dissertation, "A Security Kernel Based on the
Lambda Calculus".
-- Scheme Variable: all-pure-bindings
All "pure" bindings that together form a safe subset of those
bindings available by default to Guile user code.
-- Scheme Variable: all-pure-and-impure-bindings
Like 'all-pure-bindings', but additionally including mutating
primitives like 'vector-set!'. This set is still safe in the sense
mentioned above, with the caveats about mutation.
The components of these composite sets are as follows:
-- Scheme Variable: alist-bindings
-- Scheme Variable: array-bindings
-- Scheme Variable: bit-bindings
-- Scheme Variable: bitvector-bindings
-- Scheme Variable: char-bindings
-- Scheme Variable: char-set-bindings
-- Scheme Variable: clock-bindings
-- Scheme Variable: core-bindings
-- Scheme Variable: error-bindings
-- Scheme Variable: fluid-bindings
-- Scheme Variable: hash-bindings
-- Scheme Variable: iteration-bindings
-- Scheme Variable: keyword-bindings
-- Scheme Variable: list-bindings
-- Scheme Variable: macro-bindings
-- Scheme Variable: nil-bindings
-- Scheme Variable: number-bindings
-- Scheme Variable: pair-bindings
-- Scheme Variable: predicate-bindings
-- Scheme Variable: procedure-bindings
-- Scheme Variable: promise-bindings
-- Scheme Variable: prompt-bindings
-- Scheme Variable: regexp-bindings
-- Scheme Variable: sort-bindings
-- Scheme Variable: srfi-4-bindings
-- Scheme Variable: string-bindings
-- Scheme Variable: symbol-bindings
-- Scheme Variable: unspecified-bindings
-- Scheme Variable: variable-bindings
-- Scheme Variable: vector-bindings
-- Scheme Variable: version-bindings
The components of 'all-pure-bindings'.
-- Scheme Variable: mutating-alist-bindings
-- Scheme Variable: mutating-array-bindings
-- Scheme Variable: mutating-bitvector-bindings
-- Scheme Variable: mutating-fluid-bindings
-- Scheme Variable: mutating-hash-bindings
-- Scheme Variable: mutating-list-bindings
-- Scheme Variable: mutating-pair-bindings
-- Scheme Variable: mutating-sort-bindings
-- Scheme Variable: mutating-srfi-4-bindings
-- Scheme Variable: mutating-string-bindings
-- Scheme Variable: mutating-variable-bindings
-- Scheme Variable: mutating-vector-bindings
The additional components of 'all-pure-and-impure-bindings'.
Finally, what do you do with a binding set? What is a binding set
anyway? 'make-sandbox-module' is here for you.
-- Scheme Procedure: make-sandbox-module bindings
Return a fresh module that only contains BINDINGS.
The BINDINGS should be given as a list of import sets. One import
set is a list whose car names an interface, like '(ice-9 q)', and
whose cdr is a list of imports. An import is either a bare symbol
or a pair of '(OUT . IN)', where OUT and IN are both symbols and
denote the name under which a binding is exported from the module,
and the name under which to make the binding available,
respectively.
So you see that binding sets are just lists, and
'all-pure-and-impure-bindings' is really just the result of appending
all of the component binding sets.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-04-18 19:48 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-31 9:27 RFC: (ice-9 sandbox) Andy Wingo
2017-03-31 11:33 ` Ludovic Courtès
2017-03-31 16:26 ` Andy Wingo
2017-03-31 21:41 ` Ludovic Courtès
2017-04-02 10:18 ` Andy Wingo
2017-04-03 15:35 ` Ludovic Courtès
2017-04-14 10:52 ` Andy Wingo
2017-04-14 12:17 ` tomas
2017-04-14 12:32 ` Ludovic Courtès
2017-03-31 14:41 ` Mike Gran
2017-04-01 14:33 ` Christopher Allan Webber
2017-04-06 21:41 ` Freja Nordsiek
2017-04-14 10:58 ` Andy Wingo
2017-04-15 17:23 ` Nala Ginrut
2017-04-17 8:07 ` Andy Wingo
2017-04-17 9:12 ` Nala Ginrut
2017-04-18 19:48 ` Andy Wingo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).