unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* RFC: (ice-9 sandbox)
@ 2017-03-31  9:27 Andy Wingo
  2017-03-31 11:33 ` Ludovic Courtès
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Andy Wingo @ 2017-03-31  9:27 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 500 bytes --]

Hi,

Attached is a module that can evaluate an expression within a sandbox.
If the evaluation takes too long or allocates too much, it will be
cancelled.  The evaluation will take place with respect to a module with
a "safe" set of imports.  Those imports include most of the bindings
available in a default Guile environment.  See the file below for full
details and a number of caveats.

Any thoughts?  I would like something like this for a web service that
has to evaluate untrusted code.

Andy


[-- Attachment #2: sandbox.scm --]
[-- Type: text/plain, Size: 36124 bytes --]

;;; Sandboxed evaluation of Scheme code

;;; Copyright (C) 2017 Free Software Foundation, Inc.

;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public
;;;; License as published by the Free Software Foundation; either
;;;; version 3 of the License, or (at your option) any later version.
;;;; 
;;;; This library is distributed in the hope that it will be useful,
;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
;;;; Lesser General Public License for more details.
;;;; 
;;;; You should have received a copy of the GNU Lesser General Public
;;;; License along with this library; if not, write to the Free Software
;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

;;; Commentary:
;;; 
;;; Code:

(define-module (ice-9 sandbox)
  #:use-module (ice-9 control)
  #:use-module (ice-9 match)
  #:use-module (system vm vm)
  #:export (call-with-time-limit
            call-with-allocation-limit
            call-with-time-and-allocation-limits

            eval-in-sandbox
            make-sandbox-module

            *alist-bindings*
            *array-bindings*
            *bit-bindings*
            *bitvector-bindings*
            *char-bindings*
            *char-set-bindings*
            *clock-bindings*
            *core-bindings*
            *error-bindings*
            *fluid-bindings*
            *hash-bindings*
            *iteration-bindings*
            *keyword-bindings*
            *list-bindings*
            *macro-bindings*
            *nil-bindings*
            *number-bindings*
            *pair-bindings*
            *predicate-bindings*
            *procedure-bindings*
            *promise-bindings*
            *prompt-bindings*
            *regexp-bindings*
            *sort-bindings*
            *srfi-4-bindings*
            *string-bindings*
            *symbol-bindings*
            *unspecified-bindings*
            *variable-bindings*
            *vector-bindings*
            *version-bindings*

            *mutating-alist-bindings*
            *mutating-array-bindings*
            *mutating-bitvector-bindings*
            *mutating-fluid-bindings*
            *mutating-hash-bindings*
            *mutating-list-bindings*
            *mutating-pair-bindings*
            *mutating-sort-bindings*
            *mutating-srfi-4-bindings*
            *mutating-string-bindings*
            *mutating-variable-bindings*
            *mutating-vector-bindings*

            *all-pure-bindings*
            *all-pure-and-impure-bindings*))


(define (call-with-time-limit limit thunk limit-reached)
  "Call @var{thunk}, but cancel it if @var{limit} seconds of wall-clock
time have elapsed.  If the computation is cancelled, call
@var{limit-reached} in tail position.  @var{thunk} must not disable
interrupts or prevent an abort via a @code{dynamic-wind} unwind
handler."
  ;; FIXME: use separate thread instead of sigalrm.
  (let ((limit-usecs (inexact->exact (round (* limit 1e6))))
        (prev-sigalarm-handler #f)
        (tag (make-prompt-tag)))
    (call-with-prompt tag
      (lambda ()
        (dynamic-wind
          (lambda ()
            (set! prev-sigalarm-handler
              (sigaction SIGALRM (lambda (sig) (abort-to-prompt tag))))
            (setitimer ITIMER_REAL 0 0 0 limit-usecs))
          thunk
          (lambda ()
            (setitimer ITIMER_REAL 0 0 0 0)
            (match prev-sigalarm-handler
              ((handler . flags)
               (sigaction SIGALRM handler flags))))))
      (lambda (k)
        (limit-reached)))))

(define (call-with-allocation-limit limit thunk limit-reached)
  "Call @var{thunk}, but cancel it if @var{limit} bytes have been
allocated.  If the computation is cancelled, call @var{limit-reached} in
tail position.  @var{thunk} must not disable interrupts or prevent an
abort via a @code{dynamic-wind} unwind handler.

This limit applies to both stack and heap allocation.  The computation
will not be aborted before @var{limit} bytes have been allocated, but
for the heap allocation limit, the check may be postponed until the next garbage collection."
  (define (bytes-allocated) (assq-ref (gc-stats) 'heap-total-allocated))
  (let ((zero (bytes-allocated))
        (tag (make-prompt-tag)))
    (define (check-allocation)
      (when (< limit (- (bytes-allocated) zero))
        (abort-to-prompt tag)))
    (call-with-prompt tag
      (lambda ()
        (dynamic-wind
          (lambda ()
            (add-hook! after-gc-hook check-allocation))
          (lambda ()
            (call-with-stack-overflow-handler
             ;; The limit is in "words", which used to be 4 or 8 but now
             ;; is always 8 bytes.
             (floor/ limit 8)
             thunk
             (lambda () (abort-to-prompt tag))))
          (lambda ()
            (remove-hook! after-gc-hook check-allocation))))
      (lambda (k)
        (limit-reached)))))

(define (call-with-time-and-allocation-limits time-limit allocation-limit
                                              thunk)
  "Invoke @var{thunk} in a dynamic extent in which its execution is
limited to @var{time-limit} seconds of wall-clock time, and its
allocation to @var{allocation-limit} bytes.  @var{thunk} must not
disable interrupts or prevent an abort via a @code{dynamic-wind} unwind
handler.

If successful, return all values produced by invoking @var{thunk}.  Any
uncaught exception thrown by the thunk will propagate out.  If the time
or allocation limit is exceeded, an exception will be thrown to the
@code{limit-exceeded} key."
  
  (call-with-time-limit
   time-limit
   (lambda ()
     (call-with-allocation-limit
      allocation-limit
      thunk
      (lambda ()
        (scm-error 'limit-exceeded "with-resource-limits"
                   "Allocation limit exceeded" '() #f))))
   (lambda ()
     (scm-error 'limit-exceeded "with-resource-limits"
                "Time limit exceeded" '() #f))))

(define (sever-module! m)
  "Remove @var{m} from its container module."
  (match (module-name m)
    ((head ... tail)
     (let ((parent (resolve-module head #f)))
       (unless (eq? m (module-ref-submodule parent tail))
         (error "can't sever module?"))
       (hashq-remove! (module-submodules parent) tail)))))

;; bindings := module-binding-list ...
;; module-binding-list := interface-name import ...
;; import := name | (exported-name . imported-name)
;; name := symbol
(define (make-sandbox-module bindings)
  "Return a fresh module that only contains @var{bindings}.

The @var{bindings} should be given as a list of import sets.  One import
set is a list whose car names an interface, like @code{(ice-9 q)}, and
whose cdr is a list of imports.  An import is either a bare symbol or a
pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
both symbols and denote the name under which a binding is exported from
the module, and the name under which to make the binding available,
respectively."
  (let ((m (make-fresh-user-module)))
    (purify-module! m)
    ;; FIXME: We want to have a module that will be collectable by GC.
    ;; Currently in Guile all modules are part of a single tree, and
    ;; once a module is part of that tree it will never be collected.
    ;; So we want to sever the module off from that tree.  However the
    ;; psyntax syntax expander currently needs to be able to look up
    ;; modules by name; being severed from the name tree prevents that
    ;; from happening.  So for now, each evaluation leaks memory :/
    ;; 
    ;; (sever-module! m)
    (module-use-interfaces! m
                            (map (match-lambda
                                   ((mod-name . bindings)
                                    (resolve-interface mod-name
                                                       #:select bindings)))
                                 bindings))
    m))

(define* (eval-in-sandbox exp #:key
                          (time-limit 0.1)
                          (allocation-limit #e10e6)
                          (bindings *all-pure-bindings*)
                          (module (make-sandbox-module bindings)))
  "Evaluate the Scheme expression @var{exp} within an isolated
\"sandbox\".  Limit its execution to @var{time-limit} seconds of
wall-clock time, and limit its allocation to @var{allocation-limit}
bytes.

The evaluation will occur in @var{module}, which defaults to the result
of calling @code{make-sandbox-module} on @var{bindings}, which itself
defaults to @code{*all-pure-bindings*}.  This is the core of the
sandbox: creating a scope for the expression that is @dfn{safe}.

A safe sandbox module has two characteristics.  Firstly, it will not
allow the expression being evaluated to avoid being cancelled due to
time or allocation limits.  This ensures that the expression terminates
in a timely fashion.

Secondly, a safe sandbox module will prevent the evaluation from
receiving information from previous evaluations, or from affecting
future evaluations.  All combinations of binding sets exported by
@code{(ice-9 sandbox)} form safe sandbox modules.

The @var{bindings} should be given as a list of import sets.  One import
set is a list whose car names an interface, like @code{(ice-9 q)}, and
whose cdr is a list of imports.  An import is either a bare symbol or a
pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
both symbols and denote the name under which a binding is exported from
the module, and the name under which to make the binding available,
respectively.  Note that @var{bindings} is only used as an input to the
default initializer for the @var{module} argument; if you pass
@code{#:module}, @var{bindings} is unused.

If successful, return all values produced by @var{exp}.  Any uncaught
exception thrown by the expression will propagate out.  If the time or
allocation limit is exceeded, an exception will be thrown to the
@code{limit-exceeded} key."
  (call-with-time-and-allocation-limits
   time-limit allocation-limit
   (lambda ()
     ;; Prevent the expression from forging syntax objects.  See "Syntax
     ;; Transformer Helpers" in the manual.
     (parameterize ((allow-legacy-syntax-objects? #f))
       (eval exp module)))))


;; An evaluation-sandboxing facility is safe if:
;;
;;  (1) every evaluation will terminate in a timely manner
;;
;;  (2) no evaluation can affect future evaluations
;;
;; For (1), we impose a user-controllable time limit on the evaluation,
;; in wall-clock time.  When that limit is reached, Guile schedules an
;; asynchronous interrupt in the sandbox that aborts the computation.
;; For this to work, the sandboxed evaluation must not disable
;; interrupts, and it must not prevent timely aborts via malicious "out"
;; guards in dynamic-wind thunks.
;;
;; The sandbox also has an allocation limit that uses a similar cancel
;; mechanism, but this limit is less precise as it only runs at
;; garbage-collection time.
;;
;; The sandbox sets the allocation limit as the stack limit as well.
;;
;; For (2), the only way an evaluation can affect future evaluations is
;; if it causes a side-effect outside its sandbox.  That side effect
;; could change the way the host or future sandboxed evaluations
;; operate, or it could leak information to future evaluations.
;;
;; One means of information leakage would be the file system.  Although
;; one can imagine "safe" ways to access a file system, in practice we
;; just prevent all access to this and other operating system facilities
;; by not exposing the Guile primitives that access the file system,
;; connect to networking hosts, etc.  If we chose our set of bindings
;; correctly and it is impossible to access host values other than those
;; given to the evaluation, then we have succeeded in granting only a
;; limited set of capabilities to the guest.
;; 
;; To prevent information leakage we also limit other information about
;; the host, like its hostname or the Guile build information.
;; 
;; The guest must also not have the capability to mutate a location used
;; by the host or by future sandboxed evaluations.  Either you expose no
;; primitives to the evaluation that can mutate locations, or you expose
;; no mutable locations.  In this sandbox we opt for a combination of
;; the two, though the selection of bindings is up to the user.  "set!"
;; is always excluded, as Guile doesn't have a nice way to prevent set!
;; on imported bindings.  But variable-set! is included, as no set of
;; bindings from this module includes a variable or a capability to a
;; variable.  It's possible though to build sandbox modules with no
;; mutating primitives.  As far as we know, all possible combinations of
;; the binding sets listed below are safe.
;;
(define *core-bindings*
  '(((guile)
     and
     begin
     apply
     call-with-values
     values
     case
     case-lambda
     case-lambda*
     cond
     define
     define*
     define-values
     do
     if
     lambda
     lambda*
     let
     let*
     letrec
     letrec*
     or
     quasiquote
     quote
     ;; Can't allow mutation to globals.
     ;; set!
     unless
     unquote
     unquote-splicing
     when
     while
     λ)))

(define *macro-bindings*
  '(((guile)
     bound-identifier=?
     ;; Although these have "current" in their name, they are lexically
     ;; scoped, not dynamically scoped.
     current-filename
     current-source-location
     datum->syntax
     define-macro
     define-syntax
     define-syntax-parameter
     define-syntax-rule
     defmacro
     free-identifier=?
     generate-temporaries
     gensym
     identifier-syntax
     identifier?
     let-syntax
     letrec-syntax
     macroexpand
     macroexpanded?
     quasisyntax
     start-stack
     syntax
     syntax->datum
     syntax-case
     syntax-error
     syntax-parameterize
     syntax-rules
     syntax-source
     syntax-violation
     unsyntax
     unsyntax-splicing
     with-ellipsis
     with-syntax
     make-variable-transformer)))

(define *iteration-bindings*
  '(((guile)
     compose
     for-each
     identity
     iota
     map
     map-in-order
     const
     noop)))

(define *clock-bindings*
  '(((guile)
     get-internal-real-time
     internal-time-units-per-second
     sleep
     usleep)))

(define *procedure-bindings*
  '(((guile)
     procedure-documentation
     procedure-minimum-arity
     procedure-name
     procedure?
     thunk?)))

(define *version-bindings*
  '(((guile)
     effective-version
     major-version
     micro-version
     minor-version
     version
     version-matches?)))

(define *nil-bindings*
  '(((guile)
     nil?)))

(define *unspecified-bindings*
  '(((guile)
     unspecified?
     *unspecified*)))

(define *predicate-bindings*
  '(((guile)
     ->bool
     and-map
     and=>
     boolean?
     eq?
     equal?
     eqv?
     negate
     not
     or-map)))

;; The current ports (current-input-port et al) are dynamically scoped,
;; which is a footgun from a sandboxing perspective.  It's too easy for
;; a procedure that is the result of a sandboxed evaluation to be later
;; invoked in a different context and thereby be implicitly granted
;; capabilities to whatever port is then current.  This is compounded by
;; the fact that most Scheme i/o primitives allow the port to be omitted
;; and thereby default to whatever's current.  For now, sadly, we avoid
;; exposing any i/o primitive to the sandbox.
#;
(define *i/o-bindings*
  '(((guile)
     display
     eof-object?
     force-output
     format
     make-soft-port
     newline
     read
     simple-format
     write
     write-char)
    ((ice-9 ports)
     %make-void-port
     char-ready?
     ;; Note that these are mutable parameters.
     current-error-port
     current-input-port
     current-output-port
     current-warning-port
     drain-input
     eof-object?
     file-position
     force-output
     ftell
     input-port?
     output-port?
     peek-char
     port-closed?
     port-column
     port-conversion-strategy
     port-encoding
     port-filename
     port-line
     port-mode
     port?
     read-char
     the-eof-object
     ;; We don't provide open-output-string because it needs
     ;; get-output-string, and get-output-string provides a generic
     ;; capability on any output string port.  For consistency then we
     ;; don't provide open-input-string either; call-with-input-string
     ;; is sufficient.
     call-with-input-string
     call-with-output-string
     with-error-to-port
     with-error-to-string
     with-input-from-port
     with-input-from-string
     with-output-to-port
     with-output-to-string)))

;; If two evaluations are called with the same input port, unread-char
;; and unread-string can use a port as a mutable channel to pass
;; information from one to the other.
#;
(define *mutating-i/o-bindings*
  '(((guile)
     set-port-encoding!)
    ((ice-9 ports)
     close-input-port
     close-output-port
     close-port
     file-set-position
     seek
     set-port-column!
     set-port-conversion-strategy!
     set-port-encoding!
     set-port-filename!
     set-port-line!
     setvbuf
     unread-char
     unread-string)))

(define *error-bindings*
  '(((guile)
     error
     throw
     with-throw-handler
     catch
     ;; false-if-exception can cause i/o if the #:warning arg is passed.
     ;; false-if-exception

     ;; See notes on *i/o-bindings*.
     ;; peek
     ;; pk
     ;; print-exception
     ;; warn
     strerror
     scm-error
     )))

;; FIXME: Currently we can't expose anything that works on the current
;; module to the sandbox.  It could be that the sandboxed evaluation
;; returns a procedure, and that procedure may later be invoked in a
;; different context with a different current-module and it is unlikely
;; that the later caller will consider themselves as granting a
;; capability on whatever module is then current.  Likewise export (and
;; by extension, define-public and the like) also operate on the current
;; module.
;;
;; It could be that we could expose a statically scoped eval to the
;; sandbox.
#;
(define *eval-bindings*
  '(((guile)
     current-module
     module-name
     module?
     define-once
     define-private
     define-public
     defined?
     export
     defmacro-public
     ;; FIXME: single-arg eval?
     eval
     primitive-eval
     eval-string
     self-evaluating?
     ;; Can we?
     set-current-module)))

(define *sort-bindings*
  '(((guile)
     sort
     sorted?
     stable-sort
     sort-list)))

;; These can only form part of a safe binding set if no mutable pair or
;; vector is exposed to the sandbox.
(define *mutating-sort-bindings*
  '(((guile)
     sort!
     stable-sort!
     sort-list!
     restricted-vector-sort!)))

(define *regexp-bindings*
  '(((guile)
     make-regexp
     regexp-exec
     regexp/basic
     regexp/extended
     regexp/icase
     regexp/newline
     regexp/notbol
     regexp/noteol
     regexp?)))

(define *alist-bindings*
  '(((guile)
     acons
     assoc
     assoc-ref
     assq
     assq-ref
     assv
     assv-ref
     sloppy-assoc
     sloppy-assq
     sloppy-assv)))

;; These can only form part of a safe binding set if no mutable pair
;; is exposed to the sandbox.  Unfortunately all charsets in Guile are
;; mutable, currently, including the built-in charsets, so we can't
;; expose these primitives.
(define *mutating-alist-bindings*
  '(((guile)
     assoc-remove!
     assoc-set!
     assq-remove!
     assq-set!
     assv-remove!
     assv-set!)))

(define *number-bindings*
  '(((guile)
     *
     +
     -
     /
     1+
     1-
     <
     <=
     =
     >
     >=
     abs
     acos
     acosh
     angle
     asin
     asinh
     atan
     atanh
     ceiling
     ceiling-quotient
     ceiling-remainder
     ceiling/
     centered-quotient
     centered-remainder
     centered/
     complex?
     cos
     cosh
     denominator
     euclidean-quotient
     euclidean-remainder
     euclidean/
     even?
     exact->inexact
     exact-integer-sqrt
     exact-integer?
     exact?
     exp
     expt
     finite?
     floor
     floor-quotient
     floor-remainder
     floor/
     gcd
     imag-part
     inf
     inf?
     integer-expt
     integer-length
     integer?
     lcm
     log
     log10
     magnitude
     make-polar
     make-rectangular
     max
     min
     modulo
     modulo-expt
     most-negative-fixnum
     most-positive-fixnum
     nan
     nan?
     negative?
     numerator
     odd?
     positive?
     quotient
     rational?
     rationalize
     real-part
     real?
     remainder
     round
     round-quotient
     round-remainder
     round/
     sin
     sinh
     sqrt
     tan
     tanh
     truncate
     truncate-quotient
     truncate-remainder
     truncate/
     zero?
     number?
     number->string
     string->number)))

(define *char-set-bindings*
  '(((guile)
     ->char-set
     char-set
     char-set->list
     char-set->string
     char-set-adjoin
     char-set-any
     char-set-complement
     char-set-contains?
     char-set-copy
     char-set-count
     char-set-cursor
     char-set-cursor-next
     char-set-delete
     char-set-diff+intersection
     char-set-difference
     char-set-every
     char-set-filter
     char-set-fold
     char-set-for-each
     char-set-hash
     char-set-intersection
     char-set-map
     char-set-ref
     char-set-size
     char-set-unfold
     char-set-union
     char-set-xor
     char-set:ascii
     char-set:blank
     char-set:designated
     char-set:digit
     char-set:empty
     char-set:full
     char-set:graphic
     char-set:hex-digit
     char-set:iso-control
     char-set:letter
     char-set:letter+digit
     char-set:lower-case
     char-set:printing
     char-set:punctuation
     char-set:symbol
     char-set:title-case
     char-set:upper-case
     char-set:whitespace
     char-set<=
     char-set=
     char-set?
     end-of-char-set?
     list->char-set
     string->char-set
     ucs-range->char-set)))

;; These can only form part of a safe binding set if no mutable char-set
;; is exposed to the sandbox.  Unfortunately all charsets in Guile are
;; mutable, currently, including the built-in charsets, so we can't
;; expose these primitives.
#;
(define *mutating-char-set-bindings*
  '(((guile)
     char-set-adjoin!
     char-set-complement!
     char-set-delete!
     char-set-diff+intersection!
     char-set-difference!
     char-set-filter!
     char-set-intersection!
     char-set-unfold!
     char-set-union!
     char-set-xor!
     list->char-set!
     string->char-set!
     ucs-range->char-set!)))

(define *array-bindings*
  '(((guile)
     array->list
     array-cell-ref
     array-contents
     array-dimensions
     array-equal?
     array-for-each
     array-in-bounds?
     array-length
     array-rank
     array-ref
     array-shape
     array-slice
     array-slice-for-each
     array-slice-for-each-in-order
     array-type
     array-type-code
     array?
     list->array
     list->typed-array
     make-array
     make-shared-array
     make-typed-array
     shared-array-increments
     shared-array-offset
     shared-array-root
     transpose-array
     typed-array?)))

;; These can only form part of a safe binding set if no mutable vector,
;; bitvector, bytevector, srfi-4 vector, or array is exposed to the
;; sandbox.
(define *mutating-array-bindings*
  '(((guile)
     array-cell-set!
     array-copy!
     array-copy-in-order!
     array-fill!
     array-index-map!
     array-map!
     array-map-in-order!
     array-set!)))

(define *hash-bindings*
  '(((guile)
     doubly-weak-hash-table?
     hash
     hash-count
     hash-fold
     hash-for-each
     hash-for-each-handle
     hash-get-handle
     hash-map->list
     hash-ref
     hash-table?
     hashq
     hashq-get-handle
     hashq-ref
     hashv
     hashv-get-handle
     hashv-ref
     hashx-get-handle
     hashx-ref
     make-doubly-weak-hash-table
     make-hash-table
     make-weak-key-hash-table
     make-weak-value-hash-table
     weak-key-hash-table?
     weak-value-hash-table?)))

;; These can only form part of a safe binding set if no hash table is
;; exposed to the sandbox.
(define *mutating-hash-bindings*
  '(((guile)
     hash-clear!
     hash-create-handle!
     hash-remove!
     hash-set!
     hashq-create-handle!
     hashq-remove!
     hashq-set!
     hashv-create-handle!
     hashv-remove!
     hashv-set!
     hashx-create-handle!
     hashx-remove!
     hashx-set!)))

(define *variable-bindings*
  '(((guile)
     make-undefined-variable
     make-variable
     variable-bound?
     variable-ref
     variable?)))

;; These can only form part of a safe binding set if no mutable variable
;; is exposed to the sandbox; this applies particularly to variables
;; that are module bindings.
(define *mutating-variable-bindings*
  '(((guile)
     variable-set!
     variable-unset!)))

(define *string-bindings*
  '(((guile)
     absolute-file-name?
     file-name-separator-string
     file-name-separator?
     in-vicinity
     basename
     dirname

     list->string
     make-string
     object->string
     reverse-list->string
     string
     string->list
     string-any
     string-any-c-code
     string-append
     string-append/shared
     string-capitalize
     string-ci<
     string-ci<=
     string-ci<=?
     string-ci<>
     string-ci<?
     string-ci=
     string-ci=?
     string-ci>
     string-ci>=
     string-ci>=?
     string-ci>?
     string-compare
     string-compare-ci
     string-concatenate
     string-concatenate-reverse
     string-concatenate-reverse/shared
     string-concatenate/shared
     string-contains
     string-contains-ci
     string-copy
     string-count
     string-delete
     string-downcase
     string-drop
     string-drop-right
     string-every
     string-every-c-code
     string-filter
     string-fold
     string-fold-right
     string-for-each
     string-for-each-index
     string-hash
     string-hash-ci
     string-index
     string-index-right
     string-join
     string-length
     string-map
     string-normalize-nfc
     string-normalize-nfd
     string-normalize-nfkc
     string-normalize-nfkd
     string-null?
     string-pad
     string-pad-right
     string-prefix-ci?
     string-prefix-length
     string-prefix-length-ci
     string-prefix?
     string-ref
     string-replace
     string-reverse
     string-rindex
     string-skip
     string-skip-right
     string-split
     string-suffix-ci?
     string-suffix-length
     string-suffix-length-ci
     string-suffix?
     string-tabulate
     string-take
     string-take-right
     string-titlecase
     string-tokenize
     string-trim
     string-trim-both
     string-trim-right
     string-unfold
     string-unfold-right
     string-upcase
     string-utf8-length
     string<
     string<=
     string<=?
     string<>
     string<?
     string=
     string=?
     string>
     string>=
     string>=?
     string>?
     string?
     substring
     substring/copy
     substring/read-only
     substring/shared
     xsubstring)))

;; These can only form part of a safe binding set if no mutable string
;; is exposed to the sandbox.
(define *mutating-string-bindings*
  '(((guile)
     string-capitalize!
     string-copy!
     string-downcase!
     string-fill!
     string-map!
     string-reverse!
     string-set!
     string-titlecase!
     string-upcase!
     string-xcopy!
     substring-fill!
     substring-move!)))

(define *symbol-bindings*
  '(((guile)
     string->symbol
     string-ci->symbol
     symbol->string
     list->symbol
     make-symbol
     symbol
     symbol-append
     symbol-hash
     symbol-interned?
     symbol?)))

(define *keyword-bindings*
  '(((guile)
     keyword?
     keyword->symbol
     symbol->keyword)))

;; These can only form part of a safe binding set if no valid prompt tag
;; is ever exposed to the sandbox, or can be constructed by the sandbox.
(define *prompt-bindings*
  '(((guile)
     abort-to-prompt
     abort-to-prompt*
     call-with-prompt
     make-prompt-tag)))

(define *bit-bindings*
  '(((guile)
     ash
     round-ash
     logand
     logcount
     logior
     lognot
     logtest
     logxor
     logbit?)))

(define *bitvector-bindings*
  '(((guile)
     bit-count
     bit-count*
     bit-extract
     bit-position
     bitvector
     bitvector->list
     bitvector-length
     bitvector-ref
     bitvector?
     list->bitvector
     make-bitvector)))

;; These can only form part of a safe binding set if no mutable
;; bitvector is exposed to the sandbox.
(define *mutating-bitvector-bindings*
  '(((guile)
     bit-invert!
     bit-set*!
     bitvector-fill!
     bitvector-set!)))

(define *fluid-bindings*
  '(((guile)
     fluid-bound?
     fluid-ref
     ;; fluid-ref* could escape the sandbox and is not allowed.
     fluid-thread-local?
     fluid?
     make-fluid
     make-thread-local-fluid
     make-unbound-fluid
     with-fluid*
     with-fluids
     with-fluids*
     make-parameter
     parameter?
     parameterize)))

;; These can only form part of a safe binding set if no fluid is
;; directly exposed to the sandbox.
(define *mutating-fluid-bindings*
  '(((guile)
     fluid-set!
     fluid-unset!
     fluid->parameter)))

(define *char-bindings*
  '(((guile)
     char-alphabetic?
     char-ci<=?
     char-ci<?
     char-ci=?
     char-ci>=?
     char-ci>?
     char-downcase
     char-general-category
     char-is-both?
     char-lower-case?
     char-numeric?
     char-titlecase
     char-upcase
     char-upper-case?
     char-whitespace?
     char<=?
     char<?
     char=?
     char>=?
     char>?
     char?
     char->integer
     integer->char)))

(define *list-bindings*
  '(((guile)
     list
     list-cdr-ref
     list-copy
     list-head
     list-index
     list-ref
     list-tail
     list?
     null?
     make-list
     append
     delete
     delq
     delv
     filter
     length
     member
     memq
     memv
     merge
     reverse)))

;; These can only form part of a safe binding set if no mutable
;; pair is exposed to the sandbox.
(define *mutating-list-bindings*
  '(((guile)
     list-cdr-set!
     list-set!
     append!
     delete!
     delete1!
     delq!
     delq1!
     delv!
     delv1!
     filter!
     merge!
     reverse!)))

(define *pair-bindings*
  '(((guile)
     last-pair
     pair?
     caaaar
     caaadr
     caaar
     caadar
     caaddr
     caadr
     caar
     cadaar
     cadadr
     cadar
     caddar
     cadddr
     caddr
     cadr
     car
     cdaaar
     cdaadr
     cdaar
     cdadar
     cdaddr
     cdadr
     cdar
     cddaar
     cddadr
     cddar
     cdddar
     cddddr
     cdddr
     cddr
     cdr
     cons
     cons*)))

;; These can only form part of a safe binding set if no mutable
;; pair is exposed to the sandbox.
(define *mutating-pair-bindings*
  '(((guile)
     set-car!
     set-cdr!)))

(define *vector-bindings*
  '(((guile)
     list->vector
     make-vector
     vector
     vector->list
     vector-copy
     vector-length
     vector-ref
     vector?)))

;; These can only form part of a safe binding set if no mutable
;; vector is exposed to the sandbox.
(define *mutating-vector-bindings*
  '(((guile)
     vector-fill!
     vector-move-left!
     vector-move-right!
     vector-set!)))

(define *promise-bindings*
  '(((guile)
     force
     delay
     make-promise
     promise?)))

(define *srfi-4-bindings*
  '(((srfi srfi-4)
     f32vector
     f32vector->list
     f32vector-length
     f32vector-ref
     f32vector?
     f64vector
     f64vector->list
     f64vector-length
     f64vector-ref
     f64vector?
     list->f32vector
     list->f64vector
     list->s16vector
     list->s32vector
     list->s64vector
     list->s8vector
     list->u16vector
     list->u32vector
     list->u64vector
     list->u8vector
     make-f32vector
     make-f64vector
     make-s16vector
     make-s32vector
     make-s64vector
     make-s8vector
     make-u16vector
     make-u32vector
     make-u64vector
     make-u8vector
     s16vector
     s16vector->list
     s16vector-length
     s16vector-ref
     s16vector?
     s32vector
     s32vector->list
     s32vector-length
     s32vector-ref
     s32vector?
     s64vector
     s64vector->list
     s64vector-length
     s64vector-ref
     s64vector?
     s8vector
     s8vector->list
     s8vector-length
     s8vector-ref
     s8vector?
     u16vector
     u16vector->list
     u16vector-length
     u16vector-ref
     u16vector?
     u32vector
     u32vector->list
     u32vector-length
     u32vector-ref
     u32vector?
     u64vector
     u64vector->list
     u64vector-length
     u64vector-ref
     u64vector?
     u8vector
     u8vector->list
     u8vector-length
     u8vector-ref
     u8vector?)))

;; These can only form part of a safe binding set if no mutable
;; bytevector is exposed to the sandbox.
(define *mutating-srfi-4-bindings*
  '(((srfi srfi-4)
     f32vector-set!
     f64vector-set!
     s16vector-set!
     s32vector-set!
     s64vector-set!
     s8vector-set!
     u16vector-set!
     u32vector-set!
     u64vector-set!
     u8vector-set!)))

(define *all-pure-bindings*
  (append *alist-bindings*
          *array-bindings*
          *bit-bindings*
          *bitvector-bindings*
          *char-bindings*
          *char-set-bindings*
          *clock-bindings*
          *core-bindings*
          *error-bindings*
          *fluid-bindings*
          *hash-bindings*
          *iteration-bindings*
          *keyword-bindings*
          *list-bindings*
          *macro-bindings*
          *nil-bindings*
          *number-bindings*
          *pair-bindings*
          *predicate-bindings*
          *procedure-bindings*
          *promise-bindings*
          *prompt-bindings*
          *regexp-bindings*
          *sort-bindings*
          *srfi-4-bindings*
          *string-bindings*
          *symbol-bindings*
          *unspecified-bindings*
          *variable-bindings*
          *vector-bindings*
          *version-bindings*))


(define *all-pure-and-impure-bindings*
  (append *all-pure-bindings*
          *mutating-alist-bindings*
          *mutating-array-bindings*
          *mutating-bitvector-bindings*
          *mutating-fluid-bindings*
          *mutating-hash-bindings*
          *mutating-list-bindings*
          *mutating-pair-bindings*
          *mutating-sort-bindings*
          *mutating-srfi-4-bindings*
          *mutating-string-bindings*
          *mutating-variable-bindings*
          *mutating-vector-bindings*))

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31  9:27 RFC: (ice-9 sandbox) Andy Wingo
@ 2017-03-31 11:33 ` Ludovic Courtès
  2017-03-31 16:26   ` Andy Wingo
  2017-03-31 14:41 ` Mike Gran
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Ludovic Courtès @ 2017-03-31 11:33 UTC (permalink / raw)
  To: guile-devel

Hello!

Andy Wingo <wingo@pobox.com> skribis:

> Any thoughts?  I would like something like this for a web service that
> has to evaluate untrusted code.

Would be nice!

> (define (call-with-allocation-limit limit thunk limit-reached)
>   "Call @var{thunk}, but cancel it if @var{limit} bytes have been
> allocated.  If the computation is cancelled, call @var{limit-reached} in
> tail position.  @var{thunk} must not disable interrupts or prevent an
> abort via a @code{dynamic-wind} unwind handler.
>
> This limit applies to both stack and heap allocation.  The computation
> will not be aborted before @var{limit} bytes have been allocated, but
> for the heap allocation limit, the check may be postponed until the next garbage collection."
>   (define (bytes-allocated) (assq-ref (gc-stats) 'heap-total-allocated))
>   (let ((zero (bytes-allocated))
>         (tag (make-prompt-tag)))
>     (define (check-allocation)
>       (when (< limit (- (bytes-allocated) zero))
>         (abort-to-prompt tag)))
>     (call-with-prompt tag
>       (lambda ()
>         (dynamic-wind
>           (lambda ()
>             (add-hook! after-gc-hook check-allocation))
>           (lambda ()
>             (call-with-stack-overflow-handler
>              ;; The limit is in "words", which used to be 4 or 8 but now
>              ;; is always 8 bytes.
>              (floor/ limit 8)
>              thunk
>              (lambda () (abort-to-prompt tag))))
>           (lambda ()
>             (remove-hook! after-gc-hook check-allocation))))
>       (lambda (k)
>         (limit-reached)))))

The allocations that trigger ‘after-gc-hook’ could be caused by a
separate thread, right?  That’s probably an acceptable limitation, but
one to be aware of.

Also, if the code does:

  (make-bytevector (expt 2 32))

then ‘after-gc-hook’ run too late, as the comment notes.

> (define (make-sandbox-module bindings)
>   "Return a fresh module that only contains @var{bindings}.
>
> The @var{bindings} should be given as a list of import sets.  One import
> set is a list whose car names an interface, like @code{(ice-9 q)}, and
> whose cdr is a list of imports.  An import is either a bare symbol or a
> pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
> both symbols and denote the name under which a binding is exported from
> the module, and the name under which to make the binding available,
> respectively."
>   (let ((m (make-fresh-user-module)))
>     (purify-module! m)
>     ;; FIXME: We want to have a module that will be collectable by GC.
>     ;; Currently in Guile all modules are part of a single tree, and
>     ;; once a module is part of that tree it will never be collected.
>     ;; So we want to sever the module off from that tree.  However the
>     ;; psyntax syntax expander currently needs to be able to look up
>     ;; modules by name; being severed from the name tree prevents that
>     ;; from happening.  So for now, each evaluation leaks memory :/
>     ;; 
>     ;; (sever-module! m)
>     (module-use-interfaces! m
>                             (map (match-lambda
>                                    ((mod-name . bindings)
>                                     (resolve-interface mod-name
>                                                        #:select bindings)))
>                                  bindings))
>     m))

IIUC ‘@@’ in unavailable in the returned module, right?

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (eval '(@@ (guile) resolve-interface)
			   (let ((m (make-fresh-user-module)))
			     (purify-module! m)
			     m))
ERROR: In procedure %resolve-variable:
ERROR: Unbound variable: @@
--8<---------------cut here---------------end--------------->8---

Isn’t make-fresh-user-module + purify-module! equivalent to just
(make-module)?


> ;; These can only form part of a safe binding set if no mutable
> ;; pair is exposed to the sandbox.
> (define *mutating-pair-bindings*
>   '(((guile)
>      set-car!
>      set-cdr!)))

When used on a literal pair (mapped read-only), these can cause a
segfault.  Now since the code is ‘eval’d, the only literal pairs it can
see are those passed by the caller I suppose, so this may be safe?

> (define *all-pure-and-impure-bindings*
>   (append *all-pure-bindings*

Last but not least: why all the stars?  :-)
I’m used to ‘%something’.

Thank you!

Ludo’.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31  9:27 RFC: (ice-9 sandbox) Andy Wingo
  2017-03-31 11:33 ` Ludovic Courtès
@ 2017-03-31 14:41 ` Mike Gran
  2017-04-01 14:33 ` Christopher Allan Webber
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Mike Gran @ 2017-03-31 14:41 UTC (permalink / raw)
  To: Andy Wingo, guile-devel@gnu.org




?> On Friday, March 31, 2017 2:28 AM, Andy Wingo <wingo@pobox.com> wrote:


> Any thoughts?  I would like something like this for a web service that

> has to evaluate untrusted code.
Neat!  Here are some random, tangential ideas.
While this might be a good route toward a pragmatic definition of
"safe," a route to a stronger version of safety might be trying
to compile a Guile against the CloudABI C library -- which prevents
OS interaction altogether --  and then use something like inetd to
to communicate with your safe guile.
As a middle ground, there are the --disable-posix,
--disable-networking, and --disable-regex options, to consider.
-Mike Gran



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31 11:33 ` Ludovic Courtès
@ 2017-03-31 16:26   ` Andy Wingo
  2017-03-31 21:41     ` Ludovic Courtès
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Wingo @ 2017-03-31 16:26 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Fri 31 Mar 2017 13:33, ludo@gnu.org (Ludovic Courtès) writes:

> Andy Wingo <wingo@pobox.com> skribis:
>
> The allocations that trigger ‘after-gc-hook’ could be caused by a
> separate thread, right?  That’s probably an acceptable limitation, but
> one to be aware of.

Ah yes, we should document this.  Sadly we just don't have very good
metrics here.

> Also, if the code does:
>
>   (make-bytevector (expt 2 32))
>
> then ‘after-gc-hook’ run too late, as the comment notes.

Yep.

> IIUC ‘@@’ in unavailable in the returned module, right?

Correct.  You could put it there but that's a bad ideal.

> Isn’t make-fresh-user-module + purify-module! equivalent to just
> (make-module)?

No, beautify-user-module! does a few more things too.  I was thinking
that we would want to be able to work on the public interface of the
module so I wanted to make sure it was there but in retrospect we don't
need it and can probably simplify things I guess.

>> ;; These can only form part of a safe binding set if no mutable
>> ;; pair is exposed to the sandbox.
>> (define *mutating-pair-bindings*
>>   '(((guile)
>>      set-car!
>>      set-cdr!)))
>
> When used on a literal pair (mapped read-only), these can cause a
> segfault.  Now since the code is ‘eval’d, the only literal pairs it can
> see are those passed by the caller I suppose, so this may be safe?

Who knows.  I mean vector-set! can also cause segfaults.  I think we
should fix that situation to throw an exception.

>> (define *all-pure-and-impure-bindings*
>>   (append *all-pure-bindings*
>
> Last but not least: why all the stars?  :-)
> I’m used to ‘%something’.

For me I read % as being pronounced "sys" and indicating internal
bindings.  Why do you use it for globals?  Is it your proposal that we
use it for globals?

Andy



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31 16:26   ` Andy Wingo
@ 2017-03-31 21:41     ` Ludovic Courtès
  2017-04-02 10:18       ` Andy Wingo
  0 siblings, 1 reply; 17+ messages in thread
From: Ludovic Courtès @ 2017-03-31 21:41 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> skribis:

> On Fri 31 Mar 2017 13:33, ludo@gnu.org (Ludovic Courtès) writes:

[...]

>>> ;; These can only form part of a safe binding set if no mutable
>>> ;; pair is exposed to the sandbox.
>>> (define *mutating-pair-bindings*
>>>   '(((guile)
>>>      set-car!
>>>      set-cdr!)))
>>
>> When used on a literal pair (mapped read-only), these can cause a
>> segfault.  Now since the code is ‘eval’d, the only literal pairs it can
>> see are those passed by the caller I suppose, so this may be safe?
>
> Who knows.  I mean vector-set! can also cause segfaults.  I think we
> should fix that situation to throw an exception.

Yes, that would be nice, though I suppose it’s currently tricky to
achieve no?  Maybe that newfangled ‘userfaultfd’ will save us all.

>>> (define *all-pure-and-impure-bindings*
>>>   (append *all-pure-bindings*
>>
>> Last but not least: why all the stars?  :-)
>> I’m used to ‘%something’.
>
> For me I read % as being pronounced "sys" and indicating internal
> bindings.  Why do you use it for globals?  Is it your proposal that we
> use it for globals?

I tend to do that but I realize I must be a minority here.  Let it be
stars then.  :-)

Thanks for working on this!

Ludo’.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31  9:27 RFC: (ice-9 sandbox) Andy Wingo
  2017-03-31 11:33 ` Ludovic Courtès
  2017-03-31 14:41 ` Mike Gran
@ 2017-04-01 14:33 ` Christopher Allan Webber
  2017-04-06 21:41 ` Freja Nordsiek
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Christopher Allan Webber @ 2017-04-01 14:33 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Wow!  With this I suppose we could implement something like 
  http://mumble.net/~jar/pubs/secureos/secureos.html
?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31 21:41     ` Ludovic Courtès
@ 2017-04-02 10:18       ` Andy Wingo
  2017-04-03 15:35         ` Ludovic Courtès
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Wingo @ 2017-04-02 10:18 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Fri 31 Mar 2017 23:41, ludo@gnu.org (Ludovic Courtès) writes:

> Andy Wingo <wingo@pobox.com> skribis:
>
>> On Fri 31 Mar 2017 13:33, ludo@gnu.org (Ludovic Courtès) writes:
>
> [...]
>
>>>> ;; These can only form part of a safe binding set if no mutable
>>>> ;; pair is exposed to the sandbox.
>>>> (define *mutating-pair-bindings*
>>>>   '(((guile)
>>>>      set-car!
>>>>      set-cdr!)))
>>>
>>> When used on a literal pair (mapped read-only), these can cause a
>>> segfault.  Now since the code is ‘eval’d, the only literal pairs it can
>>> see are those passed by the caller I suppose, so this may be safe?
>>
>> Who knows.  I mean vector-set! can also cause segfaults.  I think we
>> should fix that situation to throw an exception.
>
> Yes, that would be nice, though I suppose it’s currently tricky to
> achieve no?  Maybe that newfangled ‘userfaultfd’ will save us all.

Maybe :)  I mean it's possible now to catch SIGSEGV.  I just sent a
patch to guile-devel; wdyt?  Needs docs & tests of course.

>>>> (define *all-pure-and-impure-bindings*
>>>>   (append *all-pure-bindings*
>>>
>>> Last but not least: why all the stars?  :-)
>>> I’m used to ‘%something’.
>>
>> For me I read % as being pronounced "sys" and indicating internal
>> bindings.  Why do you use it for globals?  Is it your proposal that we
>> use it for globals?
>
> I tend to do that but I realize I must be a minority here.  Let it be
> stars then.  :-)

I think that like you, I learned Scheme conventions in an ad-hoc way,
apeing conventions from many sources (Guile's own code, Common Lisp,
random Scheme).  I would be happy if we could be a bit more purposeful
about our conventions and I would be happy to change mine :)  %
can work fine :)

Andy



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-04-02 10:18       ` Andy Wingo
@ 2017-04-03 15:35         ` Ludovic Courtès
  2017-04-14 10:52           ` Andy Wingo
  0 siblings, 1 reply; 17+ messages in thread
From: Ludovic Courtès @ 2017-04-03 15:35 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> skribis:

> On Fri 31 Mar 2017 23:41, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Andy Wingo <wingo@pobox.com> skribis:
>>
>>> On Fri 31 Mar 2017 13:33, ludo@gnu.org (Ludovic Courtès) writes:
>>
>> [...]
>>
>>>>> ;; These can only form part of a safe binding set if no mutable
>>>>> ;; pair is exposed to the sandbox.
>>>>> (define *mutating-pair-bindings*
>>>>>   '(((guile)
>>>>>      set-car!
>>>>>      set-cdr!)))
>>>>
>>>> When used on a literal pair (mapped read-only), these can cause a
>>>> segfault.  Now since the code is ‘eval’d, the only literal pairs it can
>>>> see are those passed by the caller I suppose, so this may be safe?
>>>
>>> Who knows.  I mean vector-set! can also cause segfaults.  I think we
>>> should fix that situation to throw an exception.
>>
>> Yes, that would be nice, though I suppose it’s currently tricky to
>> achieve no?  Maybe that newfangled ‘userfaultfd’ will save us all.
>
> Maybe :)  I mean it's possible now to catch SIGSEGV.  I just sent a
> patch to guile-devel; wdyt?  Needs docs & tests of course.

Neat! I’ll look into it.

>>>>> (define *all-pure-and-impure-bindings*
>>>>>   (append *all-pure-bindings*
>>>>
>>>> Last but not least: why all the stars?  :-)
>>>> I’m used to ‘%something’.
>>>
>>> For me I read % as being pronounced "sys" and indicating internal
>>> bindings.  Why do you use it for globals?  Is it your proposal that we
>>> use it for globals?
>>
>> I tend to do that but I realize I must be a minority here.  Let it be
>> stars then.  :-)
>
> I think that like you, I learned Scheme conventions in an ad-hoc way,
> apeing conventions from many sources (Guile's own code, Common Lisp,
> random Scheme).  I would be happy if we could be a bit more purposeful
> about our conventions and I would be happy to change mine :)  %
> can work fine :)

I grepped Guile and it seems that stars are actually more common for
globals than % (I thought it was the opposite but as you say, I kind of
discovered/invented the conventions.)

Riastradh’s document at <http://mumble.net/~campbell/scheme/style.txt>
has this:

  Affix asterisks to the beginning and end of a globally mutable
  variable.  This allows the reader of the program to recognize very
  easily that it is badly written!

… but it doesn’t say anything about constants nor about %.

It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
‘%all-pure-bindings’.  So, dunno, as you see fit!

Ludo’.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31  9:27 RFC: (ice-9 sandbox) Andy Wingo
                   ` (2 preceding siblings ...)
  2017-04-01 14:33 ` Christopher Allan Webber
@ 2017-04-06 21:41 ` Freja Nordsiek
  2017-04-14 10:58   ` Andy Wingo
  2017-04-15 17:23 ` Nala Ginrut
  2017-04-18 19:48 ` Andy Wingo
  5 siblings, 1 reply; 17+ messages in thread
From: Freja Nordsiek @ 2017-04-06 21:41 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

I took a look at the specific binding the sandbox makes available and
have a few thoughts.

I didn't see any problems with any of the pure bindings made
available, but I am only very familiar with basic R5RS, R6RS, and R7RS
bindings, not Guile extensions (yet, at least), so I can't comment on
many of them.

On the subject of ports and i/o, I have a few ideas. R6RS i/o in the
(rnrs io ports) module generally requires the port to be explicitly
given, rather than assuming current in or out if not given (though
rnrs io simple does make those assumptions). For many, it would be
impossible because they put the port as the first argument and a
required second argument afterwards. Looking at module/io/ports.scm in
Guile 2.2.x, it looks like the reading and writing procedures there
should be safe. Obviously, nothing that opens a file should be used,
nor the procedures to get current input, output, and error; but the
rest can be used. And this includes string and bytevector ports, which
could be very useful in the sandbox (I don't know about anyone else,
but I use string ports all the time).

One question, is there a particular reason that guard is not exported?
It doesn't seem like it is as nasty as dynamic-wind with trying to
terminate, though maybe I am just not seeing how it could be used to
prevent the sandbox terminating the process. Having at least one
exception handling binding might be very helpful in a sandbox.



Freja Nordsiek

On Fri, Mar 31, 2017 at 11:27 AM, Andy Wingo <wingo@pobox.com> wrote:
> Hi,
>
> Attached is a module that can evaluate an expression within a sandbox.
> If the evaluation takes too long or allocates too much, it will be
> cancelled.  The evaluation will take place with respect to a module with
> a "safe" set of imports.  Those imports include most of the bindings
> available in a default Guile environment.  See the file below for full
> details and a number of caveats.
>
> Any thoughts?  I would like something like this for a web service that
> has to evaluate untrusted code.
>
> Andy
>
>
> ;;; Sandboxed evaluation of Scheme code
>
> ;;; Copyright (C) 2017 Free Software Foundation, Inc.
>
> ;;;; This library is free software; you can redistribute it and/or
> ;;;; modify it under the terms of the GNU Lesser General Public
> ;;;; License as published by the Free Software Foundation; either
> ;;;; version 3 of the License, or (at your option) any later version.
> ;;;;
> ;;;; This library is distributed in the hope that it will be useful,
> ;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> ;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> ;;;; Lesser General Public License for more details.
> ;;;;
> ;;;; You should have received a copy of the GNU Lesser General Public
> ;;;; License along with this library; if not, write to the Free Software
> ;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>
> ;;; Commentary:
> ;;;
> ;;; Code:
>
> (define-module (ice-9 sandbox)
>   #:use-module (ice-9 control)
>   #:use-module (ice-9 match)
>   #:use-module (system vm vm)
>   #:export (call-with-time-limit
>             call-with-allocation-limit
>             call-with-time-and-allocation-limits
>
>             eval-in-sandbox
>             make-sandbox-module
>
>             *alist-bindings*
>             *array-bindings*
>             *bit-bindings*
>             *bitvector-bindings*
>             *char-bindings*
>             *char-set-bindings*
>             *clock-bindings*
>             *core-bindings*
>             *error-bindings*
>             *fluid-bindings*
>             *hash-bindings*
>             *iteration-bindings*
>             *keyword-bindings*
>             *list-bindings*
>             *macro-bindings*
>             *nil-bindings*
>             *number-bindings*
>             *pair-bindings*
>             *predicate-bindings*
>             *procedure-bindings*
>             *promise-bindings*
>             *prompt-bindings*
>             *regexp-bindings*
>             *sort-bindings*
>             *srfi-4-bindings*
>             *string-bindings*
>             *symbol-bindings*
>             *unspecified-bindings*
>             *variable-bindings*
>             *vector-bindings*
>             *version-bindings*
>
>             *mutating-alist-bindings*
>             *mutating-array-bindings*
>             *mutating-bitvector-bindings*
>             *mutating-fluid-bindings*
>             *mutating-hash-bindings*
>             *mutating-list-bindings*
>             *mutating-pair-bindings*
>             *mutating-sort-bindings*
>             *mutating-srfi-4-bindings*
>             *mutating-string-bindings*
>             *mutating-variable-bindings*
>             *mutating-vector-bindings*
>
>             *all-pure-bindings*
>             *all-pure-and-impure-bindings*))
>
>
> (define (call-with-time-limit limit thunk limit-reached)
>   "Call @var{thunk}, but cancel it if @var{limit} seconds of wall-clock
> time have elapsed.  If the computation is cancelled, call
> @var{limit-reached} in tail position.  @var{thunk} must not disable
> interrupts or prevent an abort via a @code{dynamic-wind} unwind
> handler."
>   ;; FIXME: use separate thread instead of sigalrm.
>   (let ((limit-usecs (inexact->exact (round (* limit 1e6))))
>         (prev-sigalarm-handler #f)
>         (tag (make-prompt-tag)))
>     (call-with-prompt tag
>       (lambda ()
>         (dynamic-wind
>           (lambda ()
>             (set! prev-sigalarm-handler
>               (sigaction SIGALRM (lambda (sig) (abort-to-prompt tag))))
>             (setitimer ITIMER_REAL 0 0 0 limit-usecs))
>           thunk
>           (lambda ()
>             (setitimer ITIMER_REAL 0 0 0 0)
>             (match prev-sigalarm-handler
>               ((handler . flags)
>                (sigaction SIGALRM handler flags))))))
>       (lambda (k)
>         (limit-reached)))))
>
> (define (call-with-allocation-limit limit thunk limit-reached)
>   "Call @var{thunk}, but cancel it if @var{limit} bytes have been
> allocated.  If the computation is cancelled, call @var{limit-reached} in
> tail position.  @var{thunk} must not disable interrupts or prevent an
> abort via a @code{dynamic-wind} unwind handler.
>
> This limit applies to both stack and heap allocation.  The computation
> will not be aborted before @var{limit} bytes have been allocated, but
> for the heap allocation limit, the check may be postponed until the next garbage collection."
>   (define (bytes-allocated) (assq-ref (gc-stats) 'heap-total-allocated))
>   (let ((zero (bytes-allocated))
>         (tag (make-prompt-tag)))
>     (define (check-allocation)
>       (when (< limit (- (bytes-allocated) zero))
>         (abort-to-prompt tag)))
>     (call-with-prompt tag
>       (lambda ()
>         (dynamic-wind
>           (lambda ()
>             (add-hook! after-gc-hook check-allocation))
>           (lambda ()
>             (call-with-stack-overflow-handler
>              ;; The limit is in "words", which used to be 4 or 8 but now
>              ;; is always 8 bytes.
>              (floor/ limit 8)
>              thunk
>              (lambda () (abort-to-prompt tag))))
>           (lambda ()
>             (remove-hook! after-gc-hook check-allocation))))
>       (lambda (k)
>         (limit-reached)))))
>
> (define (call-with-time-and-allocation-limits time-limit allocation-limit
>                                               thunk)
>   "Invoke @var{thunk} in a dynamic extent in which its execution is
> limited to @var{time-limit} seconds of wall-clock time, and its
> allocation to @var{allocation-limit} bytes.  @var{thunk} must not
> disable interrupts or prevent an abort via a @code{dynamic-wind} unwind
> handler.
>
> If successful, return all values produced by invoking @var{thunk}.  Any
> uncaught exception thrown by the thunk will propagate out.  If the time
> or allocation limit is exceeded, an exception will be thrown to the
> @code{limit-exceeded} key."
>
>   (call-with-time-limit
>    time-limit
>    (lambda ()
>      (call-with-allocation-limit
>       allocation-limit
>       thunk
>       (lambda ()
>         (scm-error 'limit-exceeded "with-resource-limits"
>                    "Allocation limit exceeded" '() #f))))
>    (lambda ()
>      (scm-error 'limit-exceeded "with-resource-limits"
>                 "Time limit exceeded" '() #f))))
>
> (define (sever-module! m)
>   "Remove @var{m} from its container module."
>   (match (module-name m)
>     ((head ... tail)
>      (let ((parent (resolve-module head #f)))
>        (unless (eq? m (module-ref-submodule parent tail))
>          (error "can't sever module?"))
>        (hashq-remove! (module-submodules parent) tail)))))
>
> ;; bindings := module-binding-list ...
> ;; module-binding-list := interface-name import ...
> ;; import := name | (exported-name . imported-name)
> ;; name := symbol
> (define (make-sandbox-module bindings)
>   "Return a fresh module that only contains @var{bindings}.
>
> The @var{bindings} should be given as a list of import sets.  One import
> set is a list whose car names an interface, like @code{(ice-9 q)}, and
> whose cdr is a list of imports.  An import is either a bare symbol or a
> pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
> both symbols and denote the name under which a binding is exported from
> the module, and the name under which to make the binding available,
> respectively."
>   (let ((m (make-fresh-user-module)))
>     (purify-module! m)
>     ;; FIXME: We want to have a module that will be collectable by GC.
>     ;; Currently in Guile all modules are part of a single tree, and
>     ;; once a module is part of that tree it will never be collected.
>     ;; So we want to sever the module off from that tree.  However the
>     ;; psyntax syntax expander currently needs to be able to look up
>     ;; modules by name; being severed from the name tree prevents that
>     ;; from happening.  So for now, each evaluation leaks memory :/
>     ;;
>     ;; (sever-module! m)
>     (module-use-interfaces! m
>                             (map (match-lambda
>                                    ((mod-name . bindings)
>                                     (resolve-interface mod-name
>                                                        #:select bindings)))
>                                  bindings))
>     m))
>
> (define* (eval-in-sandbox exp #:key
>                           (time-limit 0.1)
>                           (allocation-limit #e10e6)
>                           (bindings *all-pure-bindings*)
>                           (module (make-sandbox-module bindings)))
>   "Evaluate the Scheme expression @var{exp} within an isolated
> \"sandbox\".  Limit its execution to @var{time-limit} seconds of
> wall-clock time, and limit its allocation to @var{allocation-limit}
> bytes.
>
> The evaluation will occur in @var{module}, which defaults to the result
> of calling @code{make-sandbox-module} on @var{bindings}, which itself
> defaults to @code{*all-pure-bindings*}.  This is the core of the
> sandbox: creating a scope for the expression that is @dfn{safe}.
>
> A safe sandbox module has two characteristics.  Firstly, it will not
> allow the expression being evaluated to avoid being cancelled due to
> time or allocation limits.  This ensures that the expression terminates
> in a timely fashion.
>
> Secondly, a safe sandbox module will prevent the evaluation from
> receiving information from previous evaluations, or from affecting
> future evaluations.  All combinations of binding sets exported by
> @code{(ice-9 sandbox)} form safe sandbox modules.
>
> The @var{bindings} should be given as a list of import sets.  One import
> set is a list whose car names an interface, like @code{(ice-9 q)}, and
> whose cdr is a list of imports.  An import is either a bare symbol or a
> pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
> both symbols and denote the name under which a binding is exported from
> the module, and the name under which to make the binding available,
> respectively.  Note that @var{bindings} is only used as an input to the
> default initializer for the @var{module} argument; if you pass
> @code{#:module}, @var{bindings} is unused.
>
> If successful, return all values produced by @var{exp}.  Any uncaught
> exception thrown by the expression will propagate out.  If the time or
> allocation limit is exceeded, an exception will be thrown to the
> @code{limit-exceeded} key."
>   (call-with-time-and-allocation-limits
>    time-limit allocation-limit
>    (lambda ()
>      ;; Prevent the expression from forging syntax objects.  See "Syntax
>      ;; Transformer Helpers" in the manual.
>      (parameterize ((allow-legacy-syntax-objects? #f))
>        (eval exp module)))))
>
>
> ;; An evaluation-sandboxing facility is safe if:
> ;;
> ;;  (1) every evaluation will terminate in a timely manner
> ;;
> ;;  (2) no evaluation can affect future evaluations
> ;;
> ;; For (1), we impose a user-controllable time limit on the evaluation,
> ;; in wall-clock time.  When that limit is reached, Guile schedules an
> ;; asynchronous interrupt in the sandbox that aborts the computation.
> ;; For this to work, the sandboxed evaluation must not disable
> ;; interrupts, and it must not prevent timely aborts via malicious "out"
> ;; guards in dynamic-wind thunks.
> ;;
> ;; The sandbox also has an allocation limit that uses a similar cancel
> ;; mechanism, but this limit is less precise as it only runs at
> ;; garbage-collection time.
> ;;
> ;; The sandbox sets the allocation limit as the stack limit as well.
> ;;
> ;; For (2), the only way an evaluation can affect future evaluations is
> ;; if it causes a side-effect outside its sandbox.  That side effect
> ;; could change the way the host or future sandboxed evaluations
> ;; operate, or it could leak information to future evaluations.
> ;;
> ;; One means of information leakage would be the file system.  Although
> ;; one can imagine "safe" ways to access a file system, in practice we
> ;; just prevent all access to this and other operating system facilities
> ;; by not exposing the Guile primitives that access the file system,
> ;; connect to networking hosts, etc.  If we chose our set of bindings
> ;; correctly and it is impossible to access host values other than those
> ;; given to the evaluation, then we have succeeded in granting only a
> ;; limited set of capabilities to the guest.
> ;;
> ;; To prevent information leakage we also limit other information about
> ;; the host, like its hostname or the Guile build information.
> ;;
> ;; The guest must also not have the capability to mutate a location used
> ;; by the host or by future sandboxed evaluations.  Either you expose no
> ;; primitives to the evaluation that can mutate locations, or you expose
> ;; no mutable locations.  In this sandbox we opt for a combination of
> ;; the two, though the selection of bindings is up to the user.  "set!"
> ;; is always excluded, as Guile doesn't have a nice way to prevent set!
> ;; on imported bindings.  But variable-set! is included, as no set of
> ;; bindings from this module includes a variable or a capability to a
> ;; variable.  It's possible though to build sandbox modules with no
> ;; mutating primitives.  As far as we know, all possible combinations of
> ;; the binding sets listed below are safe.
> ;;
> (define *core-bindings*
>   '(((guile)
>      and
>      begin
>      apply
>      call-with-values
>      values
>      case
>      case-lambda
>      case-lambda*
>      cond
>      define
>      define*
>      define-values
>      do
>      if
>      lambda
>      lambda*
>      let
>      let*
>      letrec
>      letrec*
>      or
>      quasiquote
>      quote
>      ;; Can't allow mutation to globals.
>      ;; set!
>      unless
>      unquote
>      unquote-splicing
>      when
>      while
>      λ)))
>
> (define *macro-bindings*
>   '(((guile)
>      bound-identifier=?
>      ;; Although these have "current" in their name, they are lexically
>      ;; scoped, not dynamically scoped.
>      current-filename
>      current-source-location
>      datum->syntax
>      define-macro
>      define-syntax
>      define-syntax-parameter
>      define-syntax-rule
>      defmacro
>      free-identifier=?
>      generate-temporaries
>      gensym
>      identifier-syntax
>      identifier?
>      let-syntax
>      letrec-syntax
>      macroexpand
>      macroexpanded?
>      quasisyntax
>      start-stack
>      syntax
>      syntax->datum
>      syntax-case
>      syntax-error
>      syntax-parameterize
>      syntax-rules
>      syntax-source
>      syntax-violation
>      unsyntax
>      unsyntax-splicing
>      with-ellipsis
>      with-syntax
>      make-variable-transformer)))
>
> (define *iteration-bindings*
>   '(((guile)
>      compose
>      for-each
>      identity
>      iota
>      map
>      map-in-order
>      const
>      noop)))
>
> (define *clock-bindings*
>   '(((guile)
>      get-internal-real-time
>      internal-time-units-per-second
>      sleep
>      usleep)))
>
> (define *procedure-bindings*
>   '(((guile)
>      procedure-documentation
>      procedure-minimum-arity
>      procedure-name
>      procedure?
>      thunk?)))
>
> (define *version-bindings*
>   '(((guile)
>      effective-version
>      major-version
>      micro-version
>      minor-version
>      version
>      version-matches?)))
>
> (define *nil-bindings*
>   '(((guile)
>      nil?)))
>
> (define *unspecified-bindings*
>   '(((guile)
>      unspecified?
>      *unspecified*)))
>
> (define *predicate-bindings*
>   '(((guile)
>      ->bool
>      and-map
>      and=>
>      boolean?
>      eq?
>      equal?
>      eqv?
>      negate
>      not
>      or-map)))
>
> ;; The current ports (current-input-port et al) are dynamically scoped,
> ;; which is a footgun from a sandboxing perspective.  It's too easy for
> ;; a procedure that is the result of a sandboxed evaluation to be later
> ;; invoked in a different context and thereby be implicitly granted
> ;; capabilities to whatever port is then current.  This is compounded by
> ;; the fact that most Scheme i/o primitives allow the port to be omitted
> ;; and thereby default to whatever's current.  For now, sadly, we avoid
> ;; exposing any i/o primitive to the sandbox.
> #;
> (define *i/o-bindings*
>   '(((guile)
>      display
>      eof-object?
>      force-output
>      format
>      make-soft-port
>      newline
>      read
>      simple-format
>      write
>      write-char)
>     ((ice-9 ports)
>      %make-void-port
>      char-ready?
>      ;; Note that these are mutable parameters.
>      current-error-port
>      current-input-port
>      current-output-port
>      current-warning-port
>      drain-input
>      eof-object?
>      file-position
>      force-output
>      ftell
>      input-port?
>      output-port?
>      peek-char
>      port-closed?
>      port-column
>      port-conversion-strategy
>      port-encoding
>      port-filename
>      port-line
>      port-mode
>      port?
>      read-char
>      the-eof-object
>      ;; We don't provide open-output-string because it needs
>      ;; get-output-string, and get-output-string provides a generic
>      ;; capability on any output string port.  For consistency then we
>      ;; don't provide open-input-string either; call-with-input-string
>      ;; is sufficient.
>      call-with-input-string
>      call-with-output-string
>      with-error-to-port
>      with-error-to-string
>      with-input-from-port
>      with-input-from-string
>      with-output-to-port
>      with-output-to-string)))
>
> ;; If two evaluations are called with the same input port, unread-char
> ;; and unread-string can use a port as a mutable channel to pass
> ;; information from one to the other.
> #;
> (define *mutating-i/o-bindings*
>   '(((guile)
>      set-port-encoding!)
>     ((ice-9 ports)
>      close-input-port
>      close-output-port
>      close-port
>      file-set-position
>      seek
>      set-port-column!
>      set-port-conversion-strategy!
>      set-port-encoding!
>      set-port-filename!
>      set-port-line!
>      setvbuf
>      unread-char
>      unread-string)))
>
> (define *error-bindings*
>   '(((guile)
>      error
>      throw
>      with-throw-handler
>      catch
>      ;; false-if-exception can cause i/o if the #:warning arg is passed.
>      ;; false-if-exception
>
>      ;; See notes on *i/o-bindings*.
>      ;; peek
>      ;; pk
>      ;; print-exception
>      ;; warn
>      strerror
>      scm-error
>      )))
>
> ;; FIXME: Currently we can't expose anything that works on the current
> ;; module to the sandbox.  It could be that the sandboxed evaluation
> ;; returns a procedure, and that procedure may later be invoked in a
> ;; different context with a different current-module and it is unlikely
> ;; that the later caller will consider themselves as granting a
> ;; capability on whatever module is then current.  Likewise export (and
> ;; by extension, define-public and the like) also operate on the current
> ;; module.
> ;;
> ;; It could be that we could expose a statically scoped eval to the
> ;; sandbox.
> #;
> (define *eval-bindings*
>   '(((guile)
>      current-module
>      module-name
>      module?
>      define-once
>      define-private
>      define-public
>      defined?
>      export
>      defmacro-public
>      ;; FIXME: single-arg eval?
>      eval
>      primitive-eval
>      eval-string
>      self-evaluating?
>      ;; Can we?
>      set-current-module)))
>
> (define *sort-bindings*
>   '(((guile)
>      sort
>      sorted?
>      stable-sort
>      sort-list)))
>
> ;; These can only form part of a safe binding set if no mutable pair or
> ;; vector is exposed to the sandbox.
> (define *mutating-sort-bindings*
>   '(((guile)
>      sort!
>      stable-sort!
>      sort-list!
>      restricted-vector-sort!)))
>
> (define *regexp-bindings*
>   '(((guile)
>      make-regexp
>      regexp-exec
>      regexp/basic
>      regexp/extended
>      regexp/icase
>      regexp/newline
>      regexp/notbol
>      regexp/noteol
>      regexp?)))
>
> (define *alist-bindings*
>   '(((guile)
>      acons
>      assoc
>      assoc-ref
>      assq
>      assq-ref
>      assv
>      assv-ref
>      sloppy-assoc
>      sloppy-assq
>      sloppy-assv)))
>
> ;; These can only form part of a safe binding set if no mutable pair
> ;; is exposed to the sandbox.  Unfortunately all charsets in Guile are
> ;; mutable, currently, including the built-in charsets, so we can't
> ;; expose these primitives.
> (define *mutating-alist-bindings*
>   '(((guile)
>      assoc-remove!
>      assoc-set!
>      assq-remove!
>      assq-set!
>      assv-remove!
>      assv-set!)))
>
> (define *number-bindings*
>   '(((guile)
>      *
>      +
>      -
>      /
>      1+
>      1-
>      <
>      <=
>      =
>      >
>      >=
>      abs
>      acos
>      acosh
>      angle
>      asin
>      asinh
>      atan
>      atanh
>      ceiling
>      ceiling-quotient
>      ceiling-remainder
>      ceiling/
>      centered-quotient
>      centered-remainder
>      centered/
>      complex?
>      cos
>      cosh
>      denominator
>      euclidean-quotient
>      euclidean-remainder
>      euclidean/
>      even?
>      exact->inexact
>      exact-integer-sqrt
>      exact-integer?
>      exact?
>      exp
>      expt
>      finite?
>      floor
>      floor-quotient
>      floor-remainder
>      floor/
>      gcd
>      imag-part
>      inf
>      inf?
>      integer-expt
>      integer-length
>      integer?
>      lcm
>      log
>      log10
>      magnitude
>      make-polar
>      make-rectangular
>      max
>      min
>      modulo
>      modulo-expt
>      most-negative-fixnum
>      most-positive-fixnum
>      nan
>      nan?
>      negative?
>      numerator
>      odd?
>      positive?
>      quotient
>      rational?
>      rationalize
>      real-part
>      real?
>      remainder
>      round
>      round-quotient
>      round-remainder
>      round/
>      sin
>      sinh
>      sqrt
>      tan
>      tanh
>      truncate
>      truncate-quotient
>      truncate-remainder
>      truncate/
>      zero?
>      number?
>      number->string
>      string->number)))
>
> (define *char-set-bindings*
>   '(((guile)
>      ->char-set
>      char-set
>      char-set->list
>      char-set->string
>      char-set-adjoin
>      char-set-any
>      char-set-complement
>      char-set-contains?
>      char-set-copy
>      char-set-count
>      char-set-cursor
>      char-set-cursor-next
>      char-set-delete
>      char-set-diff+intersection
>      char-set-difference
>      char-set-every
>      char-set-filter
>      char-set-fold
>      char-set-for-each
>      char-set-hash
>      char-set-intersection
>      char-set-map
>      char-set-ref
>      char-set-size
>      char-set-unfold
>      char-set-union
>      char-set-xor
>      char-set:ascii
>      char-set:blank
>      char-set:designated
>      char-set:digit
>      char-set:empty
>      char-set:full
>      char-set:graphic
>      char-set:hex-digit
>      char-set:iso-control
>      char-set:letter
>      char-set:letter+digit
>      char-set:lower-case
>      char-set:printing
>      char-set:punctuation
>      char-set:symbol
>      char-set:title-case
>      char-set:upper-case
>      char-set:whitespace
>      char-set<=
>      char-set=
>      char-set?
>      end-of-char-set?
>      list->char-set
>      string->char-set
>      ucs-range->char-set)))
>
> ;; These can only form part of a safe binding set if no mutable char-set
> ;; is exposed to the sandbox.  Unfortunately all charsets in Guile are
> ;; mutable, currently, including the built-in charsets, so we can't
> ;; expose these primitives.
> #;
> (define *mutating-char-set-bindings*
>   '(((guile)
>      char-set-adjoin!
>      char-set-complement!
>      char-set-delete!
>      char-set-diff+intersection!
>      char-set-difference!
>      char-set-filter!
>      char-set-intersection!
>      char-set-unfold!
>      char-set-union!
>      char-set-xor!
>      list->char-set!
>      string->char-set!
>      ucs-range->char-set!)))
>
> (define *array-bindings*
>   '(((guile)
>      array->list
>      array-cell-ref
>      array-contents
>      array-dimensions
>      array-equal?
>      array-for-each
>      array-in-bounds?
>      array-length
>      array-rank
>      array-ref
>      array-shape
>      array-slice
>      array-slice-for-each
>      array-slice-for-each-in-order
>      array-type
>      array-type-code
>      array?
>      list->array
>      list->typed-array
>      make-array
>      make-shared-array
>      make-typed-array
>      shared-array-increments
>      shared-array-offset
>      shared-array-root
>      transpose-array
>      typed-array?)))
>
> ;; These can only form part of a safe binding set if no mutable vector,
> ;; bitvector, bytevector, srfi-4 vector, or array is exposed to the
> ;; sandbox.
> (define *mutating-array-bindings*
>   '(((guile)
>      array-cell-set!
>      array-copy!
>      array-copy-in-order!
>      array-fill!
>      array-index-map!
>      array-map!
>      array-map-in-order!
>      array-set!)))
>
> (define *hash-bindings*
>   '(((guile)
>      doubly-weak-hash-table?
>      hash
>      hash-count
>      hash-fold
>      hash-for-each
>      hash-for-each-handle
>      hash-get-handle
>      hash-map->list
>      hash-ref
>      hash-table?
>      hashq
>      hashq-get-handle
>      hashq-ref
>      hashv
>      hashv-get-handle
>      hashv-ref
>      hashx-get-handle
>      hashx-ref
>      make-doubly-weak-hash-table
>      make-hash-table
>      make-weak-key-hash-table
>      make-weak-value-hash-table
>      weak-key-hash-table?
>      weak-value-hash-table?)))
>
> ;; These can only form part of a safe binding set if no hash table is
> ;; exposed to the sandbox.
> (define *mutating-hash-bindings*
>   '(((guile)
>      hash-clear!
>      hash-create-handle!
>      hash-remove!
>      hash-set!
>      hashq-create-handle!
>      hashq-remove!
>      hashq-set!
>      hashv-create-handle!
>      hashv-remove!
>      hashv-set!
>      hashx-create-handle!
>      hashx-remove!
>      hashx-set!)))
>
> (define *variable-bindings*
>   '(((guile)
>      make-undefined-variable
>      make-variable
>      variable-bound?
>      variable-ref
>      variable?)))
>
> ;; These can only form part of a safe binding set if no mutable variable
> ;; is exposed to the sandbox; this applies particularly to variables
> ;; that are module bindings.
> (define *mutating-variable-bindings*
>   '(((guile)
>      variable-set!
>      variable-unset!)))
>
> (define *string-bindings*
>   '(((guile)
>      absolute-file-name?
>      file-name-separator-string
>      file-name-separator?
>      in-vicinity
>      basename
>      dirname
>
>      list->string
>      make-string
>      object->string
>      reverse-list->string
>      string
>      string->list
>      string-any
>      string-any-c-code
>      string-append
>      string-append/shared
>      string-capitalize
>      string-ci<
>      string-ci<=
>      string-ci<=?
>      string-ci<>
>      string-ci<?
>      string-ci=
>      string-ci=?
>      string-ci>
>      string-ci>=
>      string-ci>=?
>      string-ci>?
>      string-compare
>      string-compare-ci
>      string-concatenate
>      string-concatenate-reverse
>      string-concatenate-reverse/shared
>      string-concatenate/shared
>      string-contains
>      string-contains-ci
>      string-copy
>      string-count
>      string-delete
>      string-downcase
>      string-drop
>      string-drop-right
>      string-every
>      string-every-c-code
>      string-filter
>      string-fold
>      string-fold-right
>      string-for-each
>      string-for-each-index
>      string-hash
>      string-hash-ci
>      string-index
>      string-index-right
>      string-join
>      string-length
>      string-map
>      string-normalize-nfc
>      string-normalize-nfd
>      string-normalize-nfkc
>      string-normalize-nfkd
>      string-null?
>      string-pad
>      string-pad-right
>      string-prefix-ci?
>      string-prefix-length
>      string-prefix-length-ci
>      string-prefix?
>      string-ref
>      string-replace
>      string-reverse
>      string-rindex
>      string-skip
>      string-skip-right
>      string-split
>      string-suffix-ci?
>      string-suffix-length
>      string-suffix-length-ci
>      string-suffix?
>      string-tabulate
>      string-take
>      string-take-right
>      string-titlecase
>      string-tokenize
>      string-trim
>      string-trim-both
>      string-trim-right
>      string-unfold
>      string-unfold-right
>      string-upcase
>      string-utf8-length
>      string<
>      string<=
>      string<=?
>      string<>
>      string<?
>      string=
>      string=?
>      string>
>      string>=
>      string>=?
>      string>?
>      string?
>      substring
>      substring/copy
>      substring/read-only
>      substring/shared
>      xsubstring)))
>
> ;; These can only form part of a safe binding set if no mutable string
> ;; is exposed to the sandbox.
> (define *mutating-string-bindings*
>   '(((guile)
>      string-capitalize!
>      string-copy!
>      string-downcase!
>      string-fill!
>      string-map!
>      string-reverse!
>      string-set!
>      string-titlecase!
>      string-upcase!
>      string-xcopy!
>      substring-fill!
>      substring-move!)))
>
> (define *symbol-bindings*
>   '(((guile)
>      string->symbol
>      string-ci->symbol
>      symbol->string
>      list->symbol
>      make-symbol
>      symbol
>      symbol-append
>      symbol-hash
>      symbol-interned?
>      symbol?)))
>
> (define *keyword-bindings*
>   '(((guile)
>      keyword?
>      keyword->symbol
>      symbol->keyword)))
>
> ;; These can only form part of a safe binding set if no valid prompt tag
> ;; is ever exposed to the sandbox, or can be constructed by the sandbox.
> (define *prompt-bindings*
>   '(((guile)
>      abort-to-prompt
>      abort-to-prompt*
>      call-with-prompt
>      make-prompt-tag)))
>
> (define *bit-bindings*
>   '(((guile)
>      ash
>      round-ash
>      logand
>      logcount
>      logior
>      lognot
>      logtest
>      logxor
>      logbit?)))
>
> (define *bitvector-bindings*
>   '(((guile)
>      bit-count
>      bit-count*
>      bit-extract
>      bit-position
>      bitvector
>      bitvector->list
>      bitvector-length
>      bitvector-ref
>      bitvector?
>      list->bitvector
>      make-bitvector)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; bitvector is exposed to the sandbox.
> (define *mutating-bitvector-bindings*
>   '(((guile)
>      bit-invert!
>      bit-set*!
>      bitvector-fill!
>      bitvector-set!)))
>
> (define *fluid-bindings*
>   '(((guile)
>      fluid-bound?
>      fluid-ref
>      ;; fluid-ref* could escape the sandbox and is not allowed.
>      fluid-thread-local?
>      fluid?
>      make-fluid
>      make-thread-local-fluid
>      make-unbound-fluid
>      with-fluid*
>      with-fluids
>      with-fluids*
>      make-parameter
>      parameter?
>      parameterize)))
>
> ;; These can only form part of a safe binding set if no fluid is
> ;; directly exposed to the sandbox.
> (define *mutating-fluid-bindings*
>   '(((guile)
>      fluid-set!
>      fluid-unset!
>      fluid->parameter)))
>
> (define *char-bindings*
>   '(((guile)
>      char-alphabetic?
>      char-ci<=?
>      char-ci<?
>      char-ci=?
>      char-ci>=?
>      char-ci>?
>      char-downcase
>      char-general-category
>      char-is-both?
>      char-lower-case?
>      char-numeric?
>      char-titlecase
>      char-upcase
>      char-upper-case?
>      char-whitespace?
>      char<=?
>      char<?
>      char=?
>      char>=?
>      char>?
>      char?
>      char->integer
>      integer->char)))
>
> (define *list-bindings*
>   '(((guile)
>      list
>      list-cdr-ref
>      list-copy
>      list-head
>      list-index
>      list-ref
>      list-tail
>      list?
>      null?
>      make-list
>      append
>      delete
>      delq
>      delv
>      filter
>      length
>      member
>      memq
>      memv
>      merge
>      reverse)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; pair is exposed to the sandbox.
> (define *mutating-list-bindings*
>   '(((guile)
>      list-cdr-set!
>      list-set!
>      append!
>      delete!
>      delete1!
>      delq!
>      delq1!
>      delv!
>      delv1!
>      filter!
>      merge!
>      reverse!)))
>
> (define *pair-bindings*
>   '(((guile)
>      last-pair
>      pair?
>      caaaar
>      caaadr
>      caaar
>      caadar
>      caaddr
>      caadr
>      caar
>      cadaar
>      cadadr
>      cadar
>      caddar
>      cadddr
>      caddr
>      cadr
>      car
>      cdaaar
>      cdaadr
>      cdaar
>      cdadar
>      cdaddr
>      cdadr
>      cdar
>      cddaar
>      cddadr
>      cddar
>      cdddar
>      cddddr
>      cdddr
>      cddr
>      cdr
>      cons
>      cons*)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; pair is exposed to the sandbox.
> (define *mutating-pair-bindings*
>   '(((guile)
>      set-car!
>      set-cdr!)))
>
> (define *vector-bindings*
>   '(((guile)
>      list->vector
>      make-vector
>      vector
>      vector->list
>      vector-copy
>      vector-length
>      vector-ref
>      vector?)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; vector is exposed to the sandbox.
> (define *mutating-vector-bindings*
>   '(((guile)
>      vector-fill!
>      vector-move-left!
>      vector-move-right!
>      vector-set!)))
>
> (define *promise-bindings*
>   '(((guile)
>      force
>      delay
>      make-promise
>      promise?)))
>
> (define *srfi-4-bindings*
>   '(((srfi srfi-4)
>      f32vector
>      f32vector->list
>      f32vector-length
>      f32vector-ref
>      f32vector?
>      f64vector
>      f64vector->list
>      f64vector-length
>      f64vector-ref
>      f64vector?
>      list->f32vector
>      list->f64vector
>      list->s16vector
>      list->s32vector
>      list->s64vector
>      list->s8vector
>      list->u16vector
>      list->u32vector
>      list->u64vector
>      list->u8vector
>      make-f32vector
>      make-f64vector
>      make-s16vector
>      make-s32vector
>      make-s64vector
>      make-s8vector
>      make-u16vector
>      make-u32vector
>      make-u64vector
>      make-u8vector
>      s16vector
>      s16vector->list
>      s16vector-length
>      s16vector-ref
>      s16vector?
>      s32vector
>      s32vector->list
>      s32vector-length
>      s32vector-ref
>      s32vector?
>      s64vector
>      s64vector->list
>      s64vector-length
>      s64vector-ref
>      s64vector?
>      s8vector
>      s8vector->list
>      s8vector-length
>      s8vector-ref
>      s8vector?
>      u16vector
>      u16vector->list
>      u16vector-length
>      u16vector-ref
>      u16vector?
>      u32vector
>      u32vector->list
>      u32vector-length
>      u32vector-ref
>      u32vector?
>      u64vector
>      u64vector->list
>      u64vector-length
>      u64vector-ref
>      u64vector?
>      u8vector
>      u8vector->list
>      u8vector-length
>      u8vector-ref
>      u8vector?)))
>
> ;; These can only form part of a safe binding set if no mutable
> ;; bytevector is exposed to the sandbox.
> (define *mutating-srfi-4-bindings*
>   '(((srfi srfi-4)
>      f32vector-set!
>      f64vector-set!
>      s16vector-set!
>      s32vector-set!
>      s64vector-set!
>      s8vector-set!
>      u16vector-set!
>      u32vector-set!
>      u64vector-set!
>      u8vector-set!)))
>
> (define *all-pure-bindings*
>   (append *alist-bindings*
>           *array-bindings*
>           *bit-bindings*
>           *bitvector-bindings*
>           *char-bindings*
>           *char-set-bindings*
>           *clock-bindings*
>           *core-bindings*
>           *error-bindings*
>           *fluid-bindings*
>           *hash-bindings*
>           *iteration-bindings*
>           *keyword-bindings*
>           *list-bindings*
>           *macro-bindings*
>           *nil-bindings*
>           *number-bindings*
>           *pair-bindings*
>           *predicate-bindings*
>           *procedure-bindings*
>           *promise-bindings*
>           *prompt-bindings*
>           *regexp-bindings*
>           *sort-bindings*
>           *srfi-4-bindings*
>           *string-bindings*
>           *symbol-bindings*
>           *unspecified-bindings*
>           *variable-bindings*
>           *vector-bindings*
>           *version-bindings*))
>
>
> (define *all-pure-and-impure-bindings*
>   (append *all-pure-bindings*
>           *mutating-alist-bindings*
>           *mutating-array-bindings*
>           *mutating-bitvector-bindings*
>           *mutating-fluid-bindings*
>           *mutating-hash-bindings*
>           *mutating-list-bindings*
>           *mutating-pair-bindings*
>           *mutating-sort-bindings*
>           *mutating-srfi-4-bindings*
>           *mutating-string-bindings*
>           *mutating-variable-bindings*
>           *mutating-vector-bindings*))
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-04-03 15:35         ` Ludovic Courtès
@ 2017-04-14 10:52           ` Andy Wingo
  2017-04-14 12:17             ` tomas
  2017-04-14 12:32             ` Ludovic Courtès
  0 siblings, 2 replies; 17+ messages in thread
From: Andy Wingo @ 2017-04-14 10:52 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Mon 03 Apr 2017 17:35, ludo@gnu.org (Ludovic Courtès) writes:

> Riastradh’s document at <http://mumble.net/~campbell/scheme/style.txt>
> has this:
>
>   Affix asterisks to the beginning and end of a globally mutable
>   variable.  This allows the reader of the program to recognize very
>   easily that it is badly written!
>
> … but it doesn’t say anything about constants nor about %.
>
> It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
> ‘%all-pure-bindings’.  So, dunno, as you see fit!

I feel like I would have less of a need for name sigils like *earmuffs*
or %preficentiles if we had more reliably immutable data.

Right now one of the functions of these sigils is to tell the reader,
"Don't use append! on this data structure or you will cause spooky
action-at-a-distance!"

It sure would be nice to be able to use these values without worries of
this kind.  We don't have this immutability problem with strings because
our compiled string literals are marked as immutable, and string
mutators assert that the strings are mutable.  We should do the same for
all literal constants.

We currently can't add an immutable bit to pairs due to our tagging
scheme -- pairs are just two words.  But we can do this easily with
other data types: vectors, arrays, bytevectors, etc.  (If we want to do
this, anyway.)

However we it is possible to do a more expensive check to see if a pair
is embedded in an ELF image (or the converse, that it is allocated on
the GC heap).  I just looked in Guile and there are only a few dozen
instances of set-car! in Guile's source and a bit more of set-cdr!, so
it's conceivable to think of this being a check that we can make.

If we are able to do this, we can avoid the whole discussion about
SIGSEGV handlers.

It would be nice of course to be able to cons an immutable pair on the
heap -- so a simple GC_is_heap_ptr(x) check wouldn't suffice to prove
immutability.  Not sure quite what the right solution would be there.

FWIW, Racket uses four words for pairs: the type tag, the hash code, and
the two fields.  Four words is I think the logical progression after 2
given GC's object size granularity.  It would be nice to avoid having
the extra words, but if we ever switched to a moving GC we would need
space for a hash code I think.

Thoughts on the plan for immutable literals?

Concretely for this use case, assuming that we can solve the immutable
literal problem, I propose to remove sigils entirely.  Thoughts welcome
here.

Andy



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-04-06 21:41 ` Freja Nordsiek
@ 2017-04-14 10:58   ` Andy Wingo
  0 siblings, 0 replies; 17+ messages in thread
From: Andy Wingo @ 2017-04-14 10:58 UTC (permalink / raw)
  To: Freja Nordsiek; +Cc: guile-devel

On Thu 06 Apr 2017 23:41, Freja Nordsiek <fnordsie@gmail.com> writes:

> On the subject of ports and i/o, I have a few ideas. R6RS i/o in the
> (rnrs io ports) module generally requires the port to be explicitly
> given, rather than assuming current in or out if not given (though
> rnrs io simple does make those assumptions). For many, it would be
> impossible because they put the port as the first argument and a
> required second argument afterwards. Looking at module/io/ports.scm in
> Guile 2.2.x, it looks like the reading and writing procedures there
> should be safe. Obviously, nothing that opens a file should be used,
> nor the procedures to get current input, output, and error; but the
> rest can be used. And this includes string and bytevector ports, which
> could be very useful in the sandbox (I don't know about anyone else,
> but I use string ports all the time).
>
> One question, is there a particular reason that guard is not exported?
> It doesn't seem like it is as nasty as dynamic-wind with trying to
> terminate, though maybe I am just not seeing how it could be used to
> prevent the sandbox terminating the process. Having at least one
> exception handling binding might be very helpful in a sandbox.

These questions are related.  There is nothing unsafe about "guard"
specifically.  Indeed the sandbox environment has "catch" and similar
things.  "guard" isn't in this default set because currently the set of
bindings that (ice-9 sandbox) offers in *all-pure-and-impure-bindings*
is subset of the bindings that are available by default.  "guard" has to
be imported via srfi-34.  Likewise for r6rs port procedures.  I think
it's reasonable to have this limitation -- otherwise there's no point at
which to stop.  Other binding sets are of course possible.

I would of course like I/O in the sandbox :) We could have versions of
"display" et al that require their port argument; that would be a
consistent with the strict-subset criteria.

Andy



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-04-14 10:52           ` Andy Wingo
@ 2017-04-14 12:17             ` tomas
  2017-04-14 12:32             ` Ludovic Courtès
  1 sibling, 0 replies; 17+ messages in thread
From: tomas @ 2017-04-14 12:17 UTC (permalink / raw)
  To: guile-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Apr 14, 2017 at 12:52:19PM +0200, Andy Wingo wrote:

[...]

> Concretely for this use case, assuming that we can solve the immutable
> literal problem, I propose to remove sigils entirely.  Thoughts welcome
> here.

There's still the "cultural value" of such sigils, which eases the
communication between humans. That'll depend on what other Schemes
do, and how current pedagogical literature is set up. Readability
and all that. Cultures are bound to change, though.

Of course, really marking things as immutable (the "technical" bit)
is still very cool.

regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAljwvcUACgkQBcgs9XrR2kYZTACcDPuqBDCiuPT9Etz3YS1m6Mta
TT4AniJs2TRtp899aiuleeV1FqYo1be7
=nA1X
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-04-14 10:52           ` Andy Wingo
  2017-04-14 12:17             ` tomas
@ 2017-04-14 12:32             ` Ludovic Courtès
  1 sibling, 0 replies; 17+ messages in thread
From: Ludovic Courtès @ 2017-04-14 12:32 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Hi!

Andy Wingo <wingo@pobox.com> skribis:

> On Mon 03 Apr 2017 17:35, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Riastradh’s document at <http://mumble.net/~campbell/scheme/style.txt>
>> has this:
>>
>>   Affix asterisks to the beginning and end of a globally mutable
>>   variable.  This allows the reader of the program to recognize very
>>   easily that it is badly written!
>>
>> … but it doesn’t say anything about constants nor about %.
>>
>> It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
>> ‘%all-pure-bindings’.  So, dunno, as you see fit!
>
> I feel like I would have less of a need for name sigils like *earmuffs*
> or %preficentiles if we had more reliably immutable data.

[...]

> However we it is possible to do a more expensive check to see if a pair
> is embedded in an ELF image (or the converse, that it is allocated on
> the GC heap).  I just looked in Guile and there are only a few dozen
> instances of set-car! in Guile's source and a bit more of set-cdr!, so
> it's conceivable to think of this being a check that we can make.
>
> If we are able to do this, we can avoid the whole discussion about
> SIGSEGV handlers.
>
> It would be nice of course to be able to cons an immutable pair on the
> heap -- so a simple GC_is_heap_ptr(x) check wouldn't suffice to prove
> immutability.  Not sure quite what the right solution would be there.
>
> FWIW, Racket uses four words for pairs: the type tag, the hash code, and
> the two fields.  Four words is I think the logical progression after 2
> given GC's object size granularity.  It would be nice to avoid having
> the extra words, but if we ever switched to a moving GC we would need
> space for a hash code I think.
>
> Thoughts on the plan for immutable literals?

My feeling is that using GC_is_heap_ptr or similar would be nicer than
adding bits to the type tags, because we’d need to add this read-only
bit for every type, and we could have bugs where we forget to check them
in some cases.

GC_is_heap_ptr is probably enough until we support immutable objects
allocated on the heap.

> Concretely for this use case, assuming that we can solve the immutable
> literal problem, I propose to remove sigils entirely.  Thoughts welcome
> here.

In practice I guess the funny characters will stay for a while.  :-)

But I agree that it’d be nice to have a generic way to represent
immutable objects.

Ludo’.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31  9:27 RFC: (ice-9 sandbox) Andy Wingo
                   ` (3 preceding siblings ...)
  2017-04-06 21:41 ` Freja Nordsiek
@ 2017-04-15 17:23 ` Nala Ginrut
  2017-04-17  8:07   ` Andy Wingo
  2017-04-18 19:48 ` Andy Wingo
  5 siblings, 1 reply; 17+ messages in thread
From: Nala Ginrut @ 2017-04-15 17:23 UTC (permalink / raw)
  To: Andy Wingo, guile-devel

[-- Attachment #1: Type: text/plain, Size: 797 bytes --]

Hi Andy!
It's pretty cool!
Could you please add #:from keyword to evil-in-sand box to indicate the
language front-end? Don't forget there's multi-lang plan. :-)

Best regards.

Andy Wingo <wingo@pobox.com>于2017年3月31日周五 17:28写道:

> Hi,
>
> Attached is a module that can evaluate an expression within a sandbox.
> If the evaluation takes too long or allocates too much, it will be
> cancelled.  The evaluation will take place with respect to a module with
> a "safe" set of imports.  Those imports include most of the bindings
> available in a default Guile environment.  See the file below for full
> details and a number of caveats.
>
> Any thoughts?  I would like something like this for a web service that
> has to evaluate untrusted code.
>
> Andy
>
>

[-- Attachment #2: Type: text/html, Size: 1109 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-04-15 17:23 ` Nala Ginrut
@ 2017-04-17  8:07   ` Andy Wingo
  2017-04-17  9:12     ` Nala Ginrut
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Wingo @ 2017-04-17  8:07 UTC (permalink / raw)
  To: Nala Ginrut; +Cc: guile-devel

On Sat 15 Apr 2017 19:23, Nala Ginrut <nalaginrut@gmail.com> writes:

> Could you please add #:from keyword to evil-in-sand box to indicate
> the language front-end? Don't forget there's multi-lang plan. :-)

In theory yes, but I don't know how to make safe sandboxes in other
languages.  ice-9 sandbox relies on the Scheme characteristic that the
only capabilities granted to a program are those that are in scope.
Other languages often have ambient capabilities -- like Bash for example
where there's no way to not provide the pipe ("|") operator.  I think
adding other languages should be an exercise for the reader :)

Andy



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-04-17  8:07   ` Andy Wingo
@ 2017-04-17  9:12     ` Nala Ginrut
  0 siblings, 0 replies; 17+ messages in thread
From: Nala Ginrut @ 2017-04-17  9:12 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1136 bytes --]

Hmm...I didn't think about this security issue. And even if we may do some
verification in IR(say, CPS or lower level), it's insufficient to avoid
security issue, since front-end implementation may use cross module
function to mimic primitives for other languages.
Now I think maybe front-end writer has to write their own sandbox with
(ice-9 sandbox) if any necessary. :-)

Best regards.


2017年4月17日 16:07,"Andy Wingo" <wingo@pobox.com>写道:

> On Sat 15 Apr 2017 19:23, Nala Ginrut <nalaginrut@gmail.com> writes:
>
> > Could you please add #:from keyword to evil-in-sand box to indicate
> > the language front-end? Don't forget there's multi-lang plan. :-)
>
> In theory yes, but I don't know how to make safe sandboxes in other
> languages.  ice-9 sandbox relies on the Scheme characteristic that the
> only capabilities granted to a program are those that are in scope.
> Other languages often have ambient capabilities -- like Bash for example
> where there's no way to not provide the pipe ("|") operator.  I think
> adding other languages should be an exercise for the reader :)
>
> Andy
>

[-- Attachment #2: Type: text/html, Size: 1609 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFC: (ice-9 sandbox)
  2017-03-31  9:27 RFC: (ice-9 sandbox) Andy Wingo
                   ` (4 preceding siblings ...)
  2017-04-15 17:23 ` Nala Ginrut
@ 2017-04-18 19:48 ` Andy Wingo
  5 siblings, 0 replies; 17+ messages in thread
From: Andy Wingo @ 2017-04-18 19:48 UTC (permalink / raw)
  To: guile-devel

On Fri 31 Mar 2017 11:27, Andy Wingo <wingo@pobox.com> writes:

> Attached is a module that can evaluate an expression within a sandbox.

Pushed to master.  See NEWS here, where I include a couple more entries
of note:

    * Notable changes

    ** New sandboxed evaluation facility

    Guile now has a way to execute untrusted code in a safe way.  See
    "Sandboxed Evaluation" in the manual for full details, including some
    important notes on limitations on the sandbox's ability to prevent
    resource exhaustion.

    ** All literal constants are read-only

    According to the Scheme language definition, it is an error to attempt
    to mutate a "constant literal".  A constant literal is data that is a
    literal quoted part of a program.  For example, all of these are errors:

      (set-car! '(1 . 2) 42)
      (append! '(1 2 3) '(4 5 6))
      (vector-set! '#(a b c) 1 'B)

    Guile takes advantage of this provision of Scheme to deduplicate shared
    structure in constant literals within a compilation unit, and to
    allocate constant data directly in the compiled object file.  If the
    data needs no relocation at run-time, as is the case for pairs or
    vectors that only contain immediate values, then the data can actually
    be shared between different Guile processes, using the operating
    system's virtual memory facilities.

    However, in Guile 2.2.0, constants that needed relocation were actually
    mutable -- though (vector-set! '#(a b c) 1 'B) was an error, Guile
    wouldn't actually cause an exception to be raised, silently allowing the
    mutation.  This could affect future users of this constant, or indeed of
    any constant in the compilation unit that shared structure with the
    original vector.

    Additionally, attempting to mutate constant literals mapped in the
    read-only section of files would actually cause a segmentation fault, as
    the operating system prohibits writes to read-only memory.  "Don't do
    that" isn't a very nice solution :)

    Both of these problems have been fixed.  Any attempt to mutate a
    constant literal will now raise an exception, whether the constant needs
    relocation or not.

    ** Syntax objects are now a distinct type

    It used to be that syntax objects were represented as a tagged vector.
    These values could be forged by users to break scoping abstractions,
    preventing the implementation of sandboxing facilities in Guile.  We are
    as embarrassed about the previous situation as we pleased are about the
    fact that we've fixed it.

    Unfortunately, during the 2.2 stable series (or at least during part of
    it), we need to support files compiled with Guile 2.2.0.  These files
    may contain macros that contain legacy syntax object constants.  See the
    discussion of "allow-legacy-syntax-objects?" in "Syntax Transformer
    Helpers" in the manual for full details.

And the documentation formatted as text is below.  I guess a 2.2.1 is
coming soon.  Thanks all for the review!

Andy



1.12 Sandboxed Evaluation
-------------------------

Sometimes you would like to evaluate code that comes from an untrusted
party.  The safest way to do this is to buy a new computer, evaluate the
code on that computer, then throw the machine away.  However if you are
unwilling to take this simple approach, Guile does include a limited
"sandbox" facility that can allow untrusted code to be evaluated with
some confidence.

   To use the sandboxed evaluator, load its module:

     (use-modules (ice-9 sandbox))

   Guile's sandboxing facility starts with the ability to restrict the
time and space used by a piece of code.

 -- Scheme Procedure: call-with-time-limit limit thunk limit-reached
     Call THUNK, but cancel it if LIMIT seconds of wall-clock time have
     elapsed.  If the computation is cancelled, call LIMIT-REACHED in
     tail position.  THUNK must not disable interrupts or prevent an
     abort via a 'dynamic-wind' unwind handler.

 -- Scheme Procedure: call-with-allocation-limit limit thunk
          limit-reached
     Call THUNK, but cancel it if LIMIT bytes have been allocated.  If
     the computation is cancelled, call LIMIT-REACHED in tail position.
     THUNK must not disable interrupts or prevent an abort via a
     'dynamic-wind' unwind handler.

     This limit applies to both stack and heap allocation.  The
     computation will not be aborted before LIMIT bytes have been
     allocated, but for the heap allocation limit, the check may be
     postponed until the next garbage collection.

     Note that as a current shortcoming, the heap size limit applies to
     all threads; concurrent allocation by other unrelated threads
     counts towards the allocation limit.

 -- Scheme Procedure: call-with-time-and-allocation-limits time-limit
          allocation-limit thunk
     Invoke THUNK in a dynamic extent in which its execution is limited
     to TIME-LIMIT seconds of wall-clock time, and its allocation to
     ALLOCATION-LIMIT bytes.  THUNK must not disable interrupts or
     prevent an abort via a 'dynamic-wind' unwind handler.

     If successful, return all values produced by invoking THUNK.  Any
     uncaught exception thrown by the thunk will propagate out.  If the
     time or allocation limit is exceeded, an exception will be thrown
     to the 'limit-exceeded' key.

   The time limit and stack limit are both very precise, but the heap
limit only gets checked asynchronously, after a garbage collection.  In
particular, if the heap is already very large, the number of allocated
bytes between garbage collections will be large, and therefore the
precision of the check is reduced.

   Additionally, due to the mechanism used by the allocation limit (the
'after-gc-hook'), large single allocations like '(make-vector #e1e7)'
are only detected after the allocation completes, even if the allocation
itself causes garbage collection.  It's possible therefore for user code
to not only exceed the allocation limit set, but also to exhaust all
available memory, causing out-of-memory conditions at any allocation
site.  Failure to allocate memory in Guile itself should be safe and
cause an exception to be thrown, but most systems are not designed to
handle 'malloc' failures.  An allocation failure may therefore exercise
unexpected code paths in your system, so it is a weakness of the sandbox
(and therefore an interesting point of attack).

   The main sandbox interface is 'eval-in-sandbox'.

 -- Scheme Procedure: eval-in-sandbox exp [#:time-limit 0.1]
          [#:allocation-limit #e10e6] [#:bindings all-pure-bindings]
          [#:module (make-sandbox-module bindings)] [#:sever-module? #t]
     Evaluate the Scheme expression EXP within an isolated "sandbox".
     Limit its execution to TIME-LIMIT seconds of wall-clock time, and
     limit its allocation to ALLOCATION-LIMIT bytes.

     The evaluation will occur in MODULE, which defaults to the result
     of calling 'make-sandbox-module' on BINDINGS, which itself defaults
     to 'all-pure-bindings'.  This is the core of the sandbox: creating
     a scope for the expression that is "safe".

     A safe sandbox module has two characteristics.  Firstly, it will
     not allow the expression being evaluated to avoid being cancelled
     due to time or allocation limits.  This ensures that the expression
     terminates in a timely fashion.

     Secondly, a safe sandbox module will prevent the evaluation from
     receiving information from previous evaluations, or from affecting
     future evaluations.  All combinations of binding sets exported by
     '(ice-9 sandbox)' form safe sandbox modules.

     The BINDINGS should be given as a list of import sets.  One import
     set is a list whose car names an interface, like '(ice-9 q)', and
     whose cdr is a list of imports.  An import is either a bare symbol
     or a pair of '(OUT . IN)', where OUT and IN are both symbols and
     denote the name under which a binding is exported from the module,
     and the name under which to make the binding available,
     respectively.  Note that BINDINGS is only used as an input to the
     default initializer for the MODULE argument; if you pass
     '#:module', BINDINGS is unused.  If SEVER-MODULE? is true (the
     default), the module will be unlinked from the global module tree
     after the evaluation returns, to allow MOD to be garbage-collected.

     If successful, return all values produced by EXP.  Any uncaught
     exception thrown by the expression will propagate out.  If the time
     or allocation limit is exceeded, an exception will be thrown to the
     'limit-exceeded' key.

   Constructing a safe sandbox module is tricky in general.  Guile
defines an easy way to construct safe modules from predefined sets of
bindings.  Before getting to that interface, here are some general notes
on safety.

  1. The time and allocation limits rely on the ability to interrupt and
     cancel a computation.  For this reason, no binding included in a
     sandbox module should be able to indefinitely postpone interrupt
     handling, nor should a binding be able to prevent an abort.  In
     practice this second consideration means that 'dynamic-wind' should
     not be included in any binding set.
  2. The time and allocation limits apply only to the 'eval-in-sandbox'
     call.  If the call returns a procedure which is later called, no
     limit is "automatically" in place.  Users of 'eval-in-sandbox' have
     to be very careful to reimpose limits when calling procedures that
     escape from sandboxes.
  3. Similarly, the dynamic environment of the 'eval-in-sandbox' call is
     not necessarily in place when any procedure that escapes from the
     sandbox is later called.

     This detail prevents us from exposing 'primitive-eval' to the
     sandbox, for two reasons.  The first is that it's possible for
     legacy code to forge references to any binding, if the
     'allow-legacy-syntax-objects?' parameter is true.  The default for
     this parameter is true; *note Syntax Transformer Helpers:: for the
     details.  The parameter is bound to '#f' for the duration of the
     'eval-in-sandbox' call itself, but that will not be in place during
     calls to escaped procedures.

     The second reason we don't expose 'primitive-eval' is that
     'primitive-eval' implicitly works in the current module, which for
     an escaped procedure will probably be different than the module
     that is current for the 'eval-in-sandbox' call itself.

     The common denominator here is that if an interface exposed to the
     sandbox relies on dynamic environments, it is easy to mistakenly
     grant the sandboxed procedure additional capabilities in the form
     of bindings that it should not have access to.  For this reason,
     the default sets of predefined bindings do not depend on any
     dynamically scoped value.
  4. Mutation may allow a sandboxed evaluation to break some invariant
     in users of data supplied to it.  A lot of code culturally doesn't
     expect mutation, but if you hand mutable data to a sandboxed
     evaluation and you also grant mutating capabilities to that
     evaluation, then the sandboxed code may indeed mutate that data.
     The default set of bindings to the sandbox do not include any
     mutating primitives.

     Relatedly, 'set!' may allow a sandbox to mutate a primitive,
     invalidating many system-wide invariants.  Guile is currently quite
     permissive when it comes to imported bindings and mutability.
     Although 'set!' to a module-local or lexically bound variable would
     be fine, we don't currently have an easy way to disallow 'set!' to
     an imported binding, so currently no binding set includes 'set!'.
  5. Mutation may allow a sandboxed evaluation to keep state, or make a
     communication mechanism with other code.  On the one hand this
     sounds cool, but on the other hand maybe this is part of your
     threat model.  Again, the default set of bindings doesn't include
     mutating primitives, preventing sandboxed evaluations from keeping
     state.
  6. The sandbox should probably not be able to open a network
     connection, or write to a file, or open a file from disk.  The
     default binding set includes no interaction with the operating
     system.

   If you, dear reader, find the above discussion interesting, you will
enjoy Jonathan Rees' dissertation, "A Security Kernel Based on the
Lambda Calculus".

 -- Scheme Variable: all-pure-bindings
     All "pure" bindings that together form a safe subset of those
     bindings available by default to Guile user code.

 -- Scheme Variable: all-pure-and-impure-bindings
     Like 'all-pure-bindings', but additionally including mutating
     primitives like 'vector-set!'.  This set is still safe in the sense
     mentioned above, with the caveats about mutation.

   The components of these composite sets are as follows:
 -- Scheme Variable: alist-bindings
 -- Scheme Variable: array-bindings
 -- Scheme Variable: bit-bindings
 -- Scheme Variable: bitvector-bindings
 -- Scheme Variable: char-bindings
 -- Scheme Variable: char-set-bindings
 -- Scheme Variable: clock-bindings
 -- Scheme Variable: core-bindings
 -- Scheme Variable: error-bindings
 -- Scheme Variable: fluid-bindings
 -- Scheme Variable: hash-bindings
 -- Scheme Variable: iteration-bindings
 -- Scheme Variable: keyword-bindings
 -- Scheme Variable: list-bindings
 -- Scheme Variable: macro-bindings
 -- Scheme Variable: nil-bindings
 -- Scheme Variable: number-bindings
 -- Scheme Variable: pair-bindings
 -- Scheme Variable: predicate-bindings
 -- Scheme Variable: procedure-bindings
 -- Scheme Variable: promise-bindings
 -- Scheme Variable: prompt-bindings
 -- Scheme Variable: regexp-bindings
 -- Scheme Variable: sort-bindings
 -- Scheme Variable: srfi-4-bindings
 -- Scheme Variable: string-bindings
 -- Scheme Variable: symbol-bindings
 -- Scheme Variable: unspecified-bindings
 -- Scheme Variable: variable-bindings
 -- Scheme Variable: vector-bindings
 -- Scheme Variable: version-bindings
     The components of 'all-pure-bindings'.

 -- Scheme Variable: mutating-alist-bindings
 -- Scheme Variable: mutating-array-bindings
 -- Scheme Variable: mutating-bitvector-bindings
 -- Scheme Variable: mutating-fluid-bindings
 -- Scheme Variable: mutating-hash-bindings
 -- Scheme Variable: mutating-list-bindings
 -- Scheme Variable: mutating-pair-bindings
 -- Scheme Variable: mutating-sort-bindings
 -- Scheme Variable: mutating-srfi-4-bindings
 -- Scheme Variable: mutating-string-bindings
 -- Scheme Variable: mutating-variable-bindings
 -- Scheme Variable: mutating-vector-bindings
     The additional components of 'all-pure-and-impure-bindings'.

   Finally, what do you do with a binding set?  What is a binding set
anyway?  'make-sandbox-module' is here for you.

 -- Scheme Procedure: make-sandbox-module bindings
     Return a fresh module that only contains BINDINGS.

     The BINDINGS should be given as a list of import sets.  One import
     set is a list whose car names an interface, like '(ice-9 q)', and
     whose cdr is a list of imports.  An import is either a bare symbol
     or a pair of '(OUT . IN)', where OUT and IN are both symbols and
     denote the name under which a binding is exported from the module,
     and the name under which to make the binding available,
     respectively.

   So you see that binding sets are just lists, and
'all-pure-and-impure-bindings' is really just the result of appending
all of the component binding sets.



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-04-18 19:48 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-31  9:27 RFC: (ice-9 sandbox) Andy Wingo
2017-03-31 11:33 ` Ludovic Courtès
2017-03-31 16:26   ` Andy Wingo
2017-03-31 21:41     ` Ludovic Courtès
2017-04-02 10:18       ` Andy Wingo
2017-04-03 15:35         ` Ludovic Courtès
2017-04-14 10:52           ` Andy Wingo
2017-04-14 12:17             ` tomas
2017-04-14 12:32             ` Ludovic Courtès
2017-03-31 14:41 ` Mike Gran
2017-04-01 14:33 ` Christopher Allan Webber
2017-04-06 21:41 ` Freja Nordsiek
2017-04-14 10:58   ` Andy Wingo
2017-04-15 17:23 ` Nala Ginrut
2017-04-17  8:07   ` Andy Wingo
2017-04-17  9:12     ` Nala Ginrut
2017-04-18 19:48 ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).