From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Typechecking I Date: Tue, 16 Nov 2010 00:10:53 +0100 Message-ID: <201011160010.53435.stefan.itampe@gmail.com> References: <201010212223.23822.stefan.itampe@gmail.com> <87bp66q8g5.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_93b4MfLAoxmrPJi" X-Trace: dough.gmane.org 1289862632 6602 80.91.229.12 (15 Nov 2010 23:10:32 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 15 Nov 2010 23:10:32 +0000 (UTC) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Nov 16 00:10:25 2010 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PI8Bq-0005XE-7Z for guile-devel@m.gmane.org; Tue, 16 Nov 2010 00:10:22 +0100 Original-Received: from localhost ([127.0.0.1]:35815 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PI8Bp-0006CV-2y for guile-devel@m.gmane.org; Mon, 15 Nov 2010 18:10:21 -0500 Original-Received: from [140.186.70.92] (port=40510 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PI8Bl-0006AQ-O1 for guile-devel@gnu.org; Mon, 15 Nov 2010 18:10:19 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PI8Bj-0004WF-Ri for guile-devel@gnu.org; Mon, 15 Nov 2010 18:10:17 -0500 Original-Received: from mail-ey0-f169.google.com ([209.85.215.169]:44361) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PI8Bj-0004W9-FL for guile-devel@gnu.org; Mon, 15 Nov 2010 18:10:15 -0500 Original-Received: by eydd26 with SMTP id d26so13904eyd.0 for ; Mon, 15 Nov 2010 15:10:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:subject:date :user-agent:references:in-reply-to:mime-version:content-type :message-id; bh=aKzDTIYHXiEuU22WGQpNNbIrtecW3luF2Zyde6DMOwI=; b=tSZn2OMdqYEoiQ3XK2rXkMy1stROwLw0nPj3IKI9HB4u64U8fIPw2+Luycfs/rIbsd YGTwrsznmG4wMrLnnDP4bY3m36GPG9RqfkE18CptHQwMC4zx2IFCCr78h3fmZgSrNIQE WwXvUqHuxj0me0VYx75RruJez4KZxfXzU7z2s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:user-agent:references:in-reply-to:mime-version :content-type:message-id; b=ryJZ28lpKVB2i7LKPxMPp6Hfr/tvGo/ZnRPFqb+jW0D7j6myLbUP3BIAMji64XyRrw 9lF2cSvi5t+tAhitfGVuR9zvmeyz2StbSBPNxYJkmMD85LMutYe1h5iIokG4BhS8GR3T ubrDhcs9Vydscp+AJFyCDJRStVKnOxEtQBMnE= Original-Received: by 10.213.101.7 with SMTP id a7mr729380ebo.14.1289862614082; Mon, 15 Nov 2010 15:10:14 -0800 (PST) Original-Received: from linux-s4gz.localnet (1-1-1-39a.veo.vs.bostream.se [82.182.254.46]) by mx.google.com with ESMTPS id q58sm464894eeh.3.2010.11.15.15.10.11 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 15 Nov 2010 15:10:13 -0800 (PST) User-Agent: KMail/1.13.5 (Linux/2.6.34.7-0.5-desktop; KDE/4.4.4; x86_64; ; ) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:11137 Archived-At: --Boundary-00=_93b4MfLAoxmrPJi Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Hi, Ok, here is a first step to analyse code. Note, attached to the mail is a modul /put it at a suitable place and fixup the reference for the module. Then use e.g. (check '(if 1 1 1)) and you will get the output you see further down in the email. We will assume that local variables in the following is gensymed to be unique and no name clashes apear. For typechecking i find it interesting to consider an equational approach first. e.g. translate code to an equation system. let's take a first simple step here and and assume that the typecheck function just finds the equation we need to solve in order to understand type consistency and discovery. we will assume that typecheck is of the form (typecheck Eq Code Type) or in short Eq is current equational state, Code the code to draw equations out of. Type the type associated with the result of Code ******************* Item I ******************************* [let X Y Z] : T Translates to (typecheck (typecheck Eq Y X) Z T) Example of equation generation: (let x 1 x): ((= integer x) (= x Texp)) (let x 1 (let y x y)): ((= integer x) (= x y) (= y Texp)) TODO: Note: (= y x) should be more nicer for the understanding!!! This is not earthshaken. But I think it is wise to handle first such an translational step in order to write a solver in kanren or prolog or racklog or guilelog :-) for the more general case. ******************* Item II *************************** [begin x ... xx] : T (typecheck x e) o ... o (typecheck xx T) the symbol e will mean an anything object so that essentially [= e x] is true for all x. You could say everything that we have not assigned a type to has type e and then (the idea) is that the all scheme expression should typecheck. It remains to define the properties and rules associated with e to make this work. The target is to be able to gradually mix typed expressions with untyped expression. Example: (begin 1 "a" 1): ((= integer e) (= string e) (= integer Texp)) ******************* Item III **************************** [lambda [x ...] code] What to look up for here is that essentialle every usage of a lambda inside a function has to, to be general, have a unique set of equations and do a unique deduction. This can be expensive, but is the correct way to treat the typechecking of the lambda. So we must tell the solver later on that we are going to treat a lambda, e.g. x ... will be a unique version at every usage of the lambda the directive for the solver is (lambda vars res equation) lambda conceptually translates to an abstraction of an equations. Example (lambda (x y) x): ((= (lambda (x y) Tlam6583 ((= x Tlam6583))) Texp)) ********************** Item IV ************************** [apply f [x ...]] : T First of all we need to deduce the equation of f, to yield a lambda. e.g. (typecheck f [lamda [ArgType(x) ...] ResType]) o (typecheck x [arg ArgType(x)] ) o ... o (add-equation [= [res ResType] T]) so 1. define the functional signatur. 2. find the equations typececked to [arg X], arg is a directive to the solver to tell it that we are equating a function argument why? well the equality needs to be treated differnetly in order to be correct. 3. the same reason for the addition of the last equation. we need to inform the solver that ResType is of a result type. Example: (apply (lambda (x y) x) (1 2)) : ((= (lambda (x y) Tlam6954 ((= x Tlam6954))) ;; a lambda type abstraction (lambda (Targ6951 Targ6952) Tres6953)) ;; a lambda type (= integer (arg Targ6951)) (= integer (arg Targ6952)) (= (res Tres6953) Texp)) Can you see how res and arg differ (= T (arg (or A B C))) ; argument can be A or B or C (= T (res (or A B C))) ; function produces A or B or C **************** Item V ************* [if P X Y] : T (integer? X) makes a promise that if true then X is an integer and is a string. Consider (= (res A) (arg integer)) if the result of a function is anything it follows that it can't unify with an argument that has to take an inte integer as in the case (+ (g X) 1) and g is not typehcecked and we cannot ussume anything about it hence the typecheck should fail. On the other hand (if (integer? (g X)) (+ (g X) 1) #f) should be able to typecheck against (or integer boolean) and here we would have an equality of the type (= (validated integer) (res A)) and then A=integer in the rest of the equations in the or. Example: (let x 1 (if (integer? x) "a" 2)) ================================= (= integer x) (or ((= x (validated integer)) (= string Texp)) ((= x (validated (not integer))) (= integer Texp)))) So maybe it's an unessesary step but at least I appriciate this little transformation to get an understanding the subject. I'll will now work a little with solving the equations. And yes, all tree-il is not included, but I fix the complexity at this level and move over to solving the equations. Play around and have fun /Stefan --Boundary-00=_93b4MfLAoxmrPJi Content-Type: text/x-scheme; charset="UTF-8"; name="equationalize.scm" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="equationalize.scm" (define-module (language prolog typecheck equationalize) #:use-module (srfi srfi-1 ) #:use-module (ice-9 pretty-print ) #:use-module (ice-9 match ) #:use-module (ice-9 receive ) #:export (equationalize check)) ;; the main function is equationalize, which takes a format that should be derived from ;; tree-il representation with suitable tags to be able to communicate to the compiler ;; about type-checking conclusions. For know it's just a toy translator. The idea is to ;; translate this code to a set of equations that later on is fed into the typechecker and ;; is a good starting point to see the type structure ;; ;; (equationalize Bindings Equations Code Outer-type) ;; To get the equations of (if 1 1 1) you would write ;; (recieve (Bind Eq) (equationalize '() '() '(if 1 1 1) 'T-outer) ...) ;; ;; the language have just a few functions implemented for illustration ;; (number? X),(integer? X), (symbol? X), (char? X), (string? X)., (boolean? X). ;; plain simple (if X Y Z) (let Var Val Code) (lambda (x ...) code) is available as ;; well. functions application should be written as (apply symbol? X). ;; and or and not is supported as is because they are special ;; ;; So currently not that exciting but it is very instructive to play with a few expressions ;; an look att the corresponding equations that results. esiest is to use the check function ;; like (check Code) e.g. (check `(if 1 1 1)) which produces the type equation, ;;(if 1 1 1) ;;((or ((= 1 boolean) (= integer Texp)) ;; ((not (= 1 boolean)) (= integer Texp)))) ;;Have a play! (define (pp X) (begin (pretty-print X) X)) (define union (lambda (x y) (lset-union eq? x y))) ;;**************************** PRELUDE TO TYPECHECKING *********************** ;;This is just to kill of receive nesting buitifies the code alot (define-syntax rlet (syntax-rules () ((_ (((a ...) c) . l) code ...) (receive (a ...) c (rlet l code ...))) ((_ ((a c) . l) code ...) (let ((a c)) (rlet l code ...))) ((_ () code ...) (begin code ...)))) (define (compute-type Const Code) (define (type? X) (and (symbol? X) (char-upper-case? (car (string->list (symbol->string X)))))) (define (gen-map Code Bind Lam) (match Code ([X . L] (gen-map X Bind (lambda (Bind) (gen-map L Bind Lam)))) ((? type? X) (if (member X Bind) (Lam Bind) (let ((T (Const))) (Lam (cons (cons X T) Bind))))) (_ (Lam Bind)))) (define (make-rep X Bind) (match X ([X . L] (cons (make-rep X Bind) (make-rep L Bind))) ([] '[]) (X (let ((R (assoc-ref Bind X))) (if R R X))))) (make-rep Code (gen-map Code '() (lambda (x) x)))) ;;typecheck will create a list of equation that defines the rules for the predicates, the rules are ;; [= A B] A and B should unify ;; [and A B] [A B] A holds and B holds ;; [or A B] A holds or B holds ;; [not A] A is not true ;; [arg A] A is an argument variable ;; [res A] A is the result of a function ;; e Accepts everything (define (equationalize Bind Eq Expr Tp) ;;(pp Expr) (match Expr (['let X V Code] (rlet (((Bind Eq) (equationalize Bind Eq V X))) (equationalize (cons X Bind) Eq Code Tp ))) (['lambda V Code] (rlet (( Tlam (gensym "Tlam")) ((Bind-lam Eq-lam) (equationalize (append V Bind) '() Code Tlam))) (values Bind (cons `(= (lambda ,V ,Tlam ,Eq-lam) ,Tp) Eq)))) (['if P X Y] (rlet ((F (equationalize-bool Bind '() P)) ((Bind1 Eq1) (F #t)) ((Bind0 Eq0) (F #f)) ((Bind1 Eq1) (equationalize Bind1 Eq1 X Tp)) ((Bind0 Eq0) (equationalize Bind0 Eq0 Y Tp))) (values (union Bind0 Bind1) (append `((or ,(reverse Eq1) ,(reverse Eq0))) Eq)))) (['apply F A] (rlet ((Ta (map (lambda x (gensym "Targ")) A)) (Tr (gensym "Tres")) ((Bind Eq) (equationalize Bind Eq F `[lambda ,Ta ,Tr])) ((Bind Eq) (equationalize-arg Bind Eq A Ta))) (values Bind (cons `[= [res ,Tr] ,Tp] Eq)))) (['begin A] (equationalize Bind Eq A Tp)) (['begin A . L] (rlet (((Bind Eq) (equationalize Bind Eq A 'e))) (equationalize Bind Eq (cons 'begin L) Tp))) ((? symbol? X) (if (member X Bind) (values Bind (cons `(= ,X ,Tp) Eq)) (let ((Ts (symbol-property X ':type))) (if Ts (values Bind (cons `(= ,(compute-type (lambda () (gensym "T")) Ts) ,Tp) Eq)) (values Bind (cons `(= e ,Tp) Eq)))))) (('quote (? symbol?)) (values Bind (cons `(= symbol ,Tp) Eq))) ((? integer?) (values Bind (cons `(= integer ,Tp) Eq))) ((? boolean?) (values Bind (cons `(= boolean ,Tp) Eq))) ((? number? ) (values Bind (cons `(= number ,Tp) Eq))) ((? char? ) (values Bind (cons `(= character ,Tp) Eq))) ((? string? ) (values Bind (cons `(= string ,Tp) Eq))))) (define (check X) (rlet ((A X) ((Ba Ea) (equationalize '() '() A 'Texp))) (pp A) (pp (reverse Ea))) #t) (define typemap '((+ (lambda (number number) number)) (id (lam (A) A)) (- (lambda (number number) number)))) (map (lambda (x) (match x ((Sym T) (set-symbol-property! Sym ':type T)))) typemap) (define (equationalize-arg Bind Eq Xs Tps) (match `(,Xs ,Tps) (([X . Xs] [Tp . Tps]) (rlet (((Bind Eq) (equationalize Bind Eq X (list 'arg Tp)))) (equationalize-arg Bind Eq Xs Tps))) (([] [] ) (values Bind Eq)))) (define *predtypes* '((string? string ) (symbol? symbol ) (integer? integer ) (number? number ) (boolean? boolean ) (char? char ))) (define (equationalize-bool Bind Eq Expr) (define (pred? X) (define (f L) (if (pair? L) (if (eq? X (caar L)) #t (f (cdr L))) #f)) (f *predtypes*)) (define (type-of-pred X) (define (f L) (if (pair? L) (if (eq? X (caar L)) (cadar L) (f (cdr L))) #f)) (f *predtypes*)) ;;(pk Expr) (match Expr (['and X Y] (let ((F1 (equationalize-bool Bind Eq X)) (F2 (equationalize-bool Bind Eq Y))) (lambda (X) (if X (rlet (((Bind1 Eq1) (F1 #t)) ((Bind2 Eq2) (F2 #t))) (values (union Bind1 Bind2) `(and ,Eq1 ,Eq2))) (rlet (((Bind1 Eq1) (F1 #f)) ((Bind2 Eq2) (F2 #f))) (values (union Bind1 Bind2) `(or ,Eq1 ,Eq2))))))) (['or X Y] (let ((F1 (equationalize-bool Bind Eq X)) (F2 (equationalize-bool Bind Eq Y))) (lambda (X) (if X (rlet (((Bind1 Eq1) (F1 #t)) ((Bind2 Eq2) (F2 #t))) (values (union Bind1 Bind2) `(or ,Eq1 ,Eq2))) (rlet (((Bind1 Eq1) (F1 #f)) ((Bind2 Eq2) (F2 #f))) (values (union Bind1 Bind2) `(and ,Eq1 ,Eq2))))))) (['not X ] (let ((F (equationalize-bool Bind Eq X))) (lambda (X) (if X (F #f) (F #t))))) ([(? pred? String?) X] (lambda (P) (if P (equationalize Bind Eq X `(validated ,(type-of-pred String?))) (equationalize Bind Eq X `(validated (not ,(type-of-pred String?))))))) (X (lambda (P) (if P (values Bind (cons `[= ,X boolean] Eq)) (values Bind (cons `[not [= ,X boolean]] Eq))))))) --Boundary-00=_93b4MfLAoxmrPJi--