* Q on (language <lang> spec) @ 2015-10-17 19:02 Matt Wette 2015-10-18 20:00 ` Matt Wette 0 siblings, 1 reply; 5+ messages in thread From: Matt Wette @ 2015-10-17 19:02 UTC (permalink / raw) To: guile-user [-- Attachment #1: Type: text/plain, Size: 2272 bytes --] I am playing with the compiler tower and have been digging through the (system base language) module to try to get my hands around writing to the compiler tower. Here is the define-language signature: define-language [#:name] [#:title] [#:reader] [#:printer] [#:parser=#f] [#:compilers='()] [#:decompilers='()] [#:evaluator=#f] [#:joiner=#f] [#:for-humans?=#t] [#:make-default-environment=make-fresh-user-module] Here are my assumptions. I’d appreciate corrections if I have missed something. reader is a procedure that must be provided. The procedure takes input port and environment and returns a form of the implementers choice. The text read from the input port is (nominally) in the supported language. parser is an optional procedure. If provided, it takes the output form generated by the reader and returns another form in the implementors choice. compilers is an a-list of (symbol . procedure). For each symbol the associated procedure takes as input the form produced by the parser or reader (for the case where parser is not provided) and generates the code associated with the symbol. For example, if no parser is defined, an entry of `(tree-il . ,compile-tree-il) means the implementer provides a procedure compile-tree-il that takes a form (returned by the reader), an environment form, and an options (a-list?) and generates tree-il. decompilers is an a-list of (symbol . procedure). The procedure takes an expression in the symbol-designated form, along with environment and option a-list, and returns something in the implementers intermediate form (output of parser, or of reader in case no parser is specified). What did I miss or get wrong? I have not been digging to figure out joiner or evaluator yet. I have been able to do the following, but not sure I’ve got things laid out correctly yet: scheme@(guile-user)> ,L javascript Happy hacking with javascript! To switch back, type `,L scheme'. javascript@(guile-user)> var abc = 123 javascript@(guile-user)> ,L scheme Happy hacking with Scheme! To switch back, type `,L javascript'. scheme@(guile-user)> abc $1 = 123 Matt [-- Attachment #2: Type: text/html, Size: 4360 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Q on (language <lang> spec) 2015-10-17 19:02 Q on (language <lang> spec) Matt Wette @ 2015-10-18 20:00 ` Matt Wette 2015-10-19 3:53 ` Nala Ginrut 0 siblings, 1 reply; 5+ messages in thread From: Matt Wette @ 2015-10-18 20:00 UTC (permalink / raw) To: guile-user [-- Attachment #1: Type: text/plain, Size: 4039 bytes --] > On Oct 17, 2015, at 12:02 PM, Matt Wette <matthew.wette@verizon.net> wrote: > I am playing with the compiler tower and have been digging through the (system base language) module to try to get my hands around writing to the compiler tower. . Here is a simple calculator example that I have working with my own intermediate (SXML based) language. scheme@(guile-user)> ,L calc Happy hacking with calc! To switch back, type `,L scheme'. calc@(guile-user)> a = (2.5 + 4.5)/(9.3 - 1) calc@(guile-user)> ,L scheme Happy hacking with Scheme! To switch back, type `,L calc'. scheme@(guile-user)> a $1 = 0.8433734939759036 The implementation consists of the files spec.scm, parser.scm and compiler.scm which are listed below. All files are: ;;; Copyright (C) 2015 Matthew R. Wette ;;; ;;; This program is free software: you can redistribute it and/or modify ;;; it under the terms of the GNU General Public License as published by ;;; the Free Software Foundation, either version 3 of the License, or ;;; (at your option) any later version. and to appear at https://savannah.nongnu.org/projects/nyacc <https://savannah.nongnu.org/projects/nyacc>. spec.scm: (define-module (language calc spec) #:export (calc) #:use-module (system base language) #:use-module (nyacc lang calc parser) #:use-module (nyacc lang calc compiler)) (define (calc-reader port env) (let ((iport (current-input-port))) (dynamic-wind (lambda () (set-current-input-port port)) (lambda () (calc-parse #:debug #f)) (lambda () (set-current-input-port iport))))) (define-language calc #:title "calc" #:reader calc-reader #:compilers `((tree-il . ,calc-sxml->tree-il)) #:printer write) parser.scm: (define-module (nyacc lang calc parser) #:export (calc-parse calc-spec calc-mach) #:use-module (nyacc lalr) #:use-module (nyacc lex) #:use-module (nyacc parse) ) (define calc-spec (lalr-spec (prec< (left "+" "-") (left "*" "/")) (start stmt-list-proxy) (grammar (stmt-list-proxy (stmt-list "\n" ($$ (cons 'stmt-list (reverse $1))))) (stmt-list (stmt ($$ (list $1))) (stmt-list ";" stmt ($$ (cons $3 $1)))) (stmt (ident "=" expr ($$ `(assn-stmt ,$1 ,$3))) (expr ($$ `(expr-stmt ,$1))) ( ($$ '(empty-stmt)))) (expr (expr "+" expr ($$ `(add ,$1 ,$3))) (expr "-" expr ($$ `(sub ,$1 ,$3))) (expr "*" expr ($$ `(mul ,$1 ,$3))) (expr "/" expr ($$ `(div ,$1 ,$3))) ('$fixed ($$ `(fixed ,$1))) ('$float ($$ `(float ,$1))) ("(" expr ")" ($$ $2))) (ident ('$ident ($$ `(ident ,$1)))) ))) (define calc-mach (compact-machine (hashify-machine (make-lalr-machine calc-spec)))) (define calc-parse (let ((gen-lexer (make-lexer-generator (assq-ref calc-mach 'mtab) #:space-chars " \t")) (parser (make-lalr-ia-parser calc-mach))) (lambda* (#:key (debug #f)) (parser (gen-lexer) #:debug debug)))) compiler.scm: (define-module (nyacc lang calc compiler) #:export (calc-sxml->tree-il) #:use-module (sxml match) #:use-module (sxml fold) ;;#:use-module (system base language) #:use-module (language tree-il)) (define (fup tree) (sxml-match tree ((fixed ,fx) `(const ,(string->number fx))) ((float ,fl) `(const ,(string->number fl))) ((ident ,id) `(toplevel ,(string->symbol id))) ((add ,lt ,rt) `(apply (toplevel +) ,lt ,rt)) ((sub ,lt ,rt) `(apply (toplevel -) ,lt ,rt)) ((mul ,lt ,rt) `(apply (toplevel *) ,lt ,rt)) ((div ,lt ,rt) `(apply (toplevel /) ,lt ,rt)) ((assn-stmt (toplevel ,lhs) ,rhs) `(define ,lhs ,rhs)) ((empty-stmt) '(begin)) ((stmt-list ,items ...) `(begin ,items ...)) (,otherwise tree))) (define (calc-sxml->tree-il exp env opts) (let* ((tree (foldt fup identity exp)) (code (parse-tree-il tree))) (values code env env))) [-- Attachment #2: Type: text/html, Size: 14131 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Q on (language <lang> spec) 2015-10-18 20:00 ` Matt Wette @ 2015-10-19 3:53 ` Nala Ginrut 2015-10-23 0:20 ` Matt Wette 0 siblings, 1 reply; 5+ messages in thread From: Nala Ginrut @ 2015-10-19 3:53 UTC (permalink / raw) To: Matt Wette; +Cc: guile-user Nice work! For more generic discussion about multi-lang, I could share some opinions. Most of the time, we just need to convert our AST/IR to tree-il. But I saw old guile-lua frontend added something in the lower level. I haven't learned it deeper. There're two identical forms of tree-il, s-expr form and record form. Personally, I like s-expr one, it's simple to use. But record form could store the src meta info, which is considered better. And I'm grad that you write a new lexer generator (before it I only know silex), it's great! Would you like to make the generated tokens compatible with scm-lalr? If so, people may rewrite their lexer module with your lexer generator, and no need to rewrite the parser. I saw the token name is string rather than symbol, so I guess it's not compatible with scm-lalr. Happy hacking! On Sun, 2015-10-18 at 13:00 -0700, Matt Wette wrote: > > On Oct 17, 2015, at 12:02 PM, Matt Wette <matthew.wette@verizon.net> wrote: > > I am playing with the compiler tower and have been digging through the (system base language) module to try to get my hands around writing to the compiler tower. . > > Here is a simple calculator example that I have working with my own intermediate (SXML based) language. > > scheme@(guile-user)> ,L calc > Happy hacking with calc! To switch back, type `,L scheme'. > calc@(guile-user)> a = (2.5 + 4.5)/(9.3 - 1) > calc@(guile-user)> ,L scheme > Happy hacking with Scheme! To switch back, type `,L calc'. > scheme@(guile-user)> a > $1 = 0.8433734939759036 > > The implementation consists of the files spec.scm, parser.scm and compiler.scm which are listed below. > > All files are: > > ;;; Copyright (C) 2015 Matthew R. Wette > ;;; > ;;; This program is free software: you can redistribute it and/or modify > ;;; it under the terms of the GNU General Public License as published by > ;;; the Free Software Foundation, either version 3 of the License, or > ;;; (at your option) any later version. > > and to appear at https://savannah.nongnu.org/projects/nyacc <https://savannah.nongnu.org/projects/nyacc>. > > spec.scm: > (define-module (language calc spec) > #:export (calc) > #:use-module (system base language) > #:use-module (nyacc lang calc parser) > #:use-module (nyacc lang calc compiler)) > > (define (calc-reader port env) > (let ((iport (current-input-port))) > (dynamic-wind > (lambda () (set-current-input-port port)) > (lambda () (calc-parse #:debug #f)) > (lambda () (set-current-input-port iport))))) > > (define-language calc > #:title "calc" > #:reader calc-reader > #:compilers `((tree-il . ,calc-sxml->tree-il)) > #:printer write) > > > parser.scm: > (define-module (nyacc lang calc parser) > #:export (calc-parse calc-spec calc-mach) > #:use-module (nyacc lalr) > #:use-module (nyacc lex) > #:use-module (nyacc parse) > ) > > (define calc-spec > (lalr-spec > (prec< (left "+" "-") (left "*" "/")) > (start stmt-list-proxy) > (grammar > > (stmt-list-proxy > (stmt-list "\n" ($$ (cons 'stmt-list (reverse $1))))) > > (stmt-list > (stmt ($$ (list $1))) > (stmt-list ";" stmt ($$ (cons $3 $1)))) > > (stmt > (ident "=" expr ($$ `(assn-stmt ,$1 ,$3))) > (expr ($$ `(expr-stmt ,$1))) > ( ($$ '(empty-stmt)))) > > (expr > (expr "+" expr ($$ `(add ,$1 ,$3))) > (expr "-" expr ($$ `(sub ,$1 ,$3))) > (expr "*" expr ($$ `(mul ,$1 ,$3))) > (expr "/" expr ($$ `(div ,$1 ,$3))) > ('$fixed ($$ `(fixed ,$1))) > ('$float ($$ `(float ,$1))) > ("(" expr ")" ($$ $2))) > > (ident ('$ident ($$ `(ident ,$1)))) > ))) > > (define calc-mach > (compact-machine > (hashify-machine > (make-lalr-machine calc-spec)))) > > (define calc-parse > (let ((gen-lexer (make-lexer-generator (assq-ref calc-mach 'mtab) > #:space-chars " \t")) > (parser (make-lalr-ia-parser calc-mach))) > (lambda* (#:key (debug #f)) (parser (gen-lexer) #:debug debug)))) > > > compiler.scm: > (define-module (nyacc lang calc compiler) > #:export (calc-sxml->tree-il) > #:use-module (sxml match) > #:use-module (sxml fold) > ;;#:use-module (system base language) > #:use-module (language tree-il)) > > (define (fup tree) > (sxml-match tree > ((fixed ,fx) `(const ,(string->number fx))) > ((float ,fl) `(const ,(string->number fl))) > ((ident ,id) `(toplevel ,(string->symbol id))) > ((add ,lt ,rt) `(apply (toplevel +) ,lt ,rt)) > ((sub ,lt ,rt) `(apply (toplevel -) ,lt ,rt)) > ((mul ,lt ,rt) `(apply (toplevel *) ,lt ,rt)) > ((div ,lt ,rt) `(apply (toplevel /) ,lt ,rt)) > ((assn-stmt (toplevel ,lhs) ,rhs) `(define ,lhs ,rhs)) > ((empty-stmt) '(begin)) > ((stmt-list ,items ...) `(begin ,items ...)) > (,otherwise tree))) > > (define (calc-sxml->tree-il exp env opts) > (let* ((tree (foldt fup identity exp)) > (code (parse-tree-il tree))) > (values code env env))) > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Q on (language <lang> spec) 2015-10-19 3:53 ` Nala Ginrut @ 2015-10-23 0:20 ` Matt Wette 2015-10-23 13:10 ` Matt Wette 0 siblings, 1 reply; 5+ messages in thread From: Matt Wette @ 2015-10-23 0:20 UTC (permalink / raw) To: Nala Ginrut; +Cc: guile-user [-- Attachment #1: Type: text/plain, Size: 1274 bytes --] > On Oct 18, 2015, at 8:53 PM, Nala Ginrut <nalaginrut@gmail.com> wrote: > And I'm grad that you write a new lexer generator (before it I only know > silex), it's great! Would you like to make the generated tokens > compatible with scm-lalr? If so, people may rewrite their lexer module > with your lexer generator, and no need to rewrite the parser. I saw the > token name is string rather than symbol, so I guess it's not compatible > with scm-lalr. Actually, the lexer-generator uses convention of internally turning certain lexemes, like strings, into symbols like ‘$string, or integers into ‘$fixed. The argument to the lexer-generator is a “match-table” which says how to map the read items quoted items are identifiers (e.g., “while”) or character sequences (e.g., “+=“) to something the parser wants to see. For example, if you use the symbol WHILE to denote the source text “while” then you would have an entry (“while” . ‘WHILE) in the match table. So I think the lexer-generator should be adaptable to other parsers. As as side note, the nyacc parser generator can be “hashified” which means the lexer should return integers. In that case the match table has entries that look like (“while” . 45). Matt [-- Attachment #2: Type: text/html, Size: 6059 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Q on (language <lang> spec) 2015-10-23 0:20 ` Matt Wette @ 2015-10-23 13:10 ` Matt Wette 0 siblings, 0 replies; 5+ messages in thread From: Matt Wette @ 2015-10-23 13:10 UTC (permalink / raw) To: guile-user [-- Attachment #1: Type: text/plain, Size: 4010 bytes --] > On Oct 22, 2015, at 5:20 PM, Matt Wette <matthew.wette@verizon.net> wrote: > > >> On Oct 18, 2015, at 8:53 PM, Nala Ginrut <nalaginrut@gmail.com <mailto:nalaginrut@gmail.com>> wrote: >> And I'm grad that you write a new lexer generator (before it I only know >> silex), it's great! Would you like to make the generated tokens >> compatible with scm-lalr? If so, people may rewrite their lexer module >> with your lexer generator, and no need to rewrite the parser. I saw the >> token name is string rather than symbol, so I guess it's not compatible >> with scm-lalr. > > Actually, the lexer-generator uses convention of internally turning certain lexemes, like strings, into symbols like ‘$string, or integers into ‘$fixed. The argument to the lexer-generator is a “match-table” which says how to map the read items quoted items are identifiers (e.g., “while”) or character sequences (e.g., “+=“) to something the parser wants to see. For example, if you use the symbol WHILE to denote the source text “while” then you would have an entry (“while” . ‘WHILE) in the match table. So I think the lexer-generator should be adaptable to other parsers. I didn’t describe this very well. I will try again. The code actually provides a lexical analyzer (aka lexer) generator-generator. To make a lexer you call make-lexer-generator with a match-table as argument: (define gen-lexer (make-lexer-generator match-table)) Then when you pass a generated lexer each time you call the parser: (parse (gen-lexer)) The reason is that the lexer keeps state information (e.g., the beginning-of-line condition). Now the match table argument indicates how the user wants lexemes, read from the input, to be reported to the parser. If you want “while” in the input to be reported as ‘WHILE to the parser, then the match table would include an entry ‘(“while” . WHILE). The generator uses special symbols to represent quoted strings, numbers and comments. If you want quoted strings returned with the symbol ‘STRING, then the match table would include an entry ‘($string . STRING). In many cases I have nyacc "hashify” my parser so that it uses integers instead of symbols. Here is the match table generated for the hashified matlab parser: (define mtab '(($lone-comm . 1) ($string . 2) ($float . 3) ($fixed . 4) ($ident . 5) ( ";" . 6) (".'" . 7) ("'" . 8) ("~" . 9) (".^" . 10) (".\\" . 11) ("./" . 12) (".*" . 13) ("^" . 14) ("\\" . 15) ("/" . 16) ("*" . 17) ("-" . 18) ( "+" . 19) (">=" . 20) ("<=" . 21) (">" . 22) ("<" . 23) ("~=" . 24) ("==" . 25) ("&" . 26) ("|" . 27) (":" . 28) ("case" . 29) ("elseif" . 30) ( "clear" . 31) ("global" . 32) ("return" . 33) ("otherwise" . 34) ("switch" . 35) ("else" . 36) ("if" . 37) ("while" . 38) ("for" . 39) ("," . 40) ( ")" . 41) ("(" . 42) ("=" . 43) ("]" . 44) ("[" . 45) ("function" . 46) ( #\newline . 47) ("end" . 48) ($end . 49))) and here is the match table generated for the non-hashified match table for the same language: (define mtab '(($lone-comm . $lone-comm) ($string . $string) ($float . $float) ($fixed . $fixed) ($ident . $ident) (";" . #{$:;}#) (".'" . $:.') ("'" . $:') ("~" . $:~) (".^" . $:.^) (".\\" . $:.\) ("./" . $:./) (".*" . $:.*) ("^" . $:^) ("\\" . $:\) ("/" . $:/) ("*" . $:*) ("-" . $:-) ("+" . $:+) (">=" . $:>=) ("<=" . $:<=) (">" . $:>) ("<" . $:<) ("~=" . $:~=) ("==" . $:==) ( "&" . $:&) ("|" . $:|) (":" . $::) ("case" . $:case) ("elseif" . $:elseif) ("clear" . $:clear) ("global" . $:global) ("return" . $:return) ( "otherwise" . $:otherwise) ("switch" . $:switch) ("else" . $:else) ("if" . $:if) ("while" . $:while) ("for" . $:for) ("," . $:,) (")" . #{$:\x29;}# ) ("(" . #{$:\x28;}#) ("=" . $:=) ("]" . #{$:\x5d;}#) ("[" . #{$:\x5b;}#) ("function" . $:function) (#\newline . #\newline) ("end" . $:end) ($end . $end))) [-- Attachment #2: Type: text/html, Size: 11458 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-10-23 13:10 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-10-17 19:02 Q on (language <lang> spec) Matt Wette 2015-10-18 20:00 ` Matt Wette 2015-10-19 3:53 ` Nala Ginrut 2015-10-23 0:20 ` Matt Wette 2015-10-23 13:10 ` Matt Wette
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).