unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* nyacc 0.65.0 released
@ 2015-12-29 21:01 Matt Wette
  2015-12-30 14:35 ` Ludovic Courtès
  2015-12-31  4:11 ` Nala Ginrut
  0 siblings, 2 replies; 9+ messages in thread
From: Matt Wette @ 2015-12-29 21:01 UTC (permalink / raw)
  To: guile-user

nyacc version 0.65.0 is released as beta

nyacc is a LALR parser generator written from the ground up in guile

Features/Updates:
* clean scheme-flavored syntax for grammar specification 
* updated documentation (but still rough draft)
* prototype parsers for c, javascript, matlab that output parse trees in a SXML format
* partial sxml-parse-tree to il-tree conversion for javascript and C
* partial sxml-parse-tree pretty printer for javascript and C

Demo:
Use C parser and pretty printer to clean up C expressions (e.g., remove unneeded paren’s):

(use-modules (nyacc lang c99 pprint))
(use-modules (nyacc lang c99 xparser))
(use-modules (ice-9 pretty-print))

(let* ((st0 "(int)(((((foo_t*)0)->x)->y)->z)")
       (sx0 (parse-cx st0 #:tyns '("foo_t")))
       (st1 (with-output-to-string (lambda () (pretty-print-c99 sx0))))
       (sx1 (parse-cx st1 #:tyns '("foo_t"))))
  (simple-format #t "~S => \n" st0)
  (pretty-print sx0 #:per-line-prefix " ")
  (simple-format #t "==[pretty-print-c99]==>\n")
  (simple-format #t "~S =>\n" st1)
  (pretty-print sx1 #:per-line-prefix " “))

"(int)(((((foo_t*)0)->x)->y)->z)" => 
 (cast (type-name
         (decl-spec-list (type-spec (fixed-type "int"))))
       (i-sel (ident "z")
              (i-sel (ident "y")
                     (i-sel (ident "x")
                            (cast (type-name
                                    (decl-spec-list
                                      (type-spec (typename "foo_t")))
                                    (abs-declr (pointer)))
                                  (p-expr (fixed "0")))))))
==[pretty-print-c99]==>
"(int)((foo_t*)0)->x->y->z" =>
 (cast (type-name
         (decl-spec-list (type-spec (fixed-type "int"))))
       (i-sel (ident "z")
              (i-sel (ident "y")
                     (i-sel (ident "x")
                            (cast (type-name
                                    (decl-spec-list
                                      (type-spec (typename "foo_t")))
                                    (abs-declr (pointer)))
                                  (p-expr (fixed "0")))))))

http://download.savannah.gnu.org/releases/nyacc/
or
git clone git://git.savannah.nongnu.org/nyacc.git


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: nyacc 0.65.0 released
  2015-12-29 21:01 nyacc 0.65.0 released Matt Wette
@ 2015-12-30 14:35 ` Ludovic Courtès
  2015-12-30 16:54   ` Matt Wette
  2015-12-31  4:11 ` Nala Ginrut
  1 sibling, 1 reply; 9+ messages in thread
From: Ludovic Courtès @ 2015-12-30 14:35 UTC (permalink / raw)
  To: guile-user

Hi!

Matt Wette <matthew.wette@verizon.net> skribis:

> nyacc version 0.65.0 is released as beta
>
> nyacc is a LALR parser generator written from the ground up in guile
>
> Features/Updates:
> * clean scheme-flavored syntax for grammar specification 
> * updated documentation (but still rough draft)
> * prototype parsers for c, javascript, matlab that output parse trees in a SXML format
> * partial sxml-parse-tree to il-tree conversion for javascript and C
> * partial sxml-parse-tree pretty printer for javascript and C
>
> Demo:
> Use C parser and pretty printer to clean up C expressions (e.g., remove unneeded paren’s):

The demo is already quite impressive!

What subset of C99 and GNU99 is currently supported?

What are you aiming for with the tree-il conversion?  That sounds
interesting.  :-)

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: nyacc 0.65.0 released
  2015-12-30 14:35 ` Ludovic Courtès
@ 2015-12-30 16:54   ` Matt Wette
  2015-12-30 17:38     ` Ludovic Courtès
  2015-12-30 20:58     ` Christopher Allan Webber
  0 siblings, 2 replies; 9+ messages in thread
From: Matt Wette @ 2015-12-30 16:54 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 4857 bytes --]

On Dec 30, 2015, at 6:35 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>> Demo:
>> Use C parser and pretty printer to clean up C expressions (e.g., remove unneeded paren’s):
> 
> The demo is already quite impressive!

Thanks.

> What subset of C99 and GNU99 is currently supported?

I can’t remember if any C99 items are not in.  I do not have any GNU99 extensions in there.  However, this is not a faithful parser for C99.  The purpose is not to compile, but to generate translators and auto-coders.  I have built the parser with the following features:
1) The c-preprocessor parsing is merged in with the lexical analyzer.
2) The parse tree is “file oriented”, so that you can track what elements are in the top-level file.
3) The parser will parse comments.

so
#ifdef ABC
#include “xyz.h”
#endif
foo_t x;   /* this is a decl */

comes out
(trans-unit
  (cpp-stmt (if (defined "ABC")))
  (cpp-stmt
    (include
      "\"xyz.h\""
      (trans-unit
        (decl (decl-spec-list
                (stor-spec (typedef))
                (type-spec (fixed-type "int")))
              (init-declr-list (init-declr (ident "foo_t")))))))
  (cpp-stmt (endif))
  (decl (decl-spec-list (type-spec (typename "foo_t")))
        (init-declr-list (init-declr (ident "x")))
        (comment " this is a decl ")))

and the pretty-printer generates
#if defined(ABC)
#include "xyz.h"
#endif
foo_t x; /* this is a decl */



> What are you aiming for with the tree-il conversion?  That sounds
> interesting.  :-)

The purpose of the tree-il conversion is to support “,L <langauge>” at the guile prompt.  And the tree-il code is done for javascript and a simple calculator, not C.  I just tried the javascript and it is not working, but here is a small demo with the simple calculator.  The calc files (parser, compiler and spec) are included below.

scheme@(guile-user)> ,L calc
Happy hacking with calc!  To switch back, type `,L scheme'.
calc@(guile-user)> a = 3 + 4
calc@(guile-user)> ,L scheme
Happy hacking with Scheme!  To switch back, type `,L calc'.
scheme@(guile-user)> a
$1 = 7


PARSER:
(define-module (nyacc lang calc parser)
  #:export (calc-parse calc-spec calc-mach)
  #:use-module (nyacc lalr)
  #:use-module (nyacc lex)
  #:use-module (nyacc parse)
  )

(define calc-spec
  (lalr-spec
   (prec< (left "+" "-") (left "*" "/"))
   (start stmt-list-proxy)
   (grammar

    (stmt-list-proxy
     (stmt-list "\n" ($$ `(stmt-list ,@(reverse $1)))))

    (stmt-list
     (stmt ($$ (list $1)))
     (stmt-list ";" stmt ($$ (cons $3 $1))))

    (stmt
     (ident "=" expr ($$ `(assn-stmt ,$1 ,$3)))
     (expr ($$ `(expr-stmt ,$1)))
     ( ($$ '(empty-stmt))))

    (expr
     (expr "+" expr ($$ `(add ,$1 ,$3)))
     (expr "-" expr ($$ `(sub ,$1 ,$3)))
     (expr "*" expr ($$ `(mul ,$1 ,$3)))
     (expr "/" expr ($$ `(div ,$1 ,$3)))
     ('$fixed ($$ `(fixed ,$1)))
     ('$float ($$ `(float ,$1)))
     ("(" expr ")" ($$ $2)))

    (ident ('$ident ($$ `(ident ,$1))))
    )))

(define calc-mach
  (compact-machine
   (hashify-machine
     (make-lalr-machine calc-spec))))

(define calc-parse
  (let ((gen-lexer (make-lexer-generator (assq-ref calc-mach 'mtab)
					 #:space-chars " \t"))
	(parser (make-lalr-ia-parser calc-mach)))
    (lambda* (#:key (debug #f)) (parser (gen-lexer) #:debug debug))))

COMPILER:
(define-module (nyacc lang calc compiler)
  #:export (calc-sxml->tree-il)
  #:use-module (sxml match)
  #:use-module (sxml fold)
  #:use-module (language tree-il))

(define (fup tree)
  (sxml-match tree
    ((fixed ,fx) `(const ,(string->number fx)))
    ((float ,fl) `(const ,(string->number fl)))
    ((ident ,id) `(toplevel ,(string->symbol id)))
    ((add ,lt ,rt) `(apply (toplevel +) ,lt ,rt))
    ((sub ,lt ,rt) `(apply (toplevel -) ,lt ,rt))
    ((mul ,lt ,rt) `(apply (toplevel *) ,lt ,rt))
    ((div ,lt ,rt) `(apply (toplevel /) ,lt ,rt))
    ((assn-stmt (toplevel ,lhs) ,rhs) `(define ,lhs ,rhs))
    ((empty-stmt) '(begin))
    ((stmt-list ,items ...) `(begin ,items ...))
    (,otherwise tree)))

(define (calc-sxml->tree-il exp env opts)
  (let* ((tree (foldt fup identity exp))
	 (code (parse-tree-il tree)))
    (values code env env)))


SPEC:
(define-module (language calc spec)
  #:export (calc)
  #:use-module (system base language)
  #:use-module (nyacc lang calc parser)
  #:use-module (nyacc lang calc compiler))

(define (calc-reader port env)
  (let ((iport (current-input-port)))
    (dynamic-wind
	(lambda () (set-current-input-port port))
	(lambda () (calc-parse #:debug #f))
	(lambda () (set-current-input-port iport)))))

(define-language calc
  #:title	"calc"
  #:reader	calc-reader
  #:compilers   `((tree-il . ,calc-sxml->tree-il))
  #:printer	write)


[-- Attachment #2: Type: text/html, Size: 16527 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: nyacc 0.65.0 released
  2015-12-30 16:54   ` Matt Wette
@ 2015-12-30 17:38     ` Ludovic Courtès
  2015-12-30 20:58     ` Christopher Allan Webber
  1 sibling, 0 replies; 9+ messages in thread
From: Ludovic Courtès @ 2015-12-30 17:38 UTC (permalink / raw)
  To: Matt Wette; +Cc: guile-user

Matt Wette <matthew.wette@verizon.net> skribis:

> I can’t remember if any C99 items are not in.  I do not have any GNU99 extensions in there.  However, this is not a faithful parser for C99.  The purpose is not to compile, but to generate translators and auto-coders.  I have built the parser with the following features:
> 1) The c-preprocessor parsing is merged in with the lexical analyzer.
> 2) The parse tree is “file oriented”, so that you can track what elements are in the top-level file.
> 3) The parser will parse comments.

OK.

Keeping things this way, with CPP parsing built in, is useful to
analyzers + pretty-printers, and also for semantic patching tools like
Coccinelle.

Ludo’.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: nyacc 0.65.0 released
  2015-12-30 16:54   ` Matt Wette
  2015-12-30 17:38     ` Ludovic Courtès
@ 2015-12-30 20:58     ` Christopher Allan Webber
  2015-12-30 21:32       ` Matt Wette
  1 sibling, 1 reply; 9+ messages in thread
From: Christopher Allan Webber @ 2015-12-30 20:58 UTC (permalink / raw)
  To: Matt Wette; +Cc: Ludovic Courtès, guile-user

Matt Wette writes:

> On Dec 30, 2015, at 6:35 AM, Ludovic Courtès <ludo@gnu.org> wrote:
>>> Demo:
>>> Use C parser and pretty printer to clean up C expressions (e.g., remove unneeded paren’s):
>> 
>> The demo is already quite impressive!
>
> Thanks.

I agree.  Holy damn, this looks incredible.

>> What subset of C99 and GNU99 is currently supported?
>
> I can’t remember if any C99 items are not in.  I do not have any GNU99
> extensions in there.  However, this is not a faithful parser for C99.
> The purpose is not to compile, but to generate translators and
> auto-coders.  I have built the parser with the following features:
> 1) The c-preprocessor parsing is merged in with the lexical analyzer.
> 2) The parse tree is “file oriented”, so that you can track what
>    elements are in the top-level file.
> 3) The parser will parse comments.
>
> so
> #ifdef ABC
> #include “xyz.h”
> #endif
> foo_t x;   /* this is a decl */
>
> comes out
> (trans-unit
>   (cpp-stmt (if (defined "ABC")))
>   (cpp-stmt
>     (include
>       "\"xyz.h\""
>       (trans-unit
>         (decl (decl-spec-list
>                 (stor-spec (typedef))
>                 (type-spec (fixed-type "int")))
>               (init-declr-list (init-declr (ident "foo_t")))))))
>   (cpp-stmt (endif))
>   (decl (decl-spec-list (type-spec (typename "foo_t")))
>         (init-declr-list (init-declr (ident "x")))
>         (comment " this is a decl ")))
>
> and the pretty-printer generates
> #if defined(ABC)
> #include "xyz.h"
> #endif
> foo_t x; /* this is a decl */

Still super cool.

>> What are you aiming for with the tree-il conversion?  That sounds
>> interesting.  :-)
>
> The purpose of the tree-il conversion is to support “,L <langauge>” at
> the guile prompt.  And the tree-il code is done for javascript and a
> simple calculator, not C.  I just tried the javascript and it is not
> working, but here is a small demo with the simple calculator.  The
> calc files (parser, compiler and spec) are included below.
>
> scheme@(guile-user)> ,L calc
> Happy hacking with calc!  To switch back, type `,L scheme'.
> calc@(guile-user)> a = 3 + 4
> calc@(guile-user)> ,L scheme
> Happy hacking with Scheme!  To switch back, type `,L calc'.
> scheme@(guile-user)> a
> $1 = 7

That's really interesting.  Do you have more specific applications in
mind beyond this?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: nyacc 0.65.0 released
  2015-12-30 20:58     ` Christopher Allan Webber
@ 2015-12-30 21:32       ` Matt Wette
  0 siblings, 0 replies; 9+ messages in thread
From: Matt Wette @ 2015-12-30 21:32 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 961 bytes --]


> On Dec 30, 2015, at 12:58 PM, Christopher Allan Webber <cwebber@dustycloud.org> wrote:
> 
> Matt Wette writes
>> The purpose of the tree-il conversion is to support “,L <langauge>” at
>> the guile prompt.  And the tree-il code is done for javascript and a
>> simple calculator, not C.  I just tried the javascript and it is not
>> working, but here is a small demo with the simple calculator.  The
>> calc files (parser, compiler and spec) are included below.
>> 
>> scheme@(guile-user)> ,L calc
>> Happy hacking with calc!  To switch back, type `,L scheme'.
>> calc@(guile-user)> a = 3 + 4
>> calc@(guile-user)> ,L scheme
>> Happy hacking with Scheme!  To switch back, type `,L calc'.
>> scheme@(guile-user)> a
>> $1 = 7
> 
> That's really interesting.  Do you have more specific applications in
> mind beyond this?

Well.  I wanted to go through the exercise of writing to the tree-il, but have no real goals as of yet.

Matt


[-- Attachment #2: Type: text/html, Size: 3794 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: nyacc 0.65.0 released
  2015-12-29 21:01 nyacc 0.65.0 released Matt Wette
  2015-12-30 14:35 ` Ludovic Courtès
@ 2015-12-31  4:11 ` Nala Ginrut
  2015-12-31 15:21   ` Matt Wette
  1 sibling, 1 reply; 9+ messages in thread
From: Nala Ginrut @ 2015-12-31  4:11 UTC (permalink / raw)
  To: Matt Wette; +Cc: guile-user

That's very nice! Thanks for all the work!
This may provide new weapon for our multi-lang plan.
And I saw C99 parser in your code, I think it's helpful for building a
better FFI code generator, and parse C code directly.

How's the javascript part? Is it completed one to cover ES6? And maybe
it's good enough to replace our current ecmascript frontend?

Writing parser for an industry language is painful, it's better to take
advantage of *.y files which has been written by other people already.
And save our time to cope with the IR and optimizing part.
I still don't understand what's the meaning of gram.y in your project.
Does nyacc use it?

On Tue, 2015-12-29 at 13:01 -0800, Matt Wette wrote:
> nyacc version 0.65.0 is released as beta
> 
> nyacc is a LALR parser generator written from the ground up in guile
> 
> Features/Updates:
> * clean scheme-flavored syntax for grammar specification 
> * updated documentation (but still rough draft)
> * prototype parsers for c, javascript, matlab that output parse trees in a SXML format
> * partial sxml-parse-tree to il-tree conversion for javascript and C
> * partial sxml-parse-tree pretty printer for javascript and C
> 
> Demo:
> Use C parser and pretty printer to clean up C expressions (e.g., remove unneeded paren’s):
> 
> (use-modules (nyacc lang c99 pprint))
> (use-modules (nyacc lang c99 xparser))
> (use-modules (ice-9 pretty-print))
> 
> (let* ((st0 "(int)(((((foo_t*)0)->x)->y)->z)")
>        (sx0 (parse-cx st0 #:tyns '("foo_t")))
>        (st1 (with-output-to-string (lambda () (pretty-print-c99 sx0))))
>        (sx1 (parse-cx st1 #:tyns '("foo_t"))))
>   (simple-format #t "~S => \n" st0)
>   (pretty-print sx0 #:per-line-prefix " ")
>   (simple-format #t "==[pretty-print-c99]==>\n")
>   (simple-format #t "~S =>\n" st1)
>   (pretty-print sx1 #:per-line-prefix " “))
> 
> "(int)(((((foo_t*)0)->x)->y)->z)" => 
>  (cast (type-name
>          (decl-spec-list (type-spec (fixed-type "int"))))
>        (i-sel (ident "z")
>               (i-sel (ident "y")
>                      (i-sel (ident "x")
>                             (cast (type-name
>                                     (decl-spec-list
>                                       (type-spec (typename "foo_t")))
>                                     (abs-declr (pointer)))
>                                   (p-expr (fixed "0")))))))
> ==[pretty-print-c99]==>
> "(int)((foo_t*)0)->x->y->z" =>
>  (cast (type-name
>          (decl-spec-list (type-spec (fixed-type "int"))))
>        (i-sel (ident "z")
>               (i-sel (ident "y")
>                      (i-sel (ident "x")
>                             (cast (type-name
>                                     (decl-spec-list
>                                       (type-spec (typename "foo_t")))
>                                     (abs-declr (pointer)))
>                                   (p-expr (fixed "0")))))))
> 
> http://download.savannah.gnu.org/releases/nyacc/
> or
> git clone git://git.savannah.nongnu.org/nyacc.git





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: nyacc 0.65.0 released
  2015-12-31  4:11 ` Nala Ginrut
@ 2015-12-31 15:21   ` Matt Wette
  2015-12-31 17:22     ` Matt Wette
  0 siblings, 1 reply; 9+ messages in thread
From: Matt Wette @ 2015-12-31 15:21 UTC (permalink / raw)
  To: guile-user

> On Dec 30, 2015, at 8:11 PM, Nala Ginrut <nalaginrut@gmail.com> wrote:
> 
> That's very nice! Thanks for all the work!
> This may provide new weapon for our multi-lang plan.
> And I saw C99 parser in your code, I think it's helpful for building a
> better FFI code generator, and parse C code directly.
> 
> How's the javascript part? Is it completed one to cover ES6? And maybe
> it's good enough to replace our current ecmascript frontend?

I generated my lalr-spec from the ecmascript specification (ECMA-262).   However, the spec is not LALR.   My desire is to extend nyacc  
to be able to handle this specification.  The idea is to be able to prune certain productions.  Also, in my opinion, implementing ecmascript to specification would be tough in guile for two reasons:
1) in ecmascript code points are 16 bits, in guile 32 bits
2) in ecmascript a number is a 64 bit number where certain bit patterns are to be interpreted as ints an others as floats

> Writing parser for an industry language is painful, it's better to take
> advantage of *.y files which has been written by other people already.

I’m not sure I get what you mean here.  I started with some C99 .y files on the net but I really wanted to be able to go back to a documented specification.  I then went to grammar spec in Harrison and Steele, “C: A Reference Manual,” 5th ed., but then found that that book had many gross errors in the grammar specification.  So I went to some draft spec’s I found on the net.    For the javascript I went to the ECMA-262 specification.


> And save our time to cope with the IR and optimizing part.

I’m not sure what you are getting at here.   Are you suggesting to adopt some other intermediate form.  I wanted something simple to work with.   The feature I really like about using SXML for the parse trees is that I can then use ice-9 pretty-print to see the structure.  It has been really helpful in my projects derived from nyacc.

> I still don't understand what's the meaning of gram.y in your project.
> Does nyacc use it?

The gram.y file is generated from the nyacc/export module.  It has no direct use in the parsers.  I have run this file through “bison -r all” to generate the machine description in order to validate my parsers (by hand).  A feature that would be nice is to use the xml-format of bison to generate an automated validator.  One possible issue here is that bison token precedence is totally ordered whereas nyacc is partially ordered.

Matt




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: nyacc 0.65.0 released
  2015-12-31 15:21   ` Matt Wette
@ 2015-12-31 17:22     ` Matt Wette
  0 siblings, 0 replies; 9+ messages in thread
From: Matt Wette @ 2015-12-31 17:22 UTC (permalink / raw)
  To: guile-user

One other thing to mention on why I think SXML is good for generating the parse trees:
I believe SXML can provide a natural way to use attribute grammar semantics.   So, if an expression comes out of the parser as
	(expr (add (ident “x”) (ident “y”)))
then after semantic analysis it may look something like 
	(expr (@ (type “double”)) (add (@ (type “double”) (lt “int”) (rt “double”)) (ident (@ (type “int”)) “x”) (ident (@ (type “double”)) “y”)))

Matt





^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-12-31 17:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-29 21:01 nyacc 0.65.0 released Matt Wette
2015-12-30 14:35 ` Ludovic Courtès
2015-12-30 16:54   ` Matt Wette
2015-12-30 17:38     ` Ludovic Courtès
2015-12-30 20:58     ` Christopher Allan Webber
2015-12-30 21:32       ` Matt Wette
2015-12-31  4:11 ` Nala Ginrut
2015-12-31 15:21   ` Matt Wette
2015-12-31 17:22     ` Matt Wette

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).