my current project: new lalr module

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

From: Matt Wette <matthew.wette@verizon.net>
To: guile-devel@gnu.org
Subject: my current project: new lalr module
Date: Wed, 03 Jun 2015 20:37:17 -0700	[thread overview]
Message-ID: <E45D6E7C-3864-4CEB-91DE-2560CA145BF1@verizon.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 3867 bytes --]

Having fun with all the great code, and documentation, in guile 2. 

My current project is a lalr module. Last December I wanted to start working on an interpreter for modelica in guile.  Guile has a lalr parser generator (system base lalr).   When I started looking at it, I was not crazy about the input language. I looked to see if a better interface could be pasted on, but the code impenetrable for me.  So I started coding up my own LALR parser generator.  I was hoping it would take a month of nights/weekends but ended up taking me close to four months.  For now I am calling my implementation "lalr1".

A major difference between my lalr1 and lalr in guile is that the guile version is a translation of the C-coded bison tool and mine is a from-scratch implementation in Scheme based on algorithms from the Dragon Book.  (The bison implementation uses the DeRemer-Pennello algorithms, yacc does not.  I'd guess the bison-based implementation will generate a parser generator faster, but I think that is a minor advantage.)

Some features of lalr1:
0) It's implemented with syntax-rules and syntax-case (versus define-macro for lalr).

1) There is no specification of terminals.  Terminals are detected in rules (via fenders) as quoted symbols or characters.
    Example of terminals:  'number, #\+

2) Actions are specified as ($$ action ...).  (Identifiers beginning with $ are reserved.) 

3)  Mid-rule actions (Bison Manual section 3.4.8 ) are supported.  (Not supported by lalr in guile.)

4) Some handy "proxy" macros are included.  The macro expression results in a generated symbol and additional productions.  Examples:
    ($? a b c) will generate symbol $Pk and production ($Pk () (a b c)) 
    ($* a b c) will generate symbol $Pk and production ($Pk () ($Pk a b c)) 
    ($+ a b c) will generate symbol $Pk and production ($Pk (a b c) ($Pk a b c))
   where $Pk will be $P0, $P1, ...

5) It can generate Bison and guile lalr input languages from lalr1 spec's.  (I have been using the Bison converter to debug lookahead generation.)

6) There is a macro to generate lalr1 input language from a lalr-parser specification.

7) Table compaction is optional.  (The current compaction procedure will generate a default action if max duplication is more than 3.)

8) The bulk of the code resides in one module of ~ 1500 lines of code and ~ 500 lines of comments.

Here is a partial specification from the C-parser I am working on.  It is being implemented with attributed-grammar semantics so that I can process code with functions from the SXML modules.  For example, I am using foldts to detect typedefs need to feed back to the lexer.)

(define clang-spec
  (lalr-spec
   (start translation-unit-proxy)
   (grammar

    (translation-unit-proxy (translation-unit ($$ (tl->list $1))))

    (translation-unit
     (external-declaration ($$ (make-tl 'trans-unit $1)))
     (translation-unit external-declaration ($$ (tl-app! $1 $2)))
     )

    (external-declaration
     (function-definition)
     (declaration)
     )
   ...
    (declaration
     (declaration-specifier init-declarator-list  ($$ (...)) #\;  ;; <=mid-rule action to capture typedef's
	  ($$ (list 'decl (tl->list $1) (tl->list $2)))))
     (declaration-specifier #\;
          ($$ (list 'decl (tl->list $1))
     )
    ....
    (direct-declarator
     ('ident ($$ (list 'ident $1)))
     (#\( declarator #\) ($$ (list 'scope $2)))
     (direct-declarator #\[ constant-expression #\] ($$ (list 'array $3 $1)))
     (direct-declarator #\[ #\] ($$ (list 'array $3 $1)))
     (direct-declarator #\( parameter-type-list #\) ($$ (list 'function $1 $3)))
     (direct-declarator #\( identifier-list #\) ($$ (list 'function $1 $3)))
     (direct-declarator #\( #\) ($$ (list 'function $1 $3)))
     )
   ....

Matt

[-- Attachment #2: Type: text/html, Size: 6633 bytes --]

                 reply	other threads:[~2015-06-04  3:37 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E45D6E7C-3864-4CEB-91DE-2560CA145BF1@verizon.net \
    --to=matthew.wette@verizon.net \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).