From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Matt Wette Newsgroups: gmane.lisp.guile.devel Subject: my current project: new lalr module Date: Wed, 03 Jun 2015 20:37:17 -0700 Message-ID: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Content-Type: multipart/alternative; boundary="Apple-Mail=_CAB7AD02-ABA3-4DD4-9851-2D28BE97CF37" X-Trace: ger.gmane.org 1433393524 12381 80.91.229.3 (4 Jun 2015 04:52:04 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 4 Jun 2015 04:52:04 +0000 (UTC) To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Thu Jun 04 06:51:57 2015 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Z0N81-0006c3-Ak for guile-devel@m.gmane.org; Thu, 04 Jun 2015 06:51:41 +0200 Original-Received: from localhost ([::1]:40137 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z0N80-00088h-ST for guile-devel@m.gmane.org; Thu, 04 Jun 2015 00:51:40 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36412) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z0Lyp-0005QU-F4 for guile-devel@gnu.org; Wed, 03 Jun 2015 23:38:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z0Lyk-0000ZF-DO for guile-devel@gnu.org; Wed, 03 Jun 2015 23:38:07 -0400 Original-Received: from vms173021pub.verizon.net ([206.46.173.21]:26914) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z0Lyk-0000Xb-6L for guile-devel@gnu.org; Wed, 03 Jun 2015 23:38:02 -0400 Original-Received: from [192.168.2.127] ([71.108.227.241]) by vms173021.mailsrvcs.net (Oracle Communications Messaging Server 7.0.5.32.0 64bit (built Jul 16 2014)) with ESMTPA id <0NPE00FO4I2D6P40@vms173021.mailsrvcs.net> for guile-devel@gnu.org; Wed, 03 Jun 2015 22:37:28 -0500 (CDT) X-CMAE-Score: 0 X-CMAE-Analysis: v=2.1 cv=S6gku9YP c=1 sm=1 tr=0 a=C7TgS9Pb229lFodQust5qw==:117 a=o1OHuDzbAAAA:8 a=oR5dmqMzAAAA:8 a=-9mUelKeXuEA:10 a=XAFQembCKUMA:10 a=xNt5mQs2o9PMavvHCdAA:9 a=CjuIK1q_8ugA:10 a=84jDh8sh7gN3EfJBSpgA:9 a=axxJ3iXRi21RHSXD:21 a=_W_S_7VecoQA:10 X-Mailer: Apple Mail (2.1878.6) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.46.173.21 X-Mailman-Approved-At: Thu, 04 Jun 2015 00:51:35 -0400 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:17731 Archived-At: --Apple-Mail=_CAB7AD02-ABA3-4DD4-9851-2D28BE97CF37 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Having fun with all the great code, and documentation, in guile 2.=20 My current project is a lalr module. Last December I wanted to start = working on an interpreter for modelica in guile. Guile has a lalr = parser generator (system base lalr). When I started looking at it, I = was not crazy about the input language. I looked to see if a better = interface could be pasted on, but the code impenetrable for me. So I = started coding up my own LALR parser generator. I was hoping it would = take a month of nights/weekends but ended up taking me close to four = months. For now I am calling my implementation "lalr1". A major difference between my lalr1 and lalr in guile is that the guile = version is a translation of the C-coded bison tool and mine is a = from-scratch implementation in Scheme based on algorithms from the = Dragon Book. (The bison implementation uses the DeRemer-Pennello = algorithms, yacc does not. I'd guess the bison-based implementation = will generate a parser generator faster, but I think that is a minor = advantage.) Some features of lalr1: 0) It's implemented with syntax-rules and syntax-case (versus = define-macro for lalr). 1) There is no specification of terminals. Terminals are detected in = rules (via fenders) as quoted symbols or characters. Example of terminals: 'number, #\+ 2) Actions are specified as ($$ action ...). (Identifiers beginning = with $ are reserved.)=20 3) Mid-rule actions (Bison Manual section 3.4.8 ) are supported. (Not = supported by lalr in guile.) 4) Some handy "proxy" macros are included. The macro expression results = in a generated symbol and additional productions. Examples: ($? a b c) will generate symbol $Pk and production ($Pk () (a b c))=20= ($* a b c) will generate symbol $Pk and production ($Pk () ($Pk a b = c))=20 ($+ a b c) will generate symbol $Pk and production ($Pk (a b c) ($Pk = a b c)) where $Pk will be $P0, $P1, ... 5) It can generate Bison and guile lalr input languages from lalr1 = spec's. (I have been using the Bison converter to debug lookahead = generation.) 6) There is a macro to generate lalr1 input language from a lalr-parser = specification. 7) Table compaction is optional. (The current compaction procedure will = generate a default action if max duplication is more than 3.) 8) The bulk of the code resides in one module of ~ 1500 lines of code = and ~ 500 lines of comments. Here is a partial specification from the C-parser I am working on. It = is being implemented with attributed-grammar semantics so that I can = process code with functions from the SXML modules. For example, I am = using foldts to detect typedefs need to feed back to the lexer.) (define clang-spec (lalr-spec (start translation-unit-proxy) (grammar (translation-unit-proxy (translation-unit ($$ (tl->list $1)))) (translation-unit (external-declaration ($$ (make-tl 'trans-unit $1))) (translation-unit external-declaration ($$ (tl-app! $1 $2))) ) =20 (external-declaration (function-definition) (declaration) ) ... (declaration (declaration-specifier init-declarator-list ($$ (...)) #\; ;; = <=3Dmid-rule action to capture typedef's ($$ (list 'decl (tl->list $1) (tl->list $2))))) (declaration-specifier #\; ($$ (list 'decl (tl->list $1)) ) .... (direct-declarator ('ident ($$ (list 'ident $1))) (#\( declarator #\) ($$ (list 'scope $2))) (direct-declarator #\[ constant-expression #\] ($$ (list 'array $3 = $1))) (direct-declarator #\[ #\] ($$ (list 'array $3 $1))) (direct-declarator #\( parameter-type-list #\) ($$ (list 'function = $1 $3))) (direct-declarator #\( identifier-list #\) ($$ (list 'function $1 = $3))) (direct-declarator #\( #\) ($$ (list 'function $1 $3))) ) .... Matt --Apple-Mail=_CAB7AD02-ABA3-4DD4-9851-2D28BE97CF37 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii Having = fun with all the great code, and documentation, in guile = 2. 

My current project is a lalr module. Last = December I wanted to start working on an interpreter for modelica in = guile.  Guile has a lalr parser generator (system base lalr). =   When I started looking at it, I was not crazy about the input = language. I looked to see if a better interface could be pasted on, but = the code impenetrable for me.  So I started coding up my own LALR = parser generator.  I was hoping it would take a month of = nights/weekends but ended up taking me close to four months.  For = now I am calling my implementation "lalr1".

A major = difference between my lalr1 and lalr in guile is that the guile version = is a translation of the C-coded bison tool and mine is a from-scratch = implementation in Scheme based on algorithms from the Dragon Book. =  (The bison implementation uses the DeRemer-Pennello algorithms, = yacc does not.  I'd guess the bison-based implementation will = generate a parser generator faster, but I think that is a minor = advantage.)

Some features of = lalr1:
0) It's implemented with syntax-rules and syntax-case = (versus define-macro for lalr).

1) There is no = specification of terminals.  Terminals are detected in rules (via = fenders) as quoted symbols or characters.
    = Example of terminals:  'number, #\+

2) = Actions are specified as ($$ action ...).  (Identifiers beginning = with $ are reserved.) 

3)  Mid-rule = actions (Bison Manual section 3.4.8 ) are supported.  (Not = supported by lalr in guile.)

4) Some handy = "proxy" macros are included.  The macro expression results in a = generated symbol and additional productions. =  Examples:
    ($? a b c) will generate symbol = $Pk and production ($Pk () (a b c)) 
    ($* a = b c) will generate symbol $Pk and production ($Pk () ($Pk a b = c)) 
    ($+ a b c) will generate symbol $Pk = and production ($Pk (a b c) ($Pk a b c))
   where = $Pk will be $P0, $P1, ...

5) It can generate = Bison and guile lalr input languages from lalr1 spec's.  (I have = been using the Bison converter to debug lookahead = generation.)

6) There is a macro to generate = lalr1 input language from a lalr-parser = specification.

7) Table compaction is optional. =  (The current compaction procedure will generate a default action = if max duplication is more than 3.)

8) The bulk = of the code resides in one module of ~ 1500 lines of code and ~ 500 = lines of comments.

Here is a partial = specification from the C-parser I am working on.  It is being = implemented with attributed-grammar semantics so that I can process code = with functions from the SXML modules.  For example, I am using = foldts to detect typedefs need to feed back to the = lexer.)

(define = clang-spec
  = (lalr-spec
   (start = translation-unit-proxy)
  =  (grammar

  =   (translation-unit-proxy (translation-unit ($$ (tl->list = $1))))

    (translation-unit
     (external-declaration ($$ (make-tl = 'trans-unit $1)))
    =  (translation-unit external-declaration ($$ (tl-app! $1 = $2)))
    =  )
  =   
    = (external-declaration
  =    (function-definition)
    =  (declaration)
  =    )
   ...
    (declaration
     (declaration-specifier init-declarator-list =  ($$ = (...)) #\;  ;; = <=3Dmid-rule action to capture typedef's
=   ($$ (list 'decl (tl->list $1) (tl->list = $2)))))
    =  (declaration-specifier #\;
          = ($$ (list 'decl = (tl->list $1))
    =  )
    = ....
    = (direct-declarator
    =  ('ident ($$ (list 'ident $1)))
     (#\( declarator #\) ($$ (list = 'scope $2)))
    =  (direct-declarator #\[ constant-expression #\] ($$ (list 'array $3 = $1)))
    =  (direct-declarator #\[ #\] ($$ (list 'array $3 = $1)))
    =  (direct-declarator #\( parameter-type-list #\) ($$ (list 'function = $1 $3)))
    =  (direct-declarator #\( identifier-list #\) ($$ (list 'function $1 = $3)))
    =  (direct-declarator #\( #\) ($$ (list 'function $1 = $3)))
    =  )
  =  ....

Matt

<= br>


= --Apple-Mail=_CAB7AD02-ABA3-4DD4-9851-2D28BE97CF37--