From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mike Mattie Newsgroups: gmane.emacs.help Subject: Re: early preview of a recursive descent parser compiler implemented as a macro Date: Sun, 13 Jan 2008 20:23:42 -0800 Message-ID: <20080113202342.701f3d69@reforged> References: <20080108023436.714c4289@reforged> <20080111150738.773df4e4@reforged> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1824511807==" X-Trace: ger.gmane.org 1200285013 5770 80.91.229.12 (14 Jan 2008 04:30:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 14 Jan 2008 04:30:13 +0000 (UTC) To: emacs users ml Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Jan 14 05:30:34 2008 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JEGxx-00070W-KJ for geh-help-gnu-emacs@m.gmane.org; Mon, 14 Jan 2008 05:30:31 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JEGxX-0008RU-RW for geh-help-gnu-emacs@m.gmane.org; Sun, 13 Jan 2008 23:30:03 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JEGwq-0008L2-GA for help-gnu-emacs@gnu.org; Sun, 13 Jan 2008 23:29:20 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JEGwo-0008Ji-Il for help-gnu-emacs@gnu.org; Sun, 13 Jan 2008 23:29:19 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JEGwo-0008JT-6i for help-gnu-emacs@gnu.org; Sun, 13 Jan 2008 23:29:18 -0500 Original-Received: from wa-out-1112.google.com ([209.85.146.177]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JEGwn-0004ms-BE for help-gnu-emacs@gnu.org; Sun, 13 Jan 2008 23:29:18 -0500 Original-Received: by wa-out-1112.google.com with SMTP id k34so3466077wah.10 for ; Sun, 13 Jan 2008 20:29:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:subject:message-id:in-reply-to:references:x-mailer:mime-version:content-type; bh=N2QeVxKm9YiySK0NjDjdbxqiBc2S16QAt0trsYZeb7g=; b=l/aJIsINTjELYPB+g+KUXV5OaZUs/I+obpgxmv2iZEYRk7vQDEVIS0F8VqL9MuE43RTwCxCeTSfyxPxjY/Vo5ANMFuVSHmC8oVDnQz9Z7rjIuWJ1V2yoeuke379TNrVEzy00umD4NTK5RhCKz2/XteRw7hOrrFrbsQpvy8s7io4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:subject:message-id:in-reply-to:references:x-mailer:mime-version:content-type; b=C5L8udrU03BoTXTJN0c3KDuIP+Xx+ZK7H8qkI0oqYtnP6IpL0z8cHYOLLPiEtNuTw++e0UmsEdV4N7AfYlJukpuxr30Yy+quWk3oiYX6De0jGUKxxTVrSi0vSoJHzpOu6FWydmoJAUPmUfuWjtE7W//rFHI4X/I8/DBB3bKShcE= Original-Received: by 10.114.89.1 with SMTP id m1mr2309567wab.77.1200284955318; Sun, 13 Jan 2008 20:29:15 -0800 (PST) Original-Received: from reforged ( [71.217.198.62]) by mx.google.com with ESMTPS id k24sm15609441waf.22.2008.01.13.20.29.13 (version=SSLv3 cipher=OTHER); Sun, 13 Jan 2008 20:29:13 -0800 (PST) In-Reply-To: <20080111150738.773df4e4@reforged> X-Mailer: Claws Mail 3.0.2 (GTK+ 2.12.1; i686-pc-linux-gnu) X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 2) X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:50703 Archived-At: --===============1824511807== Content-Type: multipart/signed; boundary="Sig_/k.4S4T62e3p2jZ5nPBK7lu5"; protocol="application/pgp-signature"; micalg=PGP-SHA1 --Sig_/k.4S4T62e3p2jZ5nPBK7lu5 Content-Type: multipart/mixed; boundary="MP_/YV6fjump4LhE6gLKYoZe0R3" --MP_/YV6fjump4LhE6gLKYoZe0R3 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Fri, 11 Jan 2008 15:07:38 -0800 Mike Mattie wrote: > On Tue, 8 Jan 2008 02:34:36 -0800 > Mike Mattie wrote: >=20 > > Hello, > >=20 > > I recently found myself needing a top down parser, so I wrote up a > > parser compiler as a macro. It is turning out quite nicely for my > > purposes so I would like to submit it for comments and suggestions. > > I am still working out a couple of kinks, but it is at the point > > where I can start integrating feedback. > >=20 > > The goal is a lightweight parser for simple jobs such as scraping a > > data structure out of essentially static buffers, which is the > > problem I was solving when I developed it. > >=20 > > I am in-lining the actual parser compiler, and attaching a testing > > file so people can play with it if they find it interesting. > >=20 > > Status: > >=20 > > It's my first medium sized macro so it may have some thinko's or > > oddities. criticism welcome. > >=20 > > Simple tests work. The backtracking hasn't been extensively tested > > yet. The and form doesn't seem to return the full AST it should. > > Other than that all the basic bits are in place. >=20 > I reworked the runtime functions a bit and the and operator works > correctly now. test-parser.el has some minor changes so I have > included the latest version. quoting is fixed for the production symbols. > > The catch for semantic-error throws is not in place yet. I need > > something like (condition-case) however that construct seemed a > > little over-powered for my purpose. Suggestions on handling multiple > > error conditions in a single form would be appreciated. > My next priority is completing the error handling. I chose the condition-case route. I ended up writing up a quick and dirty d= efine-error which is evidently in XEmacs. Any advice for how to handle this compatibility iss= ue would be appreciated. After some good suggestions following review on IRC (thanks guys), I have c= leaned up the code for readability and re-organized it. > > TODO: > >=20 > > 1. add ? for optional productions. Not a big deal to implement once > > the basic implementation is sound. >=20 > > 2. Full Documentation >=20 > 3. add a (define ) list so that tokens and productions can be defined > without using them in place. It's implemented but not tested yet. > > Review Questions: > >=20 > > Do the Emacs developers consider this tool useful ? Does it fill a > > niche in Emacs ? If so I will document it. > >=20 > > Is it elegant ? > >=20 > > Is the terminology correct ? I am not entirely sure I used curry in > > the strict sense. This version cleans up the use of parsing terminology. > > Are there any essential features missing ? keeping in mind the goal > > is simplicity, not a fully optimized parser. > >=20 > > Is it interesting in a creative sense, or have I just ground out > > another lisp right of passage ? With some pointers from #emacs on freenode I found one similar work on On L= isp by Paul Graham. They are technical similarities of course, but the interface and implementation are quite different. Any other pointers to Related works would be greatly appreciated. =20 > > Examples: > >=20 > > A couple of contrived examples I have used for testing. A couple of > > things to note. Keywords are token|and|or. token defines a regex > > match, while or and and construct terminals. The definition of a > > token or terminal creates a match object at the point of definition. > >=20 > > There is a built-in start symbol that is a or-terminal, so the first > > example will match to the start symbol either token.=20 > >=20 > > (parser-compile test-parser > > (token whitespace "[[:blank:]]+") > > (token word "[[:alpha:]]+")) > >=20 > > The parser is compiled to the function value of test-parser. You can > > call it like this: (test-parser (point)) > >=20 > > It parses a single start production. Looping is done outside of the > > parser itself.=20 > >=20 > > as the second example shows you can reference previously created > > tokens. > >=20 > > (parser-compile test-parser > > (and indented (token whitespace "[[:blank:]]+") (token word > > "[[:alpha:]]+")) (and inverted word whitespace)) > >=20 > > The attached file test-parser.el has some test drivers for playing > > with the parser interactively. Some of the features for defining > > tokens such as defining your own actions for tokens are shown there. > >=20 > > P.S, here is a fun little lisp comic in the spirit of finding > > amusement in your work: http://xkcd.com/297/ > >=20 > > Without further ado here is parser.el , and enjoy ! > > >=20 Here is the third posting, v669. It will probably be the last mailing list = post. If I continue to publish it I will probably put it on EmacsWiki. I wanted t= o drop off a decently clean version of the code in case anybody stumbles across th= is thread and is interested. ;;---------------------------------------------------------------------- ;; parser.el ;; Primary Author: Mike Mattie (codermattie@gmail.com) ;; Copyright: Mike Mattie (2008) ;;---------------------------------------------------------------------- ;; ->Summary. ;; parser.el aims to be a lightweight implementation of Recursive ;; Descent parsing. The parser-compile macro is given a symbol and a ;; grammar definition. It returns a compiled parser bound to the ;; function of the symbol. ;; -> Application. ;; Building or extracting a nested data structure by parsing static ;; text. ;; -> Comparison with Emacs parsing facilities. ;; Emacs supports parsing in two forms: Regular Expressions and tools ;; like Syntax Tables geared towards using the parse results to ;; annotate buffers with text-properties and overlays. ;; This Dynamic Programming approach works well for overlaying ;; functions that interpret the meaning of text in the buffer, such as ;; syntax highlighting, on a paradigm of unstructured text. The ;; analysis is preserved when the buffer is edited at a character ;; level. ;; When you need to build an interface that rests entirely on the ;; parse analysis to the degree that the user or program does not ;; modify or traverse the buffer at a character level, this tool ;; simplifies construction of a nested data structure that maps tokens ;; to beginning and ending positions of the match. ;; -> Related Works ;; * CEDET. ;; CEDET appears to be a parser generator implemented on-top of a CLOS ;; emulation. This tool-set has developed towards the problem of ;; analyzing source code as a project which is bloated for simpler ;; requirements. ;; CEDET is likely to offer much higher performance than an un-optimized ;; recursive descent parser. If backtracking needs to be optimized ;; something like CEDET is a better choice than parser-compile. ;; ->Concept ;; A parser compiler that combines lexical (regex) analysis with recursive ;; descent parsing compiled by macro expansion. ;; The concept started with the idea to build a parser as nested cond ;; forms. The matching would go in the predicate part and the action ;; in the body. ;; This essential simplicity is still present in the design but the ;; implementation is now more complex with Match Functions replacing ;; the cond clause form. Match Functions allow previous definitions of ;; a token or rule to be referenced later in later rules. Backtracking ;; was another essential complexity once the and non-terminal was ;; added. ;; ->Requirements ;; 1. Easy to use for trivial problems, powerful enough to solve ;; most of them. ;; 2. Data structures are orthogonal to the parser and not hard-wired ;; in. Tokens can perform user defined functions or construct custom ;; data structures with the beginning and ending positions of the ;; match in the input. ;; ;; This allows the user to choose between positions, markers, overlays ;; according to the requirements. ;; ->Characteristics ;; parser-compile produces a no frills recursive descent parser. ;; ->Terminology ;; My reference for parsing terminology is the Dragon Book: ;; Compilers ;; Principles,Techniques,and Tools ;; Alfred V.Aho, Ravi Sethi, Jeffrey D.Ullman ;; 1986, Addison Wesley ;; ->TODO ;; 1. define list where Matches can be defined without inserting a ;; match at the definition point [implemented, but not tested] ;; 2. optional matching with ? for Match Function references. [easy] ;; 3. Canonical tree walk implemented as parser-ast-node. (require 'cl) ;; not defined in Gnu Emacs evidently. (defmacro define-error ( symbol message &rest isa-list ) "define a error symbol with a isa list and a error message" `(progn (put ',symbol 'error-conditions (append '(error ,symbol) ',isa-list)) (put ',symbol 'error-message ,message) )) ;; A recursive macro expansion would be nice for creating a hierarchy. (define-error parser-compile-error "parser error") (define-error parser-syntactic-error "syntactic error" parser-compile-er= ror) (define-error parser-semantic-error "semantic error" parser-compile-err= or) ;;---------------------------------------------------------------------- ;; Backtracking. ;;---------------------------------------------------------------------- ;; parser-position ;; The position of the parser in the input stream is maintained as a ;; stack of indexes into the buffer. As matching progresses the top of ;; the stack (car) is updated. The push and pop operations copy a ;; position to a next or previous stack position. backtrack discards ;; the top of the stack. (defsubst parser-pos () ;; tested "Return the current position of the parser in the buffer" (car parser-position)) (defun parser-push () "Copy the parser position to a new stack level so the parser can backtrac= k when necessary." (push (parser-pos) parser-position)) (defun parser-pop () "Copy the parser position to a previous stack level When the possibility = of a backtrack has been eliminated by matching." (let ((current (pop parser-position))) (setcar parser-position current) )) (defun parser-backtrack () "Restore the previous parser position by discarding the top of the parser= -position stack." (pop parser-position) (goto-char (parser-pos))) (defun parser-advance ( consumed ) "Advance the input position of the parser to the next un-matched characte= r: input consumed + 1." (if (> consumed 0) (lexical-let ((pos (+ 1 consumed (parser-pos)))) (progn (setcar parser-position pos) (goto-char pos))) )) (defun parser-consumed () "The number of input characters consumed by the token's match in the inpu= t." (- (match-end 0) (match-beginning 0))) ;;---------------------------------------------------------------------- ;; Match Result ;;---------------------------------------------------------------------- ;; The parser functions have a standard structure for returning the ;; result of a Match Function. ;; nil | (parser . AST ). ;; nil indicates failure to match. ;; A successful match has a parser and AST parts. The parser is ;; the ending position of the match relative to the starting ;; match position. ;; The AST part is created by a token action handler or a combination ;; operator like and. ;; token ( match-symbol . ( start . end ) | "user value" | "user function r= eturn value") ;; and ( match-symbol . "a list of Match Result AST parts" ) ;; This consistent return value structure allows token/and/or ;; match-functions to nest arbitrarily [except that tokens are ;; strictly leafs]. ;; My rationale for using inline functions is that I am essentially naming ;; the car and cdr of my cons cell for readability. (defsubst parser-make-match ( consumed data ) "create a match result from the input consumed by the match, and the matc= h data." (cons consumed data)) (defsubst parser-match-consumed ( match-result ) "return the input consumed by the match" (car match-result)) (defsubst parser-match-data ( match-result ) "return the data of the match" (cdr match-result)) (defsubst parser-make-match-data ( name data ) (cons name data)) ;;---------------------------------------------------------------------- ;; Combination Operators ;;---------------------------------------------------------------------- ;; The parser uses two combination operators: parser-and, parser-or as ;; nodes in the parser tree. Significantly parser-and can backtrack ;; and parser-or never does. ;; and/or have the same essential meaning as the lisp and/or forms ;; with two specializations. Both functions treat their argument lists ;; as a list of Match Functions. Also the parser-and function returns ;; a production consisting of the AST parts of the Match Results, ;; instead of returning the last Match Result. (defun parser-or ( &rest match-list ) "Combine Match Functions by or ; the first successful match is returned. nil is returned if no matches are found" (catch 'match (dolist (match match-list) (lexical-let ((production (funcall match))) (if production (progn (parser-advance (parser-match-consumed production)) (throw 'match (parser-make-match 0 (parser-match-data productio= n)))) ))) (throw 'match nil) )) (defun parser-and ( &rest match-list ) "combine the matches with and. all of the match objects must return non-n= il in the parser part or the parser will backtrack and return nil." (parser-push) (lexical-let ((production (catch 'backtrack ;; we want to gather all the matches, so mapcar across the matc= h objects. (mapcar (lambda (match) (lexical-let ((production (funcall match))) (if production (progn (parser-advance (parser-match-consumed production)) (parser-match-data production)) (throw 'backtrack nil)) )) match-list) ))) (if production (progn (parser-pop) (parser-make-match 0 production)) ;; would be nice to filter optiona= l matches here. (progn (parser-backtrack) nil)) )) ;;---------------------------------------------------------------------- ;; compiler diagnostics ;;---------------------------------------------------------------------- ;; construct meaningful compiler error messages in the ;; "expected: foo got: bar" form. (defun parser-expr-diagnostic ( form ) ;; tested (format "type(%s) %s" (symbol-name (type-of form)) (pp (eval form)))) (defmacro parser-diagnostic ( form from expected ) ;; tested "syntax: (parser-diagnostic form from expected) Where form is the expr received, from is the component issuing the diagn= ostic, and expected is a message describing what the component expected" `(concat (format "[%s] expected: " ,from) ,expected " not: " ,(parser-ex= pr-diagnostic form))) ;;---------------------------------------------------------------------- ;; Match Functions ;;---------------------------------------------------------------------- ;; make-match and get-match implement a table of match functions. ;; Match functions are lambdas that take no arguments and return Match ;; Result values. They assume that the compiled parser has scoped a ;; parser-position stack to determine the parsing position in input. ;; match-table is objarray created at macro scope, and does not appear ;; in the compiled form. (defun parser-match-function ( identifier &optional definition ) "Retrieve or define a Match Function." (lexical-let* ((id (symbol-name identifier)) (lookup (intern-soft id match-table))) (if lookup (if (eq definition nil) lookup (signal 'parser-semantic-error (format "illegal redefinition of Mat= ch Function %s" id))) (progn (fset (intern id match-table) (eval definition)) (intern id match-table)) ))) ;;---------------------------------------------------------------------- ;; tokens ;;---------------------------------------------------------------------- ;; EXPERIMENT: allow other functions to be used for token matching ;; other than looking-at, such as re-search. (defun parser-build-token ( identifier ) ;; tested "parser-make-token is a built-in constructor that records the analysis and the location of the text (id . (begin . end))" (parser-make-match-data identifier (cons (match-beginning 0) (match-end 0= )))) (defun parser-make-anon-func ( sexp ) ;; tested "bind an un-evaluated anonymous function to an un-interned symbol" (let ((anon-func (make-symbol "parser-user-handler"))) (fset anon-func (eval sexp)) anon-func)) ;; the token interpreter was split into two functions to isolate the ;; flexibility of tokens (user functions or return values for ;; constructing AST) from the hard-wired parser part. (defun parser-interp-token-action ( identifier constructor ) ;; tested "Translate the AST constructor part of a token definition into Elisp." (unless (symbolp identifier) (signal 'parser-syntactic-error (parser-diagnostic identifier "parser token" "identifier: An unbound symbol used as an identifier" ))) ;; Warning: this code is very touchy, double quotation of AST ;; symbols is required. the saving grace is that the symbols don't ;; have a variable value bound so it fails noisily when the ;; quotation is incorrect. (cond ((eq nil constructor) `(parser-build-token '',identifier)) ((listp constructor) `(,(parser-make-anon-func constructor) (match-= beginning 0) (match-end 0))) ((functionp constructor) `(,constructor (match-beginning 0) (match-end = 0))) ((symbolp constructor) `(quote ',constructor)) ;; all other constructor types are un-handled. ((signal 'parser-syntactic-error (parser-diagnostic identifier "parser token: identifier" "A symbol")))) ) (defun parser-interp-token ( syntax ) ;; tested "Translate a token definition into a Match Function. The matching part is hard-wired into the function, while the AST part which contains a great degree of flexibility is translated by parser-interp-token-action" (lexical-let ((identifier (car syntax)) (regex (cadr syntax)) ;; eval ? many regexs are stored in var= iables. (constructor (caddr syntax))) `(lambda () (if (looking-at ,regex) (parser-make-match (parser-consumed) ,(parser-interp-token-action = identifier constructor)) nil )) )) (defun parser-compile-token ( syntax ) ;; tested "Compile a token into a Match Function." (parser-match-function (car syntax) (parser-interp-token syntax))) ;;---------------------------------------------------------------------- ;; rules ;;---------------------------------------------------------------------- ;; rules are non-terminals, with a left side or identifier, and a ;; right side containing matches that recognize a production of the ;; rule. (defun list-filter-nil ( list ) "filter nil symbols from a list" (if (consp list) (lexical-let ((head (car list))) (if (eq head 'nil) (list-filter-nil (cdr list)) (cons head (list-filter-nil (cdr list))) )) nil )) (defun parser-rule-left ( prod-left combine-operator prod-right ) "Translate the Left Side of a Rule into a Match Function currying the Right Side of a rule." (unless (symbolp prod-left) (parser-diagnostic prod-left "Rule Left interpreter" "left side of the production or an identifier")) (lexical-let ((matchf-list (list-filter-nil prod-right))) ;; filter nils created by = define lists. (parser-match-function prod-left `(lambda () (lexical-let ((production (apply ',combine-operator ',matchf-list))) (if production (parser-make-match (parser-match-consumed production) (parser-make-match-data '',prod-left (parser-match-data prod= uction))) nil) ))) )) (defun parser-rule-right ( rule ) "Translate a match in the Right Side of the rule into a compiled Match Function by retrieving the Match Function or recursively interpreting the grammar definition." (cond ((listp rule) (parser-compile-definition rule)) ((symbolp rule) (parser-match-function rule)) (signal 'parser-syntactic-error (parser-daignostic rule "Rule Right interpreter" "expected a grammar list e.g: token,and,or ; or a symbol as a rule = or token reference")) )) (defun parser-compile-rule ( combine-function prod-right ) (parser-rule-left ;; make a match function (car prod-right) ;; the identifier of the production combine-function ;; the combine operator (mapcar 'parser-rule-right (cdr prod-right)) ;; interpret the matching = definition. )) ;;---------------------------------------------------------------------- ;; grammar definition. ;;---------------------------------------------------------------------- (defun parser-compile-definition ( term ) "parser-compile-definition is the recursive heart of the compiler." (unless (listp term) (signal 'parser-syntactic-error (parser-diagnostic term "parser definition" "expected a definition of token|or|and|define"))) (lexical-let ((keyword (car term)) (syntax (cdr term))) (if (listp keyword) (parser-compile-definition keyword) (cond ((eq keyword 'token) (parser-compile-token syntax)) ((eq keyword 'or) (parser-compile-rule 'parser-or syntax)) ((eq keyword 'and) (parser-compile-rule 'parser-and syntax)) ((eq keyword 'define) ;; define discards the match functions as a return value so ;; tokens and rules can be defined before they are used. (mapcar 'parser-rule-right syntax) nil) ((signal 'parser-syntactic-error (parser-diagnostic term "parser definition" "definition keyword token|or|and|define"))) )) )) (defvar parser-mtable-init-size 13 "initial size of the match-table objarray for storing match functions. th= e value was chosen based on the recommendation of prime numbers for good hashing= .") (defmacro parser-compile ( parser &rest definition ) "compile a LL parser from the given grammar." (let ;; create a symbol table to store compiled terminal and ;; non-terminal match functions ((match-table (make-vector parser-mtable-init-size 0))) (condition-case diagnostic (progn (fset parser (eval `(lambda ( start-pos ) (let ((parser-position (cons start-pos nil))) ;; initialize the= backtrack stack (save-excursion (goto-char start-pos) ;; note that the start symbol of the grammar is built in= as an or combination ;; of the top-level definitions. (lexical-let ((parse (,(parser-rule-left 'start 'parser-or (mapcar 'parser-compile-definition definit= ion)) ))) (if parse ;; if we have a production return the position at wh= ich the ;; parser stopped along with the AST. (parser-make-match (parser-pos) (parser-match-data p= arse)) nil)) ))) )) t) (parser-syntactic-error (progn (message "parser-compile Syntax Error %s" diagnostic) nil )) (parser-semantic-error (progn (message "parser-compile invalid statement %s" diagnostic) )) ))) (provide 'parser) --MP_/YV6fjump4LhE6gLKYoZe0R3 Content-Type: text/x-emacs-lisp; name=test-parser.el Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename=test-parser.el ;;---------------------------------------------------------------------- ;; diagnostics ;;---------------------------------------------------------------------- ;; test the diagnostics of the parser. I think some sort of eval may be nec= essary ;; in diagnostics. ;; see what the expansion looks like (pp (macroexpand (parser-diagnostic `(foo bar) "test" "the right thing"))) ;; see what it looks like when handled by message. (message "%s" (parser-diagnostic `(foo bar) "test" "the right thing")))) ;;---------------------------------------------------------------------- ;; token interp phase ;;---------------------------------------------------------------------- (defmacro test-interp-token ( &rest syntax ) (lexical-let ((compile (catch 'syntax-error (parser-interp-token (cdr syntax))))) (if (stringp compile) (message "%s" compile) compile) )) (defun run-test-interp-token ( match-function ) (lexical-let ((result (funcall match-function))) (message "TI match? %s AST %s" (if (car result) "yes" "no") (pp (cdr re= sult))) )) (defun run-test-token () (interactive) (run-test-interp-token (test-interp-token token whitespace "[[:blank:]]+"= ))) (pp (macroexpand (test-interp-token token whitespace "[[:blank:]]"))) ;;---------------------------------------------------------------------- ;; test tokens ;;---------------------------------------------------------------------- ;; eval this, then eval various tests. (defun run-parser () "run test-parser interactively for testing and debugging." (interactive) (lexical-let ((parse-result (test-parser (point)))) (message "PROD match? %s" (if parse-result (format "Yes matched to: %s, AST: %s" (car parse-result) (pp (cdr p= arse-result)) "No") )) )) (parser-compile test-parser (token whitespace "[[:blank:]]+") (token word "[[:alpha:]]+")) (parser-compile test-parser (and indented (token whitespace "[[:blank:]]+") (token word "[[:alpha:]]+= ")) (and inverted word whitespace)) (parser-compile test-parser (token whitespace "[[:blank:]]+")) (parser-compile test-parser (token whitespace "[[:blank:]]+" (lambda ( start stop ) (message "bar by = far")))) (parser-compile test-parser (token whitespace "[[:blank:]]+" bingo)) ;;---------------------------------------------------------------------- ;; test productions ;;---------------------------------------------------------------------- (parser-compile test-parser (token whitespace "[[:blank:]]+") (token word "[[:alpha:]]+")) (parser-compile test-parser (and indented (token whitespace "[[:blank:]]+") (token word "[[:alpha:]]+= ")) (and inverted word whitespace)) fooo bar ;;---------------------------------------------------------------------- ;; experimental ;;---------------------------------------------------------------------- now create a spiffy function that walks the AST tree for you. something like start/indented/whitespace or start/indented this should lay the grounds for verifying wether the AST is generated as ex= pected, with parser-walk defining a canonical traversal. --MP_/YV6fjump4LhE6gLKYoZe0R3-- --Sig_/k.4S4T62e3p2jZ5nPBK7lu5 Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD4DBQFHiuPOdfRchrkBInkRAvM2AJ9CvSFGPuuQfkY6HOA4bFfo7oc4jACY0bbq 5wc19tmlT2jOqzLRn2XQbw== =bKGD -----END PGP SIGNATURE----- --Sig_/k.4S4T62e3p2jZ5nPBK7lu5-- --===============1824511807== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ help-gnu-emacs mailing list help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs --===============1824511807==--