unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: Matt Wette <matt.wette@gmail.com>
To: Guile User <guile-user@gnu.org>
Subject: source tracking for nyacc parser is coming
Date: Fri, 22 Oct 2021 06:00:09 -0700	[thread overview]
Message-ID: <2dae51a4-971c-af6d-46bd-e3daa55574af@gmail.com> (raw)

Hi All,

I just wanted to give an update on this requested feature for nyacc.

I've been working on adding source location tracking to nyacc's
parser.  I believe I have a working implementation.  It should
appear in future release 1.06.0 (vs latest release 1.05.0). If
you want to work with it it's on the dev-1.06 branch in git
(i.e., git://git.savannah.nongnu.org/nyacc.git).  The parser
code, with annotations "=>" indicating changes, is shown below.

To demonstrate, I have implemented it in a language I'm working
on called TCLish.   In nyacc, the lexical analyzer returns pairs
(token-type . token-value).  I attach source-properties to these
pairs.  The parser is able to propapate them through the parsing
phase.  In the AST-to-tree-IL phase I transfer the source-properties
to the external tree-IL representation.  Guile takes care of the
rest.  (Note: in the lexical analyzer, I'm not bothering to
trace column: that gets set to zero.)

I generated the file, demo.tsh, with contents:

     proc baz { } {
       puts 1 2 3
     }

     proc bar { } {
       set x (1 + 2)
       baz
     }

     proc foo { } {
       set x (3 + 4)
       bar
     }

Now I run the tsh interpreter and source demo.tsh, then call "foo":
     scheme@(guile-user)> ,L nx-tsh
     Happy hacking with nx-tsh!  To switch back, type `,L scheme'.
     nx-tsh@(guile-user)> source "demo.tsh"
     nx-tsh@(guile-user)> foo
     ice-9/boot-9.scm:1685:16: In procedure raise-exception:
     In procedure string=: Wrong type argument in position 1 (expecting string): 1

     Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
     nx-tsh@(guile-user) [1]> ,bt
     In demo.tsh:
          13:0  5 (foo)                <= line number traceback
           8:0  4 (bar)                <= line number traceback
           3:0  3 (baz)                <= line number traceback
     In nyacc/lang/tsh/xlib.scm:
         81:12  2 (tsh:puts _ _ _)
     In unknown file:
                1 (string=? 1 "-nonewline")
     In ice-9/boot-9.scm:
       1685:16  0 (raise-exception _ #:continuable? _)


Here is the updated code for the (numeric) parser,
with new/mod lines denoted with `=>':

(define* (make-lalr-parser/num mach #:key (skip-if-unexp '()) interactive env)
   (let* ((len-v (assq-ref mach 'len-v))
	 (rto-v (assq-ref mach 'rto-v))
	 (pat-v (assq-ref mach 'pat-v))
	 (xct-v (make-xct (assq-ref mach 'act-v) env))
	 (ntab (assq-ref mach 'ntab))
	 (start (assq-ref (assq-ref mach 'mtab) '$start)))
     (lambda* (lexr #:key debug)
       (let loop ((state (list 0))	; state stack
		 (stack (list '$@))	; semantic value stack
		 (nval #f)		; non-terminal from prev reduction
		 (lval #f))		; lexical value (from lex'r)
	(cond
	 ((and interactive nval
	       (eqv? (car nval) start)
	       (zero? (car state)))     ; done
	  (cdr nval))
	 ((not (or nval lval))
	  (if (eqv? $default (caar (vector-ref pat-v (car state))))
=>	      (loop state stack (cons-source stack $default #f) lval)
	      (loop state stack nval (lexr))))		 ; reload
	 (else
	  (let* ((laval (or nval lval))
		 (tval (car laval))
		 (sval (cdr laval))
		 (stxl (vector-ref pat-v (car state)))
		 (stx (or (assq-ref stxl tval)
			  (and (not (memq tval skip-if-unexp))
			       (assq-ref stxl $default))
			  #f)))		; error
	    (if debug (dmsg/n (car state) (if nval tval sval) stx ntab))
	    (cond
	     ((eq? #f stx)		; error
	      (if (memq tval skip-if-unexp)
		  (loop state stack #f #f)
		  (parse-error state laval)))
	     ((negative? stx)		; reduce
	      (let* ((gx (abs stx))
		     (gl (vector-ref len-v gx))
		     ($$ (apply (vector-ref xct-v gx) stack))
=>		     (pobj (if (zero? gl) laval (list-tail stack (1- gl))))
=>		     (pval (source-properties pobj))
=>		     (tval (cons-source pobj (vector-ref rto-v gx) $$)))
=>		(if (supports-source-properties? $$)
=>		    (set-source-properties! $$ pval))
		(loop (list-tail state gl) (list-tail stack gl) tval lval)))
	     ((positive? stx)		; shift
=>	      (loop (cons stx state) (cons-source laval sval stack)
		    #f (if nval lval #f)))
	     (else			; accept
	      (car stack))))))))))

Matt




             reply	other threads:[~2021-10-22 13:00 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-22 13:00 Matt Wette [this message]
2021-10-23  7:49 ` source tracking for nyacc parser is coming tomas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2dae51a4-971c-af6d-46bd-e3daa55574af@gmail.com \
    --to=matt.wette@gmail.com \
    --cc=guile-user@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).