From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Matt Wette Newsgroups: gmane.lisp.guile.user Subject: source tracking for nyacc parser is coming Date: Fri, 22 Oct 2021 06:00:09 -0700 Message-ID: <2dae51a4-971c-af6d-46bd-e3daa55574af@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="24663"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 To: Guile User Original-X-From: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Fri Oct 22 15:01:04 2021 Return-path: Envelope-to: guile-user@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mduAS-0006Bb-0H for guile-user@m.gmane-mx.org; Fri, 22 Oct 2021 15:01:04 +0200 Original-Received: from localhost ([::1]:34842 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mduAQ-0007Ry-Gh for guile-user@m.gmane-mx.org; Fri, 22 Oct 2021 09:01:02 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48928) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mdu9h-0007MG-Bq for guile-user@gnu.org; Fri, 22 Oct 2021 09:00:23 -0400 Original-Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]:46854) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mdu9d-0003Q5-72 for guile-user@gnu.org; Fri, 22 Oct 2021 09:00:16 -0400 Original-Received: by mail-pf1-x42b.google.com with SMTP id x66so3524725pfx.13 for ; Fri, 22 Oct 2021 06:00:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:content-language:to:from :subject:content-transfer-encoding; bh=W4VKtA7bcj4eYkFop8i4R1ZmgLUcmapRywsOxd9T7u8=; b=TxclRWX0YeX6GOaBrT/fPrk0jOOlXP7aZoLJbF0uwp4Ytl8fyo1jw5qa2CcST1mtxU Fn/zdCJRdByt+g9Tp23SnjUUy+2zHUejVLDkMuGRfA2bx3mmeWFCWTkh9ABcnUsKp4ki SRINUpse1uHMqs4AqOtFq7jZyIH62q+7Ci2q0iuNeKzD9mtBhuikLxQKx2p+qECHHp77 heMd8/IEkp0lgRuNXGPB1m6HoNwm5cuucNelIjtpWdnDd8vWX2joCyec5862v/hS0mMv SeHuV86+YrfzdWqDXTOUlDE+v6WxWjKbfklG1c7sriq/T8dxdFZ9r+b1NsA/beQM1yrk TXSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:from:subject:content-transfer-encoding; bh=W4VKtA7bcj4eYkFop8i4R1ZmgLUcmapRywsOxd9T7u8=; b=tkbw+ZcCCz+1jfLJfL2lJEwfn0V+cAkZNQQsmQJ+j6Ue7X6Xj8cVtmp1zEeRNYYcOc 5Xlep6XKEsxacAEYuYll/0eTcyy93iczsBhPHDbZ3RMMNhAdyZPerK4tS0TR7+IeOdhy C09grOKP/NtIX+Zlj4EtKcNAdS/CK8PKv5iHMNFWeSeeJMBHc6bK7HRXK3ppVOtFRWF2 NWmrYnRObDuItV21P+DYEtk+85bgPt46Wxfs0Wl5mLg7XqGBvPd3PXGdxgiN1Ixgc4Oz KckBuqQmUt5pj5DvH7Ivmh3a7PF05UoT25I50fDYVqMRattPvUg2yZSP6irJR220pXrM 2jRw== X-Gm-Message-State: AOAM533O7fe6Gku1t1adyncs9qpoyXofFQGfYaWME20WS1o6SOEHhy/b gZQp5gtow39IT6EDq9QQh6L8Nocm7eDnvA== X-Google-Smtp-Source: ABdhPJyndc0OsQOp9jeC0pi+zPlUak0yzr4R85EnjTTW4siXAegzvPybeyqYOS7pDvdwAIMGExZLZQ== X-Received: by 2002:a63:b203:: with SMTP id x3mr9397399pge.239.1634907610699; Fri, 22 Oct 2021 06:00:10 -0700 (PDT) Original-Received: from [192.168.2.158] (64-52-176-132.championbroadband.com. [64.52.176.132]) by smtp.gmail.com with ESMTPSA id l4sm11311237pfc.121.2021.10.22.06.00.09 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 22 Oct 2021 06:00:10 -0700 (PDT) Content-Language: en-US Received-SPF: pass client-ip=2607:f8b0:4864:20::42b; envelope-from=matt.wette@gmail.com; helo=mail-pf1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.io gmane.lisp.guile.user:17810 Archived-At: Hi All, I just wanted to give an update on this requested feature for nyacc. I've been working on adding source location tracking to nyacc's parser. I believe I have a working implementation. It should appear in future release 1.06.0 (vs latest release 1.05.0). If you want to work with it it's on the dev-1.06 branch in git (i.e., git://git.savannah.nongnu.org/nyacc.git). The parser code, with annotations "=>" indicating changes, is shown below. To demonstrate, I have implemented it in a language I'm working on called TCLish. In nyacc, the lexical analyzer returns pairs (token-type . token-value). I attach source-properties to these pairs. The parser is able to propapate them through the parsing phase. In the AST-to-tree-IL phase I transfer the source-properties to the external tree-IL representation. Guile takes care of the rest. (Note: in the lexical analyzer, I'm not bothering to trace column: that gets set to zero.) I generated the file, demo.tsh, with contents: proc baz { } { puts 1 2 3 } proc bar { } { set x (1 + 2) baz } proc foo { } { set x (3 + 4) bar } Now I run the tsh interpreter and source demo.tsh, then call "foo": scheme@(guile-user)> ,L nx-tsh Happy hacking with nx-tsh! To switch back, type `,L scheme'. nx-tsh@(guile-user)> source "demo.tsh" nx-tsh@(guile-user)> foo ice-9/boot-9.scm:1685:16: In procedure raise-exception: In procedure string=: Wrong type argument in position 1 (expecting string): 1 Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. nx-tsh@(guile-user) [1]> ,bt In demo.tsh: 13:0 5 (foo) <= line number traceback 8:0 4 (bar) <= line number traceback 3:0 3 (baz) <= line number traceback In nyacc/lang/tsh/xlib.scm: 81:12 2 (tsh:puts _ _ _) In unknown file: 1 (string=? 1 "-nonewline") In ice-9/boot-9.scm: 1685:16 0 (raise-exception _ #:continuable? _) Here is the updated code for the (numeric) parser, with new/mod lines denoted with `=>': (define* (make-lalr-parser/num mach #:key (skip-if-unexp '()) interactive env) (let* ((len-v (assq-ref mach 'len-v)) (rto-v (assq-ref mach 'rto-v)) (pat-v (assq-ref mach 'pat-v)) (xct-v (make-xct (assq-ref mach 'act-v) env)) (ntab (assq-ref mach 'ntab)) (start (assq-ref (assq-ref mach 'mtab) '$start))) (lambda* (lexr #:key debug) (let loop ((state (list 0)) ; state stack (stack (list '$@)) ; semantic value stack (nval #f) ; non-terminal from prev reduction (lval #f)) ; lexical value (from lex'r) (cond ((and interactive nval (eqv? (car nval) start) (zero? (car state))) ; done (cdr nval)) ((not (or nval lval)) (if (eqv? $default (caar (vector-ref pat-v (car state)))) => (loop state stack (cons-source stack $default #f) lval) (loop state stack nval (lexr)))) ; reload (else (let* ((laval (or nval lval)) (tval (car laval)) (sval (cdr laval)) (stxl (vector-ref pat-v (car state))) (stx (or (assq-ref stxl tval) (and (not (memq tval skip-if-unexp)) (assq-ref stxl $default)) #f))) ; error (if debug (dmsg/n (car state) (if nval tval sval) stx ntab)) (cond ((eq? #f stx) ; error (if (memq tval skip-if-unexp) (loop state stack #f #f) (parse-error state laval))) ((negative? stx) ; reduce (let* ((gx (abs stx)) (gl (vector-ref len-v gx)) ($$ (apply (vector-ref xct-v gx) stack)) => (pobj (if (zero? gl) laval (list-tail stack (1- gl)))) => (pval (source-properties pobj)) => (tval (cons-source pobj (vector-ref rto-v gx) $$))) => (if (supports-source-properties? $$) => (set-source-properties! $$ pval)) (loop (list-tail state gl) (list-tail stack gl) tval lval))) ((positive? stx) ; shift => (loop (cons stx state) (cons-source laval sval stack) #f (if nval lval #f))) (else ; accept (car stack)))))))))) Matt