From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Janneke Nieuwenhuizen Newsgroups: gmane.lisp.guile.devel Subject: [PATCH v2 2/5] peg: Add debug tracing. Date: Mon, 14 Oct 2024 09:31:06 +0200 Message-ID: <20241014073109.19774-2-janneke@gnu.org> References: <87seszdu1d.fsf@gnu.org> <20241014073109.19774-1-janneke@gnu.org> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23505"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Ekaitz Zarraga , Rutger van Beusekom To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Mon Oct 14 09:32:18 2024 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1t0FYv-0005yv-Px for guile-devel@m.gmane-mx.org; Mon, 14 Oct 2024 09:32:18 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t0FYF-0008UT-Nn; Mon, 14 Oct 2024 03:31:35 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t0FXw-0008Qn-BR for guile-devel@gnu.org; Mon, 14 Oct 2024 03:31:16 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t0FXv-0007So-Dn; Mon, 14 Oct 2024 03:31:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=KMQmHkYgk1jrjYs9imP2m8FESuMJbvGaJxef0S50SG8=; b=Zj++oQyJwH/2sfdIW1lS 5i0rDl/m7N74Gso0J1N0V95g9mstXkB4Zy+xXOBR/BJaCnIzBr4bOtWWP0lUPJKW0At0kxNyC7i8b bFwFD+ehVSfmYz9AvzVL4G6dqft+Z+PCkj5qO9/0dnSuTj2a4YK/X4eBP5OkMXOZUjWd3BCh7JFU/ Xhleio3TJcXOBhbHIb5wurFYtqZk3/HrfU0bgVouEI9gxkUQmbYXYmDK9UeVYMdfls1Z5l18Zyt+p AvzrOnTBWBAooF2EFNkitnrIYVz931r1lWnO5sMdCKI/2f7sPpA79zZSw4uGdbe7GOx4B0dFfs07P Z11Wx/VJaw+8uw==; X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241014073109.19774-1-janneke@gnu.org> X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.lisp.guile.devel:22735 Archived-At: From: Rutger van Beusekom * module/ice-9/peg/codegen.scm (trace?): New function. (indent): New variable. (%peg:debug?): New exported parameter. (wrap-parser-for-users): Use them to provide debug tracing. * test-suite/tests/peg.test ("Parse tracing"): Test it. * doc/ref/api-peg.texi (Debug tracing): Document it. Co-authored-by: Janneke Nieuwenhuizen --- doc/ref/api-peg.texi | 18 ++++++++- module/ice-9/peg.scm | 3 +- module/ice-9/peg/codegen.scm | 74 ++++++++++++++++++++++++------------ test-suite/tests/peg.test | 39 ++++++++++++++++--- 4 files changed, 102 insertions(+), 32 deletions(-) diff --git a/doc/ref/api-peg.texi b/doc/ref/api-peg.texi index 0214f7ff1..dfa806832 100644 --- a/doc/ref/api-peg.texi +++ b/doc/ref/api-peg.texi @@ -1,6 +1,6 @@ @c -*-texinfo-*- @c This is part of the GNU Guile Reference Manual. -@c Copyright (C) 2006, 2010, 2011 +@c Copyright (C) 2006, 2010, 2011, 2024 @c Free Software Foundation, Inc. @c See the file guile.texi for copying conditions. @@ -1035,3 +1035,19 @@ symbol function)}, where @code{symbol} is the symbol that will indicate a form of this type and @code{function} is the code generating function described above. The function @code{add-peg-compiler!} is exported from the @code{(ice-9 peg codegen)} module. + +@subsubheading Debug tracing + +Due to the backtracking nature of PEG, the parser result is @code{#f} +when it cannot match the input text. It proves to be a big pain +determining whether the problem is actually in the input or in the +grammar, especially when changing the grammar itself. Setting the +parameter @var{%peg:debug?} to @code{#t} enables debug tracing, which +will make the PEG parser print for each production rule: its name, the +current state of the input, as well as the parse result. + +@lisp +(define-peg-string-patterns "grammar @dots{}") +(parameterize ((%peg:debug? #t)) + (and=> (match-pattern grammar input-text) peg:tree)) +@end lisp diff --git a/module/ice-9/peg.scm b/module/ice-9/peg.scm index 4e03131cd..499c3820c 100644 --- a/module/ice-9/peg.scm +++ b/module/ice-9/peg.scm @@ -1,6 +1,6 @@ ;;;; peg.scm --- Parsing Expression Grammar (PEG) parser generator ;;;; -;;;; Copyright (C) 2010, 2011 Free Software Foundation, Inc. +;;;; Copyright (C) 2010, 2011, 2024 Free Software Foundation, Inc. ;;;; ;;;; This library is free software; you can redistribute it and/or ;;;; modify it under the terms of the GNU Lesser General Public @@ -28,6 +28,7 @@ #:use-module (ice-9 peg cache) #:re-export (define-peg-pattern define-peg-string-patterns + %peg:debug? match-pattern search-for-pattern compile-peg-pattern diff --git a/module/ice-9/peg/codegen.scm b/module/ice-9/peg/codegen.scm index d80c3e849..c450be440 100644 --- a/module/ice-9/peg/codegen.scm +++ b/module/ice-9/peg/codegen.scm @@ -1,6 +1,6 @@ ;;;; codegen.scm --- code generation for composable parsers ;;;; -;;;; Copyright (C) 2011 Free Software Foundation, Inc. +;;;; Copyright (C) 2011, 2024 Free Software Foundation, Inc. ;;;; ;;;; This library is free software; you can redistribute it and/or ;;;; modify it under the terms of the GNU Lesser General Public @@ -18,7 +18,11 @@ ;;;; (define-module (ice-9 peg codegen) - #:export (compile-peg-pattern wrap-parser-for-users add-peg-compiler!) + #:export (compile-peg-pattern + wrap-parser-for-users + add-peg-compiler! + %peg:debug?) + #:use-module (ice-9 pretty-print) #:use-module (system base pmatch)) @@ -332,28 +336,48 @@ return EXP." "Not one of" (map car peg-compiler-alist))))))) ;; Packages the results of a parser + +(define %peg:debug? (make-parameter #f)) +(define (trace? symbol) + (%peg:debug?)) + +(define indent 0) + (define (wrap-parser-for-users for-syntax parser accumsym s-syn) - #`(lambda (str strlen at) + #`(lambda (str strlen at) + (when (trace? '#,s-syn) + (format (current-error-port) "~a~a\n" + (make-string indent #\space) + '#,s-syn)) + (set! indent (+ indent 4)) (let ((res (#,parser str strlen at))) - ;; Try to match the nonterminal. - (if res - ;; If we matched, do some post-processing to figure out - ;; what data to propagate upward. - (let ((at (car res)) - (body (cadr res))) - #,(cond - ((eq? accumsym 'name) - #`(list at '#,s-syn)) - ((eq? accumsym 'all) - #`(list (car res) - (cond - ((not (list? body)) - (list '#,s-syn body)) - ((null? body) '#,s-syn) - ((symbol? (car body)) - (list '#,s-syn body)) - (else (cons '#,s-syn body))))) - ((eq? accumsym 'none) #`(list (car res) '())) - (else #`(begin res)))) - ;; If we didn't match, just return false. - #f)))) + (set! indent (- indent 4)) + (let ((pos (or (and res (car res)) 0))) + (when (and (trace? '#,s-syn) (< at pos)) + (format (current-error-port) "~a~a := ~s\tnext: ~s\n" + (make-string indent #\space) + '#,s-syn + (substring str at pos) + (substring str pos (min strlen (+ pos 10))))) + ;; Try to match the nonterminal. + (if res + ;; If we matched, do some post-processing to figure out + ;; what data to propagate upward. + (let ((at (car res)) + (body (cadr res))) + #,(cond + ((eq? accumsym 'name) + #`(list at '#,s-syn)) + ((eq? accumsym 'all) + #`(list (car res) + (cond + ((not (list? body)) + (list '#,s-syn body)) + ((null? body) '#,s-syn) + ((symbol? (car body)) + (list '#,s-syn body)) + (else (cons '#,s-syn body))))) + ((eq? accumsym 'none) #`(list (car res) '())) + (else #`(begin res)))) + ;; If we didn't match, just return false. + #f))))) diff --git a/test-suite/tests/peg.test b/test-suite/tests/peg.test index f516571e8..6a8709794 100644 --- a/test-suite/tests/peg.test +++ b/test-suite/tests/peg.test @@ -1,14 +1,14 @@ +;;;;; PEG test suite. -*- scheme -*- ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -;;;;; PEG test suite. ;; Tests the parsing capabilities of (ice-9 peg). Could use more ;; tests for edge cases. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (define-module (test-suite test-peg) - :use-module (test-suite lib) - :use-module (ice-9 peg) - :use-module (ice-9 pretty-print) - :use-module (srfi srfi-1)) + #:use-module (srfi srfi-1) + #:use-module (test-suite lib) + #:use-module (ice-9 peg) + #:use-module (ice-9 pretty-print)) ;; Doubled up for pasting into REPL. (use-modules (test-suite lib)) @@ -276,3 +276,32 @@ number <-- [0-9]+") (equal? (eq-parse "1+1/2*3+(1+1)/2") '(+ (+ 1 (* (/ 1 2) 3)) (/ (+ 1 1) 2))))) +(define-peg-string-patterns + "trace-grammar <-- foo bar* baz +foo <-- 'foo' +bar <-- bla+ +bla <-- 'bar' +baz <-- 'baz'") + +(with-test-prefix "Parse tracing" + (pass-if-equal + "trace" +"trace-grammar + foo + foo := \"foo\" next: \"barbarbaz\" + bar + bla + bla := \"bar\" next: \"barbaz\" + bla + bla := \"bar\" next: \"baz\" + bla + bar := \"barbar\" next: \"baz\" + bar + baz + baz := \"baz\" next: \"\" +trace-grammar := \"foobarbarbaz\" next: \"\" +" + (parameterize ((%peg:debug? #t)) + (with-error-to-string + (lambda _ (and=> (match-pattern trace-grammar "foobarbarbaz") + peg:tree)))))) -- 2.46.0