bug#73188: PEG: Fix bugs and add complex PEG for testing

unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed

From: Ekaitz Zarraga <ekaitz@elenq.tech>
To: 73188@debbugs.gnu.org
Cc: "Ludovic Courtès" <ludo@gnu.org>
Subject: bug#73188: PEG: Fix bugs and add complex PEG for testing
Date: Wed, 30 Oct 2024 20:04:02 +0100	[thread overview]
Message-ID: <0994541b-538d-4f03-bf13-78ef8917099f@elenq.tech> (raw)
In-Reply-To: <78a81bc5-cd0d-0506-185b-c733c66e96ae@elenq.tech>

[-- Attachment #1: Type: text/plain, Size: 425 bytes --]

Hi,

I decided to improve the tests of the PEG module because I wasn't very 
confident about the [^...] functionality, and I found I had some minor 
bugs in the previous patch.

I attach a new version of the previous commits with an extra one that 
adds an HTML parser and tests against it. That's what made me find some 
of the errors and missing bits.

With the test I feel more confident about the changes.

Thanks,
Ekaitz

[-- Attachment #2: v5-0003-PEG-Add-a-complex-PEG-grammar-test.patch --]
[-- Type: text/x-patch, Size: 7996 bytes --]

From e682544a68ff1d9e41fb954539884b7163124589 Mon Sep 17 00:00:00 2001
From: Ekaitz Zarraga <ekaitz@elenq.tech>
Date: Wed, 30 Oct 2024 19:52:28 +0100
Subject: [PATCH v5 3/3] PEG: Add a complex PEG grammar test

Add a complex PEG grammar for HTML parsing and test against it.
This properly tests for complex constructs, specially `[^...]`.

* test-suite/tests/peg.test (html-grammar): New variable.
(html-example): New variable.
(Parsing with complex grammars): New test.
---
 test-suite/tests/peg.test | 113 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 113 insertions(+)

diff --git a/test-suite/tests/peg.test b/test-suite/tests/peg.test
index 5570fbfa8..d8d047288 100644
--- a/test-suite/tests/peg.test
+++ b/test-suite/tests/peg.test
@@ -284,3 +284,116 @@ number <-- [0-9]+")
    "1+1/2*3+(1+1)/2"
    (equal? (eq-parse "1+1/2*3+(1+1)/2")
 	   '(+ (+ 1 (* (/ 1 2) 3)) (/ (+ 1 1) 2)))))
+
+
+(define html-grammar
+"
+# Based on code from https://github.com/Fantom-Factory/afHtmlParser
+# 2014-2023 Steve Eynon. This code was originally released under the following
+# terms:
+#
+#      Permission to use, copy, modify, and/or distribute this software for any
+#      purpose with or without fee is hereby granted, provided that the above
+#      copyright notice and this permission notice appear in all copies.
+#
+#      THE SOFTWARE IS PROVIDED \"AS IS\" AND THE AUTHOR DISCLAIMS ALL
+#      WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES
+#      OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE
+#      FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY
+#      DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
+#      IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
+#      OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+
+# PEG Rules for parsing well formed HTML 5 documents
+# https://html.spec.whatwg.org/multipage/syntax.html
+
+html        <-- bom? blurb* doctype? blurb* xmlProlog? blurb* elem blurb*
+bom         <-- \"\\uFEFF\"
+xmlProlog   <-- \"<?xml\" (!\"?>\" .)+ \"?>\"
+
+# ---- Doctype ----
+
+doctype           <-- \"<!DOCTYPE\" [ \\t\\n\\f\\r]+ [a-zA-Z0-9]+ (doctypePublicId / doctypeSystemId)* [ \\t\\n\\f\\r]* \">\"
+doctypePublicId   <-- [ \\t\\n\\f\\r]+  \"PUBLIC\" [ \\t\\n\\f\\r]+   ((\"\\\"\" [^\"]* \"\\\"\") / (\"'\" [^']* \"'\"))
+doctypeSystemId   <-- [ \\t\\n\\f\\r]+ (\"SYSTEM\" [ \\t\\n\\f\\r]+)? ((\"\\\"\" [^\"]* \"\\\"\") / (\"'\" [^']* \"'\"))
+
+# ---- Elems ----
+
+elem              <-- voidElem / rawTextElem / escRawTextElem / selfClosingElem / normalElem
+voidElem          <-- \"<\"  voidElemName       attributes  \">\"
+rawTextElem       <-- \"<\"  rawTextElemName    attributes  \">\" rawTextContent endElem
+escRawTextElem    <-- \"<\"  escRawTextElemName attributes  \">\" escRawTextContent endElem
+selfClosingElem   <-- \"<\"  elemName           attributes \"/>\"
+normalElem        <-- \"<\"  elemName           attributes  \">\" normalContent? endElem?
+endElem           <-- \"</\" elemName                       \">\"
+
+elemName            <-- [a-zA-Z] [^\\t\\n\\f />]*
+voidElemName        <-- \"area\" / \"base\" / \"br\" / \"col\" / \"embed\" /
+                      \"hr\" / \"img\" / \"input\" / \"keygen\" / \"link\" /
+                      \"meta\" / \"param\" / \"source\" / \"track\" / \"wbr\"
+rawTextElemName     <-- \"script\" / \"style\"
+escRawTextElemName  <-- \"textarea\" / \"title\"
+
+rawTextContent      <-- (!(\"</script>\" / \"</style>\") .)+
+escRawTextContent   <-- ((!(\"</textarea>\" / \"</title>\" / \"&\") .)+ / charRef)*
+normalContent       <-- !\"</\" (([^<&]+ / charRef) / comment / cdata / elem)*
+
+# ---- Attributes ----
+
+attributes        <-- (&[^/>] ([ \\t]+ / doubleQuoteAttr / singleQuoteAttr / unquotedAttr / emptyAttr))*
+attrName          <-- [^ \\t\\n\\r\\f\"'>/=]+
+emptyAttr         <-- attrName+
+unquotedAttr      <-- attrName [ \\t]* \"=\" [ \\t]*      (charRef / [^ \\t\\n\\r\\f\"'=<>`&]+)+
+singleQuoteAttr   <-- attrName [ \\t]* \"=\" [ \\t]* \"'\"  (charRef / [^'&]+)* \"'\"
+doubleQuoteAttr   <-- attrName [ \\t]* \"=\" [ \\t]* \"\\\"\" (charRef / [^\"&]+)* \"\\\"\"
+
+# ---- Character References ----
+
+charRef         <-- &\"&\" (decNumCharRef / hexNumCharRef / namedCharRef / borkedRef)
+namedCharRef    <-- \"&\"   [^;>]+ \";\"
+decNumCharRef   <-- \"&#\"  [0-9]+ \";\"
+hexNumCharRef   <-- \"&#x\" [a-fA-F0-9]+ \";\"
+borkedRef       <-- \"&\"  &[ \\t]
+
+# ---- Misc ----
+
+cdata       <-- \"<![CDATA[\" (!\"]]>\" .)+ \"]]>\"
+comment     <-- \"<!--\" (!\"--\" .)+ \"-->\"
+blurb       <-- [ \\t\\n\\f\\r]+ / comment")
+
+(define html-example "
+<!DOCTYPE html>
+<html>
+<head>
+    <title>Example Domain</title>
+    <meta charset=\"utf-8\" />
+    <meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\" />
+    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\" />
+    <style type=\"text/css\">
+    body {
+        background-color: #f0f0f2;
+        margin: 0;
+        padding: 0;
+    }
+    </style>
+</head>
+
+<body>
+<div>
+    <h1>Example Domain</h1>
+    <p>This domain is for use in illustrative examples in documents. You may
+    use this domain in literature without prior coordination or asking for
+    permission.</p> <p><a href=\"https://www.iana.org/domains/example\">More
+    information...</a></p>
+</div>
+</body>
+</html>
+")
+
+(with-test-prefix "Parsing with complex grammars"
+  (eeval `(define-peg-string-patterns ,html-grammar))
+  (pass-if
+    "HTML parsing"
+    (equal?
+      (peg:tree (match-pattern html html-example))
+      '(html (blurb "\n") (doctype "<!DOCTYPE html>") (blurb "\n") (elem (normalElem "<" (elemName "html") attributes ">" (normalContent "\n" (elem (normalElem "<" (elemName "head") attributes ">" (normalContent "\n    " (elem (escRawTextElem "<" (escRawTextElemName "title") attributes ">" (escRawTextContent "Example Domain") (endElem "</" (elemName "title") ">"))) "\n    " (elem (selfClosingElem "<" (elemName "meta") (attributes " " (doubleQuoteAttr (attrName "charset") "=\"utf-8\"") " ") "/>")) "\n    " (elem (selfClosingElem "<" (elemName "meta") (attributes " " (doubleQuoteAttr (attrName "http-equiv") "=\"Content-type\"") " " (doubleQuoteAttr (attrName "content") "=\"text/html; charset=utf-8\"") " ") "/>")) "\n    " (elem (selfClosingElem "<" (elemName "meta") (attributes " " (doubleQuoteAttr (attrName "name") "=\"viewport\"") " " (doubleQuoteAttr (attrName "content") "=\"width=device-width, initial-scale=1\"") " ") "/>")) "\n    " (elem (rawTextElem "<" (rawTextElemName "style") (attributes " " (doubleQuoteAttr (attrName "type") "=\"text/css\"")) ">" (rawTextContent "\n    body {\n        background-color: #f0f0f2;\n        margin: 0;\n        padding: 0;\n    }\n    ") (endElem "</" (elemName "style") ">"))) "\n") (endElem "</" (elemName "head") ">"))) "\n\n" (elem (normalElem "<" (elemName "body") attributes ">" (normalContent "\n" (elem (normalElem "<" (elemName "div") attributes ">" (normalContent "\n    " (elem (normalElem "<" (elemName "h1") attributes ">" (normalContent "Example Domain") (endElem "</" (elemName "h1") ">"))) "\n    " (elem (normalElem "<" (elemName "p") attributes ">" (normalContent "This domain is for use in illustrative examples in documents. You may\n    use this domain in literature without prior coordination or asking for\n    permission.") (endElem "</" (elemName "p") ">"))) " " (elem (normalElem "<" (elemName "p") attributes ">" (normalContent (elem (normalElem "<" (elemName "a") (attributes " " (doubleQuoteAttr (attrName "href") "=\"https://www.iana.org/domains/example\"")) ">" (normalContent "More\n    information...") (endElem "</" (elemName "a") ">")))) (endElem "</" (elemName "p") ">"))) "\n") (endElem "</" (elemName "div") ">"))) "\n") (endElem "</" (elemName "body") ">"))) "\n") (endElem "</" (elemName "html") ">"))) (blurb "\n")))))
-- 
2.46.0


[-- Attachment #3: v5-0002-PEG-Add-support-for-not-in-range-and.patch --]
[-- Type: text/x-patch, Size: 8383 bytes --]

From d894303ed01de72703b815061f826da97dd303f0 Mon Sep 17 00:00:00 2001
From: Ekaitz Zarraga <ekaitz@elenq.tech>
Date: Fri, 11 Oct 2024 14:24:30 +0200
Subject: [PATCH v5 2/3] PEG: Add support for `not-in-range` and [^...]

Modern PEG supports inversed class like `[^a-z]` that would get any
character not in the `a-z` range. This commit adds support for that and
also for a new `not-in-range` PEG pattern for scheme.

* module/ice-9/peg/codegen.scm (cg-not-in-range): New function.
* module/ice-9/peg/string-peg.scm: Add support for `[^...]`
* test-suite/tests/peg.test: Add NotInClass to grammar-mapping.
* doc/ref/api-peg.texi: Document accordingly.
---
 NEWS                            |  3 ++-
 doc/ref/api-peg.texi            |  8 ++++++
 module/ice-9/peg/codegen.scm    | 22 +++++++++++++++
 module/ice-9/peg/string-peg.scm | 48 ++++++++++++++++++++++++++++++---
 test-suite/tests/peg.test       |  2 +-
 5 files changed, 77 insertions(+), 6 deletions(-)

diff --git a/NEWS b/NEWS
index df43f3754..17ef560b1 100644
--- a/NEWS
+++ b/NEWS
@@ -32,7 +32,8 @@ Changes in 3.0.11 (since 3.0.10)
 ** PEG parser
 
 PEG parser has been rewritten to cover all the functionality defined in
-<https://bford.info/pub/lang/peg.pdf>.
+<https://bford.info/pub/lang/peg.pdf>. Also added the `not-in-range` pattern
+to `(ice-9 peg)` that is also available from PEG strings via `[^...]`.
 
 \f
 Changes in 3.0.10 (since 3.0.9)
diff --git a/doc/ref/api-peg.texi b/doc/ref/api-peg.texi
index 84a9e6c6b..edb090b20 100644
--- a/doc/ref/api-peg.texi
+++ b/doc/ref/api-peg.texi
@@ -147,6 +147,14 @@ Parses any character falling between @var{a} and @var{z}.
 @code{(range #\a #\z)}
 @end deftp
 
+@deftp {PEG Pattern} {inverse range of characters} a z
+Parses any character not falling between @var{a} and @var{z}.
+
+@code{"[^a-z]"}
+
+@code{(not-in-range #\a #\z)}
+@end deftp
+
 Example:
 
 @example
diff --git a/module/ice-9/peg/codegen.scm b/module/ice-9/peg/codegen.scm
index d80c3e849..82367ef55 100644
--- a/module/ice-9/peg/codegen.scm
+++ b/module/ice-9/peg/codegen.scm
@@ -140,6 +140,27 @@ return EXP."
                          ((none) #`(list (1+ pos) '()))
                          (else (error "bad accum" accum))))))))))
 
+;; Generates code for matching a range of characters not between start and end.
+;; E.g.: (cg-not-in-range syntax #\a #\z 'body)
+(define (cg-not-in-range pat accum)
+  (syntax-case pat ()
+    ((start end)
+     (if (not (and (char? (syntax->datum #'start))
+                   (char? (syntax->datum #'end))))
+         (error "range PEG should have characters after it; instead got"
+                #'start #'end))
+     #`(lambda (str len pos)
+         (and (< pos len)
+              (let ((c (string-ref str pos)))
+                (and (or (char<? c start) (char>? c end))
+                     #,(case accum
+                         ((all) #`(list (1+ pos)
+                                        (list 'cg-not-in-range (string c))))
+                         ((name) #`(list (1+ pos) 'cg-not-in-range))
+                         ((body) #`(list (1+ pos) (string c)))
+                         ((none) #`(list (1+ pos) '()))
+                         (else (error "bad accum" accum))))))))))
+
 ;; Generate code to match a pattern and do nothing with the result
 (define (cg-ignore pat accum)
   (syntax-case pat ()
@@ -304,6 +325,7 @@ return EXP."
         (assq-set! peg-compiler-alist symbol function)))
 
 (add-peg-compiler! 'range cg-range)
+(add-peg-compiler! 'not-in-range cg-not-in-range)
 (add-peg-compiler! 'ignore cg-ignore)
 (add-peg-compiler! 'capture cg-capture)
 (add-peg-compiler! 'and cg-and)
diff --git a/module/ice-9/peg/string-peg.scm b/module/ice-9/peg/string-peg.scm
index ede24181c..4b92b393c 100644
--- a/module/ice-9/peg/string-peg.scm
+++ b/module/ice-9/peg/string-peg.scm
@@ -54,7 +54,7 @@ Prefix <-- (AND / NOT)? Suffix
 Suffix <-- Primary (QUESTION / STAR / PLUS)?
 Primary <-- Identifier !LEFTARROW
            / OPEN Expression CLOSE
-           / Literal / Class / DOT
+           / Literal / Class / NotInClass / DOT
 
 # Lexical syntax
 Identifier <-- IdentStart IdentCont* Spacing
@@ -64,7 +64,8 @@ IdentCont <- IdentStart / [0-9]
 
 Literal <-- SQUOTE (!SQUOTE Char)* SQUOTE Spacing
         / DQUOTE (!DQUOTE Char)* DQUOTE Spacing
-Class <-- OPENBRACKET (!CLOSEBRACKET Range)* CLOSEBRACKET Spacing
+NotInClass <-- OPENBRACKET NOTIN  (!CLOSEBRACKET Range)* CLOSEBRACKET Spacing
+Class <-- OPENBRACKET !NOTIN  (!CLOSEBRACKET Range)* CLOSEBRACKET Spacing
 Range <-- Char DASH Char / Char
 Char <-- '\\\\' [nrtf'\"\\[\\]\\\\]
        / '\\\\' [0-7][0-7][0-7]
@@ -80,6 +81,7 @@ DASH < '-'
 OPENBRACKET < '['
 CLOSEBRACKET < ']'
 HEX <- [0-9a-fA-F]
+NOTIN < '^'
 SLASH < '/' Spacing
 AND <-- '&' Spacing
 NOT <-- '!' Spacing
@@ -124,6 +126,7 @@ EndOfFile < !.
       (and OPEN Expression CLOSE)
       Literal
       Class
+      NotInClass
       DOT))
 (define-sexp-parser Identifier all
   (and IdentStart (* IdentCont) Spacing))
@@ -135,7 +138,11 @@ EndOfFile < !.
   (or (and SQUOTE (* (and (not-followed-by SQUOTE) Char)) SQUOTE Spacing)
       (and DQUOTE (* (and (not-followed-by DQUOTE) Char)) DQUOTE Spacing)))
 (define-sexp-parser Class all
-  (and OPENBRACKET (* (and (not-followed-by CLOSEBRACKET) Range)) CLOSEBRACKET Spacing))
+  (and OPENBRACKET (not-followed-by NOTIN)
+       (* (and (not-followed-by CLOSEBRACKET) Range)) CLOSEBRACKET Spacing))
+(define-sexp-parser NotInClass all
+  (and OPENBRACKET NOTIN
+       (* (and (not-followed-by CLOSEBRACKET) Range)) CLOSEBRACKET Spacing))
 (define-sexp-parser Range all
   (or (and Char DASH Char) Char))
 (define-sexp-parser Char all
@@ -148,6 +155,8 @@ EndOfFile < !.
   (and (or "<--" "<-" "<") Spacing)) ; NOTE: <-- and < are extensions
 (define-sexp-parser HEX body
   (or (range #\0 #\9) (range #\a #\f) (range #\A #\F)))
+(define-sexp-parser NOTIN none
+  (and "^"))
 (define-sexp-parser SLASH none
   (and "/" Spacing))
 (define-sexp-parser AND all
@@ -284,6 +293,7 @@ EndOfFile < !.
       ('Identifier (Identifier->defn value for-syntax))
       ('Expression (Expression->defn value for-syntax))
       ('Literal    (Literal->defn value for-syntax))
+      ('NotInClass (NotInClass->defn value for-syntax))
       ('Class      (Class->defn value for-syntax)))))
 
 ;; (Identifier "hello")
@@ -296,13 +306,43 @@ EndOfFile < !.
 (define (Literal->defn lst for-syntax)
   (apply string (map (lambda (x) (Char->defn x for-syntax)) (cdr lst))))
 
-;; TODO: empty Class can happen: `[]`, but what does it represent?
+;; (NotInClass (Range ...) (Range ...))
+;;  `-> (and (followed-by (not-in-range ...))
+;;           (followed-by (not-in-range ...))
+;;           ...
+;;           (not-in-range ...))
+;; NOTE: the order doesn't matter, because all `not-in-range`s will always
+;; parse exactly one character, but all the elements but the last need not to
+;; consume the input.
+(define (NotInClass->defn lst for-syntax)
+  #`(and
+      #,@(map (lambda (x) #`(followed-by #,(NotInRange->defn x for-syntax)))
+              (cddr lst))
+      #,(NotInRange->defn (cadr lst) for-syntax)))
+
 ;; (Class ...)
 ;;  `-> (or ...)
 (define (Class->defn lst for-syntax)
   #`(or #,@(map (lambda (x) (Range->defn x for-syntax))
                 (cdr lst))))
 
+;; For one character:
+;; (Range (Char "a"))
+;;  `-> (not-in-range #\a #\a)
+;; Or for a range:
+;; (Range (Char "a") (Char "b"))
+;;  `-> (not-in-range #\a #\b)
+;; NOTE: It's coming from NotInClass.
+(define (NotInRange->defn lst for-syntax)
+  (match lst
+    (('Range c)
+     (let ((ch (Char->defn c for-syntax)))
+       #`(not-in-range #,ch #,ch)))
+    (('Range range-beginning range-end)
+     #`(not-in-range
+         #,(Char->defn range-beginning for-syntax)
+         #,(Char->defn range-end       for-syntax)))))
+
 ;; For one character:
 ;; (Range (Char "a"))
 ;;  `-> "a"
diff --git a/test-suite/tests/peg.test b/test-suite/tests/peg.test
index 1136c03f1..5570fbfa8 100644
--- a/test-suite/tests/peg.test
+++ b/test-suite/tests/peg.test
@@ -38,6 +38,7 @@
     (Identifier Identifier)
     (Literal Literal)
     (Class Class)
+    (NotInClass NotInClass)
     (Range Range)
     (Char Char)
     (LEFTARROW LEFTARROW)
@@ -283,4 +284,3 @@ number <-- [0-9]+")
    "1+1/2*3+(1+1)/2"
    (equal? (eq-parse "1+1/2*3+(1+1)/2")
 	   '(+ (+ 1 (* (/ 1 2) 3)) (/ (+ 1 1) 2)))))
-
-- 
2.46.0


[-- Attachment #4: v5-0001-PEG-Add-full-support-for-PEG-some-extensions.patch --]
[-- Type: text/x-patch, Size: 23039 bytes --]

From aa35db3cbc691b57c85374d6a269e8d344a0440a Mon Sep 17 00:00:00 2001
From: Ekaitz Zarraga <ekaitz@elenq.tech>
Date: Wed, 11 Sep 2024 21:19:26 +0200
Subject: [PATCH v5 1/3] PEG: Add full support for PEG + some extensions

This commit adds support for PEG as described in:

    <https://bford.info/pub/lang/peg.pdf>

It adds support for the missing features (comments, underscores in
identifiers and escaping) while keeping the extensions (dashes in
identifiers, < and <--).

The naming system tries to be as close as possible to the one proposed
in the paper.

* module/ice-9/peg/string-peg.scm: Rewrite PEG parser.
* test-suite/tests/peg.test: Fix import
---
 NEWS                            |   7 +
 doc/ref/api-peg.texi            |   8 +-
 module/ice-9/peg/string-peg.scm | 466 ++++++++++++++++++++------------
 test-suite/tests/peg.test       |  32 ++-
 4 files changed, 333 insertions(+), 180 deletions(-)

diff --git a/NEWS b/NEWS
index 9fd14c39d..df43f3754 100644
--- a/NEWS
+++ b/NEWS
@@ -27,6 +27,13 @@ Changes in 3.0.11 (since 3.0.10)
 ** Guile is compiled with -fexcess-precision=standard for i[3456]86 when possible
    (<https://debbugs.gnu.org/43262>)
 
+* New interfaces and functionality
+
+** PEG parser
+
+PEG parser has been rewritten to cover all the functionality defined in
+<https://bford.info/pub/lang/peg.pdf>.
+
 \f
 Changes in 3.0.10 (since 3.0.9)
 
diff --git a/doc/ref/api-peg.texi b/doc/ref/api-peg.texi
index d34ddc64c..84a9e6c6b 100644
--- a/doc/ref/api-peg.texi
+++ b/doc/ref/api-peg.texi
@@ -17,6 +17,10 @@ Wikipedia has a clear and concise introduction to PEGs if you want to
 familiarize yourself with the syntax:
 @url{http://en.wikipedia.org/wiki/Parsing_expression_grammar}.
 
+The paper that introduced PEG contains a more detailed description of how PEG
+works, and describes its syntax in detail:
+@url{https://bford.info/pub/lang/peg.pdf}
+
 The @code{(ice-9 peg)} module works by compiling PEGs down to lambda
 expressions.  These can either be stored in variables at compile-time by
 the define macros (@code{define-peg-pattern} and
@@ -216,8 +220,8 @@ should propagate up the parse tree.  The normal @code{<-} propagates the
 matched text up the parse tree, @code{<--} propagates the matched text
 up the parse tree tagged with the name of the nonterminal, and @code{<}
 discards that matched text and propagates nothing up the parse tree.
-Also, nonterminals may consist of any alphanumeric character or a ``-''
-character (in normal PEGs nonterminals can only be alphabetic).
+Also, nonterminals may include ``-'' character, while in normal PEG it is not
+allowed.
 
 For example, if we:
 @lisp
diff --git a/module/ice-9/peg/string-peg.scm b/module/ice-9/peg/string-peg.scm
index 45ed14bb1..ede24181c 100644
--- a/module/ice-9/peg/string-peg.scm
+++ b/module/ice-9/peg/string-peg.scm
@@ -1,6 +1,7 @@
 ;;;; string-peg.scm --- representing PEG grammars as strings
 ;;;;
-;;;; 	Copyright (C) 2010, 2011 Free Software Foundation, Inc.
+;;;; 	Copyright (C) 2010, 2011, Free Software Foundation, Inc.
+;;;; 	Copyright (C) 2024 Ekaitz Zarraga <ekaitz@elenq.tech>
 ;;;;
 ;;;; This library is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
@@ -21,10 +22,15 @@
   #:export (peg-as-peg
             define-peg-string-patterns
             peg-grammar)
+  #:use-module (ice-9 match)
   #:use-module (ice-9 peg using-parsers)
+  #:use-module (srfi srfi-1)
   #:use-module (ice-9 peg codegen)
   #:use-module (ice-9 peg simplify-tree))
 
+;; This module provides support for PEG as described in:
+;;   <https://bford.info/pub/lang/peg.pdf>
+
 ;; Gets the left-hand depth of a list.
 (define (depth lst)
   (if (or (not (list? lst)) (null? lst))
@@ -38,22 +44,60 @@
 
 ;; Grammar for PEGs in PEG grammar.
 (define peg-as-peg
-"grammar <-- (nonterminal ('<--' / '<-' / '<') sp pattern)+
-pattern <-- alternative (SLASH sp alternative)*
-alternative <-- ([!&]? sp suffix)+
-suffix <-- primary ([*+?] sp)*
-primary <-- '(' sp pattern ')' sp / '.' sp / literal / charclass / nonterminal !'<'
-literal <-- ['] (!['] .)* ['] sp
-charclass <-- LB (!']' (CCrange / CCsingle))* RB sp
-CCrange <-- . '-' .
-CCsingle <-- .
-nonterminal <-- [a-zA-Z0-9-]+ sp
-sp < [ \t\n]*
-SLASH < '/'
-LB < '['
-RB < ']'
+"# Hierarchical syntax
+Grammar <-- Spacing Definition+ EndOfFile
+Definition <-- Identifier LEFTARROW Expression
+
+Expression <-- Sequence (SLASH Sequence)*
+Sequence <-- Prefix*
+Prefix <-- (AND / NOT)? Suffix
+Suffix <-- Primary (QUESTION / STAR / PLUS)?
+Primary <-- Identifier !LEFTARROW
+           / OPEN Expression CLOSE
+           / Literal / Class / DOT
+
+# Lexical syntax
+Identifier <-- IdentStart IdentCont* Spacing
+# NOTE: `-` is an extension
+IdentStart <- [a-zA-Z_] / '-'
+IdentCont <- IdentStart / [0-9]
+
+Literal <-- SQUOTE (!SQUOTE Char)* SQUOTE Spacing
+        / DQUOTE (!DQUOTE Char)* DQUOTE Spacing
+Class <-- OPENBRACKET (!CLOSEBRACKET Range)* CLOSEBRACKET Spacing
+Range <-- Char DASH Char / Char
+Char <-- '\\\\' [nrtf'\"\\[\\]\\\\]
+       / '\\\\' [0-7][0-7][0-7]
+       / '\\\\' [0-7][0-7]?
+       / '\\\\' 'u' HEX HEX HEX HEX
+       / !'\\\\' .
+
+# NOTE: `<--` and `<` are extensions
+LEFTARROW <- ('<--' / '<-' / '<') Spacing
+SQUOTE < [']
+DQUOTE < [\"]
+DASH < '-'
+OPENBRACKET < '['
+CLOSEBRACKET < ']'
+HEX <- [0-9a-fA-F]
+SLASH < '/' Spacing
+AND <-- '&' Spacing
+NOT <-- '!' Spacing
+QUESTION <-- '?' Spacing
+STAR <-- '*' Spacing
+PLUS <-- '+' Spacing
+OPEN < '(' Spacing
+CLOSE < ')' Spacing
+DOT <-- '.' Spacing
+
+Spacing < (Space / Comment)*
+Comment < '#' (!EndOfLine .)* EndOfLine
+Space < ' ' / '\\t' / EndOfLine
+EndOfLine < '\\r\\n' / '\\n' / '\\r'
+EndOfFile < !.
 ")
 
+
 (define-syntax define-sexp-parser
   (lambda (x)
     (syntax-case x ()
@@ -63,35 +107,81 @@ RB < ']'
               (syn (wrap-parser-for-users x matchf accumsym #'sym)))
            #`(define sym #,syn))))))
 
-(define-sexp-parser peg-grammar all
-  (+ (and peg-nonterminal (or "<--" "<-" "<") peg-sp peg-pattern)))
-(define-sexp-parser peg-pattern all
-  (and peg-alternative
-       (* (and (ignore "/") peg-sp peg-alternative))))
-(define-sexp-parser peg-alternative all
-  (+ (and (? (or "!" "&")) peg-sp peg-suffix)))
-(define-sexp-parser peg-suffix all
-  (and peg-primary (* (and (or "*" "+" "?") peg-sp))))
-(define-sexp-parser peg-primary all
-  (or (and "(" peg-sp peg-pattern ")" peg-sp)
-      (and "." peg-sp)
-      peg-literal
-      peg-charclass
-      (and peg-nonterminal (not-followed-by "<"))))
-(define-sexp-parser peg-literal all
-  (and "'" (* (and (not-followed-by "'") peg-any)) "'" peg-sp))
-(define-sexp-parser peg-charclass all
-  (and (ignore "[")
-       (* (and (not-followed-by "]")
-               (or charclass-range charclass-single)))
-       (ignore "]")
-       peg-sp))
-(define-sexp-parser charclass-range all (and peg-any "-" peg-any))
-(define-sexp-parser charclass-single all peg-any)
-(define-sexp-parser peg-nonterminal all
-  (and (+ (or (range #\a #\z) (range #\A #\Z) (range #\0 #\9) "-")) peg-sp))
-(define-sexp-parser peg-sp none
-  (* (or " " "\t" "\n")))
+(define-sexp-parser Grammar all
+  (and Spacing (+ Definition) EndOfFile))
+(define-sexp-parser Definition all
+  (and Identifier LEFTARROW Expression))
+(define-sexp-parser Expression all
+  (and Sequence (* (and SLASH Sequence))))
+(define-sexp-parser Sequence all
+  (* Prefix))
+(define-sexp-parser Prefix all
+  (and (? (or AND NOT)) Suffix))
+(define-sexp-parser Suffix all
+  (and Primary (? (or QUESTION STAR PLUS))))
+(define-sexp-parser Primary all
+  (or (and Identifier (not-followed-by LEFTARROW))
+      (and OPEN Expression CLOSE)
+      Literal
+      Class
+      DOT))
+(define-sexp-parser Identifier all
+  (and IdentStart (* IdentCont) Spacing))
+(define-sexp-parser IdentStart body
+  (or (or (range #\a #\z) (range #\A #\Z) "_") "-")) ; NOTE: - is an extension
+(define-sexp-parser IdentCont body
+  (or IdentStart (range #\0 #\9)))
+(define-sexp-parser Literal all
+  (or (and SQUOTE (* (and (not-followed-by SQUOTE) Char)) SQUOTE Spacing)
+      (and DQUOTE (* (and (not-followed-by DQUOTE) Char)) DQUOTE Spacing)))
+(define-sexp-parser Class all
+  (and OPENBRACKET (* (and (not-followed-by CLOSEBRACKET) Range)) CLOSEBRACKET Spacing))
+(define-sexp-parser Range all
+  (or (and Char DASH Char) Char))
+(define-sexp-parser Char all
+  (or (and "\\" (or "n" "r" "t" "f" "'" "\"" "[" "]" "\\"))
+      (and "\\" (range #\0 #\7) (range #\0 #\7) (range #\0 #\7))
+      (and "\\" (range #\0 #\7) (? (range #\0 #\7)))
+      (and "\\" "u" HEX HEX HEX HEX)
+      (and (not-followed-by "\\") peg-any)))
+(define-sexp-parser LEFTARROW body
+  (and (or "<--" "<-" "<") Spacing)) ; NOTE: <-- and < are extensions
+(define-sexp-parser HEX body
+  (or (range #\0 #\9) (range #\a #\f) (range #\A #\F)))
+(define-sexp-parser SLASH none
+  (and "/" Spacing))
+(define-sexp-parser AND all
+  (and "&" Spacing))
+(define-sexp-parser NOT all
+  (and "!" Spacing))
+(define-sexp-parser QUESTION all
+  (and "?" Spacing))
+(define-sexp-parser STAR all
+  (and "*" Spacing))
+(define-sexp-parser PLUS all
+  (and "+" Spacing))
+(define-sexp-parser OPEN none
+  (and "(" Spacing))
+(define-sexp-parser CLOSE none
+  (and ")" Spacing))
+(define-sexp-parser DOT all
+  (and "." Spacing))
+(define-sexp-parser SQUOTE none "'")
+(define-sexp-parser DQUOTE none "\"")
+(define-sexp-parser OPENBRACKET none "[")
+(define-sexp-parser CLOSEBRACKET none "]")
+(define-sexp-parser DASH none "-")
+(define-sexp-parser Spacing none
+  (* (or Space Comment)))
+(define-sexp-parser Comment none
+  (and "#" (* (and (not-followed-by EndOfLine) peg-any)) EndOfLine))
+(define-sexp-parser Space none
+  (or " " "\t" EndOfLine))
+(define-sexp-parser EndOfLine none
+  (or "\r\n" "\n" "\r"))
+(define-sexp-parser EndOfFile none
+  (not-followed-by peg-any))
+
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;;;; PARSE STRING PEGS
@@ -101,7 +191,7 @@ RB < ']'
 ;; will define all of the nonterminals in the grammar with equivalent
 ;; PEG s-expressions.
 (define (peg-parser str for-syntax)
-  (let ((parsed (match-pattern peg-grammar str)))
+  (let ((parsed (match-pattern Grammar str)))
     (if (not parsed)
         (begin
           ;; (display "Invalid PEG grammar!\n")
@@ -110,11 +200,169 @@ RB < ']'
           (cond
            ((or (not (list? lst)) (null? lst))
             lst)
-           ((eq? (car lst) 'peg-grammar)
-            #`(begin
-                #,@(map (lambda (x) (peg-nonterm->defn x for-syntax))
-                        (context-flatten (lambda (lst) (<= (depth lst) 2))
-                                         (cdr lst))))))))))
+           ((eq? (car lst) 'Grammar)
+            (Grammar->defn lst for-syntax)))))))
+
+;; (Grammar (Definition ...) (Definition ...))
+(define (Grammar->defn lst for-syntax)
+  #`(begin
+      #,@(map (lambda (x) (Definition->defn x for-syntax))
+              (context-flatten (lambda (lst) (<= (depth lst) 1))
+                               (cdr lst)))))
+
+;; (Definition (Identifier "Something") "<-" (Expression ...))
+;;  `-> (define-peg-pattern Something 'all ...)
+(define (Definition->defn lst for-syntax)
+  (match lst
+    (('Definition ('Identifier identifier) grabber expression)
+     #`(define-peg-pattern
+         #,(datum->syntax for-syntax (string->symbol identifier))
+         #,(match grabber
+                  ("<--" (datum->syntax for-syntax 'all))
+                  ("<-"  (datum->syntax for-syntax 'body))
+                  ("<"   (datum->syntax for-syntax 'none)))
+         #,(compressor
+             (Expression->defn expression for-syntax)
+             for-syntax)))))
+
+;; (Expression X)
+;;  `-> (or X)
+;; (Expression X Y)
+;;  `-> (or X Y)
+;; (Expression X (Y Z ...))
+;;  `-> (or X Y Z ...)
+(define (Expression->defn lst for-syntax)
+  (match lst
+    (('Expression seq ...)
+     #`(or #,@(map (lambda (x) (Sequence->defn x for-syntax))
+                   (keyword-flatten '(Sequence) seq))))))
+
+;; (Sequence X)
+;;  `-> (and X)
+;; (Sequence X Y)
+;;  `-> (and X Y)
+;; (Sequence X (Y Z ...))
+;;  `-> (and X Y Z ...)
+(define (Sequence->defn lst for-syntax)
+  (match lst
+    (('Sequence pre ...)
+     #`(and #,@(map (lambda (x) (Prefix->defn x for-syntax))
+                    (keyword-flatten '(Prefix) pre))))))
+
+;; (Prefix (Suffix ...))
+;;  `-> (...)
+;; (Prefix (NOT "!") (Suffix ...))
+;;  `-> (not-followed-by ...)
+;; (Prefix (AND "&") (Suffix ...))
+;;  `-> (followed-by ...)
+(define (Prefix->defn lst for-syntax)
+  (match lst
+    (('Prefix ('AND _) su) #`(followed-by     #,(Suffix->defn su for-syntax)))
+    (('Prefix ('NOT _) su) #`(not-followed-by #,(Suffix->defn su for-syntax)))
+    (('Prefix suffix) (Suffix->defn suffix for-syntax))))
+
+;; (Suffix (Primary ...))
+;;  `-> (...)
+;; (Suffix (Primary ...) (STAR "*"))
+;;  `-> (* ...)
+;; (Suffix (Primary ...) (QUESTION "?"))
+;;  `-> (? ...)
+;; (Suffix (Primary ...) (PLUS "+"))
+;;  `-> (+ ...)
+(define (Suffix->defn lst for-syntax)
+  (match lst
+    (('Suffix prim)               (Primary->defn prim for-syntax))
+    (('Suffix prim ('STAR     _)) #`(* #,(Primary->defn prim for-syntax)))
+    (('Suffix prim ('QUESTION _)) #`(? #,(Primary->defn prim for-syntax)))
+    (('Suffix prim ('PLUS     _)) #`(+ #,(Primary->defn prim for-syntax)))))
+
+
+(define (Primary->defn lst for-syntax)
+  (let ((value (second lst)))
+    (match (car value)
+      ('DOT        #'peg-any)
+      ('Identifier (Identifier->defn value for-syntax))
+      ('Expression (Expression->defn value for-syntax))
+      ('Literal    (Literal->defn value for-syntax))
+      ('Class      (Class->defn value for-syntax)))))
+
+;; (Identifier "hello")
+;;  `-> hello
+(define (Identifier->defn lst for-syntax)
+  (datum->syntax for-syntax (string->symbol (second lst))))
+
+;; (Literal (Char "a") (Char "b") (Char "c"))
+;;  `-> "abc"
+(define (Literal->defn lst for-syntax)
+  (apply string (map (lambda (x) (Char->defn x for-syntax)) (cdr lst))))
+
+;; TODO: empty Class can happen: `[]`, but what does it represent?
+;; (Class ...)
+;;  `-> (or ...)
+(define (Class->defn lst for-syntax)
+  #`(or #,@(map (lambda (x) (Range->defn x for-syntax))
+                (cdr lst))))
+
+;; For one character:
+;; (Range (Char "a"))
+;;  `-> "a"
+;; Or for a range:
+;; (Range (Char "a") (Char "b"))
+;;  `-> (range #\a #\b)
+(define (Range->defn lst for-syntax)
+  (match lst
+    (('Range ch)
+     (string (Char->defn ch for-syntax)))
+    (('Range range-beginning range-end)
+     #`(range
+         #,(Char->defn range-beginning for-syntax)
+         #,(Char->defn range-end       for-syntax)))))
+
+;; (Char "a")
+;;  `-> #\a
+;; (Char "\\n")
+;;  `-> #\newline
+;; (Char "\\135")
+;;  `-> #\]
+(define (Char->defn lst for-syntax)
+  (let* ((charstr (second lst))
+         (first   (string-ref charstr 0)))
+    (cond
+      ((= 1 (string-length charstr)) first)
+      ((char-numeric? (string-ref charstr 1))
+       (integer->char
+         (reduce + 0
+                 (map
+                   (lambda (x y)
+                     (* (- (char->integer x) (char->integer #\0)) y))
+                   (reverse (string->list charstr 1))
+                   '(1 8 64)))))
+      ((char=? #\u (string-ref charstr 1))
+       (integer->char
+         (reduce + 0
+                 (map
+                   (lambda (x y)
+                     (* (cond
+                          ((char-numeric? x)
+                           (- (char->integer x) (char->integer #\0)))
+                          ((char-alphabetic? x)
+                           (+ 10 (- (char->integer x) (char->integer #\a)))))
+                        y))
+                   (reverse (string->list (string-downcase charstr) 2))
+                   '(1 16 256 4096)))))
+      (else
+        (case (string-ref charstr 1)
+          ((#\n) #\newline)
+          ((#\r) #\return)
+          ((#\t) #\tab)
+          ((#\f) #\page)
+          ((#\') #\')
+          ((#\") #\")
+          ((#\]) #\])
+          ((#\\) #\\)
+          ((#\[) #\[))))))
+
+(define peg-grammar Grammar)
 
 ;; Macro wrapper for PEG-PARSER.  Parses PEG grammars expressed as strings and
 ;; defines all the appropriate nonterminals.
@@ -124,119 +372,6 @@ RB < ']'
       ((_ str)
        (peg-parser (syntax->datum #'str) x)))))
 
-;; lst has format (nonterm grabber pattern), where
-;;   nonterm is a symbol (the name of the nonterminal),
-;;   grabber is a string (either "<", "<-" or "<--"), and
-;;   pattern is the parse of a PEG pattern expressed as as string.
-(define (peg-nonterm->defn lst for-syntax)
-  (let* ((nonterm (car lst))
-         (grabber (cadr lst))
-         (pattern (caddr lst))
-         (nonterm-name (datum->syntax for-syntax
-                                      (string->symbol (cadr nonterm)))))
-    #`(define-peg-pattern #,nonterm-name
-       #,(cond
-          ((string=? grabber "<--") (datum->syntax for-syntax 'all))
-          ((string=? grabber "<-") (datum->syntax for-syntax 'body))
-          (else (datum->syntax for-syntax 'none)))
-       #,(compressor (peg-pattern->defn pattern for-syntax) for-syntax))))
-
-;; lst has format ('peg-pattern ...).
-;; After the context-flatten, (cdr lst) has format
-;;   (('peg-alternative ...) ...), where the outer list is a collection
-;;   of elements from a '/' alternative.
-(define (peg-pattern->defn lst for-syntax)
-  #`(or #,@(map (lambda (x) (peg-alternative->defn x for-syntax))
-                (context-flatten (lambda (x) (eq? (car x) 'peg-alternative))
-                                 (cdr lst)))))
-
-;; lst has format ('peg-alternative ...).
-;; After the context-flatten, (cdr lst) has the format
-;;   (item ...), where each item has format either ("!" ...), ("&" ...),
-;;   or ('peg-suffix ...).
-(define (peg-alternative->defn lst for-syntax)
-  #`(and #,@(map (lambda (x) (peg-body->defn x for-syntax))
-                 (context-flatten (lambda (x) (or (string? (car x))
-                                             (eq? (car x) 'peg-suffix)))
-                                  (cdr lst)))))
-
-;; lst has the format either
-;;   ("!" ('peg-suffix ...)), ("&" ('peg-suffix ...)), or
-;;     ('peg-suffix ...).
-(define (peg-body->defn lst for-syntax)
-    (cond
-      ((equal? (car lst) "&")
-       #`(followed-by #,(peg-suffix->defn (cadr lst) for-syntax)))
-      ((equal? (car lst) "!")
-       #`(not-followed-by #,(peg-suffix->defn (cadr lst) for-syntax)))
-      ((eq? (car lst) 'peg-suffix)
-       (peg-suffix->defn lst for-syntax))
-      (else `(peg-parse-body-fail ,lst))))
-
-;; lst has format ('peg-suffix <peg-primary> (? (/ "*" "?" "+")))
-(define (peg-suffix->defn lst for-syntax)
-  (let ((inner-defn (peg-primary->defn (cadr lst) for-syntax)))
-    (cond
-      ((null? (cddr lst))
-       inner-defn)
-      ((equal? (caddr lst) "*")
-       #`(* #,inner-defn))
-      ((equal? (caddr lst) "?")
-       #`(? #,inner-defn))
-      ((equal? (caddr lst) "+")
-       #`(+ #,inner-defn)))))
-
-;; Parse a primary.
-(define (peg-primary->defn lst for-syntax)
-  (let ((el (cadr lst)))
-  (cond
-   ((list? el)
-    (cond
-     ((eq? (car el) 'peg-literal)
-      (peg-literal->defn el for-syntax))
-     ((eq? (car el) 'peg-charclass)
-      (peg-charclass->defn el for-syntax))
-     ((eq? (car el) 'peg-nonterminal)
-      (datum->syntax for-syntax (string->symbol (cadr el))))))
-   ((string? el)
-    (cond
-     ((equal? el "(")
-      (peg-pattern->defn (caddr lst) for-syntax))
-     ((equal? el ".")
-      (datum->syntax for-syntax 'peg-any))
-     (else (datum->syntax for-syntax
-                          `(peg-parse-any unknown-string ,lst)))))
-   (else (datum->syntax for-syntax
-                        `(peg-parse-any unknown-el ,lst))))))
-
-;; Trims characters off the front and end of STR.
-;; (trim-1chars "'ab'") -> "ab"
-(define (trim-1chars str) (substring str 1 (- (string-length str) 1)))
-
-;; Parses a literal.
-(define (peg-literal->defn lst for-syntax)
-  (datum->syntax for-syntax (trim-1chars (cadr lst))))
-
-;; Parses a charclass.
-(define (peg-charclass->defn lst for-syntax)
-  #`(or
-     #,@(map
-         (lambda (cc)
-           (cond
-            ((eq? (car cc) 'charclass-range)
-             #`(range #,(datum->syntax
-                         for-syntax
-                         (string-ref (cadr cc) 0))
-                      #,(datum->syntax
-                         for-syntax
-                         (string-ref (cadr cc) 2))))
-            ((eq? (car cc) 'charclass-single)
-             (datum->syntax for-syntax (cadr cc)))))
-         (context-flatten
-          (lambda (x) (or (eq? (car x) 'charclass-range)
-                          (eq? (car x) 'charclass-single)))
-          (cdr lst)))))
-
 ;; Compresses a list to save the optimizer work.
 ;; e.g. (or (and a)) -> a
 (define (compressor-core lst)
@@ -263,11 +398,10 @@ RB < ']'
      (let ((string (syntax->datum #'str-stx)))
        (compile-peg-pattern
         (compressor
-         (peg-pattern->defn
-          (peg:tree (match-pattern peg-pattern string)) #'str-stx)
+         (Expression->defn
+          (peg:tree (match-pattern Expression string)) #'str-stx)
          #'str-stx)
         (if (eq? accum 'all) 'body accum))))
      (else (error "Bad embedded PEG string" args))))
 
 (add-peg-compiler! 'peg peg-string-compile)
-
diff --git a/test-suite/tests/peg.test b/test-suite/tests/peg.test
index f516571e8..1136c03f1 100644
--- a/test-suite/tests/peg.test
+++ b/test-suite/tests/peg.test
@@ -28,17 +28,25 @@
 ;; the nonterminals defined in the PEG parser written with
 ;; S-expressions.
 (define grammar-mapping
-  '((grammar peg-grammar)
-    (pattern peg-pattern)
-    (alternative peg-alternative)
-    (suffix peg-suffix)
-    (primary peg-primary)
-    (literal peg-literal)
-    (charclass peg-charclass)
-    (CCrange charclass-range)
-    (CCsingle charclass-single)
-    (nonterminal peg-nonterminal)
-    (sp peg-sp)))
+  '((Grammar Grammar)
+    (Definition Definition)
+    (Expression Expression)
+    (Sequence Sequence)
+    (Prefix Prefix)
+    (Suffix Suffix)
+    (Primary Primary)
+    (Identifier Identifier)
+    (Literal Literal)
+    (Class Class)
+    (Range Range)
+    (Char Char)
+    (LEFTARROW LEFTARROW)
+    (AND AND)
+    (NOT NOT)
+    (QUESTION QUESTION)
+    (STAR STAR)
+    (PLUS PLUS)
+    (DOT DOT)))
 
 ;; Transforms the nonterminals defined in the PEG parser written as a PEG to the nonterminals defined in the PEG parser written with S-expressions.
 (define (grammar-transform x)
@@ -69,7 +77,7 @@
     (peg:tree (match-pattern (@@ (ice-9 peg) peg-grammar) (@@ (ice-9 peg) peg-as-peg)))
     (tree-map
      grammar-transform
-     (peg:tree (match-pattern grammar (@@ (ice-9 peg) peg-as-peg)))))))
+     (peg:tree (match-pattern Grammar (@@ (ice-9 peg) peg-as-peg)))))))
 
 ;; A grammar for pascal-style comments from Wikipedia.
 (define comment-grammar
-- 
2.46.0

     prev parent reply	other threads:[~2024-10-30 19:04 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-11 22:03 bug#73188: PEG parser does not support full PEG grammar Ekaitz Zarraga
2024-09-12 20:57 ` bug#73188: [PATCH v2] PEG: Add full support for PEG + some extensions Ekaitz Zarraga
2024-10-13 20:29   ` bug#73188: PEG parser does not support full PEG grammar Ludovic Courtès
2024-10-13 20:59     ` Ekaitz Zarraga
2024-10-14 11:56       ` Ludovic Courtès
2024-10-14 14:00         ` Ekaitz Zarraga
2024-10-20 10:10           ` Ludovic Courtès
2024-10-20 20:18             ` Ekaitz Zarraga
2024-10-11 12:31 ` bug#73188: [PATCH] PEG: Add support for `not-in-range` and [^...] Ekaitz Zarraga
2024-10-30 19:04 ` Ekaitz Zarraga [this message]

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:5570fbfa dfblob:d8d04728 dfblob:df43f375 dfblob:17ef560b
dfblob:84a9e6c6 dfblob:edb090b2 dfblob:d80c3e84 dfblob:82367ef5
dfblob:ede24181 dfblob:4b92b393 dfblob:1136c03f dfblob:5570fbfa
dfblob:9fd14c39 dfblob:df43f375 dfblob:d34ddc64 dfblob:84a9e6c6
dfblob:45ed14bb dfblob:ede24181 dfblob:f516571e dfblob:1136c03f )
 OR (
bs:"PEG: Add full support for PEG + some extensions" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0994541b-538d-4f03-bf13-78ef8917099f@elenq.tech \
    --to=ekaitz@elenq.tech \
    --cc=73188@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).