From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ekaitz Zarraga Newsgroups: gmane.lisp.guile.bugs Subject: bug#73188: [PATCH 2/3] PEG: string-peg: better support for escaping Date: Sun, 22 Dec 2024 21:01:07 +0100 Message-ID: <20241222200128.13782-2-ekaitz@elenq.tech> References: <20241222200128.13782-1-ekaitz@elenq.tech> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34995"; mail-complaints-to="usenet@ciao.gmane.io" Cc: ludo@gnu.org, Ekaitz Zarraga To: 73188@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Sun Dec 22 21:03:32 2024 Return-path: Envelope-to: guile-bugs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tPSAl-0008yP-K2 for guile-bugs@m.gmane-mx.org; Sun, 22 Dec 2024 21:03:32 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tPSAK-000202-0V; Sun, 22 Dec 2024 15:03:04 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tPSAJ-0001zo-17 for bug-guile@gnu.org; Sun, 22 Dec 2024 15:03:03 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tPSAI-0002wE-Ok for bug-guile@gnu.org; Sun, 22 Dec 2024 15:03:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:References:In-Reply-To:Date:From:To:Subject; bh=VAyGYAaI8Uc0NA+LlgOaavymIAd3ZLF0P1DnwYipqDE=; b=CEybAiPP3GiNg4gKInjVLwTZXZWZhXf9HaY2g0sCo/MZOk/bvYL3BXJFhShnQlCLLUSVmpZyXlNbe+PMmfsTvtDUVwhsOAj96cUAdQ+88U95JdFliVlKydsWY8HqoTFfbLyQ92kob6YZYu/oSc1d4btUF4LEaqVqqyFVTnv1V00GUDqjz2yZdqA3Q1WWB68LUdaHhZwjExer1w4ZF+yOCdn2qh1PyhCvRhNXDfVXKxWfsfQCuCxwlCYUaP2Af5HCVP/ta9CP7anpdWErexYvdGZM6tuWuhSuU1ESgP6uiuRTA2vjjUn7To3DUZpFB+xI0jtgY3t+kxk+QY15yUs9Sg==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1tPSAI-0006uO-J9 for bug-guile@gnu.org; Sun, 22 Dec 2024 15:03:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Ekaitz Zarraga Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Sun, 22 Dec 2024 20:03:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73188 X-GNU-PR-Package: guile Original-Received: via spool by 73188-submit@debbugs.gnu.org id=B73188.173489775226503 (code B ref 73188); Sun, 22 Dec 2024 20:03:02 +0000 Original-Received: (at 73188) by debbugs.gnu.org; 22 Dec 2024 20:02:32 +0000 Original-Received: from localhost ([127.0.0.1]:52049 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tPS9o-0006tP-Cu for submit@debbugs.gnu.org; Sun, 22 Dec 2024 15:02:32 -0500 Original-Received: from dane.soverin.net ([185.233.34.25]:58835) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tPS9l-0006t9-Q9 for 73188@debbugs.gnu.org; Sun, 22 Dec 2024 15:02:30 -0500 Original-Received: from smtp.soverin.net (unknown [10.10.4.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dane.soverin.net (Postfix) with ESMTPS id 4YGX9X2H5czyQG; Sun, 22 Dec 2024 20:01:52 +0000 (UTC) Original-Received: from smtp.soverin.net (smtp.soverin.net [10.10.4.100]) by soverin.net (Postfix) with ESMTPSA id 4YGX9V6tCQzKP; Sun, 22 Dec 2024 20:01:50 +0000 (UTC) Authentication-Results: smtp.soverin.net; dkim=pass (2048-bit key; unprotected) header.d=elenq.tech header.i=@elenq.tech header.a=rsa-sha256 header.s=soverin1 header.b=WKXDZ/e4; dkim-atps=neutral DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=elenq.tech; s=soverin1; t=1734897712; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VAyGYAaI8Uc0NA+LlgOaavymIAd3ZLF0P1DnwYipqDE=; b=WKXDZ/e4x6O6PpVImB5ckDcaiw3Gm6ou/SwelN9TrBuguZ6dp0JRmfnm0k4EzKCJNAOxKO VsBhcL+KSow57HPJCcW7EJzJ5KQ/I1rnw40Z2Hw64mLpWpaOy8YAAvIcA0vgUdJmw9To/u UogmMvln+j8k1Y1YuODAX83O9R6Uc4z9D4SYQtjN0X1t7s2Akg4G+4Y3NeC/dchBvDt/Cd uuQPuBZY1ejSRky4G3tRyvhYeb199B56+W9NM98gBbkrQ169z11wXfdvb5cHk8SMV/ILju uCq/c4c6Jh+4NGj/PZu01lJ8WVb3AkfoFDyEvvRitrEaXTc7/44eEiFIKSnRWw== X-CM-Envelope: MS4xfK1w8C7vpCT6jPCNrhUr/EJS3mHYD6DgRyRBN0ts+7OJVoS905+j3UNlpmAcsnYrbHUliLdW57ruuB0+Mw5wuRJfguYUag3CZBA8Rfd7eg2JedRVvpIy QaaD8ZAStFtOOPvo88TXjIO6yV2JkfDKkcGIyys6zLZy8IHDor/j7CVW68C6Xe39Mf4wPd4LRDW0brRWr/Sd6WHxWJ4BZW7e91rZlyWN+QH9GU1JXck+N/sv RxyGIiztbVHWTDMbWc8t6g== X-CM-Analysis: v=2.4 cv=WMmFXmsR c=1 sm=1 tr=0 ts=67687030 a=boG0PpFrEpR1SC5N/ZD5Tw==:117 a=boG0PpFrEpR1SC5N/ZD5Tw==:17 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=KnLkD2i1WC7CnZU7tFgA:9 a=yPy0HX4kI4LsAlP3oO-2:22 In-Reply-To: <20241222200128.13782-1-ekaitz@elenq.tech> X-Spampanel-Class: ham X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.lisp.guile.bugs:11125 Archived-At: --- module/ice-9/peg/string-peg.scm | 26 +++++++++++++++++++++++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/module/ice-9/peg/string-peg.scm b/module/ice-9/peg/string-peg.scm index 745d8e8e7..9891f2ae5 100644 --- a/module/ice-9/peg/string-peg.scm +++ b/module/ice-9/peg/string-peg.scm @@ -67,9 +67,10 @@ Literal <-- SQUOTE (!SQUOTE Char)* SQUOTE Spacing NotInClass <-- OPENBRACKET NOTIN (!CLOSEBRACKET Range)* CLOSEBRACKET Spacing Class <-- OPENBRACKET !NOTIN (!CLOSEBRACKET Range)* CLOSEBRACKET Spacing Range <-- Char DASH Char / Char -Char <-- '\\\\' [nrt'\"\\[\\]\\\\] +Char <-- '\\\\' [nrtf'\"\\[\\]\\\\] / '\\\\' [0-7][0-7][0-7] / '\\\\' [0-7][0-7]? + / '\\\\' 'u' HEX HEX HEX HEX / !'\\\\' . # NOTE: `<--` and `<` are extensions @@ -79,6 +80,7 @@ DQUOTE < [\"] DASH < '-' OPENBRACKET < '[' CLOSEBRACKET < ']' +HEX <- [0-9a-fA-F] NOTIN < '^' SLASH < '/' Spacing AND <-- '&' Spacing @@ -92,7 +94,7 @@ DOT <-- '.' Spacing Spacing < (Space / Comment)* Comment < '#' (!EndOfLine .)* EndOfLine -Space < ' ' / '\t' / EndOfLine +Space < ' ' / '\\t' / EndOfLine EndOfLine < '\\r\\n' / '\\n' / '\\r' EndOfFile < !. ") @@ -144,12 +146,15 @@ EndOfFile < !. (define-sexp-parser Range all (or (and Char DASH Char) Char)) (define-sexp-parser Char all - (or (and "\\" (or "n" "r" "t" "'" "\"" "[" "]" "\\")) + (or (and "\\" (or "n" "r" "t" "f" "'" "\"" "[" "]" "\\")) (and "\\" (range #\0 #\7) (range #\0 #\7) (range #\0 #\7)) (and "\\" (range #\0 #\7) (? (range #\0 #\7))) + (and "\\" "u" HEX HEX HEX HEX) (and (not-followed-by "\\") peg-any))) (define-sexp-parser LEFTARROW body (and (or "<--" "<-" "<") Spacing)) ; NOTE: <-- and < are extensions +(define-sexp-parser HEX body + (or (range #\0 #\9) (range #\a #\f) (range #\A #\F))) (define-sexp-parser NOTIN none (and "^")) (define-sexp-parser SLASH none @@ -372,12 +377,27 @@ EndOfFile < !. (* (- (char->integer x) (char->integer #\0)) y)) (reverse (string->list charstr 1)) '(1 8 64))))) + ((char=? #\u (string-ref charstr 1)) + (integer->char + (reduce + 0 + (map + (lambda (x y) + (* (cond + ((char-numeric? x) + (- (char->integer x) (char->integer #\0))) + ((char-alphabetic? x) + (+ 10 (- (char->integer x) (char->integer #\a))))) + y)) + (reverse (string->list (string-downcase charstr) 2)) + '(1 16 256 4096))))) (else (case (string-ref charstr 1) ((#\n) #\newline) ((#\r) #\return) ((#\t) #\tab) + ((#\f) #\page) ((#\') #\') + ((#\") #\") ((#\]) #\]) ((#\\) #\\) ((#\[) #\[)))))) -- 2.46.0