From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Matt Wette Newsgroups: gmane.lisp.guile.user Subject: [potluck dish] the module (potluck regexc) Date: Tue, 16 Feb 2016 05:30:18 -0800 Message-ID: <09B4C160-3751-414E-9A8A-19594972F95F@verizon.net> References: <87mvr8ood1.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Content-Type: multipart/alternative; boundary="Apple-Mail=_3681BDE2-C171-4F57-BABA-087860B884F7" X-Trace: ger.gmane.org 1455629484 23721 80.91.229.3 (16 Feb 2016 13:31:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 16 Feb 2016 13:31:24 +0000 (UTC) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Tue Feb 16 14:31:15 2016 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aVfik-0003gm-PU for guile-user@m.gmane.org; Tue, 16 Feb 2016 14:31:15 +0100 Original-Received: from localhost ([::1]:45951 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aVfik-0000le-2U for guile-user@m.gmane.org; Tue, 16 Feb 2016 08:31:14 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:43954) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aVfiN-0000lO-SW for guile-user@gnu.org; Tue, 16 Feb 2016 08:30:53 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aVfiK-0006xc-Cl for guile-user@gnu.org; Tue, 16 Feb 2016 08:30:51 -0500 Original-Received: from vms173021pub.verizon.net ([206.46.173.21]:58211) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aVfiJ-0006xA-FL for guile-user@gnu.org; Tue, 16 Feb 2016 08:30:48 -0500 Original-Received: from vz-proxy-m004.mx.aol.com ([64.236.83.2]) by vms173021.mailsrvcs.net (Oracle Communications Messaging Server 7.0.5.32.0 64bit (built Jul 16 2014)) with ESMTPA id <0O2N00F026UKFR90@vms173021.mailsrvcs.net> for guile-user@gnu.org; Tue, 16 Feb 2016 07:30:21 -0600 (CST) X-CMAE-Score: 0 X-CMAE-Analysis: v=2.1 cv=WcjxEBVX c=1 sm=1 tr=0 a=EDCZS3Slg7tGr17y9w9vtg==:117 a=jFJIQSaiL_oA:10 a=EDXxJTuq73VxJTcqhW8A:9 a=CjuIK1q_8ugA:10 a=c2xXBKqDWU23xWAV31QA:9 a=3dermav-TaB7qCg_:21 a=_W_S_7VecoQA:10 a=BDyur4InJX6bKE1j05MA:9 a=zQwYGS0TsWX4bTwHTqEA:9 a=rk_YDjOiByrIMFgRIVAA:9 a=cCPdUmxMbq7Pk0bOWvUA:9 a=9ezu8pfE0iSI3Z5cCOkA:9 Original-Received: by 72.87.204.128 with SMTP id 61140e0d; Tue, 16 Feb 2016 13:30:20 GMT In-reply-to: <87mvr8ood1.fsf@gnu.org> X-Mailer: Apple Mail (2.2104) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.46.173.21 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:12403 Archived-At: --Apple-Mail=_3681BDE2-C171-4F57-BABA-087860B884F7 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii The (potluck regexc) module provides a macro for throwing a string at a sequence of regular expressions, executing an associated body when a match is found. Attached are three files: * regexc.scm: the source code * regexc.texi: documentation * regexc.test: test code Regexp Utilities ---------------- -- Scheme Procedure: regexp-case str case ... [else body] Match the string STR against each CASE in turn. Each CASE is of the form ((pat VAR1 VAR2 ...) body) where pat is a regular expression string literal, VAR1 ... are bound to the ordered list of matched subexpressions, and BODY is a sequence of expressions. If no match is found and the optional ELSE case exists, the associated body is executed, otherwise an error is signaled. The following example matches a string aginst either a simple variable name, or a simple variable name with an array reference, and returns a list with the variable name and the string index, or '"1"'. If no match is found, '#f' is returned. (define str "foo") (regexp-case str (("^([a-z]+)\\(([0-9]+)\\)$" var idx) (list var idx)) (("^([a-z]+)$" var) (list var "1")) (else #f)) ==> ("foo" "1") -- Scheme Procedure: make-string-matcher (str ...) case ... [else body] This is similar to 'regexp-case' but generates a procedure '(lambda (str ...) ...)' that matches its string argument STR againt each CASE in turn. (define my-matcher (make-string-matcher (str a b c) (("^([a-z]+)\\(([0-9]+)\\)$" var idx) (list var idx)) (("^([a-z]+)$" var) (list var "1")) (else #f)) --Apple-Mail=_3681BDE2-C171-4F57-BABA-087860B884F7 Content-Type: multipart/mixed; boundary="Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2" --Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii
The (potluck regexc) module provides a macro = for throwing a string
at a sequence of regular expressions, executing an associated = body
when a = match is found.

Attached are three files:
* regexc.scm: the source = code
* = regexc.texi: documentation
* regexc.test: test code


Regexp Utilities
----------------

 -- Scheme Procedure: = regexp-case str case ... [else body]
     Match the string STR = against each CASE in turn.  Each CASE is of
     the = form
  =         ((pat VAR1 VAR2 ...)
        =    body)
     where pat is a regular expression string = literal, VAR1 ... are
     bound to the ordered list of matched = subexpressions, and BODY is a
     sequence of expressions. =  If no match is found and the optional
     ELSE case = exists, the associated body is executed, otherwise an
     error is = signaled.

   The following example matches a string aginst = either a simple
variable name, or a simple variable name with an array = reference, and
returns a list with the variable name and the string index, = or '"1"'.
If = no match is found, '#f' is returned.

     (define = str "foo")
     (regexp-case str
      = (("^([a-z]+)\\(([0-9]+)\\)$" var idx)
       (list var = idx))
  =     (("^([a-z]+)$" var)
       (list var = "1"))
  =     (else #f))
     =3D=3D>
     ("foo" = "1")

 -- Scheme Procedure: make-string-matcher (str ...) case = ... [else body]
     This is similar to 'regexp-case' but = generates a procedure '(lambda
     (str ...) ...)' that = matches its string argument STR againt each
     CASE in = turn.

     (define my-matcher
      =  (make-string-matcher (str a b c)
        = (("^([a-z]+)\\(([0-9]+)\\)$" var idx)
         (list var = idx))
  =       (("^([a-z]+)$" var)
         (list var = "1"))
  =       (else #f))

= --Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2 Content-Disposition: attachment; filename=regexc.scm Content-Type: application/octet-stream; name="regexc.scm" Content-Transfer-Encoding: 7bit ;;; potluck/regexc.scm ;;; ;;; Copyright (C) 2016 Matthew R. Wette ;;; ;;; This library is free software; you can redistribute it and/or ;;; modify it under the terms of the GNU Lesser General Public ;;; License as published by the Free Software Foundation; either ;;; version 3 of the License, or (at your option) any later version. (define-module (potluck regexc) #:export-syntax (regexp-case make-string-matcher) #:use-module (ice-9 regex) #:re-export (match:count match:string match:prefix match:suffix regexp-match? regexp-quote match:start match:end match:substring string-match regexp-substitute fold-matches list-matches regexp-substitute/global) ) ;; (rx-let m (v ...) exp ...) => (let ((v (match:substring m 1)) ...) exp ...) ;; This syntax expands application of a regex match to a body of expressions. ;; This is a helper macro for regexp-case. (define-syntax rx-let (lambda (x) (syntax-case x () ((_ m (v ...) exp ...) (with-syntax (((i ...) (let f ((n 1) (vl #'(v ...))) ; fold (v ...) to (1 ...) (if (null? vl) '() (cons n (f (1+ n) (cdr vl))))))) #'(let ((v (match:substring m i)) ...) exp ...)))))) ;; (all-const-string? pattern-list) => #t|#f ;; This syntax checks if each member of pattern-list is a string literal. ;; This is a helper for regexp-case. (define-syntax-rule (all-const-string? pattern-list) (let iter ((res #t) (pl pattern-list)) (if (null? pl) res (iter (and res (or (and (not (identifier? (car pl))) (string? (syntax->datum (car pl)))) (syntax-violation "regexp-case" "expecting string literal" (car pl)))) (cdr pl))))) ;; I'm not sure eval-when is doing the optimization I hoped for. (define-syntax regexp-case (lambda (x) (syntax-case x (else) ((_ str ((pat v ...) exp ...) ... (else else-exp ...)) (with-syntax (((id ...) (generate-temporaries #'(pat ...)))) (all-const-string? #'(pat ...)) #'(let ((id (eval-when (expand load eval) (make-regexp pat))) ...) (cond ((regexp-exec id str) => (lambda (m) (rx-let m (v ...) exp ...))) ... (else else-exp ...))))) ((_ str ((pat v ...) exp ...) ...) (with-syntax (((id ...) (generate-temporaries #'(pat ...)))) (all-const-string? #'(pat ...)) #'(let ((id (eval-when (expand load eval) (make-regexp pat))) ...) (cond ((regexp-exec id str) => (lambda (m) (rx-let m (v ...) exp ...))) ... (else (scm-error 'misc-error "regexp-case" "no match found: ~S" (list str) #f))))))))) (define-syntax make-string-matcher (lambda (x) (syntax-case x (else) ((_ (str . args) ((pat v ...) exp ...) ... (else else-exp ...)) (with-syntax (((id ...) (generate-temporaries #'(pat ...)))) (all-const-string? #'(pat ...)) #'(let ((id (make-regexp pat)) ...) (lambda (str . args) (cond ((regexp-exec id str) => (lambda (m) (rx-let m (v ...) exp ...))) ... (else else-exp ...)))))) ((_ (str . args) ((pat v ...) exp ...) ...) (with-syntax (((id ...) (generate-temporaries #'(pat ...)))) (all-const-string? #'(pat ...)) #'(let ((id (make-regexp pat)) ...) (lambda (str . args) (cond ((regexp-exec id str) => (lambda (m) (rx-let m (v ...) exp ...))) ... (else (scm-error 'misc-error "regexp-case" "no match found: ~S" (list str) #f)))))))))) ;; --- last line --- --Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=us-ascii
--Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2 Content-Disposition: attachment; filename=regexc.test Content-Type: application/octet-stream; name="regexc.test" Content-Transfer-Encoding: 7bit ;; regexc.test -*- scheme -*- ;; ;; Copyright (C) 2016 Matthew R. Wette ;; ;; Copying and distribution of this file, with or without modification, ;; are permitted in any medium without royalty provided the copyright ;; notice and this notice are preserved. This file is offered as-is, ;; without any warranty. (use-modules (potluck regexc)) (use-modules (ice-9 pretty-print)) (use-modules (system base compile)) (define* (expand-form e #:key (opts '())) (call-with-values (lambda () (decompile (compile e #:from 'scheme #:to 'tree-il #:env (current-module)) #:from 'tree-il #:to 'scheme #:opts opts)) (lambda (exp env) exp))) (define-syntax-rule (expand _expression_) (expand-form '_expression_)) ;; just show the expansion (define (test0) (pretty-print (expand (regexp-case str (("^([a-z]+)\\(([0-9]+)\\)$" v i) (list v i)) (("^([a-z]+)$" v) (list v "1"))) ))) (define (test1) (define (f1 str) (regexp-case str (("^([a-z]+)\\(([0-9]+)\\)$" v i) (list v i)) (("^([a-z]+)$" v) (list v "1")) (else #f) )) (if (not (equal? (f1 "foo") (list "foo" "1"))) (error "not working"))) ;; The following should generate a syntax error, but how to I trap that? ;;(define (test2) (regexp-case str (("abc") #t) ((pat) #f) (("ghi") #t))) (define (test3) (define f3 (make-string-matcher (str a b) (("^([a-z]+)\\(([0-9]+)\\)$" v i) (list v i b a)) (("^([a-z]+)$" v) (list v "1" b a)) (else #f) )) (if (not (equal? (f3 "foo" 'hello 'world) (list "foo" "1" 'world 'hello))) (error "not working"))) ;;(test0) (test1) ;;(test2) (test3) ;; --- last line --- --Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=us-ascii
--Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2 Content-Disposition: attachment; filename=regexc.texi Content-Type: application/octet-stream; name="regexc.texi" Content-Transfer-Encoding: 7bit @c -*-texinfo-*- @c Copyright (C) 2016 Matthew R. Wette @c @c Permission is granted to copy, distribute and/or modify this document @c under the terms of the GNU Free Documentation License, Version 1.3 or @c any later version published by the Free Software Foundation; with no @c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. @node Regexp Utilities @subsection Regexp Utilities @deffn {Scheme Procedure} regexp-case str case @dots{} [else body] Match the string @var{str} against each @var{case} in turn. Each @var{case} is of the form @example ((@i{pat} @var{var1} @var{var2} @dots{}) @i{body}) @end example @noindent where @i{pat} is a regular expression string literal, @var{var1} @dots{} are bound to the ordered list of matched subexpressions, and @var{body} is a sequence of expressions. If no match is found and the optional @var{else} case exists, the associated body is executed, otherwise an error is signaled. @end deffn The following example matches a string aginst either a simple variable name, or a simple variable name with an array reference, and returns a list with the variable name and the string index, or @code{"1"}. If no match is found, @code{#f} is returned. @example (define str "foo") (regexp-case str (("^([a-z]+)\\(([0-9]+)\\)$" var idx) (list var idx)) (("^([a-z]+)$" var) (list var "1")) (else #f)) ==> ("foo" "1") @end example @deffn {Scheme Procedure} make-string-matcher (str @dots{}) case @dots{} [else body] This is similar to @code{regexp-case} but generates a procedure @code{(lambda (str @dots{}) @dots{})} that matches its string argument @var{str} againt each @var{case} in turn. @end deffn @example (define my-matcher (make-string-matcher (str a b c) (("^([a-z]+)\\(([0-9]+)\\)$" var idx) (list var idx)) (("^([a-z]+)$" var) (list var "1")) (else #f)) @end example @c --- last line --- --Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=us-ascii
--Apple-Mail=_2396693E-78D6-40B6-9217-2E8F986811A2-- --Apple-Mail=_3681BDE2-C171-4F57-BABA-087860B884F7--