From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "dskr@mac.com" Newsgroups: gmane.lisp.guile.user Subject: Re: Converting s-expressions to XML Date: Thu, 17 Jun 2010 15:21:08 -0400 Message-ID: <282B74EA-192F-42AE-9B32-D8ED5E1B7E51@mac.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="Boundary_(ID_Bgfu5TRR8govrHzipO/M2w)" X-Trace: dough.gmane.org 1276802352 28922 80.91.229.12 (17 Jun 2010 19:19:12 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 17 Jun 2010 19:19:12 +0000 (UTC) Cc: "guile-user@gnu.org" To: Josef Wolf Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Thu Jun 17 21:19:07 2010 connect(): No such file or directory Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OPKcA-0006Ny-PW for guile-user@m.gmane.org; Thu, 17 Jun 2010 21:19:06 +0200 Original-Received: from localhost ([127.0.0.1]:45998 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OPKc9-0003H7-Hr for guile-user@m.gmane.org; Thu, 17 Jun 2010 15:19:01 -0400 Original-Received: from [140.186.70.92] (port=58654 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OPKbj-0002za-G2 for guile-user@gnu.org; Thu, 17 Jun 2010 15:18:38 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OPKbh-0006Ax-EW for guile-user@gnu.org; Thu, 17 Jun 2010 15:18:35 -0400 Original-Received: from asmtpout028.mac.com ([17.148.16.103]:52058) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OPKbh-0006Ak-45 for guile-user@gnu.org; Thu, 17 Jun 2010 15:18:33 -0400 Original-Received: from [192.168.1.2] ([108.96.61.229]) by asmtp028.mac.com (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTPSA id <0L46001WKAXY5N80@asmtp028.mac.com> for guile-user@gnu.org; Thu, 17 Jun 2010 12:18:05 -0700 (PDT) X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=6.0.2-1004200000 definitions=main-1006170110 X-Proofpoint-Virus-Version: vendor=fsecure engine=1.12.8161:2.4.5,1.2.40,4.0.166 definitions=2010-06-17_03:2010-02-06, 2010-06-17, 2010-06-17 signatures=0 X-Mailer: iPad Mail (7B367) X-detected-operating-system: by eggs.gnu.org: Solaris 10 (1203?) X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:7883 Archived-At: --Boundary_(ID_Bgfu5TRR8govrHzipO/M2w) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Here's an excerpt from my sexp to XML interface for Proverb: Generating XML in Scheme Proverb lets you generate and present XML-valued objects from Scheme with little fuss. The most crude interface is the procedure make-raw . This procedure is called with a valid XML fragment as a string and returns a new XML-valued object whose contents are that fragment. (make-raw "An Emphasized Phrase") yields An Emphasized Phrase . Any Scheme value can be converted into an XML fragment as a string with xmlize-value . (xmlize-value (make-raw "An Emphasized Phrase")) yields "An Emphasized Phrase" . And Scheme values can be converted into an XML document as a string with xmlize . (xmlize (make-raw "An Emphasized Phrase")) yields " An Emphasized Phrase " . It is unsurprising to see that the text supplied to make-raw is passed through into the ultimate XML unmolested. We can see values like strings and lists are adulterated but not quite molested. (xmlize-value "foo") yields "foo" (xmlize-value (list "foo")) yields "foo" Most users and procedures that generate XML values neither explicitly generate these corresponding XML string fragments nor embed XML string fragments directly in their programs. Scheme values are transformed automatically into XML for the Proverb web interface and XML-valued objects can be generated more elegantly with the procedure with-xml-tag . (with-xml-tag 'em (make-raw "A Better Way")) yields A Better Way That example certainly looks a bit more like Scheme but papers over an irritating detail. The call to make-raw expects a string laden with XML markup. Any XML characters in the text would need to be escaped with entity codes. We would need to write: (with-xml-tag 'b (make-raw "<-o->")) just to get Darth Vader's TIE fighter <-o-> String escaping always ends in tears. This also ends in tears but for a better reason: (with-xml-tag 'b "<-o->") yields "<-o->" When we generate XML for Scheme values that are not naturally XML, like Scheme strings, we go to some effort to provide the type of print fidelity provided by Scheme write . We put enough markup around these values to allow them to be distinguished in the browser. This decorates our classy ASCII with extra quotes and other styling. We square this circle with the helper procedure make-text that generates free XML text from its string argument. (with-xml-tag 'b (make-text "<-o->")) yields <-o-> after too much work! A better way to square circles generally would be with SVG: (with-xml-tag '(svg xmlns "http://www.w3.org/2000/svg" width 50 height 50 viewbox "0 0 50 50") (with-xml-tag 'g (with-xml-tag '(rect width 50 height 50 fill green)) (with-xml-tag '(circle cx 25 cy 25 r 25 fill red)) ) ) which yields OMITTED FROM THIS EMAIL And which introduces a new detail of with-xml-tag . This procedure accepts as its first argument either a symbol naming an XML tag or a list naming an XML tag and encoding a number of XML tag attribute keys and values. --------- The five referenced functions and syntax forms: xmlize xmlize-value with-xml-tag make-raw make-text have pretty straightforward definitions. The largest, and the one that most resembles your example kernel, is xmlize-value. Its definition is provided below: (define xmlize-value (lambda (value) (cond ((and (vector? value) (> (vector-length value) 1)) (cond ((eq? (vector-ref value 0) xml-xhtml) (string-append "" (apply string-append (map xmlize-value (vector-ref value 1))) "")) ((eq? (vector-ref value 0) xml-raw) (vector-ref value 1)) ((eq? (vector-ref value 0) xml-text) (let* () (letrec ((p (object-property value (quote charset)))) (if p (string-append "" (xml-escape-string (vector-ref value 1)) "") (string-append "" (xml-body-escape-string (vector-ref value 1)) ""))))) (else (apply string-append "" (append (map xmlize-value (vector->list value)) '("")))))) ((vector? value) (apply string-append "" (append (map xmlize-value (vector->list value)) '("")))) ((null? value) "") ((list? value) (cond ((and (eq? (car value) (quote quote)) (eq? (length value) 2)) (string-append "" (xmlize-value (cadr value)) "")) ((and (eq? (car value) (quote quasiquote)) (eq? (length value) 2)) (string-append "" (xmlize-value (cadr value)) "")) (else (apply string-append "" (append (map xmlize-value value) '("")))))) ((pair? value) (string-append "" (xmlize-value (car value)) (xmlize-value (cdr value)) "")) ((hash-table? value) (apply string-append "" (hash-fold (lambda (k v acc) (cons (xmlize-value (vector k v)) acc)) '("") value))) ((string? value) (let* () (letrec ((p (object-property value (quote charset)))) (if p (string-append "" (xml-escape-string value) "") (string-append "" (xml-body-escape-string value) ""))))) ((symbol? value) (string-append "string value)) "\"/>")) ((boolean? value) (if value "" "")) ((number? value) (string-append "xml-string value) "\"/>")) ((record? value) (string-append "" (xml-body-escape-string (object->string value)) "")) ((unspecified? value) "") ((char? value) (string-append "" (xml-body-escape-string (object->string value)) "")) (else (string-append "" (xml-body-escape-string (object->string value)) ""))))) Sent from my iPad On Jun 17, 2010, at 11:00 AM, Josef Wolf wrote: > Hello, > > I am trying to write a (simple) function to convert s-expressions to XML. > I've come up with following function, which (somehow) works: > > (use-modules (ice-9 rdelim)) > (use-modules (ice-9 pretty-print)) > > (define atom? > (lambda (x) > (and (not (pair? x)) (not (null? x))))) > > (define (indent string count) > (if (< count 1) > "" > (string-append string (indent string (- count 1))))) > > (define (walklist expr level) > (if (null? expr) > '() > (begin > (sexp->xml (car expr) (+ level 1)) > (walklist (cdr expr) level)))) > > (define (unknown->string expr) > (let ((s (open-output-string))) > (display expr s) > (get-output-string s))) > > (define (sexp->xml expr . params) > (let ((level (if (null? params) 0 (car params)))) > (cond > ((atom? expr) > (simple-format #t "~A~A\n" (indent " " level) expr)) > > ((list? (car expr)) > (sexp->xml (car expr) (+ level 1)) > (walklist (cdr expr) level)) > > ; ((pair? expr) > ; (simple-format #t "~A~A\n" (indent " " level) (car expr)) > ; (simple-format #t "~A~A\n" (indent " " level) (cdr expr))) > > (#t > (let ((s (unknown->string (car expr)))) > (simple-format #t "~A~A~A~A\n" (indent " " level) "<" s ">") > (walklist (cdr expr) level) > (simple-format #t "~A~A~A~A\n" (indent " " level) "")))))) > > > (sexp->xml '(person > (name "myself") > (address > (street "somestreet" 2) > (town 1234 "thistown") > (country "wonderland") > (test 'a b) > ; (test1 . asd) > (test2 '(asd "asd"))))) > > > While this seems to work, it has some drawbacks, which I can not figure out > how to get rid of them: > > - It doesn't work on pairs. When I uncomment the case which checks for pairs, > the result is no longer XML. Instead, it looks more like the output of > (display) > > - There is no way to tell symbols from strings in the XML output. > > - I'd like atoms to be enclosed into its tags without any whitespace. While > it is easy to get rid of the indentation and the trailing newline, I can't > figure out how to get rid of the newline that comes behind the opening tag > > Any hints? > > BTW: In addition, I'd appreciate any hints how to make this thing more > scheme'ish. I think there is lots of potential for improvement here. > --Boundary_(ID_Bgfu5TRR8govrHzipO/M2w) Content-type: text/html; charset=utf-8 Content-transfer-encoding: quoted-printable
Here's an excerpt from my = sexp to XML interface for Proverb:

make-raw . = This procedure is called with a valid XML fragment as = a string and returns a new XML-valued object whose = contents are that fragment.

(make-raw "<em>An Emphasized = Phrase</em>") yields An Emphasized = Phrase .

Any Scheme value can be = converted into an XML fragment as a string with xmlize-value .

(xmlize-value (make-raw "<em>An = Emphasized Phrase</em>")) yields "<em>An Emphasized Phrase</em&g= t;" .

And Scheme values can be = converted into an XML document as a string with xmlize .

(xmlize (make-raw "<em>An Emphasized = Phrase</em>")) yields "<?xml version=3D\"1.0\"?>
<?xml-= stylesheet type=3D\"text/xsl\" href=3D\"/sexp.xsl\"?>
<= sexp xmlns=3D\"http://www.w3.org/2000/sexp\= ">
<em>An Emphasized Phrase</em></sexp><= br>"
 .

It is unsurprising to see = that the text supplied to make-raw is passed through into = the ultimate XML unmolested.

We can see values like = strings and lists are adulterated but not quite molested.

(xmlize-value = "foo") yields "<string>foo</string>"
(xmlize-value = (list "foo")) yields "<list><string>foo</string></= list>"

Most users and procedures = that generate XML values neither explicitly = generate these corresponding XML string fragments nor = embed XML string fragments directly in their = programs.

Scheme values are = transformed automatically into XML for the Proverb web interface and XML-valued = objects can be generated more elegantly with the = procedure with-xml-tag .

(with-xml-tag 'em (make-raw "A Better = Way")) yields A Better = Way

That example certainly = looks a bit more like Scheme but papers over = an irritating detail. The call to make-raw expects = a string laden with XML markup. Any XML characters in = the text would need to be escaped with entity codes. = We would need to write:

(with-xml-tag 'b (make-raw = "&lt;-o->")) just to get Darth Vader's TIE = fighter <-o->

String escaping always = ends in tears. This also ends in tears but for a better = reason:

(with-xml-tag 'b = "<-o->") yields "<-o->"
=
When we generate XML = for Scheme values that are not naturally XML, like = Scheme strings, we go to some effort to provide the type of = print fidelity provided by Scheme write . We put enough = markup around these values to allow them to be = distinguished in the browser. This decorates our  = classy ASCII with extra quotes and other styling.


We square this circle = with the helper procedure make-text that = generates free XML text from its string argument.

(with-xml-tag 'b (make-text = "<-o->")) yields <-o->=  after too much work!

A better way to square = circles generally would be with SVG: 

(with-xml-tag '(svg
          = ;        xmlns "http://www.w3.org/2000/svg"
          = ;        width 50
<= span = class=3D"line">          = ;        height 50

=           = ;        viewbox "0 0 50 50")
<= span class=3D"line">  (with-xml-tag 'g
    (with-xml-tag '(rect widt= h 50 height 50 fill green))
    (with-xml-tag '(circle cx=  25 cy 25 r 25 fill red))
  )
)

which yields  OMITTED = FROM THIS EMAIL

And which introduces a = new detail of with-xml-tag . T= his procedure accepts as its first argument either a symbol naming an = XML tag or a list naming an XML tag and encoding a = number of XML tag attribute keys and = values. 

---------

The five referenced functions and syntax = forms:
xmlize
xmlize-value
with-xml-tag
make-raw
make-text

have pretty = straightforward definitions. The largest, and the one that most = resembles your example kernel, is xmlize-value. Its definition is = provided below:

(define xmlize-value = (lambda (value)
  (cond ((and (vector? value) = ;(> (vector-length value) 1))
         (cond=  ((eq? (vector-ref value 0) xml-xhtml)
=           = ;      (string-append
          = ;        "<xhtml>"          = ;        (apply string-append=
          = ;            &= nbsp;  (map xmlize-value (vector-ref value 1= )))
          = ;        "</xhtml>"))=
          = ;     ((eq? (vector-ref value 0)&n= bsp;xml-raw)
          = ;      (vector-ref value 1))
          = ;     ((eq? (vector-ref value 0)&n= bsp;xml-text)
          = ;      (let* ()
          = ;        (letrec ((p (ob= ject-property value (quote charset))))
          = ;          (if p
          = ;            (= string-append
          = ;            &= nbsp; "<text charset=3D'"
          = ;            &= nbsp; (xml-escape-string p)
          = ;            &= nbsp; "'>"
          = ;            &= nbsp; (xml-escape-string (vector-ref value 1))<= br>          = ;            &= nbsp; "</text>")
          = ;            (= string-append
          = ;            &= nbsp; "<text>"
          = ;            &= nbsp; (xml-body-escape-string (vector-ref value 1))
          = ;            &= nbsp; "</text>")))))
          = ;     (else
          = ;      (apply string-append
<= span = class=3D"line">          = ;            &= nbsp;"<vector>"

          = ;            &= nbsp;(append
          = ;            &= nbsp;  (map xmlize-value (vector->list value))=
          = ;            &= nbsp;  '("</vector>"))))))
        ((vector?&n= bsp;value)
         (appl= y string-append
          = ;      "<vector>"
          = ;      (append
          = ;        (map xmlize-value&nb= sp;(vector->list value))
          = ;        '("</vector>"))))
        ((null?&nbs= p;value) "<list/>")
        ((list?&nbs= p;value)
         (cond=  ((and (eq? (car value) (quote quote))
          = ;           (eq?&nb= sp;(length value) 2))
          = ;      (string-append
          = ;        "<quote>"          = ;        (xmlize-value (cadr&= nbsp;value))
          = ;        "</quote>"))=
          = ;     ((and (eq? (car value) = (quote quasiquote))
          = ;           (eq?&nb= sp;(length value) 2))
          = ;      (string-append
          = ;        "<quasiquote>"
          = ;        (xmlize-value (cadr&= nbsp;value))
          = ;        "</quasiquote>"))
          = ;     (else
          = ;      (apply string-append
<= span = class=3D"line">          = ;            &= nbsp;"<list>"

          = ;            &= nbsp;(append
          = ;            &= nbsp;  (map xmlize-value value)
          = ;            &= nbsp;  '("</list>"))))))
        ((pair?&nbs= p;value)
         (stri= ng-append
          = ; "<pair>"
          = ; (xmlize-value (car value))
          = ; (xmlize-value (cdr value))
          = ; "</pair>"))
        ((hash-tabl= e? value)
         (appl= y string-append
          = ;      "<hash>"
          = ;      (hash-fold
          = ;        (lambda (k v&nb= sp;acc)
          = ;          (cons (x= mlize-value (vector k v)) acc))
          = ;        '("</hash>")=
          = ;        value)))
        ((string?&n= bsp;value)
         (let*=  ()
          = ; (letrec ((p (object-property value (quote = charset))))
          = ;   (if p
          = ;     (string-append
          = ;       "<string charset=3D'"
          = ;       (xml-escape-string p)
          = ;       "'>"
          = ;       (xml-escape-string value)<= /span>
          = ;       "</string>")
          = ;     (string-append

          = ;       "<string>"
          = ;       (xml-body-escape-string va= lue)

          = ;       "</string>")))))        ((symbol?&n= bsp;value)
         (stri= ng-append
          = ; "<symbol name=3D\""
          = ; (xml-quote-escape-string (symbol->string value))
          = ; "\"/>"))
        ((boolean?&= nbsp;value)
         (if&n= bsp;value
          = ; "<boolean value=3D\"true\"/>"
          = ; "<boolean value=3D\"false\"/>"))
        ((number?&n= bsp;value)
         (stri= ng-append
          = ; "<number value=3D\""
          = ; (number->xml-string value)
          = ; "\"/>"))
        ((record?&n= bsp;value)
         (stri= ng-append
          = ; "<text>"
          = ; (xml-body-escape-string (object->string value))=
          = ; "</text>"))
        ((unspecifi= ed? value) "<unspecified/>")
        ((char?&nbs= p;value)
         (stri= ng-append
          = ; "<char code=3D'"
          = ; (number->string (char->integer value))
          = ; "'>"

          = ; (xml-body-escape-string (object->string value))=
          = ; "</char>"))
        (else
         (stri= ng-append
          = ; "<text>"
          = ; (xml-body-escape-string (object->string value))=
          = ; "</text>")))))


Sent from my iPad

On = Jun 17, 2010, at 11:00 AM, Josef Wolf <jw@raven.inka.de> = wrote:

Hello,

I am = trying to write a (simple) function to convert s-expressions to = XML.
I've come up with following function, which = (somehow) works:

 (use-modules = (ice-9 rdelim))
 (use-modules (ice-9 = pretty-print))

 (define = atom?
   (lambda (x)
=      (and (not (pair? x)) (not (null? = x)))))

 (define (indent string = count)
   (if (< count = 1)
=        ""
=        (string-append string (indent = string (- count 1)))))

 (define = (walklist expr level)
   (if (null? = expr)
=        '()
=        (begin
=          (sexp->xml (car = expr) (+ level 1))
=          (walklist (cdr = expr) level))))

 (define = (unknown->string expr)
   (let ((s = (open-output-string)))
=      (display expr s)
=      (get-output-string = s)))

 (define (sexp->xml expr = . params)
   (let ((level (if (null? = params) 0 (car params))))
=      (cond
=       ((atom? expr)
=        (simple-format #t "~A~A\n" = (indent "  " level) expr))

=       ((list? (car expr))
=        (sexp->xml (car expr) (+ = level 1))
=        (walklist (cdr expr) = level))

 ; =     ((pair? expr)
 ; =      (simple-format #t "~A~A\n" (indent " =  " level) (car expr))
 ; =      (simple-format #t "~A~A\n" (indent " =  " level) (cdr expr)))

=       (#t
=        (let ((s =  (unknown->string (car expr))))
=          (simple-format #t = "~A~A~A~A\n" (indent "  " level) "<" s ">")
=          (walklist (cdr = expr) level)
=          (simple-format #t = "~A~A~A~A\n" (indent "  " level) "</" s = ">"))))))


=  (sexp->xml '(person
=             &n= bsp; (name "myself")
=             &n= bsp; (address
=             &n= bsp;  (street "somestreet" 2)
=             &n= bsp;  (town 1234 "thistown")
=             &n= bsp;  (country "wonderland")
=             &n= bsp;  (test 'a b)
; =             &n= bsp;  (test1 . asd)
=             &n= bsp;  (test2 '(asd = "asd")))))


While this = seems to work, it has some drawbacks, which I can not figure = out
how to get rid of = them:

- It doesn't work on pairs. When = I uncomment the case which checks for pairs,
=   the result is no longer XML. Instead, it looks more like the = output of
=   (display)

- There is no = way to tell symbols from strings in the XML = output.

- I'd like atoms to be = enclosed into its tags without any whitespace. While
=   it is easy to get rid of the indentation and the trailing = newline, I can't
  figure out how to get rid = of the newline that comes behind the opening = tag

Any = hints?

BTW: In addition, I'd appreciate = any hints how to make this thing more
=     scheme'ish. I think there is lots of potential = for improvement = here.

= --Boundary_(ID_Bgfu5TRR8govrHzipO/M2w)--