* Converting s-expressions to XML
@ 2010-06-17 15:00 Josef Wolf
2010-06-17 19:33 ` dsmich
2010-06-17 19:33 ` dsmich
0 siblings, 2 replies; 5+ messages in thread
From: Josef Wolf @ 2010-06-17 15:00 UTC (permalink / raw)
To: guile-user
Hello,
I am trying to write a (simple) function to convert s-expressions to XML.
I've come up with following function, which (somehow) works:
(use-modules (ice-9 rdelim))
(use-modules (ice-9 pretty-print))
(define atom?
(lambda (x)
(and (not (pair? x)) (not (null? x)))))
(define (indent string count)
(if (< count 1)
""
(string-append string (indent string (- count 1)))))
(define (walklist expr level)
(if (null? expr)
'()
(begin
(sexp->xml (car expr) (+ level 1))
(walklist (cdr expr) level))))
(define (unknown->string expr)
(let ((s (open-output-string)))
(display expr s)
(get-output-string s)))
(define (sexp->xml expr . params)
(let ((level (if (null? params) 0 (car params))))
(cond
((atom? expr)
(simple-format #t "~A~A\n" (indent " " level) expr))
((list? (car expr))
(sexp->xml (car expr) (+ level 1))
(walklist (cdr expr) level))
; ((pair? expr)
; (simple-format #t "~A~A\n" (indent " " level) (car expr))
; (simple-format #t "~A~A\n" (indent " " level) (cdr expr)))
(#t
(let ((s (unknown->string (car expr))))
(simple-format #t "~A~A~A~A\n" (indent " " level) "<" s ">")
(walklist (cdr expr) level)
(simple-format #t "~A~A~A~A\n" (indent " " level) "</" s ">"))))))
(sexp->xml '(person
(name "myself")
(address
(street "somestreet" 2)
(town 1234 "thistown")
(country "wonderland")
(test 'a b)
; (test1 . asd)
(test2 '(asd "asd")))))
While this seems to work, it has some drawbacks, which I can not figure out
how to get rid of them:
- It doesn't work on pairs. When I uncomment the case which checks for pairs,
the result is no longer XML. Instead, it looks more like the output of
(display)
- There is no way to tell symbols from strings in the XML output.
- I'd like atoms to be enclosed into its tags without any whitespace. While
it is easy to get rid of the indentation and the trailing newline, I can't
figure out how to get rid of the newline that comes behind the opening tag
Any hints?
BTW: In addition, I'd appreciate any hints how to make this thing more
scheme'ish. I think there is lots of potential for improvement here.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Converting s-expressions to XML
@ 2010-06-17 19:21 dskr
2010-06-17 19:50 ` Thien-Thi Nguyen
0 siblings, 1 reply; 5+ messages in thread
From: dskr @ 2010-06-17 19:21 UTC (permalink / raw)
To: Josef Wolf; +Cc: guile-user@gnu.org
[-- Attachment #1: Type: text/plain, Size: 10307 bytes --]
Here's an excerpt from my sexp to XML interface for Proverb:
Generating XML in Scheme
Proverb lets you generate and present XML-valued objects from Scheme with little fuss.
The most crude interface is the procedure make-raw . This procedure is called with a valid XML fragment as a string and returns a new XML-valued object whose contents are that fragment.
(make-raw "<em>An Emphasized Phrase</em>") yields An Emphasized Phrase .
Any Scheme value can be converted into an XML fragment as a string with xmlize-value .
(xmlize-value (make-raw "<em>An Emphasized Phrase</em>")) yields "<em>An Emphasized Phrase</em>" .
And Scheme values can be converted into an XML document as a string with xmlize .
(xmlize (make-raw "<em>An Emphasized Phrase</em>")) yields "<?xml version=\"1.0\"?>
<?xml-stylesheet type=\"text/xsl\" href=\"/sexp.xsl\"?>
<sexp xmlns=\"http://www.w3.org/2000/sexp\">
<em>An Emphasized Phrase</em></sexp>
" .
It is unsurprising to see that the text supplied to make-raw is passed through into the ultimate XML unmolested.
We can see values like strings and lists are adulterated but not quite molested.
(xmlize-value "foo") yields "<string>foo</string>"
(xmlize-value (list "foo")) yields "<list><string>foo</string></list>"
Most users and procedures that generate XML values neither explicitly generate these corresponding XML string fragments nor embed XML string fragments directly in their programs.
Scheme values are transformed automatically into XML for the Proverb web interface and XML-valued objects can be generated more elegantly with the procedure with-xml-tag .
(with-xml-tag 'em (make-raw "A Better Way")) yields A Better Way
That example certainly looks a bit more like Scheme but papers over an irritating detail. The call to make-raw expects a string laden with XML markup. Any XML characters in the text would need to be escaped with entity codes. We would need to write:
(with-xml-tag 'b (make-raw "<-o->")) just to get Darth Vader's TIE fighter <-o->
String escaping always ends in tears. This also ends in tears but for a better reason:
(with-xml-tag 'b "<-o->") yields "<-o->"
When we generate XML for Scheme values that are not naturally XML, like Scheme strings, we go to some effort to provide the type of print fidelity provided by Scheme write . We put enough markup around these values to allow them to be distinguished in the browser. This decorates our classy ASCII with extra quotes and other styling.
We square this circle with the helper procedure make-text that generates free XML text from its string argument.
(with-xml-tag 'b (make-text "<-o->")) yields <-o-> after too much work!
A better way to square circles generally would be with SVG:
(with-xml-tag '(svg
xmlns "http://www.w3.org/2000/svg"
width 50
height 50
viewbox "0 0 50 50")
(with-xml-tag 'g
(with-xml-tag '(rect width 50 height 50 fill green))
(with-xml-tag '(circle cx 25 cy 25 r 25 fill red))
)
)
which yields OMITTED FROM THIS EMAIL
And which introduces a new detail of with-xml-tag . This procedure accepts as its first argument either a symbol naming an XML tag or a list naming an XML tag and encoding a number of XML tag attribute keys and values.
---------
The five referenced functions and syntax forms:
xmlize
xmlize-value
with-xml-tag
make-raw
make-text
have pretty straightforward definitions. The largest, and the one that most resembles your example kernel, is xmlize-value. Its definition is provided below:
(define xmlize-value (lambda (value)
(cond ((and (vector? value) (> (vector-length value) 1))
(cond ((eq? (vector-ref value 0) xml-xhtml)
(string-append
"<xhtml>"
(apply string-append
(map xmlize-value (vector-ref value 1)))
"</xhtml>"))
((eq? (vector-ref value 0) xml-raw)
(vector-ref value 1))
((eq? (vector-ref value 0) xml-text)
(let* ()
(letrec ((p (object-property value (quote charset))))
(if p
(string-append
"<text charset='"
(xml-escape-string p)
"'>"
(xml-escape-string (vector-ref value 1))
"</text>")
(string-append
"<text>"
(xml-body-escape-string (vector-ref value 1))
"</text>")))))
(else
(apply string-append
"<vector>"
(append
(map xmlize-value (vector->list value))
'("</vector>"))))))
((vector? value)
(apply string-append
"<vector>"
(append
(map xmlize-value (vector->list value))
'("</vector>"))))
((null? value) "<list/>")
((list? value)
(cond ((and (eq? (car value) (quote quote))
(eq? (length value) 2))
(string-append
"<quote>"
(xmlize-value (cadr value))
"</quote>"))
((and (eq? (car value) (quote quasiquote))
(eq? (length value) 2))
(string-append
"<quasiquote>"
(xmlize-value (cadr value))
"</quasiquote>"))
(else
(apply string-append
"<list>"
(append
(map xmlize-value value)
'("</list>"))))))
((pair? value)
(string-append
"<pair>"
(xmlize-value (car value))
(xmlize-value (cdr value))
"</pair>"))
((hash-table? value)
(apply string-append
"<hash>"
(hash-fold
(lambda (k v acc)
(cons (xmlize-value (vector k v)) acc))
'("</hash>")
value)))
((string? value)
(let* ()
(letrec ((p (object-property value (quote charset))))
(if p
(string-append
"<string charset='"
(xml-escape-string p)
"'>"
(xml-escape-string value)
"</string>")
(string-append
"<string>"
(xml-body-escape-string value)
"</string>")))))
((symbol? value)
(string-append
"<symbol name=\""
(xml-quote-escape-string (symbol->string value))
"\"/>"))
((boolean? value)
(if value
"<boolean value=\"true\"/>"
"<boolean value=\"false\"/>"))
((number? value)
(string-append
"<number value=\""
(number->xml-string value)
"\"/>"))
((record? value)
(string-append
"<text>"
(xml-body-escape-string (object->string value))
"</text>"))
((unspecified? value) "<unspecified/>")
((char? value)
(string-append
"<char code='"
(number->string (char->integer value))
"'>"
(xml-body-escape-string (object->string value))
"</char>"))
(else
(string-append
"<text>"
(xml-body-escape-string (object->string value))
"</text>")))))
Sent from my iPad
On Jun 17, 2010, at 11:00 AM, Josef Wolf <jw@raven.inka.de> wrote:
> Hello,
>
> I am trying to write a (simple) function to convert s-expressions to XML.
> I've come up with following function, which (somehow) works:
>
> (use-modules (ice-9 rdelim))
> (use-modules (ice-9 pretty-print))
>
> (define atom?
> (lambda (x)
> (and (not (pair? x)) (not (null? x)))))
>
> (define (indent string count)
> (if (< count 1)
> ""
> (string-append string (indent string (- count 1)))))
>
> (define (walklist expr level)
> (if (null? expr)
> '()
> (begin
> (sexp->xml (car expr) (+ level 1))
> (walklist (cdr expr) level))))
>
> (define (unknown->string expr)
> (let ((s (open-output-string)))
> (display expr s)
> (get-output-string s)))
>
> (define (sexp->xml expr . params)
> (let ((level (if (null? params) 0 (car params))))
> (cond
> ((atom? expr)
> (simple-format #t "~A~A\n" (indent " " level) expr))
>
> ((list? (car expr))
> (sexp->xml (car expr) (+ level 1))
> (walklist (cdr expr) level))
>
> ; ((pair? expr)
> ; (simple-format #t "~A~A\n" (indent " " level) (car expr))
> ; (simple-format #t "~A~A\n" (indent " " level) (cdr expr)))
>
> (#t
> (let ((s (unknown->string (car expr))))
> (simple-format #t "~A~A~A~A\n" (indent " " level) "<" s ">")
> (walklist (cdr expr) level)
> (simple-format #t "~A~A~A~A\n" (indent " " level) "</" s ">"))))))
>
>
> (sexp->xml '(person
> (name "myself")
> (address
> (street "somestreet" 2)
> (town 1234 "thistown")
> (country "wonderland")
> (test 'a b)
> ; (test1 . asd)
> (test2 '(asd "asd")))))
>
>
> While this seems to work, it has some drawbacks, which I can not figure out
> how to get rid of them:
>
> - It doesn't work on pairs. When I uncomment the case which checks for pairs,
> the result is no longer XML. Instead, it looks more like the output of
> (display)
>
> - There is no way to tell symbols from strings in the XML output.
>
> - I'd like atoms to be enclosed into its tags without any whitespace. While
> it is easy to get rid of the indentation and the trailing newline, I can't
> figure out how to get rid of the newline that comes behind the opening tag
>
> Any hints?
>
> BTW: In addition, I'd appreciate any hints how to make this thing more
> scheme'ish. I think there is lots of potential for improvement here.
>
[-- Attachment #2: Type: text/html, Size: 32191 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-06-17 19:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-17 15:00 Converting s-expressions to XML Josef Wolf
2010-06-17 19:33 ` dsmich
2010-06-17 19:33 ` dsmich
-- strict thread matches above, loose matches on Subject: below --
2010-06-17 19:21 dskr
2010-06-17 19:50 ` Thien-Thi Nguyen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).