unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* sxml simple, sxml->xml and namespaces
@ 2015-04-08 20:55 tomas
  2016-06-20  8:56 ` Andy Wingo
  0 siblings, 1 reply; 8+ messages in thread
From: tomas @ 2015-04-08 20:55 UTC (permalink / raw
  To: guile-devel


[-- Attachment #1.1: Type: text/plain, Size: 4496 bytes --]

Gentle guile folks,

I'm playing around with (sxml simple) and stumbled upon something
I think might be a bug. Consider the following snippet:

  #!/usr/bin/guile -s
  !#
  (use-modules (sxml simple))
  
  ;; An XML with two namespaces (one default)
  (define the-svg "<svg xmlns='http://www.w3.org/2000/svg'
       xmlns:xlink='http://www.w3.org/1999/xlink'>
    <rect x='5' y='5' width='20' height='20'
          stroke-width='2' stroke='purple' fill='yellow'
          id='rect1' />
    <rect x='30' y='5' width='20' height='20'
          ry='5' rx='8' stroke-width='2' stroke='purple' fill='blue'
          xlink:href='#rect1' />
  </svg>")
  
  ;; Note how SXML handles QNames (just concatenating NS and
  ;; local-name with a colon):
  (define the-sxml
    (with-input-from-string the-svg xml->sxml))
  (format #t "~A\n" the-sxml)
  
  ;; If we try to serialize this: kaboom!
  (sxml->xml the-sxml)
  
The parsing into SXML goes well, the (format ...) outputs what
I'd expect. But the (sxml->xml ...) dies with:

  ERROR: In procedure scm-error:
  ERROR: Invalid QName: more than one colon http://www.w3.org/2000/svg:svg

I had a look at sxml simple and think the problem is that the
function check-name (which is the one throwing the error) expects
the name to be a QName (i.e. either a Name or a namespace abbreviation
plus a colon plus a Name).

But SXML tacks the whole namespaces to names (i.e. the whole
"http://www.w3.org/1999/xlink", for example -- not the "xlink").

When serializing to XML, we should go the way back, finding abbreviations
for the namespaces used, prefixing the names with those abbreviations
and issuing namespace declarations for those abbreviations (those funny
xmlns:foo attributes).

I've tried my hand at a patch which "works for me". Basically, what it
does is to thread an extra parameter "nsmap", representing a mapping
(namespace -> ns-abbreviation) valid at "this" position and below in
the tree. When new, unseen namespaces come up, new abbreviations are
"invented" (ns-abbrev-new), collected and the corresponding declarations
printed. When recursing to sub-elements, the new mappings are added to
the nsmap passed down.

The result after the patch for the above example (a bit embellished)
looks like this:

  <ns1:svg xmlns:ns1="http://www.w3.org/2000/svg">
    <ns1:rect y="5" x="5" width="20" stroke-width="2"
              stroke="purple" id="rect1" height="20" fill="yellow" />
    <ns1:rect ns2:href="#rect1" y="5" x="30" width="20" stroke-width="2"
              stroke="purple" ry="5" rx="8" height="20" fill="blue"
              xmlns:ns2="http://www.w3.org/1999/xlink" />
  </ns1:svg>
  
Pretty clumsy, but basically correct.

The attached patch is against "GNU Guile 2.0.5-deb+1-3". The relevant
code hasn't changed up to the current development version.

I'm not very happy with the patch as-is. Among other things,

 - I had a hard time doing what I wanted in a non-clumsy way.
   Especially, ns-abbr is a strange function and not very clear
   because it tries to do several things at once: replace the
   namespace by its abbreviation, signal a new mapping item
   whenever this abbreviation was new. But how to achieve this
   elegantly without doing several look-ups?

 - The namespace declarations are tacked at the end of the attribute
   list. This is plain opportunism: the tag may carry a namespace,
   and each of the attribute names too. Thus, it's very handy to
   collect all the unseen mappings (new-namespaces in element->xml)
   and output them at the end of the attribute list.

   But in XML it is usual to put the namespace declarations before
   the attributes (the "canonical" XML order even prescribes that).

 - The sxml code is pretty careful to not munge around too much
   with strings, but to output things ASAP to the port. I think
   I might be a bit more careful in that department.

 - In other XML libraries the user gets a choice on preferred
   namespace mappings (e.g. I'd like http://www.w3.org/2000/svg
   to be the default namespace -- or http://www.w3.org/1999/xlink
   to be abbreviated as 'xlink'). This could be achieved by
   passing a function as an optional parameter which gets a try
   at a new namespace before ns-abbr-new gets at it.

I'd be happy to prepare a patch against whatever version makes
sense once we get some consensus on how to do it right.

Thanks & regards
-- tomás

[-- Attachment #1.2: simple.diff --]
[-- Type: text/x-diff, Size: 7416 bytes --]

--- /usr/share/guile/2.0/sxml/simple.scm	2012-03-18 20:16:21.000000000 +0100
+++ /home/tomas/lib/guile/sxml/simple.scm	2015-04-08 22:29:30.049277842 +0200
@@ -37,29 +37,38 @@
 argument, @var{port}, which defaults to the current input port."
   (ssax:xml->sxml port '()))
 
-(define check-name
-  (let ((*good-cache* (make-hash-table)))
-    (lambda (name)
-      (if (not (hashq-ref *good-cache* name))
-          (let* ((str (symbol->string name))
-                 (i (string-index str #\:))
-                 (head (or (and i (substring str 0 i)) str))
-                 (tail (and i (substring str (1+ i)))))
-            (and i (string-index (substring str (1+ i)) #\:)
-                 (error "Invalid QName: more than one colon" name))
-            (for-each
-             (lambda (s)
-               (and s
-                    (or (char-alphabetic? (string-ref s 0))
-                        (eq? (string-ref s 0) #\_)
-                        (error "Invalid name starting character" s name))
-                    (string-for-each
-                     (lambda (c)
-                       (or (char-alphabetic? c) (string-index "0123456789.-_" c)
-                           (error "Invalid name character" c s name)))
-                     s)))
-             (list head tail))
-            (hashq-set! *good-cache* name #t))))))
+(define (ns-lookup ns nsmap)
+  "Look up namespace ns in nsmap. Return its abbreviation or #f"
+  (assoc-ref nsmap ns))
+
+(define ns-abbr-new
+  (let ((*nscounter* 0))
+    (lambda ()
+      (set! *nscounter* (1+ *nscounter*))
+      (string-append "ns" (number->string *nscounter*)))))
+
+(define (ns-abbr name nsmap)
+  "Takes a QName, SXML style (i.e a symbol whose string value is either a
+clean local name or a colon-concatenated pair of namespace:name, and returns
+a list whose car is a string <nsabbrev>:<local-name> and which has as cdr
+a pair (<namespace> . nsabbrev) whenever <namespace> wasn't found in nsmap"
+  ;; FIXME check for empty ns (e.g ":foo")
+  ;; check (worse!) for empty locname (e.g. "foo:")
+  (let* ((str (symbol->string name))
+         (i (string-rindex str #\:))
+         (ns (and i (substring str 0 i)))
+         (locname (or (and i (substring str (1+ i))) str)))
+    (if ns
+        (let ((nsabbr (ns-lookup ns nsmap)))
+          (if nsabbr
+              ;; known namespace:
+              (list (string-append nsabbr ":" locname))
+              ;; unknown namespace
+              (let ((nsabbr (ns-abbr-new)))
+                (list (string-append nsabbr ":" locname)
+                      (cons ns nsabbr)))))
+        ;; empty namespace: clean local-name:
+        (list locname))))
 
 ;; The following two functions serialize tags and attributes. They are
 ;; being used in the node handlers for the post-order function, see
@@ -82,42 +91,58 @@
      port))))
 
 (define (attribute->xml attr value port)
-  (check-name attr)
   (display attr port)
   (display "=\"" port)
   (attribute-value->xml value port)
   (display #\" port))
 
-(define (element->xml tag attrs body port)
-  (check-name tag)
-  (display #\< port)
-  (display tag port)
-  (if attrs
-      (let lp ((attrs attrs))
-        (if (pair? attrs)
-            (let ((attr (car attrs)))
-              (display #\space port)
-              (if (pair? attr)
-                  (attribute->xml (car attr) (cdr attr) port)
-                  (error "bad attribute" tag attr))
-              (lp (cdr attrs)))
-            (if (not (null? attrs))
-                (error "bad attributes" tag attrs)))))
-  (if (pair? body)
-      (begin
-        (display #\> port)
-        (let lp ((body body))
-          (cond
-           ((pair? body)
-            (sxml->xml (car body) port)
-            (lp (cdr body)))
-           ((null? body)
-            (display "</" port)
-            (display tag port)
-            (display ">" port))
-           (else
-            (error "bad element body" tag body)))))
-      (display " />" port)))
+(define (element->xml tag attrs body port nsmap)
+  (let* ((ab (ns-abbr tag  nsmap))
+         (abname (car ab))
+         (new-namespaces (cdr ab)))
+    (display #\< port)
+    (display abname port)
+    (if attrs
+        (let lp ((attrs attrs))
+          (if (pair? attrs)
+              (let ((attr (car attrs)))
+                (display #\space port)
+                (if (pair? attr)
+                    (let* ((ab (ns-abbr (car attr) nsmap))
+                           (abname (car ab))
+                           (nsplus (cdr ab)))
+                      (unless (null? nsplus)
+                        (set! new-namespaces
+                              (cons (car nsplus) new-namespaces)))
+                      (attribute->xml abname (cdr attr) port))
+                    (error "bad attribute" tag attr))
+                (lp (cdr attrs)))
+              (if (not (null? attrs))
+                  (error "bad attributes" tag attrs)))))
+    ;; Output namespace declarations
+    (let lp ((new-namespaces new-namespaces))
+      (unless (null? new-namespaces)
+        ;; remember: car is namespace, cdr is abbrev
+        (let ((ns (caar new-namespaces))
+              (nsabbr (cdar new-namespaces)))
+          (display #\space port)
+          (attribute->xml (string-append "xmlns:" nsabbr) ns port))
+        (lp (cdr new-namespaces))))
+    (if (pair? body)
+        (begin
+          (display #\> port)
+          (let lp ((body body))
+            (cond
+             ((pair? body)
+              (sxml->xml (car body) port (append new-namespaces nsmap))
+              (lp (cdr body)))
+             ((null? body)
+              (display "</" port)
+              (display abname port)
+              (display ">" port))
+             (else
+              (error "bad element body" tag body)))))
+        (display " />" port))))
 
 ;; FIXME: ensure name is valid
 (define (entity->xml name port)
@@ -133,7 +158,8 @@
   (display str port)
   (display "?>" port))
 
-(define* (sxml->xml tree #:optional (port (current-output-port)))
+(define* (sxml->xml tree #:optional (port (current-output-port))
+                    (nsmap '()))
   "Serialize the sxml tree @var{tree} as XML. The output will be written
 to the current output port, unless the optional argument @var{port} is
 present."
@@ -144,7 +170,7 @@
         (let ((tag (car tree)))
           (case tag
             ((*TOP*)
-             (sxml->xml (cdr tree) port))
+             (sxml->xml (cdr tree) port nsmap))
             ((*ENTITY*)
              (if (and (list? (cdr tree)) (= (length (cdr tree)) 1))
                  (entity->xml (cadr tree) port)
@@ -158,9 +184,9 @@
                     (attrs (and (pair? elems) (pair? (car elems))
                                 (eq? '@ (caar elems))
                                 (cdar elems))))
-               (element->xml tag attrs (if attrs (cdr elems) elems) port)))))
+               (element->xml tag attrs (if attrs (cdr elems) elems) port nsmap)))))
         ;; A nodelist.
-        (for-each (lambda (x) (sxml->xml x port)) tree)))
+        (for-each (lambda (x) (sxml->xml x port nsmap)) tree)))
    ((string? tree)
     (string->escaped-xml tree port))
    ((null? tree) *unspecified*)

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sxml simple, sxml->xml and namespaces
  2015-04-08 20:55 sxml simple, sxml->xml and namespaces tomas
@ 2016-06-20  8:56 ` Andy Wingo
  2016-06-20 10:52   ` Ricardo Wurmus
  2016-06-20 11:18   ` tomas
  0 siblings, 2 replies; 8+ messages in thread
From: Andy Wingo @ 2016-06-20  8:56 UTC (permalink / raw
  To: tomas; +Cc: guile-devel

Greetings gentle Guiler,

Apologies for the long delay here.  I'm with you regarding namespaces
and sxml->xml.  In the past I made sure to always get the namespaces
attached to the root element via the @ xmlns attributes, and then have
namespaced uses just be local names, not qnames, and that way sxml->xml
works fine.  But, perhaps that doesn't cover all cases in a nice way.
Do you still have thoughts on this patch?  Is the right thing for you?
In any case we need better documentation in the manual about how to deal
with namespaces and SXML, in practice, with examples.

Regards,

Andy

On Wed 08 Apr 2015 22:55, <tomas@tuxteam.de> writes:

> Gentle guile folks,
>
> I'm playing around with (sxml simple) and stumbled upon something
> I think might be a bug. Consider the following snippet:
>
>   #!/usr/bin/guile -s
>   !#
>   (use-modules (sxml simple))
>   
>   ;; An XML with two namespaces (one default)
>   (define the-svg "<svg xmlns='http://www.w3.org/2000/svg'
>        xmlns:xlink='http://www.w3.org/1999/xlink'>
>     <rect x='5' y='5' width='20' height='20'
>           stroke-width='2' stroke='purple' fill='yellow'
>           id='rect1' />
>     <rect x='30' y='5' width='20' height='20'
>           ry='5' rx='8' stroke-width='2' stroke='purple' fill='blue'
>           xlink:href='#rect1' />
>   </svg>")
>   
>   ;; Note how SXML handles QNames (just concatenating NS and
>   ;; local-name with a colon):
>   (define the-sxml
>     (with-input-from-string the-svg xml->sxml))
>   (format #t "~A\n" the-sxml)
>   
>   ;; If we try to serialize this: kaboom!
>   (sxml->xml the-sxml)
>   
> The parsing into SXML goes well, the (format ...) outputs what
> I'd expect. But the (sxml->xml ...) dies with:
>
>   ERROR: In procedure scm-error:
>   ERROR: Invalid QName: more than one colon http://www.w3.org/2000/svg:svg
>
> I had a look at sxml simple and think the problem is that the
> function check-name (which is the one throwing the error) expects
> the name to be a QName (i.e. either a Name or a namespace abbreviation
> plus a colon plus a Name).
>
> But SXML tacks the whole namespaces to names (i.e. the whole
> "http://www.w3.org/1999/xlink", for example -- not the "xlink").
>
> When serializing to XML, we should go the way back, finding abbreviations
> for the namespaces used, prefixing the names with those abbreviations
> and issuing namespace declarations for those abbreviations (those funny
> xmlns:foo attributes).
>
> I've tried my hand at a patch which "works for me". Basically, what it
> does is to thread an extra parameter "nsmap", representing a mapping
> (namespace -> ns-abbreviation) valid at "this" position and below in
> the tree. When new, unseen namespaces come up, new abbreviations are
> "invented" (ns-abbrev-new), collected and the corresponding declarations
> printed. When recursing to sub-elements, the new mappings are added to
> the nsmap passed down.
>
> The result after the patch for the above example (a bit embellished)
> looks like this:
>
>   <ns1:svg xmlns:ns1="http://www.w3.org/2000/svg">
>     <ns1:rect y="5" x="5" width="20" stroke-width="2"
>               stroke="purple" id="rect1" height="20" fill="yellow" />
>     <ns1:rect ns2:href="#rect1" y="5" x="30" width="20" stroke-width="2"
>               stroke="purple" ry="5" rx="8" height="20" fill="blue"
>               xmlns:ns2="http://www.w3.org/1999/xlink" />
>   </ns1:svg>
>   
> Pretty clumsy, but basically correct.
>
> The attached patch is against "GNU Guile 2.0.5-deb+1-3". The relevant
> code hasn't changed up to the current development version.
>
> I'm not very happy with the patch as-is. Among other things,
>
>  - I had a hard time doing what I wanted in a non-clumsy way.
>    Especially, ns-abbr is a strange function and not very clear
>    because it tries to do several things at once: replace the
>    namespace by its abbreviation, signal a new mapping item
>    whenever this abbreviation was new. But how to achieve this
>    elegantly without doing several look-ups?
>
>  - The namespace declarations are tacked at the end of the attribute
>    list. This is plain opportunism: the tag may carry a namespace,
>    and each of the attribute names too. Thus, it's very handy to
>    collect all the unseen mappings (new-namespaces in element->xml)
>    and output them at the end of the attribute list.
>
>    But in XML it is usual to put the namespace declarations before
>    the attributes (the "canonical" XML order even prescribes that).
>
>  - The sxml code is pretty careful to not munge around too much
>    with strings, but to output things ASAP to the port. I think
>    I might be a bit more careful in that department.
>
>  - In other XML libraries the user gets a choice on preferred
>    namespace mappings (e.g. I'd like http://www.w3.org/2000/svg
>    to be the default namespace -- or http://www.w3.org/1999/xlink
>    to be abbreviated as 'xlink'). This could be achieved by
>    passing a function as an optional parameter which gets a try
>    at a new namespace before ns-abbr-new gets at it.
>
> I'd be happy to prepare a patch against whatever version makes
> sense once we get some consensus on how to do it right.
>
> Thanks & regards
> -- tomás
>
> --- /usr/share/guile/2.0/sxml/simple.scm	2012-03-18 20:16:21.000000000 +0100
> +++ /home/tomas/lib/guile/sxml/simple.scm	2015-04-08 22:29:30.049277842 +0200
> @@ -37,29 +37,38 @@
>  argument, @var{port}, which defaults to the current input port."
>    (ssax:xml->sxml port '()))
>  
> -(define check-name
> -  (let ((*good-cache* (make-hash-table)))
> -    (lambda (name)
> -      (if (not (hashq-ref *good-cache* name))
> -          (let* ((str (symbol->string name))
> -                 (i (string-index str #\:))
> -                 (head (or (and i (substring str 0 i)) str))
> -                 (tail (and i (substring str (1+ i)))))
> -            (and i (string-index (substring str (1+ i)) #\:)
> -                 (error "Invalid QName: more than one colon" name))
> -            (for-each
> -             (lambda (s)
> -               (and s
> -                    (or (char-alphabetic? (string-ref s 0))
> -                        (eq? (string-ref s 0) #\_)
> -                        (error "Invalid name starting character" s name))
> -                    (string-for-each
> -                     (lambda (c)
> -                       (or (char-alphabetic? c) (string-index "0123456789.-_" c)
> -                           (error "Invalid name character" c s name)))
> -                     s)))
> -             (list head tail))
> -            (hashq-set! *good-cache* name #t))))))
> +(define (ns-lookup ns nsmap)
> +  "Look up namespace ns in nsmap. Return its abbreviation or #f"
> +  (assoc-ref nsmap ns))
> +
> +(define ns-abbr-new
> +  (let ((*nscounter* 0))
> +    (lambda ()
> +      (set! *nscounter* (1+ *nscounter*))
> +      (string-append "ns" (number->string *nscounter*)))))
> +
> +(define (ns-abbr name nsmap)
> +  "Takes a QName, SXML style (i.e a symbol whose string value is either a
> +clean local name or a colon-concatenated pair of namespace:name, and returns
> +a list whose car is a string <nsabbrev>:<local-name> and which has as cdr
> +a pair (<namespace> . nsabbrev) whenever <namespace> wasn't found in nsmap"
> +  ;; FIXME check for empty ns (e.g ":foo")
> +  ;; check (worse!) for empty locname (e.g. "foo:")
> +  (let* ((str (symbol->string name))
> +         (i (string-rindex str #\:))
> +         (ns (and i (substring str 0 i)))
> +         (locname (or (and i (substring str (1+ i))) str)))
> +    (if ns
> +        (let ((nsabbr (ns-lookup ns nsmap)))
> +          (if nsabbr
> +              ;; known namespace:
> +              (list (string-append nsabbr ":" locname))
> +              ;; unknown namespace
> +              (let ((nsabbr (ns-abbr-new)))
> +                (list (string-append nsabbr ":" locname)
> +                      (cons ns nsabbr)))))
> +        ;; empty namespace: clean local-name:
> +        (list locname))))
>  
>  ;; The following two functions serialize tags and attributes. They are
>  ;; being used in the node handlers for the post-order function, see
> @@ -82,42 +91,58 @@
>       port))))
>  
>  (define (attribute->xml attr value port)
> -  (check-name attr)
>    (display attr port)
>    (display "=\"" port)
>    (attribute-value->xml value port)
>    (display #\" port))
>  
> -(define (element->xml tag attrs body port)
> -  (check-name tag)
> -  (display #\< port)
> -  (display tag port)
> -  (if attrs
> -      (let lp ((attrs attrs))
> -        (if (pair? attrs)
> -            (let ((attr (car attrs)))
> -              (display #\space port)
> -              (if (pair? attr)
> -                  (attribute->xml (car attr) (cdr attr) port)
> -                  (error "bad attribute" tag attr))
> -              (lp (cdr attrs)))
> -            (if (not (null? attrs))
> -                (error "bad attributes" tag attrs)))))
> -  (if (pair? body)
> -      (begin
> -        (display #\> port)
> -        (let lp ((body body))
> -          (cond
> -           ((pair? body)
> -            (sxml->xml (car body) port)
> -            (lp (cdr body)))
> -           ((null? body)
> -            (display "</" port)
> -            (display tag port)
> -            (display ">" port))
> -           (else
> -            (error "bad element body" tag body)))))
> -      (display " />" port)))
> +(define (element->xml tag attrs body port nsmap)
> +  (let* ((ab (ns-abbr tag  nsmap))
> +         (abname (car ab))
> +         (new-namespaces (cdr ab)))
> +    (display #\< port)
> +    (display abname port)
> +    (if attrs
> +        (let lp ((attrs attrs))
> +          (if (pair? attrs)
> +              (let ((attr (car attrs)))
> +                (display #\space port)
> +                (if (pair? attr)
> +                    (let* ((ab (ns-abbr (car attr) nsmap))
> +                           (abname (car ab))
> +                           (nsplus (cdr ab)))
> +                      (unless (null? nsplus)
> +                        (set! new-namespaces
> +                              (cons (car nsplus) new-namespaces)))
> +                      (attribute->xml abname (cdr attr) port))
> +                    (error "bad attribute" tag attr))
> +                (lp (cdr attrs)))
> +              (if (not (null? attrs))
> +                  (error "bad attributes" tag attrs)))))
> +    ;; Output namespace declarations
> +    (let lp ((new-namespaces new-namespaces))
> +      (unless (null? new-namespaces)
> +        ;; remember: car is namespace, cdr is abbrev
> +        (let ((ns (caar new-namespaces))
> +              (nsabbr (cdar new-namespaces)))
> +          (display #\space port)
> +          (attribute->xml (string-append "xmlns:" nsabbr) ns port))
> +        (lp (cdr new-namespaces))))
> +    (if (pair? body)
> +        (begin
> +          (display #\> port)
> +          (let lp ((body body))
> +            (cond
> +             ((pair? body)
> +              (sxml->xml (car body) port (append new-namespaces nsmap))
> +              (lp (cdr body)))
> +             ((null? body)
> +              (display "</" port)
> +              (display abname port)
> +              (display ">" port))
> +             (else
> +              (error "bad element body" tag body)))))
> +        (display " />" port))))
>  
>  ;; FIXME: ensure name is valid
>  (define (entity->xml name port)
> @@ -133,7 +158,8 @@
>    (display str port)
>    (display "?>" port))
>  
> -(define* (sxml->xml tree #:optional (port (current-output-port)))
> +(define* (sxml->xml tree #:optional (port (current-output-port))
> +                    (nsmap '()))
>    "Serialize the sxml tree @var{tree} as XML. The output will be written
>  to the current output port, unless the optional argument @var{port} is
>  present."
> @@ -144,7 +170,7 @@
>          (let ((tag (car tree)))
>            (case tag
>              ((*TOP*)
> -             (sxml->xml (cdr tree) port))
> +             (sxml->xml (cdr tree) port nsmap))
>              ((*ENTITY*)
>               (if (and (list? (cdr tree)) (= (length (cdr tree)) 1))
>                   (entity->xml (cadr tree) port)
> @@ -158,9 +184,9 @@
>                      (attrs (and (pair? elems) (pair? (car elems))
>                                  (eq? '@ (caar elems))
>                                  (cdar elems))))
> -               (element->xml tag attrs (if attrs (cdr elems) elems) port)))))
> +               (element->xml tag attrs (if attrs (cdr elems) elems) port nsmap)))))
>          ;; A nodelist.
> -        (for-each (lambda (x) (sxml->xml x port)) tree)))
> +        (for-each (lambda (x) (sxml->xml x port nsmap)) tree)))
>     ((string? tree)
>      (string->escaped-xml tree port))
>     ((null? tree) *unspecified*)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sxml simple, sxml->xml and namespaces
  2016-06-20  8:56 ` Andy Wingo
@ 2016-06-20 10:52   ` Ricardo Wurmus
  2016-06-20 11:20     ` tomas
  2016-06-20 12:11     ` Andy Wingo
  2016-06-20 11:18   ` tomas
  1 sibling, 2 replies; 8+ messages in thread
From: Ricardo Wurmus @ 2016-06-20 10:52 UTC (permalink / raw
  To: Andy Wingo; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 676 bytes --]


Andy Wingo <wingo@pobox.com> writes:

> Apologies for the long delay here.  I'm with you regarding namespaces
> and sxml->xml.  In the past I made sure to always get the namespaces
> attached to the root element via the @ xmlns attributes, and then have
> namespaced uses just be local names, not qnames, and that way sxml->xml
> works fine.  But, perhaps that doesn't cover all cases in a nice way.
> Do you still have thoughts on this patch?  Is the right thing for you?
> In any case we need better documentation in the manual about how to deal
> with namespaces and SXML, in practice, with examples.

Here is another proposal, mirroring what is done in “xml->sxml”:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-sxml-Write-XML-namespaces-when-serializing.patch --]
[-- Type: text/x-patch, Size: 2343 bytes --]

From 845badc7d4b748bc13c532e940b8a18ffb10f426 Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <rekado@elephly.net>
Date: Sun, 30 Aug 2015 10:57:00 +0200
Subject: [PATCH] sxml: Write XML namespaces when serializing.

* module/sxml/simple.scm (sxml->xml): Add optional keyword argument
  "namespaces".
---
 module/sxml/simple.scm | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm
index 703ad91..8cc20dd 100644
--- a/module/sxml/simple.scm
+++ b/module/sxml/simple.scm
@@ -311,7 +311,8 @@ port."
   (display str port)
   (display "?>" port))
 
-(define* (sxml->xml tree #:optional (port (current-output-port)))
+(define* (sxml->xml tree #:optional (port (current-output-port)) #:key
+                    (namespaces '()))
   "Serialize the sxml tree @var{tree} as XML. The output will be written
 to the current output port, unless the optional argument @var{port} is
 present."
@@ -322,7 +323,7 @@ present."
         (let ((tag (car tree)))
           (case tag
             ((*TOP*)
-             (sxml->xml (cdr tree) port))
+             (sxml->xml (cdr tree) port #:namespaces namespaces))
             ((*ENTITY*)
              (if (and (list? (cdr tree)) (= (length (cdr tree)) 1))
                  (entity->xml (cadr tree) port)
@@ -335,10 +336,16 @@ present."
              (let* ((elems (cdr tree))
                     (attrs (and (pair? elems) (pair? (car elems))
                                 (eq? '@ (caar elems))
-                                (cdar elems))))
-               (element->xml tag attrs (if attrs (cdr elems) elems) port)))))
+                                (cdar elems)))
+                    (xmlns (map (lambda (x)
+                                  (cons (symbol-append 'xmlns: (car x))
+                                        (cdr x)))
+                                namespaces)))
+               (element->xml tag
+                             (if attrs (append xmlns attrs) xmlns)
+                             (if attrs (cdr elems) elems) port)))))
         ;; A nodelist.
-        (for-each (lambda (x) (sxml->xml x port)) tree)))
+        (for-each (lambda (x) (sxml->xml x port #:namespaces namespaces)) tree)))
    ((string? tree)
     (string->escaped-xml tree port))
    ((null? tree) *unspecified*)
-- 
2.8.4


[-- Attachment #3: Type: text/plain, Size: 12 bytes --]


~~ Ricardo

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: sxml simple, sxml->xml and namespaces
  2016-06-20  8:56 ` Andy Wingo
  2016-06-20 10:52   ` Ricardo Wurmus
@ 2016-06-20 11:18   ` tomas
  1 sibling, 0 replies; 8+ messages in thread
From: tomas @ 2016-06-20 11:18 UTC (permalink / raw
  To: Andy Wingo; +Cc: guile-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, Jun 20, 2016 at 10:56:26AM +0200, Andy Wingo wrote:
> Greetings gentle Guiler,
> 
> Apologies for the long delay here.

No worries. Thanks for looking into it.

I'm myself deeply embroiled in things (not Guile, alas!) so my
response times might be... less than ideal too. And I appreciate
your spare cycles being spent in parts of Guile which are far
beyond my grasp ;-)

>                                  I'm with you regarding namespaces
> and sxml->xml.  In the past I made sure to always get the namespaces
> attached to the root element via the @ xmlns attributes, and then have
> namespaced uses just be local names, not qnames, and that way sxml->xml
> works fine.  But, perhaps that doesn't cover all cases in a nice way.
> Do you still have thoughts on this patch?  Is the right thing for you?
> In any case we need better documentation in the manual about how to deal
> with namespaces and SXML, in practice, with examples.

The patch is less than elegant. I'll have a look at the patch Ricardo
proposes, which I think is much more Scheme-y and will try to respond
whithin this week.

Thanks&regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAldn0QAACgkQBcgs9XrR2kZ6ewCbBi+Sn5k13Er0+AUS43zI+svS
N10An26EKgwmj4fmtUrGH74nqDZSDjHS
=sjpn
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sxml simple, sxml->xml and namespaces
  2016-06-20 10:52   ` Ricardo Wurmus
@ 2016-06-20 11:20     ` tomas
  2016-06-20 12:11     ` Andy Wingo
  1 sibling, 0 replies; 8+ messages in thread
From: tomas @ 2016-06-20 11:20 UTC (permalink / raw
  To: Ricardo Wurmus; +Cc: Andy Wingo, guile-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, Jun 20, 2016 at 12:52:47PM +0200, Ricardo Wurmus wrote:

[...]

> Here is another proposal, mirroring what is done in “xml->sxml”:
> 

Hey, thanks, Ricardo!

Let me have a look at your patch and see whether I can wrap my head
around it :-)

I'll have some feedback this week, promised.

Thanks a lot
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAldn0WYACgkQBcgs9XrR2kb2hACfX0seqPBw9iKBi5XlW/2aV32p
ugoAnjiPPAq2M+uObdzsl0eIIwlMGVQJ
=qQKG
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sxml simple, sxml->xml and namespaces
  2016-06-20 10:52   ` Ricardo Wurmus
  2016-06-20 11:20     ` tomas
@ 2016-06-20 12:11     ` Andy Wingo
  2016-06-21 20:36       ` Ricardo Wurmus
  1 sibling, 1 reply; 8+ messages in thread
From: Andy Wingo @ 2016-06-20 12:11 UTC (permalink / raw
  To: Ricardo Wurmus; +Cc: guile-devel

On Mon 20 Jun 2016 12:52, Ricardo Wurmus <rekado@elephly.net> writes:

> Andy Wingo <wingo@pobox.com> writes:
>
>> Apologies for the long delay here.  I'm with you regarding namespaces
>> and sxml->xml.  In the past I made sure to always get the namespaces
>> attached to the root element via the @ xmlns attributes, and then have
>> namespaced uses just be local names, not qnames, and that way sxml->xml
>> works fine.  But, perhaps that doesn't cover all cases in a nice way.
>> Do you still have thoughts on this patch?  Is the right thing for you?
>> In any case we need better documentation in the manual about how to deal
>> with namespaces and SXML, in practice, with examples.
>
> Here is another proposal, mirroring what is done in “xml->sxml”:

Neat!  Can you elaborate on how it is supposed to work?  In a final form
it would need documentation, tests, and an update to the docstring, but
I'd be interested in some xml->sxml->xml round trips as an example.

Andy



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sxml simple, sxml->xml and namespaces
  2016-06-20 12:11     ` Andy Wingo
@ 2016-06-21 20:36       ` Ricardo Wurmus
  2016-06-21 20:58         ` Andy Wingo
  0 siblings, 1 reply; 8+ messages in thread
From: Ricardo Wurmus @ 2016-06-21 20:36 UTC (permalink / raw
  To: Andy Wingo; +Cc: guile-devel


Andy Wingo <wingo@pobox.com> writes:

> On Mon 20 Jun 2016 12:52, Ricardo Wurmus <rekado@elephly.net> writes:
>
>> Andy Wingo <wingo@pobox.com> writes:
>>
>>> Apologies for the long delay here.  I'm with you regarding namespaces
>>> and sxml->xml.  In the past I made sure to always get the namespaces
>>> attached to the root element via the @ xmlns attributes, and then have
>>> namespaced uses just be local names, not qnames, and that way sxml->xml
>>> works fine.  But, perhaps that doesn't cover all cases in a nice way.
>>> Do you still have thoughts on this patch?  Is the right thing for you?
>>> In any case we need better documentation in the manual about how to deal
>>> with namespaces and SXML, in practice, with examples.
>>
>> Here is another proposal, mirroring what is done in “xml->sxml”:
>
> Neat!  Can you elaborate on how it is supposed to work?  In a final form
> it would need documentation, tests, and an update to the docstring, but
> I'd be interested in some xml->sxml->xml round trips as an example.

This is the same patch I sent to the discussion of bug#20339 about a
year ago.

The patch is not very ambitious: it only gives the user a way around the
error by letting them pass an alist of namespaces.  The patched
“sxml->xml” does not attempt to be smart about anything.  It will still
fail if it encounters an undeclared namespace.  My primary goal was to
get around the error.  Maybe “sxml->xml” really should be smarter than
that.

What follows is a copy of my original message:

    >> Since xml->sxml accepts a namespace alist I suppose it would make sense
    >> to extend sxml->xml to do the same.

    Attached is a minimal patch to extend "sxml->xml" such that it accepts an
    optional keyword argument "namespaces" with an alist of prefixes to
    URLs, analogous to "xml->sxml".

    When the namespaces alist is provided, "xmlns:prefix=url" attributes are
    prepended to the element's list of attributes.


        ;; Define SVG document with namespaces
        (define the-svg "<svg xmlns='http://www.w3.org/2000/svg'
           xmlns:xlink='http://www.w3.org/1999/xlink'>
        <rect x='5' y='5' width='20' height='20'
              stroke-width='2' stroke='purple' fill='yellow'
              id='rect1' />
        <rect x='30' y='5' width='20' height='20'
              ry='5' rx='8' stroke-width='2' stroke='purple' fill='blue'
              xlink:href='#rect1' />
        </svg>")

        ;; Define alist of namespaces
        (define ns '((svg . "http://www.w3.org/2000/svg")
                     (xlink . "http://www.w3.org/1999/xlink")))

        ;; Convert to SXML, abbreviate namespaces according to ns alist
        (define the-sxml (xml->sxml the-svg #:namespaces ns))

        ;; Convert back to XML
        (sxml->xml the-sxml #:namespaces ns)

        => <svg:svg xmlns:svg="http://www.w3.org/2000/svg"
                    xmlns:xlink="http://www.w3.org/1999/xlink">
             <svg:rect y="5" x="5"
                       width="20"
                       stroke-width="2"
                       stroke="purple"
                       id="rect1"
                       height="20"
                       fill="yellow" />
             <svg:rect xlink:href="#rect1"
                       y="5" x="30"
                       width="20"
                       stroke-width="2"
                       stroke="purple"
                       ry="5" rx="8"
                       height="20"
                       fill="blue" />
           </svg:svg>

~~ Ricardo




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: sxml simple, sxml->xml and namespaces
  2016-06-21 20:36       ` Ricardo Wurmus
@ 2016-06-21 20:58         ` Andy Wingo
  0 siblings, 0 replies; 8+ messages in thread
From: Andy Wingo @ 2016-06-21 20:58 UTC (permalink / raw
  To: Ricardo Wurmus; +Cc: guile-devel

Hi,

On Tue 21 Jun 2016 22:36, Ricardo Wurmus <rekado@elephly.net> writes:

> This is the same patch I sent to the discussion of bug#20339 about a
> year ago.

Yeah sorry for the miscommunication; two lists, split brain.

> The patch is not very ambitious: it only gives the user a way around the
> error by letting them pass an alist of namespaces.  The patched
> “sxml->xml” does not attempt to be smart about anything.  It will still
> fail if it encounters an undeclared namespace.

This sounds fine to me to be honest.

>         ;; Define SVG document with namespaces
>         (define the-svg "<svg xmlns='http://www.w3.org/2000/svg'
>            xmlns:xlink='http://www.w3.org/1999/xlink'>
>         <rect x='5' y='5' width='20' height='20'
>               stroke-width='2' stroke='purple' fill='yellow'
>               id='rect1' />
>         <rect x='30' y='5' width='20' height='20'
>               ry='5' rx='8' stroke-width='2' stroke='purple' fill='blue'
>               xlink:href='#rect1' />
>         </svg>")
>
>         ;; Define alist of namespaces
>         (define ns '((svg . "http://www.w3.org/2000/svg")
>                      (xlink . "http://www.w3.org/1999/xlink")))
>
>         ;; Convert to SXML, abbreviate namespaces according to ns alist
>         (define the-sxml (xml->sxml the-svg #:namespaces ns))
>
>         ;; Convert back to XML
>         (sxml->xml the-sxml #:namespaces ns)
>
>         => <svg:svg xmlns:svg="http://www.w3.org/2000/svg"
>                     xmlns:xlink="http://www.w3.org/1999/xlink">
>              <svg:rect y="5" x="5"
>                        width="20"
>                        stroke-width="2"
>                        stroke="purple"
>                        id="rect1"
>                        height="20"
>                        fill="yellow" />
>              <svg:rect xlink:href="#rect1"
>                        y="5" x="30"
>                        width="20"
>                        stroke-width="2"
>                        stroke="purple"
>                        ry="5" rx="8"
>                        height="20"
>                        fill="blue" />
>            </svg:svg>

This doesn't seem quite right to me, in the sense that it should be
possible to specify default namespaces (i.e. xmlns=..., not requiring
xmlns:svg=...).  What do you think?  The patch window is wide open :)

Andy



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-21 20:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-08 20:55 sxml simple, sxml->xml and namespaces tomas
2016-06-20  8:56 ` Andy Wingo
2016-06-20 10:52   ` Ricardo Wurmus
2016-06-20 11:20     ` tomas
2016-06-20 12:11     ` Andy Wingo
2016-06-21 20:36       ` Ricardo Wurmus
2016-06-21 20:58         ` Andy Wingo
2016-06-20 11:18   ` tomas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).