unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
From: <tomas@tuxteam.de>
To: Ricardo Wurmus <rekado@elephly.net>
Cc: 20339@debbugs.gnu.org
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Date: Mon, 8 Apr 2019 14:14:03 +0200	[thread overview]
Message-ID: <20190408121403.GA781@tuxteam.de> (raw)
In-Reply-To: <87r2cmgzq0.fsf@elephly.net>

[-- Attachment #1: Type: text/plain, Size: 5267 bytes --]

On Tue, Feb 05, 2019 at 01:57:11PM +0100, Ricardo Wurmus wrote:
> 
> Ricardo Wurmus <rekado@elephly.net> writes:
> 
> > In that case we coud have FINISH-ELEMENT add all namespace declarations
> > that are in scope to the current node that is about to be returned.  It
> > would be a little verbose, but more correct.
> 
> Like this:

Thanks again for your patch, and sorry for my glacial pace.

I now came around to test it (against Guile 2.2.4, commit
791cae940afcb2b2eb2c167fe438be1dc1008a73).

TL;DR:

 - The default namespace is still a problem (see below)
 - It would be nice to inhibit the down-inheritance of
   namespace declararions at xml->sxml time. Then, the
   sxml representation would closely mimic the XML, this
   has obvious advantages, since it'd give the user much
   more control over the generated XML.

I'd be willing to prepare a patch along these lines, but
for that, I'd like to get an idea of which direction we
want to take this whole thing to.

To see what's going on, I tried with a small XML example:

First with explicit (aka non-default) namespace:

  #+NAME: minimal-explicit
  #+BEGIN_EXAMPLE
  <?xml version="1.0"?>
  <myns:root xmlns:myns="http://example.org/namespaces/myns">
    <myns:subnode/>
  </myns:root>
  #+END_EXAMPLE

Before your patch:

  #+NAME: minimal-explicit-before
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=minimal-explicit
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-explicit-before
  : <stdin>:12:0: warning: possibly unbound variable `pretty-print'
  : <stdin>:12:14: warning: possibly unbound variable `xml->sxml'
  : (*TOP* (*PI* xml "version=\"1.0\"")
  :        (http://example.org/namespaces/myns:root
  :          "\n  "
  :          (http://example.org/namespaces/myns:subnode)
  :          "\n"))

As we know, this replaces the namespace prefixes with the namespace URIs

After your patch:

  #+NAME: minimal-explicit-after
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=minimal-explicit
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-explicit-after
  #+begin_example
  <stdin>:13:0: warning: possibly unbound variable `pretty-print'
  <stdin>:13:14: warning: possibly unbound variable `xml->sxml'
  ;;; note: source file ./sxml/simple.scm
  ;;;       newer than compiled /usr/local/lib/guile/2.2/ccache/sxml/simple.go
  ;;; found fresh local cache at /home/tomas/.cache/guile/ccache/2.2-LE-8-3.A/home/tomas/guile/sxml-fix/sxml/simple.scm.go
  (*TOP* (*PI* xml "version=\"1.0\"")
         (myns:root
           (@ (xmlns:myns "http://example.org/namespaces/myns"))
           "\n  "
           (myns:subnode
             (@ (xmlns:myns "http://example.org/namespaces/myns")))
           "\n"))
  #+end_example

(I've put sxml/simple.scm in the current directory, thus the manipulation
of %load-path). This mimics the XML more closely, using namespace prefixes
instead of namespaces in the sxml. This is compelling :-)

The only difference to the xml is that the namespace declaration is inherited
to lower-level nodes (that's why sxml->xml propagates them, too).

This works, with the above downside, which you noted too.

It doesn't work with a default namespace, though:

  #+NAME: minimal-implicit
  #+BEGIN_EXAMPLE
  <?xml version="1.0"?>
  <root xmlns="http://example.org/namespaces/myns">
    <subnode/>
  </root>
  #+END_EXAMPLE

With your patch:

  #+NAME: minimal-implicit-after
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=minimal-implicit
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-implicit-after
  : <stdin>:13:0: warning: possibly unbound variable `pretty-print'
  : <stdin>:13:14: warning: possibly unbound variable `xml->sxml'
  : ;;; note: source file ./sxml/simple.scm
  : ;;;       newer than compiled /usr/local/lib/guile/2.2/ccache/sxml/simple.go
  : ;;; found fresh local cache at /home/tomas/.cache/guile/ccache/2.2-LE-8-3.A/home/tomas/guile/sxml-fix/sxml/simple.scm.go
  : (*TOP* (*PI* xml "version=\"1.0\"")
  :        (*DEFAULT*:root "\n  " (*DEFAULT*:subnode) "\n"))

Note that the namespace declaration for *DEFAULT* is missing,
so we lost that bit of information. Besides, this is not
serializable:

  #+NAME: reserialize-implicit
  #+BEGIN_SRC scheme :results output verbatim
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (define the-sxml
    '(*TOP* (*PI* xml "version=\"1.0\"")
       (*DEFAULT*:root "\n  " (*DEFAULT*:subnode) "\n")))
  (sxml->xml the-sxml)
  #+END_SRC

It catches the bad (xml) name starting with a star:

  #+RESULTS: reserialize-implicit
  : ERROR: In procedure scm-error:
  : Invalid name starting character "*DEFAULT*" *DEFAULT*:root
  : 
  : Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
  : scheme@(guile-user) [1]> 

Cheers
-- tomás

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2019-04-08 12:14 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-15 19:47 bug#20339: sxml simple: sxml->xml mishandles namespaces? tomas
2015-04-20  7:45 ` bug#20339: [PATCH] sxml->xml and namespaces: updated patch tomas
2015-04-21  9:24 ` bug#20339: sxml simple: sxml->xml mishandles namespaces? Ricardo Wurmus
2015-04-21  9:44   ` tomas
2015-04-22 14:29     ` Ricardo Wurmus
2015-04-23  6:57       ` tomas
2015-04-23  7:04         ` Ricardo Wurmus
2015-04-23  7:40           ` tomas
2015-04-25 20:25       ` tomas
2015-04-26 10:28         ` tomas
2016-06-23 19:32 ` Andy Wingo
2016-07-13 13:24   ` tomas
2016-07-13 18:08     ` tomas
2016-07-14 10:10     ` Andy Wingo
2016-07-14 10:26       ` tomas
2019-02-04 20:44       ` Ricardo Wurmus
2019-02-04 22:55         ` John Cowan
2019-02-05  9:12           ` Ricardo Wurmus
2019-02-05 12:57             ` Ricardo Wurmus
2019-04-08 12:14               ` tomas [this message]
2019-02-12  9:56         ` tomas
2019-02-12 20:30           ` Ricardo Wurmus
2019-05-03 10:46             ` bug#20339: Taking a step back (was: sxml simple: sxml->xml mishandles namespaces?) tomas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190408121403.GA781@tuxteam.de \
    --to=tomas@tuxteam.de \
    --cc=20339@debbugs.gnu.org \
    --cc=rekado@elephly.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).