From: Andrew Gierth <andrew@tao11.riddles.org.uk>
To: 38269@debbugs.gnu.org
Subject: bug#38269: SSAX incorrect handling of > in CDATA
Date: Tue, 19 Nov 2019 13:41:54 +0000 [thread overview]
Message-ID: <87zhgsyost.fsf@news-spur.riddles.org.uk> (raw)
The bug:
> (xml->sxml "<e><![CDATA[>]]></e>")
$2 = (*TOP* (e ">"))
The expected result is (*TOP* (e ">")).
In upstream/SSAX.scm:
; procedure+: ssax:read-cdata-body PORT STR-HANDLER SEED
[...]
; Within a CDATA section all characters are taken at their face value,
; with only three exceptions:
[..]
; > is treated as an embedded #\> character
This handling of > is contrary to the XML specification, in which
there are no special character sequences inside CDATA except newline and
the "]]>" closing tag. I have confirmed this by checking other XML
parsers. The code seems to be based on a wild misreading of another
section of the specification that does not apply here. (And
unfortunately, the W3C validation suite for XML happens not to contain
any instances of > inside CDATA.)
I believe the fix should be as simple as removing the entire (#\&) case
from the function (and fixing the test cases).
This bug seems to exist in all versions of SSAX.
--
Andrew.
reply other threads:[~2019-11-19 13:41 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zhgsyost.fsf@news-spur.riddles.org.uk \
--to=andrew@tao11.riddles.org.uk \
--cc=38269@debbugs.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).