From: "Linus Björnstam" <linus.bjornstam@veryfast.biz>
To: guile-devel@gnu.org
Subject: Re: [PATCH] Fix of upstream parsing of CDATA
Date: Thu, 12 Mar 2020 09:35:07 +0100 [thread overview]
Message-ID: <8d8bc0a1-04b0-4aeb-8be6-16864bfa288d@www.fastmail.com> (raw)
In-Reply-To: <5811db42-ecbe-4ad9-a44f-87481f1ac9a6@www.fastmail.com>
Oleg replied that he will look into this bug when he has time (and that the patch looked reasonable), which sounded non-imminent :) Hopefully this means that there will be an upstream patch sometime in the future.
--
Linus Björnstam
On Thu, 16 Jan 2020, at 13:00, Linus Björnstam wrote:
> Hello Guilers!
>
> RhodiumToad found an error in sxml where it would not properly parse
> CDATA: > would be converted to > inside CDATA blocks. This is
> probably due to some wrong reading of the XML spec:
>
> "Within a CDATA section, only the CDEnd string is recognized as
> markup, so that left angle brackets and ampersands may occur in their
> literal form; they need not (and cannot) be escaped using ' < ' and
> ' & '.".
>
> Notice that it mentions that only CDEnd is recognized, but omitts >
> in the enumeration of things that need-not-and-cannot be escaped.
>
> No other XML libraries behave this way. Take for example python's Etree:
>
> Python 2.7.17 (default, Dec 23 2019, 21:25:33)
> >>> import xml.etree.ElementTree as ET
> >>> root = ET.fromstring("<e><![CDATA[>]]></e>")
> >>> root.text
> '>'
>
> The same thing with the un-patched (sxml ssax) (or rather (sxml
> simple)): looks different:
>
> (xml->sxml "<e><![CDATA[>]]></e>")
> ;; => (*TOP* (e ">"))
>
> The question is whether this patch should be sent upstream. Since there
> has been very little activity there, I suspect it is a lost cause.
>
> Failing tests have been looked through, verified and fixed. No
> unexpected errors were encountered. All SXML tests pass after this
> patch.
>
> Best regards
> Linus Björnstam
> Attachments:
> * 0001-module-sxml-upstream-SSAX.scm-Fix-improper-handling-.patch
prev parent reply other threads:[~2020-03-12 8:35 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-16 12:00 [PATCH] Fix of upstream parsing of CDATA Linus Björnstam
2020-03-12 8:35 ` Linus Björnstam [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8d8bc0a1-04b0-4aeb-8be6-16864bfa288d@www.fastmail.com \
--to=linus.bjornstam@veryfast.biz \
--cc=guile-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).