* [PATCH] Fix of upstream parsing of CDATA
@ 2020-01-16 12:00 Linus Björnstam
2020-03-12 8:35 ` Linus Björnstam
0 siblings, 1 reply; 2+ messages in thread
From: Linus Björnstam @ 2020-01-16 12:00 UTC (permalink / raw)
To: guile-devel
[-- Attachment #1: Type: text/plain, Size: 1286 bytes --]
Hello Guilers!
RhodiumToad found an error in sxml where it would not properly parse CDATA: > would be converted to > inside CDATA blocks. This is probably due to some wrong reading of the XML spec:
"Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using ' < ' and ' & '.".
Notice that it mentions that only CDEnd is recognized, but omitts > in the enumeration of things that need-not-and-cannot be escaped.
No other XML libraries behave this way. Take for example python's Etree:
Python 2.7.17 (default, Dec 23 2019, 21:25:33)
>>> import xml.etree.ElementTree as ET
>>> root = ET.fromstring("<e><![CDATA[>]]></e>")
>>> root.text
'>'
The same thing with the un-patched (sxml ssax) (or rather (sxml simple)): looks different:
(xml->sxml "<e><![CDATA[>]]></e>")
;; => (*TOP* (e ">"))
The question is whether this patch should be sent upstream. Since there has been very little activity there, I suspect it is a lost cause.
Failing tests have been looked through, verified and fixed. No unexpected errors were encountered. All SXML tests pass after this patch.
Best regards
Linus Björnstam
[-- Attachment #2: 0001-module-sxml-upstream-SSAX.scm-Fix-improper-handling-.patch --]
[-- Type: application/octet-stream, Size: 4803 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] Fix of upstream parsing of CDATA
2020-01-16 12:00 [PATCH] Fix of upstream parsing of CDATA Linus Björnstam
@ 2020-03-12 8:35 ` Linus Björnstam
0 siblings, 0 replies; 2+ messages in thread
From: Linus Björnstam @ 2020-03-12 8:35 UTC (permalink / raw)
To: guile-devel
Oleg replied that he will look into this bug when he has time (and that the patch looked reasonable), which sounded non-imminent :) Hopefully this means that there will be an upstream patch sometime in the future.
--
Linus Björnstam
On Thu, 16 Jan 2020, at 13:00, Linus Björnstam wrote:
> Hello Guilers!
>
> RhodiumToad found an error in sxml where it would not properly parse
> CDATA: > would be converted to > inside CDATA blocks. This is
> probably due to some wrong reading of the XML spec:
>
> "Within a CDATA section, only the CDEnd string is recognized as
> markup, so that left angle brackets and ampersands may occur in their
> literal form; they need not (and cannot) be escaped using ' < ' and
> ' & '.".
>
> Notice that it mentions that only CDEnd is recognized, but omitts >
> in the enumeration of things that need-not-and-cannot be escaped.
>
> No other XML libraries behave this way. Take for example python's Etree:
>
> Python 2.7.17 (default, Dec 23 2019, 21:25:33)
> >>> import xml.etree.ElementTree as ET
> >>> root = ET.fromstring("<e><![CDATA[>]]></e>")
> >>> root.text
> '>'
>
> The same thing with the un-patched (sxml ssax) (or rather (sxml
> simple)): looks different:
>
> (xml->sxml "<e><![CDATA[>]]></e>")
> ;; => (*TOP* (e ">"))
>
> The question is whether this patch should be sent upstream. Since there
> has been very little activity there, I suspect it is a lost cause.
>
> Failing tests have been looked through, verified and fixed. No
> unexpected errors were encountered. All SXML tests pass after this
> patch.
>
> Best regards
> Linus Björnstam
> Attachments:
> * 0001-module-sxml-upstream-SSAX.scm-Fix-improper-handling-.patch
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-03-12 8:35 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-16 12:00 [PATCH] Fix of upstream parsing of CDATA Linus Björnstam
2020-03-12 8:35 ` Linus Björnstam
unofficial mirror of guile-devel@gnu.org
This inbox may be cloned and mirrored by anyone:
git clone --mirror https://yhetil.org/guile-devel/0 guile-devel/git/0.git
# If you have public-inbox 1.1+ installed, you may
# initialize and index your mirror using the following commands:
public-inbox-init -V2 guile-devel guile-devel/ https://yhetil.org/guile-devel \
guile-devel@gnu.org
public-inbox-index guile-devel
Example config snippet for mirrors.
Newsgroups are available over NNTP:
nntp://news.yhetil.org/yhetil.lisp.guile.devel
nntp://news.gmane.io/gmane.lisp.guile.devel
AGPL code for this site: git clone http://ou63pmih66umazou.onion/public-inbox.git