unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* [PATCH] Fix of upstream parsing of CDATA
@ 2020-01-16 12:00 Linus Björnstam
  2020-03-12  8:35 ` Linus Björnstam
  0 siblings, 1 reply; 2+ messages in thread
From: Linus Björnstam @ 2020-01-16 12:00 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1286 bytes --]

Hello Guilers!

RhodiumToad found an error in sxml where it would not properly parse CDATA: &gt would be converted to > inside CDATA blocks. This is probably due to some wrong reading of the XML spec:

    "Within a CDATA section, only the CDEnd string is recognized as markup, so that left angle brackets and ampersands may occur in their literal form; they need not (and cannot) be escaped using ' < ' and ' & '.".

Notice that it mentions that only CDEnd is recognized, but omitts > in the enumeration of things that need-not-and-cannot be escaped. 

No other XML libraries behave this way. Take for example python's Etree:

Python 2.7.17 (default, Dec 23 2019, 21:25:33)
>>> import xml.etree.ElementTree as ET
>>> root = ET.fromstring("<e><![CDATA[&gt;]]></e>")
>>> root.text
'&gt;'

The same thing with the un-patched (sxml ssax) (or rather (sxml simple)): looks different:

(xml->sxml "<e><![CDATA[&gt;]]></e>")
;; => (*TOP* (e ">"))

The question is whether this patch should be sent upstream. Since there has been very little activity there, I suspect it is a lost cause.

Failing tests have been looked through, verified and fixed. No unexpected errors were encountered. All SXML tests pass after this patch.

Best regards
  Linus Björnstam

[-- Attachment #2: 0001-module-sxml-upstream-SSAX.scm-Fix-improper-handling-.patch --]
[-- Type: application/octet-stream, Size: 4803 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Fix of upstream parsing of CDATA
  2020-01-16 12:00 [PATCH] Fix of upstream parsing of CDATA Linus Björnstam
@ 2020-03-12  8:35 ` Linus Björnstam
  0 siblings, 0 replies; 2+ messages in thread
From: Linus Björnstam @ 2020-03-12  8:35 UTC (permalink / raw)
  To: guile-devel

Oleg replied that he will look into this bug when he has time (and that the patch looked reasonable), which sounded non-imminent :) Hopefully this means that there will be an upstream patch sometime in the future.

-- 
  Linus Björnstam

On Thu, 16 Jan 2020, at 13:00, Linus Björnstam wrote:
> Hello Guilers!
> 
> RhodiumToad found an error in sxml where it would not properly parse 
> CDATA: &gt would be converted to > inside CDATA blocks. This is 
> probably due to some wrong reading of the XML spec:
> 
>     "Within a CDATA section, only the CDEnd string is recognized as 
> markup, so that left angle brackets and ampersands may occur in their 
> literal form; they need not (and cannot) be escaped using ' &lt; ' and 
> ' &amp; '.".
> 
> Notice that it mentions that only CDEnd is recognized, but omitts &gt; 
> in the enumeration of things that need-not-and-cannot be escaped. 
> 
> No other XML libraries behave this way. Take for example python's Etree:
> 
> Python 2.7.17 (default, Dec 23 2019, 21:25:33)
> >>> import xml.etree.ElementTree as ET
> >>> root = ET.fromstring("<e><![CDATA[&gt;]]></e>")
> >>> root.text
> '&gt;'
> 
> The same thing with the un-patched (sxml ssax) (or rather (sxml 
> simple)): looks different:
> 
> (xml->sxml "<e><![CDATA[&gt;]]></e>")
> ;; => (*TOP* (e ">"))
> 
> The question is whether this patch should be sent upstream. Since there 
> has been very little activity there, I suspect it is a lost cause.
> 
> Failing tests have been looked through, verified and fixed. No 
> unexpected errors were encountered. All SXML tests pass after this 
> patch.
> 
> Best regards
>   Linus Björnstam
> Attachments:
> * 0001-module-sxml-upstream-SSAX.scm-Fix-improper-handling-.patch



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-03-12  8:35 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-16 12:00 [PATCH] Fix of upstream parsing of CDATA Linus Björnstam
2020-03-12  8:35 ` Linus Björnstam

unofficial mirror of guile-devel@gnu.org 

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://yhetil.org/guile-devel/0 guile-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 guile-devel guile-devel/ https://yhetil.org/guile-devel \
		guile-devel@gnu.org
	public-inbox-index guile-devel

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.lisp.guile.devel
	nntp://news.gmane.io/gmane.lisp.guile.devel


AGPL code for this site: git clone http://ou63pmih66umazou.onion/public-inbox.git