From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Maxime Devos Newsgroups: gmane.lisp.guile.devel,gmane.lisp.guile.user Subject: Re: avoid character encoding/escaping in sxml->xml or htmlprag's sxml->html Date: Sun, 21 Aug 2022 12:16:54 +0200 Message-ID: References: <4425bd42-90fa-c083-717d-8b7def122e59@telenet.be> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------MKSCnDnIbl90ZpDtA2tPBeXh" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34142"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Cc: guile-user , guile-devel To: =?UTF-8?Q?Aleix_Conchillo_Flaqu=c3=a9?= Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sun Aug 21 12:17:26 2022 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oPi1G-0008gm-8Z for guile-devel@m.gmane-mx.org; Sun, 21 Aug 2022 12:17:26 +0200 Original-Received: from localhost ([::1]:36574 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oPi1F-0000Nv-Bh for guile-devel@m.gmane-mx.org; Sun, 21 Aug 2022 06:17:25 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42506) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oPi0r-0000Nc-Ec for guile-devel@gnu.org; Sun, 21 Aug 2022 06:17:02 -0400 Original-Received: from baptiste.telenet-ops.be ([2a02:1800:120:4::f00:13]:40576) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oPi0o-0002cb-I1 for guile-devel@gnu.org; Sun, 21 Aug 2022 06:17:01 -0400 Original-Received: from [IPV6:2a02:1811:8c09:9d00:5dba:d409:33f7:a16] ([IPv6:2a02:1811:8c09:9d00:5dba:d409:33f7:a16]) by baptiste.telenet-ops.be with bizsmtp id AAGu2800R20ykKC01AGvwi; Sun, 21 Aug 2022 12:16:55 +0200 Content-Language: en-US In-Reply-To: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telenet.be; s=r22; t=1661077015; bh=MHUBFBmwMzFFbllT8JeL98+/wvr0WEbNiBd1gKWEDbI=; h=Date:To:Cc:References:From:Subject:In-Reply-To; b=bRcXYNFDQfBO63xqPmd3bXcuhPV6CKDDS41+IHVY7Sm6kEjrxS3wERGnACX3Ab2wy MFKOqkrP6/W0RttXVPElrZ5FOVbvJ803G3qXwKDZksI0znO/C+SMi+p9j7jsreRFbg n+ESqdhVFPdS6Zfs6DneZHdcZaHeAQuRxI51BOyvnUrg+AOPw5vRxaLMZnne3bU1LE iRqUG4VPZNZcZtOU0cj+WEm5woOSCz5Z2f5LZX/ANTGF3wjHvQkdJhI758UjiFcj0C qLoSQb6+wqYoYz+j7oiZ3h2X8mG7/o5oVO9+tbNbXyyrPjbXfYSOZ1RdDVxgzKViGS olyD7z9l0tYIg== Received-SPF: pass client-ip=2a02:1800:120:4::f00:13; envelope-from=maximedevos@telenet.be; helo=baptiste.telenet-ops.be X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:21325 gmane.lisp.guile.user:18536 Archived-At: This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------MKSCnDnIbl90ZpDtA2tPBeXh Content-Type: multipart/mixed; boundary="------------s1I0DKP0LCzTziom0q10b0vB"; protected-headers="v1" From: Maxime Devos To: =?UTF-8?Q?Aleix_Conchillo_Flaqu=c3=a9?= Cc: guile-user , guile-devel Message-ID: Subject: Re: avoid character encoding/escaping in sxml->xml or htmlprag's sxml->html References: <4425bd42-90fa-c083-717d-8b7def122e59@telenet.be> In-Reply-To: --------------s1I0DKP0LCzTziom0q10b0vB Content-Type: multipart/mixed; boundary="------------4C1E0jrLIdA2tAyYf4AMvVF4" --------------4C1E0jrLIdA2tAyYf4AMvVF4 Content-Type: multipart/alternative; boundary="------------ocMWtKJbLQddj9nHQDAfRv7h" --------------ocMWtKJbLQddj9nHQDAfRv7h Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64 T24gMjEtMDgtMjAyMiAwMjowNSwgQWxlaXggQ29uY2hpbGxvIEZsYXF1w6kgd3JvdGU6DQoN Cj4gQWNjb3JkaW5nIHRvIHRoZSBzcGVjLCBlbWJlZGRpbmcgaW5saW5lIGNvbnRlbnQgaW4g dGhlIDxzY3JpcHQ+IHRhZyANCj4gc2hvdWxkIGNvbmZvcm0gdG8gdGhlIGxhbmd1YWdlIGRl ZmluZWQgYnkgdGhlICJ0eXBlIiBhdHRyaWJ1dGUgDQo+IChkZWZhdWx0cyB0byBqYXZhc2Ny aXB0KS4gU28sIEkgd291bGQgZXhwZWN0IHlvdSBjb3VsZCBwdXQgYW55IHN0cmluZyANCj4g dGhhdCBjb25mb3JtcyB0byBKUy4NCj4NCj4gIiIiDQo+IFdoZW4gdXNlZCB0byBpbmNsdWRl IGR5bmFtaWMgc2NyaXB0cywgdGhlIHNjcmlwdHMgbWF5IGVpdGhlciBiZSANCj4gZW1iZWRk ZWQgaW5saW5lIG9yIG1heSBiZSBpbXBvcnRlZCBmcm9tIGFuIGV4dGVybmFsIGZpbGUgdXNp bmcgdGhlIHNyYyANCj4gYXR0cmlidXRlLiBJZiB0aGUgbGFuZ3VhZ2UgaXMgbm90IHRoYXQg ZGVzY3JpYmVkIGJ5ICJ0ZXh0L2phdmFzY3JpcHQiLCANCj4gdGhlbiB0aGUgdHlwZSBhdHRy aWJ1dGUgbXVzdCBiZSBwcmVzZW50LCBhcyBkZXNjcmliZWQgYmVsb3cuIFdoYXRldmVyIA0K PiBsYW5ndWFnZSBpcyB1c2VkLCB0aGUgY29udGVudHMgb2YgdGhlIHNjcmlwdCBlbGVtZW50 IG11c3QgY29uZm9ybSB3aXRoIA0KPiB0aGUgcmVxdWlyZW1lbnRzIG9mIHRoYXQgbGFuZ3Vh Z2UncyBzcGVjaWZpY2F0aW9uDQoNCkkgYW0gcHJvcG9zaW5nIHRvIHVzZSBYSFRNTCAod2hp Y2ggaXMgWE1MKSwgbm90IEhUTUwuIEhUTUwncyBzcGVjaWFsIA0KcGFyc2luZyBxdWlya3Mg YXJlIGlycmVsZXZhbnQgaGVyZS4NCg0KPiBJdCBkb2VzLCBicm93c2VycyAoYXQgbGVhc3Qg Q2hyb21lKSBkb24ndCBpbnRlcnByZXQgdGhhdCBjb3JyZWN0bHksIA0KPiBzaW5jZSBpdCdz IG5vdCB2YWxpZCBKYXZhU2NyaXB0Lg0KQXMgPHNjcmlwdD4gLi4uIDwvc2NyaXB0PiBpcyBY TUwsIHRoZSBYTUwgcGFyc2VywqAgKG5vdCB0aGUgSFRNTCBwYXJzZXIsIA0KdGhpcyBpcyBY SFRNTCEpIHdpbGwgZGVjb2RlIHRoZSAmbHQ7IGluc2lkZSB0aGUgPHNjcmlwdD4uLi48L3Nj cmlwdD4sIA0KdGhlIHJlc3VsdCBfYWZ0ZXIgZGVjb2RpbmdfIGlzIHZhbGlkIEphdmFTY3Jp cHQuwqAgSW4gWE1MLCA8c2NyaXB0PiBpcyANCm5vdCBzcGVjaWFsIC0tIGV2ZXJ5dGhpbmcg aXMgcGFyc2VkIHRoZSBzYW1lIHdheSBpbiBYTUwuDQoNCkFueXdheSwgaXQgc2VlbXMgdG8g d29yayBmb3IgbWUsIGJvdGggaW4gaWNlY2F0IGFuZCB1bmdvb2dsZWQtY2hyb21pdW06DQoN Cj4gKHVzZS1tb2R1bGVzICh3ZWIgc2VydmVyKSkNCj4gKGRlZmluZSBkb2N1bWVudA0KPiAi PD94bWwgdmVyc2lvbj1cIjEuMFwiIGVuY29kaW5nPVwiVVRGLThcIj8+DQo+IDxodG1sIHht bG5zPVwiaHR0cDovL3d3dy53My5vcmcvMTk5OS94aHRtbFwiPg0KPiA8aGVhZD4NCj4gPHRp dGxlPlRlc3Q8L3RpdGxlPg0KPiA8L2hlYWQ+DQo+IDxib2R5Pg0KPiA8c2NyaXB0IHR5cGU9 XCJ0ZXh0L2phdmFzY3JpcHRcIj4NCj4gY29uc29sZS5sb2coXCImbHQ7SGkhJmd0O1wiKTsN Cj4gPC9zY3JpcHQ+DQo+IDwvYm9keT4NCj4gPC9odG1sPiIpDQo+DQo+IChkZWZpbmUgKGhh bmRsZXIgcmVxdWVzdCByZXF1ZXN0LWJvZHkpDQo+IMKgICh2YWx1ZXMgJygoY29udGVudC10 eXBlIGFwcGxpY2F0aW9uL3hodG1sK3htbCkpDQo+IMKgwqAgwqDCoCBkb2N1bWVudCkpDQo+ DQo+IChydW4tc2VydmVyIGhhbmRsZXIgJ2h0dHApDQotLSBvbiB0aGUgY29uc29sZSwgPEhp IT4gaXMgbG9nZ2VkLCBub3QgJmx0O0hpISZndDsuDQoNCklmIEkgcmVwbGFjZSAmbHQ7IGJ5 IDwgYW5kICZndDsgYnkgPiB0byBtYWtlIGl0ICd2YWxpZCBKYXZhc1NjcmlwdCcgYXMgDQp5 b3UgYXBwZWFyIHRvIGJlIHByb3Bvc2luZywgSSBnZXQgYSBwYXJzaW5nIGVycm9yOg0KDQo+ DQo+ICAgICAgICh1bmdvb2dsZWQtY2hyb21pdW0pDQo+ICAgICAgIFRoaXMgcGFnZSBjb250 YWlucyB0aGUgZm9sbG93aW5nIGVycm9yczoNCj4NCj4gZXJyb3Igb24gbGluZSA4IGF0IGNv bHVtbiAxNzogZXJyb3IgcGFyc2luZyBhdHRyaWJ1dGUgbmFtZQ0KPg0KPg0KPiAgICAgICBC ZWxvdyBpcyBhIHJlbmRlcmluZyBvZiB0aGUgcGFnZSB1cCB0byB0aGUgZmlyc3QgZXJyb3Iu DQo+DQo+DQphbmQNCg0KPiAoaWNlY2F0KToNCj4gWE1MIFBhcnNpbmcgRXJyb3I6IG5vdCB3 ZWxsLWZvcm1lZA0KPiBMb2NhdGlvbjogaHR0cDovL2xvY2FsaG9zdDo4MDgwLw0KPiBMaW5l IE51bWJlciA4LCBDb2x1bW4gMTc6DQo+IGNvbnNvbGUubG9nKCI8SGkhPiIpOw0KPiAtLS0t LS0tLS0tLS0tLS0tXg0KDQpHcmVldGluZ3MsDQpNYXhpbWUuDQoNCg== --------------ocMWtKJbLQddj9nHQDAfRv7h Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On 21-08-2022 02:05, Aleix Conchillo Flaqu=C3=A9 wrote:

According to the spec, embedding inline content in the <script> tag should conform to the language defined by the "type" attribute (defaults to javascript). So, I would expect you could put any string that conforms to JS.

"""
When used to include dynamic scripts, the scripts may either be embedded inline or may be imported from an external file using the src attribute. If the language is not that described by "text/javascript", then the type attribute must be present, as described below. Whatever language is used, the contents of the script element must conform with the requirements of that language's specification

I am proposing to use XHTML (which is XML), not HTML. HTML's special parsing quirks are irrelevant here.

It does, browsers (at least Chrome) don't interpret that correctly, since it's not valid JavaScript.
As <script> ... </script> is XML, the XML parser=C2=A0 = (not the HTML parser, this is XHTML!) will decode the &lt; inside the <script>...</script>, the result _after decoding_ is valid JavaScript.=C2=A0 In XML, <script> is not special --= everything is parsed the same way in XML.

Anyway, it seems to work for me, both in icecat and ungoogled-chromium:

(use-modules (web server))
(define document
"<?xml version=3D\"1.0\" encoding=3D\"UTF-8\"?>
<html xmlns=3D\"http://www.w3.org/1999/xhtml\">
<head>
<title>Test</title>
</head>
<body>
<script type=3D\"text/javascript\">
console.log(\"&lt;Hi!&gt;\");
</script>
</body>
</html>")

(define (handler request request-body)
=C2=A0 (values '((content-type application/xhtml+xml))
=C2=A0=C2=A0 =C2=A0=C2=A0 document))

(run-server handler 'http)
-- on the console, <Hi!> is logged, not &lt;Hi!&gt;.<= /p>

If I replace &lt; by < and &gt; by > to make it 'valid JavasScript' as you appear to be proposing, I get a parsing error:

(ungoogled-chromium)
This page contains the following errors:

error on line 8 at column 17: error parsing attribute name

Below is a rendering of the page up to the first error.


and

(icecat):
XML Parsing Error: not well-formed
Location: http://localhost:8080/
Line Number 8, Column 17:
console.log("<Hi!>");
----------------^

Greetings,
Maxime.

--------------ocMWtKJbLQddj9nHQDAfRv7h-- --------------4C1E0jrLIdA2tAyYf4AMvVF4 Content-Type: application/pgp-keys; name="OpenPGP_0x49E3EE22191725EE.asc" Content-Disposition: attachment; filename="OpenPGP_0x49E3EE22191725EE.asc" Content-Description: OpenPGP public key Content-Transfer-Encoding: quoted-printable -----BEGIN PGP PUBLIC KEY BLOCK----- xjMEX4ch6BYJKwYBBAHaRw8BAQdANPb/d6MrGnGi5HyvODCkBUJPRjiFQcRU5V+m xvMaAa/NL01heGltZSBEZXZvcyA8bWF4aW1lLmRldm9zQHN0dWRlbnQua3VsZXV2 ZW4uYmU+wpAEExYIADgWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCX4ch6AIbAwUL CQgHAwUVCgkICwUWAgMBAAIeAQIXgAAKCRBJ4+4iGRcl7japAQC3opZ2KGWzWmRc /gIWSu0AAcfMwyinFEEPa/QhUt2CogD/e2RdF4CYAgaRHJJmZ9WU7piKbLZ7llB4 LzgezVDHggzNJU1heGltZSBEZXZvcyA8bWF4aW1lZGV2b3NAdGVsZW5ldC5iZT7C kAQTFggAOBYhBMHzPuIMUo/bfdcBH0nj7iIZFyXuBQJf56ycAhsDBQsJCAcDBRUK CQgLBRYCAwEAAh4BAheAAAoJEEnj7iIZFyXujpQBAKV1SwDDl4f24rXciDlB9L8W ycZt30CgbewMSRQk4mvbAP9dFMbVVixYBd6C8cfhR+NsOBGiOJnQABlUmgNuqGFJ Dc44BF+HIegSCisGAQQBl1UBBQEBB0BOlzIWiJzgobMF6/cqwLaLk7jIcFSZ++c0 k9cCNT6YXwMBCAfCeAQYFggAIBYhBMHzPuIMUo/bfdcBH0nj7iIZFyXuBQJfhyHo AhsMAAoJEEnj7iIZFyXuMr0BAJc8cl5PGvVmVuSQVKjleNl4DK1/XAaPAYPe34AE fZJPAP9IqLCQhH/FeJanHqBP8gNdGNI2qn8RnnLVfRJgUjZ1BA=3D=3D =3DOVqp -----END PGP PUBLIC KEY BLOCK----- --------------4C1E0jrLIdA2tAyYf4AMvVF4-- --------------s1I0DKP0LCzTziom0q10b0vB-- --------------MKSCnDnIbl90ZpDtA2tPBeXh Content-Type: application/pgp-signature; name="OpenPGP_signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="OpenPGP_signature" -----BEGIN PGP SIGNATURE----- wnsEABYIACMWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYwIGFgUDAAAAAAAKCRBJ4+4iGRcl7tqS AQCS1Xwk/nu5CSj68cxd1GFKjVjo1ds1THTFfGRelDxYtQD/flvjIZWC1ey5tdC2tbnnZs0EmxQl 6G2JXh5nM+Vz7A4= =Enh3 -----END PGP SIGNATURE----- --------------MKSCnDnIbl90ZpDtA2tPBeXh--