* bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters.
@ 2024-01-16 13:24 Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-16 13:47 ` Eli Zaretskii
0 siblings, 1 reply; 4+ messages in thread
From: Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-16 13:24 UTC (permalink / raw)
To: 68508
[-- Attachment #1: Type: text/plain, Size: 843 bytes --]
Tags: patch
This makes `dom-print` encode HTML reserved characters that occur in
string elements of the DOM, to ensure the validity of the result.
For example, put the following in `foo.html`:
--8<---------------cut here---------------start------------->8---
<html><body>
Add ‘<samp class="samp"><div class="default"> </div></samp>’ tags around the fontified body.
<body><html>
--8<---------------cut here---------------end--------------->8---
(Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfontify.html)
Open that file in Emacs and say `M-: (require 'dom)` and then
`(dom-print (libxml-parse-html-region))` in the HTML buffer. This
produces invalid HTML since `libxml-parse-html-region` correctly decodes
HTML entities, but `dom-print` doesn't encode (without this patch).
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-dom-print-Use-HTML-entities-for-reserved-characters.patch --]
[-- Type: text/patch, Size: 704 bytes --]
From 259c0138623c352acc7bcd79a1fda42ec606a0cf Mon Sep 17 00:00:00 2001
From: Eshel Yaron <me@eshelyaron.com>
Date: Fri, 5 Jan 2024 16:40:44 +0100
Subject: [PATCH] ; (dom-print): Use HTML entities for reserved characters.
---
lisp/dom.el | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lisp/dom.el b/lisp/dom.el
index f7043ba8252..b329379fdc3 100644
--- a/lisp/dom.el
+++ b/lisp/dom.el
@@ -288,7 +288,7 @@ dom-print
(insert ">")
(dolist (child children)
(if (stringp child)
- (insert child)
+ (insert (url-insert-entities-in-string child))
(setq non-text t)
(when pretty
(insert "\n" (make-string (+ column 2) ?\s)))
--
2.42.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters.
2024-01-16 13:24 bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-16 13:47 ` Eli Zaretskii
2024-01-16 16:29 ` Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2024-01-16 13:47 UTC (permalink / raw)
To: Eshel Yaron; +Cc: 68508
> Date: Tue, 16 Jan 2024 14:24:40 +0100
> From: Eshel Yaron via "Bug reports for GNU Emacs,
> the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
>
> This makes `dom-print` encode HTML reserved characters that occur in
> string elements of the DOM, to ensure the validity of the result.
>
> For example, put the following in `foo.html`:
>
> --8<---------------cut here---------------start------------->8---
> <html><body>
> Add ‘<samp class="samp"><div class="default"> </div></samp>’ tags around the fontified body.
> <body><html>
> --8<---------------cut here---------------end--------------->8---
> (Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfontify.html)
>
> Open that file in Emacs and say `M-: (require 'dom)` and then
> `(dom-print (libxml-parse-html-region))` in the HTML buffer. This
> produces invalid HTML since `libxml-parse-html-region` correctly decodes
> HTML entities, but `dom-print` doesn't encode (without this patch).
Thanks, but could you please also add tests for this?
^ permalink raw reply [flat|nested] 4+ messages in thread
* bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters.
2024-01-16 13:47 ` Eli Zaretskii
@ 2024-01-16 16:29 ` Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-20 9:42 ` Eli Zaretskii
0 siblings, 1 reply; 4+ messages in thread
From: Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-01-16 16:29 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 68508
[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]
Eli Zaretskii <eliz@gnu.org> writes:
>> Date: Tue, 16 Jan 2024 14:24:40 +0100
>> From: Eshel Yaron via "Bug reports for GNU Emacs,
>> the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
>>
>> This makes `dom-print` encode HTML reserved characters that occur in
>> string elements of the DOM, to ensure the validity of the result.
>>
>> For example, put the following in `foo.html`:
>>
>> --8<---------------cut here---------------start------------->8---
>> <html><body>
>> Add ‘<samp class="samp"><div class="default"> </div></samp>’ tags around the fontified body.
>> <body><html>
>> --8<---------------cut here---------------end--------------->8---
>> (Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfontify.html)
>>
>> Open that file in Emacs and say `M-: (require 'dom)` and then
>> `(dom-print (libxml-parse-html-region))` in the HTML buffer. This
>> produces invalid HTML since `libxml-parse-html-region` correctly decodes
>> HTML entities, but `dom-print` doesn't encode (without this patch).
>
> Thanks, but could you please also add tests for this?
Sure, I've added a test to dom-tests.el in the updated patch below.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: v2-0001-Use-HTML-entities-for-reserved-characters-in-dom-.patch --]
[-- Type: text/x-patch, Size: 1732 bytes --]
From 8d60074053ee1ebc04fc3fda417d53ddc5a4fac9 Mon Sep 17 00:00:00 2001
From: Eshel Yaron <me@eshelyaron.com>
Date: Fri, 5 Jan 2024 16:40:44 +0100
Subject: [PATCH v2] ; Use HTML entities for reserved characters in 'dom-print'
* lisp/dom.el (dom-print): Encode HTML reserved characters in strings.
* test/lisp/dom-tests.el (dom-tests-print): New test. (Bug#68508)
---
lisp/dom.el | 2 +-
test/lisp/dom-tests.el | 10 ++++++++++
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/lisp/dom.el b/lisp/dom.el
index f7043ba8252..b329379fdc3 100644
--- a/lisp/dom.el
+++ b/lisp/dom.el
@@ -288,7 +288,7 @@ dom-print
(insert ">")
(dolist (child children)
(if (stringp child)
- (insert child)
+ (insert (url-insert-entities-in-string child))
(setq non-text t)
(when pretty
(insert "\n" (make-string (+ column 2) ?\s)))
diff --git a/test/lisp/dom-tests.el b/test/lisp/dom-tests.el
index 8cbfb9ad9df..a4e913541bf 100644
--- a/test/lisp/dom-tests.el
+++ b/test/lisp/dom-tests.el
@@ -209,6 +209,16 @@ dom-tests-pp
(dom-pp node t)
(should (equal (buffer-string) "(\"foo\" nil)")))))
+(ert-deftest dom-tests-print ()
+ "Test that `dom-print' correctly encodes HTML reserved characters."
+ (with-temp-buffer
+ (dom-print '(samp ((class . "samp")) "<div class=\"default\"> </div>"))
+ (should (equal
+ (buffer-string)
+ (concat "<samp class=\"samp\">"
+ "<div class="default"> </div>"
+ "</samp>")))))
+
(ert-deftest dom-test-search ()
(let ((dom '(a nil (b nil (c nil)))))
(should (equal (dom-search dom (lambda (d) (eq (dom-tag d) 'a)))
--
2.42.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters.
2024-01-16 16:29 ` Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-01-20 9:42 ` Eli Zaretskii
0 siblings, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2024-01-20 9:42 UTC (permalink / raw)
To: Eshel Yaron; +Cc: 68508-done
> From: Eshel Yaron <me@eshelyaron.com>
> Cc: 68508@debbugs.gnu.org
> Date: Tue, 16 Jan 2024 17:29:12 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > Thanks, but could you please also add tests for this?
>
> Sure, I've added a test to dom-tests.el in the updated patch below.
Thanks, installed on master, and closing the bug.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-01-20 9:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-16 13:24 bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-16 13:47 ` Eli Zaretskii
2024-01-16 16:29 ` Eshel Yaron via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-01-20 9:42 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).