From: Robert Pluim <rpluim@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com
Subject: bug#34469: 26.1; EWW stops renderring web page on null byte
Date: Tue, 19 Feb 2019 18:37:26 +0100 [thread overview]
Message-ID: <m27edvel2h.fsf@gmail.com> (raw)
In-Reply-To: <83mumrivuv.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 19 Feb 2019 18:30:48 +0200")
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Robert Pluim <rpluim@gmail.com>
>> Date: Tue, 19 Feb 2019 11:06:37 +0100
>> Cc: 34469@debbugs.gnu.org, Nicholas Drozd <nicholasdrozd@gmail.com>
>>
>> Glenn Morris <rgm@gnu.org> writes:
>>
>> > Perhaps eww-display-html should replace null bytes (with whatever the
>> > html standard says is appropriate) before calling
>> > libxml-parse-html-region. It already replaces CRLF.
>>
>> Chrome at least just strips the null byte completely.
>>
>> There is apparently a class of attacks that uses the null character
>> for nefarious purposes, so how about something like this:
>>
>> diff --git a/lisp/net/eww.el b/lisp/net/eww.el
>> index 1cc4557ce1..9b57bc43e4 100644
>> --- a/lisp/net/eww.el
>> +++ b/lisp/net/eww.el
>> @@ -448,8 +448,8 @@ eww-display-html
>> (decode-coding-region (point) (point-max) encode)
>> (coding-system-error nil))
>> (save-excursion
>> - ;; Remove CRLF before parsing.
>> - (while (re-search-forward "\r$" nil t)
>> + ;; Remove CRLF and NULL before parsing.
>> + (while (re-search-forward "\r$\\|\000" nil t)
>> (replace-match "" t t)))
>
> It is un-Emacsy, IMO, to remove content without a trace. (CR is
> different: we simply convert text to Unix LF-only EOL format.) So I'd
> suggest to replace with "^@" or "\000" or "NUL" or something to that
> effect. Even U+FFFD would be better than removing.
>
Since this is all due to a C-ism in the handling of content, Iʼd vote
for "\0", although this is inside Emacs, so perhaps "^@" is best.
> (We could get fancy and have a defcustom for those who do want the
> null bytes removed.)
I really donʼt think this is something that needs to be configurable.
Robert
next prev parent reply other threads:[~2019-02-19 17:37 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20190213122718eucas1p26156656a2376e5055452ac4d0385fc6d@eucas1p2.samsung.com>
2019-02-13 12:27 ` bug#34469: 26.1; EWW stops renderring web page on null byte Lukasz Pawelczyk
2019-02-14 4:44 ` Nicholas Drozd
2019-02-14 19:14 ` Eli Zaretskii
2019-02-16 18:13 ` Nicholas Drozd
2019-02-19 1:12 ` Glenn Morris
2019-02-19 10:06 ` Robert Pluim
2019-02-19 16:30 ` Eli Zaretskii
2019-02-19 17:37 ` Robert Pluim [this message]
2019-02-19 18:11 ` Eli Zaretskii
2019-02-20 18:48 ` Robert Pluim
2019-02-27 11:31 ` Robert Pluim
2019-02-27 15:55 ` Eli Zaretskii
2019-02-27 16:21 ` Robert Pluim
2019-02-28 1:52 ` Paul Eggert
2019-02-28 8:46 ` Robert Pluim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m27edvel2h.fsf@gmail.com \
--to=rpluim@gmail.com \
--cc=34469@debbugs.gnu.org \
--cc=eliz@gnu.org \
--cc=nicholasdrozd@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.