From: "Sebastián Monía" <sebastian@sebasmonia.com>
To: Jim Porter <jporterbugs@gmail.com>
Cc: 73133@debbugs.gnu.org,
"Mattias Engdegård" <mattias.engdegard@gmail.com>,
"Eli Zaretskii" <eliz@gnu.org>,
ganimard@tuta.io
Subject: bug#73133: 29.2; EWW fails to render some webpages
Date: Thu, 24 Oct 2024 13:13:26 -0400 [thread overview]
Message-ID: <thqnsesl2zm1.fsf@sebasmonia.com> (raw)
In-Reply-To: <87zfmufa1g.fsf@sebasmonia.com> ("Sebastián Monía"'s message of "Wed, 23 Oct 2024 23:35:07 -0400")
[-- Attachment #1: Type: text/plain, Size: 558 bytes --]
Sebastián Monía <sebastian@sebasmonia.com> writes:
> Jim Porter <jporterbugs@gmail.com> writes:
>> Thoughts on just simplifying to checking for "<!doctype html"? That
>> way, we'd also guess "text/html" for all the (mostly obsolete) HTML
>> doctypes here: <https://www.w3.org/QA/2002/04/valid-dtd-list.html>.
>
> It sounds like a good idea, can provide a patch in a couple days (maybe
> tomorrow). That leaves some time for dissenting voices to express any
> concerns with this approach.
Attached a patch with the corrections mentioned so far.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: bug#73133 --]
[-- Type: text/x-patch, Size: 1683 bytes --]
From 952930c78dcfe7e4bb3a32504805239ae32073e9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Sebasti=C3=A1n=20Mon=C3=ADa?=
<sebastian.monia@sebasmonia.com>
Date: Thu, 24 Oct 2024 13:09:11 -0400
Subject: [PATCH] More lax doctype check in EWW (bug#73133)
The regexp to match doctype tags was simplified and will match
more legacy entries; also correct binding of case-fold-search.
* lisp/net/eww.el (eww--html buffer-list): Update function.
---
lisp/net/eww.el | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index 7bbbeadaedd..71e4d720b74 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -660,15 +660,14 @@ eww--html-if-doctype
"Return \"text/html\" if RESPONSE-BUFFER has an HTML doctype declaration.
HEADERS is unused."
;; https://html.spec.whatwg.org/multipage/syntax.html#the-doctype
- (let ((case-fold-search t)
- (target
- "<!doctype +html *\\(>\\|system +\\(\\\"\\|'\\)+about:legacy-compat\\)"))
- (with-current-buffer response-buffer
- (goto-char (point-min))
- ;; match basic <!doctype html> and also legacy variants as
- ;; specified in link above
- (when (re-search-forward target nil t)
- "text/html"))))
+ (with-current-buffer response-buffer
+ (let ((case-fold-search t))
+ (save-excursion
+ (goto-char (point-min))
+ ;; match basic <!doctype html> and also legacy variants as
+ ;; specified in link above - being purposely lax about it
+ (when (re-search-forward "<!doctype html" nil t)
+ "text/html")))))
(defun eww--rename-buffer ()
"Rename the current EWW buffer.
--
2.45.2.windows.1
[-- Attachment #3: Type: text/plain, Size: 54 bytes --]
--
Sebastián Monía
https://site.sebasmonia.com/
next prev parent reply other threads:[~2024-10-24 17:13 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-08 20:52 bug#73133: 29.2; EWW fails to render some webpages Ganimard via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-10 6:06 ` Jim Porter
2024-09-21 9:13 ` Eli Zaretskii
2024-09-21 17:12 ` Jim Porter
2024-09-23 15:43 ` Sebastián Monía
2024-09-28 10:58 ` Eli Zaretskii
2024-09-30 15:52 ` Sebastián Monía
2024-09-23 15:56 ` Sebastián Monía
2024-09-24 18:31 ` Jim Porter
2024-09-25 20:46 ` Sebastián Monía
2024-09-26 1:59 ` Jim Porter
2024-09-30 17:10 ` Sebastián Monía
2024-10-03 23:39 ` Jim Porter
2024-10-09 3:30 ` Sebastián Monía
2024-10-09 3:42 ` Jim Porter
2024-10-10 2:08 ` Sebastián Monía
2024-10-14 4:35 ` Jim Porter
2024-10-14 14:03 ` Eli Zaretskii
2024-10-15 11:43 ` Sebastián Monía
2024-10-19 7:46 ` Eli Zaretskii
2024-10-19 17:56 ` Sebastián Monía
2024-10-20 19:17 ` Jim Porter
2024-10-21 1:48 ` Sebastián Monía
2024-10-22 4:59 ` Jim Porter
2024-10-22 12:35 ` Sebastián Monía
2024-10-22 12:36 ` Ganimard via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-10-23 10:43 ` Mattias Engdegård
2024-10-23 16:19 ` Mattias Engdegård
2024-10-23 18:51 ` Jim Porter
2024-10-24 3:35 ` Sebastián Monía
2024-10-24 17:13 ` Sebastián Monía [this message]
2024-10-28 15:45 ` Mattias Engdegård
2024-10-30 15:21 ` Sebastián Monía
2024-10-24 3:32 ` Sebastián Monía
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=thqnsesl2zm1.fsf@sebasmonia.com \
--to=sebastian@sebasmonia.com \
--cc=73133@debbugs.gnu.org \
--cc=eliz@gnu.org \
--cc=ganimard@tuta.io \
--cc=jporterbugs@gmail.com \
--cc=mattias.engdegard@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).