From: "Sebastián Monía" <sebastian@sebasmonia.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: jporterbugs@gmail.com, 73133@debbugs.gnu.org, ganimard@tuta.io
Subject: bug#73133: 29.2; EWW fails to render some webpages
Date: Sat, 19 Oct 2024 13:56:14 -0400 [thread overview]
Message-ID: <87zfn0f03l.fsf@sebasmonia.com> (raw)
In-Reply-To: <86wmi4lem4.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 19 Oct 2024 10:46:11 +0300")
[-- Attachment #1: Type: text/plain, Size: 449 bytes --]
Eli Zaretskii <eliz@gnu.org> writes:
> The legal paperwork is now done, so Sebastián, please update the patch
> to fix the nit with unused argument HEADERS in eww--html-if-doctype,
> and resubmit, so we could install the changes.
>
> Thanks.
What a momentous ocassion :)
Attached the patch with that correction (and a small dostring fix that
'checkdoc' caught)
Thank you everyone for your help in this process.
Regards,
Seb
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: bug73133-doctype --]
[-- Type: text/x-patch, Size: 3003 bytes --]
From e35f4502383e368747d5f2bd8bcb9ed872315029 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Sebasti=C3=A1n=20Mon=C3=ADa?= <sebastian@sebasmonia.com>
Date: Tue, 8 Oct 2024 23:26:42 -0400
Subject: [PATCH] Add customization to let EWW guess content-type if needed
(bug#73133)
---
lisp/net/eww.el | 39 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 38 insertions(+), 1 deletion(-)
diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index b5d2f20781a..147982057c5 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -108,6 +108,19 @@ eww-suggest-uris
eww-current-url
eww-bookmark-urls))
+(defcustom eww-guess-content-type-functions
+ '(eww--html-if-doctype)
+ "List of functions used to guess a page's content-type.
+These are only used when the page does not have a valid Content-Type
+header. Functions are called in order, until one of them returns the
+value to be used as Content-Type. They receive two parameters: an alist
+of headers, and the buffer that holds the complete response. If the
+list is exhausted, eww assumes \"text/plain\" so the user can see the
+markup."
+ :version "31.1"
+ :group 'eww
+ :type '(repeat function))
+
(defcustom eww-bookmarks-directory user-emacs-directory
"Directory where bookmark files will be stored."
:version "25.1"
@@ -630,6 +643,30 @@ eww-html-p
(member content-type '("text/html"
"application/xhtml+xml")))
+(defun eww--guess-content-type (headers response-buffer)
+ "Use HEADERS and RESPONSE-BUFFER to guess the Content-Type.
+Will call each function in `eww-guess-content-type-functions', until one
+of them returns a value. This mechanism is used only if there isn't a
+valid Content-Type header. If none of the functions can guess, return
+\"text/plain\", so at least the mark up is displayed."
+ (or (run-hook-with-args-until-success
+ 'eww-guess-content-type-functions headers response-buffer)
+ "text/plain"))
+
+(defun eww--html-if-doctype (_headers response-buffer)
+ "Return \"text/html\" if RESPONSE-BUFFER has an HTML doctype declaration.
+HEADERS is unused."
+ ;; https://html.spec.whatwg.org/multipage/syntax.html#the-doctype
+ (let ((case-fold-search t)
+ (target
+ "<!doctype +html *\\(>\\|system +\\(\\\"\\|'\\)+about:legacy-compat\\)"))
+ (with-current-buffer response-buffer
+ (goto-char (point-min))
+ ;; match basic <!doctype html> and also legacy variants as
+ ;; specified in link above
+ (when (re-search-forward target nil t)
+ "text/html"))))
+
(defun eww--rename-buffer ()
"Rename the current EWW buffer.
The renaming scheme is performed in accordance with
@@ -659,7 +696,7 @@ eww-render
(content-type
(mail-header-parse-content-type
(if (zerop (length (cdr (assoc "content-type" headers))))
- "text/plain"
+ (eww--guess-content-type headers (current-buffer))
(cdr (assoc "content-type" headers)))))
(charset (intern
(downcase
--
2.43.0
[-- Attachment #3: Type: text/plain, Size: 56 bytes --]
--
Sebastián Monía
https://site.sebasmonia.com/
prev parent reply other threads:[~2024-10-19 17:56 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-08 20:52 bug#73133: 29.2; EWW fails to render some webpages Ganimard via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-10 6:06 ` Jim Porter
2024-09-21 9:13 ` Eli Zaretskii
2024-09-21 17:12 ` Jim Porter
2024-09-23 15:43 ` Sebastián Monía
2024-09-28 10:58 ` Eli Zaretskii
2024-09-30 15:52 ` Sebastián Monía
2024-09-23 15:56 ` Sebastián Monía
2024-09-24 18:31 ` Jim Porter
2024-09-25 20:46 ` Sebastián Monía
2024-09-26 1:59 ` Jim Porter
2024-09-30 17:10 ` Sebastián Monía
2024-10-03 23:39 ` Jim Porter
2024-10-09 3:30 ` Sebastián Monía
2024-10-09 3:42 ` Jim Porter
2024-10-10 2:08 ` Sebastián Monía
2024-10-14 4:35 ` Jim Porter
2024-10-14 14:03 ` Eli Zaretskii
2024-10-15 11:43 ` Sebastián Monía
2024-10-19 7:46 ` Eli Zaretskii
2024-10-19 17:56 ` Sebastián Monía [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zfn0f03l.fsf@sebasmonia.com \
--to=sebastian@sebasmonia.com \
--cc=73133@debbugs.gnu.org \
--cc=eliz@gnu.org \
--cc=ganimard@tuta.io \
--cc=jporterbugs@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).