From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?Sebasti=C3=A1n_?= =?UTF-8?Q?Mon=C3=ADa?= Newsgroups: gmane.emacs.bugs Subject: bug#73133: 29.2; EWW fails to render some webpages Date: Sat, 19 Oct 2024 13:56:14 -0400 Message-ID: <87zfn0f03l.fsf@sebasmonia.com> References: <86plox4bef.fsf@gnu.org> <7eb7b048-06ea-5751-56e1-590689c8c318@gmail.com> <8e285069-6e95-de49-dd46-92ce49b94372@gmail.com> <5e49a521-a191-15db-6368-6ca0f046d68a@gmail.com> <87y12y7y2s.fsf@sebasmonia.com> <9d90789a-ef06-1f7d-c340-2bba315dda5f@gmail.com> <87ttdk90c1.fsf@sebasmonia.com> <220e88e6-cbd4-331f-f25a-abb906852f6b@gmail.com> <86sesysrxi.fsf@gnu.org> <86wmi4lem4.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="25332"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: jporterbugs@gmail.com, 73133@debbugs.gnu.org, ganimard@tuta.io To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Oct 19 19:57:08 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1t2DhJ-0006L8-Kd for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 19 Oct 2024 19:57:06 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t2Dh8-0007ws-7y; Sat, 19 Oct 2024 13:56:54 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t2Dh6-0007wi-Qp for bug-gnu-emacs@gnu.org; Sat, 19 Oct 2024 13:56:52 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1t2Dgs-0004qy-4D for bug-gnu-emacs@gnu.org; Sat, 19 Oct 2024 13:56:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:References:In-Reply-To:From:To:Subject; bh=aC99UNGkwJIBzhDKO1bbKlzjZgEbd7XsZs9xWZdfNJg=; b=AkuAqGPKOMoVJ4E37evqDL6Y1zXqCTHLvIXUAJIHusAV1dqWpGTsRWJeOJkwmas3iSZpoP599IO1bAaAyYCtNLNCEOTARDpKEP1A7msn1FEhavbNwEN1jRzvbdf26Kg1qVAhtEdJkrUwBSWzh9QvRFnbyHA0+pfC3+TwvqkOGs0zi7PS2d8p4n5qLQhU19KH4q7pQuK8XV0IRFUoyQ65zW9carxdOKIInEqDSfK1uws++sUJZbTGBcY+aSB9Z0eROoJVXeADnc0PyTe3ITiT6an7RxxB2V9KEy7i4nXQrQ4eQg4um7vNzgr+Z9LHhhZ/r8g8aoJF5qI7W7WncetbVA==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1t2DhG-000455-2g for bug-gnu-emacs@gnu.org; Sat, 19 Oct 2024 13:57:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?Sebasti=C3=A1n_?= =?UTF-8?Q?Mon=C3=ADa?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 19 Oct 2024 17:57:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73133 X-GNU-PR-Package: emacs Original-Received: via spool by 73133-submit@debbugs.gnu.org id=B73133.172936060815660 (code B ref 73133); Sat, 19 Oct 2024 17:57:01 +0000 Original-Received: (at 73133) by debbugs.gnu.org; 19 Oct 2024 17:56:48 +0000 Original-Received: from localhost ([127.0.0.1]:44714 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1t2Dh1-00044V-JM for submit@debbugs.gnu.org; Sat, 19 Oct 2024 13:56:48 -0400 Original-Received: from fout-a1-smtp.messagingengine.com ([103.168.172.144]:40773) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1t2Dgy-00044G-ND for 73133@debbugs.gnu.org; Sat, 19 Oct 2024 13:56:45 -0400 Original-Received: from phl-compute-12.internal (phl-compute-12.phl.internal [10.202.2.52]) by mailfout.phl.internal (Postfix) with ESMTP id 71ADA1380180; Sat, 19 Oct 2024 13:56:15 -0400 (EDT) Original-Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-12.internal (MEProxy); Sat, 19 Oct 2024 13:56:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sebasmonia.com; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm2; t=1729360575; x= 1729446975; bh=aC99UNGkwJIBzhDKO1bbKlzjZgEbd7XsZs9xWZdfNJg=; b=a 3ZvW9czdoJTgLr0NR5lAVUc6QkMIQfVxL5ZpbyvfDioS5KCIPKAh6gbYVJXKOPOF E/dsiT86RzhMYAeUdepyrjrMXfmmTlW0aFbz0LyGCzgiKnAT/MiJHn7lLnDjAcZA 3zYlK2iLZ/3AJp9J8iQxoAQZZXwYlVPQ5n97deC22EWDBx7c2Bw0Z2ho6jeNm1qF 2fhCbdW0NCpmBc7bTBnnuRjfov/S7x4oDkHP+tTir2kqnc2J+7gLDzMbx62KCcUX 469V5qoa8XwJjGxI/MSWH4ZN9dEyAGSpxFQBvgoM+qgI2s+w1rle0rXYQI8SOFc1 mz6x0CwhWJ1IeumiGdqiQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1729360575; x=1729446975; bh=aC99UNGkwJIBzhDKO1bbKlzjZgEb d7XsZs9xWZdfNJg=; b=knUj0e30esCXP1l+WwZwHvYQHzjsTU1lzNQXE7+CcObx lTE77+EobeAMGUVoQ2RGuIvU/9/P5UHuYJcCnXgVdmthXg/rKaHL/0XkKS6QclCp MirNmNA6yonkaMYDb628WKzzwcR6geMeO/ivL0ugc2+dDZRRhpFjAHGi3mJWOmD+ LKWxYJn5yZ6x6HVpiJIY2Gb0ios5T9tRPosmZRvIcQU4Xq9mgru3AIIrVFV+lNEJ dlSZ+XN+uAui+xStXFZ4LMoIjzEightq+4IhYZZbo3b580agd5pr3LUnDG404meZ tqIu8Q2eCMgbriSiRRigJxECLc7CGTS/ugk/Ge81Cw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdehhedguddvvdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvfevufgjfhffkfgfgggtsehmtderredtreej necuhfhrohhmpefuvggsrghsthhijohnucfoohhnvogruceoshgvsggrshhtihgrnhessh gvsggrshhmohhnihgrrdgtohhmqeenucggtffrrghtthgvrhhnpeehkedvueffgfeluedv ueethfejteetudefffelheejgeeugffgteetfedvvdekteenucevlhhushhtvghrufhiii gvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehsvggsrghsthhirghnsehsvggsrghs mhhonhhirgdrtghomhdpnhgspghrtghpthhtohepgedpmhhouggvpehsmhhtphhouhhtpd hrtghpthhtohepghgrnhhimhgrrhgusehtuhhtrgdrihhopdhrtghpthhtohepjeefudef feesuggvsggsuhhgshdrghhnuhdrohhrghdprhgtphhtthhopehjphhorhhtvghrsghugh hssehgmhgrihhlrdgtohhmpdhrtghpthhtohepvghlihiisehgnhhurdhorhhg X-ME-Proxy: Feedback-ID: iab2c46da:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat, 19 Oct 2024 13:56:14 -0400 (EDT) In-Reply-To: <86wmi4lem4.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 19 Oct 2024 10:46:11 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293903 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Eli Zaretskii writes: > The legal paperwork is now done, so Sebasti=C3=A1n, please update the pat= ch > to fix the nit with unused argument HEADERS in eww--html-if-doctype, > and resubmit, so we could install the changes. > > Thanks. What a momentous ocassion :) Attached the patch with that correction (and a small dostring fix that 'checkdoc' caught) Thank you everyone for your help in this process. Regards, Seb --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-Add-customization-to-let-EWW-guess-content-type-if-n.patch Content-Description: bug73133-doctype >From e35f4502383e368747d5f2bd8bcb9ed872315029 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sebasti=C3=A1n=20Mon=C3=ADa?= Date: Tue, 8 Oct 2024 23:26:42 -0400 Subject: [PATCH] Add customization to let EWW guess content-type if needed (bug#73133) --- lisp/net/eww.el | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/lisp/net/eww.el b/lisp/net/eww.el index b5d2f20781a..147982057c5 100644 --- a/lisp/net/eww.el +++ b/lisp/net/eww.el @@ -108,6 +108,19 @@ eww-suggest-uris eww-current-url eww-bookmark-urls)) +(defcustom eww-guess-content-type-functions + '(eww--html-if-doctype) + "List of functions used to guess a page's content-type. +These are only used when the page does not have a valid Content-Type +header. Functions are called in order, until one of them returns the +value to be used as Content-Type. They receive two parameters: an alist +of headers, and the buffer that holds the complete response. If the +list is exhausted, eww assumes \"text/plain\" so the user can see the +markup." + :version "31.1" + :group 'eww + :type '(repeat function)) + (defcustom eww-bookmarks-directory user-emacs-directory "Directory where bookmark files will be stored." :version "25.1" @@ -630,6 +643,30 @@ eww-html-p (member content-type '("text/html" "application/xhtml+xml"))) +(defun eww--guess-content-type (headers response-buffer) + "Use HEADERS and RESPONSE-BUFFER to guess the Content-Type. +Will call each function in `eww-guess-content-type-functions', until one +of them returns a value. This mechanism is used only if there isn't a +valid Content-Type header. If none of the functions can guess, return +\"text/plain\", so at least the mark up is displayed." + (or (run-hook-with-args-until-success + 'eww-guess-content-type-functions headers response-buffer) + "text/plain")) + +(defun eww--html-if-doctype (_headers response-buffer) + "Return \"text/html\" if RESPONSE-BUFFER has an HTML doctype declaration. +HEADERS is unused." + ;; https://html.spec.whatwg.org/multipage/syntax.html#the-doctype + (let ((case-fold-search t) + (target + "\\|system +\\(\\\"\\|'\\)+about:legacy-compat\\)")) + (with-current-buffer response-buffer + (goto-char (point-min)) + ;; match basic and also legacy variants as + ;; specified in link above + (when (re-search-forward target nil t) + "text/html")))) + (defun eww--rename-buffer () "Rename the current EWW buffer. The renaming scheme is performed in accordance with @@ -659,7 +696,7 @@ eww-render (content-type (mail-header-parse-content-type (if (zerop (length (cdr (assoc "content-type" headers)))) - "text/plain" + (eww--guess-content-type headers (current-buffer)) (cdr (assoc "content-type" headers))))) (charset (intern (downcase -- 2.43.0 --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable --=20 Sebasti=C3=A1n Mon=C3=ADa https://site.sebasmonia.com/ --=-=-=--