From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?Sebasti=C3=A1n_?= =?UTF-8?Q?Mon=C3=ADa?= Newsgroups: gmane.emacs.bugs Subject: bug#73133: 29.2; EWW fails to render some webpages Date: Wed, 09 Oct 2024 22:08:14 -0400 Message-ID: <87ttdk90c1.fsf@sebasmonia.com> References: <86plox4bef.fsf@gnu.org> <7eb7b048-06ea-5751-56e1-590689c8c318@gmail.com> <8e285069-6e95-de49-dd46-92ce49b94372@gmail.com> <5e49a521-a191-15db-6368-6ca0f046d68a@gmail.com> <87y12y7y2s.fsf@sebasmonia.com> <9d90789a-ef06-1f7d-c340-2bba315dda5f@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5485"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Eli Zaretskii , 73133@debbugs.gnu.org, ganimard@tuta.io To: Jim Porter Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Oct 10 04:09:16 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1syic6-0001FS-Ry for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 10 Oct 2024 04:09:16 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1syibm-0000CE-Nb; Wed, 09 Oct 2024 22:08:54 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1syibk-0000C1-1x for bug-gnu-emacs@gnu.org; Wed, 09 Oct 2024 22:08:52 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1syibj-0004X0-Pg for bug-gnu-emacs@gnu.org; Wed, 09 Oct 2024 22:08:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:References:In-Reply-To:From:To:Subject; bh=SpVbjbtTmpWiNeZ0sq0kslKryPGMsKUBoYqyUFMpxps=; b=MHU3+QjHjsg3wLglrtOebUnpJSU6uPQZsT/s1ozTK3a1Ln2UdhAReptz9M8NQ0sXqCOK4MIfopQPIO4oKqVY8bCF4YQH0nbLrHC4nQpG6sJ3mtMkkx+lS4YX4eoFTOBTvX/kdAEmf2HEPn42086onq3jmyXlvBI3Bm8qEg797OtifOciOwS8bCxMKdA8KVcGlDQ2Ahy0WDRSRfF3ivvLu6XWpNIFuH0Lt9FGiiv2BV6laP3vXgn7lyMCabZWRSArA5/4UFtkV8KRSiXawwVZRul1ROwUA2QHmG/jZW2MzAlLVaOJloCMwcIqDHz1x5ycMJKqn0fK4yGAHmkTJmAZRg==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1syibu-0007tw-0Q for bug-gnu-emacs@gnu.org; Wed, 09 Oct 2024 22:09:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?Sebasti=C3=A1n_?= =?UTF-8?Q?Mon=C3=ADa?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 10 Oct 2024 02:09:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73133 X-GNU-PR-Package: emacs Original-Received: via spool by 73133-submit@debbugs.gnu.org id=B73133.172852611630336 (code B ref 73133); Thu, 10 Oct 2024 02:09:01 +0000 Original-Received: (at 73133) by debbugs.gnu.org; 10 Oct 2024 02:08:36 +0000 Original-Received: from localhost ([127.0.0.1]:58125 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1syibU-0007tE-Aq for submit@debbugs.gnu.org; Wed, 09 Oct 2024 22:08:36 -0400 Original-Received: from fout-a2-smtp.messagingengine.com ([103.168.172.145]:45379) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1syibQ-0007sy-F3 for 73133@debbugs.gnu.org; Wed, 09 Oct 2024 22:08:34 -0400 Original-Received: from phl-compute-12.internal (phl-compute-12.phl.internal [10.202.2.52]) by mailfout.phl.internal (Postfix) with ESMTP id 8FF7213802B6; Wed, 9 Oct 2024 22:08:16 -0400 (EDT) Original-Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-12.internal (MEProxy); Wed, 09 Oct 2024 22:08:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sebasmonia.com; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm2; t=1728526096; x= 1728612496; bh=SpVbjbtTmpWiNeZ0sq0kslKryPGMsKUBoYqyUFMpxps=; b=k l/54nRSMSLHj6e9zsxKaj1XDjjWIIKYy0SWhiwc4KX81kVm8E+/MhUBAHc27AhSx N+P0Vk5h+lFVAin+R98qbpEZhG0dts6wHosgfd3dlBsz2v7pYwrcgfISWkuTeX14 1h24jGHWyWT9stWdWVGH739mAVNS7j6Bgq31BSDVPvg0sVnHknhwesbAFQR41IbY wvNjjEX+FuxIHznALx8Hl3Yid2K6TO/KQ7lw+8m7hBxp1eC+zbvINtWniKt4SBYi s0wPh4t6plVl7Yr7CDb2W9ruwJYqAFcp9/kJljFiqY0Cg44qqMOdksO96IfcRVOe nUFuUyl72f4xxfoSOExNg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1728526096; x=1728612496; bh=SpVbjbtTmpWiNeZ0sq0kslKryPGM sKUBoYqyUFMpxps=; b=PBpSnJHlhveQewsqpAXuJ3g7C9S92iKx0G/VFmftpwti ts0DTigZgFkZcF5sPKz1PVqVa8SFC931GemJ7JBPnKMLjO792lBqzrEG7pKBhqwu H+SQ0cCbcJEDCNIoRef1tqVsFARTG5WUVp/7HHOMdCopadjLn/zoiv9dJqUiWYam rb3O7iTj4uU2hZLQfCKaxv6pgaXRy3Oi0hjPdp3cldnVpIXcnhsFtTWQnnX8gjWG 2Siaeej7WYCATEeKRN9lQBXBvjz3JCpM/c/xxK7P4QCyS4DV0NfLteaXRvpjQvl6 PNbrQnIS8fBI7pPl83ikJkwKthseCNLDdCL1iZAB8Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdefgedgheehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhephffvvefujghffffkfgggtgesmhdtreertderjeen ucfhrhhomhepufgvsggrshhtihojnhcuofhonhovrgcuoehsvggsrghsthhirghnsehsvg gsrghsmhhonhhirgdrtghomheqnecuggftrfgrthhtvghrnhepheekvdeufffgleeuvdeu tefhjeettedufeffleehjeeguefggfetteefvddvkeetnecuvehluhhsthgvrhfuihiivg eptdenucfrrghrrghmpehmrghilhhfrhhomhepshgvsggrshhtihgrnhesshgvsggrshhm ohhnihgrrdgtohhmpdhnsggprhgtphhtthhopeegpdhmohguvgepshhmthhpohhuthdprh gtphhtthhopehgrghnihhmrghrugesthhuthgrrdhiohdprhgtphhtthhopeejfedufeef seguvggssghughhsrdhgnhhurdhorhhgpdhrtghpthhtohepvghlihiisehgnhhurdhorh hgpdhrtghpthhtohepjhhpohhrthgvrhgsuhhgshesghhmrghilhdrtghomh X-ME-Proxy: Feedback-ID: iab2c46da:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 9 Oct 2024 22:08:15 -0400 (EDT) In-Reply-To: <9d90789a-ef06-1f7d-c340-2bba315dda5f@gmail.com> (Jim Porter's message of "Tue, 8 Oct 2024 20:42:47 -0700") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293237 Archived-At: --=-=-= Content-Type: text/plain Jim Porter writes: > I believe this could be: > > (or (run-hook-with-args-until-success > 'eww-guess-content-type-functions headers response-buffer) > "text/plain") TIL. I landed in seq-some looking for something like run-hook-with-args-until-sucess. So I actually learned two days in a row! :) Attached a modified patch. I also noticed and corrected another error, that broke things when using the "g" (reload) command. As for testing, I used this: (defun do-ask (headers response) (when (y-or-n-p "decide?") (if (y-or-n-p "render?") "text/html" "text/plain"))) (setq eww-guess-content-type-functions '(do-ask eww--html-if-doctype)) And then reverse the order of the functions. Using "regular" pages and the one reported in the bug. Also tested with no functions. --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-Add-customization-to-let-EWW-guess-content-type-if-n.patch Content-Description: patch bug 73133 >From 5239cf0add09f69276ae21c13efb2fe665297234 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sebasti=C3=A1n=20Mon=C3=ADa?= Date: Tue, 8 Oct 2024 23:26:42 -0400 Subject: [PATCH] Add customization to let EWW guess content-type if needed (bug#73133) --- lisp/net/eww.el | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/lisp/net/eww.el b/lisp/net/eww.el index b5d2f20781a..30e780a44d9 100644 --- a/lisp/net/eww.el +++ b/lisp/net/eww.el @@ -108,6 +108,19 @@ eww-suggest-uris eww-current-url eww-bookmark-urls)) +(defcustom eww-guess-content-type-functions + '(eww--html-if-doctype) + "List of functions used to guess a page's content-type. +These are only used when the page does not have a valid Content-Type +header. Functions are called in order, until one of them returns the +value to be used as Content-Type. They receive two parameters: an alist +of headers, and the buffer that holds the complete response. If the +list is exhausted, eww assumes \"text/plain\" so the user can see the +markup." + :version "31.1" + :group 'eww + :type '(repeat function)) + (defcustom eww-bookmarks-directory user-emacs-directory "Directory where bookmark files will be stored." :version "25.1" @@ -630,6 +643,30 @@ eww-html-p (member content-type '("text/html" "application/xhtml+xml"))) +(defun eww--guess-content-type (headers response-buffer) + "Use HEADERS and RESPONSE to guess the Content-Type. +Will call each function in `eww-guess-content-type-functions', until one +of them returns a value. This mechanism is used only if there isn't a +valid Content-Type header. If none of the functions can guess, return +\"text/plain\", so at least the mark up is displayed." + (or (run-hook-with-args-until-success + 'eww-guess-content-type-functions headers response-buffer) + "text/plain")) + +(defun eww--html-if-doctype (headers response-buffer) + "Return \"text/html\" if RESPONSE-BUFFER has an HTML doctype declaration. +HEADERS is unused." + ;; https://html.spec.whatwg.org/multipage/syntax.html#the-doctype + (let ((case-fold-search t) + (target + "\\|system +\\(\\\"\\|'\\)+about:legacy-compat\\)")) + (with-current-buffer response-buffer + (goto-char (point-min)) + ;; match basic and also legacy variants as + ;; specified in link above + (when (re-search-forward target nil t) + "text/html")))) + (defun eww--rename-buffer () "Rename the current EWW buffer. The renaming scheme is performed in accordance with @@ -659,7 +696,7 @@ eww-render (content-type (mail-header-parse-content-type (if (zerop (length (cdr (assoc "content-type" headers)))) - "text/plain" + (eww--guess-content-type headers (current-buffer)) (cdr (assoc "content-type" headers))))) (charset (intern (downcase -- 2.43.0 --=-=-=--