From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?Sebasti=C3=A1n_?= =?UTF-8?Q?Mon=C3=ADa?= Newsgroups: gmane.emacs.bugs Subject: bug#73133: 29.2; EWW fails to render some webpages Date: Thu, 24 Oct 2024 13:13:26 -0400 Message-ID: References: <86613F3D-B7C8-4498-B435-7AAF342264C2@gmail.com> <2eb287fc-b73e-f7d0-ed5d-fa52063224e8@gmail.com> <87zfmufa1g.fsf@sebasmonia.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18488"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: 73133@debbugs.gnu.org, Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= , Eli Zaretskii , ganimard@tuta.io To: Jim Porter Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Oct 24 19:15:05 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1t41QN-0004Zl-7Y for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 24 Oct 2024 19:15:04 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t41Q4-0004C8-QD; Thu, 24 Oct 2024 13:14:45 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t41Ps-0004BR-Cx for bug-gnu-emacs@gnu.org; Thu, 24 Oct 2024 13:14:35 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1t41Pr-0005OA-Tj for bug-gnu-emacs@gnu.org; Thu, 24 Oct 2024 13:14:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:References:In-Reply-To:From:To:Subject; bh=4u+xYhiY88X6gnzPRi52P8Uk/i1oOIFEaPS55a+TdY4=; b=vBvvZiQl8ckFw10WFsoeKHDjKTqQuE7S6aXcPib4Awh/NLMTpVVGJo4FyA+gchlNCYiNmuFWMJe8mr1Jzw9kzfdfn0zAIJ3suimL1ct3dJPOsvHLGtdQYPeTaa+V65ougjn8i7BYMdzTI7WC7GYcgrQtF+8S8G41GnOY8Mb40hjtkQGWSJIF+HTLhBMoJuQQoKamt6bZwoQTt4yT/dlL+B1gsgeLSOtKKFjduwtRFKdTeNde6wZ06qCwvxnUpfRRxxaqbwmGPR59ksXodM5tfRAtMV0fA+QHVl+QseOZ/L+JsYevE8632GcrXEtixdTsRGxlSLj/5aNnZJyK7CJjXQ==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1t41QM-0006KL-GO for bug-gnu-emacs@gnu.org; Thu, 24 Oct 2024 13:15:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?Sebasti=C3=A1n_?= =?UTF-8?Q?Mon=C3=ADa?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 24 Oct 2024 17:15:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73133 X-GNU-PR-Package: emacs Original-Received: via spool by 73133-submit@debbugs.gnu.org id=B73133.172979005024227 (code B ref 73133); Thu, 24 Oct 2024 17:15:02 +0000 Original-Received: (at 73133) by debbugs.gnu.org; 24 Oct 2024 17:14:10 +0000 Original-Received: from localhost ([127.0.0.1]:36062 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1t41PV-0006Ih-IZ for submit@debbugs.gnu.org; Thu, 24 Oct 2024 13:14:09 -0400 Original-Received: from fout-a2-smtp.messagingengine.com ([103.168.172.145]:54097) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1t41PQ-0006I0-MR for 73133@debbugs.gnu.org; Thu, 24 Oct 2024 13:14:08 -0400 Original-Received: from phl-compute-07.internal (phl-compute-07.phl.internal [10.202.2.47]) by mailfout.phl.internal (Postfix) with ESMTP id 94A991380217; Thu, 24 Oct 2024 13:13:27 -0400 (EDT) Original-Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-07.internal (MEProxy); Thu, 24 Oct 2024 13:13:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sebasmonia.com; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm3; t=1729790007; x= 1729876407; bh=4u+xYhiY88X6gnzPRi52P8Uk/i1oOIFEaPS55a+TdY4=; b=y ZLaGP+OQjIMrTOW2qxHUHPzZDOUNvPqEDb9FY+P8p9t8H0qTszPcZM2eZTfGmt7R ud23c8DhIfX/6W0ySyMKSzD+auS1bT19CliiwBDKCBajm0yi9tXZNEglj0TTJcwc H5AuFsd82v/UeubJnG+FIDr0XYQNP9ECzszTHuECjWzci04wQB70LNp0eYXwU741 B2t+XUu9uhec45gcnTfmR1/vcQmIj/MqafZaGf/VNz8gCs6jZjqLBzjXQOv3Q/Pf vc3TlxwhjwlEqEPsU1GOQmoIAzXM0ceoexPEHlg7jwmjLlBTxLe7iYydC63jX1OE 8fOvlPEAZJReLeVLJypkA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1729790007; x=1729876407; bh=4u+xYhiY88X6gnzPRi52P8Uk/i1o OIFEaPS55a+TdY4=; b=ZuyFfzpwTYw/0timznwCCZOxQLfoYjywMbDET2BecTjt yuCZBWXSTGSlPYHo7ZMhZagraBTxy6cb2fz8uoqkw/d5eRjFeiiE0HJern02z4TP hh3zA9pB6EjFFVNNY2BwKyocSU3C6nXYzHj8jcujHxXClXLEZ3X9fQCZX+WQEuiH qIzn6Gwj2noBzD5gpKSXPuBulYDvs7VkJdScZ4oMnaYRQQ1ceIbPUSGMUgcYGDQx snnMlvf+bdUI7XVPhRUoYv+DRxCQj4sVYofPz3fvCo+El0GQW3pdB7iXIHz69xkO LNt4fUpKjchy73g6OSa6D4eExjXVR5VhROwxbDRzrw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdejtddgkeegucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhephffvvefujghffffkfgggtgesmhdtreertderuden ucfhrhhomhepufgvsggrshhtihojnhcuofhonhovrgcuoehsvggsrghsthhirghnsehsvg gsrghsmhhonhhirgdrtghomheqnecuggftrfgrthhtvghrnhepledvuddvlefhteeuvdeh keeviedufffgffffkeetvdfhjefhfffhtdegveektddvnecuffhomhgrihhnpeiffedroh hrghenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehs vggsrghsthhirghnsehsvggsrghsmhhonhhirgdrtghomhdpnhgspghrtghpthhtohephe dpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepghgrnhhimhgrrhgusehtuhhtrgdr ihhopdhrtghpthhtohepjeefudeffeesuggvsggsuhhgshdrghhnuhdrohhrghdprhgtph htthhopegvlhhiiiesghhnuhdrohhrghdprhgtphhtthhopehmrghtthhirghsrdgvnhhg uggvghgrrhgusehgmh X-ME-Proxy: Feedback-ID: iab2c46da:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 24 Oct 2024 13:13:26 -0400 (EDT) In-Reply-To: <87zfmufa1g.fsf@sebasmonia.com> ("=?UTF-8?Q?Sebasti=C3=A1n_?= =?UTF-8?Q?Mon=C3=ADa?="'s message of "Wed, 23 Oct 2024 23:35:07 -0400") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:294198 Archived-At: --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Sebasti=E1n Mon=EDa writes: > Jim Porter writes: >> Thoughts on just simplifying to checking for "> way, we'd also guess "text/html" for all the (mostly obsolete) HTML >> doctypes here: . > > It sounds like a good idea, can provide a patch in a couple days (maybe > tomorrow). That leaves some time for dissenting voices to express any > concerns with this approach. Attached a patch with the corrections mentioned so far. --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-More-lax-doctype-check-in-EWW-bug-73133.patch Content-Description: bug#73133 >From 952930c78dcfe7e4bb3a32504805239ae32073e9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sebasti=C3=A1n=20Mon=C3=ADa?= Date: Thu, 24 Oct 2024 13:09:11 -0400 Subject: [PATCH] More lax doctype check in EWW (bug#73133) The regexp to match doctype tags was simplified and will match more legacy entries; also correct binding of case-fold-search. * lisp/net/eww.el (eww--html buffer-list): Update function. --- lisp/net/eww.el | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/lisp/net/eww.el b/lisp/net/eww.el index 7bbbeadaedd..71e4d720b74 100644 --- a/lisp/net/eww.el +++ b/lisp/net/eww.el @@ -660,15 +660,14 @@ eww--html-if-doctype "Return \"text/html\" if RESPONSE-BUFFER has an HTML doctype declaration. HEADERS is unused." ;; https://html.spec.whatwg.org/multipage/syntax.html#the-doctype - (let ((case-fold-search t) - (target - "\\|system +\\(\\\"\\|'\\)+about:legacy-compat\\)")) - (with-current-buffer response-buffer - (goto-char (point-min)) - ;; match basic and also legacy variants as - ;; specified in link above - (when (re-search-forward target nil t) - "text/html")))) + (with-current-buffer response-buffer + (let ((case-fold-search t)) + (save-excursion + (goto-char (point-min)) + ;; match basic and also legacy variants as + ;; specified in link above - being purposely lax about it + (when (re-search-forward "