From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#37009: EWW Gets Confused on Invalid HTML Date: Tue, 13 Aug 2019 21:13:46 +0300 Message-ID: <83d0h9rl0l.fsf@gnu.org> References: <855zn12bnu.fsf@gmail.com> Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="243459"; mail-complaints-to="usenet@blaine.gmane.org" Cc: 37009@debbugs.gnu.org, nick.m.daly@gmail.com To: Noam Postavsky Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Aug 13 20:15:11 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hxbKA-0011Bb-3Q for geb-bug-gnu-emacs@m.gmane.org; Tue, 13 Aug 2019 20:15:10 +0200 Original-Received: from localhost ([::1]:54692 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hxbK8-0004Fz-Fe for geb-bug-gnu-emacs@m.gmane.org; Tue, 13 Aug 2019 14:15:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42474) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hxbK3-0004Fc-8W for bug-gnu-emacs@gnu.org; Tue, 13 Aug 2019 14:15:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hxbK2-0002c4-4m for bug-gnu-emacs@gnu.org; Tue, 13 Aug 2019 14:15:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:39806) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hxbK2-0002bo-1v for bug-gnu-emacs@gnu.org; Tue, 13 Aug 2019 14:15:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hxbK1-0004Ey-Lp for bug-gnu-emacs@gnu.org; Tue, 13 Aug 2019 14:15:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 13 Aug 2019 18:15:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 37009 X-GNU-PR-Package: emacs Original-Received: via spool by 37009-submit@debbugs.gnu.org id=B37009.156572004616221 (code B ref 37009); Tue, 13 Aug 2019 18:15:01 +0000 Original-Received: (at 37009) by debbugs.gnu.org; 13 Aug 2019 18:14:06 +0000 Original-Received: from localhost ([127.0.0.1]:48627 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hxbJ8-0004DZ-2t for submit@debbugs.gnu.org; Tue, 13 Aug 2019 14:14:06 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:58135) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hxbJ6-0004Cw-Jw for 37009@debbugs.gnu.org; Tue, 13 Aug 2019 14:14:05 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:42691) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hxbJ1-0001VU-H3; Tue, 13 Aug 2019 14:13:59 -0400 Original-Received: from [176.228.60.248] (port=3164 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hxbJ0-0003Ph-Dr; Tue, 13 Aug 2019 14:13:59 -0400 In-reply-to: <855zn12bnu.fsf@gmail.com> (message from Noam Postavsky on Tue, 13 Aug 2019 13:55:01 -0400) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:164945 Archived-At: > From: Noam Postavsky > Date: Tue, 13 Aug 2019 13:55:01 -0400 > Cc: 37009@debbugs.gnu.org > > > Unfortunately, the page does not escape the less-than symbol before "xs" > > on the second line, so the "<-" (and several more characters) aren't > > displayed. > > I'm not sure how feasible it will be to fix this at all. Eww relies on > libxml for parsing, and it's not as flexible as a typical web browser: > > (with-temp-buffer > (insert " > abc <- xyz > ") > (libxml-parse-html-region (point-min) (point-max))) > > ;=> (html nil (body nil "abc\n")) Maybe we should report this to libxml developers and hear their opinion?