From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.bugs Subject: bug#30789: 26.0.91; xml-parse-region works but libxml-parse-html-region doesn't Date: Tue, 13 Mar 2018 01:44:22 +0100 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1520901806 27640 195.159.176.226 (13 Mar 2018 00:43:26 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 13 Mar 2018 00:43:26 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: 30789@debbugs.gnu.org, =?UTF-8?Q?=E7=A9=8D=E4=B8=B9=E5=B0=BC?= Dan Jacobson To: Katsumi Yamaoka Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Mar 13 01:43:22 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1evY2B-0006tE-Cl for geb-bug-gnu-emacs@m.gmane.org; Tue, 13 Mar 2018 01:43:19 +0100 Original-Received: from localhost ([::1]:36368 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1evY49-0006br-2C for geb-bug-gnu-emacs@m.gmane.org; Mon, 12 Mar 2018 20:45:21 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:33395) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1evY3v-0006Zw-T8 for bug-gnu-emacs@gnu.org; Mon, 12 Mar 2018 20:45:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1evY3r-0001tH-1F for bug-gnu-emacs@gnu.org; Mon, 12 Mar 2018 20:45:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:49952) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1evY3q-0001tA-UQ for bug-gnu-emacs@gnu.org; Mon, 12 Mar 2018 20:45:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1evY3q-0000tZ-Mv for bug-gnu-emacs@gnu.org; Mon, 12 Mar 2018 20:45:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 13 Mar 2018 00:45:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 30789 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 30789-submit@debbugs.gnu.org id=B30789.15209018713374 (code B ref 30789); Tue, 13 Mar 2018 00:45:02 +0000 Original-Received: (at 30789) by debbugs.gnu.org; 13 Mar 2018 00:44:31 +0000 Original-Received: from localhost ([127.0.0.1]:57848 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1evY3K-0000sK-3i for submit@debbugs.gnu.org; Mon, 12 Mar 2018 20:44:31 -0400 Original-Received: from hermes.netfonds.no ([80.91.224.195]:55365) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1evY3H-0000sA-Rg for 30789@debbugs.gnu.org; Mon, 12 Mar 2018 20:44:28 -0400 Original-Received: from cm-84.209.240.67.getinternet.no ([84.209.240.67] helo=stories) by hermes.netfonds.no with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1evY3C-00068C-9g; Tue, 13 Mar 2018 01:44:24 +0100 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAGFBMVEX+/uzl4cn+//E6Ji2O gYG6qaJsYWX///G9SfnnAAACQElEQVQ4jXWUMXOjMBCFRZGhRZeIazNu3N6NMDUEiToatNThJFR7 0ujv3xMI27m5bMXo4+ktb9dmLFfBvqnCPHEujPl6WsZ4zY+RxfhwHOPnCfX6emLsfDplZpWRUoat 8LASHcACtDtwUi4qg/JFmCRQSm2SG2CCDARekxqTZNE6K7jgAEpbqzbA+d5jSXKRwSsl1ySRy2Fe UgOglPMXCyJ7oXYQrV9ar/QAVwDXk9q/F6AHMIJbBRfXq0PBnwwsjDY0J8kb5+e4BULDAiA0pznZ 95TbjTQtQQ09cTM39hEIl8Bv/UTkYdKndq/pKl73AM0iiEYAeMTdQxAHkMI1eiSAX2HrNgET1CTd IGnQauy8uyneg5pl+yzNhLY69QAQ7Ug1FKsdu0Vu5jEWvEK2Tl5G7dBvV20hVmX8ZCwMWspmXhPQ eyQABTyGccSw9XyxoR8PkMyHMDVEal5tuCvq0AWFA9LKAvTBbwDp8qr1Q2M9zdrb8FawM7oCeG6q NgyXIfgt3e6F56xq/Y7taWrpZ68xD8HjOYMuLYOQFrcB2EMh6COtj0G/ULg/VOUvR7s7gLMObskz j7FWHVbUG4eLdJBr6zKYE2iDmVeyiGxtQ063tl36GSxCAww3sHtAsm5gwrbrZI4OEAmARFZkMbDF yAwm+ZGAVAAeq20BIvt5B43h6Bfb/i+QhlPw02KPqwT9wPWEkFnBeVGVPIOtTtcTe2XxqDso8DI3 ht/A/aUvdU2j/V+l0bIHVLLbTdU3/0Z/AQ5uLplhO+KIAAAAAElFTkSuQmCC In-Reply-To: (Katsumi Yamaoka's message of "Tue, 13 Mar 2018 08:38:09 +0900") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:144169 Archived-At: Katsumi Yamaoka writes: > When I read the mail using Gnus + shr, the text after the broken > point is all cut off. That is what libxml-parse-html-region does, > whereas xml-parse-region doesn't cut it. Moreover a web browser, > to which I send the html data using the `K H' command, shows all > the text (the broken character is shown as is, though). > > This is not necessarily a libxml bug anyway, but I hope it works > like xml-parse. libxml is more strict about correctness of the input than most other HTML parsers. I don't think there's anything we can do about this problematic input other than ponder whether Emacs should use a different HTML parser, which I think sounds of unlikely. :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no