From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Chong Yidong Newsgroups: gmane.emacs.bugs Subject: bug#4950: `xml-parse-file' returns incorrect results strings after `>' before `<' when CR\LF TAB+ Date: Sun, 01 Jul 2012 19:22:33 +0800 Message-ID: <874npr6806.fsf@gnu.org> References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1341141821 23072 80.91.229.3 (1 Jul 2012 11:23:41 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sun, 1 Jul 2012 11:23:41 +0000 (UTC) Cc: 4950@debbugs.gnu.org To: MON KEY Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Jul 01 13:23:39 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SlIFe-0006vu-Vu for geb-bug-gnu-emacs@m.gmane.org; Sun, 01 Jul 2012 13:23:39 +0200 Original-Received: from localhost ([::1]:60374 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SlIFd-0008Hk-O0 for geb-bug-gnu-emacs@m.gmane.org; Sun, 01 Jul 2012 07:23:37 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:37630) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SlIFa-0008HX-U8 for bug-gnu-emacs@gnu.org; Sun, 01 Jul 2012 07:23:36 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SlIFZ-0006mN-5f for bug-gnu-emacs@gnu.org; Sun, 01 Jul 2012 07:23:34 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:58587) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SlIFZ-0006mJ-2X for bug-gnu-emacs@gnu.org; Sun, 01 Jul 2012 07:23:33 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1SlIJt-0005sA-VX for bug-gnu-emacs@gnu.org; Sun, 01 Jul 2012 07:28:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Chong Yidong Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 01 Jul 2012 11:28:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 4950 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: X-Debbugs-Original-Cc: 4950@debbugs.gnu.org, bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.134114203722500 (code B ref -1); Sun, 01 Jul 2012 11:28:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 1 Jul 2012 11:27:17 +0000 Original-Received: from localhost ([127.0.0.1]:39899 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SlIJB-0005qr-KO for submit@debbugs.gnu.org; Sun, 01 Jul 2012 07:27:17 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:49445) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SlIJ9-0005qk-SD for submit@debbugs.gnu.org; Sun, 01 Jul 2012 07:27:16 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SlIEm-0006gn-Qp for submit@debbugs.gnu.org; Sun, 01 Jul 2012 07:22:45 -0400 Original-Received: from lists.gnu.org ([208.118.235.17]:41716) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SlIEm-0006gj-NS for submit@debbugs.gnu.org; Sun, 01 Jul 2012 07:22:44 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:37574) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SlIEl-00086S-50 for bug-gnu-emacs@gnu.org; Sun, 01 Jul 2012 07:22:44 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SlIEj-0006gZ-9K for bug-gnu-emacs@gnu.org; Sun, 01 Jul 2012 07:22:42 -0400 Original-Received: from fencepost.gnu.org ([208.118.235.10]:43914) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SlIEj-0006gV-60 for bug-gnu-emacs@gnu.org; Sun, 01 Jul 2012 07:22:41 -0400 Original-Received: from cm162.gamma80.maxonline.com.sg ([202.156.80.162]:43587 helo=ulysses) by fencepost.gnu.org with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1SlIEg-000855-Hi; Sun, 01 Jul 2012 07:22:39 -0400 In-Reply-To: (MON KEY's message of "Tue, 17 Nov 2009 17:12:37 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:61459 Archived-At: MON KEY writes: > CR\LF > TAB TAB TAB > > Returns (:NOTE with my pp-ing to help clarify the problem): > > (ELEMENT nil > ((attr1 . "a1") > (attr2 . "a2") > (attr3 . "a3") > (attr4 . "a4") > (attr5 . "a5") " > " ;; <-i.e. (mapconcat #'char-to-string '(32 10 9 9 9) "") > (NEXT-NODE nil (... > > Is it if fair/safe to assume that where these types of sequences occur > they are not part of the XML and can be removed with a regexp? No. XML 1.0 Recommendation, Section 2.10 White Space Handling: "An XML processor MUST always pass all characters in a document that are not markup through to the application."