From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Boylan, Ross" Newsgroups: gmane.emacs.help Subject: nXML gags on XML with one long line Date: Wed, 24 Dec 2014 09:02:43 +0000 Message-ID: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1419411791 8352 80.91.229.3 (24 Dec 2014 09:03:11 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 24 Dec 2014 09:03:11 +0000 (UTC) To: "help-gnu-emacs@gnu.org" Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Dec 24 10:03:04 2014 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Y3hqQ-0004EE-Ly for geh-help-gnu-emacs@m.gmane.org; Wed, 24 Dec 2014 10:03:02 +0100 Original-Received: from localhost ([::1]:47476 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y3hqP-0001cA-M6 for geh-help-gnu-emacs@m.gmane.org; Wed, 24 Dec 2014 04:03:01 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37897) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y3hqE-0001c5-Ie for help-gnu-emacs@gnu.org; Wed, 24 Dec 2014 04:02:51 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y3hqB-0006jM-Cq for help-gnu-emacs@gnu.org; Wed, 24 Dec 2014 04:02:50 -0500 Original-Received: from esa1.ucsf.iphmx.com ([68.232.141.34]:27314) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y3hqB-0006dh-4K for help-gnu-emacs@gnu.org; Wed, 24 Dec 2014 04:02:47 -0500 Original-Received: from mcbmobwap004.ucsfmedicalcenter.org ([64.54.35.217]) by esa1.ucsf.iphmx.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 24 Dec 2014 01:02:45 -0800 X-AuditID: 403623d9-f793f6d000003241-28-549a813429d0 Original-Received: from bcuda4.ucsf.edu (otp005580ots.ucsfmedicalcenter.org [64.54.36.202]) by mcbmobwap004.ucsfmedicalcenter.org (Symantec Mail Security) with SMTP id 97.74.12865.4318A945; Wed, 24 Dec 2014 01:02:44 -0800 (PST) X-ASG-Debug-ID: 1419411764-04d82021f120df540001-2yy5ZX Original-Received: from exht05.net.ucsf.edu (mx.ucsf.edu [64.54.247.193]) by bcuda4.ucsf.edu with ESMTP id KM4NoY6RIOf6qLFW (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO) for ; Wed, 24 Dec 2014 01:02:44 -0800 (PST) X-Barracuda-Envelope-From: Ross.Boylan@ucsf.edu X-Barracuda-Apparent-Source-IP: 64.54.247.193 Original-Received: from EX08.net.ucsf.edu ([64.54.247.161]) by exht05.net.ucsf.edu ([64.54.247.222]) with mapi id 14.03.0224.002; Wed, 24 Dec 2014 01:02:44 -0800 Thread-Topic: nXML gags on XML with one long line X-ASG-Orig-Subj: nXML gags on XML with one long line Thread-Index: AdAfVxBX+f960T3lRfSyMrTrXxcZtA== Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [50.161.133.3] X-Barracuda-Connect: mx.ucsf.edu[64.54.247.193] X-Barracuda-Start-Time: 1419411764 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://bcuda4.ucsf.edu:80/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at ucsf.edu X-Barracuda-BRTS-Status: 1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=4.5 KILL_LEVEL=5.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.13346 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-CFilter-Loop: Reflected X-Brightmail-Tracker: H4sIAAAAAAAAA02Ta0hTYRjHe7fjdjRPnqbpwwqqY/qheZmlUhShQbdvFqskAj3bTttym+Oc TTPKbEUXCQm6rGYXhRAUSVKwJINaQWspRtLFNazIJV4KS0y7UJ2zc1b79uf3PP/n+b8v74vL Vd1KNW6xOxnWTlspRQJWVJixKzv/qFenvdlXsKZnaFJRhLaeuFhYgvYkrDcyVksVw+ZuKE8w 9/n0jun4A8Nzw4o6NK2sR/E4kPnw89s9hahT4dlwB68TcBUZQhAaC8jFQiEM9DZgYmECwaPm W5IjFwafDUk6G3z+RiQ2XUHwrXkqskJBasBzIhQn6BSyAAI/uiI6mcyC3qtDSpHnQV+bG0Uj Hes6xw/FeZ4DVx8XCRgjM2DkfHvESpCbYdrnxgSN+NSzgXaZoOVkGgRHrsvEMSTc6B2QDrAI xj78jhP1MgjNTSnE/ixouvtV0hpoaZ6Qi/MXwpPLI1j0jN/fvpSiURBq+CTp5dDgD0i3uA6m 3GNSfyYEJ49JXA0nW95LfAl8bn0n3dUhOPnUrziLKG9MbG9MJG9MJG9MpCaEtSHKZtDbKvXV tEOrzc9xGbh9NsZoMdBWAyO8i5xK1tSJIi8jfegOCswU+xCJIyqRuFPq1ani6CquxuZDubiM WkR0HebRAn2lscZMc+Yy1mVlOCqFsNfxmPiH9S5rhQ8BLudLO/cLJSNdc5BhK0WDDy3GMSqN KJk6pVORJtrJVDCMg2Gj1QIcp4CoPcIbF7KMiTmwz2J1Rsu8rzVwiffFVsSFMjzeh7LxRH6r XzATnIO2cRaTZEwmHAJNjFLBFEBr1WlEYy3PSYGbXfZ/m9SpRPfqyzpVUkxB8ER/ X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 68.232.141.34 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:101707 Archived-At: I opened a .xml file that was relatively large in GNU Emacs 23.4.1 (x86_64-= pc-linux-gnu, GTK+ Version 2.24.10)=0A= of 2012-09-08 on trouble, modified by Debian. This opened in nXML mode, b= ut started using up all the CPU (I think after I asked it to use outline mo= de) and became unresponsive. Before that it showed a message saying the fi= le was 87% validated (very roughly--from memory) for quite awhile (a minute= ?), with low CPU use. It did eventually show as completely validated.=0A= =0A= The xml file is one long block of text with no whitespace between entries; = in fact, there isn't even a line break at the end of the file. Some of the= nXML documentation, specifically on paragraphs, refers to identifying para= graphs by line breaks. Perhaps nXML can't cope with files without newlines= ?=0A= =0A= At any rate, any suggestions for how to deal with the file, or at least deb= ug the problem?=0A= =0A= $ wc KHC-Endnote2.xml # the file referred to above=0A= 0 344068 4883375 KHC-Endnote2.xml=0A= =0A= The file was exported from Endnote in its XML format.=0A= =0A= Thanks.=0A= Ross Boylan=0A= =0A= P.S. I was able to change to text mode, but that isn't too helpful since t= he thing is just one long string.=