From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: William F Hammond Newsgroups: gmane.emacs.help Subject: Re: sgml-validate Date: Wed, 16 Feb 2011 19:56:00 -0500 Organization: Dept of Math & Stat, Univ at Albany (SUNY), Albany, NY Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1297906853 10724 80.91.229.12 (17 Feb 2011 01:40:53 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 17 Feb 2011 01:40:53 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Feb 17 02:40:49 2011 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PpsrP-0002uY-VV for geh-help-gnu-emacs@m.gmane.org; Thu, 17 Feb 2011 02:40:48 +0100 Original-Received: from localhost ([127.0.0.1]:60569 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PpsrP-0003bK-CK for geh-help-gnu-emacs@m.gmane.org; Wed, 16 Feb 2011 20:40:47 -0500 Original-Path: usenet.stanford.edu!news.glorb.com!news2.glorb.com!news.glorb.com!border3.nntp.dca.giganews.com!Xl.tags.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!nntp.posted.universityofalbany!news.posted.universityofalbany.POSTED!not-for-mail Original-NNTP-Posting-Date: Wed, 16 Feb 2011 18:56:02 -0600 Original-Newsgroups: gnu.emacs.help User-Agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.4 (usg-unix-v) Cancel-Lock: sha1:U72MMbhvdR2JWp+Ox7Pt1Uy1Ntw= Original-Lines: 61 X-Usenet-Provider: http://www.giganews.com Original-NNTP-Posting-Host: 169.226.140.28 Original-X-Trace: sv3-UGOUEA1ncW/VXASH3kRxGLJt3JqV9m7wn51XqjMPtrLytofx81frRqsQjxXC8pLb3WuLnj8nvvgFTS8!4AGfi1NwFlkTmtsM69D87yFkkXuApwwzCkhK3H5HWrA0jPgK6cbH4gKiewOg7/Ed4ejBmJ8/MPeZ!UzuBVavwayGhOiU= X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 3257 Original-Xref: usenet.stanford.edu gnu.emacs.help:185015 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:79176 Archived-At: >> onsgmls is a very reliable tool. In general, however, it requires >> complicated command lines. > > Agreed. I have given up since I was not able to figure out how to have > it "not validate" and xmllint has that ability. The validation adds > minutes to the test so needs to be used sparingly. If it's adding minutes, then it's probably because external entities are being loaded from across the network. With most tools there are ways to avoid this, and one should avoid it. For a non-validating parse with onsgmls use something like this shell script: ------ #!/bin/sh # interface to "onsgmls" for non-validating XML parse pname=`basename "$0"` SP_CHARSET_FIXED=YES export SP_CHARSET_FIXED SP_ENCODING=XML export SP_ENCODING SGML_CATALOG_FILES=/usr/local/xml/pubtext/myFavoriteCatalog export SGML_CATALOG_FILES if [ -f "$1" -a -r "$1" ] ; then fname="$1" else fname="${1}.xml" fi if [ -f "$fname" ] ; then onsgmls -wxml -wno-valid -c "$SGML_CATALOG_FILES" "$fname" else echo "${pname}: Cannot find $fname" fi ------ For a non-validating parse the specified catalog should contain the line SGMLDECL "xml.dcl" where "xml.dcl" is James Clark's sgml declaration for xml, usually distributed with opensp, which should be located parallel to the catalog or else referenced in the catalog with a pathname relative to the location of the catalog. > The only annoyance is that xmllint keeps telling me that > ≥ et al are not valid entities. Correct. The only such entities that are available by default are '&' for '&', '<' for '<', annd '>' for '>'. Something like '≥' (which I imagine to be U-2265, "greater than or equal to") can be defined for in-house use but will not work well across the network. For publication on the network use the character itself if the XML instance is given a suitable content-encoding, e.g., utf-8, or use '≥'. -- Bill