* sgml-validate @ 2011-02-05 3:48 Allan Gottlieb 2011-02-05 13:02 ` sgml-validate Andreas Röhler [not found] ` <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 12+ messages in thread From: Allan Gottlieb @ 2011-02-05 3:48 UTC (permalink / raw) To: help-gnu-emacs (I use gentoo) When I tried M-x sgml-validate, it offers to run nsgmls -s myfile.html I don't have nsgmls so looked around and found opensp providing onsgmls and a gentoo bug report claimed a symlink nsgmls --> onsgmls. Emerged opensp and now have onsgmls. But onsgmls -s myfile.html just hangs and uses no CPU time. onsgmls claims it will validate "the SGML document whose document entity is specified by the system identifiers SYSID..." so perhaps my error is in supplying just a file name. What is needed for a valid SYSID? thanks. allan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate 2011-02-05 3:48 sgml-validate Allan Gottlieb @ 2011-02-05 13:02 ` Andreas Röhler [not found] ` <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 12+ messages in thread From: Andreas Röhler @ 2011-02-05 13:02 UTC (permalink / raw) To: help-gnu-emacs Am 05.02.2011 04:48, schrieb Allan Gottlieb: > (I use gentoo) > > When I tried M-x sgml-validate, it offers to run > nsgmls -s myfile.html > > I don't have nsgmls so looked around and found > opensp providing onsgmls and a gentoo bug report claimed > a symlink nsgmls --> onsgmls. > > Emerged opensp and now have onsgmls. > > But onsgmls -s myfile.html > just hangs and uses no CPU time. > > onsgmls claims it will validate "the SGML document whose document entity is > specified by the system identifiers SYSID..." so perhaps my error is in > supplying just a file name. What is needed for a valid SYSID? > > thanks. > allan > > Hi Allan, AFAIU that's rather an sgml-issue. Seems the command onsgmls fails, which should happen too for the very reasons also from the command line outside Emacs. BTW I use `xmllint' for validation, may be it's ok for your purposes too. HTH Andreas ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org>]
* Re: sgml-validate [not found] ` <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org> @ 2011-02-13 0:39 ` William F Hammond 2011-02-13 15:29 ` sgml-validate Allan Gottlieb [not found] ` <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 12+ messages in thread From: William F Hammond @ 2011-02-13 0:39 UTC (permalink / raw) To: help-gnu-emacs >> When I tried M-x sgml-validate, it offers to run >> nsgmls -s myfile.html >> >> I don't have nsgmls so looked around and found >> opensp providing onsgmls and a gentoo bug report claimed >> a symlink nsgmls --> onsgmls. >> >> Emerged opensp and now have onsgmls. >> >> But onsgmls -s myfile.html >> just hangs and uses no CPU time. >> >> onsgmls claims it will validate "the SGML document whose document entity is >> specified by the system identifiers SYSID..." so perhaps my error is in >> supplying just a file name. What is needed for a valid SYSID? > AFAIU that's rather an sgml-issue. > > Seems the command onsgmls fails, which should happen too for the very > reasons also from the command line outside Emacs. > > BTW I use `xmllint' for validation, may be it's ok for your purposes too. onsgmls is a very reliable tool. In general, however, it requires complicated command lines. AIUI "xmllint" works only on xml files, and, therefore, probably won't work with most extant html files, even those self-identified as the xml form of html, which commonly are not "well-formed xml". (Xml well-formedness is easily and quickly checked using James Clark's "xmlwf" that is included with "expat" distributions.) If your html is dependably well-formed xml, then take your questions to comp.text.xml. Otherwise maybe take them to comp.infosystems.www.authoring.html unless you really, really want a full understanding of the sgml background for html in which case ask in the rather quiet group comp.text.sgml. Furthermore, be aware that the proposal for future html5 served as "text/html" falls outside of all classical sgml/xml validation paradigms. -- Bill ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate 2011-02-13 0:39 ` sgml-validate William F Hammond @ 2011-02-13 15:29 ` Allan Gottlieb [not found] ` <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 12+ messages in thread From: Allan Gottlieb @ 2011-02-13 15:29 UTC (permalink / raw) To: help-gnu-emacs On Sat, Feb 12 2011, William F. Hammond wrote: >>> When I tried M-x sgml-validate, it offers to run >>> nsgmls -s myfile.html >>> >>> I don't have nsgmls so looked around and found >>> opensp providing onsgmls and a gentoo bug report claimed >>> a symlink nsgmls --> onsgmls. >>> >>> Emerged opensp and now have onsgmls. >>> >>> But onsgmls -s myfile.html >>> just hangs and uses no CPU time. >>> >>> onsgmls claims it will validate "the SGML document whose document entity is >>> specified by the system identifiers SYSID..." so perhaps my error is in >>> supplying just a file name. What is needed for a valid SYSID? > >> AFAIU that's rather an sgml-issue. >> >> Seems the command onsgmls fails, which should happen too for the very >> reasons also from the command line outside Emacs. >> >> BTW I use `xmllint' for validation, may be it's ok for your purposes too. > > onsgmls is a very reliable tool. In general, however, it requires > complicated command lines. Agreed. I have given up since I was not able to figure out how to have it "not validate" and xmllint has that ability. The validation adds minutes to the test so needs to be used sparingly. > AIUI "xmllint" works only on xml files, and, therefore, probably won't > work with most extant html files, even those self-identified as > the xml form of html, which commonly are not "well-formed xml". > (Xml well-formedness is easily and quickly checked using James Clark's > "xmlwf" that is included with "expat" distributions.) Thank you very much for this tip. I ran xmlwf and my html passed! This was made possible in large part by sgml-xml-mode. I believe xmllint also validates, but xmlwf has the plus of speed and not commenting about entities (see below). > If your html is dependably well-formed xml, then take your questions > to comp.text.xml. Otherwise maybe take them to > comp.infosystems.www.authoring.html unless you really, really want a > full understanding of the sgml background for html in which case ask > in the rather quiet group comp.text.sgml. Actually I am now doing OK thanks to the help I received here. The only annoyance is that xmllint keeps telling me that ≥ et al are not valid entities. > Furthermore, be aware that the proposal for future html5 served as > "text/html" falls outside of all classical sgml/xml validation > paradigms. Thanks for the warning. allan ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org>]
* Re: sgml-validate [not found] ` <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org> @ 2011-02-17 0:56 ` William F Hammond 2011-02-17 4:21 ` sgml-validate Allan Gottlieb [not found] ` <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 12+ messages in thread From: William F Hammond @ 2011-02-17 0:56 UTC (permalink / raw) To: help-gnu-emacs >> onsgmls is a very reliable tool. In general, however, it requires >> complicated command lines. > > Agreed. I have given up since I was not able to figure out how to have > it "not validate" and xmllint has that ability. The validation adds > minutes to the test so needs to be used sparingly. If it's adding minutes, then it's probably because external entities are being loaded from across the network. With most tools there are ways to avoid this, and one should avoid it. For a non-validating parse with onsgmls use something like this shell script: ------ #!/bin/sh # interface to "onsgmls" for non-validating XML parse pname=`basename "$0"` SP_CHARSET_FIXED=YES export SP_CHARSET_FIXED SP_ENCODING=XML export SP_ENCODING SGML_CATALOG_FILES=/usr/local/xml/pubtext/myFavoriteCatalog export SGML_CATALOG_FILES if [ -f "$1" -a -r "$1" ] ; then fname="$1" else fname="${1}.xml" fi if [ -f "$fname" ] ; then onsgmls -wxml -wno-valid -c "$SGML_CATALOG_FILES" "$fname" else echo "${pname}: Cannot find $fname" fi ------ For a non-validating parse the specified catalog should contain the line SGMLDECL "xml.dcl" where "xml.dcl" is James Clark's sgml declaration for xml, usually distributed with opensp, which should be located parallel to the catalog or else referenced in the catalog with a pathname relative to the location of the catalog. > The only annoyance is that xmllint keeps telling me that > ≥ et al are not valid entities. Correct. The only such entities that are available by default are '&' for '&', '<' for '<', annd '>' for '>'. Something like '≥' (which I imagine to be U-2265, "greater than or equal to") can be defined for in-house use but will not work well across the network. For publication on the network use the character itself if the XML instance is given a suitable content-encoding, e.g., utf-8, or use '≥'. -- Bill ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate 2011-02-17 0:56 ` sgml-validate William F Hammond @ 2011-02-17 4:21 ` Allan Gottlieb 2011-02-20 17:25 ` sgml-validate Yuri Khan [not found] ` <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org> 1 sibling, 1 reply; 12+ messages in thread From: Allan Gottlieb @ 2011-02-17 4:21 UTC (permalink / raw) To: help-gnu-emacs On Wed, Feb 16 2011, William F. Hammond wrote: > For a non-validating parse with onsgmls use something like this shell > script: [ snipped] > For a non-validating parse the specified catalog should contain > the line > > SGMLDECL "xml.dcl" > > where "xml.dcl" is James Clark's sgml declaration for xml, usually > distributed with opensp, Thank you for all the information >> The only annoyance is that xmllint keeps telling me that >> ≥ et al are not valid entities. > > Correct. The only such entities that are available by default are > '&' for '&', '<' for '<', annd '>' for '>'. > > Something like '≥' (which I imagine to be U-2265, "greater than or > equal to") can be defined for in-house use but will not work well > across the network. For publication on the network use the character > itself if the XML instance is given a suitable content-encoding, e.g., > utf-8, or use '≥'. I had hoped I would be able to put somewhere in my .html file some version of <entity-defs> ge #x2265 ... </entity-defs> But I see from your reply that this will not be possible. Thanks again for your expert commentary. allan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate 2011-02-17 4:21 ` sgml-validate Allan Gottlieb @ 2011-02-20 17:25 ` Yuri Khan 2011-02-22 14:44 ` sgml-validate Allan Gottlieb 0 siblings, 1 reply; 12+ messages in thread From: Yuri Khan @ 2011-02-20 17:25 UTC (permalink / raw) To: Allan Gottlieb; +Cc: help-gnu-emacs On Thu, Feb 17, 2011 at 10:21, Allan Gottlieb <gottlieb@nyu.edu> wrote: > I had hoped I would be able to put somewhere in my .html file > some version of > <entity-defs> > ge #x2265 > ... > </entity-defs> > But I see from your reply that this will not be possible. In fact, in an XML document you should be able to specify an internal subset in your document type declaration: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [ <!ENTITY ge "≥"> ]> Any conforming XML parser should then understand ≥ in the rest of the document, along with the built-in <, >, &, ", and '. Whether the web browsers use conforming XML parsers is an entirely different matter. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate 2011-02-20 17:25 ` sgml-validate Yuri Khan @ 2011-02-22 14:44 ` Allan Gottlieb 2011-02-22 20:03 ` sgml-validate Yuri Khan 0 siblings, 1 reply; 12+ messages in thread From: Allan Gottlieb @ 2011-02-22 14:44 UTC (permalink / raw) To: help-gnu-emacs On Sun, Feb 20 2011, Yuri Khan wrote: > On Thu, Feb 17, 2011 at 10:21, Allan Gottlieb <gottlieb@nyu.edu> wrote: > >> I had hoped I would be able to put somewhere in my .html file >> some version of >> <entity-defs> >> ge #x2265 >> ... >> </entity-defs> >> But I see from your reply that this will not be possible. > > In fact, in an XML document you should be able to specify an internal > subset in your document type declaration: > > <!DOCTYPE html > PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [ > <!ENTITY ge "≥"> > ]> Thank you. I put a bunch in my file and it now validates (I had previously wrote a sed script to convert before validating) > Any conforming XML parser should then understand ≥ in the rest of > the document, along with the built-in <, >, &, ", and > '. Whether the web browsers use conforming XML parsers is an > entirely different matter. What I can say is that firefox doesn't complain and does show the characters correctly, but it also recognized the native ≥, etc so perhaps it is just ignoring the <!ENTITY ge "≥">. Thanks again. allan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate 2011-02-22 14:44 ` sgml-validate Allan Gottlieb @ 2011-02-22 20:03 ` Yuri Khan 2011-02-23 14:34 ` sgml-validate Allan Gottlieb 0 siblings, 1 reply; 12+ messages in thread From: Yuri Khan @ 2011-02-22 20:03 UTC (permalink / raw) To: help-gnu-emacs On Tue, Feb 22, 2011 at 20:44, Allan Gottlieb <gottlieb@nyu.edu> wrote: >> <!DOCTYPE html >> PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" >> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [ >> <!ENTITY ge "≥"> >> ]> > What I can say is that firefox doesn't complain and does show the > characters correctly, but it also recognized the native ≥, etc > so perhaps it is just ignoring the <!ENTITY ge "≥">. This may happen if the browser is led to believe that your document is an HTML one. It will therefore parse it using HTML rules, which include the whole lot of HTML entities. Many servers serve XHTML documents as text/html because of one widely used browser that did not support application/xhtml+xml until its most recent version 9. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate 2011-02-22 20:03 ` sgml-validate Yuri Khan @ 2011-02-23 14:34 ` Allan Gottlieb 0 siblings, 0 replies; 12+ messages in thread From: Allan Gottlieb @ 2011-02-23 14:34 UTC (permalink / raw) To: help-gnu-emacs On Tue, Feb 22 2011, Yuri Khan wrote: > On Tue, Feb 22, 2011 at 20:44, Allan Gottlieb <gottlieb@nyu.edu> wrote: >>> <!DOCTYPE html >>> PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" >>> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [ >>> <!ENTITY ge "≥"> >>> ]> > >> What I can say is that firefox doesn't complain and does show the >> characters correctly, but it also recognized the native ≥, etc >> so perhaps it is just ignoring the <!ENTITY ge "≥">. > > This may happen if the browser is led to believe that your document is > an HTML one. It will therefore parse it using HTML rules, which > include the whole lot of HTML entities. Many servers serve XHTML > documents as text/html because of one widely used browser that did not > support application/xhtml+xml until its most recent version 9. I see. Thanks. allan ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org>]
* Re: sgml-validate [not found] ` <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org> @ 2011-02-19 0:25 ` William F Hammond 2011-02-20 13:19 ` sgml-validate Allan Gottlieb 0 siblings, 1 reply; 12+ messages in thread From: William F Hammond @ 2011-02-19 0:25 UTC (permalink / raw) To: help-gnu-emacs Allan Gottlieb <gottlieb@nyu.edu> writes: > I had hoped I would be able to put somewhere in my .html file > some version of > <entity-defs> > ge #x2265 > ... > </entity-defs> That is essentially a macro definition. You can put something like that in your source file that is then auto-translated to public HTML. -- Bill ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate 2011-02-19 0:25 ` sgml-validate William F Hammond @ 2011-02-20 13:19 ` Allan Gottlieb 0 siblings, 0 replies; 12+ messages in thread From: Allan Gottlieb @ 2011-02-20 13:19 UTC (permalink / raw) To: help-gnu-emacs On Fri, Feb 18 2011, William F. Hammond wrote: > Allan Gottlieb <gottlieb@nyu.edu> writes: > >> I had hoped I would be able to put somewhere in my .html file >> some version of >> <entity-defs> >> ge #x2265 >> ... >> </entity-defs> > > That is essentially a macro definition. Agreed; I was hoping that html would support such. > You can put something like that in your source file that is then > auto-translated to public HTML. Right. I will write a script. thanks. allan ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-02-23 14:34 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-05 3:48 sgml-validate Allan Gottlieb 2011-02-05 13:02 ` sgml-validate Andreas Röhler [not found] ` <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org> 2011-02-13 0:39 ` sgml-validate William F Hammond 2011-02-13 15:29 ` sgml-validate Allan Gottlieb [not found] ` <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org> 2011-02-17 0:56 ` sgml-validate William F Hammond 2011-02-17 4:21 ` sgml-validate Allan Gottlieb 2011-02-20 17:25 ` sgml-validate Yuri Khan 2011-02-22 14:44 ` sgml-validate Allan Gottlieb 2011-02-22 20:03 ` sgml-validate Yuri Khan 2011-02-23 14:34 ` sgml-validate Allan Gottlieb [not found] ` <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org> 2011-02-19 0:25 ` sgml-validate William F Hammond 2011-02-20 13:19 ` sgml-validate Allan Gottlieb
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).