* sgml-validate
@ 2011-02-05 3:48 Allan Gottlieb
2011-02-05 13:02 ` sgml-validate Andreas Röhler
[not found] ` <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 12+ messages in thread
From: Allan Gottlieb @ 2011-02-05 3:48 UTC (permalink / raw)
To: help-gnu-emacs
(I use gentoo)
When I tried M-x sgml-validate, it offers to run
nsgmls -s myfile.html
I don't have nsgmls so looked around and found
opensp providing onsgmls and a gentoo bug report claimed
a symlink nsgmls --> onsgmls.
Emerged opensp and now have onsgmls.
But onsgmls -s myfile.html
just hangs and uses no CPU time.
onsgmls claims it will validate "the SGML document whose document entity is
specified by the system identifiers SYSID..." so perhaps my error is in
supplying just a file name. What is needed for a valid SYSID?
thanks.
allan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
2011-02-05 3:48 sgml-validate Allan Gottlieb
@ 2011-02-05 13:02 ` Andreas Röhler
[not found] ` <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 12+ messages in thread
From: Andreas Röhler @ 2011-02-05 13:02 UTC (permalink / raw)
To: help-gnu-emacs
Am 05.02.2011 04:48, schrieb Allan Gottlieb:
> (I use gentoo)
>
> When I tried M-x sgml-validate, it offers to run
> nsgmls -s myfile.html
>
> I don't have nsgmls so looked around and found
> opensp providing onsgmls and a gentoo bug report claimed
> a symlink nsgmls --> onsgmls.
>
> Emerged opensp and now have onsgmls.
>
> But onsgmls -s myfile.html
> just hangs and uses no CPU time.
>
> onsgmls claims it will validate "the SGML document whose document entity is
> specified by the system identifiers SYSID..." so perhaps my error is in
> supplying just a file name. What is needed for a valid SYSID?
>
> thanks.
> allan
>
>
Hi Allan,
AFAIU that's rather an sgml-issue.
Seems the command onsgmls fails, which should happen too for the very
reasons also from the command line outside Emacs.
BTW I use `xmllint' for validation, may be it's ok for your purposes too.
HTH
Andreas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
[not found] ` <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org>
@ 2011-02-13 0:39 ` William F Hammond
2011-02-13 15:29 ` sgml-validate Allan Gottlieb
[not found] ` <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 12+ messages in thread
From: William F Hammond @ 2011-02-13 0:39 UTC (permalink / raw)
To: help-gnu-emacs
>> When I tried M-x sgml-validate, it offers to run
>> nsgmls -s myfile.html
>>
>> I don't have nsgmls so looked around and found
>> opensp providing onsgmls and a gentoo bug report claimed
>> a symlink nsgmls --> onsgmls.
>>
>> Emerged opensp and now have onsgmls.
>>
>> But onsgmls -s myfile.html
>> just hangs and uses no CPU time.
>>
>> onsgmls claims it will validate "the SGML document whose document entity is
>> specified by the system identifiers SYSID..." so perhaps my error is in
>> supplying just a file name. What is needed for a valid SYSID?
> AFAIU that's rather an sgml-issue.
>
> Seems the command onsgmls fails, which should happen too for the very
> reasons also from the command line outside Emacs.
>
> BTW I use `xmllint' for validation, may be it's ok for your purposes too.
onsgmls is a very reliable tool. In general, however, it requires
complicated command lines.
AIUI "xmllint" works only on xml files, and, therefore, probably won't
work with most extant html files, even those self-identified as
the xml form of html, which commonly are not "well-formed xml".
(Xml well-formedness is easily and quickly checked using James Clark's
"xmlwf" that is included with "expat" distributions.)
If your html is dependably well-formed xml, then take your questions
to comp.text.xml. Otherwise maybe take them to
comp.infosystems.www.authoring.html unless you really, really want a
full understanding of the sgml background for html in which case ask
in the rather quiet group comp.text.sgml.
Furthermore, be aware that the proposal for future html5 served as
"text/html" falls outside of all classical sgml/xml validation
paradigms.
-- Bill
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
2011-02-13 0:39 ` sgml-validate William F Hammond
@ 2011-02-13 15:29 ` Allan Gottlieb
[not found] ` <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 12+ messages in thread
From: Allan Gottlieb @ 2011-02-13 15:29 UTC (permalink / raw)
To: help-gnu-emacs
On Sat, Feb 12 2011, William F. Hammond wrote:
>>> When I tried M-x sgml-validate, it offers to run
>>> nsgmls -s myfile.html
>>>
>>> I don't have nsgmls so looked around and found
>>> opensp providing onsgmls and a gentoo bug report claimed
>>> a symlink nsgmls --> onsgmls.
>>>
>>> Emerged opensp and now have onsgmls.
>>>
>>> But onsgmls -s myfile.html
>>> just hangs and uses no CPU time.
>>>
>>> onsgmls claims it will validate "the SGML document whose document entity is
>>> specified by the system identifiers SYSID..." so perhaps my error is in
>>> supplying just a file name. What is needed for a valid SYSID?
>
>> AFAIU that's rather an sgml-issue.
>>
>> Seems the command onsgmls fails, which should happen too for the very
>> reasons also from the command line outside Emacs.
>>
>> BTW I use `xmllint' for validation, may be it's ok for your purposes too.
>
> onsgmls is a very reliable tool. In general, however, it requires
> complicated command lines.
Agreed. I have given up since I was not able to figure out how to have
it "not validate" and xmllint has that ability. The validation adds
minutes to the test so needs to be used sparingly.
> AIUI "xmllint" works only on xml files, and, therefore, probably won't
> work with most extant html files, even those self-identified as
> the xml form of html, which commonly are not "well-formed xml".
> (Xml well-formedness is easily and quickly checked using James Clark's
> "xmlwf" that is included with "expat" distributions.)
Thank you very much for this tip. I ran xmlwf and my html passed! This
was made possible in large part by sgml-xml-mode. I believe xmllint
also validates, but xmlwf has the plus of speed and not commenting about
entities (see below).
> If your html is dependably well-formed xml, then take your questions
> to comp.text.xml. Otherwise maybe take them to
> comp.infosystems.www.authoring.html unless you really, really want a
> full understanding of the sgml background for html in which case ask
> in the rather quiet group comp.text.sgml.
Actually I am now doing OK thanks to the help I received here.
The only annoyance is that xmllint keeps telling me that
≥ et al are not valid entities.
> Furthermore, be aware that the proposal for future html5 served as
> "text/html" falls outside of all classical sgml/xml validation
> paradigms.
Thanks for the warning.
allan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
[not found] ` <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org>
@ 2011-02-17 0:56 ` William F Hammond
2011-02-17 4:21 ` sgml-validate Allan Gottlieb
[not found] ` <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 12+ messages in thread
From: William F Hammond @ 2011-02-17 0:56 UTC (permalink / raw)
To: help-gnu-emacs
>> onsgmls is a very reliable tool. In general, however, it requires
>> complicated command lines.
>
> Agreed. I have given up since I was not able to figure out how to have
> it "not validate" and xmllint has that ability. The validation adds
> minutes to the test so needs to be used sparingly.
If it's adding minutes, then it's probably because external entities
are being loaded from across the network. With most tools there are
ways to avoid this, and one should avoid it.
For a non-validating parse with onsgmls use something like this shell
script:
------
#!/bin/sh
# interface to "onsgmls" for non-validating XML parse
pname=`basename "$0"`
SP_CHARSET_FIXED=YES
export SP_CHARSET_FIXED
SP_ENCODING=XML
export SP_ENCODING
SGML_CATALOG_FILES=/usr/local/xml/pubtext/myFavoriteCatalog
export SGML_CATALOG_FILES
if [ -f "$1" -a -r "$1" ] ; then
fname="$1"
else
fname="${1}.xml"
fi
if [ -f "$fname" ] ; then
onsgmls -wxml -wno-valid -c "$SGML_CATALOG_FILES" "$fname"
else
echo "${pname}: Cannot find $fname"
fi
------
For a non-validating parse the specified catalog should contain
the line
SGMLDECL "xml.dcl"
where "xml.dcl" is James Clark's sgml declaration for xml, usually
distributed with opensp, which should be located parallel to the
catalog or else referenced in the catalog with a pathname relative to
the location of the catalog.
> The only annoyance is that xmllint keeps telling me that
> ≥ et al are not valid entities.
Correct. The only such entities that are available by default are
'&' for '&', '<' for '<', annd '>' for '>'.
Something like '≥' (which I imagine to be U-2265, "greater than or
equal to") can be defined for in-house use but will not work well
across the network. For publication on the network use the character
itself if the XML instance is given a suitable content-encoding, e.g.,
utf-8, or use '≥'.
-- Bill
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
2011-02-17 0:56 ` sgml-validate William F Hammond
@ 2011-02-17 4:21 ` Allan Gottlieb
2011-02-20 17:25 ` sgml-validate Yuri Khan
[not found] ` <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org>
1 sibling, 1 reply; 12+ messages in thread
From: Allan Gottlieb @ 2011-02-17 4:21 UTC (permalink / raw)
To: help-gnu-emacs
On Wed, Feb 16 2011, William F. Hammond wrote:
> For a non-validating parse with onsgmls use something like this shell
> script:
[ snipped]
> For a non-validating parse the specified catalog should contain
> the line
>
> SGMLDECL "xml.dcl"
>
> where "xml.dcl" is James Clark's sgml declaration for xml, usually
> distributed with opensp,
Thank you for all the information
>> The only annoyance is that xmllint keeps telling me that
>> ≥ et al are not valid entities.
>
> Correct. The only such entities that are available by default are
> '&' for '&', '<' for '<', annd '>' for '>'.
>
> Something like '≥' (which I imagine to be U-2265, "greater than or
> equal to") can be defined for in-house use but will not work well
> across the network. For publication on the network use the character
> itself if the XML instance is given a suitable content-encoding, e.g.,
> utf-8, or use '≥'.
I had hoped I would be able to put somewhere in my .html file
some version of
<entity-defs>
ge #x2265
...
</entity-defs>
But I see from your reply that this will not be possible.
Thanks again for your expert commentary.
allan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
[not found] ` <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org>
@ 2011-02-19 0:25 ` William F Hammond
2011-02-20 13:19 ` sgml-validate Allan Gottlieb
0 siblings, 1 reply; 12+ messages in thread
From: William F Hammond @ 2011-02-19 0:25 UTC (permalink / raw)
To: help-gnu-emacs
Allan Gottlieb <gottlieb@nyu.edu> writes:
> I had hoped I would be able to put somewhere in my .html file
> some version of
> <entity-defs>
> ge #x2265
> ...
> </entity-defs>
That is essentially a macro definition.
You can put something like that in your source file that is then
auto-translated to public HTML.
-- Bill
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
2011-02-19 0:25 ` sgml-validate William F Hammond
@ 2011-02-20 13:19 ` Allan Gottlieb
0 siblings, 0 replies; 12+ messages in thread
From: Allan Gottlieb @ 2011-02-20 13:19 UTC (permalink / raw)
To: help-gnu-emacs
On Fri, Feb 18 2011, William F. Hammond wrote:
> Allan Gottlieb <gottlieb@nyu.edu> writes:
>
>> I had hoped I would be able to put somewhere in my .html file
>> some version of
>> <entity-defs>
>> ge #x2265
>> ...
>> </entity-defs>
>
> That is essentially a macro definition.
Agreed; I was hoping that html would support such.
> You can put something like that in your source file that is then
> auto-translated to public HTML.
Right. I will write a script.
thanks.
allan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
2011-02-17 4:21 ` sgml-validate Allan Gottlieb
@ 2011-02-20 17:25 ` Yuri Khan
2011-02-22 14:44 ` sgml-validate Allan Gottlieb
0 siblings, 1 reply; 12+ messages in thread
From: Yuri Khan @ 2011-02-20 17:25 UTC (permalink / raw)
To: Allan Gottlieb; +Cc: help-gnu-emacs
On Thu, Feb 17, 2011 at 10:21, Allan Gottlieb <gottlieb@nyu.edu> wrote:
> I had hoped I would be able to put somewhere in my .html file
> some version of
> <entity-defs>
> ge #x2265
> ...
> </entity-defs>
> But I see from your reply that this will not be possible.
In fact, in an XML document you should be able to specify an internal
subset in your document type declaration:
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [
<!ENTITY ge "≥">
]>
Any conforming XML parser should then understand ≥ in the rest of
the document, along with the built-in <, >, &, ", and
'. Whether the web browsers use conforming XML parsers is an
entirely different matter.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
2011-02-20 17:25 ` sgml-validate Yuri Khan
@ 2011-02-22 14:44 ` Allan Gottlieb
2011-02-22 20:03 ` sgml-validate Yuri Khan
0 siblings, 1 reply; 12+ messages in thread
From: Allan Gottlieb @ 2011-02-22 14:44 UTC (permalink / raw)
To: help-gnu-emacs
On Sun, Feb 20 2011, Yuri Khan wrote:
> On Thu, Feb 17, 2011 at 10:21, Allan Gottlieb <gottlieb@nyu.edu> wrote:
>
>> I had hoped I would be able to put somewhere in my .html file
>> some version of
>> <entity-defs>
>> ge #x2265
>> ...
>> </entity-defs>
>> But I see from your reply that this will not be possible.
>
> In fact, in an XML document you should be able to specify an internal
> subset in your document type declaration:
>
> <!DOCTYPE html
> PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [
> <!ENTITY ge "≥">
> ]>
Thank you. I put a bunch in my file and it now validates
(I had previously wrote a sed script to convert before validating)
> Any conforming XML parser should then understand ≥ in the rest of
> the document, along with the built-in <, >, &, ", and
> '. Whether the web browsers use conforming XML parsers is an
> entirely different matter.
What I can say is that firefox doesn't complain and does show the
characters correctly, but it also recognized the native ≥, etc
so perhaps it is just ignoring the <!ENTITY ge "≥">.
Thanks again.
allan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
2011-02-22 14:44 ` sgml-validate Allan Gottlieb
@ 2011-02-22 20:03 ` Yuri Khan
2011-02-23 14:34 ` sgml-validate Allan Gottlieb
0 siblings, 1 reply; 12+ messages in thread
From: Yuri Khan @ 2011-02-22 20:03 UTC (permalink / raw)
To: help-gnu-emacs
On Tue, Feb 22, 2011 at 20:44, Allan Gottlieb <gottlieb@nyu.edu> wrote:
>> <!DOCTYPE html
>> PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
>> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [
>> <!ENTITY ge "≥">
>> ]>
> What I can say is that firefox doesn't complain and does show the
> characters correctly, but it also recognized the native ≥, etc
> so perhaps it is just ignoring the <!ENTITY ge "≥">.
This may happen if the browser is led to believe that your document is
an HTML one. It will therefore parse it using HTML rules, which
include the whole lot of HTML entities. Many servers serve XHTML
documents as text/html because of one widely used browser that did not
support application/xhtml+xml until its most recent version 9.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sgml-validate
2011-02-22 20:03 ` sgml-validate Yuri Khan
@ 2011-02-23 14:34 ` Allan Gottlieb
0 siblings, 0 replies; 12+ messages in thread
From: Allan Gottlieb @ 2011-02-23 14:34 UTC (permalink / raw)
To: help-gnu-emacs
On Tue, Feb 22 2011, Yuri Khan wrote:
> On Tue, Feb 22, 2011 at 20:44, Allan Gottlieb <gottlieb@nyu.edu> wrote:
>>> <!DOCTYPE html
>>> PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
>>> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" [
>>> <!ENTITY ge "≥">
>>> ]>
>
>> What I can say is that firefox doesn't complain and does show the
>> characters correctly, but it also recognized the native ≥, etc
>> so perhaps it is just ignoring the <!ENTITY ge "≥">.
>
> This may happen if the browser is led to believe that your document is
> an HTML one. It will therefore parse it using HTML rules, which
> include the whole lot of HTML entities. Many servers serve XHTML
> documents as text/html because of one widely used browser that did not
> support application/xhtml+xml until its most recent version 9.
I see. Thanks.
allan
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-02-23 14:34 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-05 3:48 sgml-validate Allan Gottlieb
2011-02-05 13:02 ` sgml-validate Andreas Röhler
[not found] ` <mailman.6.1296910957.7938.help-gnu-emacs@gnu.org>
2011-02-13 0:39 ` sgml-validate William F Hammond
2011-02-13 15:29 ` sgml-validate Allan Gottlieb
[not found] ` <mailman.10.1297610953.15503.help-gnu-emacs@gnu.org>
2011-02-17 0:56 ` sgml-validate William F Hammond
2011-02-17 4:21 ` sgml-validate Allan Gottlieb
2011-02-20 17:25 ` sgml-validate Yuri Khan
2011-02-22 14:44 ` sgml-validate Allan Gottlieb
2011-02-22 20:03 ` sgml-validate Yuri Khan
2011-02-23 14:34 ` sgml-validate Allan Gottlieb
[not found] ` <mailman.6.1297916501.11971.help-gnu-emacs@gnu.org>
2011-02-19 0:25 ` sgml-validate William F Hammond
2011-02-20 13:19 ` sgml-validate Allan Gottlieb
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).