unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Suggestions? Better filetype sniffing -- XHTML vs. HTML
@ 2004-02-24 14:56 D. D. Brierton
  2004-02-24 16:47 ` Kin Cho
  2004-02-24 17:11 ` Stefan Monnier
  0 siblings, 2 replies; 11+ messages in thread
From: D. D. Brierton @ 2004-02-24 14:56 UTC (permalink / raw)


I'd like to be able to have emacs autodetect whether a file is an HTML
file or an XHTML file. Standardly, file extension is not enough for this
as HTML and XHTML files tend to have the same file extensions. I have to
edit a lot of files created by other people, and they are often hopelessly
invalid, so I have no hope of perfectly differentiating XHTML from HTML.
However, there are some good clues to go on:

If a file ends in one of the following:

\.inc$
\.php[34]?$
\.[sjp]?html?$

Then  (in my case) it is *either* HTML *or* XHTML.

If a file with one of the above extensions has very near the beginning one
or both of:

<?xml
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML

then it is XHTML. Otherwise it is probably just HTML.

I know how to (add-to-list 'auto-mode-alist ... the file extensions, but I
don't know how to also check the first few lines of the file. Can anyone
offer any suggestions?

Further details:

I use psgml, and I define two derived modes:

(define-derived-mode xml-html-mode xml-mode "XHTML"
  "This version of html mode is just a wrapper around xml mode."
  (make-local-variable 'sgml-declaration)
  (make-local-variable 'sgml-default-doctype-name)
  (setq
   sgml-default-doctype-name    "html"
   sgml-declaration             "/usr/share/sgml/xml.dcl"
   sgml-always-quote-attributes t
   sgml-indent-step             2
   sgml-indent-data             t
   sgml-minimize-attributes     nil
   sgml-omittag                 nil
   sgml-shorttag                nil
   )
  )

(define-derived-mode sgml-html-mode sgml-mode "HTML"
  "This version of html mode is just a wrapper around sgml mode."
  (make-local-variable 'sgml-declaration)
  (make-local-variable 'sgml-default-doctype-name)
  (setq
   sgml-default-doctype-name    "html"
   sgml-declaration             "~/lib/DTD/html401/HTML4.decl"
   sgml-always-quote-attributes t
   sgml-indent-step             2
   sgml-indent-data             t
   sgml-minimize-attributes     nil
   sgml-omittag                 nil
   sgml-shorttag                nil
   )
  )

I also have the following:

; What files to invoke the new html-mode for?
(add-to-list 'auto-mode-alist '("\\.inc\\'" . sgml-html-mode))
(add-to-list 'auto-mode-alist '("\\.php[34]?\\'" . sgml-html-mode))
(add-to-list 'auto-mode-alist '("\\.[sj]?html?\\'" . sgml-html-mode))

So that basically I end up in sgml-html-mode when I open an (X)HTML file,
and then if it is an XHTML file I have to manually M-x xml-html-mode.

TIA, Darren

-- 
======================================================================
D. D. Brierton            darren@dzr-web.com           www.dzr-web.com
       Trying is the first step towards failure (Homer Simpson)
======================================================================

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-09-15 16:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-24 14:56 Suggestions? Better filetype sniffing -- XHTML vs. HTML D. D. Brierton
2004-02-24 16:47 ` Kin Cho
2004-02-24 17:16   ` D. D. Brierton
2004-02-24 17:31     ` Kin Cho
2004-02-24 17:46       ` D. D. Brierton
2005-05-27 14:29         ` slashdevslashnull
     [not found]         ` <mailman.2088.1117208718.25862.help-gnu-emacs@gnu.org>
2005-05-27 23:39           ` Thien-Thi Nguyen
2005-05-31  6:52             ` don provan
2005-09-12  7:59               ` Thien-Thi Nguyen
2005-09-15 16:25                 ` don provan
2004-02-24 17:11 ` Stefan Monnier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).