all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "D. D. Brierton" <darren@dzr-web.com>
Subject: Suggestions? Better filetype sniffing -- XHTML vs. HTML
Date: Tue, 24 Feb 2004 14:56:38 +0000	[thread overview]
Message-ID: <pan.2004.02.24.14.56.37.935485@dzr-web.com> (raw)

I'd like to be able to have emacs autodetect whether a file is an HTML
file or an XHTML file. Standardly, file extension is not enough for this
as HTML and XHTML files tend to have the same file extensions. I have to
edit a lot of files created by other people, and they are often hopelessly
invalid, so I have no hope of perfectly differentiating XHTML from HTML.
However, there are some good clues to go on:

If a file ends in one of the following:

\.inc$
\.php[34]?$
\.[sjp]?html?$

Then  (in my case) it is *either* HTML *or* XHTML.

If a file with one of the above extensions has very near the beginning one
or both of:

<?xml
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML

then it is XHTML. Otherwise it is probably just HTML.

I know how to (add-to-list 'auto-mode-alist ... the file extensions, but I
don't know how to also check the first few lines of the file. Can anyone
offer any suggestions?

Further details:

I use psgml, and I define two derived modes:

(define-derived-mode xml-html-mode xml-mode "XHTML"
  "This version of html mode is just a wrapper around xml mode."
  (make-local-variable 'sgml-declaration)
  (make-local-variable 'sgml-default-doctype-name)
  (setq
   sgml-default-doctype-name    "html"
   sgml-declaration             "/usr/share/sgml/xml.dcl"
   sgml-always-quote-attributes t
   sgml-indent-step             2
   sgml-indent-data             t
   sgml-minimize-attributes     nil
   sgml-omittag                 nil
   sgml-shorttag                nil
   )
  )

(define-derived-mode sgml-html-mode sgml-mode "HTML"
  "This version of html mode is just a wrapper around sgml mode."
  (make-local-variable 'sgml-declaration)
  (make-local-variable 'sgml-default-doctype-name)
  (setq
   sgml-default-doctype-name    "html"
   sgml-declaration             "~/lib/DTD/html401/HTML4.decl"
   sgml-always-quote-attributes t
   sgml-indent-step             2
   sgml-indent-data             t
   sgml-minimize-attributes     nil
   sgml-omittag                 nil
   sgml-shorttag                nil
   )
  )

I also have the following:

; What files to invoke the new html-mode for?
(add-to-list 'auto-mode-alist '("\\.inc\\'" . sgml-html-mode))
(add-to-list 'auto-mode-alist '("\\.php[34]?\\'" . sgml-html-mode))
(add-to-list 'auto-mode-alist '("\\.[sj]?html?\\'" . sgml-html-mode))

So that basically I end up in sgml-html-mode when I open an (X)HTML file,
and then if it is an XHTML file I have to manually M-x xml-html-mode.

TIA, Darren

-- 
======================================================================
D. D. Brierton            darren@dzr-web.com           www.dzr-web.com
       Trying is the first step towards failure (Homer Simpson)
======================================================================

             reply	other threads:[~2004-02-24 14:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-24 14:56 D. D. Brierton [this message]
2004-02-24 16:47 ` Suggestions? Better filetype sniffing -- XHTML vs. HTML Kin Cho
2004-02-24 17:16   ` D. D. Brierton
2004-02-24 17:31     ` Kin Cho
2004-02-24 17:46       ` D. D. Brierton
2005-05-27 14:29         ` slashdevslashnull
     [not found]         ` <mailman.2088.1117208718.25862.help-gnu-emacs@gnu.org>
2005-05-27 23:39           ` Thien-Thi Nguyen
2005-05-31  6:52             ` don provan
2005-09-12  7:59               ` Thien-Thi Nguyen
2005-09-15 16:25                 ` don provan
2004-02-24 17:11 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pan.2004.02.24.14.56.37.935485@dzr-web.com \
    --to=darren@dzr-web.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.