unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [user42@zip.com.au: html-coding.el -- coding system from meta tag]
@ 2005-07-20 22:08 Richard M. Stallman
  2005-07-20 22:36 ` Arne Jørgensen
  0 siblings, 1 reply; 2+ messages in thread
From: Richard M. Stallman @ 2005-07-20 22:08 UTC (permalink / raw)


Could people who know more than I about HTML specifications please
look at this, and tell me whether they think it is good to add to Emacs?

------- Start of forwarded message -------
From: Kevin Ryde <user42@zip.com.au>
To: gnu-emacs-sources@gnu.org
Organization: Bah Humbug
Date: Wed, 20 Jul 2005 10:47:38 +1000
Subject: html-coding.el -- coding system from meta tag
Sender: gnu-emacs-sources-bounces+rms=gnu.org@gnu.org
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on monty-python
X-Spam-Level: 
X-Spam-Status: No, hits=0.9 required=5.0 tests=FROM_ENDS_IN_NUMS autolearn=no 
	version=2.63

- --=-=-=

This is a little spot of code for getting the coding system from the
meta tag when visiting a html file.

The emacs cvs head already has this feature, so this code is only for
emacs 21.

I'd be surprised if something like this isn't already in some or most
of the heavy duty html/sgml editing/viewing packages, though I
couldn't find the right bits on cursory inspection.  In any case all I
wanted was to see the right chars in a plain old find-file of some
random html.


- --=-=-=
Content-Type: application/emacs-lisp
Content-Disposition: attachment; filename=html-coding.el
Content-Transfer-Encoding: quoted-printable

;;; html-coding.el --- coding system from meta tag when visiting html files.

;; Copyright 2005 Kevin Ryde
;;
;; html-coding.el is free software; you can redistribute it and/or modify it
;; under the terms of the GNU General Public License as published by the
;; Free Software Foundation; either version 2, or (at your option) any later
;; version.
;;
;; html-coding.el is distributed in the hope that it will be useful, but
;; WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
;; Public License for more details.
;;
;; You can get a copy of the GNU General Public License online at
;; http://www.gnu.org/licenses/gpl.txt, or you should have one in the file
;; COPYING which comes with GNU Emacs and other GNU programs.  Failing that,
;; write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
;; Boston, MA 02111-1307, USA.


;;; Commentary:

;; This is a spot of code for getting the coding system from a HTML <meta>
;; tag when visiting a .html, .shtml or .htm file.  mm-util.el (from Gnus)
;; is used to map a mime charset name in the html to an emacs coding system.
;;
;; This code is designed for Emacs 21.  The Emacs cvs head (which will be
;; Emacs 22 or whatever) already has this feature (in
;; sgml-html-meta-auto-coding-function), so nothing is done there.

;; If you have a file with a slightly bogus charset name, like "iso8859-1"
;; where it should be "iso-8859-1", you can map to the right one in
;; `mm-charset-synonym-alist', like
;;
;;     (eval-after-load "mm-util"
;;       '(add-to-list 'mm-charset-synonym-alist '(iso8859-1 . iso-8859-1)))
;;
;; But note that the mm-util.el which comes with Emacs 21.4a has a bug that
;; stops this working.  The test (mm-coding-system-p charset) should be
;; (mm-coding-system-p cs), ie. validate the mapped good name, not the bad
;; one.  You can make that change, or it's fixed in the separately packaged
;; Gnus.


;;; Install:

;; Put html-coding.el somewhere in your `load-path', and in your .emacs put
;;
;;     (require 'html-coding)

;;; History:

;; Version 1 - the first version.


;;; Code:

;; emacs 22 `sgml-html-meta-auto-coding-function' does this coding system
;; determination already, skip our code in that case
;;
(unless (fboundp 'sgml-html-meta-auto-coding-function)

  (defun html-coding-system (args)
    "Return the coding system for reading a HTML file, based on the <meta> =
tag.
If there's no charset in the file, this function checks what other rules sa=
y.

This function is for use in `file-coding-system-alist', the ARGS parameter
is a list, the only form handled here is `(insert-file-contents ...)'."
    (or (and (eq (car args) 'insert-file-contents)
             (file-exists-p (cadr args))
             (with-temp-buffer
               (insert-file-contents-literally (cadr args))
               (and (re-search-forward "<meta\\s-[^>]*charset=3D\\([^\">]+\=
\)"
                                       ;; first 10 lines, like emacs 22
                                       (save-excursion (forward-line 10)
                                                       (point))
                                       t)
                    (let ((charset (match-string 1)))
                      (require 'mm-util)
                      (or (mm-charset-to-coding-system charset)
                          (progn
                            (message "Unrecognised HTML MIME charset: %s"
                                     charset)
                            nil))))))
        (progn
          (require 'cl)
          (let ((file-coding-system-alist
                 (remove* 'html-coding-system file-coding-system-alist
                          :key 'cdr)))
            (apply 'find-operation-coding-system args)))))

  (modify-coding-system-alist 'file "\\.\\(html\\|shtml\\|htm\\)\\'" 'html-=
coding-system))

(provide 'html-coding)

;;; html-coding.el ends here

- --=-=-=
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Gnu-emacs-sources mailing list
Gnu-emacs-sources@gnu.org
http://lists.gnu.org/mailman/listinfo/gnu-emacs-sources

- --=-=-=--
------- End of forwarded message -------

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [user42@zip.com.au: html-coding.el -- coding system from meta tag]
  2005-07-20 22:08 [user42@zip.com.au: html-coding.el -- coding system from meta tag] Richard M. Stallman
@ 2005-07-20 22:36 ` Arne Jørgensen
  0 siblings, 0 replies; 2+ messages in thread
From: Arne Jørgensen @ 2005-07-20 22:36 UTC (permalink / raw)


"Richard M. Stallman" <rms@gnu.org> writes:

> Could people who know more than I about HTML specifications please
> look at this, and tell me whether they think it is good to add to Emacs?

[...]

> From: Kevin Ryde <user42@zip.com.au>   

[...]

> ;; This code is designed for Emacs 21.  The Emacs cvs head (which will be
> ;; Emacs 22 or whatever) already has this feature (in
> ;; sgml-html-meta-auto-coding-function), so nothing is done there.

No need to add it to Emacs 22.

Kind regards,
-- 
Arne Jørgensen <http://arnested.dk/>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-07-20 22:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-20 22:08 [user42@zip.com.au: html-coding.el -- coding system from meta tag] Richard M. Stallman
2005-07-20 22:36 ` Arne Jørgensen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).