unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Colin Walters <walters@gnu.org>
Subject: Re: auto-detecting encoding for XML
Date: 20 May 2002 03:04:56 -0400	[thread overview]
Message-ID: <1021878296.16796.2926.camel@space-ghost> (raw)
In-Reply-To: <200205192313.g4JNDmk24770@rum.cs.yale.edu>

On Sun, 2002-05-19 at 19:13, Stefan Monnier wrote:
> > 	* international/mule.el (auto-coding-functions): New variable.
> 
> Why not extend auto-coding-regexp-alist so it can associate a regexp
> to a function (rather than a coding-system) ?

Hm.  It seems cleaner to just have the function do the searching in the
first place, instead of in this case matching against a regexp, then
callling a function which will probably have to do the same searching...

> Or why not do what po.el does (i.e. use file-coding-system-alist) ?
> Admittedly, the file-coding-system-alist approach is pretty
> hairy/heavy-weight.

Well, it also has the disadvantage in this case that it depends on file
extensions; XML tends to be used as an encoding for other types of
files, which use their own extension.  So using file names as a way to
detect XML is probably a bad approach.

Just as a random sample on my system:

~/.gconf/* contains XML files, and their extension is .xml.
/etc/oglerc is an XML file, but doesn't have an extension at all.
~/local-cvs/resume/resume.fo is an XML file.
.nautilus-metafile.xml is XML.
/foreign-cvs/cvs.gnome.org/evolution/views/mail/Messages.galview is XML.
/foreign-cvs/cvs.gnome.org/evolution/views/mail/galview.xml is XML.

So only about 50% of the XML files have an "obvious" extension like
".xml".

> In any case we should come up with some way to do those things conveniently,
> because it applies to po-mode, to sgml-mode to tex-mode and probably
> a lot more.

auto-coding-functions should be able to handle those.

> Note that these are always associated with a mode, so
> it would be good if the implementation also was mode-specific so
> that it automatically works if you open an xml file called
> foo.myxmlextension (as long as "\\.myxmlextension\\'" is in the
> auto-mode-alist).

Yes.  It's very tricky though.  We can't possibly cover all the file
name extensions that would be used for XML.  I agree that it would be
great if we had a way to associate it with a mode.  The problem with
that though is that by the time the major mode function is called, the
file will have already been read from disk, and the only way to change
the coding system is to reread it from disk (as I understand things). 
And doing that in a major mode function is kind of a hack.  Maybe that's
the best solution, but auto-coding-functions certainly does the trick
here, and it seems to be extensible to handle the analogous po-mode and
tex-mode problems.

  parent reply	other threads:[~2002-05-20  7:04 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-05-19  2:27 auto-detecting encoding for XML Colin Walters
2002-05-19 23:13 ` Stefan Monnier
2002-05-20  4:53   ` Eli Zaretskii
2002-05-20  7:04   ` Colin Walters [this message]
2002-05-20 14:23     ` Stefan Monnier
2002-05-20 22:32       ` Colin Walters
2002-05-21 19:43         ` Stefan Monnier
2002-05-20 10:10   ` Kai Großjohann
2002-05-20 12:59     ` Eli Zaretskii
2002-05-20 14:31       ` Kai Großjohann
2002-05-20 14:32         ` Eli Zaretskii
2002-05-20 14:18     ` Stefan Monnier
2002-05-20 14:29       ` Kai Großjohann
2002-05-20 14:32         ` Stefan Monnier
2002-05-20 15:26       ` Kai Großjohann
2002-05-20 22:18         ` Colin Walters
2002-05-20 22:09       ` Colin Walters
2002-05-20  4:48 ` Eli Zaretskii
2002-05-20  7:07   ` Colin Walters
2002-05-20 14:48 ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1021878296.16796.2926.camel@space-ghost \
    --to=walters@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).