From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Colin Walters Newsgroups: gmane.emacs.devel Subject: Re: auto-detecting encoding for XML Date: 20 May 2002 03:04:56 -0400 Sender: emacs-devel-admin@gnu.org Message-ID: <1021878296.16796.2926.camel@space-ghost> References: <1021775271.29752.2282.camel@space-ghost> <200205192313.g4JNDmk24770@rum.cs.yale.edu> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Trace: main.gmane.org 1021881073 8389 127.0.0.1 (20 May 2002 07:51:13 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 20 May 2002 07:51:13 +0000 (UTC) Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 179hwn-0002BC-00 for ; Mon, 20 May 2002 09:51:13 +0200 Original-Received: from fencepost.gnu.org ([199.232.76.164]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 179iAd-0008AL-00 for ; Mon, 20 May 2002 10:05:31 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 179hwu-0001mn-00; Mon, 20 May 2002 03:51:20 -0400 Original-Received: from monk.debian.net ([216.185.54.61] helo=monk.verbum.org) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 179hcp-0007tr-00 for ; Mon, 20 May 2002 03:30:35 -0400 Original-Received: from space-ghost.verbum.private (dhcp024-208-188-193.columbus.rr.com [24.208.188.193]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client CN "space-ghost.verbum.org", Issuer "monk.verbum.org" (verified OK)) by monk.verbum.org (Postfix (Debian/GNU)) with ESMTP id 337EA7400274 for ; Mon, 20 May 2002 03:30:25 -0400 (EDT) Original-Received: by space-ghost.verbum.private (Postfix (Debian/GNU), from userid 1000) id A665E806B92; Mon, 20 May 2002 03:04:57 -0400 (EDT) Original-To: emacs-devel@gnu.org In-Reply-To: <200205192313.g4JNDmk24770@rum.cs.yale.edu> X-Mailer: Ximian Evolution 1.0.3 Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.9 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:4159 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:4159 On Sun, 2002-05-19 at 19:13, Stefan Monnier wrote: > > * international/mule.el (auto-coding-functions): New variable. > > Why not extend auto-coding-regexp-alist so it can associate a regexp > to a function (rather than a coding-system) ? Hm. It seems cleaner to just have the function do the searching in the first place, instead of in this case matching against a regexp, then callling a function which will probably have to do the same searching... > Or why not do what po.el does (i.e. use file-coding-system-alist) ? > Admittedly, the file-coding-system-alist approach is pretty > hairy/heavy-weight. Well, it also has the disadvantage in this case that it depends on file extensions; XML tends to be used as an encoding for other types of files, which use their own extension. So using file names as a way to detect XML is probably a bad approach. Just as a random sample on my system: ~/.gconf/* contains XML files, and their extension is .xml. /etc/oglerc is an XML file, but doesn't have an extension at all. ~/local-cvs/resume/resume.fo is an XML file. .nautilus-metafile.xml is XML. /foreign-cvs/cvs.gnome.org/evolution/views/mail/Messages.galview is XML. /foreign-cvs/cvs.gnome.org/evolution/views/mail/galview.xml is XML. So only about 50% of the XML files have an "obvious" extension like ".xml". > In any case we should come up with some way to do those things conveniently, > because it applies to po-mode, to sgml-mode to tex-mode and probably > a lot more. auto-coding-functions should be able to handle those. > Note that these are always associated with a mode, so > it would be good if the implementation also was mode-specific so > that it automatically works if you open an xml file called > foo.myxmlextension (as long as "\\.myxmlextension\\'" is in the > auto-mode-alist). Yes. It's very tricky though. We can't possibly cover all the file name extensions that would be used for XML. I agree that it would be great if we had a way to associate it with a mode. The problem with that though is that by the time the major mode function is called, the file will have already been read from disk, and the only way to change the coding system is to reread it from disk (as I understand things). And doing that in a major mode function is kind of a hack. Maybe that's the best solution, but auto-coding-functions certainly does the trick here, and it seems to be extensible to handle the analogous po-mode and tex-mode problems.