From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ralf Angeli Newsgroups: gmane.emacs.devel Subject: Re: [angeli@iwi.uni-sb.de: Coding problem with Euro sign] Date: Thu, 15 Dec 2005 17:20:09 +0100 Message-ID: References: NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1134674294 21268 80.91.229.2 (15 Dec 2005 19:18:14 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 15 Dec 2005 19:18:14 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Dec 15 20:18:12 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EmyZy-0005gW-M0 for ged-emacs-devel@m.gmane.org; Thu, 15 Dec 2005 20:15:51 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Emyac-0004Se-VP for ged-emacs-devel@m.gmane.org; Thu, 15 Dec 2005 14:16:31 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EmwEN-0001u6-Ld for emacs-devel@gnu.org; Thu, 15 Dec 2005 11:45:25 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Emw4e-0006NA-FQ for emacs-devel@gnu.org; Thu, 15 Dec 2005 11:35:22 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Emw08-00055P-Cb for emacs-devel@gnu.org; Thu, 15 Dec 2005 11:30:40 -0500 Original-Received: from [80.91.229.2] (helo=ciao.gmane.org) by monty-python.gnu.org with esmtp (TLS-1.0:RSA_AES_128_CBC_SHA:16) (Exim 4.34) id 1Emw2O-0001tN-LP for emacs-devel@gnu.org; Thu, 15 Dec 2005 11:33:01 -0500 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1EmvvW-0002pb-O2 for emacs-devel@gnu.org; Thu, 15 Dec 2005 17:25:55 +0100 Original-Received: from dialin-212-144-207-181.pools.arcor-ip.net ([212.144.207.181]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 15 Dec 2005 17:25:54 +0100 Original-Received: from angeli by dialin-212-144-207-181.pools.arcor-ip.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 15 Dec 2005 17:25:54 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-To: emacs-devel@gnu.org Original-Lines: 59 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: dialin-212-144-207-181.pools.arcor-ip.net User-Agent: Gnus/5.110004 (No Gnus v0.4) Emacs/22.0.50 (gnu/linux) Cancel-Lock: sha1:ZqiEwaXKk9NJsXYo9MQ2+k8TqX0= X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:47799 Archived-At: * Kevin Rodgers (2005-12-15) writes: > Ralf Angeli wrote: >> * Kevin Rodgers (2005-12-14) writes: >>>I think the OP is confused: >> >> Was confused. That was cleared up on emacs-pretest-bug. > > Good! I hope you didn't take offense at my remark. Oh well ... something like that was to be expected as my knowledge about coding systems is only improving slowly. (c: >>>And the OP should try visiting the file with the cp1252 coding system. >> >> Well, the question now is if it is possible for Emacs to figure out >> the coding system on itself with the example at hand. > > You could try something like this: > > (setq auto-coding-regexp-alist > (cons '("[\040-\177][\200-\237]" . cp1252) > auto-coding-regexp-alist)) > > I don't think that's a general purpose solution since (1) > auto-coding-regexp-alist actually has precedence over `-*-coding:-*-' > file variables and (2) other encodings probably use those o200 - o237 > bytes (certainly other Microsoft Windows code pages do). This doesn't seem to work here. I still see the byte codes of the 8-bit characters when opening the file after evaluating the above form. And a customization is actually not what I am interested in; I'd like Emacs to figure this out by itself, out of the box. I am not sure how common something like the case at hand is but it is certainly not academic. And if one is working with different operating systems or interchanging files with people working on different operating systems the failure to detect the correct coding could lead to people regarding Emacs as a truly inferior piece of software. I can already hear them: "What? It displays the Euro sign as \200? Even Notepad gets this right!" On these grounds it may become a bit hard to convince people that Emacs is the one true editor. Anyway, I tested a bit and under Windows (surprise) every application I tried (e.g. Notepad and OpenOffice) managed to display the file correctly. On GNU/Linux no application got it right. I checked with less, more, vim, nano, pico, and OpenOffice. Either "garbage" was displayed or (in case of OpenOffice) a dialog asking the user to specify the encoding. So it's not like Emacs isn't in good company. Nevertheless it would be nice if Emacs got it right. Unfortunately I lack the knowledge for judging if this is possible at all without having to use all sorts of unreliable heuristics which are costly to implement. -- Ralf