From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: Auto-detection of windows-1252 fails Date: Wed, 09 Jan 2008 15:33:18 +0900 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: ger.gmane.org 1199860462 12325 80.91.229.12 (9 Jan 2008 06:34:22 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 9 Jan 2008 06:34:22 +0000 (UTC) Cc: rms@gnu.org, reinersteib+gmane@imap.cc To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Jan 09 07:34:39 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JCUWL-00055w-34 for ged-emacs-devel@m.gmane.org; Wed, 09 Jan 2008 07:34:37 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JCUVx-0003LI-Kj for ged-emacs-devel@m.gmane.org; Wed, 09 Jan 2008 01:34:13 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JCUVr-0003Hy-Pa for emacs-devel@gnu.org; Wed, 09 Jan 2008 01:34:08 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JCUVp-0003Fm-LV for emacs-devel@gnu.org; Wed, 09 Jan 2008 01:34:06 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JCUVp-0003Fb-FQ for emacs-devel@gnu.org; Wed, 09 Jan 2008 01:34:05 -0500 Original-Received: from mx1.aist.go.jp ([150.29.246.133]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JCUVU-0005rV-5o; Wed, 09 Jan 2008 01:34:01 -0500 Original-Received: from rqsmtp1.aist.go.jp (rqsmtp1.aist.go.jp [150.29.254.115]) by mx1.aist.go.jp with ESMTP id m096XJ63006090; Wed, 9 Jan 2008 15:33:19 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp1.aist.go.jp by rqsmtp1.aist.go.jp with ESMTP id m096XJKs011341; Wed, 9 Jan 2008 15:33:19 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp1.aist.go.jp with ESMTP id m096XIOl025521; Wed, 9 Jan 2008 15:33:18 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken.m17n.org with local (Exim 4.68) (envelope-from ) id 1JCUV4-0004Hw-Bl; Wed, 09 Jan 2008 15:33:18 +0900 In-reply-to: (message from Richard Stallman on Sun, 06 Jan 2008 03:09:03 -0500) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.60 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) X-detected-kernel: by monty-python.gnu.org: Solaris 8 (1) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:86622 Archived-At: In article , Richard Stallman writes: > Can you please DTRT on this, and ack? [...] > From: Reiner Steib > Date: Sat, 05 Jan 2008 14:22:37 +0100 > Subject: Auto-detection of windows-1252 fails [...] > in September/October 2006 we had a long thread on emacs-pretest-bugs > about auto-detection of windows-1252 text files: > Subject: local chars displayed as numbers > > [ I include a summary of this thread below. ] > windows-1252 files were supposed to be detected automatically in the > "Latin-1" and "German" language environments. This doesn't work > (anymore?) in Emacs 22.1, the Emacs_22 branch and in the trunk. > * Summary of the September/October 2006 discussion: > The following change was installed... > ,----[ ChangeLog.12 ] > | 2006-09-21 Kenichi Handa > | > | * language/european.el ("Latin-1"): Add windows-1252 to > | coding-priority. > | ("German"): Likewise. > `---- > ... and was supposed to result in the following behavior: > Kenichi Handa wrote in > : > | A file containing a windows-1252 char that doesn't appear in > | iso-8859-1 is detected as windows-1252. Bad effect is that some (or > | many) binary files are also detected as windows-1252. > Some people pointed out that this may lead to the bad effect that some > (or many) binary files are also detected as windows-1252. Eli > suggested to implement null-byte detection which should solve this > problem. > In > Kenichi Handa wrote: > | Reiner Steib imap.cc> writes: > | > | > (6) Implement null-byte detection (to prevent binary files > | > mis-detected as windows-12xx), keep the current code (windows-1252) > | > and add windows-1254/1255 accordingly. > | > | I think that change results in the best behavior. > ... and Richard agreed on that. But I don't think this has been done. > ("the current code" refers to the 2006-09-21 change, see above.) I've just installed the null-byte detection code and some improvement on handling latin-extra-code-table in the trunk. Could you please test the latest code? > | > and add windows-1254/1255 accordingly. I've not yet done that. Could someone tell me which to add where? > * Additionally, the addition of windows-1252 to "German" has been lost > in the emacs-unicode-2 branch: > --- european.el 26 Jul 2007 05:27:10 -0000 1.100 > +++ european.el 25 Dec 2007 10:57:51 -0000 1.86.4.13 > @@ -277,16 +414,15 @@ > (set-language-info-alist > "German" '((tutorial . "TUTORIAL.de") > - (charset ascii latin-iso8859-1) > + (charset iso-8859-1) > (coding-system iso-latin-1 iso-latin-9) > - (coding-priority iso-latin-1 windows-1252) > + (coding-priority iso-latin-1) > + (nonascii-translation . iso-8859-1) > (input-method . "german-postfix") Oops, I don't know why that change was lost. I'll fix it soon as well as the equivalent change for null-byte detection and latin-extra-code-table handling improvement. --- Kenichi Handa handa@ni.aist.go.jp