From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#2497: 23.0.91; Fails to read UTF-8 on Win2k Date: Sat, 28 Feb 2009 12:49:58 +0200 Message-ID: References: <877i3c55tg.fsf@tum.de> <87ljrromgg.fsf@tum.de> Reply-To: Eli Zaretskii , 2497@emacsbugs.donarmstrong.com NNTP-Posting-Host: lo.gmane.org X-Trace: ger.gmane.org 1235819103 25994 80.91.229.12 (28 Feb 2009 11:05:03 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 28 Feb 2009 11:05:03 +0000 (UTC) Cc: 2497@emacsbugs.donarmstrong.com, uwe.siart@tum.de To: Stefan Monnier , Kenichi Handa Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Feb 28 12:06:19 2009 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1LdN1B-0007WZ-IE for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Feb 2009 12:06:05 +0100 Original-Received: from localhost ([127.0.0.1]:52165 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LdMzq-0004QH-Ot for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Feb 2009 06:04:42 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LdMyt-0003qe-PJ for bug-gnu-emacs@gnu.org; Sat, 28 Feb 2009 06:03:43 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LdMyr-0003oL-LR for bug-gnu-emacs@gnu.org; Sat, 28 Feb 2009 06:03:42 -0500 Original-Received: from [199.232.76.173] (port=47109 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LdMyr-0003o8-Gm for bug-gnu-emacs@gnu.org; Sat, 28 Feb 2009 06:03:41 -0500 Original-Received: from rzlab.ucr.edu ([138.23.92.77]:44265) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1LdMyq-0003U7-Ot for bug-gnu-emacs@gnu.org; Sat, 28 Feb 2009 06:03:41 -0500 Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n1SB3csK014594; Sat, 28 Feb 2009 03:03:38 -0800 Original-Received: (from debbugs@localhost) by rzlab.ucr.edu (8.13.8/8.13.8/Submit) id n1SAt5Cg011284; Sat, 28 Feb 2009 02:55:05 -0800 X-Loop: owner@emacsbugs.donarmstrong.com Resent-From: Eli Zaretskii Resent-To: bug-submit-list@donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Sat, 28 Feb 2009 10:55:05 +0000 Resent-Message-ID: Resent-Sender: owner@emacsbugs.donarmstrong.com X-Emacs-PR-Message: followup 2497 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Original-Received: via spool by 2497-submit@emacsbugs.donarmstrong.com id=B2497.123581821010115 (code B ref 2497); Sat, 28 Feb 2009 10:55:05 +0000 Original-Received: (at 2497) by emacsbugs.donarmstrong.com; 28 Feb 2009 10:50:10 +0000 X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. Original-Received: from mtaout1.012.net.il (mtaout1.012.net.il [84.95.2.1]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n1SAo6rZ009236 for <2497@emacsbugs.donarmstrong.com>; Sat, 28 Feb 2009 02:50:07 -0800 Original-Received: from conversion-daemon.i-mtaout1.012.net.il by i-mtaout1.012.net.il (HyperSendmail v2007.08) id <0KFR00500V7YH500@i-mtaout1.012.net.il> for 2497@emacsbugs.donarmstrong.com; Sat, 28 Feb 2009 12:50:38 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([77.127.167.119]) by i-mtaout1.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0KFR004ZXVG7P6D0@i-mtaout1.012.net.il>; Sat, 28 Feb 2009 12:50:33 +0200 (IST) In-reply-to: X-012-Sender: halo1@inter.net.il X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) Resent-Date: Sat, 28 Feb 2009 06:03:42 -0500 X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:25834 Archived-At: > From: Stefan Monnier > Cc: 2497@emacsbugs.donarmstrong.com, uwe.siart@tum.de > Date: Fri, 27 Feb 2009 23:40:01 -0500 > > >> It works with "C-x RET c utf-8 RET" immediately prior to "C-x C-f". > >> > If it does, then the problem is with guessing the encoding, not with > >> > decoding it. > >> That's also my impression. > >> > Also, what is the default value of buffer-file-coding-system, and was > >> > it the same in 23.0.90? > >> iso-latin-1-dos in 23.0.90 and in 23.0.91. > > Then you shouldn't expect Emacs to guess UTF-8 encoding correctly in > > every single instance. Distinguishing between UTF-8 and Latin-1 is > > The guessing shouldn't give priority to buffer-file-coding-system. > Instead we have the set-coding-system-priority instead. Please give me some credit: I said ``the _default_value_ of buffer-file-coding-system''. That default tells volumes about the coding-system priorities. > And IIUC utf-8 should always have a pretty high priority With today's CVS on a Windows XP machine I get this: M-: (coding-system-priority-list) RET => (iso-latin-1 utf-8 iso-2022-7bit iso-2022-7bit-lock iso-2022-8bit-ss2 emacs-mule raw-text iso-2022-jp in-is13194-devanagari chinese-iso-8bit utf-8-auto utf-8-with-signature utf-16 utf-16be-with-signature utf-16le-with-signature utf-16be utf-16le japanese-shift-jis undecided) So UTF-8 is indeed ``pretty high'', but lower than the locale's default. > So this still looks like a real bug. Perhaps it is, but I didn't know Emacs 23 can reliably distinguish between Latin-1 and UTF-8, even when UTF-8 sequences are present in the text. Can we do that reliably? Perhaps Handa-san can shed some light on this.