From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#2497: 23.0.91; Fails to read UTF-8 on Win2k Date: Sat, 28 Feb 2009 14:09:04 +0200 Message-ID: References: <877i3c55tg.fsf@tum.de> <87ljrromgg.fsf@tum.de> <87zlg7t1pc.fsf@tum.de> <87tz6e3m2v.fsf@engster.org> Reply-To: Eli Zaretskii , 2497@emacsbugs.donarmstrong.com NNTP-Posting-Host: lo.gmane.org X-Trace: ger.gmane.org 1235824043 5650 80.91.229.12 (28 Feb 2009 12:27:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 28 Feb 2009 12:27:23 +0000 (UTC) Cc: uwe.siart@tum.de To: David Engster , 2497@emacsbugs.donarmstrong.com Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Feb 28 13:28:39 2009 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1LdOJ4-0004K9-QQ for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Feb 2009 13:28:39 +0100 Original-Received: from localhost ([127.0.0.1]:45419 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LdOHj-00059O-1H for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Feb 2009 07:27:15 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LdOEJ-0004NX-Bn for bug-gnu-emacs@gnu.org; Sat, 28 Feb 2009 07:23:43 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LdOEI-0004N5-2b for bug-gnu-emacs@gnu.org; Sat, 28 Feb 2009 07:23:42 -0500 Original-Received: from [199.232.76.173] (port=58147 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LdOEG-0004My-V4 for bug-gnu-emacs@gnu.org; Sat, 28 Feb 2009 07:23:41 -0500 Original-Received: from rzlab.ucr.edu ([138.23.92.77]:48571) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1LdOEG-0001IW-76 for bug-gnu-emacs@gnu.org; Sat, 28 Feb 2009 07:23:40 -0500 Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n1SCNcHw011328; Sat, 28 Feb 2009 04:23:38 -0800 Original-Received: (from debbugs@localhost) by rzlab.ucr.edu (8.13.8/8.13.8/Submit) id n1SCF494008891; Sat, 28 Feb 2009 04:15:04 -0800 X-Loop: owner@emacsbugs.donarmstrong.com Resent-From: Eli Zaretskii Resent-To: bug-submit-list@donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Sat, 28 Feb 2009 12:15:04 +0000 Resent-Message-ID: Resent-Sender: owner@emacsbugs.donarmstrong.com X-Emacs-PR-Message: followup 2497 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Original-Received: via spool by 2497-submit@emacsbugs.donarmstrong.com id=B2497.12358230338590 (code B ref 2497); Sat, 28 Feb 2009 12:15:04 +0000 Original-Received: (at 2497) by emacsbugs.donarmstrong.com; 28 Feb 2009 12:10:33 +0000 X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. Original-Received: from mtaout1.012.net.il (mtaout1.012.net.il [84.95.2.1]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n1SCAScP008573 for <2497@emacsbugs.donarmstrong.com>; Sat, 28 Feb 2009 04:10:30 -0800 Original-Received: from conversion-daemon.i-mtaout1.012.net.il by i-mtaout1.012.net.il (HyperSendmail v2007.08) id <0KFR00K00YXKPN00@i-mtaout1.012.net.il> for 2497@emacsbugs.donarmstrong.com; Sat, 28 Feb 2009 14:09:38 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([77.127.167.119]) by i-mtaout1.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0KFR004VLZ400VA0@i-mtaout1.012.net.il>; Sat, 28 Feb 2009 14:09:37 +0200 (IST) In-reply-to: <87tz6e3m2v.fsf@engster.org> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) Resent-Date: Sat, 28 Feb 2009 07:23:42 -0500 X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:25840 Archived-At: > From: David Engster > Date: Sat, 28 Feb 2009 11:14:16 +0100 > Cc: 2497@emacsbugs.donarmstrong.com > > I once again confirmed that this behaviour can be tracked down to this > change in detect_coding_charset in coding.c (revision 1.413): > > --- coding.c 7 Feb 2009 10:49:39 -0000 1.412 > +++ coding.c 9 Feb 2009 00:42:37 -0000 1.413 > @@ -5101,7 +5101,7 @@ > valids = AREF (attrs, coding_attr_charset_valids); > name = CODING_ID_NAME (coding->id); > if (VECTORP (Vlatin_extra_code_table) > - && strcmp ((char *) SDATA (SYMBOL_NAME (name)), "iso-8859-")) > + && strcmp ((char *) SDATA (SYMBOL_NAME (name)), "iso-8859-") == 0) > check_latin_extra = 1; > if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs))) > src += head_ascii; > > I'm inclined to say that this change is wrong, since strcmp will only > return 0 if two strings are exactly equal. In this case though, the > string "iso-8859-" is compared to "iso-8859-1" (in my case), so it > returns 1 and therefore check_latin_extra is not set. You are right. But in my case, it was not enough to test for "iso-8859-", as the symbol's name was "iso-latin-1", not "iso-8859-1". I installed the patch below, that does seem to fix the problem with the OP's .gnus.el, although I don't know how general that problem is, nor whether Emacs is capable of distinguishing UTF-8 from Latin-N in general. 2009-02-28 Eli Zaretskii * coding.c (detect_coding_charset): Fix change from 2008-10-21. Also, check iso-latin-*, not only iso-8859-*. Index: src/coding.c =================================================================== RCS file: /cvsroot/emacs/emacs/src/coding.c,v retrieving revision 1.419 diff -u -r1.419 coding.c --- src/coding.c 22 Feb 2009 15:48:03 -0000 1.419 +++ src/coding.c 28 Feb 2009 12:01:18 -0000 @@ -5103,7 +5103,10 @@ valids = AREF (attrs, coding_attr_charset_valids); name = CODING_ID_NAME (coding->id); if (VECTORP (Vlatin_extra_code_table) - && strcmp ((char *) SDATA (SYMBOL_NAME (name)), "iso-8859-") == 0) + && (strncmp ((char *) SDATA (SYMBOL_NAME (name)), + "iso-8859-", sizeof ("iso-8859-") - 1) == 0 + || strncmp ((char *) SDATA (SYMBOL_NAME (name)), + "iso-latin-", sizeof ("iso-latin-") - 1) == 0)) check_latin_extra = 1; if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs))) src += head_ascii;