From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.bugs,gmane.emacs.pretest.bugs Subject: bug#4051: Character Soup Date: Thu, 06 Aug 2009 00:09:20 +0300 Organization: JURTA Message-ID: <87my6eary7.fsf@mail.jurta.org> Reply-To: Juri Linkov , 4051@emacsbugs.donarmstrong.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1249507629 22638 80.91.229.12 (5 Aug 2009 21:27:09 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 5 Aug 2009 21:27:09 +0000 (UTC) To: emacs-pretest-bug@gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Aug 05 23:27:02 2009 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MYo0j-0004UP-VV for geb-bug-gnu-emacs@m.gmane.org; Wed, 05 Aug 2009 23:27:02 +0200 Original-Received: from localhost ([127.0.0.1]:59048 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MYo0j-0004rq-8s for geb-bug-gnu-emacs@m.gmane.org; Wed, 05 Aug 2009 17:27:01 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MYo0d-0004r4-NS for bug-gnu-emacs@gnu.org; Wed, 05 Aug 2009 17:26:55 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MYo0Z-0004qI-Ov for bug-gnu-emacs@gnu.org; Wed, 05 Aug 2009 17:26:55 -0400 Original-Received: from [199.232.76.173] (port=57783 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MYo0Z-0004qD-HS for bug-gnu-emacs@gnu.org; Wed, 05 Aug 2009 17:26:51 -0400 Original-Received: from rzlab.ucr.edu ([138.23.92.77]:57833) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MYo0Y-00034y-OG for bug-gnu-emacs@gnu.org; Wed, 05 Aug 2009 17:26:51 -0400 Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id n75LQmY0026917; Wed, 5 Aug 2009 14:26:48 -0700 Original-Received: (from debbugs@localhost) by rzlab.ucr.edu (8.14.3/8.14.3/Submit) id n75LPIs8026563; Wed, 5 Aug 2009 14:25:18 -0700 X-Loop: owner@emacsbugs.donarmstrong.com Resent-From: Juri Linkov Resent-To: bug-submit-list@donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Wed, 05 Aug 2009 21:25:16 +0000 Resent-Message-ID: Resent-Sender: owner@emacsbugs.donarmstrong.com X-Emacs-PR-Message: report 4051 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Original-Received: via spool by submit@emacsbugs.donarmstrong.com id=B.124950694920190 (code B ref -1); Wed, 05 Aug 2009 21:25:16 +0000 Original-Received: (at submit) by emacsbugs.donarmstrong.com; 5 Aug 2009 21:15:49 +0000 X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. Original-Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id n75LFhs7020185 for ; Wed, 5 Aug 2009 14:15:44 -0700 Original-Received: from mx10.gnu.org ([199.232.76.166]:50248) by fencepost.gnu.org with esmtp (Exim 4.67) (envelope-from ) id 1MYnpm-0003IB-Ga for emacs-pretest-bug@gnu.org; Wed, 05 Aug 2009 17:15:42 -0400 Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1MYnpk-0000hC-0j for emacs-pretest-bug@gnu.org; Wed, 05 Aug 2009 17:15:42 -0400 Original-Received: from smtp-out1.starman.ee ([85.253.0.3]:36066 helo=mx1.starman.ee) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MYnpj-0000gq-C8 for emacs-pretest-bug@gnu.org; Wed, 05 Aug 2009 17:15:39 -0400 X-Virus-Scanned: by Amavisd-New at mx1.starman.ee Original-Received: from mail.starman.ee (82.131.69.104.cable.starman.ee [82.131.69.104]) by mx1.starman.ee (Postfix) with ESMTP id 451FE3F41E7 for ; Thu, 6 Aug 2009 00:15:33 +0300 (EEST) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (x86_64-pc-linux-gnu) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 2) Resent-Date: Wed, 05 Aug 2009 17:26:55 -0400 X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:29908 gmane.emacs.pretest.bugs:24918 Archived-At: The coding system for the buffer with the Latin-1 character =C3=A1 in the Cyrillic KOI8 language environment is detected as Chinese gb2312. How funny! I noticed this while reporting the bug#4037 that was sent by message.el with charset=3Dgb2312. Mail readers incorrectly display this message due to ugly fonts associated with gb2312 (this is a separate problem). I think it would be more natural to encode this as Latin-1 (in this particular case) or generally UTF-8 - the universal coding specially designed for mixing different scripts. The easiest way to reproduce this problem: 1. emacs -Q 2. C-x RET l Cyrillic-KOI8 3. C-x 8 ' a 4. C-x C-s 5. File to save in: /tmp/file After that the prompt says: Select coding system (default chinese-iso-8bit):=20 and the buffer `*Warning*' contains: These default coding systems were tried to encode text in the buffer `file': (cyrillic-koi8-unix (192 . 225)) However, each of them encountered characters it couldn't encode: cyrillic-koi8-unix cannot encode these: =C3=A1 Click on a character (or switch to this window by `C-x o' and select the characters by RET) to jump to the place it appears, where `C-u C-x =3D' will give information about it. Select one of the safe coding systems listed below, or cancel the writing with C-g and edit the buffer to remove or modify the problematic characters, or specify any other coding system (and risk losing the problematic characters). gb2312 utf-8 euc-jis-2004 euc-jp windows-1258 viscii iso-2022-jp-2004 cp862 iso-8859-16 hp-roman8 next mac-roman cp437 cp865 cp861 cp860 cp858 cp857 cp852 cp850 windows-1254 windows-1252 windows-1250 iso-8859-15 iso-8859-14 iso-8859-10 iso-8859-9 iso-8859-4 iso-8859-3 iso-8859-2 gb18030 gbk hz-gb-2312 utf-7 iso-8859-1 utf-16 utf-16be-with-signature utf-16le-with-signature utf-16be utf-16le iso-2022-7bit utf-8-auto utf-8-with-signature eucjp-ms vietnamese-tcvn vietnamese-viqr vietnamese-vscii japanese-shift-jis-2004 japanese-iso-7bit-1978-irv ibm1047 utf-7-imap utf-8-emacs I already figured out how to fix this problem for message.el using (setq mm-coding-system-priorities (cons 'utf-8 mm-coding-system-prioritie= s)) But as shown by the test case above this is a general problem. --=20 Juri Linkov http://www.jurta.org/emacs/