From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Miles Bader Newsgroups: gmane.emacs.devel Subject: auto-recognizing utf-16le ? Date: Mon, 15 Jun 2009 20:40:46 +0900 Message-ID: Reply-To: Miles Bader NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: ger.gmane.org 1245066182 4285 80.91.229.12 (15 Jun 2009 11:43:02 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 15 Jun 2009 11:43:02 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jun 15 13:43:00 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MGAaZ-0000VX-9j for ged-emacs-devel@m.gmane.org; Mon, 15 Jun 2009 13:42:59 +0200 Original-Received: from localhost ([127.0.0.1]:46801 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MGAaY-0000yS-Qa for ged-emacs-devel@m.gmane.org; Mon, 15 Jun 2009 07:42:58 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MGAYk-0008PK-6Y for emacs-devel@gnu.org; Mon, 15 Jun 2009 07:41:06 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MGAYf-0008On-Ci for emacs-devel@gnu.org; Mon, 15 Jun 2009 07:41:05 -0400 Original-Received: from [199.232.76.173] (port=46769 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MGAYf-0008Ok-AA for emacs-devel@gnu.org; Mon, 15 Jun 2009 07:41:01 -0400 Original-Received: from tyo201.gate.nec.co.jp ([202.32.8.193]:50347) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MGAYb-0002m8-8a; Mon, 15 Jun 2009 07:40:57 -0400 Original-Received: from relay31.aps.necel.com ([10.29.19.54]) by tyo201.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id n5FBeWhI018179; Mon, 15 Jun 2009 20:40:51 +0900 (JST) Original-Received: from relay31.aps.necel.com ([10.29.19.20] [10.29.19.20]) by relay31.aps.necel.com with ESMTP; Mon, 15 Jun 2009 20:40:51 +0900 Original-Received: from dhlpc061 ([10.114.113.70] [10.114.113.70]) by relay31.aps.necel.com with ESMTP; Mon, 15 Jun 2009 20:40:50 +0900 Original-Received: by dhlpc061 (Postfix, from userid 31295) id 1B70F52E1B7; Mon, 15 Jun 2009 20:40:48 +0900 (JST) System-Type: x86_64-unknown-linux-gnu Blat: Foop Original-Lines: 57 X-detected-operating-system: by monty-python.gnu.org: Solaris 8 (1) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:111521 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Someone on #emacs noticed that emacs doesn't seem to auto-recognize files encoding using utf-16le. Visiting a file which uses such an encoding results in the buffer having coding-system "no-conversion (alias: binary)", and lots of ^@ (NUL) characters in the buffer. Forcing the encoding with "C-x C-m r utf-16le RET" results in the correct thing happening. [He was on windows where this coding system is common, so it's kind of annoying for him.] I noticed that the same happens on debian. I thought maybe he could just do: (prefer-coding-system 'utf-16le-dos) but it seems to have no effect. To reproduce: 1. Save this message's attachment to a file "/tmp/oink" 2.=E3=80=80Start emacs with: HOME=3D/tmp emacs -Q 3. Visit the file you saved: C-x C-f /tmp/oink RET 4. ** Notice that the buffer contains ^@ (NUL) characters, and that the buffer coding-system is "no-conversion (binary)" 5. Re-visit the file, forcing the coding-system: C-x C-m r utf-16le RET yes RET 6. ** Notice that the file contents are now correct 7. Kill the current buffer: C-x k RET 8. Evaluate: M-: (prefer-coding-system 'utf-16le) RET 9. Visit the file again: C-x C-f /tmp/oink RET 10. ** Notice that prefer-coding-system didn't seem to have any effect Thanks, -Miles --=-=-= Content-Type: application/octet-stream Content-Disposition: attachment; filename=oink Content-Transfer-Encoding: base64 Content-Description: test file encoded using utf-16le dABoAGkAcwAgAGkAcwAgAGEAIAB0AGUAcwB0AAoA --=-=-= -- Justice, n. A commodity which in a more or less adulterated condition the State sells to the citizen as a reward for his allegiance, taxes and personal service. --=-=-=--