From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: coding tags and utf-16 Date: Fri, 06 Jan 2006 15:31:03 +0900 Message-ID: References: <20051221.090033.182620434.wl@gnu.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: sea.gmane.org 1136529099 3789 80.91.229.2 (6 Jan 2006 06:31:39 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 6 Jan 2006 06:31:39 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jan 06 07:31:37 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1Eul8I-0007RQ-Ty for ged-emacs-devel@m.gmane.org; Fri, 06 Jan 2006 07:31:27 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EulA4-0005bK-AX for ged-emacs-devel@m.gmane.org; Fri, 06 Jan 2006 01:33:16 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Eul9s-0005bB-Rf for emacs-devel@gnu.org; Fri, 06 Jan 2006 01:33:05 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Eul9s-0005az-CI for emacs-devel@gnu.org; Fri, 06 Jan 2006 01:33:04 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Eul9s-0005aw-81 for emacs-devel@gnu.org; Fri, 06 Jan 2006 01:33:04 -0500 Original-Received: from [192.47.44.130] (helo=tsukuba.m17n.org) by monty-python.gnu.org with esmtp (TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA:24) (Exim 4.34) id 1EulBW-0007ua-QN; Fri, 06 Jan 2006 01:34:47 -0500 Original-Received: from nfs.m17n.org (nfs.m17n.org [192.47.44.7]) by tsukuba.m17n.org (8.13.4/8.13.4/Debian-3) with ESMTP id k066V5a0005076; Fri, 6 Jan 2006 15:31:05 +0900 Original-Received: from etlken (etlken.m17n.org [192.47.44.125]) by nfs.m17n.org (8.13.4/8.13.4/Debian-3) with ESMTP id k066V5Th008764; Fri, 6 Jan 2006 15:31:05 +0900 Original-Received: from handa by etlken with local (Exim 3.36 #1 (Debian)) id 1Eul7v-00056E-00; Fri, 06 Jan 2006 15:31:03 +0900 Original-To: Stefan Monnier In-reply-to: (message from Stefan Monnier on Thu, 05 Jan 2006 10:56:52 -0500) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:48783 Archived-At: In article , Stefan Monnier writes: >> So, in any cases, a tag value itself is useless. Then how >> to detect utf-16 more reliably? In the current Emacs >> (i.e. Ver.22), I think we can use auto-coding-regexp-alist >> or auto-coding-alist. In the former case, we can register >> BOM patterns and also something like "\\`\\(\0[\0-\177]\\)+" >> for utf-16be. In the latter case, you can use more >> complicated heuristics in a registered function. > Can't it be somehow added to detect_coding_utf_16? Yes, but usually it has no effect if, for instance, iso-8859-1 is more preferred. If only ASCII and Latin-1 characters are encoded in utf-16, all bytes (including BOM) are valid for iso-8859-1. --- Kenichi Handa handa@m17n.org