From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: `decode-coding-string' question Date: Fri, 07 Jul 2006 12:17:03 +0300 Message-ID: References: <200607040035.01379.pogonyshev@gmx.net> <200607061852.28566.pogonyshev@gmx.net> <200607062334.21288.pogonyshev@gmx.net> Reply-To: Eli Zaretskii NNTP-Posting-Host: main.gmane.org X-Trace: sea.gmane.org 1152263866 31833 80.91.229.2 (7 Jul 2006 09:17:46 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 7 Jul 2006 09:17:46 +0000 (UTC) Cc: bug-cc-mode@gnu.org, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jul 07 11:17:43 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FymT0-0000fd-QP for ged-emacs-devel@m.gmane.org; Fri, 07 Jul 2006 11:17:43 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FymT0-0005kS-7M for ged-emacs-devel@m.gmane.org; Fri, 07 Jul 2006 05:17:42 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FymSn-0005kD-Pe for emacs-devel@gnu.org; Fri, 07 Jul 2006 05:17:29 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FymSk-0005jf-Id for emacs-devel@gnu.org; Fri, 07 Jul 2006 05:17:28 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FymSk-0005jX-A1 for emacs-devel@gnu.org; Fri, 07 Jul 2006 05:17:26 -0400 Original-Received: from [192.114.186.66] (helo=romy.inter.net.il) by monty-python.gnu.org with esmtp (Exim 4.52) id 1FymT6-0002qP-GJ; Fri, 07 Jul 2006 05:17:48 -0400 Original-Received: from HOME-C4E4A596F7 (IGLD-84-228-163-164.inter.net.il [84.228.163.164]) by romy.inter.net.il (MOS 3.7.3-GA) with ESMTP id FDF44789 (AUTH halo1); Fri, 7 Jul 2006 12:17:03 +0300 (IDT) Original-To: Paul Pogonyshev In-reply-to: <200607062334.21288.pogonyshev@gmx.net> (message from Paul Pogonyshev on Thu, 6 Jul 2006 23:34:21 +0300) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:56707 Archived-At: > From: Paul Pogonyshev > Date: Thu, 6 Jul 2006 23:34:21 +0300 > Cc: handa@m17n.org > > There is indeed a misunderstanding. The characters in the buffer _are_ > decoded. However the characters form C escape sequence, like "\xc2\xa9" Right, I see the problem now. > To know what character is encoded by this C sequence, I first translate > strings "\xc2" and "\xa9" to the appropriate (undecoded!) characters. > The resulting string of length 2 is encoded in UTF-8 and I decode it > to receive the copyright character or whatever. Why not use `(decode-coding-string "\xc2\xa9" 'utf-8)' right away? It gives me the right character directly. Btw, why don't we have a feature in cc-mode to transparently decode and encode such strings when the source file is read/written? If detecting the encoding is an issue, we could for starters ask that users state that in some file-local variable.