From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alexander Kotelnikov Newsgroups: gmane.emacs.devel Subject: Re: converting between charsets Date: Tue, 16 May 2006 00:30:44 +0400 Organization: Global disintoxication Message-ID: <84d5ef3ua3.fsf@vinci.loc> References: <87lktejh6f.fsf@myxomop.com> <87u082109z.fsf-monnier+emacs@gnu.org> <84veshaajc.fsf@vinci.loc> <87d5ep1a2c.fsf-monnier+emacs@gnu.org> <84hd40am8t.fsf@vinci.loc> <84ac9rah6z.fsf@vinci.loc> <87k68v82q6.fsf-monnier+emacs@gnu.org> <84bqu17on7.fsf@vinci.loc> <87wtcpqorp.fsf-monnier+emacs@gnu.org> <84zmhk4hnn.fsf@vinci.loc> <871wuwqg43.fsf-monnier+emacs@gnu.org> <84ves74ytd.fsf@vinci.loc> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1147725387 22470 80.91.229.2 (15 May 2006 20:36:27 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 15 May 2006 20:36:27 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 15 22:36:26 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1Ffjnb-0006ej-CV for ged-emacs-devel@m.gmane.org; Mon, 15 May 2006 22:36:15 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Ffjnb-0007WK-74 for ged-emacs-devel@m.gmane.org; Mon, 15 May 2006 16:36:15 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Ffjmt-0007K6-Vh for emacs-devel@gnu.org; Mon, 15 May 2006 16:35:32 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Ffjmt-0007JG-1d for emacs-devel@gnu.org; Mon, 15 May 2006 16:35:31 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Ffjms-0007JB-SR for emacs-devel@gnu.org; Mon, 15 May 2006 16:35:30 -0400 Original-Received: from [80.91.229.2] (helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA:32) (Exim 4.52) id 1FfjpL-0002oX-V8 for emacs-devel@gnu.org; Mon, 15 May 2006 16:38:04 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1Ffjmg-0006U9-Pf for emacs-devel@gnu.org; Mon, 15 May 2006 22:35:18 +0200 Original-Received: from 81.211.124.120.adsl-spb.net.rol.ru ([81.211.124.120]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 15 May 2006 22:35:18 +0200 Original-Received: from sacha by 81.211.124.120.adsl-spb.net.rol.ru with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 15 May 2006 22:35:18 +0200 X-Injected-Via-Gmane: http://gmane.org/ Mail-Followup-To: emacs-devel@gnu.org Original-To: emacs-devel@gnu.org Original-Lines: 75 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 81.211.124.120.adsl-spb.net.rol.ru Mail-Copies-To: never User-Agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux) Cancel-Lock: sha1:ZKqLEDLZ/A5ZgozmDcADdKOzhe4= X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:54530 Archived-At: >>>>> On Mon, 15 May 2006 10:11:48 -0400 >>>>> "SM" == Stefan Monnier wrote: SM> SM> Than show us how and when you call encode-coding-region. SM> I.e. repeat the above but in elisp rather than english. SM> Assume you're explining it to a complete idiot. SM> SM> Please take seriously the bit about the idiot. SM> >>>> For example. >>>> 1. (find-file "/tmp/test.txt") SM> SM> How did you start Emacs? SM> >> For example 'emacs -q', even if do not suppress reading my ~/.emacs >> the result is the same. SM> SM> Under X or under a tty? X >>>> 2. enter some text in Russian (after I toggled xkb layout) >>>> 3. M-: (encode-coding-region (point-min) (point-max) 'koi8-r) and >>>> Russian characters become '?'. SM> What did you expect instead? >> I expect that cyrrillic characters will be encoded to their koi8-r values. SM> SM> If you put the cursor on the russian chars before calling SM> encode-coding-region and hit C-u C-x = what does it say? character: Т (01212102, 332866, 0x51442) charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.) code point: 40 66 syntax: word category: y:Cyrillic buffer code: 0x9C 0xF4 0xA8 0xC2 file code: 0xD0 0xA2 (encoded by coding system utf-8) font: -monotype-courier new-medium-r-normal--13-94-99-99-m-80-iso10646-1 SM> If you put the cursor on the `?' that replaced that char and hit C-u C-x = SM> what does it say? character: ? (077, 63, 0x3f) charset: ascii (ASCII (ISO646 IRV)) code point: 63 syntax: punctuation category: a:ASCII l:Latin buffer code: 0x3F file code: 0x3F (encoded by coding system utf-8) font: -monotype-courier new-medium-r-normal--13-94-99-99-m-80-adobe-standard And for français I get: character: ç (04347, 2279, 0x8e7) charset: latin-iso8859-1 (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100) code point: 103 syntax: word category: l:Latin buffer code: 0x81 0xE7 file code: 0xC3 0xA7 (encoded by coding system utf-8) font: -monotype-courier new-medium-r-normal--13-94-99-99-m-80-iso8859-1 after (representaion is \347) character: ç (0347, 231, 0xe7) charset: eight-bit-graphic (8-bit graphic char (0xA0..0xFF)) code point: 231 syntax: whitespace category: buffer code: 0xE7 file code: 0xE7 (encoded by coding system utf-8) font: -monotype-courier new-medium-r-normal--13-94-99-99-m-80-adobe-standard -- Alexander Kotelnikov Saint-Petersburg, Russia