From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alexander Kotelnikov Newsgroups: gmane.emacs.devel Subject: Re: converting between charsets Date: Sun, 07 May 2006 23:40:23 +0400 Organization: Global disintoxication Message-ID: <84veshaajc.fsf@vinci.loc> References: <87lktejh6f.fsf@myxomop.com> <87u082109z.fsf-monnier+emacs@gnu.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1147031163 28481 80.91.229.2 (7 May 2006 19:46:03 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sun, 7 May 2006 19:46:03 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun May 07 21:46:00 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1FcpCQ-00077B-QX for ged-emacs-devel@m.gmane.org; Sun, 07 May 2006 21:45:51 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FcpCQ-0001L2-7y for ged-emacs-devel@m.gmane.org; Sun, 07 May 2006 15:45:50 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FcpCC-0001KV-Tp for emacs-devel@gnu.org; Sun, 07 May 2006 15:45:36 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FcpCC-0001KJ-Cb for emacs-devel@gnu.org; Sun, 07 May 2006 15:45:36 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FcpCC-0001KG-3p for emacs-devel@gnu.org; Sun, 07 May 2006 15:45:36 -0400 Original-Received: from [80.91.229.2] (helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA:32) (Exim 4.52) id 1FcpCr-0004T2-Ja for emacs-devel@gnu.org; Sun, 07 May 2006 15:46:17 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1FcpC0-00073U-TB for emacs-devel@gnu.org; Sun, 07 May 2006 21:45:24 +0200 Original-Received: from 81.211.124.120.adsl-spb.net.rol.ru ([81.211.124.120]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 07 May 2006 21:45:24 +0200 Original-Received: from sacha by 81.211.124.120.adsl-spb.net.rol.ru with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 07 May 2006 21:45:24 +0200 X-Injected-Via-Gmane: http://gmane.org/ Mail-Followup-To: emacs-devel@gnu.org Original-To: emacs-devel@gnu.org Original-Lines: 93 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 81.211.124.120.adsl-spb.net.rol.ru Mail-Copies-To: never User-Agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux) Cancel-Lock: sha1:Eo5eczsTiaU1/rGzps6nEg7U4WI= X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:54041 Archived-At: >>>>> On Sun, 07 May 2006 08:43:56 -0400 >>>>> "SM" == Stefan Monnier wrote: SM> >> After I switched to utf-8 as my basic environment encoding (on Linux) >> I got need of converting some texts sometimes back to koi8-r. Typical >> task here is to convert outgoing mail to persons and newsgroups >> hierarchies which do not understand multibyte encodings.] SM> SM> Emacs always converts from/to the encoding you use. So you don't really SM> need to "convert from utf-8 to koi8", when sending email because, before the SM> email is sent, it's not any more in utf-8 than in any other encoding (other SM> than the internal encoding). SM> I.e. all you need is to tell Emacs that when sending to newsgroups such and SM> such, it should use koi8 rather than utf-8. How to do that depends on the SM> newsreader you're using. I am using Gnus, it does not have such functionality, and the thing I am going to do is to implement it. So, the thing I need to be able to do is to convert from internal represintation to some code page (mostly koi8-r). >> Theoretically something like >> (encode-coding-region (point-min) (point-max) 'koi8-r) >> should work, but it does not. SM> SM> I don't think that's true in theory. Why? >> Relevant lines in my ~/.emacs are: SM> >> (set-language-environment "UTF-8") >> (set-terminal-coding-system 'utf-8) >> (set-selection-coding-system 'utf-8) >> (setq default-buffer-file-coding-system 'utf-8) >> (set-input-mode (car (current-input-mode)) (nth 1 (current-input-mode)) 0) >> (setq default-input-method "cyrillic-jcuken") SM> SM> Looks like you have some problems here. Try to remove most of the lines SM> (if your locale is using utf-8 already, you really don't need to do SM> anything at all in your .emacs). SM> At the very least try removing the set-selection-coding-system and SM> set-input-mode. SM> >> 1. Paste in X (from non-Emacs to Emacs) does not work correctly. It >> seems to be broken in different ways for singlebyte and mutlibyte. SM> SM> Probably caused by your set-selection-coding-system. SM> >> 2. With my utf-8 setup non-ascii input does not work on terminal (for >> example, when emacs is run in xterm as emacs -nw) when I switch input >> with system means (X keyboard layout, console input mode), instead of >> toggle-input-method. SM> SM> Could be because of your set-input-mode. I have started emacs without ~/.emacs and evaluated (setq default-input-method "cyrillic-jcuken") What I got: 1. Paste into Emacs frame works strange: different from normal font is used and on save I am asked in Minibuffer: Select coding system (default euc-jp) and I am shown *Warning* buffer with lines Start of *Warning* These default coding systems were tried: mule-utf-8 However, none of them safely encodes the target text. Select one of the following safe coding systems: euc-jp shift_jis iso-2022-jp iso-2022-jp-2 x-ctext japanese-iso-7bit-1978-irv iso-2022-7bit raw-text emacs-mule no-conversion iso-2022-7bit-lock-ss2 ctext-no-compositions iso-2022-8bit-ss2 iso-2022-7bit-lock iso-2022-7bit-ss2 tibetan-iso-8bit-with-esc thai-tis620-with-esc lao-with-esc korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc iso-latin-2-with-esc iso-latin-1-with-esc in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc End of *Warning* Cyrillic nput in emacs -nw in xterm still does not work, if I just change X keyboard layout. Converting works just like before: it converts only text typed with toggled input method. -- Alexander Kotelnikov Saint-Petersburg, Russia