From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Alexander Kotelnikov <sacha@myxomop.com>
Newsgroups: gmane.emacs.devel
Subject: Re: converting between charsets
Date: Sun, 07 May 2006 23:40:23 +0400
Organization: Global disintoxication
Message-ID: <84veshaajc.fsf@vinci.loc>
References: <87lktejh6f.fsf@myxomop.com> <87u082109z.fsf-monnier+emacs@gnu.org>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: sea.gmane.org 1147031163 28481 80.91.229.2 (7 May 2006 19:46:03 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Sun, 7 May 2006 19:46:03 +0000 (UTC)
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun May 07 21:46:00 2006
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by ciao.gmane.org with esmtp (Exim 4.43)
	id 1FcpCQ-00077B-QX
	for ged-emacs-devel@m.gmane.org; Sun, 07 May 2006 21:45:51 +0200
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1FcpCQ-0001L2-7y
	for ged-emacs-devel@m.gmane.org; Sun, 07 May 2006 15:45:50 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1FcpCC-0001KV-Tp
	for emacs-devel@gnu.org; Sun, 07 May 2006 15:45:36 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1FcpCC-0001KJ-Cb
	for emacs-devel@gnu.org; Sun, 07 May 2006 15:45:36 -0400
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1FcpCC-0001KG-3p
	for emacs-devel@gnu.org; Sun, 07 May 2006 15:45:36 -0400
Original-Received: from [80.91.229.2] (helo=ciao.gmane.org)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA:32)
	(Exim 4.52) id 1FcpCr-0004T2-Ja
	for emacs-devel@gnu.org; Sun, 07 May 2006 15:46:17 -0400
Original-Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1FcpC0-00073U-TB
	for emacs-devel@gnu.org; Sun, 07 May 2006 21:45:24 +0200
Original-Received: from 81.211.124.120.adsl-spb.net.rol.ru ([81.211.124.120])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <emacs-devel@gnu.org>; Sun, 07 May 2006 21:45:24 +0200
Original-Received: from sacha by 81.211.124.120.adsl-spb.net.rol.ru with local (Gmexim
	0.1 (Debian)) id 1AlnuQ-0007hv-00
	for <emacs-devel@gnu.org>; Sun, 07 May 2006 21:45:24 +0200
X-Injected-Via-Gmane: http://gmane.org/
Mail-Followup-To: emacs-devel@gnu.org
Original-To: emacs-devel@gnu.org
Original-Lines: 93
Original-X-Complaints-To: usenet@sea.gmane.org
X-Gmane-NNTP-Posting-Host: 81.211.124.120.adsl-spb.net.rol.ru
Mail-Copies-To: never
User-Agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux)
Cancel-Lock: sha1:Eo5eczsTiaU1/rGzps6nEg7U4WI=
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:54041
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/54041>

>>>>> On Sun, 07 May 2006 08:43:56 -0400
>>>>> "SM" == Stefan Monnier <monnier@iro.umontreal.ca> wrote:
SM> 
>> After I switched to utf-8 as my basic environment encoding (on Linux)
>> I got need of converting some texts sometimes back to koi8-r.  Typical
>> task here is to convert outgoing mail to persons and newsgroups
>> hierarchies which do not understand multibyte encodings.]
SM> 
SM> Emacs always converts from/to the encoding you use.  So you don't really
SM> need to "convert from utf-8 to koi8", when sending email because, before the
SM> email is sent, it's not any more in utf-8 than in any other encoding (other
SM> than the internal encoding).
SM> I.e. all you need is to tell Emacs that when sending to newsgroups such and
SM> such, it should use koi8 rather than utf-8.  How to do that depends on the
SM> newsreader you're using.

I am using Gnus, it does not have such functionality, and the thing I
am going to do is to implement it. So, the thing I need to be able to
do is to convert from internal represintation to some code page
(mostly koi8-r).

>> Theoretically something like
>> (encode-coding-region (point-min) (point-max) 'koi8-r)
>> should work, but it does not.
SM> 
SM> I don't think that's true in theory.

Why?

>> Relevant lines in my ~/.emacs are:
SM> 
>> (set-language-environment "UTF-8")
>> (set-terminal-coding-system 'utf-8)
>> (set-selection-coding-system 'utf-8)
>> (setq default-buffer-file-coding-system 'utf-8)
>> (set-input-mode (car (current-input-mode)) (nth 1 (current-input-mode)) 0)
>> (setq default-input-method "cyrillic-jcuken")
SM> 
SM> Looks like you have some problems here.  Try to remove most of the lines
SM> (if your locale is using utf-8 already, you really don't need to do
SM> anything at all in your .emacs).
SM> At the very least try removing the set-selection-coding-system and
SM> set-input-mode.
SM> 
>> 1. Paste in X (from non-Emacs to Emacs) does not work correctly. It
>> seems to be broken in different ways for singlebyte and mutlibyte.
SM> 
SM> Probably caused by your set-selection-coding-system.
SM> 
>> 2. With my utf-8 setup non-ascii input does not work on terminal (for
>> example, when emacs is run in xterm as emacs -nw) when I switch input
>> with system means (X keyboard layout, console input mode), instead of
>> toggle-input-method.
SM> 
SM> Could be because of your set-input-mode.

I have started emacs without ~/.emacs and evaluated 
(setq default-input-method "cyrillic-jcuken")

What I got:
1. Paste into Emacs frame works strange: different from normal font is
used and on save I am asked in Minibuffer:
Select coding system (default euc-jp)
and I am shown *Warning* buffer with lines

Start of *Warning*
These default coding systems were tried:
  mule-utf-8
However, none of them safely encodes the target text.

Select one of the following safe coding systems:
  euc-jp shift_jis iso-2022-jp iso-2022-jp-2 x-ctext
  japanese-iso-7bit-1978-irv iso-2022-7bit raw-text emacs-mule
  no-conversion iso-2022-7bit-lock-ss2 ctext-no-compositions
  iso-2022-8bit-ss2 iso-2022-7bit-lock iso-2022-7bit-ss2
  tibetan-iso-8bit-with-esc thai-tis620-with-esc lao-with-esc
  korean-iso-8bit-with-esc hebrew-iso-8bit-with-esc
  greek-iso-8bit-with-esc iso-latin-9-with-esc iso-latin-8-with-esc
  iso-latin-5-with-esc iso-latin-4-with-esc iso-latin-3-with-esc
  iso-latin-2-with-esc iso-latin-1-with-esc
  in-is13194-devanagari-with-esc cyrillic-iso-8bit-with-esc
  chinese-iso-8bit-with-esc japanese-iso-8bit-with-esc
End of *Warning*

Cyrillic nput in emacs -nw in xterm still does not work, if I just
change X keyboard layout.

Converting works just like before: it converts only text typed with
toggled input method.

-- 
Alexander Kotelnikov
Saint-Petersburg, Russia