unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Alexander Kotelnikov <sacha@myxomop.com>
Subject: Re: converting between charsets
Date: Tue, 09 May 2006 09:41:08 +0400	[thread overview]
Message-ID: <84ac9rah6z.fsf@vinci.loc> (raw)
In-Reply-To: jwvfyjkzjv9.fsf-monnier+emacs@gnu.org

>>>>> On Mon, 08 May 2006 10:30:48 -0400
>>>>> "SM" == Stefan Monnier <monnier@iro.umontreal.ca> wrote:
SM> 
>> Let's first talk about encoding regions. Why does not it work with
>> encode-coding-region?
SM> 
SM> It works.  Any evidence that it doesn't?

I started this thread from note about problems with
encoding-coding-region:

>>>>> On Sun, 07 May 2006 13:52:08 +0400
>>>>> "AK" == Alexander Kotelnikov <sacha@myxomop.com> wrote:
AK> 
AK> There could be three different ways, which I checked, how characters
AK> to be converted can appear in emacs buffer:
AK>   a. when I open such file.
AK>   b. when I type in characters and my keyboard layout in X is different
AK>      from 'us', for me it is normally 'ru' then.
AK>   c. when I type in after I used toggle-input-method.
AK> 
AK> 
AK> And the trouble is that encode-coding-region converts only in case
AK> (c). In (a) and (b) characters that need conversion are substituted
AK> with question marks. And even in (c) conversion is performed (if, for
AK> instance, I save a file after it appears to be in koi8-r) in the
AK> converted buffer converted characters are shown in \321 manner.
AK> 
AK> So, it will be nice to get some help on this, thanks.

>>>> 1. Paste into Emacs frame works strange:
SM> What text did you paste?  Where does it come from?
>> I type some Russian text in xterm and paste in into Emacs, have a look
>> at the attached screenshot.
SM> 
SM> Oh, I see.  I don't know enough of how this works to help you much further.
SM> If you hit C-u C-x = on the various chars (especially on two similar chars
SM> displayed with different fonts), you'll see that they come from different
SM> charsets (one is probably something like iso-8859-5 and the other may be
SM> unicode).  Emacs-22 doesn't unify them by default.  You can try to put
SM> (unify-8859-on-decoding-mode 1) in your .emacs.  And you can also try to
SM> play with utf-fragment-on-decoding.  And ask someone more knowledgeable
SM> about such problems.

On first character like latin T:
  character: <I removed cyrillic character> (01212102, 332866, 0x51442)^[-A
    charset: mule-unicode-0100-24ff
	     (Unicode characters of the range U+0100..U+24FF.)
 code point: 40 66
     syntax: word
   category: y:Cyrillic  
buffer code: 0x9C 0xF4 0xA8 0xC2
  file code: 0xD0 0xA2 (encoded by coding system mule-utf-8)
       font: -monotype-courier new-medium-r-normal--13-94-99-99-m-80-iso10646-1

After the same character in the next line:
  character: <I remove cyrillic character shown with wrong fontt> (0151664, 54196, 0xd3b4)
    charset: japanese-jisx0208 (JISX0208.1983/1990 Japanese Kanji: ISO-IR-87)
 code point: 39 52
     syntax: word
   category: Y:Cyrillic characters of 2-byte character sets   j:Japanese  
	     |:While filling, we can break a line at this character.  
buffer code: 0x92 0xA7 0xB4
  file code: not encodable by coding system mule-utf-8
       font: -Misc-Fixed-Medium-R-Normal--14-130-75-75-C-140-JISX0208.1983-0

Something is not ok here...

SM> You could even M-x report-emacs-bug about it, since maybe the default config
SM> in a cyrillic locale should already take care of it.
SM> 
>>>> Cyrillic nput in emacs -nw in xterm still does not work, if I just
>>>> change X keyboard layout.
SM> 
SM> That doesn't give us much to go on, does it?  What does it do, other than
SM> "not work"?
SM> 
>> It beeps.
SM> 
SM> What does C-h l show after hitting a particular key?

M-P M-0 C-h l

-- 
Alexander Kotelnikov
Saint-Petersburg, Russia

  reply	other threads:[~2006-05-09  5:41 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-07  9:52 converting between charsets Alexander Kotelnikov
2006-05-07 12:43 ` Stefan Monnier
2006-05-07 19:40   ` Alexander Kotelnikov
2006-05-08  3:28     ` Stefan Monnier
2006-05-08  9:39       ` Alexander Kotelnikov
2006-05-08 14:30         ` Stefan Monnier
2006-05-09  5:41           ` Alexander Kotelnikov [this message]
2006-05-09 18:42             ` Stefan Monnier
2006-05-13 18:42               ` Alexander Kotelnikov
2006-05-14  3:20                 ` Stefan Monnier
2006-05-14 17:53                   ` Alexander Kotelnikov
2006-05-15  0:37                     ` Stefan Monnier
2006-05-15  5:55                       ` Alexander Kotelnikov
2006-05-15  6:02                         ` Alexander Kotelnikov
2006-05-15 14:11                         ` Stefan Monnier
2006-05-15 20:30                           ` Alexander Kotelnikov
2006-05-16  3:50                             ` Stefan Monnier
2006-05-16 10:04                               ` Alexander Kotelnikov
2006-05-17 15:20                                 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84ac9rah6z.fsf@vinci.loc \
    --to=sacha@myxomop.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).