all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* recommended russian encoding
@ 2004-07-14 15:09 Bruce Ingalls
  2004-07-14 17:43 ` Kevin Rodgers
  0 siblings, 1 reply; 8+ messages in thread
From: Bruce Ingalls @ 2004-07-14 15:09 UTC (permalink / raw)


The latest version of EMacro I am releasing has i18n language support.
Both the setup and the html docs are being translated into multiple 
languages.
Keep in mind, that I am trying to maintain OS & Emacs/XEmacs portability.

It will take me some time to figure how to deal with Arabic. :/

Currently, I am trying to deal with Russian.
The biggest problem, is that I don't know how to deal with various 
encodings.
Is there a program that translates KOI8-R to UTF-8 and other encodings?
Better yet, is there a single encoding for all platforms?
If not, what do you recommend for which platforms for both (X)Emacs and 
html?
Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: recommended russian encoding
  2004-07-14 15:09 recommended russian encoding Bruce Ingalls
@ 2004-07-14 17:43 ` Kevin Rodgers
  2004-07-15 13:59   ` Bruce Ingalls
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Rodgers @ 2004-07-14 17:43 UTC (permalink / raw)


Bruce Ingalls wrote:
 > Is there a program that translates KOI8-R to UTF-8 and other encodings?

http://www.gnu.org/directory/recode.html

 > Better yet, is there a single encoding for all platforms?

By platform, do you mean processor, operating system, or Emacs
implementation?

Whether a text is encoded as KOI8-R, UTF-8, or whatever has nothing to
do with chip and OS.  But you might consider UTF-8 for maximum
portability across Emacs implementations.

-- 
Kevin Rodgers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: recommended russian encoding
  2004-07-14 17:43 ` Kevin Rodgers
@ 2004-07-15 13:59   ` Bruce Ingalls
  2004-07-15 15:45     ` Kevin Rodgers
  0 siblings, 1 reply; 8+ messages in thread
From: Bruce Ingalls @ 2004-07-15 13:59 UTC (permalink / raw)


Kevin Rodgers wrote:
> Whether a text is encoded as KOI8-R, UTF-8, or whatever has nothing to
> do with chip and OS.  But you might consider UTF-8 for maximum
> portability across Emacs implementations.

Thanks, UTF-8 does seem to work best, and it worked on w32.
It also worked in gedit on Linux, as well.
However, when I tried opening the file in Emacs on Linux, the UTF-8 
encoded Russian characters displayed as garbage.

I am using the precompiled Emacs that is bundled with Fedora Core 2 
(latest RedHat). I assume that it has leim & mule bundled in.
How might I check what is going wrong?

BTW, koi8-r fails for both w32 & linux.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: recommended russian encoding
  2004-07-15 13:59   ` Bruce Ingalls
@ 2004-07-15 15:45     ` Kevin Rodgers
  2004-07-16  4:57       ` Bruce Ingalls
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Rodgers @ 2004-07-15 15:45 UTC (permalink / raw)


Bruce Ingalls wrote:
 > Thanks, UTF-8 does seem to work best, and it worked on w32.
 > It also worked in gedit on Linux, as well.
 > However, when I tried opening the file in Emacs on Linux, the UTF-8
 > encoded Russian characters displayed as garbage.

Garbage?  Empty boxes would indicate that there's no available font, but
bogus glyphs indicate a problem with the encoding.  Did you visit the
file with `C-x RET c utf-8 RET C-x C-f'?

 > I am using the precompiled Emacs that is bundled with Fedora Core 2
 > (latest RedHat). I assume that it has leim & mule bundled in.
 > How might I check what is going wrong?

Use `C-h h' to display the HELLO file.  Is the Russian text correctly
displayed?  Does it make a difference if you invoke emacs with the
--font=fontset-standard option?  How about disabling any Fedora
customizations with --no-init-file --no-site-file?

 > BTW, koi8-r fails for both w32 & linux.

-- 
Kevin Rodgers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: recommended russian encoding
  2004-07-15 15:45     ` Kevin Rodgers
@ 2004-07-16  4:57       ` Bruce Ingalls
  2004-07-16 16:32         ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: Bruce Ingalls @ 2004-07-16  4:57 UTC (permalink / raw)


Kevin Rodgers wrote:
> Bruce Ingalls wrote:
>  > However, when I tried opening the file in Emacs on Linux, the UTF-8
>  > encoded Russian characters displayed as garbage.
> 
> Garbage?  Empty boxes would indicate that there's no available font, but
Checked again; I am getting empty boxes.

> bogus glyphs indicate a problem with the encoding.  Did you visit the
> file with `C-x RET c utf-8 RET C-x C-f'?

I think you got the above syntax wrong.
I'd also like to know what elisp to eval, to invoke utf-8 encoding

> Use `C-h h' to display the HELLO file.  Is the Russian text correctly
> displayed?  Does it make a difference if you invoke emacs with the
> --font=fontset-standard option?  How about disabling any Fedora
> customizations with --no-init-file --no-site-file?

Disabling the startup file made no difference.
C-h h does display Russian "hello" properly.
Emacs also works fine in -nw mode in an xterm.

This seems to demonstrate that the font is available...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: recommended russian encoding
  2004-07-16  4:57       ` Bruce Ingalls
@ 2004-07-16 16:32         ` Stefan Monnier
  2004-07-17 14:57           ` Bruce Ingalls
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2004-07-16 16:32 UTC (permalink / raw)


>> > However, when I tried opening the file in Emacs on Linux, the UTF-8
>> > encoded Russian characters displayed as garbage.
>> Garbage?  Empty boxes would indicate that there's no available font, but
> Checked again; I am getting empty boxes.

So it seems that Emacs correctly detected the utf-8 encoding but just
can't find the chars in the unicode font.
You can check with C-u C-x = when point is on one of those empty boxes.

>> bogus glyphs indicate a problem with the encoding.  Did you visit the
>> file with `C-x RET c utf-8 RET C-x C-f'?
> I think you got the above syntax wrong.

Care to tell us what was wrong?
[ Not that the C-x RET c utf-8 RET is necessary since Emacs seems to have
  figured that part on its own. ]

>> Use `C-h h' to display the HELLO file.  Is the Russian text correctly
>> displayed?  Does it make a difference if you invoke emacs with the
>> --font=fontset-standard option?  How about disabling any Fedora
>> customizations with --no-init-file --no-site-file?

> Disabling the startup file made no difference.
> C-h h does display Russian "hello" properly.
> Emacs also works fine in -nw mode in an xterm.

> This seems to demonstrate that the font is available...

Probably that you have a font for the koi-8 characters but not for the
russian unicode characters (and your Emacs doesn't realize that they are
the same).


        Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: recommended russian encoding
  2004-07-16 16:32         ` Stefan Monnier
@ 2004-07-17 14:57           ` Bruce Ingalls
  2004-07-18 15:17             ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: Bruce Ingalls @ 2004-07-17 14:57 UTC (permalink / raw)


Stefan Monnier wrote:
>>>>However, when I tried opening the file in Emacs on Linux, the UTF-8
>>>>encoded Russian characters displayed as ... empty boxes.
> 
> So it seems that Emacs correctly detected the utf-8 encoding but just
> can't find the chars in the unicode font.
> You can check with C-u C-x = when point is on one of those empty boxes.
I get:
   character: а (01212120, 332880, 0x51450)
     charset: mule-unicode-0100-24ff
	     (Unicode characters of the range U+0100..U+24FF.)
  code point: 40 80
      syntax: word
    category: y:Cyrillic
buffer code: 0x9C 0xF4 0xA8 0xD0
   file code: 0xD0 0xB0 (encoded by coding system mule-utf-8)
        font: -Adobe-Courier-Medium-R-Normal--12-120-75-75-M-70-ISO10646-1

(the encoded char above might not have survived copy & pasting)

>>>bogus glyphs indicate a problem with the encoding.  Did you visit the
>>>file with `C-x RET c utf-8 RET C-x C-f'?
>>I think you got the above syntax wrong.
> 
> Care to tell us what was wrong?
I discovered that the better syntax, which worked for me, is:
	C-x C-m c utf-8 RET C-x C-f
-------------^
C-m is not the same as RET.

>>C-h h does display Russian "hello" properly.
>>Emacs also works fine in -nw mode in an xterm.
> 
>>This seems to demonstrate that the font is available...
> 
> Probably that you have a font for the koi-8 characters but not for the
> russian unicode characters (and your Emacs doesn't realize that they are
> the same).

That's it. koi-8-r encoding worked. Is there some way to tell Emacs to 
map the Cyrillic fonts that it has, to UTF-8 encodings?

Now, I'm going through all this again with XEmacs.
The precompiled v21.4.13 cygwin (XEmacs, not Cygwin distro) binary seems 
not to have LEIM/MULE support compiled in.
When I type 'C-x C-m c', the only completion choices are 'default' & 
'raw-text'

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: recommended russian encoding
  2004-07-17 14:57           ` Bruce Ingalls
@ 2004-07-18 15:17             ` Stefan Monnier
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2004-07-18 15:17 UTC (permalink / raw)


> I discovered that the better syntax, which worked for me, is:
> 	C-x C-m c utf-8 RET C-x C-f
> -------------^
> C-m is not the same as RET.

SPC, RET, and TAB (within an Emacs context) are notations for the ASCII
chars 32, 13, and 9, respectively.  So yes, RET is the same as C-m which is
not necessarily the same as what you get when you press hit `return' key.


        Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-07-18 15:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-14 15:09 recommended russian encoding Bruce Ingalls
2004-07-14 17:43 ` Kevin Rodgers
2004-07-15 13:59   ` Bruce Ingalls
2004-07-15 15:45     ` Kevin Rodgers
2004-07-16  4:57       ` Bruce Ingalls
2004-07-16 16:32         ` Stefan Monnier
2004-07-17 14:57           ` Bruce Ingalls
2004-07-18 15:17             ` Stefan Monnier

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.