unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* problem with mule-utf-8 ?
@ 2003-09-06 14:59 Pascal Bourguignon
  2003-09-06 20:17 ` Eli Zaretskii
  2003-09-07 11:49 ` Alex Schroeder
  0 siblings, 2 replies; 6+ messages in thread
From: Pascal Bourguignon @ 2003-09-06 14:59 UTC (permalink / raw)



Hello, 

I'm trying to make an UTF-8 file containing some katakana characters.

So, I take a new buffer, type M-x set-input-method RET japanese-katakana RET
a SPACE e SPACE i SPACE o SPACE C-x C-s, and then it says:

------------------------------------------------------------------------
These default coding systems were tried to encode text
in the buffer `test':
  utf-8 iso-latin-1
However, each of them encountered these problematic characters:
  iso-latin-1: ア エ イ オ ウ ヤ
  utf-8: ア エ イ オ ウ ヤ
The first problematic character is at point in the displayed buffer,
and C-u C-x = will give information about it.

Select one of the following safe coding systems, or edit the buffer:
  euc-jp shift_jis iso-2022-jp japanese-iso-7bit-1978-irv
  iso-2022-7bit
Or specify any other coding system
on your risk of losing the problematic characters.
------------------------------------------------------------------------

So,  what's  the  matter?  I  thought that  unicode  inclued  all  the
characters,  and  that  utf-8  was  able  to  transcribe  all  unicode
character, or not?

Of course, I insist and save  it with utf-8 encoding, then when I load
this utf-8 file later, I get rectangle frames instead of katakana...


Is this a problem with mule, or is it with unicode?


-- 
__Pascal_Bourguignon__                   http://www.informatimago.com/
----------------------------------------------------------------------
Do not adjust your mind, there is a fault in reality.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: problem with mule-utf-8 ?
  2003-09-06 14:59 problem with mule-utf-8 ? Pascal Bourguignon
@ 2003-09-06 20:17 ` Eli Zaretskii
  2003-09-08  5:47   ` Janusz S. Bień
  2003-09-07 11:49 ` Alex Schroeder
  1 sibling, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2003-09-06 20:17 UTC (permalink / raw)


> From: Pascal Bourguignon <spam@thalassa.informatimago.com>
> Newsgroups: gnu.emacs.help
> Date: 06 Sep 2003 16:59:24 +0200
> 
> So,  what's  the  matter?  I  thought that  unicode  inclued  all  the
> characters,  and  that  utf-8  was  able  to  transcribe  all  unicode
> character, or not?
> 
> Of course, I insist and save  it with utf-8 encoding, then when I load
> this utf-8 file later, I get rectangle frames instead of katakana...

In what version of Emacs?  All released versions of Emacs support
only part of the BMP.  Specifically, these ranges of Unicode
codepoints are supported:

  0100..33ff
  e000..ffff

If katakana characters are not in these ranges, you cannot have them
in unicode.

The CVS version of Emacs can (I think) convert katakana to Unicode
when encoding and decoding text.  I also think you can have this with
released versions if you install ucs-tables (look on
gnu.emacs.sources).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: problem with mule-utf-8 ?
  2003-09-06 14:59 problem with mule-utf-8 ? Pascal Bourguignon
  2003-09-06 20:17 ` Eli Zaretskii
@ 2003-09-07 11:49 ` Alex Schroeder
  2003-09-09  0:08   ` Pascal Bourguignon
  1 sibling, 1 reply; 6+ messages in thread
From: Alex Schroeder @ 2003-09-07 11:49 UTC (permalink / raw)


Pascal Bourguignon <spam@thalassa.informatimago.com> writes:

> Is this a problem with mule, or is it with unicode?

Emacs doesn't do Han-Unification by default.  The CVS version has code
to do that, but until then, if you type Japanese characters using the
Japanese input methods, they can only be represented in the iso-2022
encodings, or in specific Japanese encodings such as euc-jp.

Alex.
-- 
http://www.emacswiki.org/alex/
There is no substitute for experience.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: problem with mule-utf-8 ?
  2003-09-06 20:17 ` Eli Zaretskii
@ 2003-09-08  5:47   ` Janusz S. Bień
  2003-09-08 14:24     ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Janusz S. Bień @ 2003-09-08  5:47 UTC (permalink / raw)


On Sat, 06 Sep 2003  "Eli Zaretskii" <eliz@elta.co.il> wrote:

> > From: Pascal Bourguignon <spam@thalassa.informatimago.com>
> > Newsgroups: gnu.emacs.help
> > Date: 06 Sep 2003 16:59:24 +0200
> > 
> > So,  what's  the  matter?  I  thought that  unicode  inclued  all  the
> > characters,  and  that  utf-8  was  able  to  transcribe  all  unicode
> > character, or not?
> > 
> > Of course, I insist and save  it with utf-8 encoding, then when I load
> > this utf-8 file later, I get rectangle frames instead of katakana...
> 
> In what version of Emacs?  
> All released versions of Emacs support
> only part of the BMP.  Specifically, these ranges of Unicode
> codepoints are supported:
> 
>   0100..33ff
>   e000..ffff
> 
> If katakana characters are not in these ranges, you cannot have them
> in unicode.

You can extend Unicode support with mule-ucs. This is the excerpt
from the output of Help -> Find Extra Packages.

 * Mule-UCS: Universal Encoding System:
   <URL:ftp://ftp.m17n.org/pub/mule/Mule-UCS/>
   Extended coding systems for Mule, specifically for reading and
   writing UTF-8 encoded Unicode.  This does more than the built-in
   utf-8 coding system, specifically covering a lot of Far Eastern
   characters.  (See the entry in PROBLEMS concerning slow startup of
   Mule-UCS.)


> 
> The CVS version of Emacs can (I think) convert katakana to Unicode
> when encoding and decoding text.  I also think you can have this with
> released versions if you install ucs-tables (look on
> gnu.emacs.sources).

If you are going to use CVS, you may as well try emacs-unicode :-).

Regards

Janusz

-- 
                     ,   
dr hab. Janusz S. Bien, prof. UW
Prof. Janusz S. Bien, Warsaw Uniwersity
jsbien@mimuw.edu.pl, jsbien@uw.edu.pl
http://www.orient.uw.edu.pl/~jsbien/
http://www.mimuw.edu.pl/~jsbien/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: problem with mule-utf-8 ?
  2003-09-08  5:47   ` Janusz S. Bień
@ 2003-09-08 14:24     ` Eli Zaretskii
  0 siblings, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2003-09-08 14:24 UTC (permalink / raw)


> From: jsbien@mimuw.edu.pl (=?iso-8859-2?q?Janusz_S._Bie=F1?=)
> Date: 08 Sep 2003 07:47:24 +0200
> 
> If you are going to use CVS, you may as well try emacs-unicode :-).

AFAIK, the unicode branch is not yet stable enough to give such an
advice, but perhaps my information is outdated.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: problem with mule-utf-8 ?
  2003-09-07 11:49 ` Alex Schroeder
@ 2003-09-09  0:08   ` Pascal Bourguignon
  0 siblings, 0 replies; 6+ messages in thread
From: Pascal Bourguignon @ 2003-09-09  0:08 UTC (permalink / raw)


Alex Schroeder <alex@emacswiki.org> writes:

> Pascal Bourguignon <spam@thalassa.informatimago.com> writes:
> 
> > Is this a problem with mule, or is it with unicode?
> 
> Emacs doesn't do Han-Unification by default.  The CVS version has code
> to do that, but until then, if you type Japanese characters using the
> Japanese input methods, they can only be represented in the iso-2022
> encodings, or in specific Japanese encodings such as euc-jp.

I see. Thank you.

-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
Do not adjust your mind, there is a fault in reality.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-09-09  0:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-06 14:59 problem with mule-utf-8 ? Pascal Bourguignon
2003-09-06 20:17 ` Eli Zaretskii
2003-09-08  5:47   ` Janusz S. Bień
2003-09-08 14:24     ` Eli Zaretskii
2003-09-07 11:49 ` Alex Schroeder
2003-09-09  0:08   ` Pascal Bourguignon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).