unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* cannot encode a buffer??
@ 2003-06-24 14:52 Sam Steingold
  2003-06-24 16:23 ` Kai Großjohann
  0 siblings, 1 reply; 7+ messages in thread
From: Sam Steingold @ 2003-06-24 14:52 UTC (permalink / raw


GNU Emacs 21.3.50.1 (i386-msvc-nt5.0.2195)
 of 2003-06-16 on WINSTEINGOLDLAP
--with-msvc (12.00)

I have a file which starts with
";;; -*- coding: utf-8-unix -*-"

when I try to save it, I get this:

These default coding systems were tried to encode text
in the buffer `.bbdb':
  utf-8-unix
However, each of them encountered these problematic characters:
  utf-8-unix: 山 本 和 彦
The first problematic character is at point in the displayed buffer,
and C-u C-x = will give information about it.

Select one of the following safe coding systems, or edit the buffer:
  iso-2022-7bit compound-text-with-extensions
Or specify any other coding system
on your risk of losing the problematic characters.

clicking on the "problematic characters" gives;

  character: 山 (0156663, 56755, 0xddb3)
    charset: japanese-jisx0208 (JISX0208.1983/1990 Japanese Kanji: ISO-IR-87.)
 code point: 59 51
     syntax: w 	which means: word
   category: C:Chinese (Han) characters of 2-byte character sets   j:Japanese  
             |:While filling, we can break a line at this character.  
buffer code: 0x92 0xBB 0xB3
  file code: not encodable by coding system utf-8-unix
       font:
             -outline-MS Mincho-normal-r-normal-normal-13-97-96-96-c-70-jisx0208-sjis

now, when I type "utf-8-unix" at the "Select coding system (default
iso-2022-7bit):" prompt, the file is saved, but the next time I have to
save the file, the problem repeats, i.e., I am asked to enter
"utf-8-unix" &c.

-- 
Sam Steingold (http://www.podval.org/~sds) running RedHat9 GNU/Linux
<http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/>
<http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html>
Those who value Life above Freedom are destined to lose both.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cannot encode a buffer??
  2003-06-24 14:52 cannot encode a buffer?? Sam Steingold
@ 2003-06-24 16:23 ` Kai Großjohann
  2003-06-24 19:17   ` Sam Steingold
  0 siblings, 1 reply; 7+ messages in thread
From: Kai Großjohann @ 2003-06-24 16:23 UTC (permalink / raw


Sam Steingold <sds@gnu.org> writes:

> GNU Emacs 21.3.50.1 (i386-msvc-nt5.0.2195)
>  of 2003-06-16 on WINSTEINGOLDLAP
> --with-msvc (12.00)
>
> I have a file which starts with
> ";;; -*- coding: utf-8-unix -*-"
>
> when I try to save it, I get this:
>
> These default coding systems were tried to encode text
> in the buffer `.bbdb':
>   utf-8-unix
> However, each of them encountered these problematic characters:
>   utf-8-unix: 山 本 和 彦

Does it work after (utf-translate-cjk-mode 1)?
-- 
~/.signature

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cannot encode a buffer??
  2003-06-24 16:23 ` Kai Großjohann
@ 2003-06-24 19:17   ` Sam Steingold
  2003-06-25  7:03     ` Kai Großjohann
  2003-06-25  7:52     ` Kenichi Handa
  0 siblings, 2 replies; 7+ messages in thread
From: Sam Steingold @ 2003-06-24 19:17 UTC (permalink / raw


> * In message <847k7bv40q.fsf@lucy.is.informatik.uni-duisburg.de>
> * On the subject of "Re: cannot encode a buffer??"
> * Sent on Tue, 24 Jun 2003 18:23:01 +0200
> * Honorable kai.grossjohann@gmx.net (Kai Großjohann) writes:
>
> Sam Steingold <sds@gnu.org> writes:
> 
> > GNU Emacs 21.3.50.1 (i386-msvc-nt5.0.2195)
> >  of 2003-06-16 on WINSTEINGOLDLAP
> > --with-msvc (12.00)
> >
> > I have a file which starts with
> > ";;; -*- coding: utf-8-unix -*-"
> >
> > when I try to save it, I get this:
> >
> > These default coding systems were tried to encode text
> > in the buffer `.bbdb':
> >   utf-8-unix
> > However, each of them encountered these problematic characters:
> >   utf-8-unix: 山 本 和 彦
> 
> Does it work after (utf-translate-cjk-mode 1)?

indeed it does.  thanks.

I wonder what is going on here: why did Emacs insist on asking me about
the new coding system on every save and then gleefully accepted the
coding system in the file header?

-- 
Sam Steingold (http://www.podval.org/~sds) running RedHat9 GNU/Linux
<http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/>
<http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html>
Daddy, why doesn't this magnet pick up this floppy disk?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cannot encode a buffer??
  2003-06-24 19:17   ` Sam Steingold
@ 2003-06-25  7:03     ` Kai Großjohann
  2003-06-25  7:52     ` Kenichi Handa
  1 sibling, 0 replies; 7+ messages in thread
From: Kai Großjohann @ 2003-06-25  7:03 UTC (permalink / raw


Sam Steingold <sds@gnu.org> writes:

> I wonder what is going on here: why did Emacs insist on asking me about
> the new coding system on every save and then gleefully accepted the
> coding system in the file header?

Yes, I almost overlooked that part.  It is indeed quite strange.  Any
Mule Meister here who can help out?
-- 
~/.signature

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cannot encode a buffer??
  2003-06-24 19:17   ` Sam Steingold
  2003-06-25  7:03     ` Kai Großjohann
@ 2003-06-25  7:52     ` Kenichi Handa
  2003-06-25 11:22       ` Stephen J. Turnbull
  1 sibling, 1 reply; 7+ messages in thread
From: Kenichi Handa @ 2003-06-25  7:52 UTC (permalink / raw
  Cc: emacs-devel

In article <ur85j5lpg.fsf@gnu.org>, Sam Steingold <sds@gnu.org> writes:
>>  > I have a file which starts with
>>  > ";;; -*- coding: utf-8-unix -*-"
>>  >
>>  > when I try to save it, I get this:
>>  >
>>  > These default coding systems were tried to encode text
>>  > in the buffer `.bbdb':
>>  >   utf-8-unix
>>  > However, each of them encountered these problematic characters:
>>  >   utf-8-unix: 山 本 和 彦
>>  
>>  Does it work after (utf-translate-cjk-mode 1)?

> indeed it does.  thanks.

> I wonder what is going on here: why did Emacs insist on asking me about
> the new coding system on every save and then gleefully accepted the
> coding system in the file header?

It seems that when Emacs at first read the file .bbdb, those
Japanese characters didn't exist, but they were inserted in
the buffer by yourself or automatically by some package in
your Emacs session.  Correct?

If utf-translate-cjk-mode is not turned on, those characters
can't be encoded by utf-8.  If you force saving them by
utf-8, the encoder generates a utf-8 byte sequence
corresponding to U+FFFD for each of them.  In this
situation, when you modify the buffer and try to save it,
Emacs again detects that those characters can't be handled
by utf-8, thus ask you to select some other safe coding
system.  There's no mechanism to distinguish once saved
characters from the newly inserted characters.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cannot encode a buffer??
  2003-06-25  7:52     ` Kenichi Handa
@ 2003-06-25 11:22       ` Stephen J. Turnbull
  2003-06-25 11:52         ` Jason Rumney
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen J. Turnbull @ 2003-06-25 11:22 UTC (permalink / raw
  Cc: sds

>>>>> "Kenichi" == Kenichi Handa <handa@m17n.org> writes:

    Kenichi> If you force saving them by utf-8, the encoder generates
    Kenichi> a utf-8 byte sequence corresponding to U+FFFD for each of
    Kenichi> them.

Isn't that a violation of the Unicode standard?

I agree that the preferences of those who would rather that Emacs keep
the different flavors of Han different should be respected.  FWIW, I'd
default `utf-translate-cjk-mode' to on (to encourage development of a
Unicode-based way to disambiguate Unihan), but that does risk a lot of
annoyance for Asian polyglots.

However, if somebody insists on saving as UTF-8, the result should be
unification of Japanese to Chinese (which after all can be read, if
you can read both languages), not destruction of text.  Ie, it should
not be possible for Emacs to convert any JIS X 0208 character to
U+FFFD, ever.


-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cannot encode a buffer??
  2003-06-25 11:22       ` Stephen J. Turnbull
@ 2003-06-25 11:52         ` Jason Rumney
  0 siblings, 0 replies; 7+ messages in thread
From: Jason Rumney @ 2003-06-25 11:52 UTC (permalink / raw
  Cc: emacs-devel

Stephen J. Turnbull wrote:

> I agree that the preferences of those who would rather that Emacs keep
> the different flavors of Han different should be respected.  FWIW, I'd
> default `utf-translate-cjk-mode' to on (to encourage development of a
> Unicode-based way to disambiguate Unihan), but that does risk a lot of
> annoyance for Asian polyglots.

I think the reason it is not on by default is not to avoid disagreement
over whether unification should happen (as unification can easily be 
avoided by not saving as utf-*), but to save memory.

There was talk about a month ago of automatically loading it when
required, but AFAIK this has not been implemented yet.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-06-25 11:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-24 14:52 cannot encode a buffer?? Sam Steingold
2003-06-24 16:23 ` Kai Großjohann
2003-06-24 19:17   ` Sam Steingold
2003-06-25  7:03     ` Kai Großjohann
2003-06-25  7:52     ` Kenichi Handa
2003-06-25 11:22       ` Stephen J. Turnbull
2003-06-25 11:52         ` Jason Rumney

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).