unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Emacs 22.1.90 can't save chinese-gb2312 file
@ 2008-01-31 16:05 Zhang Wei
  2008-02-01  4:30 ` Zhang Wei
  0 siblings, 1 reply; 8+ messages in thread
From: Zhang Wei @ 2008-01-31 16:05 UTC (permalink / raw)
  To: emacs-pretest-bug, emacs-devel

When I save a file in gb2312 coding system, I got the following
compliant, all of the chinese punctuation characters can't be encoded
with gb2312:
------------------------------------------------------------------------------
These default coding systems were tried to encode text
in the buffer `test':
  chinese-iso-8bit
However, each of them encountered characters it couldn't encode:
  chinese-iso-8bit cannot encode these: , 。 、 ?

Click on a character (or switch to this window by `C-x o'
and select the characters by RET) to jump to the place it appears,
where `C-u C-x =' will give information about it.

Select one of the safe coding systems listed below,
or cancel the writing with C-g and edit the buffer
   to remove or modify the problematic characters,
or specify any other coding system (and risk losing
   the problematic characters).

  utf-8 utf-16 utf-16 utf-16 utf-16be utf-16le iso-2022-7bit
--------------------------------------------------------------------------
C-u C-x = gives:
--------------------------------------------------------------------------
  character: 。 (302786, #o1117302, #x49ec2, U+3002)
    charset: mule-unicode-2500-33ff (Unicode characters of the range
U+2500..U+33FF.)
 code point: #x3D #x42
     syntax: w 	which means: word
buffer code: #x9C #xF2 #xBD #xC2
  file code: not encodable by coding system chinese-iso-8bit
    display: by this font (glyph code)
     -outline-Courier
New-normal-r-normal-normal-13-97-96-96-c-*-iso10646-1 (#x3002)

[back]
--------------------------------------------------------------------------




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Emacs 22.1.90 can't save chinese-gb2312 file
  2008-01-31 16:05 Emacs 22.1.90 can't save chinese-gb2312 file Zhang Wei
@ 2008-02-01  4:30 ` Zhang Wei
  2008-02-01  5:00   ` Kenichi Handa
  0 siblings, 1 reply; 8+ messages in thread
From: Zhang Wei @ 2008-02-01  4:30 UTC (permalink / raw)
  To: emacs-pretest-bug, emacs-devel, jasonr

On 2/1/08, Zhang Wei <id.brep@gmail.com> wrote:
> When I save a file in gb2312 coding system, I got the following
> compliant, all of the chinese punctuation characters can't be encoded
> with gb2312:
> ------------------------------------------------------------------------------
> These default coding systems were tried to encode text
> in the buffer `test':
>  chinese-iso-8bit
> However, each of them encountered characters it couldn't encode:
>  chinese-iso-8bit cannot encode these: , 。 、 ?
>
> Click on a character (or switch to this window by `C-x o'
> and select the characters by RET) to jump to the place it appears,
> where `C-u C-x =' will give information about it.
>
> Select one of the safe coding systems listed below,
> or cancel the writing with C-g and edit the buffer
>   to remove or modify the problematic characters,
> or specify any other coding system (and risk losing
>   the problematic characters).
>
>  utf-8 utf-16 utf-16 utf-16 utf-16be utf-16le iso-2022-7bit
> --------------------------------------------------------------------------
> C-u C-x = gives:
> --------------------------------------------------------------------------
>  character: 。 (302786, #o1117302, #x49ec2, U+3002)
>    charset: mule-unicode-2500-33ff (Unicode characters of the range
> U+2500..U+33FF.)
>  code point: #x3D #x42
>     syntax: w  which means: word
> buffer code: #x9C #xF2 #xBD #xC2
>  file code: not encodable by coding system chinese-iso-8bit
>    display: by this font (glyph code)
>     -outline-Courier
> New-normal-r-normal-normal-13-97-96-96-c-*-iso10646-1 (#x3002)
>
> [back]
> --------------------------------------------------------------------------
>

This bug crept in due to the changes of w32term.c since 2007-12-12, I
think, when I revert the changes of this file, the bug is gone.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Emacs 22.1.90 can't save chinese-gb2312 file
  2008-02-01  4:30 ` Zhang Wei
@ 2008-02-01  5:00   ` Kenichi Handa
  2008-02-01  6:55     ` Zhang Wei
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Kenichi Handa @ 2008-02-01  5:00 UTC (permalink / raw)
  To: Zhang Wei; +Cc: emacs-pretest-bug, jasonr, emacs-devel

In article <ebaf065f0801312030x70d3fb7aj96368c58cb8763c7@mail.gmail.com>, "Zhang Wei" <id.brep@gmail.com> writes:

> On 2/1/08, Zhang Wei <id.brep@gmail.com> wrote:
> > When I save a file in gb2312 coding system, I got the following
> > compliant, all of the chinese punctuation characters can't be encoded
> > with gb2312:
> > ------------------------------------------------------------------------------
> > These default coding systems were tried to encode text
> > in the buffer `test':
> >  chinese-iso-8bit
> > However, each of them encountered characters it couldn't encode:
> >  chinese-iso-8bit cannot encode these: , 。 、 ?
> >
> > Click on a character (or switch to this window by `C-x o'
> > and select the characters by RET) to jump to the place it appears,
> > where `C-u C-x =' will give information about it.
> >
> > Select one of the safe coding systems listed below,
> > or cancel the writing with C-g and edit the buffer
> >   to remove or modify the problematic characters,
> > or specify any other coding system (and risk losing
> >   the problematic characters).
> >
> >  utf-8 utf-16 utf-16 utf-16 utf-16be utf-16le iso-2022-7bit
> > --------------------------------------------------------------------------
> > C-u C-x = gives:
> > --------------------------------------------------------------------------
> >  character: 。 (302786, #o1117302, #x49ec2, U+3002)
> >    charset: mule-unicode-2500-33ff (Unicode characters of the range
> > U+2500..U+33FF.)
> >  code point: #x3D #x42
> >     syntax: w  which means: word
> > buffer code: #x9C #xF2 #xBD #xC2
> >  file code: not encodable by coding system chinese-iso-8bit
> >    display: by this font (glyph code)
> >     -outline-Courier
> > New-normal-r-normal-normal-13-97-96-96-c-*-iso10646-1 (#x3002)
> >
> > [back]
> > --------------------------------------------------------------------------
> >

> This bug crept in due to the changes of w32term.c since 2007-12-12, I
> think, when I revert the changes of this file, the bug is gone.

Are you sure?  It's quite surprising that the code in
w32term.c affects encoding of characters.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Emacs 22.1.90 can't save chinese-gb2312 file
  2008-02-01  5:00   ` Kenichi Handa
@ 2008-02-01  6:55     ` Zhang Wei
  2008-02-01  7:09     ` Zhang Wei
  2008-02-01  8:46     ` Jason Rumney
  2 siblings, 0 replies; 8+ messages in thread
From: Zhang Wei @ 2008-02-01  6:55 UTC (permalink / raw)
  To: Kenichi Handa, emacs-pretest-bug, emacs-devel

On 2/1/08, Kenichi Handa <handa@ni.aist.go.jp> wrote:

> Are you sure?  It's quite surprising that the code in
> w32term.c affects encoding of characters.

Yes, I'm sure. I checked out the w32term.c of date 2007-12-12, and bug has gone.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Emacs 22.1.90 can't save chinese-gb2312 file
  2008-02-01  5:00   ` Kenichi Handa
  2008-02-01  6:55     ` Zhang Wei
@ 2008-02-01  7:09     ` Zhang Wei
  2008-02-01  8:14       ` Kenichi Handa
  2008-02-01  8:46     ` Jason Rumney
  2 siblings, 1 reply; 8+ messages in thread
From: Zhang Wei @ 2008-02-01  7:09 UTC (permalink / raw)
  To: emacs-devel, emacs-pretest-bug

On 2/1/08, Kenichi Handa <handa@ni.aist.go.jp> wrote:

> Are you sure?  It's quite surprising that the code in
> w32term.c affects encoding of characters.

It seems like that with the changes of w32term.c the inputed chinese
punctuations have internal encodings of mule-unicode-2500-33ff
charset, while without the changes those characters has a internal
encoding of chinese-gb2312 charset, although they have same
appearance.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Emacs 22.1.90 can't save chinese-gb2312 file
  2008-02-01  7:09     ` Zhang Wei
@ 2008-02-01  8:14       ` Kenichi Handa
  2008-02-01  8:38         ` Zhang Wei
  0 siblings, 1 reply; 8+ messages in thread
From: Kenichi Handa @ 2008-02-01  8:14 UTC (permalink / raw)
  To: Zhang Wei; +Cc: emacs-devel

In article <ebaf065f0801312309l5d3eb7bbob1659ce45bbbd4cf@mail.gmail.com>, "Zhang Wei" <id.brep@gmail.com> writes:

> On 2/1/08, Kenichi Handa <handa@ni.aist.go.jp> wrote:
> > Are you sure?  It's quite surprising that the code in
> > w32term.c affects encoding of characters.

> It seems like that with the changes of w32term.c the inputed chinese
> punctuations have internal encodings of mule-unicode-2500-33ff
> charset, while without the changes those characters has a internal
> encoding of chinese-gb2312 charset, although they have same
> appearance.

Ah, it seems that you are not using the builtin Chinese
input method of Emacs but an input method of Windows, right?

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Emacs 22.1.90 can't save chinese-gb2312 file
  2008-02-01  8:14       ` Kenichi Handa
@ 2008-02-01  8:38         ` Zhang Wei
  0 siblings, 0 replies; 8+ messages in thread
From: Zhang Wei @ 2008-02-01  8:38 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

On 2/1/08, Kenichi Handa <handa@ni.aist.go.jp> wrote:

> Ah, it seems that you are not using the builtin Chinese
> input method of Emacs but an input method of Windows, right?

yeah, I use a system input method of Windows.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Emacs 22.1.90 can't save chinese-gb2312 file
  2008-02-01  5:00   ` Kenichi Handa
  2008-02-01  6:55     ` Zhang Wei
  2008-02-01  7:09     ` Zhang Wei
@ 2008-02-01  8:46     ` Jason Rumney
  2 siblings, 0 replies; 8+ messages in thread
From: Jason Rumney @ 2008-02-01  8:46 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-pretest-bug, Zhang Wei, emacs-devel

Kenichi Handa wrote:
> In article <ebaf065f0801312030x70d3fb7aj96368c58cb8763c7@mail.gmail.com>, "Zhang Wei" <id.brep@gmail.com> writes:
>
>   
>> This bug crept in due to the changes of w32term.c since 2007-12-12, I
>> think, when I revert the changes of this file, the bug is gone.
>>     
>
> Are you sure?  It's quite surprising that the code in
> w32term.c affects encoding of characters.
>   


The change affects how keyboard input is handled. Before the change,
input was sent in gb2312 encoding one byte at a time, and encoded-kbd
was used to decode the input. After the change, input is sent in
mule-unicode-* charsets where possible.

So the punctuation has changed in representation from gb2312 to
mule-unicode-2500-33ff, while Hanzi is outside the range covered by the
mule-unicode charsets, so continues to be sent as gb2312.

Emacs knows how to do the conversion, as writing the file as utf-8 then
reading and writing out as gb2312 is reported to work, it just doesn't
do the conversion in this case.





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-02-01  8:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-31 16:05 Emacs 22.1.90 can't save chinese-gb2312 file Zhang Wei
2008-02-01  4:30 ` Zhang Wei
2008-02-01  5:00   ` Kenichi Handa
2008-02-01  6:55     ` Zhang Wei
2008-02-01  7:09     ` Zhang Wei
2008-02-01  8:14       ` Kenichi Handa
2008-02-01  8:38         ` Zhang Wei
2008-02-01  8:46     ` Jason Rumney

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).