* Fwd: 23.0.50; can't input chinese punctuation on w32 platform
[not found] <42b562540801260432h43921157k7d4034ddfff28862@mail.gmail.com>
@ 2008-01-26 15:04 ` Jason Rumney
2008-02-01 7:30 ` Kenichi Handa
[not found] ` <47A8FEBA.8000304@gnu.org>
1 sibling, 1 reply; 9+ messages in thread
From: Jason Rumney @ 2008-01-26 15:04 UTC (permalink / raw)
To: emacs-pretest-bug@gnu.org; +Cc: yu jie
This seems to be a problem with mule-unicode-2500-33ff to gb2312
encoding. I doubt it is limited to w32.
yu jie wrote:
> Hi,
> I met a new Chinese related issue. When I try to save a buffer with
> Chinese period: 。, I meet a error message: gb2312-dos cannot encode
> these: 。I change the coding system to utf8, save the buffer and call
> revert-buffer, then set buffer encoding back to GB2312, and now I
> could save the buffer. The glyphs of the two period is different.
> Here's output of describe-char:
>
> character: 。 (302786, #o1117302, #x49ec2, U+3002)
> charset: mule-unicode-2500-33ff (Unicode characters of the range
> U+2500..U+33FF.)
> code point: #x3D #x42
> syntax: w which means: word
> buffer code: #x9C #xF2 #xBD #xC2
> file code: not encodable by coding system gb2312-dos
> display: by this font (glyph code)
> -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1
> (#x3002)
>
> character: 。 (37027, #o110243, #x90a3, U+3002)
> charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.)
> code point: #x21 #x23
> syntax: . which means: punctuation
> category: c:Chinese |:While filling, we can break a line at this
> character.
> buffer code: #x91 #xA1 #xA3
> file code: #xA1 #xA3 (encoded by coding system gb2312-dos)
> display: by this font (glyph code)
> -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1
> (#x3002)
>
> Thanks.
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
2008-01-26 15:04 ` Fwd: 23.0.50; can't input chinese punctuation on w32 platform Jason Rumney
@ 2008-02-01 7:30 ` Kenichi Handa
2008-02-01 12:33 ` yu jie
2008-02-01 14:07 ` Jason Rumney
0 siblings, 2 replies; 9+ messages in thread
From: Kenichi Handa @ 2008-02-01 7:30 UTC (permalink / raw)
To: Jason Rumney; +Cc: emacs-pretest-bug, yujie052
In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:
> This seems to be a problem with mule-unicode-2500-33ff to gb2312
> encoding. I doubt it is limited to w32.
Right. This is because of the limitation of Emacs 22's
Unicode handling. If you want to handle U+3002, you have
to use UTF-* coding systems.
It will be fixed by Emacs 23.
---
Kenichi Handa
handa@ni.aist.go.jp
> yu jie wrote:
> > Hi,
> > I met a new Chinese related issue. When I try to save a buffer with
> > Chinese period: 。, I meet a error message: gb2312-dos cannot encode
> > these: 。I change the coding system to utf8, save the buffer and call
> > revert-buffer, then set buffer encoding back to GB2312, and now I
> > could save the buffer. The glyphs of the two period is different.
> > Here's output of describe-char:
> >
> > character: 。 (302786, #o1117302, #x49ec2, U+3002)
> > charset: mule-unicode-2500-33ff (Unicode characters of the range
> > U+2500..U+33FF.)
> > code point: #x3D #x42
> > syntax: w which means: word
> > buffer code: #x9C #xF2 #xBD #xC2
> > file code: not encodable by coding system gb2312-dos
> > display: by this font (glyph code)
> > -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1
> > (#x3002)
> >
> > character: 。 (37027, #o110243, #x90a3, U+3002)
> > charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.)
> > code point: #x21 #x23
> > syntax: . which means: punctuation
> > category: c:Chinese |:While filling, we can break a line at this
> > character.
> > buffer code: #x91 #xA1 #xA3
> > file code: #xA1 #xA3 (encoded by coding system gb2312-dos)
> > display: by this font (glyph code)
> > -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1
> > (#x3002)
> >
> > Thanks.
> >
> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
2008-02-01 7:30 ` Kenichi Handa
@ 2008-02-01 12:33 ` yu jie
2008-02-01 14:07 ` Jason Rumney
1 sibling, 0 replies; 9+ messages in thread
From: yu jie @ 2008-02-01 12:33 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-pretest-bug, Jason Rumney
[-- Attachment #1: Type: text/plain, Size: 2293 bytes --]
Thanks for your replay.
But there's no such a bug in Emacs 22.1
Someone has mentioned that this bug is caused by w32term.c and Mr Handa has
talked
with him about this bug in this news-group. :)...
2008/2/1 Kenichi Handa <handa@ni.aist.go.jp>:
> In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org>
> writes:
>
> > This seems to be a problem with mule-unicode-2500-33ff to gb2312
> > encoding. I doubt it is limited to w32.
>
> Right. This is because of the limitation of Emacs 22's
> Unicode handling. If you want to handle U+3002, you have
> to use UTF-* coding systems.
>
> It will be fixed by Emacs 23.
>
> ---
> Kenichi Handa
> handa@ni.aist.go.jp
>
> > yu jie wrote:
> > > Hi,
> > > I met a new Chinese related issue. When I try to save a buffer with
> > > Chinese period: 。, I meet a error message: gb2312-dos cannot encode
> > > these: 。I change the coding system to utf8, save the buffer and call
> > > revert-buffer, then set buffer encoding back to GB2312, and now I
> > > could save the buffer. The glyphs of the two period is different.
> > > Here's output of describe-char:
> > >
> > > character: 。 (302786, #o1117302, #x49ec2, U+3002)
> > > charset: mule-unicode-2500-33ff (Unicode characters of the range
> > > U+2500..U+33FF.)
> > > code point: #x3D #x42
> > > syntax: w which means: word
> > > buffer code: #x9C #xF2 #xBD #xC2
> > > file code: not encodable by coding system gb2312-dos
> > > display: by this font (glyph code)
> > > -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1
> > > (#x3002)
> > >
> > > character: 。 (37027, #o110243, #x90a3, U+3002)
> > > charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.)
> > > code point: #x21 #x23
> > > syntax: . which means: punctuation
> > > category: c:Chinese |:While filling, we can break a line at this
> > > character.
> > > buffer code: #x91 #xA1 #xA3
> > > file code: #xA1 #xA3 (encoded by coding system gb2312-dos)
> > > display: by this font (glyph code)
> > > -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1
> > > (#x3002)
> > >
> > > Thanks.
> > >
>
>
>
> > _______________________________________________
> > Emacs-devel mailing list
> > Emacs-devel@gnu.org
> > http://lists.gnu.org/mailman/listinfo/emacs-devel
>
>
[-- Attachment #2: Type: text/html, Size: 3180 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
2008-02-01 7:30 ` Kenichi Handa
2008-02-01 12:33 ` yu jie
@ 2008-02-01 14:07 ` Jason Rumney
2008-02-01 15:37 ` Eli Zaretskii
1 sibling, 1 reply; 9+ messages in thread
From: Jason Rumney @ 2008-02-01 14:07 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-pretest-bug, yujie052
Kenichi Handa wrote:
> In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:
>
>
>> This seems to be a problem with mule-unicode-2500-33ff to gb2312
>> encoding. I doubt it is limited to w32.
>>
>
> Right. This is because of the limitation of Emacs 22's
> Unicode handling. If you want to handle U+3002, you have
> to use UTF-* coding systems.
>
> It will be fixed by Emacs 23.
>
Meanwhile we need to handle keyboard input in Emacs 22.2 in a way that
is not any worse than 22.1.
AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals,
Thai and possibly Greek and Cyrillic are potentially problematic. Am I
correct in thinking that Latin character sets are not affected?
Is there a well defined range of unicode that does or doesn't support
conversion? Doesn't pasting from the clipboard have the same problem?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
2008-02-01 14:07 ` Jason Rumney
@ 2008-02-01 15:37 ` Eli Zaretskii
2008-02-01 16:05 ` Jason Rumney
0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2008-02-01 15:37 UTC (permalink / raw)
To: Jason Rumney; +Cc: emacs-pretest-bug, handa, yujie052
> Date: Fri, 01 Feb 2008 14:07:35 +0000
> From: Jason Rumney <jasonr@gnu.org>
> Cc: emacs-pretest-bug@gnu.org, yujie052@gmail.com
>
> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals,
> Thai and possibly Greek and Cyrillic are potentially problematic.
I thought mule-unicode-* covers Greek and Cyrillic quite well.
Doesn't it?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
2008-02-01 15:37 ` Eli Zaretskii
@ 2008-02-01 16:05 ` Jason Rumney
2008-02-02 1:37 ` YAMAMOTO Mitsuharu
2008-02-02 10:22 ` Eli Zaretskii
0 siblings, 2 replies; 9+ messages in thread
From: Jason Rumney @ 2008-02-01 16:05 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-pretest-bug, handa, yujie052
Eli Zaretskii wrote:
>> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals,
>> Thai and possibly Greek and Cyrillic are potentially problematic.
>>
>
> I thought mule-unicode-* covers Greek and Cyrillic quite well.
> Doesn't it?
>
mule-unicode-* covers all the above. The issue is whether files
containing such characters can be written in the relevant non-UTF coding
systems.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
2008-02-01 16:05 ` Jason Rumney
@ 2008-02-02 1:37 ` YAMAMOTO Mitsuharu
2008-02-02 10:22 ` Eli Zaretskii
1 sibling, 0 replies; 9+ messages in thread
From: YAMAMOTO Mitsuharu @ 2008-02-02 1:37 UTC (permalink / raw)
To: Jason Rumney; +Cc: emacs-pretest-bug, Eli Zaretskii, handa, yujie052
>>>>> On Fri, 01 Feb 2008 16:05:56 +0000, Jason Rumney <jasonr@gnu.org> said:
>>> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK
>>> radicals, Thai and possibly Greek and Cyrillic are potentially
>>> problematic.
>>>
>>
> I thought mule-unicode-* covers Greek and Cyrillic quite well.
>> Doesn't it?
>>
> mule-unicode-* covers all the above. The issue is whether files
> containing such characters can be written in the relevant non-UTF
> coding systems.
Even for Greek and Cyrillic, a user may want it mapped to
japanese-jisx0208 rather than mule-unicode-* in some situation.
FWIW, the Mac Carbon port handles Unicode keyboard events in the
following way:
* ASCII character, with or without modifiers
-> ASCII_KEYSTROKE_EVENT
* Non-ASCII character with some modifiers
-> MULTIBYTE_CHAR_KEYSTROKE_EVENT with either CHARSET_8_BIT_CONTROL,
charset_latin_iso8859_1, or charset_mule_unicode_* code.
* Non-ASCII character without any modifiers
-> The event comes with some script/language information. So we can
distinguish mule-unicode-0100-24ff Greek from japanese-jisx0208
Greek in principle even though they have the same Unicode
codepoint. Likewise for CJK characters. Because the usual Emacs
keyboard events cannot carry such script/language information, we
pack the raw Unicode text input data and the script/language info
into a special event MAC_APPLE_EVENT instead. Then it is decoded
at the Lisp level.
(define-key special-event-map [mac-apple-event] 'mac-dispatch-apple-event)
(define-key mac-apple-event-map [text-input unicode-for-key-event]
'mac-ts-unicode-for-key-event)
(defun mac-ts-unicode-for-key-event (event)
"Convert Unicode key EVENT to Emacs key events and unread them."
(interactive "e")
(let* ((ae (mac-event-ae event))
(text (cdr (mac-ae-parameter ae "tstx" "utxt")))
(script-language (mac-ae-script-language ae "tssl"))
(coding (or (cdr (assq (car script-language)
mac-script-code-coding-systems))
'mac-roman)))
(if text
(mac-unread-string (mac-utxt-to-string text coding)))))
As for the W32 port, Emacs 22.2 should avoid drastic changes in
general. How about using the new code (i.e., mapping to
mule-unicode-* etc.) only for the with-modifier case, and leaving the
without-modifier case to encoded-kb as in Emacs 22.1?
YAMAMOTO Mitsuharu
mituharu@math.s.chiba-u.ac.jp
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
2008-02-01 16:05 ` Jason Rumney
2008-02-02 1:37 ` YAMAMOTO Mitsuharu
@ 2008-02-02 10:22 ` Eli Zaretskii
1 sibling, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2008-02-02 10:22 UTC (permalink / raw)
To: Jason Rumney; +Cc: emacs-pretest-bug, handa, yujie052
> Date: Fri, 01 Feb 2008 16:05:56 +0000
> From: Jason Rumney <jasonr@gnu.org>
> CC: handa@ni.aist.go.jp, emacs-pretest-bug@gnu.org, yujie052@gmail.com
>
> Eli Zaretskii wrote:
> >> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals,
> >> Thai and possibly Greek and Cyrillic are potentially problematic.
> >>
> >
> > I thought mule-unicode-* covers Greek and Cyrillic quite well.
> > Doesn't it?
> >
>
> mule-unicode-* covers all the above. The issue is whether files
> containing such characters can be written in the relevant non-UTF coding
> systems.
For Cyrillic, those would be windows-1251, koi8-r, and ISO-8859-5,
right? If so, I think we have no problems here. Likewise for Greek.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 23.0.50; can't input chinese punctuation on win32 platform
[not found] ` <47A8FEBA.8000304@gnu.org>
@ 2008-02-06 9:57 ` Zhang Wei
0 siblings, 0 replies; 9+ messages in thread
From: Zhang Wei @ 2008-02-06 9:57 UTC (permalink / raw)
To: Jason Rumney, emacs-devel
On 2/6/08, Jason Rumney <jasonr@gnu.org> wrote:
> yu jie wrote:
> > Hi,
> > I met a new Chinese related issue. When I try to save a buffer with
> > Chinese period: 。, I meet a error message: gb2312-dos cannot encode
> > these: 。
>
> Zhang Wei wrote:
>
> > When I save a file in gb2312 coding system, I got the following
> > compliant, all of the chinese punctuation characters can't be encoded
> > with gb2312:
>
> This problem should be fixed now. Thank you both for reporting it.
>
The bug has gone. Thank you.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-02-06 9:57 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <42b562540801260432h43921157k7d4034ddfff28862@mail.gmail.com>
2008-01-26 15:04 ` Fwd: 23.0.50; can't input chinese punctuation on w32 platform Jason Rumney
2008-02-01 7:30 ` Kenichi Handa
2008-02-01 12:33 ` yu jie
2008-02-01 14:07 ` Jason Rumney
2008-02-01 15:37 ` Eli Zaretskii
2008-02-01 16:05 ` Jason Rumney
2008-02-02 1:37 ` YAMAMOTO Mitsuharu
2008-02-02 10:22 ` Eli Zaretskii
[not found] ` <47A8FEBA.8000304@gnu.org>
2008-02-06 9:57 ` 23.0.50; can't input chinese punctuation on win32 platform Zhang Wei
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).