* Fwd: 23.0.50; can't input chinese punctuation on w32 platform [not found] <42b562540801260432h43921157k7d4034ddfff28862@mail.gmail.com> @ 2008-01-26 15:04 ` Jason Rumney 2008-02-01 7:30 ` Kenichi Handa [not found] ` <47A8FEBA.8000304@gnu.org> 1 sibling, 1 reply; 9+ messages in thread From: Jason Rumney @ 2008-01-26 15:04 UTC (permalink / raw) To: emacs-pretest-bug@gnu.org; +Cc: yu jie This seems to be a problem with mule-unicode-2500-33ff to gb2312 encoding. I doubt it is limited to w32. yu jie wrote: > Hi, > I met a new Chinese related issue. When I try to save a buffer with > Chinese period: 。, I meet a error message: gb2312-dos cannot encode > these: 。I change the coding system to utf8, save the buffer and call > revert-buffer, then set buffer encoding back to GB2312, and now I > could save the buffer. The glyphs of the two period is different. > Here's output of describe-char: > > character: 。 (302786, #o1117302, #x49ec2, U+3002) > charset: mule-unicode-2500-33ff (Unicode characters of the range > U+2500..U+33FF.) > code point: #x3D #x42 > syntax: w which means: word > buffer code: #x9C #xF2 #xBD #xC2 > file code: not encodable by coding system gb2312-dos > display: by this font (glyph code) > -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1 > (#x3002) > > character: 。 (37027, #o110243, #x90a3, U+3002) > charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.) > code point: #x21 #x23 > syntax: . which means: punctuation > category: c:Chinese |:While filling, we can break a line at this > character. > buffer code: #x91 #xA1 #xA3 > file code: #xA1 #xA3 (encoded by coding system gb2312-dos) > display: by this font (glyph code) > -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1 > (#x3002) > > Thanks. > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform 2008-01-26 15:04 ` Fwd: 23.0.50; can't input chinese punctuation on w32 platform Jason Rumney @ 2008-02-01 7:30 ` Kenichi Handa 2008-02-01 12:33 ` yu jie 2008-02-01 14:07 ` Jason Rumney 0 siblings, 2 replies; 9+ messages in thread From: Kenichi Handa @ 2008-02-01 7:30 UTC (permalink / raw) To: Jason Rumney; +Cc: emacs-pretest-bug, yujie052 In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org> writes: > This seems to be a problem with mule-unicode-2500-33ff to gb2312 > encoding. I doubt it is limited to w32. Right. This is because of the limitation of Emacs 22's Unicode handling. If you want to handle U+3002, you have to use UTF-* coding systems. It will be fixed by Emacs 23. --- Kenichi Handa handa@ni.aist.go.jp > yu jie wrote: > > Hi, > > I met a new Chinese related issue. When I try to save a buffer with > > Chinese period: 。, I meet a error message: gb2312-dos cannot encode > > these: 。I change the coding system to utf8, save the buffer and call > > revert-buffer, then set buffer encoding back to GB2312, and now I > > could save the buffer. The glyphs of the two period is different. > > Here's output of describe-char: > > > > character: 。 (302786, #o1117302, #x49ec2, U+3002) > > charset: mule-unicode-2500-33ff (Unicode characters of the range > > U+2500..U+33FF.) > > code point: #x3D #x42 > > syntax: w which means: word > > buffer code: #x9C #xF2 #xBD #xC2 > > file code: not encodable by coding system gb2312-dos > > display: by this font (glyph code) > > -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1 > > (#x3002) > > > > character: 。 (37027, #o110243, #x90a3, U+3002) > > charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.) > > code point: #x21 #x23 > > syntax: . which means: punctuation > > category: c:Chinese |:While filling, we can break a line at this > > character. > > buffer code: #x91 #xA1 #xA3 > > file code: #xA1 #xA3 (encoded by coding system gb2312-dos) > > display: by this font (glyph code) > > -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1 > > (#x3002) > > > > Thanks. > > > _______________________________________________ > Emacs-devel mailing list > Emacs-devel@gnu.org > http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform 2008-02-01 7:30 ` Kenichi Handa @ 2008-02-01 12:33 ` yu jie 2008-02-01 14:07 ` Jason Rumney 1 sibling, 0 replies; 9+ messages in thread From: yu jie @ 2008-02-01 12:33 UTC (permalink / raw) To: Kenichi Handa; +Cc: emacs-pretest-bug, Jason Rumney [-- Attachment #1: Type: text/plain, Size: 2293 bytes --] Thanks for your replay. But there's no such a bug in Emacs 22.1 Someone has mentioned that this bug is caused by w32term.c and Mr Handa has talked with him about this bug in this news-group. :)... 2008/2/1 Kenichi Handa <handa@ni.aist.go.jp>: > In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org> > writes: > > > This seems to be a problem with mule-unicode-2500-33ff to gb2312 > > encoding. I doubt it is limited to w32. > > Right. This is because of the limitation of Emacs 22's > Unicode handling. If you want to handle U+3002, you have > to use UTF-* coding systems. > > It will be fixed by Emacs 23. > > --- > Kenichi Handa > handa@ni.aist.go.jp > > > yu jie wrote: > > > Hi, > > > I met a new Chinese related issue. When I try to save a buffer with > > > Chinese period: 。, I meet a error message: gb2312-dos cannot encode > > > these: 。I change the coding system to utf8, save the buffer and call > > > revert-buffer, then set buffer encoding back to GB2312, and now I > > > could save the buffer. The glyphs of the two period is different. > > > Here's output of describe-char: > > > > > > character: 。 (302786, #o1117302, #x49ec2, U+3002) > > > charset: mule-unicode-2500-33ff (Unicode characters of the range > > > U+2500..U+33FF.) > > > code point: #x3D #x42 > > > syntax: w which means: word > > > buffer code: #x9C #xF2 #xBD #xC2 > > > file code: not encodable by coding system gb2312-dos > > > display: by this font (glyph code) > > > -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1 > > > (#x3002) > > > > > > character: 。 (37027, #o110243, #x90a3, U+3002) > > > charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.) > > > code point: #x21 #x23 > > > syntax: . which means: punctuation > > > category: c:Chinese |:While filling, we can break a line at this > > > character. > > > buffer code: #x91 #xA1 #xA3 > > > file code: #xA1 #xA3 (encoded by coding system gb2312-dos) > > > display: by this font (glyph code) > > > -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1 > > > (#x3002) > > > > > > Thanks. > > > > > > > > _______________________________________________ > > Emacs-devel mailing list > > Emacs-devel@gnu.org > > http://lists.gnu.org/mailman/listinfo/emacs-devel > > [-- Attachment #2: Type: text/html, Size: 3180 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform 2008-02-01 7:30 ` Kenichi Handa 2008-02-01 12:33 ` yu jie @ 2008-02-01 14:07 ` Jason Rumney 2008-02-01 15:37 ` Eli Zaretskii 1 sibling, 1 reply; 9+ messages in thread From: Jason Rumney @ 2008-02-01 14:07 UTC (permalink / raw) To: Kenichi Handa; +Cc: emacs-pretest-bug, yujie052 Kenichi Handa wrote: > In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org> writes: > > >> This seems to be a problem with mule-unicode-2500-33ff to gb2312 >> encoding. I doubt it is limited to w32. >> > > Right. This is because of the limitation of Emacs 22's > Unicode handling. If you want to handle U+3002, you have > to use UTF-* coding systems. > > It will be fixed by Emacs 23. > Meanwhile we need to handle keyboard input in Emacs 22.2 in a way that is not any worse than 22.1. AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals, Thai and possibly Greek and Cyrillic are potentially problematic. Am I correct in thinking that Latin character sets are not affected? Is there a well defined range of unicode that does or doesn't support conversion? Doesn't pasting from the clipboard have the same problem? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform 2008-02-01 14:07 ` Jason Rumney @ 2008-02-01 15:37 ` Eli Zaretskii 2008-02-01 16:05 ` Jason Rumney 0 siblings, 1 reply; 9+ messages in thread From: Eli Zaretskii @ 2008-02-01 15:37 UTC (permalink / raw) To: Jason Rumney; +Cc: emacs-pretest-bug, handa, yujie052 > Date: Fri, 01 Feb 2008 14:07:35 +0000 > From: Jason Rumney <jasonr@gnu.org> > Cc: emacs-pretest-bug@gnu.org, yujie052@gmail.com > > AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals, > Thai and possibly Greek and Cyrillic are potentially problematic. I thought mule-unicode-* covers Greek and Cyrillic quite well. Doesn't it? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform 2008-02-01 15:37 ` Eli Zaretskii @ 2008-02-01 16:05 ` Jason Rumney 2008-02-02 1:37 ` YAMAMOTO Mitsuharu 2008-02-02 10:22 ` Eli Zaretskii 0 siblings, 2 replies; 9+ messages in thread From: Jason Rumney @ 2008-02-01 16:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-pretest-bug, handa, yujie052 Eli Zaretskii wrote: >> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals, >> Thai and possibly Greek and Cyrillic are potentially problematic. >> > > I thought mule-unicode-* covers Greek and Cyrillic quite well. > Doesn't it? > mule-unicode-* covers all the above. The issue is whether files containing such characters can be written in the relevant non-UTF coding systems. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform 2008-02-01 16:05 ` Jason Rumney @ 2008-02-02 1:37 ` YAMAMOTO Mitsuharu 2008-02-02 10:22 ` Eli Zaretskii 1 sibling, 0 replies; 9+ messages in thread From: YAMAMOTO Mitsuharu @ 2008-02-02 1:37 UTC (permalink / raw) To: Jason Rumney; +Cc: emacs-pretest-bug, Eli Zaretskii, handa, yujie052 >>>>> On Fri, 01 Feb 2008 16:05:56 +0000, Jason Rumney <jasonr@gnu.org> said: >>> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK >>> radicals, Thai and possibly Greek and Cyrillic are potentially >>> problematic. >>> >> > I thought mule-unicode-* covers Greek and Cyrillic quite well. >> Doesn't it? >> > mule-unicode-* covers all the above. The issue is whether files > containing such characters can be written in the relevant non-UTF > coding systems. Even for Greek and Cyrillic, a user may want it mapped to japanese-jisx0208 rather than mule-unicode-* in some situation. FWIW, the Mac Carbon port handles Unicode keyboard events in the following way: * ASCII character, with or without modifiers -> ASCII_KEYSTROKE_EVENT * Non-ASCII character with some modifiers -> MULTIBYTE_CHAR_KEYSTROKE_EVENT with either CHARSET_8_BIT_CONTROL, charset_latin_iso8859_1, or charset_mule_unicode_* code. * Non-ASCII character without any modifiers -> The event comes with some script/language information. So we can distinguish mule-unicode-0100-24ff Greek from japanese-jisx0208 Greek in principle even though they have the same Unicode codepoint. Likewise for CJK characters. Because the usual Emacs keyboard events cannot carry such script/language information, we pack the raw Unicode text input data and the script/language info into a special event MAC_APPLE_EVENT instead. Then it is decoded at the Lisp level. (define-key special-event-map [mac-apple-event] 'mac-dispatch-apple-event) (define-key mac-apple-event-map [text-input unicode-for-key-event] 'mac-ts-unicode-for-key-event) (defun mac-ts-unicode-for-key-event (event) "Convert Unicode key EVENT to Emacs key events and unread them." (interactive "e") (let* ((ae (mac-event-ae event)) (text (cdr (mac-ae-parameter ae "tstx" "utxt"))) (script-language (mac-ae-script-language ae "tssl")) (coding (or (cdr (assq (car script-language) mac-script-code-coding-systems)) 'mac-roman))) (if text (mac-unread-string (mac-utxt-to-string text coding))))) As for the W32 port, Emacs 22.2 should avoid drastic changes in general. How about using the new code (i.e., mapping to mule-unicode-* etc.) only for the with-modifier case, and leaving the without-modifier case to encoded-kb as in Emacs 22.1? YAMAMOTO Mitsuharu mituharu@math.s.chiba-u.ac.jp ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform 2008-02-01 16:05 ` Jason Rumney 2008-02-02 1:37 ` YAMAMOTO Mitsuharu @ 2008-02-02 10:22 ` Eli Zaretskii 1 sibling, 0 replies; 9+ messages in thread From: Eli Zaretskii @ 2008-02-02 10:22 UTC (permalink / raw) To: Jason Rumney; +Cc: emacs-pretest-bug, handa, yujie052 > Date: Fri, 01 Feb 2008 16:05:56 +0000 > From: Jason Rumney <jasonr@gnu.org> > CC: handa@ni.aist.go.jp, emacs-pretest-bug@gnu.org, yujie052@gmail.com > > Eli Zaretskii wrote: > >> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals, > >> Thai and possibly Greek and Cyrillic are potentially problematic. > >> > > > > I thought mule-unicode-* covers Greek and Cyrillic quite well. > > Doesn't it? > > > > mule-unicode-* covers all the above. The issue is whether files > containing such characters can be written in the relevant non-UTF coding > systems. For Cyrillic, those would be windows-1251, koi8-r, and ISO-8859-5, right? If so, I think we have no problems here. Likewise for Greek. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <47A8FEBA.8000304@gnu.org>]
* Re: 23.0.50; can't input chinese punctuation on win32 platform [not found] ` <47A8FEBA.8000304@gnu.org> @ 2008-02-06 9:57 ` Zhang Wei 0 siblings, 0 replies; 9+ messages in thread From: Zhang Wei @ 2008-02-06 9:57 UTC (permalink / raw) To: Jason Rumney, emacs-devel On 2/6/08, Jason Rumney <jasonr@gnu.org> wrote: > yu jie wrote: > > Hi, > > I met a new Chinese related issue. When I try to save a buffer with > > Chinese period: 。, I meet a error message: gb2312-dos cannot encode > > these: 。 > > Zhang Wei wrote: > > > When I save a file in gb2312 coding system, I got the following > > compliant, all of the chinese punctuation characters can't be encoded > > with gb2312: > > This problem should be fixed now. Thank you both for reporting it. > The bug has gone. Thank you. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-02-06 9:57 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <42b562540801260432h43921157k7d4034ddfff28862@mail.gmail.com> 2008-01-26 15:04 ` Fwd: 23.0.50; can't input chinese punctuation on w32 platform Jason Rumney 2008-02-01 7:30 ` Kenichi Handa 2008-02-01 12:33 ` yu jie 2008-02-01 14:07 ` Jason Rumney 2008-02-01 15:37 ` Eli Zaretskii 2008-02-01 16:05 ` Jason Rumney 2008-02-02 1:37 ` YAMAMOTO Mitsuharu 2008-02-02 10:22 ` Eli Zaretskii [not found] ` <47A8FEBA.8000304@gnu.org> 2008-02-06 9:57 ` 23.0.50; can't input chinese punctuation on win32 platform Zhang Wei
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).