unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Fwd: 23.0.50; can't input chinese punctuation on w32 platform
       [not found] <42b562540801260432h43921157k7d4034ddfff28862@mail.gmail.com>
@ 2008-01-26 15:04 ` Jason Rumney
  2008-02-01  7:30   ` Kenichi Handa
       [not found] ` <47A8FEBA.8000304@gnu.org>
  1 sibling, 1 reply; 9+ messages in thread
From: Jason Rumney @ 2008-01-26 15:04 UTC (permalink / raw)
  To: emacs-pretest-bug@gnu.org; +Cc: yu jie

This seems to be a problem with mule-unicode-2500-33ff to gb2312
encoding. I doubt it is limited to w32.

yu jie wrote:
> Hi,
> I met a new Chinese related issue. When I try to save a buffer with
> Chinese period: 。, I meet a error message: gb2312-dos cannot encode
> these: 。I change the coding system to utf8, save the buffer and call
> revert-buffer, then set buffer encoding back to GB2312, and now I
> could save the buffer. The glyphs of the two period is different.
> Here's output of describe-char:
>
> character: 。 (302786, #o1117302, #x49ec2, U+3002)
> charset: mule-unicode-2500-33ff (Unicode characters of the range
> U+2500..U+33FF.)
> code point: #x3D #x42
> syntax: w which means: word
> buffer code: #x9C #xF2 #xBD #xC2
> file code: not encodable by coding system gb2312-dos
> display: by this font (glyph code)
> -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1
> (#x3002)
>
> character: 。 (37027, #o110243, #x90a3, U+3002)
> charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.)
> code point: #x21 #x23
> syntax: . which means: punctuation
> category: c:Chinese |:While filling, we can break a line at this
> character.
> buffer code: #x91 #xA1 #xA3
> file code: #xA1 #xA3 (encoded by coding system gb2312-dos)
> display: by this font (glyph code)
> -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1
> (#x3002)
>
> Thanks.
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
  2008-01-26 15:04 ` Fwd: 23.0.50; can't input chinese punctuation on w32 platform Jason Rumney
@ 2008-02-01  7:30   ` Kenichi Handa
  2008-02-01 12:33     ` yu jie
  2008-02-01 14:07     ` Jason Rumney
  0 siblings, 2 replies; 9+ messages in thread
From: Kenichi Handa @ 2008-02-01  7:30 UTC (permalink / raw)
  To: Jason Rumney; +Cc: emacs-pretest-bug, yujie052

In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:

> This seems to be a problem with mule-unicode-2500-33ff to gb2312
> encoding. I doubt it is limited to w32.

Right.  This is because of the limitation of Emacs 22's
Unicode handling.  If you want to handle U+3002, you have
to use UTF-* coding systems.

It will be fixed by Emacs 23.

---
Kenichi Handa
handa@ni.aist.go.jp

> yu jie wrote:
> > Hi,
> > I met a new Chinese related issue. When I try to save a buffer with
> > Chinese period: 。, I meet a error message: gb2312-dos cannot encode
> > these: 。I change the coding system to utf8, save the buffer and call
> > revert-buffer, then set buffer encoding back to GB2312, and now I
> > could save the buffer. The glyphs of the two period is different.
> > Here's output of describe-char:
> >
> > character: 。 (302786, #o1117302, #x49ec2, U+3002)
> > charset: mule-unicode-2500-33ff (Unicode characters of the range
> > U+2500..U+33FF.)
> > code point: #x3D #x42
> > syntax: w which means: word
> > buffer code: #x9C #xF2 #xBD #xC2
> > file code: not encodable by coding system gb2312-dos
> > display: by this font (glyph code)
> > -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1
> > (#x3002)
> >
> > character: 。 (37027, #o110243, #x90a3, U+3002)
> > charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.)
> > code point: #x21 #x23
> > syntax: . which means: punctuation
> > category: c:Chinese |:While filling, we can break a line at this
> > character.
> > buffer code: #x91 #xA1 #xA3
> > file code: #xA1 #xA3 (encoded by coding system gb2312-dos)
> > display: by this font (glyph code)
> > -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1
> > (#x3002)
> >
> > Thanks.
> >



> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-devel





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
  2008-02-01  7:30   ` Kenichi Handa
@ 2008-02-01 12:33     ` yu jie
  2008-02-01 14:07     ` Jason Rumney
  1 sibling, 0 replies; 9+ messages in thread
From: yu jie @ 2008-02-01 12:33 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-pretest-bug, Jason Rumney

[-- Attachment #1: Type: text/plain, Size: 2293 bytes --]

Thanks for your replay.
But there's no such a bug in Emacs 22.1
Someone has mentioned that this bug is caused by w32term.c and Mr Handa has
talked
with him about this bug in this news-group. :)...

2008/2/1 Kenichi Handa <handa@ni.aist.go.jp>:

> In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org>
> writes:
>
> > This seems to be a problem with mule-unicode-2500-33ff to gb2312
> > encoding. I doubt it is limited to w32.
>
> Right.  This is because of the limitation of Emacs 22's
> Unicode handling.  If you want to handle U+3002, you have
> to use UTF-* coding systems.
>
> It will be fixed by Emacs 23.
>
> ---
> Kenichi Handa
> handa@ni.aist.go.jp
>
> > yu jie wrote:
> > > Hi,
> > > I met a new Chinese related issue. When I try to save a buffer with
> > > Chinese period: 。, I meet a error message: gb2312-dos cannot encode
> > > these: 。I change the coding system to utf8, save the buffer and call
> > > revert-buffer, then set buffer encoding back to GB2312, and now I
> > > could save the buffer. The glyphs of the two period is different.
> > > Here's output of describe-char:
> > >
> > > character: 。 (302786, #o1117302, #x49ec2, U+3002)
> > > charset: mule-unicode-2500-33ff (Unicode characters of the range
> > > U+2500..U+33FF.)
> > > code point: #x3D #x42
> > > syntax: w which means: word
> > > buffer code: #x9C #xF2 #xBD #xC2
> > > file code: not encodable by coding system gb2312-dos
> > > display: by this font (glyph code)
> > > -outline-Consolas-normal-r-normal-normal-14-105-96-96-c-*-iso10646-1
> > > (#x3002)
> > >
> > > character: 。 (37027, #o110243, #x90a3, U+3002)
> > > charset: chinese-gb2312 (GB2312 Chinese simplified: ISO-IR-58.)
> > > code point: #x21 #x23
> > > syntax: . which means: punctuation
> > > category: c:Chinese |:While filling, we can break a line at this
> > > character.
> > > buffer code: #x91 #xA1 #xA3
> > > file code: #xA1 #xA3 (encoded by coding system gb2312-dos)
> > > display: by this font (glyph code)
> > > -outline-MS YaHei-normal-r-normal-normal-16-120-96-96-p-*-iso10646-1
> > > (#x3002)
> > >
> > > Thanks.
> > >
>
>
>
> > _______________________________________________
> > Emacs-devel mailing list
> > Emacs-devel@gnu.org
> > http://lists.gnu.org/mailman/listinfo/emacs-devel
>
>

[-- Attachment #2: Type: text/html, Size: 3180 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
  2008-02-01  7:30   ` Kenichi Handa
  2008-02-01 12:33     ` yu jie
@ 2008-02-01 14:07     ` Jason Rumney
  2008-02-01 15:37       ` Eli Zaretskii
  1 sibling, 1 reply; 9+ messages in thread
From: Jason Rumney @ 2008-02-01 14:07 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-pretest-bug, yujie052

Kenichi Handa wrote:
> In article <479B4BE3.2000204@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:
>
>   
>> This seems to be a problem with mule-unicode-2500-33ff to gb2312
>> encoding. I doubt it is limited to w32.
>>     
>
> Right.  This is because of the limitation of Emacs 22's
> Unicode handling.  If you want to handle U+3002, you have
> to use UTF-* coding systems.
>
> It will be fixed by Emacs 23.
>   
Meanwhile we need to handle keyboard input in Emacs 22.2 in a way that
is not any worse than 22.1.

AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals,
Thai and possibly Greek and Cyrillic are potentially problematic. Am I
correct in thinking that Latin character sets are not affected?

Is there a well defined range of unicode that does or doesn't support
conversion? Doesn't pasting from the clipboard have the same problem?




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
  2008-02-01 14:07     ` Jason Rumney
@ 2008-02-01 15:37       ` Eli Zaretskii
  2008-02-01 16:05         ` Jason Rumney
  0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2008-02-01 15:37 UTC (permalink / raw)
  To: Jason Rumney; +Cc: emacs-pretest-bug, handa, yujie052

> Date: Fri, 01 Feb 2008 14:07:35 +0000
> From: Jason Rumney <jasonr@gnu.org>
> Cc: emacs-pretest-bug@gnu.org, yujie052@gmail.com
> 
> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals,
> Thai and possibly Greek and Cyrillic are potentially problematic.

I thought mule-unicode-* covers Greek and Cyrillic quite well.
Doesn't it?




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
  2008-02-01 15:37       ` Eli Zaretskii
@ 2008-02-01 16:05         ` Jason Rumney
  2008-02-02  1:37           ` YAMAMOTO Mitsuharu
  2008-02-02 10:22           ` Eli Zaretskii
  0 siblings, 2 replies; 9+ messages in thread
From: Jason Rumney @ 2008-02-01 16:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-pretest-bug, handa, yujie052

Eli Zaretskii wrote:
>> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals,
>> Thai and possibly Greek and Cyrillic are potentially problematic.
>>     
>
> I thought mule-unicode-* covers Greek and Cyrillic quite well.
> Doesn't it?
>   

mule-unicode-* covers all the above. The issue is whether files 
containing such characters can be written in the relevant non-UTF coding 
systems.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
  2008-02-01 16:05         ` Jason Rumney
@ 2008-02-02  1:37           ` YAMAMOTO Mitsuharu
  2008-02-02 10:22           ` Eli Zaretskii
  1 sibling, 0 replies; 9+ messages in thread
From: YAMAMOTO Mitsuharu @ 2008-02-02  1:37 UTC (permalink / raw)
  To: Jason Rumney; +Cc: emacs-pretest-bug, Eli Zaretskii, handa, yujie052

>>>>> On Fri, 01 Feb 2008 16:05:56 +0000, Jason Rumney <jasonr@gnu.org> said:

>>> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK
>>> radicals, Thai and possibly Greek and Cyrillic are potentially
>>> problematic.
>>> 
>> 
> I thought mule-unicode-* covers Greek and Cyrillic quite well.
>> Doesn't it?
>> 

> mule-unicode-* covers all the above. The issue is whether files
> containing such characters can be written in the relevant non-UTF
> coding systems.

Even for Greek and Cyrillic, a user may want it mapped to
japanese-jisx0208 rather than mule-unicode-* in some situation.

FWIW, the Mac Carbon port handles Unicode keyboard events in the
following way:

* ASCII character, with or without modifiers
  -> ASCII_KEYSTROKE_EVENT

* Non-ASCII character with some modifiers
  -> MULTIBYTE_CHAR_KEYSTROKE_EVENT with either CHARSET_8_BIT_CONTROL,
     charset_latin_iso8859_1, or charset_mule_unicode_* code.

* Non-ASCII character without any modifiers
  -> The event comes with some script/language information.  So we can
     distinguish mule-unicode-0100-24ff Greek from japanese-jisx0208
     Greek in principle even though they have the same Unicode
     codepoint.  Likewise for CJK characters.  Because the usual Emacs
     keyboard events cannot carry such script/language information, we
     pack the raw Unicode text input data and the script/language info
     into a special event MAC_APPLE_EVENT instead.  Then it is decoded
     at the Lisp level.

     (define-key special-event-map [mac-apple-event] 'mac-dispatch-apple-event)
     (define-key mac-apple-event-map [text-input unicode-for-key-event]
       'mac-ts-unicode-for-key-event)

     (defun mac-ts-unicode-for-key-event (event)
       "Convert Unicode key EVENT to Emacs key events and unread them."
       (interactive "e")
       (let* ((ae (mac-event-ae event))
	      (text (cdr (mac-ae-parameter ae "tstx" "utxt")))
	      (script-language (mac-ae-script-language ae "tssl"))
	      (coding (or (cdr (assq (car script-language)
				     mac-script-code-coding-systems))
			  'mac-roman)))
	 (if text
	     (mac-unread-string (mac-utxt-to-string text coding)))))

As for the W32 port, Emacs 22.2 should avoid drastic changes in
general.  How about using the new code (i.e., mapping to
mule-unicode-* etc.) only for the with-modifier case, and leaving the
without-modifier case to encoded-kb as in Emacs 22.1?

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Fwd: 23.0.50; can't input chinese punctuation on w32 platform
  2008-02-01 16:05         ` Jason Rumney
  2008-02-02  1:37           ` YAMAMOTO Mitsuharu
@ 2008-02-02 10:22           ` Eli Zaretskii
  1 sibling, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2008-02-02 10:22 UTC (permalink / raw)
  To: Jason Rumney; +Cc: emacs-pretest-bug, handa, yujie052

> Date: Fri, 01 Feb 2008 16:05:56 +0000
> From: Jason Rumney <jasonr@gnu.org>
> CC: handa@ni.aist.go.jp, emacs-pretest-bug@gnu.org, yujie052@gmail.com
> 
> Eli Zaretskii wrote:
> >> AFAICT, CJK punctuation, Kana, Jamo, Kanbun, Bopomofo, CJK radicals,
> >> Thai and possibly Greek and Cyrillic are potentially problematic.
> >>     
> >
> > I thought mule-unicode-* covers Greek and Cyrillic quite well.
> > Doesn't it?
> >   
> 
> mule-unicode-* covers all the above. The issue is whether files 
> containing such characters can be written in the relevant non-UTF coding 
> systems.

For Cyrillic, those would be windows-1251, koi8-r, and ISO-8859-5,
right?  If so, I think we have no problems here.  Likewise for Greek.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 23.0.50; can't input chinese punctuation on win32 platform
       [not found] ` <47A8FEBA.8000304@gnu.org>
@ 2008-02-06  9:57   ` Zhang Wei
  0 siblings, 0 replies; 9+ messages in thread
From: Zhang Wei @ 2008-02-06  9:57 UTC (permalink / raw)
  To: Jason Rumney, emacs-devel

On 2/6/08, Jason Rumney <jasonr@gnu.org> wrote:
> yu jie wrote:
> > Hi,
> > I met a new Chinese related issue. When I try to save a buffer with
> > Chinese period: 。, I meet a error message: gb2312-dos cannot encode
> > these: 。
>
> Zhang Wei wrote:
>
> > When I save a file in gb2312 coding system, I got the following
> > compliant, all of the chinese punctuation characters can't be encoded
> > with gb2312:
>
> This problem should be fixed now. Thank you both for reporting it.
>

The bug has gone. Thank you.




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-02-06  9:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <42b562540801260432h43921157k7d4034ddfff28862@mail.gmail.com>
2008-01-26 15:04 ` Fwd: 23.0.50; can't input chinese punctuation on w32 platform Jason Rumney
2008-02-01  7:30   ` Kenichi Handa
2008-02-01 12:33     ` yu jie
2008-02-01 14:07     ` Jason Rumney
2008-02-01 15:37       ` Eli Zaretskii
2008-02-01 16:05         ` Jason Rumney
2008-02-02  1:37           ` YAMAMOTO Mitsuharu
2008-02-02 10:22           ` Eli Zaretskii
     [not found] ` <47A8FEBA.8000304@gnu.org>
2008-02-06  9:57   ` 23.0.50; can't input chinese punctuation on win32 platform Zhang Wei

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).