* UTF-8 character question @ 2008-05-12 6:49 horatio 2008-05-12 7:14 ` Harald Hanche-Olsen ` (3 more replies) 0 siblings, 4 replies; 10+ messages in thread From: horatio @ 2008-05-12 6:49 UTC (permalink / raw) To: help-gnu-emacs I downloaded Emacs 22.2.1 for Windows, and I was pleased to find that Chinese characters work "out of the box" on my computer. However, I have a weird visualization problem for some characters. One example is 你你. These two characters appear the same in Firefox, in Notepad, in the file system (ie Explorer), and in various other places. However, in Emacs, the character on the left appears as an empty square, but the character on the right shows up as the Chinese character for "you". Is there some way to make Emacs correctly display both versions of this character? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 6:49 UTF-8 character question horatio @ 2008-05-12 7:14 ` Harald Hanche-Olsen 2008-05-12 7:51 ` horatio 2008-05-12 7:17 ` Harald Hanche-Olsen ` (2 subsequent siblings) 3 siblings, 1 reply; 10+ messages in thread From: Harald Hanche-Olsen @ 2008-05-12 7:14 UTC (permalink / raw) To: help-gnu-emacs + horatio@gmail.com: > I downloaded Emacs 22.2.1 for Windows, and I was pleased to find that > Chinese characters work "out of the box" on my computer. However, I > have a weird visualization problem for some characters. One example > is 你你. These two characters appear the same in Firefox, in Notepad, > in the file system (ie Explorer), and in various other places. > However, in Emacs, the character on the left appears as an empty > square, but the character on the right shows up as the Chinese > character for "you". I am confused. They not only /look/ the same, they /are/ the same character (U+4F60). Maybe your news posting software knows what emacs doesn't, and has changed one of those so they are equal? I'm afraid you will have to describe the difference between the two characters somehow. -- * Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/> - It is undesirable to believe a proposition when there is no ground whatsoever for supposing it is true. -- Bertrand Russell ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 7:14 ` Harald Hanche-Olsen @ 2008-05-12 7:51 ` horatio 2008-05-12 8:07 ` horatio 0 siblings, 1 reply; 10+ messages in thread From: horatio @ 2008-05-12 7:51 UTC (permalink / raw) To: help-gnu-emacs On May 12, 12:14 am, Harald Hanche-Olsen <han...@math.ntnu.no> wrote: > + hora...@gmail.com: > > > I downloaded Emacs 22.2.1 for Windows, and I was pleased to find that > > Chinese characters work "out of the box" on my computer. However, I > > have a weird visualization problem for some characters. One example > > is 你你. These two characters appear the same in Firefox, in Notepad, > > in the file system (ie Explorer), and in various other places. > > However, in Emacs, the character on the left appears as an empty > > square, but the character on the right shows up as the Chinese > > character for "you". > > I am confused. They not only /look/ the same, they /are/ the same > character (U+4F60). Maybe your news posting software knows what emacs > doesn't, and has changed one of those so they are equal? > > I'm afraid you will have to describe the difference between the two > characters somehow. I used Firefox to post, and yes, it replaced one of the characters for me. I don't know how to figure out what the encoding is for the character Emacs is correctly displaying, but the character U+4F60 does not display correctly in my version of Emacs. Instead, it shows up as the empty square. There's another version of the same character that does show up correctly in Emacs, but unfortunately it's not the one used elsewhere. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 7:51 ` horatio @ 2008-05-12 8:07 ` horatio 2008-05-12 8:16 ` Harald Hanche-Olsen 0 siblings, 1 reply; 10+ messages in thread From: horatio @ 2008-05-12 8:07 UTC (permalink / raw) To: help-gnu-emacs On May 12, 12:51 am, hora...@gmail.com wrote: > On May 12, 12:14 am, Harald Hanche-Olsen <han...@math.ntnu.no> wrote: > > > > > + hora...@gmail.com: > > > > I downloaded Emacs 22.2.1 for Windows, and I was pleased to find that > > > Chinese characters work "out of the box" on my computer. However, I > > > have a weird visualization problem for some characters. One example > > > is 你你. These two characters appear the same in Firefox, in Notepad, > > > in the file system (ie Explorer), and in various other places. > > > However, in Emacs, the character on the left appears as an empty > > > square, but the character on the right shows up as the Chinese > > > character for "you". > > > I am confused. They not only /look/ the same, they /are/ the same > > character (U+4F60). Maybe your news posting software knows what emacs > > doesn't, and has changed one of those so they are equal? > > > I'm afraid you will have to describe the difference between the two > > characters somehow. > > I used Firefox to post, and yes, it replaced one of the characters for > me. I don't know how to figure out what the encoding is for the > character Emacs is correctly displaying, but the character U+4F60 does > not display correctly in my version of Emacs. Instead, it shows up as > the empty square. There's another version of the same character that > does show up correctly in Emacs, but unfortunately it's not the one > used elsewhere. Fascinating. I just found something else out. When I save the file, and then reload it, the character that was successfully displayed earlier is now displayed as an empty box. Maybe there is only one 你 character, and sometimes Emacs can show it, and sometimes it can't. Furthermore, when I use Options->Mule->List Character Sets, some of the supported character sets are entirely empty boxes. The strange thing about that is there are definitely some characters that it shows fine, with none of these issues. It's pretty strange that for some characters, it can show the Chinese characters, and for others it can't. My guess is there's some basic option or package that I'm missing that will make the problem go away. Can you (or anyone else) copy and paste that character into an Emacs buffer? If it works, can you think of anything in your setup that I might not have done? I'll take a look myself in the meantime. Thanks for the help. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 8:07 ` horatio @ 2008-05-12 8:16 ` Harald Hanche-Olsen 2008-05-12 8:35 ` David Kastrup 0 siblings, 1 reply; 10+ messages in thread From: Harald Hanche-Olsen @ 2008-05-12 8:16 UTC (permalink / raw) To: help-gnu-emacs + horatio@gmail.com: > My guess is there's some basic option or package that I'm missing that > will make the problem go away. Can you (or anyone else) copy and > paste that character into an Emacs buffer? If it works, can you think > of anything in your setup that I might not have done? I'll take a > look myself in the meantime. I can copy and paste it just fine. However, you said you're running emacs 22 on windows, right? I am running various versions of emacs 23 (the development version) on unix, so I very much doubt that you can learn anything useful from my setup. I don't do anything out of the ordinary with font setup anyway (other than using the Vera Sans Mono font, which will affect only the latin characters). I think some other users of emacs on windows will have to step in. -- * Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/> - It is undesirable to believe a proposition when there is no ground whatsoever for supposing it is true. -- Bertrand Russell ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 8:16 ` Harald Hanche-Olsen @ 2008-05-12 8:35 ` David Kastrup 2008-05-12 9:37 ` horatio 0 siblings, 1 reply; 10+ messages in thread From: David Kastrup @ 2008-05-12 8:35 UTC (permalink / raw) To: help-gnu-emacs Harald Hanche-Olsen <hanche@math.ntnu.no> writes: > + horatio@gmail.com: > >> My guess is there's some basic option or package that I'm missing >> that will make the problem go away. Can you (or anyone else) copy >> and paste that character into an Emacs buffer? If it works, can you >> think of anything in your setup that I might not have done? I'll >> take a look myself in the meantime. > > I can copy and paste it just fine. However, you said you're running > emacs 22 on windows, right? I am running various versions of emacs 23 > (the development version) on unix, so I very much doubt that you can > learn anything useful from my setup. I don't do anything out of the > ordinary with font setup anyway (other than using the Vera Sans Mono > font, which will affect only the latin characters). I think some other > users of emacs on windows will have to step in. If he is using Chinese or other CJK stuff a lot, he might want to bite the bullet and switch to Emacs 23. Almost all Emacs implementations that are around use MULE as an internal encoding. Emacs>=23 and XEmacs starting from some 21.5 quite instable version use utf-8 as an internal encoding. The "problem" with MULE is that it represents characters as a charset/character pair, and characters from different charsets are basically different. But character sets are coupled with encodings, and so some characters exist in quite a number of charsets (like the basic accented letters). This necessitated functions for "charset unification" which do a better or worse job depending on what they are working with, and how muc code have written for the charsets. Now Emacs 23 loses this information and keeps around only the Unicode codepoint. That means that you can't represent as much information as previously, but usually the information you lose is that which you would want to have disregarded, anyway. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 8:35 ` David Kastrup @ 2008-05-12 9:37 ` horatio 0 siblings, 0 replies; 10+ messages in thread From: horatio @ 2008-05-12 9:37 UTC (permalink / raw) To: help-gnu-emacs On May 12, 1:35 am, David Kastrup <d...@gnu.org> wrote: > Harald Hanche-Olsen <han...@math.ntnu.no> writes: > > + hora...@gmail.com: > > >> My guess is there's some basic option or package that I'm missing > >> that will make the problem go away. Can you (or anyone else) copy > >> and paste that character into an Emacs buffer? If it works, can you > >> think of anything in your setup that I might not have done? I'll > >> take a look myself in the meantime. > > > I can copy and paste it just fine. However, you said you're running > > emacs 22 on windows, right? I am running various versions of emacs 23 > > (the development version) on unix, so I very much doubt that you can > > learn anything useful from my setup. I don't do anything out of the > > ordinary with font setup anyway (other than using the Vera Sans Mono > > font, which will affect only the latin characters). I think some other > > users of emacs on windows will have to step in. > > If he is using Chinese or other CJK stuff a lot, he might want to bite > the bullet and switch to Emacs 23. > > Almost all Emacs implementations that are around use MULE as an internal > encoding. Emacs>=23 and XEmacs starting from some 21.5 quite instable > version use utf-8 as an internal encoding. > > The "problem" with MULE is that it represents characters as a > charset/character pair, and characters from different charsets are > basically different. But character sets are coupled with encodings, and > so some characters exist in quite a number of charsets (like the basic > accented letters). This necessitated functions for "charset > unification" which do a better or worse job depending on what they are > working with, and how muc code have written for the charsets. > > Now Emacs 23 loses this information and keeps around only the Unicode > codepoint. That means that you can't represent as much information as > previously, but usually the information you lose is that which you would > want to have disregarded, anyway. > > -- > David Kastrup, Kriemhildstr. 15, 44793 Bochum Hey, thanks for the suggestion. I found a zip online built from the cvs just a week ago (on the emacsw32 site), and that seems to fix my unicode problems. Thanks to everyone for the help. It seems that the best way to work with Chinese in emacs is to track down a recent build of Emacs 23. John ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 6:49 UTF-8 character question horatio 2008-05-12 7:14 ` Harald Hanche-Olsen @ 2008-05-12 7:17 ` Harald Hanche-Olsen 2008-05-12 9:21 ` Jason Rumney 2008-05-12 15:42 ` Peter Dyballa 3 siblings, 0 replies; 10+ messages in thread From: Harald Hanche-Olsen @ 2008-05-12 7:17 UTC (permalink / raw) To: help-gnu-emacs I should have made the following addition to my previous post: + horatio@gmail.com: > However, in Emacs, the character on the left appears as an empty > square, but The empty box is emacs' way of displaying a character it doesn't know how to display, meaning it is not present in the current fontset. This doesn't tell you how to solve the problem of course, but it tells you something about where to look for a solution. -- * Harald Hanche-Olsen <URL:http://www.math.ntnu.no/~hanche/> - It is undesirable to believe a proposition when there is no ground whatsoever for supposing it is true. -- Bertrand Russell ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 6:49 UTF-8 character question horatio 2008-05-12 7:14 ` Harald Hanche-Olsen 2008-05-12 7:17 ` Harald Hanche-Olsen @ 2008-05-12 9:21 ` Jason Rumney 2008-05-12 15:42 ` Peter Dyballa 3 siblings, 0 replies; 10+ messages in thread From: Jason Rumney @ 2008-05-12 9:21 UTC (permalink / raw) To: help-gnu-emacs On May 12, 7:49 am, hora...@gmail.com wrote: > I downloaded Emacs 22.2.1 for Windows, and I was pleased to find that > Chinese characters work "out of the box" on my computer. However, I > have a weird visualization problem for some characters. One example > is 你你. These two characters appear the same in Firefox, in Notepad, > in the file system (ie Explorer), and in various other places. > However, in Emacs, the character on the left appears as an empty > square, but the character on the right shows up as the Chinese > character for "you". Is there some way to make Emacs correctly > display both versions of this character? What is your default language set to in Windows? If it is not Chinese, then Emacs might be making the wrong decision about how to interpret 你 when it is encoded as Unicode, and is looking for a Japanese or Korean font which you don't have. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UTF-8 character question 2008-05-12 6:49 UTF-8 character question horatio ` (2 preceding siblings ...) 2008-05-12 9:21 ` Jason Rumney @ 2008-05-12 15:42 ` Peter Dyballa 3 siblings, 0 replies; 10+ messages in thread From: Peter Dyballa @ 2008-05-12 15:42 UTC (permalink / raw) To: horatio; +Cc: help-gnu-emacs Am 12.05.2008 um 08:49 schrieb horatio: > Is there some way to make Emacs correctly > display both versions of this character? You can check with C-u C-x = on each of the characters/boxes what they actually are. There might be another problem with the way MS encodes the snippet. Maybe you need to tell what selection-coding-system is used ... -- Greetings Pete A morning without coffee is like something without something else. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-05-12 15:42 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-05-12 6:49 UTF-8 character question horatio 2008-05-12 7:14 ` Harald Hanche-Olsen 2008-05-12 7:51 ` horatio 2008-05-12 8:07 ` horatio 2008-05-12 8:16 ` Harald Hanche-Olsen 2008-05-12 8:35 ` David Kastrup 2008-05-12 9:37 ` horatio 2008-05-12 7:17 ` Harald Hanche-Olsen 2008-05-12 9:21 ` Jason Rumney 2008-05-12 15:42 ` Peter Dyballa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).