* Chinese characters support @ 2003-05-07 23:08 Gaoyan Xie 2003-05-08 6:27 ` Charles Muller [not found] ` <mailman.5739.1052375326.21513.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 41+ messages in thread From: Gaoyan Xie @ 2003-05-07 23:08 UTC (permalink / raw) Hi all, I am trying to explore GNU emacs's multilingual support, and what I want is the display and input of Chinese characters. Have any of you done this before? I tried according to GNU emacs' online manual, but still couldn't make it work. BTW, I am using Redhat Linux 7.2 and GNU emacs 20.7. Thanks for any help for this issue. Gaoyan Xie ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-07 23:08 Chinese characters support Gaoyan Xie @ 2003-05-08 6:27 ` Charles Muller [not found] ` <mailman.5739.1052375326.21513.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 41+ messages in thread From: Charles Muller @ 2003-05-08 6:27 UTC (permalink / raw) Cc: help-gnu-emacs Gaoyan Xie wrote > I am trying to explore GNU emacs's multilingual support, and what I want > is the display and input of Chinese characters. Have any of you done > this before? I tried according to GNU emacs' online manual, but still > couldn't make it work. BTW, I am using Redhat Linux 7.2 and GNU emacs > 20.7. I would recommend first that you consider installing a newer 21.x version of Emacs if you are concerned about international script support. A newer version of RedHat would not hurt either. I am using RH9 with Emacs 21.2 and Chinese and Japanese display without me having to do anything, as long as the documents are encoded in JIS for Japanese and Big5 for Chinese. When it comes to working with UTF-8, I have never heard of anyone succeeding in displaying East Asian scripts without installing the TEI-Emacs add-on. Chuck --------------------------- Charles Muller <acmuller@gol.com> Faculty of Humanities, Toyo Gakuen University Digital Dictionary of Buddhism and CJKV-English Dictionary [http://www.acmuller.net] H-Buddhism List Editor [http://www2.h-net.msu.edu/~buddhism/] Mobile Phone: 090-9310-1787 ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.5739.1052375326.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.5739.1052375326.21513.help-gnu-emacs@gnu.org> @ 2003-05-08 7:33 ` Robin Hu 2003-05-10 14:28 ` Kai Großjohann 1 sibling, 0 replies; 41+ messages in thread From: Robin Hu @ 2003-05-08 7:33 UTC (permalink / raw) >>>>> "Charles" == Charles Muller <acmuller@gol.com> writes: Charles> Gaoyan Xie wrote >> I am trying to explore GNU emacs's multilingual support, and what >> I want is the display and input of Chinese characters. Have any >> of you done this before? I tried according to GNU emacs' online >> manual, but still couldn't make it work. BTW, I am using Redhat >> Linux 7.2 and GNU emacs 20.7. Charles> I would recommend first that you consider installing a Charles> newer 21.x version of Emacs if you are concerned about Charles> international script support. A newer version of RedHat Charles> would not hurt either. I am using RH9 with Emacs 21.2 and Charles> Chinese and Japanese display without me having to do Charles> anything, as long as the documents are encoded in JIS for Charles> Japanese and Big5 for Chinese. Charles> When it comes to working with UTF-8, I have never heard of Charles> anyone succeeding in displaying East Asian scripts without Charles> installing the TEI-Emacs add-on. I am using Mule-Ucs 0.84 (patches from debian applied) with Emacs 21.3.50, it seems to work fine. So what is "TEI-Emacs add-on", can you point me out URLs related? Charles> Chuck -- The goal of science is to build better mousetraps. The goal of nature is to build better mice. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support [not found] ` <mailman.5739.1052375326.21513.help-gnu-emacs@gnu.org> 2003-05-08 7:33 ` Robin Hu @ 2003-05-10 14:28 ` Kai Großjohann 1 sibling, 0 replies; 41+ messages in thread From: Kai Großjohann @ 2003-05-10 14:28 UTC (permalink / raw) Charles Muller <acmuller@gol.com> writes: > When it comes to working with UTF-8, I have never heard of anyone > succeeding in displaying East Asian scripts without installing the > TEI-Emacs add-on. The CVS version of Emacs has utf-translate-cjk-mode which allows me to do this: C-x C-x /some/nonexisting/file/name RET C-u C-\ chinese-py RET nihao (enter Chinese here) C-x RET c utf-8 RET C-x C-s After this, I get a UTF-8 encoded file with Chinese characters in it. utf-translate-cjk-mode used to be called utf-translate-cjk. I don't know when it appeared in Emacs. Probably it isn't in 21.3. -- file-error; Data: (Opening input file no such file or directory ~/.signature) ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.5730.1052348993.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] <mailman.5730.1052348993.21513.help-gnu-emacs@gnu.org> @ 2003-05-10 14:26 ` Kai Großjohann 2003-05-10 16:17 ` Charles Muller ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Kai Großjohann @ 2003-05-10 14:26 UTC (permalink / raw) Gaoyan Xie <gxie@eecs.wsu.edu> writes: > I am trying to explore GNU emacs's multilingual support, and what I > want is the display and input of Chinese characters. Have any of you > done this before? I tried according to GNU emacs' online manual, but > still couldn't make it work. BTW, I am using Redhat Linux 7.2 and GNU > emacs 20.7. I don't know anything about Chinese support in general. But with Emacs, it was very easy. I compiled and installed Emacs and I also installed some Chinese fonts. (The GNU intlfonts package, available from ftp.gnu.org, is a good starting point.) Then I typed M-x view-hello-file RET. This showed me some Chinese (and Japanese, and Korean) characters. If you see empty boxes instead of the Chinese characters, then some fonts are missing. Then I typed C-\ chinese-py RET to select a Pinyin input method. Then I typed nihao and saw two Chinese characters. -- file-error; Data: (Opening input file no such file or directory ~/.signature) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 14:26 ` Kai Großjohann @ 2003-05-10 16:17 ` Charles Muller 2003-05-10 16:45 ` Kai Großjohann ` (2 more replies) 2003-05-12 23:05 ` Michael Na Li [not found] ` <mailman.5922.1052583563.21513.help-gnu-emacs@gnu.org> 2 siblings, 3 replies; 41+ messages in thread From: Charles Muller @ 2003-05-10 16:17 UTC (permalink / raw) Cc: help-gnu-emacs Kai wrote: > Then I typed M-x view-hello-file RET. This showed me some Chinese > (and Japanese, and Korean) characters. If you see empty boxes > instead of the Chinese characters, then some fonts are missing. I should be pointed out, nonetheless, that it is a bad idea to cite the hello file as an example of international script functionality, since it is set in an encoding that virtually no one ever uses (at least in the CJK world), and it is quite often the case that that file will display fine despite the fact that CJK won't work in utf-8 or native East Asian encodings. Someone should either get rid of that file or save it in a relevant encoding. Chuck --------------------------- Charles Muller <acmuller@gol.com> Faculty of Humanities, Toyo Gakuen University Digital Dictionary of Buddhism and CJKV-English Dictionary [http://www.acmuller.net] H-Buddhism List Editor [http://www2.h-net.msu.edu/~buddhism/] Mobile Phone: 090-9310-1787 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 16:17 ` Charles Muller @ 2003-05-10 16:45 ` Kai Großjohann 2003-05-10 17:31 ` Charles Muller [not found] ` <mailman.5927.1052587973.21513.help-gnu-emacs@gnu.org> 2003-05-10 17:58 ` Eli Zaretskii [not found] ` <mailman.5936.1052589798.21513.help-gnu-emacs@gnu.org> 2 siblings, 2 replies; 41+ messages in thread From: Kai Großjohann @ 2003-05-10 16:45 UTC (permalink / raw) Cc: help-gnu-emacs Charles Muller <acmuller@gol.com> writes: > Kai wrote: > >> Then I typed M-x view-hello-file RET. This showed me some Chinese >> (and Japanese, and Korean) characters. If you see empty boxes >> instead of the Chinese characters, then some fonts are missing. > > I should be pointed out, nonetheless, that it is a bad idea to cite > the hello file as an example of international script functionality, > since it is set in an encoding that virtually no one ever uses (at > least in the CJK world), Really? The HELLO file shows characters from a lot of different encodings, and if used as such, then it is quite useful. > and it is quite often the case that that file will display fine > despite the fact that CJK won't work in utf-8 There are known problems with CJK support in UTF-8, but the situation has improved greatly in the development version of Emacs. > or native East Asian encodings. Can you cite examples? I have had no problem with gb2312 and Chinese characters, at least. Others routinely use Shift-JIS and EUC-JP for Japanese, I gather. > Someone should either get rid of that file or save it in a relevant > encoding. The file is in a relevant encoding: it's the encoding used by Emacs internally. (Or rather, an encoding close to the internal encoding.) This fact has its disadvantages, but it also has advantages. -- file-error; Data: (Opening input file no such file or directory ~/.signature) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 16:45 ` Kai Großjohann @ 2003-05-10 17:31 ` Charles Muller 2003-05-10 18:43 ` Eli Zaretskii 2003-05-10 19:24 ` Kai Großjohann [not found] ` <mailman.5927.1052587973.21513.help-gnu-emacs@gnu.org> 1 sibling, 2 replies; 41+ messages in thread From: Charles Muller @ 2003-05-10 17:31 UTC (permalink / raw) Cc: help-gnu-emacs Kai wrote: > Really? The HELLO file shows characters from a lot of different > encodings, and if used as such, then it is quite useful. > > > and it is quite often the case that that file will display fine > > despite the fact that CJK won't work in utf-8 > > There are known problems with CJK support in UTF-8, but the situation > has improved greatly in the development version of Emacs. I know that, and I am not contesting that point. But again, the HELLO file is not a utf-8 file. It is also not a form of JIS or other East Asian encoding, so the fact that one can display multilingual scripts by opening that file does not mean that they will be able to display them in Big5, JIS, or whatever. If you check the archives for "utf-8+cjk", you will see that we have had a few threads in the past year that dealt with problems trying to display CJK and other international scripts, in which the advice was given to look at the Hello file. As a person who has been working with international scripts and utf-8 for a number years, I know firsthand the ability to be able to read this file doesn't usually mean much. People who recommend checking this file are usually people who don't use double-byte East Asian languages. > The file is in a relevant encoding: it's the encoding used by Emacs > internally. (Or rather, an encoding close to the internal encoding.) Relevant to whom? It's not in utf-8, right? Most of the problems people have been having with CJK display in Emacs (at least until the appearance of 21.3.5) have to do with problems getting utf-8 to work, and the hello file will still display even when these problems are not resolved. No one that I know who works in XML or with East Asian international scripts works in utf-7, so while that encoding format may be relevant for those who are programming Emacs internally, it is not relevant for anyone using Emacs to do multilingual XML or HTML publication, because no one uses it. That's what I mean when I say "not relevant." It is not my purpose to badmouth Emacs handling of Unicode. I know that people have been working very hard to resolve these problems, and from what I have been hearing, once everyone has copies of 21.3.5 installed with the right Mule setup, this will be a past issue. Hopefully, somewhere along the line, the Hello file will also graduate to utf-8. Regards, Chuck --------------------------- Charles Muller <acmuller@gol.com> Faculty of Humanities, Toyo Gakuen University Digital Dictionary of Buddhism and CJKV-English Dictionary [http://www.acmuller.net] H-Buddhism List Editor [http://www2.h-net.msu.edu/~buddhism/] Mobile Phone: 090-9310-1787 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 17:31 ` Charles Muller @ 2003-05-10 18:43 ` Eli Zaretskii 2003-05-11 2:11 ` Charles Muller 2003-05-10 19:24 ` Kai Großjohann 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2003-05-10 18:43 UTC (permalink / raw) > Date: Sun, 11 May 2003 02:31:49 +0900 (JST) > From: Charles Muller <acmuller@gol.com> > > the HELLO file > is not a utf-8 file. It is also not a form of JIS or other East Asian > encoding, so the fact that one can display multilingual scripts by opening > that file does not mean that they will be able to display them in Big5, JIS, > or whatever. It does demonstrate that Emacs can display, read, and write Chinese characters, Japanese characters, and other characters. UTF-8 is not the only way to dio that, and there's lots of other non-trivial machinery, bot inside Emacs and outside it, that should be set up correctly for it to be able to display etc/HELLO, even without UTF-8. > If you check the archives for "utf-8+cjk", you will see that we have had a few > threads in the past year that dealt with problems trying to display CJK and > other international scripts, in which the advice was given to look at the > Hello file. If you read the archives of this forum (and of gnu.emacs.bug), you will see that it's been recommended _a_lot_. > It is not my purpose to badmouth Emacs handling of Unicode. However, you've actually done precisely that. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 18:43 ` Eli Zaretskii @ 2003-05-11 2:11 ` Charles Muller 2003-05-11 3:32 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Charles Muller @ 2003-05-11 2:11 UTC (permalink / raw) Eli wrote: > If you read the archives of this forum (and of gnu.emacs.bug), you > will see that it's been recommended _a_lot_. > > > It is not my purpose to badmouth Emacs handling of Unicode. > > However, you've actually done precisely that. All I'm trying to do, as a person who does a lot of work with CJK and utf-8 is point out something that can be done a little better. I can't understand why you guys can't simply say "hmm.. perhaps this is something we should look into." What's the big deal? Chuck --------------------------- Charles Muller <acmuller@gol.com> Faculty of Humanities, Toyo Gakuen University Digital Dictionary of Buddhism and CJKV-English Dictionary [http://www.acmuller.net] H-Buddhism List Editor [http://www2.h-net.msu.edu/~buddhism/] Mobile Phone: 090-9310-1787 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-11 2:11 ` Charles Muller @ 2003-05-11 3:32 ` Eli Zaretskii 2003-05-11 13:59 ` Charles Muller [not found] ` <mailman.5976.1052661651.21513.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 41+ messages in thread From: Eli Zaretskii @ 2003-05-11 3:32 UTC (permalink / raw) Cc: help-gnu-emacs > Date: Sun, 11 May 2003 11:11:48 +0900 (JST) > From: Charles Muller <acmuller@gol.com> > > Eli wrote: > > > If you read the archives of this forum (and of gnu.emacs.bug), you > > will see that it's been recommended _a_lot_. > > > > > It is not my purpose to badmouth Emacs handling of Unicode. > > > > However, you've actually done precisely that. > > All I'm trying to do, as a person who does a lot of work with CJK and utf-8 > is point out something that can be done a little better. I can't understand > why you guys can't simply say "hmm.. perhaps this is something we should > look into." What's the big deal? Is anything but complete acceptance of your opinions going to convince you that we know what we are talking about? Look, all I was trying to do, as a person who did some work in this area for Emacs, and as someone who does get to answer lots of questions about this, is to tell you that etc/HELLO _is_ useful, and that the last thing I'd expect Emacs maintainers to do is to remove it. Why cannot you accept that? It goes without saying that in a Unicode Emacs, the one that will have its internal representation of characters based on Unicode codepoints, HELLO will be recoded, either in that internal representation or in UTF-8. But until that happens, I don't see any reason to recode it. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-11 3:32 ` Eli Zaretskii @ 2003-05-11 13:59 ` Charles Muller [not found] ` <mailman.5976.1052661651.21513.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 41+ messages in thread From: Charles Muller @ 2003-05-11 13:59 UTC (permalink / raw) Cc: help-gnu-emacs Eli wrote: > Is anything but complete acceptance of your opinions going to convince > you that we know what we are talking about? It seems like this discussion has expanded out of proportion. > Look, all I was trying to do, as a person who did some work in this > area for Emacs, and as someone who does get to answer lots of > questions about this, is to tell you that etc/HELLO _is_ useful, and > that the last thing I'd expect Emacs maintainers to do is to remove > it. Why cannot you accept that? I never requested the removal of the HELLO file. All I said was that as long as it was maintained in utf-7, it was not especially useful as test file for people who are trying to get their CJK working right. > It goes without saying that in a Unicode Emacs, the one that will have > its internal representation of characters based on Unicode codepoints, > HELLO will be recoded, either in that internal representation or in > UTF-8. This seems like a good approach. I appreciate your efforts to understand the issue. >But until that happens, I don't see any reason to recode it. Whatever. Hopefully, as Emacs 21.3-5 with the appropriate Mule settings is widely proliferated, this will be a moot issue. Emacs is a superb piece of work, and I appreciate the efforts of all its developers to continually expand its versatility. Chuck --------------------------- Charles Muller <acmuller@gol.com> Faculty of Humanities, Toyo Gakuen University Digital Dictionary of Buddhism and CJKV-English Dictionary [http://www.acmuller.net] H-Buddhism List Editor [http://www2.h-net.msu.edu/~buddhism/] Mobile Phone: 090-9310-1787 ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.5976.1052661651.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.5976.1052661651.21513.help-gnu-emacs@gnu.org> @ 2003-05-12 19:29 ` Jason Rumney 2003-05-12 19:58 ` Kai Großjohann 2003-05-13 7:40 ` Lee Sau Dan 2 siblings, 0 replies; 41+ messages in thread From: Jason Rumney @ 2003-05-12 19:29 UTC (permalink / raw) Charles Muller <acmuller@gol.com> writes: > I never requested the removal of the HELLO file. All I said was that as long > as it was maintained in utf-7, It has never been maintained in utf-7, and no suggestion has ever been made to encode it in utf-7. The current encoding is based on iso-2022 AFAIK. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support [not found] ` <mailman.5976.1052661651.21513.help-gnu-emacs@gnu.org> 2003-05-12 19:29 ` Jason Rumney @ 2003-05-12 19:58 ` Kai Großjohann 2003-05-13 7:40 ` Lee Sau Dan 2 siblings, 0 replies; 41+ messages in thread From: Kai Großjohann @ 2003-05-12 19:58 UTC (permalink / raw) Charles Muller <acmuller@gol.com> writes: > I never requested the removal of the HELLO file. All I said was that as long > as it was maintained in utf-7, (It's not in utf-7. Though it's not relevant here.) > it was not especially useful as test file for people who are trying > to get their CJK working right. It's *very* useful. It halves the problem space. -- This line is not blank. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support [not found] ` <mailman.5976.1052661651.21513.help-gnu-emacs@gnu.org> 2003-05-12 19:29 ` Jason Rumney 2003-05-12 19:58 ` Kai Großjohann @ 2003-05-13 7:40 ` Lee Sau Dan 2003-05-13 9:57 ` acmuller 2003-05-13 10:02 ` Robin Hu 2 siblings, 2 replies; 41+ messages in thread From: Lee Sau Dan @ 2003-05-13 7:40 UTC (permalink / raw) >>>>> "Charles" == Charles Muller <acmuller@gol.com> writes: Charles> I never requested the removal of the HELLO file. All I Charles> said was that as long as it was maintained in utf-7, Are you sure it's utf-7? Then how come it can distinguish characters in BIG5 and equivalent characters in JIS? Charles> it was not especially useful as test file for people who Charles> are trying to get their CJK working right. It seems to me that when you talk about "CJK", you're actually refering to "utf-8". Many people using the CJK parts of Emacs only work with the national encodings (Big5, GB, JIS, KSC, etc.) and in those cases, they Emacs works excellently. -- Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-13 7:40 ` Lee Sau Dan @ 2003-05-13 9:57 ` acmuller 2003-05-13 10:02 ` Robin Hu 1 sibling, 0 replies; 41+ messages in thread From: acmuller @ 2003-05-13 9:57 UTC (permalink / raw) On 5/13/2003, Lee Sau Dan wrote: >Are you sure it's utf-7? Then how come it can distinguish characters >in BIG5 and equivalent characters in JIS? As was corrected in an earlier message, it is iso-2022, not utf-7. > Charles> it was not especially useful as test file for people who > Charles> are trying to get their CJK working right. > >It seems to me that when you talk about "CJK", you're actually >refering to "utf-8". No, I am not. But my discussion from the outset has been centered on utf-8 related problems. As you point out (and as I noted earlier in this thread) most people are able to get CJK working with localized DCBS encodings without too much trouble. I, and the people that I am collaborating with in XML-based data projects, are all at the bleeding edge using utf-8, and thus I have had to deal with this problem extensively in trying to get everyone's systems set up right. Chuck Charles Muller <acmuller@gol.com> Toyo Gakuen University Digital Dictionary of Buddhism: www.acmuller.net ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-13 7:40 ` Lee Sau Dan 2003-05-13 9:57 ` acmuller @ 2003-05-13 10:02 ` Robin Hu 2003-05-15 8:07 ` Lee Sau Dan 1 sibling, 1 reply; 41+ messages in thread From: Robin Hu @ 2003-05-13 10:02 UTC (permalink / raw) >>>>> "Lee" == Lee Sau Dan <danlee@informatik.uni-freiburg.de> writes: >>>>> "Charles" == Charles Muller <acmuller@gol.com> writes: Lee> Many people using the CJK parts of Emacs only work with the Lee> national encodings (Big5, GB, JIS, KSC, etc.) and in those Lee> cases, they Emacs works excellently. I think you are over-simpilify this problem. ;-( Most CJK characters are not encoded in either Big5 or GB or JIS or KSC, that's why the GB coding standard change from gb2312 to gbk then to gb18030. AFAIK, most chinese characters also cannot be coded within mule, and exists unicode support does not solve this problem. Of course, emacs is enough for most people in most time, but I am really hesitated to tell my friend once and once again "Sorry, but your name (羽中) is not supported by my emacs." -- The goal of science is to build better mousetraps. The goal of nature is to build better mice. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-13 10:02 ` Robin Hu @ 2003-05-15 8:07 ` Lee Sau Dan 0 siblings, 0 replies; 41+ messages in thread From: Lee Sau Dan @ 2003-05-15 8:07 UTC (permalink / raw) >>>>> "Robin" == Robin Hu <huxw@knight.6test.edu.cn> writes: >>>>> "Lee" == Lee Sau Dan <danlee@informatik.uni-freiburg.de> writes: >>>>> "Charles" == Charles Muller <acmuller@gol.com> writes: Lee> Many people using the CJK parts of Emacs only work with the Lee> national encodings (Big5, GB, JIS, KSC, etc.) and in those Lee> cases, they Emacs works excellently. Robin> I think you are over-simpilify this problem. ;-( Most Robin> CJK characters are not encoded in either Big5 or GB or JIS Robin> or KSC, that's why the GB coding standard change from Robin> gb2312 to gbk then to gb18030. It depends on what you mean by "most". Yes, if you include those 10s of thousnds of *rare* characters, then even Unicode can fall short. Most Chinese text, for instance, uses around 5000 distinct characters only, of which around 1000 accounts for more than 90% of the characters in a text. Big5 is very sufficient for normal use. If not, the Chinese people won't have thrown it away (e.g. in favour of Unicode). Similarly, Japanese texts employ around 3000 distinct characters, and there is a government standard list of characters to use. Characters outside that list should be theoretically avoided. The characters in JIS are based on this set, AFAIK. Robin> AFAIK, most chinese characters also cannot be coded within Robin> mule, and exists unicode support does not solve this Robin> problem. As long as 99.99% of the characters that I need for Chinese text files can be encoded in Big5 and emacs-mule, what's the problem? Robin> Of course, emacs is enough for most people in most Robin> time, but I am really hesitated to tell my friend once and Robin> once again "Sorry, but your name (羽中) is not supported by Robin> my emacs." No, that's not my name. I think Gnus sets the charset of my postings to big5. And which Emacs is your emacs? Emacs since version 20 has been displaying Chinese (I can't speak for Japanese and Korean) very satisfactorily. And I find it, together with Gnus, to be the most practical tool on Linux to read/write Chinese files/news/mails. -- Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 17:31 ` Charles Muller 2003-05-10 18:43 ` Eli Zaretskii @ 2003-05-10 19:24 ` Kai Großjohann 2003-05-11 2:15 ` Charles Muller [not found] ` <mailman.5956.1052619415.21513.help-gnu-emacs@gnu.org> 1 sibling, 2 replies; 41+ messages in thread From: Kai Großjohann @ 2003-05-10 19:24 UTC (permalink / raw) Cc: help-gnu-emacs Charles Muller <acmuller@gol.com> writes: > Kai wrote: > >> Really? The HELLO file shows characters from a lot of different >> encodings, and if used as such, then it is quite useful. >> >> > and it is quite often the case that that file will display fine >> > despite the fact that CJK won't work in utf-8 >> >> There are known problems with CJK support in UTF-8, but the situation >> has improved greatly in the development version of Emacs. > > I know that, and I am not contesting that point. But again, the > HELLO file is not a utf-8 file. It is also not a form of JIS or > other East Asian encoding, so the fact that one can display > multilingual scripts by opening that file does not mean that they > will be able to display them in Big5, JIS, or whatever. If you check > the archives for "utf-8+cjk", you will see that we have had a few > threads in the past year that dealt with problems trying to display > CJK and other international scripts, in which the advice was given > to look at the Hello file. As a person who has been working with > international scripts and utf-8 for a number years, I know firsthand > the ability to be able to read this file doesn't usually mean > much. People who recommend checking this file are usually people who > don't use double-byte East Asian languages. I know that I often suggest people to have a look at the HELLO file. I do that to answer one question: does Emacs find the right fonts? While a correctly-looking HELLO file is not sufficient for correct functioning of CJK, it is a requirement. That is, if HELLO looks bad, that needs to be fixed first. However, I don't use CJK myself (much), and therefore I am quite ready to be proven wrong. Is it common to have a garbled HELLO file but still CJK is working right? -- This line is not blank. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 19:24 ` Kai Großjohann @ 2003-05-11 2:15 ` Charles Muller 2003-05-11 3:34 ` Eli Zaretskii [not found] ` <mailman.5956.1052619415.21513.help-gnu-emacs@gnu.org> 1 sibling, 1 reply; 41+ messages in thread From: Charles Muller @ 2003-05-11 2:15 UTC (permalink / raw) Kai wrote: > However, I don't use CJK myself (much), and therefore I am quite > ready to be proven wrong. Is it common to have a garbled HELLO file > but still CJK is working right? No, it is exactly the opposite. It is common for one to be able to display the HELLO file without problems, but to still have difficulty displaying CJK in other encodings, especially UTF-8. Chuck --------------------------- Charles Muller <acmuller@gol.com> Faculty of Humanities, Toyo Gakuen University Digital Dictionary of Buddhism and CJKV-English Dictionary [http://www.acmuller.net] H-Buddhism List Editor [http://www2.h-net.msu.edu/~buddhism/] Mobile Phone: 090-9310-1787 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-11 2:15 ` Charles Muller @ 2003-05-11 3:34 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2003-05-11 3:34 UTC (permalink / raw) > Date: Sun, 11 May 2003 11:15:32 +0900 (JST) > From: Charles Muller <acmuller@gol.com> > > No, it is exactly the opposite. It is common for one to be able to display > the HELLO file without problems, but to still have difficulty displaying CJK > in other encodings, especially UTF-8. So etc/HELLO is not the ultimate solution for testing how well CJK is handled. So what? It certainly helps to test certain aspects of that. And for a file that requires near-zero maintenance, that's a lot. ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.5956.1052619415.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.5956.1052619415.21513.help-gnu-emacs@gnu.org> @ 2003-05-12 19:56 ` Kai Großjohann 2003-05-13 3:36 ` Charles Muller [not found] ` <mailman.6084.1052797097.21513.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 41+ messages in thread From: Kai Großjohann @ 2003-05-12 19:56 UTC (permalink / raw) Charles Muller <acmuller@gol.com> writes: > Kai wrote: > >> However, I don't use CJK myself (much), and therefore I am quite >> ready to be proven wrong. Is it common to have a garbled HELLO file >> but still CJK is working right? > > No, it is exactly the opposite. It is common for one to be able to > display the HELLO file without problems, but to still have > difficulty displaying CJK in other encodings, especially UTF-8. Well, that's good, then. The HELLO file does exactly what I need: it helps me to narrow down the area where people are having problems. If HELLO displays fine, then the problem is not the fonts. Maybe HELLO does not do what you need, but that doesn't make it useless in general, IMVHO. -- This line is not blank. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-12 19:56 ` Kai Großjohann @ 2003-05-13 3:36 ` Charles Muller 2003-05-14 3:14 ` Eli Zaretskii [not found] ` <mailman.6084.1052797097.21513.help-gnu-emacs@gnu.org> 1 sibling, 1 reply; 41+ messages in thread From: Charles Muller @ 2003-05-13 3:36 UTC (permalink / raw) Cc: help-gnu-emacs Kai wrote: > Maybe HELLO does not do what you need, but that doesn't make it > useless in general, IMVHO. I never said it was useless in general, and have never suggested that the HELLO file should be relegated to oblivion. One more time: Since the HELLO file is used for internal testing by Emacs coders it almost always works correctly in any recent Emacs "out of the box." The common misunderstanding occurs when people who are trying to get CJK working in utf-8 write to this, or another list for help, and list members, in the spirit of trying to be helpful, suggest that all is fine if the HELLO file displays right. But since the HELLO file is encoded in iso-2022 (not utf-7, as I originally stated) it is the case, in my fairly extensive experience with the matter, that the HELLO file will almost invariably display fine, while the original problem (usually Mule-related) remains untouched upon. Since the people who usually make the suggestion to test via the HELLO are those who do not regularly use CJK, it seems that they are not aware of this discrepancy, and I wanted to point this out. It seems strange to see people react so emotionally to the exposure of this simple point. No one is asking that the hallowed HELLO file be sent to oblivion--although a reincarnation as utf-8 would certainly not hurt! :-) Chuck --------------------------- Charles Muller <acmuller@gol.com> Faculty of Humanities, Toyo Gakuen University Digital Dictionary of Buddhism and CJKV-English Dictionary [http://www.acmuller.net] H-Buddhism List Editor [http://www2.h-net.msu.edu/~buddhism/] Mobile Phone: 090-9310-1787 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-13 3:36 ` Charles Muller @ 2003-05-14 3:14 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2003-05-14 3:14 UTC (permalink / raw) > Date: Tue, 13 May 2003 12:36:28 +0900 (JST) > Newsgroups: gnu.emacs.help > From: Charles Muller <acmuller@gol.com> > > I never said it was useless in general, and have never > suggested that the HELLO file should be relegated to oblivion. Perhaps someone else in this thread did, then. > Since the HELLO file is used for internal testing by Emacs coders it almost > always works correctly in any recent Emacs "out of the box." The ability to display HELLO depends on the local configuration (fonts), so it is not guaranteed to work on every platform. > Since the people who usually make the suggestion to test via the HELLO are > those who do not regularly use CJK, it seems that they are not aware of this > discrepancy, and I wanted to point this out. Actually, HELLO was created by people who use CJK every day. ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.6084.1052797097.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.6084.1052797097.21513.help-gnu-emacs@gnu.org> @ 2003-05-13 7:05 ` Kai Großjohann 2003-05-14 6:14 ` Lee Sau Dan 1 sibling, 0 replies; 41+ messages in thread From: Kai Großjohann @ 2003-05-13 7:05 UTC (permalink / raw) Charles Muller <acmuller@gol.com> writes: > Since the HELLO file is used for internal testing by Emacs coders it almost > always works correctly in any recent Emacs "out of the box." Ah, I see. Actually, some people see empty boxes when the display HELLO. But you're right, usually it Just Works. And you're also right in that something more is needed to test Unicode support in Emacs. It seems that installing Mule-UCS on Emacs 20 also makes it Just Work, more or less. I've had less of a success with installing Mule-UCS on Emacs 21 -- there was some double-UTF-8 encoding in messages written with Gnus. -- This line is not blank. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support [not found] ` <mailman.6084.1052797097.21513.help-gnu-emacs@gnu.org> 2003-05-13 7:05 ` Kai Großjohann @ 2003-05-14 6:14 ` Lee Sau Dan 2003-05-14 16:27 ` Kai Großjohann 1 sibling, 1 reply; 41+ messages in thread From: Lee Sau Dan @ 2003-05-14 6:14 UTC (permalink / raw) >>>>> "Charles" == Charles Muller <acmuller@gol.com> writes: Charles> One more time: Charles> Since the HELLO file is used for internal testing by Charles> Emacs coders it almost always works correctly in any Charles> recent Emacs "out of the box." No. If you have problems with the font installation (esp. when none of your font servers offer the relevant fonts or your sys. admin. simply don't care about your non-English needs), HELLO won't display the glyphs. It only display boxes there. Charles> The common misunderstanding occurs when people who are Charles> trying to get CJK working in utf-8 write to this, or Charles> another list for help, and list members, in the spirit of Charles> trying to be helpful, suggest that all is fine if the Charles> HELLO file displays right. For utf-8 testing, I'd refer someone to the test files in the MuleUCS package. Charles> Since the people who usually make the suggestion to test Charles> via the HELLO are those who do not regularly use CJK, it Charles> seems that they are not aware of this discrepancy, and I Charles> wanted to point this out. No. Those people often use CJK regularly. They just don't use utf-8. Like me (using Big5), they use a national encoding (e.g. GB2312, JIS, KSC). Charles> It seems strange to see people react so emotionally to Charles> the exposure of this simple point. No one is asking that Charles> the hallowed HELLO file be sent to oblivion--although a Charles> reincarnation as utf-8 would certainly not hurt! :-) That WILL certainly HURT. Look carefully at the section "Difference among chinese characters in GB, JIS, KSC, BIG5:" in HELLO. The same thing cannot be reproduced in vanilla utf-8, because Unicode unifies the various characters in these encoding into one single code point. (Most efforts in the earlier versions of Unicode were devoted to _unifying_ characters from different languages, employing different national encodings. The result is that you can no longer tell where a unified character is from Korean, Japanese and Chinese, who write them in slightly different ways.) If you want to test UTF-8 (Why not UTF-16? People who really use computers for Far East languages (CJK) would have to waste 50% disk space if they use UTF-8 to store their text files. UTF-16 is more space efficient.), do suggest including a UTF-8 test file. (Add a line in HELLO to instruct anyone how to open the UTF-8 test file, favourably with hot-key bindings.) And why stop there? Also have UTF-16 and UTF-7 test files. UTF-8 is simply NOT the magic panacea. It sucks when you have a file full of Chinese characters, for instance. The 3-byte per Chinese character "feature" of UTF-8 sucks. HELLO should remain a test file for the internal encoding "emacs-mule" and for displaying the true multilingual capabilities of Emacs. It has also been serving well to test font installation. It should never be recoded in utf-8, IMO. If all you care about is UTF-8, have another test file. Assuming that all CJK users should use UTF-8 is like assuming that everyone should fall faith to Vatican. -- Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-14 6:14 ` Lee Sau Dan @ 2003-05-14 16:27 ` Kai Großjohann 2003-05-14 21:07 ` Jason Rumney 0 siblings, 1 reply; 41+ messages in thread From: Kai Großjohann @ 2003-05-14 16:27 UTC (permalink / raw) Lee Sau Dan <danlee@informatik.uni-freiburg.de> writes: > If you want to test UTF-8 (Why not UTF-16? People who really use > computers for Far East languages (CJK) would have to waste 50% disk > space if they use UTF-8 to store their text files. UTF-16 is more > space efficient.), do suggest including a UTF-8 test file. (Add a > line in HELLO to instruct anyone how to open the UTF-8 test file, > favourably with hot-key bindings.) And why stop there? Also have > UTF-16 and UTF-7 test files. UTF-8 is simply NOT the magic panacea. > It sucks when you have a file full of Chinese characters, for > instance. The 3-byte per Chinese character "feature" of UTF-8 sucks. Why not include UTF-8 characters in the HELLO file? I gather that iso-2022 is general enough to also allow using UTF-8 as one of the encodings it supports. But I'm not an expert. -- This line is not blank. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-14 16:27 ` Kai Großjohann @ 2003-05-14 21:07 ` Jason Rumney 0 siblings, 0 replies; 41+ messages in thread From: Jason Rumney @ 2003-05-14 21:07 UTC (permalink / raw) kai.grossjohann@gmx.net (Kai Großjohann) writes: > Why not include UTF-8 characters in the HELLO file? I gather that > iso-2022 is general enough to also allow using UTF-8 as one of the > encodings it supports. compound-text-with-extensions does, but not pure iso-2022 AFAIK. Anyway, HELLO is in emacs-mule encoding, which is a 16-bit encoding based on iso-2022, so only supports a certain fixed number of character sets (most iso-2022 based encodings only support 2 or 3 character sets). CVS Emacs does contain some unicode text in HELLO, but only a subset of Unicode is supported without conversion (hence the existence of `utf-translate-cjk-mode' which causes the large translation tables to be loaded). ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.5927.1052587973.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.5927.1052587973.21513.help-gnu-emacs@gnu.org> @ 2003-05-12 19:27 ` Jason Rumney 2003-05-13 7:40 ` Lee Sau Dan 1 sibling, 0 replies; 41+ messages in thread From: Jason Rumney @ 2003-05-12 19:27 UTC (permalink / raw) Charles Muller <acmuller@gol.com> writes: > If you check the archives for "utf-8+cjk", you will see that we have > had a few threads in the past year that dealt with problems trying > to display CJK and other international scripts, in which the advice > was given to look at the Hello file. As a person who has been > working with international scripts and utf-8 for a number years, I > know firsthand the ability to be able to read this file doesn't > usually mean much. If HELLO displays correctly, it means that Emacs has all it need in order to display characters, and the problem lies in the encoding/decoding process. If we recoded HELLO into UTF-8, then if people were having problems displaying utf-8 encoded text, looking at HELLO would just not work. At least now, it is a way to quickly narrow down the problem. > People who recommend checking this file are usually people who don't > use double-byte East Asian languages. No, they are people that know how Emacs works, and are trying to help by narrowing down the problem. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support [not found] ` <mailman.5927.1052587973.21513.help-gnu-emacs@gnu.org> 2003-05-12 19:27 ` Jason Rumney @ 2003-05-13 7:40 ` Lee Sau Dan 2003-05-13 10:11 ` acmuller ` (2 more replies) 1 sibling, 3 replies; 41+ messages in thread From: Lee Sau Dan @ 2003-05-13 7:40 UTC (permalink / raw) >>>>> "Charles" == Charles Muller <acmuller@gol.com> writes: Charles> I know that, and I am not contesting that point. But Charles> again, the HELLO file is not a utf-8 file. I think you're being religious. Why must it be utf-8? Charles> It is also not a form of JIS or other East Asian Charles> encoding, It's emacs-mule encoding --- Emac's own representation of the information about characters/encodings that it keeps. Charles>so the fact that one can display multilingual Charles> scripts by opening that file does not mean that they will Charles> be able to display them in Big5, JIS, or whatever. If one can see the Big5 text in that file, he can see all other Big5 files. If one can see the Thai characters in that file, he can also see the Thai characters when he opens a Thai text file with the suitable encoding (the default if he has done set-language-environement correctly). And so on. Charles> People who recommend checking this file are usually Charles> people who don't use double-byte East Asian languages. Sorry, I use Big5 very often. And I do recommend C-h h as a quick test to see if he has installed the big5 fonts correctly. (Big5 fonts do not come with XFree86, and many Linux distros has been ignoring the "leim" and "intlfont" packages for years.) >> The file is in a relevant encoding: it's the encoding used by >> Emacs internally. (Or rather, an encoding close to the >> internal encoding.) Charles> Relevant to whom? To Emacs. Charles> It's not in utf-8, right? So what? My .signature is in Big5 and it is not in utf-8, either. And my .emacs file is in emacs-mule encoding, which is not utf-8, either. Neither are utf-16 files utf-8. I think you're being religious when you worship utf-8. For Chinese text, utf-8 wastes 50% of storage space. I'd rather use utf-16. But big5 has the same storage efficiency (and more when you include some English text) and it is more common. Charles> No one that I know who works in XML or with East Asian Charles> international scripts works in utf-7, And for XML in Chinese, utf-8 wastes lots of space. To be practical, we often use big5 for XML files with Chinese. Charles> so while that encoding format may be relevant for those Charles> who are programming Emacs internally, it is not relevant Charles> for anyone using Emacs to do multilingual XML or HTML Charles> publication, because no one uses it. That's what I mean Charles> when I say "not relevant." My experience with Emac's utf-8 <--> internal conversion has been good. -- Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-13 7:40 ` Lee Sau Dan @ 2003-05-13 10:11 ` acmuller 2003-05-13 10:54 ` Charles Muller [not found] ` <mailman.6097.1052826249.21513.help-gnu-emacs@gnu.org> 2 siblings, 0 replies; 41+ messages in thread From: acmuller @ 2003-05-13 10:11 UTC (permalink / raw) On 5/13/2003, "Lee Sau Dan" <danlee@informatik.uni-freiburg.de> wrote: >I think you're being religious when you worship utf-8. Who said anything about worship? Why the sarcasm? I am an XML developer. UTF-8 is the standard encoding for XML documents. See http://www.w3.org/TR/REC-xml Chuck Charles Muller <acmuller@gol.com> Toyo Gakuen University Digital Dictionary of Buddhism: www.acmuller.net ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-13 7:40 ` Lee Sau Dan 2003-05-13 10:11 ` acmuller @ 2003-05-13 10:54 ` Charles Muller [not found] ` <mailman.6097.1052826249.21513.help-gnu-emacs@gnu.org> 2 siblings, 0 replies; 41+ messages in thread From: Charles Muller @ 2003-05-13 10:54 UTC (permalink / raw) Lee Sau Dan wrote: > And for XML in Chinese, utf-8 wastes lots of space. To be practical, > we often use big5 for XML files with Chinese. That's fine, if all you are doing is Chinese. The documents in my project include terms from over 15 languages, including Tibetan, Nepalese, Sanskrit, Pali, and several European languages. Unicode has codepoints for these characters, while Big5 (and other Chinese codesets) do not. Chuck --------------------------- Charles Muller <acmuller@gol.com> Faculty of Humanities, Toyo Gakuen University Digital Dictionary of Buddhism and CJKV-English Dictionary [http://www.acmuller.net] H-Buddhism List Editor [http://www2.h-net.msu.edu/~buddhism/] Mobile Phone: 090-9310-1787 ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.6097.1052826249.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.6097.1052826249.21513.help-gnu-emacs@gnu.org> @ 2003-05-15 8:07 ` Lee Sau Dan 0 siblings, 0 replies; 41+ messages in thread From: Lee Sau Dan @ 2003-05-15 8:07 UTC (permalink / raw) >>>>> "Charles" == Charles Muller <acmuller@gol.com> writes: Charles> Lee Sau Dan wrote: >> And for XML in Chinese, utf-8 wastes lots of space. To be >> practical, we often use big5 for XML files with Chinese. Charles> That's fine, if all you are doing is Chinese. The Charles> documents in my project include terms from over 15 Charles> languages, including Tibetan, Nepalese, Sanskrit, Pali, Charles> and several European languages. Unicode has codepoints Charles> for these characters, while Big5 (and other Chinese Charles> codesets) do not. The emacs-mule encoding also has code points for these characters. Moreover, it can distinguish big5 characters from JIS characters (see the "Difference among chinese characters in GB, JIS, KSC, BIG5:" section in the HELLO file.), while Unicode (and hence utf-7, utf-8, utf-16, ucs2, ucs4) cannot. -- Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 16:17 ` Charles Muller 2003-05-10 16:45 ` Kai Großjohann @ 2003-05-10 17:58 ` Eli Zaretskii [not found] ` <mailman.5936.1052589798.21513.help-gnu-emacs@gnu.org> 2 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2003-05-10 17:58 UTC (permalink / raw) > Date: Sun, 11 May 2003 01:17:25 +0900 (JST) > Newsgroups: gnu.emacs.help > From: Charles Muller <acmuller@gol.com> > > I should be pointed out, nonetheless, that it is a bad idea to > cite the hello file as an example of international script functionality, > since it is set in an encoding that virtually no one ever uses (at least in > the CJK world) It is certainly useful to see whether Emacs is set up correctly for its non-ASCII support, including coding systems, fonts, and other facilities. Whether other software understands the way that file was encoded is irrelevant for this. > Someone should either get rid of that file or save it in a > relevant encoding. Until Emacs supports the full range of Unicode characters, the encoding used now to save etc/HELLO is about _the_only_ one that can do the job. Let me remind you that in the released versions of Emacs, only a subset of the BMP is supported. ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.5936.1052589798.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.5936.1052589798.21513.help-gnu-emacs@gnu.org> @ 2003-05-13 7:40 ` Lee Sau Dan 2003-05-14 3:15 ` Eli Zaretskii [not found] ` <mailman.6156.1052882447.21513.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 41+ messages in thread From: Lee Sau Dan @ 2003-05-13 7:40 UTC (permalink / raw) >>>>> "Eli" == Eli Zaretskii <eliz@elta.co.il> writes: Eli> Until Emacs supports the full range of Unicode characters, Eli> the encoding used now to save etc/HELLO is about _the_only_ Eli> one that can do the job. Let me remind you that in the Eli> released versions of Emacs, only a subset of the BMP is Eli> supported. I don't think so. Unicode will never be able to handle the "Difference among chinese characters in GB, JIS, KSC, BIG5:" section in the etc/HELLO file. Unicode simply unifies the "equivalent" characters that show up differently in that section of the HELLO file into single code points. -- Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-13 7:40 ` Lee Sau Dan @ 2003-05-14 3:15 ` Eli Zaretskii [not found] ` <mailman.6156.1052882447.21513.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2003-05-14 3:15 UTC (permalink / raw) > From: Lee Sau Dan <danlee@informatik.uni-freiburg.de> > Newsgroups: gnu.emacs.help > Date: 13 May 2003 09:40:16 +0200 > > >>>>> "Eli" == Eli Zaretskii <eliz@elta.co.il> writes: > > Eli> Until Emacs supports the full range of Unicode characters, > Eli> the encoding used now to save etc/HELLO is about _the_only_ > Eli> one that can do the job. Let me remind you that in the > Eli> released versions of Emacs, only a subset of the BMP is > Eli> supported. > > I don't think so. Unicode will never be able to handle the > "Difference among chinese characters in GB, JIS, KSC, BIG5:" section > in the etc/HELLO file. Unicode doesn't, but the Unicode Emacs will. Trust me ;-) ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.6156.1052882447.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.6156.1052882447.21513.help-gnu-emacs@gnu.org> @ 2003-05-15 8:07 ` Lee Sau Dan 2003-05-16 11:36 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Lee Sau Dan @ 2003-05-15 8:07 UTC (permalink / raw) >>>>> "Eli" == Eli Zaretskii <eliz@elta.co.il> writes: Eli> Until Emacs supports the full range of Unicode characters, Eli> the encoding used now to save etc/HELLO is about _the_only_ Eli> one that can do the job. Let me remind you that in the Eli> released versions of Emacs, only a subset of the BMP is Eli> supported. >> I don't think so. Unicode will never be able to handle the >> "Difference among chinese characters in GB, JIS, KSC, BIG5:" >> section in the etc/HELLO file. Eli> Unicode doesn't, but the Unicode Emacs will. Trust me ;-) So, you're agreeing that converting HELLO to utf-8 (which only represents Unicode) is not a good idea? Or are you resorting to dirty tricks using the Private Use Area? -- Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-15 8:07 ` Lee Sau Dan @ 2003-05-16 11:36 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2003-05-16 11:36 UTC (permalink / raw) > From: Lee Sau Dan <danlee@informatik.uni-freiburg.de> > Newsgroups: gnu.emacs.help > Date: 15 May 2003 10:07:01 +0200 > > >> I don't think so. Unicode will never be able to handle the > >> "Difference among chinese characters in GB, JIS, KSC, BIG5:" > >> section in the etc/HELLO file. > > Eli> Unicode doesn't, but the Unicode Emacs will. Trust me ;-) > > So, you're agreeing that converting HELLO to utf-8 (which only > represents Unicode) is not a good idea? I don't know yet. AFAIK, the issue of encoding etc/HELLO in the Unicode Emacs was not discussed yet, but I expect it to be encoded in the internal Emacs representation of characters, because that by definition will support all the characters suppored by Emacs, and do that unambiguously. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-10 14:26 ` Kai Großjohann 2003-05-10 16:17 ` Charles Muller @ 2003-05-12 23:05 ` Michael Na Li 2003-05-13 7:02 ` Kai Großjohann [not found] ` <mailman.5922.1052583563.21513.help-gnu-emacs@gnu.org> 2 siblings, 1 reply; 41+ messages in thread From: Michael Na Li @ 2003-05-12 23:05 UTC (permalink / raw) On 10 May 2003, Kai Großjohann spake thusly: > Gaoyan Xie <gxie@eecs.wsu.edu> writes: > > > I am trying to explore GNU emacs's multilingual support, and what I > > want is the display and input of Chinese characters. Have any of you > > done this before? I tried according to GNU emacs' online manual, but > > still couldn't make it work. BTW, I am using Redhat Linux 7.2 and GNU > > emacs 20.7. > > I don't know anything about Chinese support in general. But with > Emacs, it was very easy. > > I compiled and installed Emacs and I also installed some Chinese > fonts. (The GNU intlfonts package, available from ftp.gnu.org, is a > good starting point.) > > Then I typed M-x view-hello-file RET. This showed me some Chinese > (and Japanese, and Korean) characters. If you see empty boxes > instead of the Chinese characters, then some fonts are missing. > > Then I typed C-\ chinese-py RET to select a Pinyin input method. Don't you need M-x set-language-environment RET Chinese-GB RET such that the file is saved in gb2312 coding? The chinese-py-punct method also provides ways to input Chinese style punctuations. Michael ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: Chinese characters support 2003-05-12 23:05 ` Michael Na Li @ 2003-05-13 7:02 ` Kai Großjohann 0 siblings, 0 replies; 41+ messages in thread From: Kai Großjohann @ 2003-05-13 7:02 UTC (permalink / raw) Michael Na Li <lina@u.washington.edu> writes: > Don't you need M-x set-language-environment RET Chinese-GB RET such that the > file is saved in gb2312 coding? It seems to offer gb2312 by default. That's because the characters in the buffer are gb2312. But if I was using Chinese all the time, I'd surely set my language environment to Chinese-GB. However, right now LC_CTYPE is set to de_DE@euro... -- This line is not blank. ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <mailman.5922.1052583563.21513.help-gnu-emacs@gnu.org>]
* Re: Chinese characters support [not found] ` <mailman.5922.1052583563.21513.help-gnu-emacs@gnu.org> @ 2003-05-13 7:40 ` Lee Sau Dan 0 siblings, 0 replies; 41+ messages in thread From: Lee Sau Dan @ 2003-05-13 7:40 UTC (permalink / raw) >>>>> "Charles" == Charles Muller <acmuller@gol.com> writes: Charles> Kai wrote: >> Then I typed M-x view-hello-file RET. This showed me some >> Chinese (and Japanese, and Korean) characters. If you see >> empty boxes instead of the Chinese characters, then some fonts >> are missing. Charles> I should be pointed out, nonetheless, that it is a bad Charles> idea to cite the hello file as an example of Charles> international script functionality, Why not? That file really illustrates the international script functionality. Charles> since it is set in an encoding that virtually no one ever Charles> uses (at least in the CJK world), That's a problem with encoding, not Emacs's international script functionality. Maybe, you have "conformance to Unicode and national encodings" in mind when you said "international script functionality". They're different issues. Charles> and it is quite often the case that that file will Charles> display fine despite the fact that CJK won't work in Charles> utf-8 or native East Asian encodings. C-x RET c utf-8 C-x s ... does save my Chinese text files in UTF-8. C-x RET c big5 C-x s ... does save my Chinese text files in BIG5 -- the "native" encoding for traditional Chinese. And needless to say, I can read files in UTF-8 and big5 using C-x RET c ... C-x C-f. (For Emacs 20, I need to install an external package for Unicode encodings: MuleUCS or something like that.) Charles> Someone should either get rid of that file or save it in Charles> a relevant encoding. Since no "native" encoding preserves the details that the emacs-mule encoding saves, that "showoff" file must be kept in emacs-mule. e.g. the section "Difference among chinese characters in GB, JIS, KSC, BIG5" would be impossible with Unicode, GB, JIS, KSC or BIG5. Only emacs-mule have enough coding space to accomodate all characters from these encodings and yet not unify them to make them look non-identical. -- Lee Sau Dan 李守敦(Big5) ~{@nJX6X~}(HZ) E-mail: danlee@informatik.uni-freiburg.de Home page: http://www.informatik.uni-freiburg.de/~danlee ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2003-05-16 11:36 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-05-07 23:08 Chinese characters support Gaoyan Xie 2003-05-08 6:27 ` Charles Muller [not found] ` <mailman.5739.1052375326.21513.help-gnu-emacs@gnu.org> 2003-05-08 7:33 ` Robin Hu 2003-05-10 14:28 ` Kai Großjohann [not found] <mailman.5730.1052348993.21513.help-gnu-emacs@gnu.org> 2003-05-10 14:26 ` Kai Großjohann 2003-05-10 16:17 ` Charles Muller 2003-05-10 16:45 ` Kai Großjohann 2003-05-10 17:31 ` Charles Muller 2003-05-10 18:43 ` Eli Zaretskii 2003-05-11 2:11 ` Charles Muller 2003-05-11 3:32 ` Eli Zaretskii 2003-05-11 13:59 ` Charles Muller [not found] ` <mailman.5976.1052661651.21513.help-gnu-emacs@gnu.org> 2003-05-12 19:29 ` Jason Rumney 2003-05-12 19:58 ` Kai Großjohann 2003-05-13 7:40 ` Lee Sau Dan 2003-05-13 9:57 ` acmuller 2003-05-13 10:02 ` Robin Hu 2003-05-15 8:07 ` Lee Sau Dan 2003-05-10 19:24 ` Kai Großjohann 2003-05-11 2:15 ` Charles Muller 2003-05-11 3:34 ` Eli Zaretskii [not found] ` <mailman.5956.1052619415.21513.help-gnu-emacs@gnu.org> 2003-05-12 19:56 ` Kai Großjohann 2003-05-13 3:36 ` Charles Muller 2003-05-14 3:14 ` Eli Zaretskii [not found] ` <mailman.6084.1052797097.21513.help-gnu-emacs@gnu.org> 2003-05-13 7:05 ` Kai Großjohann 2003-05-14 6:14 ` Lee Sau Dan 2003-05-14 16:27 ` Kai Großjohann 2003-05-14 21:07 ` Jason Rumney [not found] ` <mailman.5927.1052587973.21513.help-gnu-emacs@gnu.org> 2003-05-12 19:27 ` Jason Rumney 2003-05-13 7:40 ` Lee Sau Dan 2003-05-13 10:11 ` acmuller 2003-05-13 10:54 ` Charles Muller [not found] ` <mailman.6097.1052826249.21513.help-gnu-emacs@gnu.org> 2003-05-15 8:07 ` Lee Sau Dan 2003-05-10 17:58 ` Eli Zaretskii [not found] ` <mailman.5936.1052589798.21513.help-gnu-emacs@gnu.org> 2003-05-13 7:40 ` Lee Sau Dan 2003-05-14 3:15 ` Eli Zaretskii [not found] ` <mailman.6156.1052882447.21513.help-gnu-emacs@gnu.org> 2003-05-15 8:07 ` Lee Sau Dan 2003-05-16 11:36 ` Eli Zaretskii 2003-05-12 23:05 ` Michael Na Li 2003-05-13 7:02 ` Kai Großjohann [not found] ` <mailman.5922.1052583563.21513.help-gnu-emacs@gnu.org> 2003-05-13 7:40 ` Lee Sau Dan
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.