* utf-8 cjk translation bug? @ 2003-09-30 8:30 Miles Bader 2003-09-30 9:50 ` Jason Rumney 2003-09-30 12:59 ` Kenichi Handa 0 siblings, 2 replies; 18+ messages in thread From: Miles Bader @ 2003-09-30 8:30 UTC (permalink / raw) I have `utf-translate-cjk-mode' enabled. I have the following string in a buffer: NECエレクトロニクス(株) If I write it using say `euc-jp' coding system, no problem. According to `C-u C-x =', all the japanese characters are in the charset japanese-jisx0208. However, if I save it using utf-8, I get no complaints, but when I read it back in, the first 3 characters show up as little boxes. `C-u C-x =' shows the boxes as being in charset mule-unicode-e000-ffff; the rest of the characters are still listed as being in japanese-jisx0208. I presume this is representable utf-8, because unicode is supposed to be able to represent all characters in any component character set simultaneously, so it would seem to be a bug in utf-translate-cjk-mode. Any ideas? Thanks, -Miles -- Is it true that nothing can be known? If so how do we know this? -Woody Allen ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-09-30 8:30 utf-8 cjk translation bug? Miles Bader @ 2003-09-30 9:50 ` Jason Rumney 2003-09-30 10:05 ` Miles Bader 2003-09-30 12:59 ` Kenichi Handa 1 sibling, 1 reply; 18+ messages in thread From: Jason Rumney @ 2003-09-30 9:50 UTC (permalink / raw) Cc: emacs-devel Miles Bader wrote: > `C-u C-x =' shows the boxes as being in charset mule-unicode-e000-ffff; the rest of > the characters are still listed as being in japanese-jisx0208. I guess you would need a unicode font that includes double-width roman characters. Or frob utf-translate-cjk-mode to ignore the fact that those characters are within the representable range of unicode characters and convert them to jisx0208 anyway. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-09-30 9:50 ` Jason Rumney @ 2003-09-30 10:05 ` Miles Bader 0 siblings, 0 replies; 18+ messages in thread From: Miles Bader @ 2003-09-30 10:05 UTC (permalink / raw) Cc: emacs-devel Jason Rumney <jasonr@gnu.org> writes: > > `C-u C-x =' shows the boxes as being in charset > > mule-unicode-e000-ffff; the rest of the characters are still listed > > as being in japanese-jisx0208. > > I guess you would need a unicode font that includes double-width roman > characters. Or frob utf-translate-cjk-mode to ignore the fact that those > characters are within the representable range of unicode characters and > convert them to jisx0208 anyway. Why would that be necessary? The purpose of utf-translate-cjk-mode is to translate external unicode encodings to/from emacs charsets. For instance, after reading the utf-8 file, the katakana characters in the example I gave are represented in emacs using japanese-jisx0208, and displayed using a JISX0208.1983 encoded font. I don't think unicode fonts come into play at all. -Miles -- Yo mama's so fat when she gets on an elevator it HAS to go down. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-09-30 8:30 utf-8 cjk translation bug? Miles Bader 2003-09-30 9:50 ` Jason Rumney @ 2003-09-30 12:59 ` Kenichi Handa 2003-10-01 12:44 ` Dave Love 1 sibling, 1 reply; 18+ messages in thread From: Kenichi Handa @ 2003-09-30 12:59 UTC (permalink / raw) Cc: d.love, emacs-devel In article <buo7k3q3cgu.fsf@mcspd15.ucom.lsi.nec.co.jp>, Miles Bader <miles@lsi.nec.co.jp> writes: > I have `utf-translate-cjk-mode' enabled. > I have the following string in a buffer: > NECエレクトロニクス(株) > If I write it using say `euc-jp' coding system, no problem. According > to `C-u C-x =', all the japanese characters are in the charset > japanese-jisx0208. > However, if I save it using utf-8, I get no complaints, but when I read > it back in, the first 3 characters show up as little boxes. `C-u C-x =' > shows the boxes as being in charset mule-unicode-e000-ffff; the rest of > the characters are still listed as being in japanese-jisx0208. > I presume this is representable utf-8, because unicode is supposed to be > able to represent all characters in any component character set > simultaneously, so it would seem to be a bug in utf-translate-cjk-mode. The first three letters are "FULL WIDTH LATIN ?? LETTER" (U+FF??). Yes, they are representable in utf-8. But, in subst-jis.el, we have this code: (mapc (lambda (pair) (let ((unicode (car pair)) (char (cadr pair))) ;; exclude non-CJK components from decode table (if (and (>= unicode #x2e80) (<= unicode #xd7a3)) (puthash unicode char ucs-unicode-to-mule-cjk)) (puthash char unicode ucs-mule-cjk-to-unicode))) So, #xFF?? are excluded from ucs-unicode-to-mule-cjk, thus they are not translated to japanese-jisx0208 on decoding. If you have a ISO10646-1 font that contains full width glyphs for those characters, you can see correct glyphs. I think the reason why they are excluded from the translation is that they are representable by the charset mule-unicode-e000-ffff, thus there's no need of translation. It seems to be a reasonable decision, but considering that most users don't have an ISO10646-1 font containing those glyphs, and that those characters can also be regarded as CJK components (only CJK users uses them), I think we had better not exclude them from the translation. So, I suggest changing the above line (and similar lines in the other subst-XXX.el) to: (if (>= unicode #x2e80) (puthash unicode char ucs-unicode-to-mule-cjk)) and modify ccl-decode-mule-utf-8 to check translation also for those characters. Dave, what do you think? Does such a change leads to any problem? Aren't there anything else we should change? --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-09-30 12:59 ` Kenichi Handa @ 2003-10-01 12:44 ` Dave Love 2003-10-02 1:08 ` Kenichi Handa 0 siblings, 1 reply; 18+ messages in thread From: Dave Love @ 2003-10-01 12:44 UTC (permalink / raw) Cc: emacs-devel, miles Kenichi Handa <handa@m17n.org> writes: > So, #xFF?? are excluded from ucs-unicode-to-mule-cjk, thus > they are not translated to japanese-jisx0208 on decoding. > If you have a ISO10646-1 font that contains full width > glyphs for those characters, you can see correct glyphs. Or you can display them with a jisx font, for instance. > I think the reason why they are excluded from the > translation is that they are representable by the charset > mule-unicode-e000-ffff, thus there's no need of translation. That was part of the reason for it -- the hash-based translation code is only relevant because we more-or-less used up the code space for the BMP. I also chose the boundaries to avoid breaking the region between the mule-unicode and CJK charsets. > It seems to be a reasonable decision, but considering that > most users don't have an ISO10646-1 font containing those > glyphs, I thought they typically did if they had 10646 fonts at all. Is the problem that in recent XFree86, for instance, the double-width characters are in different fonts which have `adstyl' `ja' or `ko'? As far as I remember, the fontset code doesn't deal with that yet. (So many special cases, sigh.) > and that those characters can also be regarded as > CJK components (only CJK users uses them), I think we had > better not exclude them from the translation. I'm not really convinced, but I don't feel strongly about it. (If the extra charsets hadn't been added before mule-unicode, we'd just have covered the BMP with more mule-unicode ones.) > So, I suggest changing the above line (and similar lines in > the other subst-XXX.el) to: > > (if (>= unicode #x2e80) > (puthash unicode char ucs-unicode-to-mule-cjk)) > > and modify ccl-decode-mule-utf-8 to check translation also > for those characters. > > Dave, what do you think? Does such a change leads to any > problem? As far as I remember, it includes too much, and you end up displaying some characters double width that probably shouldn't be, but I don't remember which. How about including the ranges of the double-width Western characters and the high CJK stuff explicitly? I guess it doesn't expand the tables greatly. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-01 12:44 ` Dave Love @ 2003-10-02 1:08 ` Kenichi Handa 2003-10-03 16:04 ` Dave Love 0 siblings, 1 reply; 18+ messages in thread From: Kenichi Handa @ 2003-10-02 1:08 UTC (permalink / raw) Cc: emacs-devel, miles In article <rzq65j9nn4j.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes: >> I think the reason why they are excluded from the >> translation is that they are representable by the charset >> mule-unicode-e000-ffff, thus there's no need of translation. > That was part of the reason for it -- the hash-based translation code > is only relevant because we more-or-less used up the code space for > the BMP. I also chose the boundaries to avoid breaking the region > between the mule-unicode and CJK charsets. Sorry, I don't understand the meaning of the last sentence. >> It seems to be a reasonable decision, but considering that >> most users don't have an ISO10646-1 font containing those >> glyphs, > I thought they typically did if they had 10646 fonts at all. Is the > problem that in recent XFree86, for instance, the double-width > characters are in different fonts which have `adstyl' `ja' or `ko'? Ah, right, they have double-width glyphs for those chars. But, I think there are still many those who are not using the recent XFree86, or who have not installed those fonts. > As far as I remember, the fontset code doesn't deal with that yet. > (So many special cases, sigh.) Right. So, even for XFree86 users, to utilize those fonts, we need extra work. >> and that those characters can also be regarded as >> CJK components (only CJK users uses them), I think we had >> better not exclude them from the translation. > I'm not really convinced, but I don't feel strongly about it. (If the > extra charsets hadn't been added before mule-unicode, we'd just have > covered the BMP with more mule-unicode ones.) And if I knew it took that long time to release the code that contains mule-unicode charsets, I'd implemented a single 3-dimensional charset that covers almost all Unicode characters (Charset-ID 159 is not yet used). >> So, I suggest changing the above line (and similar lines in >> the other subst-XXX.el) to: >> >> (if (>= unicode #x2e80) >> (puthash unicode char ucs-unicode-to-mule-cjk)) >> >> and modify ccl-decode-mule-utf-8 to check translation also >> for those characters. >> >> Dave, what do you think? Does such a change leads to any >> problem? > As far as I remember, it includes too much, and you end up displaying > some characters double width that probably shouldn't be, but I don't > remember which. How about including the ranges of the double-width > Western characters and the high CJK stuff explicitly? I guess it > doesn't expand the tables greatly. Ok, I've just installed a code that include U+FF00..U+FFEF in the decode tables. Now, in utf-translate-cjk mode: (decode-coding-string (encode-coding-string "NECエレクトロニクス(株)" 'utf-8) 'utf-8) => "NECエレクトロニクス(株)" --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-02 1:08 ` Kenichi Handa @ 2003-10-03 16:04 ` Dave Love 2003-10-03 16:34 ` Jason Rumney 2003-10-06 23:53 ` Kenichi Handa 0 siblings, 2 replies; 18+ messages in thread From: Dave Love @ 2003-10-03 16:04 UTC (permalink / raw) Cc: emacs-devel, miles Kenichi Handa <handa@m17n.org> writes: >> I also chose the boundaries to avoid breaking the region >> between the mule-unicode and CJK charsets. > > Sorry, I don't understand the meaning of the last sentence. mule-unicode-2500-33ff overlaps with one of the CJK blocks. You want to avoid translating the part that overlaps to mule-unicode-2500-33ff so that the block is displayed in a consistent font by default. Is that clear? > Ah, right, they have double-width glyphs for those chars. > But, I think there are still many those who are not using > the recent XFree86, or who have not installed those fonts. I would have expected them to have iso10646 fonts if they are using utf-8 (for the sake of applications other than Emacs) but maybe that isn't the case. You are obviously in a better position than I am to decide the right thing. > And if I knew it took that long time to release the code > that contains mule-unicode charsets, I'd implemented a > single 3-dimensional charset that covers almost all Unicode > characters (Charset-ID 159 is not yet used). I may have the remains of the partial implementation somewhere. It almost looks attractive again, as I guess there is no likelihood of Emacs 22 being released remotely soon... ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-03 16:04 ` Dave Love @ 2003-10-03 16:34 ` Jason Rumney 2003-10-06 2:29 ` Miles Bader 2003-10-07 11:40 ` Dave Love 2003-10-06 23:53 ` Kenichi Handa 1 sibling, 2 replies; 18+ messages in thread From: Jason Rumney @ 2003-10-03 16:34 UTC (permalink / raw) Cc: emacs-devel Dave Love wrote: > I would have expected them to have iso10646 fonts if they are using > utf-8 (for the sake of applications other than Emacs) but maybe that > isn't the case. I think the problem is not that they don't have iso10646 fonts, it is that the iso10646 fonts they do have do not contain any of the double width characters, including double width roman that is in the 2500-33ff range. Until Emacs gets a function to query which glyphs a font has (I see such a function in emacs-unicode-2), then it is safer to use localized fonts where possible instead of iso10646. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-03 16:34 ` Jason Rumney @ 2003-10-06 2:29 ` Miles Bader 2003-10-06 20:00 ` Miles Bader 2003-10-07 11:41 ` Dave Love 2003-10-07 11:40 ` Dave Love 1 sibling, 2 replies; 18+ messages in thread From: Miles Bader @ 2003-10-06 2:29 UTC (permalink / raw) Cc: Dave Love, emacs-devel Jason Rumney <jasonr@gnu.org> writes: > > I would have expected them to have iso10646 fonts if they are using > > utf-8 (for the sake of applications other than Emacs) but maybe that > > isn't the case. > > I think the problem is not that they don't have iso10646 fonts, it is > that the iso10646 fonts they do have do not contain any of the double > width characters, including double width roman that is in the > 2500-33ff range. Yeah, that's definitely the case, and it's not just a problem with double-width characters -- the coverage of many iso10646 fonts seems completely crap. E.g., see a post by `Danilo Segan' on this list. It apparently contains cyrillic characters encoded in UTF-8, which emacs dutifully tries to render using an iso10646 font, but show up as square boxes on my system... Here's the output of `C-u C-x =', in case anyone is interested: character: с (01212141, 332897, 0x51461, U+0441) charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.) code point: 40 97 syntax: w which means: word category: y:Cyrillic buffer code: 0x9C 0xF4 0xA8 0xE1 file code: 0x9C 0xF4 0xA8 0xE1 (encoded by coding system raw-text-unix) display: by this font (glyph code) -bitstream-bitstream vera sans mono-medium-r-normal--16-122-95-95-c-100-iso10646-1 (0x441) -Miles -- `Suppose Korea goes to the World Cup final against Japan and wins,' Moon said. `All the past could be forgiven.' [NYT] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-06 2:29 ` Miles Bader @ 2003-10-06 20:00 ` Miles Bader 2003-10-06 20:53 ` Jason Rumney ` (2 more replies) 2003-10-07 11:41 ` Dave Love 1 sibling, 3 replies; 18+ messages in thread From: Miles Bader @ 2003-10-06 20:00 UTC (permalink / raw) Miles Bader <miles@lsi.nec.co.jp> writes: > Yeah, that's definitely the case, and it's not just a problem with > double-width characters -- the coverage of many iso10646 fonts seems > completely crap. BTW, does this mean that the new unicode emacs will have problems rendering many charsets that are currently displayed properly by emacs? -Miles -- We live, as we dream -- alone.... ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-06 20:00 ` Miles Bader @ 2003-10-06 20:53 ` Jason Rumney 2003-10-06 23:18 ` Kenichi Handa 2003-10-07 9:57 ` Stephen J. Turnbull 2 siblings, 0 replies; 18+ messages in thread From: Jason Rumney @ 2003-10-06 20:53 UTC (permalink / raw) Cc: emacs-devel Miles Bader <miles@gnu.org> writes: > Miles Bader <miles@lsi.nec.co.jp> writes: > > Yeah, that's definitely the case, and it's not just a problem with > > double-width characters -- the coverage of many iso10646 fonts seems > > completely crap. > > BTW, does this mean that the new unicode emacs will have problems > rendering many charsets that are currently displayed properly by emacs? I'm sure Handa-san can confirm for sure, but unicode Emacs has a new function x_get_font_repertory which seems to deal with this situation. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-06 20:00 ` Miles Bader 2003-10-06 20:53 ` Jason Rumney @ 2003-10-06 23:18 ` Kenichi Handa 2003-10-07 9:57 ` Stephen J. Turnbull 2 siblings, 0 replies; 18+ messages in thread From: Kenichi Handa @ 2003-10-06 23:18 UTC (permalink / raw) Cc: emacs-devel In article <87u16mf867.fsf@tc-1-100.kawasaki.gol.ne.jp>, Miles Bader <miles@gnu.org> writes: > Miles Bader <miles@lsi.nec.co.jp> writes: >> Yeah, that's definitely the case, and it's not just a problem with >> double-width characters -- the coverage of many iso10646 fonts seems >> completely crap. > BTW, does this mean that the new unicode emacs will have problems > rendering many charsets that are currently displayed properly by emacs? No. In emacs-unicode, we can assign multiple fonts for each script, charset, or a range of character codes, and Emacs selects one that has a requested glyph and has the highest priority depending on the current langauge environment. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-06 20:00 ` Miles Bader 2003-10-06 20:53 ` Jason Rumney 2003-10-06 23:18 ` Kenichi Handa @ 2003-10-07 9:57 ` Stephen J. Turnbull 2 siblings, 0 replies; 18+ messages in thread From: Stephen J. Turnbull @ 2003-10-07 9:57 UTC (permalink / raw) Cc: emacs-devel >>>>> "Miles" == Miles Bader <miles@gnu.org> writes: Miles> Miles Bader <miles@lsi.nec.co.jp> writes: >> Yeah, that's definitely the case, and it's not just a problem >> with double-width characters -- the coverage of many iso10646 >> fonts seems completely crap. Microsoft Arial, for one of the egregious worst. :-) Why should a Russian font designer be good at designing Sanskrit glyphs? Who knows which Thai fonts go well with a given Arabic one? ISO 10646 fonts should cover what their designers are good at, no more---and not necessarily no less. It's not even obvious that there should be terminal fonts with universal coverage. Some Unihan users will surely object to any given choice of glyphs, for example. Miles> BTW, does this mean that the new unicode emacs will have Miles> problems rendering many charsets that are currently Miles> displayed properly by emacs? In practice, probably yes ... but that will be a bug easily fixed, and long before release. :-) Post-modern Emacsen will have to have support for efficiently querying font repertoire (I see emacs-unicode already has a defined API). For example, you'll get this for free with Xft2/fontconfig. -- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-06 2:29 ` Miles Bader 2003-10-06 20:00 ` Miles Bader @ 2003-10-07 11:41 ` Dave Love 1 sibling, 0 replies; 18+ messages in thread From: Dave Love @ 2003-10-07 11:41 UTC (permalink / raw) Cc: emacs-devel, Jason Rumney Miles Bader <miles@lsi.nec.co.jp> writes: > Yeah, that's definitely the case, and it's not just a problem with > double-width characters -- the coverage of many iso10646 fonts seems > completely crap. You must have missed my rants on the topic (and Unicode fundamentalism in general). Emacs 22 tries to work around the problems with such fonts. presumably the proper solution would be to have low-volume meta information available about the repertoires. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-03 16:34 ` Jason Rumney 2003-10-06 2:29 ` Miles Bader @ 2003-10-07 11:40 ` Dave Love 1 sibling, 0 replies; 18+ messages in thread From: Dave Love @ 2003-10-07 11:40 UTC (permalink / raw) Cc: emacs-devel Jason Rumney <jasonr@gnu.org> writes: > I think the problem is not that they don't have iso10646 fonts, it is > that the iso10646 fonts they do have do not contain any of the double > width characters, I meant one with the appropriate repertoire. > including double width roman that is in the 2500-33ff > range. Until Emacs gets a function to query which glyphs a font has (I > see such a function in emacs-unicode-2), then it is safer to use > localized fonts where possible instead of iso10646. See TODO, I think. However, checking the large number of tiny-repertoire fonts that are randomly encoded as iso10646 that you might have under XFree86, for instance, seems like bad news. You at least have to load them to extract information. [The issue of which fonts you use is actually orthogonal to which charsets you decode into, as long as you have the translation tables available. The CCL for the font encoding may not exist currently in cases where it would be useful, but it is trivial to provide. See examples which update `font-ccl-encoder-alist'.] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-03 16:04 ` Dave Love 2003-10-03 16:34 ` Jason Rumney @ 2003-10-06 23:53 ` Kenichi Handa 2003-10-10 16:56 ` Dave Love 1 sibling, 1 reply; 18+ messages in thread From: Kenichi Handa @ 2003-10-06 23:53 UTC (permalink / raw) Cc: emacs-devel, miles In article <rzqzngigvdy.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes: > Kenichi Handa <handa@m17n.org> writes: >>> I also chose the boundaries to avoid breaking the region >>> between the mule-unicode and CJK charsets. >> >> Sorry, I don't understand the meaning of the last sentence. > mule-unicode-2500-33ff overlaps with one of the CJK blocks. You want > to avoid translating the part that overlaps to mule-unicode-2500-33ff > so that the block is displayed in a consistent font by default. Is > that clear? Yes. I see. >> Ah, right, they have double-width glyphs for those chars. >> But, I think there are still many those who are not using >> the recent XFree86, or who have not installed those fonts. > I would have expected them to have iso10646 fonts if they are using > utf-8 (for the sake of applications other than Emacs) but maybe that > isn't the case. You are obviously in a better position than I am to > decide the right thing. "using utf-8" means many things. If they are using utf-8 locale, I think they surely have those fonts. But, as far as I know, ja_JP.UTF-8 is still not that popular in Japan. And, even in ja_JP.eucJP locale, people occasionally have to use utf-8 file for many reasons. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-06 23:53 ` Kenichi Handa @ 2003-10-10 16:56 ` Dave Love 2003-10-13 23:55 ` Kenichi Handa 0 siblings, 1 reply; 18+ messages in thread From: Dave Love @ 2003-10-10 16:56 UTC (permalink / raw) Cc: emacs-devel, miles Kenichi Handa <handa@m17n.org> writes: > "using utf-8" means many things. If they are using utf-8 > locale, I think they surely have those fonts. But, as far > as I know, ja_JP.UTF-8 is still not that popular in Japan. > And, even in ja_JP.eucJP locale, people occasionally have to > use utf-8 file for many reasons. Of course, but I assume they don't use Emacs in isolation and they probably would display such files in other ways sometimes. Since we have to make the best of things which we probably can't get obviously right in all circumstances, I want to understand the context and I don't mean to argue with what you think is appropriate. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: utf-8 cjk translation bug? 2003-10-10 16:56 ` Dave Love @ 2003-10-13 23:55 ` Kenichi Handa 0 siblings, 0 replies; 18+ messages in thread From: Kenichi Handa @ 2003-10-13 23:55 UTC (permalink / raw) Cc: miles, emacs-devel In article <rzqzng9vxou.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes: > Kenichi Handa <handa@m17n.org> writes: >> "using utf-8" means many things. If they are using utf-8 >> locale, I think they surely have those fonts. But, as far >> as I know, ja_JP.UTF-8 is still not that popular in Japan. >> And, even in ja_JP.eucJP locale, people occasionally have to >> use utf-8 file for many reasons. > Of course, but I assume they don't use Emacs in isolation and they > probably would display such files in other ways sometimes. Sometimes, Emacs is the only way (or, at least, the only way he knows) to see files that are encoded in a way that his locale doesn't support. > Since we have to make the best of things which we probably > can't get obviously right in all circumstances, I want to > understand the context and I don't mean to argue with what > you think is appropriate. Yes, I understand your intention. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2003-10-13 23:55 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-09-30 8:30 utf-8 cjk translation bug? Miles Bader 2003-09-30 9:50 ` Jason Rumney 2003-09-30 10:05 ` Miles Bader 2003-09-30 12:59 ` Kenichi Handa 2003-10-01 12:44 ` Dave Love 2003-10-02 1:08 ` Kenichi Handa 2003-10-03 16:04 ` Dave Love 2003-10-03 16:34 ` Jason Rumney 2003-10-06 2:29 ` Miles Bader 2003-10-06 20:00 ` Miles Bader 2003-10-06 20:53 ` Jason Rumney 2003-10-06 23:18 ` Kenichi Handa 2003-10-07 9:57 ` Stephen J. Turnbull 2003-10-07 11:41 ` Dave Love 2003-10-07 11:40 ` Dave Love 2003-10-06 23:53 ` Kenichi Handa 2003-10-10 16:56 ` Dave Love 2003-10-13 23:55 ` Kenichi Handa
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).