* Re: Possible UTF-8 CJK Regressions in Terminal Emulators [not found] ` <200403010815.RAA14365@etlken.m17n.org> @ 2004-03-18 15:34 ` Dave Love 2004-04-07 12:30 ` Kenichi Handa 0 siblings, 1 reply; 20+ messages in thread From: Dave Love @ 2004-03-18 15:34 UTC (permalink / raw) Cc: mariano, alexander.winston, emacs-devel, danilo, monnier, miles [I don't know what this has to do with the subject.] Kenichi Handa <handa@m17n.org> writes: > In article <buoeksjyee3.fsf@mcspd15.ucom.lsi.nec.co.jp>, Miles Bader <miles@lsi.nec.co.jp> writes: > >> Alexander Winston <alexander.winston@comcast.net> writes: >>> Okay, back to UTF-8. With regard to CJK being disabled to default, I >>> believe that this decision is rather prejudicial to many Asian users. I don't think so. There's no reason why you shouldn't define a language environment corresponding to ja_JP.UTF-8 which turned it on (not that language environments is the right approach to locale handling). Anyway, there seems to be very little interest from users; I don't recall any of the contributions I expected. Mostly I've just had unhelpful remarks from non-CJK users. >> I've been told that the reason `utf-translate-cjk-mode' is disabled by >> default is that it consumes some non-trivial amount of memory (and >> loading time, Yes. If I remember correctly, I posted measurements. >> unless it's dumped I guess). It doesn't make sense to dump it, the way it works. > As we have post-read-conversion function for utf-8, it is > possible to detect untranslated CJK characters and translate > them. > > How abut this? > > Change utf-translate-cjk-mode to a customizable variable > utf-translate-cjk which is nil, t, or auto (default). The > values nil and t mean the same thing as the current value of > utf-translate-cjk-mode. The value `auto' means setting up > tables for translating CJK characters automatically if > necessary. > > By adding pre-write-conversion function, we can make the > above work also on writing. But, in that case, it seems > difficult to make find-coding-systems-region/string work > consistently. To check if a text is encodable by utf-8, we > must load translation tables. As far as I remember, that's why I didn't implement that sort of thing. post-read-conversion machinery is already there, I think. [Is this code base ever going to be released so that most users actually can use it?] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-03-18 15:34 ` Possible UTF-8 CJK Regressions in Terminal Emulators Dave Love @ 2004-04-07 12:30 ` Kenichi Handa 2004-04-08 11:27 ` Dave Love 0 siblings, 1 reply; 20+ messages in thread From: Kenichi Handa @ 2004-04-07 12:30 UTC (permalink / raw) Cc: mariano, alexander.winston, emacs-devel, danilo, monnier, miles In article <rzqvfl2yxsg.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes: >> Change utf-translate-cjk-mode to a customizable variable >> utf-translate-cjk which is nil, t, or auto (default). The >> values nil and t mean the same thing as the current value of >> utf-translate-cjk-mode. The value `auto' means setting up >> tables for translating CJK characters automatically if >> necessary. >> >> By adding pre-write-conversion function, we can make the >> above work also on writing. But, in that case, it seems >> difficult to make find-coding-systems-region/string work >> consistently. To check if a text is encodable by utf-8, we >> must load translation tables. > As far as I remember, that's why I didn't implement that sort of > thing. Wait! If utf-translate-cjk-mode can encode all jis, kcs, big5, and gb to utf-8, we can tell that they can be encoded by utf-8 without loading tables. What we have to do is to simply include those charsets in `safe-charsets' on defining utf-8. > post-read-conversion machinery is already there, I think. Yes, utf-8 already has utf-8-post-read-conversion which composes unencoded raw-bytes into Unicode U+FFFD. > [Is this code base ever going to be released so that most users > actually can use it?] I'd like to ask it too. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-04-07 12:30 ` Kenichi Handa @ 2004-04-08 11:27 ` Dave Love 2004-04-09 11:28 ` Kenichi Handa 0 siblings, 1 reply; 20+ messages in thread From: Dave Love @ 2004-04-08 11:27 UTC (permalink / raw) Cc: mariano, alexander.winston, emacs-devel, danilo, monnier, miles Kenichi Handa <handa@m17n.org> writes: > Wait! If utf-translate-cjk-mode can encode all jis, kcs, > big5, and gb to utf-8, I don't think that's true (or I think it wasn't when I built the tables). Maybe that's not so (now). Also, the tables are customizable by design -- for instance, I anticipated people adding characters from CNS. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-04-08 11:27 ` Dave Love @ 2004-04-09 11:28 ` Kenichi Handa 2004-06-07 12:27 ` Kenichi Handa 0 siblings, 1 reply; 20+ messages in thread From: Kenichi Handa @ 2004-04-09 11:28 UTC (permalink / raw) Cc: mariano, alexander.winston, emacs-devel, danilo, monnier, miles In article <rzqu0zu1zh4.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes: > Kenichi Handa <handa@m17n.org> writes: >> Wait! If utf-translate-cjk-mode can encode all jis, kcs, >> big5, and gb to utf-8, > I don't think that's true (or I think it wasn't when I built the > tables). Maybe that's not so (now). Also, the tables are > customizable by design -- for instance, I anticipated people adding > characters from CNS. I've just checked all subst-*.el. They all contain full maps, i.e. all defined characters can be encoded into utf-8. Of course, a character not defined in each standard (e.g. a character made by (make-char japanese-jisx0208 37 126)) can't be encoded, but I think the merit of ignoring such a character is higher than correctly telling that they can't be encoded into utf-8. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-04-09 11:28 ` Kenichi Handa @ 2004-06-07 12:27 ` Kenichi Handa 2004-06-07 12:36 ` Miles Bader ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Kenichi Handa @ 2004-06-07 12:27 UTC (permalink / raw) While fixing a bug of utf-8-post-read-conversion (it may modify a text out of range), I remembered this discussion, and did some work. In article <200404091128.UAA02120@etlken.m17n.org>, Kenichi Handa <handa@m17n.org> writes: > In article <rzqu0zu1zh4.fsf@albion.dl.ac.uk>, Dave Love > <d.love@dl.ac.uk> writes: >> Kenichi Handa <handa@m17n.org> writes: >>> Wait! If utf-translate-cjk-mode can encode all jis, >>> kcs, big5, and gb to utf-8, >> I don't think that's true (or I think it wasn't when I >> built the tables). Maybe that's not so (now). Also, the >> tables are customizable by design -- for instance, I >> anticipated people adding characters from CNS. > I've just checked all subst-*.el. They all contain full > maps, i.e. all defined characters can be encoded into > utf-8. Of course, a character not defined in each > standard (e.g. a character made by (make-char > japanese-jisx0208 37 126)) can't be encoded, but I think > the merit of ignoring such a character is higher than > correctly telling that they can't be encoded into utf-8. I think I succeeded in loading subst-*.el not at the time of customizing utf-translate-cjk-mode to t but only when it is found that loading them is necessary on decoding or encoding utf-8, or on running decode/encode-char. This means that we can make the default value of utf-translate-cjk-mode to t without loading subst-*.el at building time. I think it's a big improvement especially for CJK users, and is an improvement of an existing feature rather than a new feature. If people agree on making utf-translate-cjk-mode to t, I'll brush-up the current working code and install the changes. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-07 12:27 ` Kenichi Handa @ 2004-06-07 12:36 ` Miles Bader 2004-06-07 13:00 ` Kenichi Handa 2004-06-08 17:56 ` Dave Love 2004-06-12 2:41 ` Kenichi Handa 2 siblings, 1 reply; 20+ messages in thread From: Miles Bader @ 2004-06-07 12:36 UTC (permalink / raw) Cc: mariano, alexander.winston, d.love, emacs-devel, danilo, monnier, miles On Mon, Jun 07, 2004 at 09:27:36PM +0900, Kenichi Handa wrote: > I think it's a big improvement especially for CJK users, and > is an improvement of an existing feature rather than a new > feature. If people agree on making utf-translate-cjk-mode > to t, I'll brush-up the current working code and install the > changes. Absolutely! Then we can say "utf-8 is (almost) completely supported"... I think this is a very important thing. -Miles -- "Though they may have different meanings, the cries of 'Yeeeee-haw!' and 'Allahu akbar!' are, in spirit, not actually all that different." ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-07 12:36 ` Miles Bader @ 2004-06-07 13:00 ` Kenichi Handa 2004-06-08 18:02 ` Dave Love 0 siblings, 1 reply; 20+ messages in thread From: Kenichi Handa @ 2004-06-07 13:00 UTC (permalink / raw) Cc: mariano, alexander.winston, d.love, emacs-devel, danilo, monnier, miles In article <20040607123615.GA29450@fencepost>, Miles Bader <miles@gnu.org> writes: > On Mon, Jun 07, 2004 at 09:27:36PM +0900, Kenichi Handa > wrote: >> I think it's a big improvement especially for CJK users, >> and is an improvement of an existing feature rather than >> a new feature. If people agree on making >> utf-translate-cjk-mode to t, I'll brush-up the current >> working code and install the changes. > Absolutely! Then we can say "utf-8 is (almost) completely > supported"... I think this is a very important thing. I think "completely" is still too strong even with preceding "(almost)". Perhaps "utf-8 support is fairly good" or "Unicode BMP support is fairly good". --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-07 13:00 ` Kenichi Handa @ 2004-06-08 18:02 ` Dave Love 2004-06-09 7:37 ` Kenichi Handa 0 siblings, 1 reply; 20+ messages in thread From: Dave Love @ 2004-06-08 18:02 UTC (permalink / raw) Cc: mariano, alexander.winston, emacs-devel, danilo, monnier, miles Kenichi Handa <handa@m17n.org> writes: >> Absolutely! Then we can say "utf-8 is (almost) completely >> supported"... I think this is a very important thing. > > I think "completely" is still too strong even with preceding > "(almost)". I know what you mean, but I think that's the sort of thing that encourages the established user confusion over encoding issues. UTF-8 per se is fully supported up to some limit on the code point. (I hope that's as large as the Emacs 22 maximum codepoint, but I don't remember.) Whether or not valid unicodes can be decoded into a character Emacs can actually encode/display/input properly is a different matter, and the feature should affect all relevant CCL coding systems, especially UTF-16. > Perhaps "utf-8 support is fairly good" or > "Unicode BMP support is fairly good". The latter is much better. (Exceptions include at least: various complex scripts, much of the CJK space (little used?), reliable display of CJK e.g. with XFree86 10646-encoded fonts, locale support (including customization of the font encodings preferred), and BIDI.) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-08 18:02 ` Dave Love @ 2004-06-09 7:37 ` Kenichi Handa 2004-06-09 9:38 ` Stefan Monnier 0 siblings, 1 reply; 20+ messages in thread From: Kenichi Handa @ 2004-06-09 7:37 UTC (permalink / raw) Cc: mariano, alexander.winston, emacs-devel, danilo, monnier, miles In article <rzq4qpmgc80.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes: > Kenichi Handa <handa@m17n.org> writes: > >> Absolutely! Then we can say "utf-8 is (almost) completely > >> supported"... I think this is a very important thing. > > > > I think "completely" is still too strong even with preceding > > "(almost)". > I know what you mean, but I think that's the sort of thing that > encourages the established user confusion over encoding issues. > UTF-8 per se is fully supported up to some limit on the code point. > (I hope that's as large as the Emacs 22 maximum codepoint, but I don't > remember.) No, the current support of UTF-8 is limitted to U+10FFFF (the maximum Unicode character). > Whether or not valid unicodes can be decoded into a > character Emacs can actually encode/display/input properly is a > different matter, Ah, yes. In that sense, we can say utf-8 encoding/decoding is completely supportted. > and the feature should affect all relevant CCL > coding systems, especially UTF-16. As surrogate pair was not handled well by UTF-16 converter, I've just fixed it too (not yet installed, I'm now adding comments in a code). Untranslatable characters are decoded into UTF-8 form represented by the sequence of eight-bit-graphic/control characters (the same way as UTF-8 decoding, thus we can use utf-8-post-read-conversion). The UTF-16 encoder encodes such a sequence back to the origianl UTF-16 form. So, now the UTF-16 support is at the same level as UTF-8. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-09 7:37 ` Kenichi Handa @ 2004-06-09 9:38 ` Stefan Monnier 2004-06-10 0:20 ` Kenichi Handa 0 siblings, 1 reply; 20+ messages in thread From: Stefan Monnier @ 2004-06-09 9:38 UTC (permalink / raw) Cc: mariano, alexander.winston, d.love, emacs-devel, danilo, miles > As surrogate pair was not handled well by UTF-16 converter, > I've just fixed it too (not yet installed, I'm now adding > comments in a code). Untranslatable characters are decoded > into UTF-8 form represented by the sequence of > eight-bit-graphic/control characters (the same way as UTF-8 > decoding, thus we can use utf-8-post-read-conversion). The > UTF-16 encoder encodes such a sequence back to the origianl > UTF-16 form. So, now the UTF-16 support is at the same > level as UTF-8. Does that mean that some sequences of eight-bit-graphic/control are not encoded into the corresponding raw bytes? If so, that makes me a bit uneasy, since those special chars were introduced specifically to handle things like binary input or bad-byte-sequences and make sure that we at least preserve the raw bytes in those cases. Stefan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-09 9:38 ` Stefan Monnier @ 2004-06-10 0:20 ` Kenichi Handa 0 siblings, 0 replies; 20+ messages in thread From: Kenichi Handa @ 2004-06-10 0:20 UTC (permalink / raw) Cc: mariano, alexander.winston, d.love, emacs-devel, danilo, miles In article <jwvu0xlhy6d.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > As surrogate pair was not handled well by UTF-16 converter, > > I've just fixed it too (not yet installed, I'm now adding > > comments in a code). Untranslatable characters are decoded > > into UTF-8 form represented by the sequence of > > eight-bit-graphic/control characters (the same way as UTF-8 > > decoding, thus we can use utf-8-post-read-conversion). The > > UTF-16 encoder encodes such a sequence back to the origianl > > UTF-16 form. So, now the UTF-16 support is at the same > > level as UTF-8. > Does that mean that some sequences of eight-bit-graphic/control are not > encoded into the corresponding raw bytes? No. But, that's only the case that we encode a modified text (i.e. eight-bit-graphic/control chars are added/modified after we decoded a source). > If so, that makes me a bit uneasy, since those special chars were > introduced specifically to handle things like binary input or > bad-byte-sequences and make sure that we at least preserve the raw bytes in > those cases. As far as we encode a non-modified text that is generated by decoding a source, we can preserve the byte sequence even if the original source contains bad-byte-sequence (for the case of UTF-8, I found a case that doesn't work as expected and fixed). --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-07 12:27 ` Kenichi Handa 2004-06-07 12:36 ` Miles Bader @ 2004-06-08 17:56 ` Dave Love 2004-06-09 7:24 ` Kenichi Handa 2004-06-12 2:41 ` Kenichi Handa 2 siblings, 1 reply; 20+ messages in thread From: Dave Love @ 2004-06-08 17:56 UTC (permalink / raw) Cc: mariano, alexander.winston, emacs-devel, danilo, monnier, miles Kenichi Handa <handa@m17n.org> writes: > I think I succeeded in loading subst-*.el not at the time of > customizing utf-translate-cjk-mode to t but only when it is > found that loading them is necessary on decoding or encoding > utf-8, or on running decode/encode-char. This means that we > can make the default value of utf-translate-cjk-mode to t > without loading subst-*.el at building time. It doesn't fix the potential effects on non-CJK users if decoding a bit of Unicode text containing such a character will load the large tables even if they're useless to the user. Maybe there aren't many people now with 48MB P133s or old SPARCs like me, in which case it's a reasonable default, but I suggest an entry in NEWS/PROBLEMS about it. > I think it's a big improvement especially for CJK users, I agree it should be on for CJK users anyway. (I thought it was now conditional on the language environment.) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-08 17:56 ` Dave Love @ 2004-06-09 7:24 ` Kenichi Handa 0 siblings, 0 replies; 20+ messages in thread From: Kenichi Handa @ 2004-06-09 7:24 UTC (permalink / raw) Cc: mariano, alexander.winston, emacs-devel, danilo, monnier, miles In article <rzqd64agch6.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes: > Kenichi Handa <handa@m17n.org> writes: > > I think I succeeded in loading subst-*.el not at the time of > > customizing utf-translate-cjk-mode to t but only when it is > > found that loading them is necessary on decoding or encoding > > utf-8, or on running decode/encode-char. This means that we > > can make the default value of utf-translate-cjk-mode to t > > without loading subst-*.el at building time. > It doesn't fix the potential effects on non-CJK users if decoding a > bit of Unicode text containing such a character will load the large > tables even if they're useless to the user. Maybe there aren't many > people now with 48MB P133s or old SPARCs like me, in which case it's a > reasonable default, but I suggest an entry in NEWS/PROBLEMS about it. I'm going to modify the current entry in NEWS as below. ** The utf-8/16 coding systems have been enhanced. By default, untranslatable utf-8 sequences are simply composed into single quasi-characters. User option `utf-translate-cjk-mode' (it is turned on by default) arranges to translate many utf-8 CJK character sequences into real Emacs characters in a similar way to the Mule-UCS system. As this loads a fairly big data on demand, people who are not interested in CJK characters may want to customize it to nil. You can augment/amend the CJK translation via hash tables `ucs-mule-cjk-to-unicode' and `ucs-unicode-to-mule-cjk'. The utf-8 coding systems now also encodes characters from most of Emacs's one-dimensional internal charsets, specifically the ISO-8859 ones. The utf-16 coding system is affected similarly. > > I think it's a big improvement especially for CJK users, > I agree it should be on for CJK users anyway. (I thought it was now > conditional on the language environment.) It's not. I think we had better avoid turning on/off a user option depending on language environment. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-07 12:27 ` Kenichi Handa 2004-06-07 12:36 ` Miles Bader 2004-06-08 17:56 ` Dave Love @ 2004-06-12 2:41 ` Kenichi Handa 2004-06-12 13:46 ` Juanma Barranquero 2 siblings, 1 reply; 20+ messages in thread From: Kenichi Handa @ 2004-06-12 2:41 UTC (permalink / raw) Cc: mariano, alexander.winston, d.love, emacs-devel, danilo, monnier, miles In article <200406071227.VAA06216@etlken.m17n.org>, Kenichi Handa <handa@m17n.org> writes: > I think I succeeded in loading subst-*.el not at the time of > customizing utf-translate-cjk-mode to t but only when it is > found that loading them is necessary on decoding or encoding > utf-8, or on running decode/encode-char. This means that we > can make the default value of utf-translate-cjk-mode to t > without loading subst-*.el at building time. > I think it's a big improvement especially for CJK users, and > is an improvement of an existing feature rather than a new > feature. If people agree on making utf-translate-cjk-mode > to t, I'll brush-up the current working code and install the > changes. As it seems there's no strong objection, I've just installed the necessary changes. I also modified set-language-environment to re-load substitution tables if necessary, i.e. in the case that utf-translate-cjk-mode is on and the tables are already loaded in the different language environment. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-12 2:41 ` Kenichi Handa @ 2004-06-12 13:46 ` Juanma Barranquero 2004-06-13 8:42 ` Kenichi Handa 0 siblings, 1 reply; 20+ messages in thread From: Juanma Barranquero @ 2004-06-12 13:46 UTC (permalink / raw) Cc: handa On Sat, 12 Jun 2004 11:41:51 +0900 (JST), Kenichi Handa <handa@m17n.org> wrote: > As it seems there's no strong objection, I've just installed > the necessary changes. I also modified > set-language-environment to re-load substitution tables if > necessary, i.e. in the case that utf-translate-cjk-mode is > on and the tables are already loaded in the different > language environment. I'm getting a perhaps related bootstrapping error: Loading language/chinese (source)... Loading language/cyrillic (source)... Loading subst-ksc (source)... Invalid read syntax: "?" /L/e/k/t/u ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-12 13:46 ` Juanma Barranquero @ 2004-06-13 8:42 ` Kenichi Handa 2004-06-13 11:36 ` Juanma Barranquero ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Kenichi Handa @ 2004-06-13 8:42 UTC (permalink / raw) Cc: emacs-devel In article <20040612154533.152E.LEKTU@mi.madritel.es>, Juanma Barranquero <lektu@mi.madritel.es> writes: > I'm getting a perhaps related bootstrapping error: > Loading language/chinese (source)... > Loading language/cyrillic (source)... > Loading subst-ksc (source)... > Invalid read syntax: "?" It seems that my recent change caused this bug, but I can't reproduce it. At which bootstrapping stage, does the above happen? Before byte-compiling or after byte-compiling? --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-13 8:42 ` Kenichi Handa @ 2004-06-13 11:36 ` Juanma Barranquero 2004-06-13 13:18 ` Andreas Schwab 2004-06-13 20:39 ` Luc Teirlinck 2 siblings, 0 replies; 20+ messages in thread From: Juanma Barranquero @ 2004-06-13 11:36 UTC (permalink / raw) Cc: Kenichi Handa On Sun, 13 Jun 2004 17:42:19 +0900 (JST), Kenichi Handa <handa@m17n.org> wrote: > It seems that my recent change caused this bug, but I can't > reproduce it. At which bootstrapping stage, does the above > happen? Before byte-compiling or after byte-compiling? No byte-compiling has taken place. Right after dumping temacs.bin to temacs.exe: Loading loadup.el (source)... Using load-path (../lisp c:/bin/emacs/HEAD/lisp/emacs-lisp c:/bin/emacs/HEAD/lisp/language c:/bin/emacs/HEAD/lisp/international c:/bin/emacs/HEAD/lisp/textmodes) Loading emacs-lisp/byte-run (source)... Loading emacs-lisp/backquote (source)... Loading subr (source)... Loading version.el (source)... Loading widget (source)... Loading custom (source)... Loading emacs-lisp/map-ynp (source)... Loading env (source)... Loading cus-start (source)... Note, built-in variable `x-use-underline-position-properties' not bound Loading international/mule (source)... Loading international/mule-conf.el (source)... Loading format (source)... Loading bindings (source)... Loading files (source)... Loading cus-face (source)... Loading faces (source)... Lists of integers (garbage collection statistics) are normal output while building Emacs; they do not indicate a problem. ((86815 . 21581) (5221 . 22) (575 . 122) 322995 14120 (9 . 1) (18 . 0) (6522 . 975)) Loading loaddefs.el (source)... ((106390 . 4237) (7624 . 0) (583 . 114) 1111827 14120 (35 . 33) (18 . 0) (13741 . 56)) Loading simple (source)... Loading help (source)... Loading international/mule-cmds (source)... Loading case-table (source)... Loading international/utf-8 (source)... Loading international/utf-16 (source)... Loading international/characters (source)... Loading international/latin-1 (source)... Loading international/latin-2 (source)... Loading international/latin-3 (source)... Loading international/latin-4 (source)... Loading international/latin-5 (source)... Loading international/latin-8 (source)... Loading international/latin-9 (source)... Loading language/chinese (source)... Loading language/cyrillic (source)... Loading subst-ksc (source)... Invalid read syntax: "?" NMAKE : fatal error U1077: '"C:\bin\emacs\HEAD\src/obj-spd/i386/temacs.exe"' : return code '0xffffffff' Stop. NMAKE : fatal error U1077: '"C:\Archivos de programa\Microsoft Visual Studio .NET 2003\VC7\BIN\nmake.exe"' : return code '0x2' Stop. /L/e/k/t/u ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-13 8:42 ` Kenichi Handa 2004-06-13 11:36 ` Juanma Barranquero @ 2004-06-13 13:18 ` Andreas Schwab 2004-06-14 1:05 ` Kenichi Handa 2004-06-13 20:39 ` Luc Teirlinck 2 siblings, 1 reply; 20+ messages in thread From: Andreas Schwab @ 2004-06-13 13:18 UTC (permalink / raw) Cc: lektu, emacs-devel Kenichi Handa <handa@m17n.org> writes: > It seems that my recent change caused this bug, but I can't > reproduce it. At which bootstrapping stage, does the above > happen? Before byte-compiling or after byte-compiling? It happens before byte-compiling, try removing all *.elc files first. utf-translate-cjk-load-tables probably shouldn't load the tables during dumping. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-13 13:18 ` Andreas Schwab @ 2004-06-14 1:05 ` Kenichi Handa 0 siblings, 0 replies; 20+ messages in thread From: Kenichi Handa @ 2004-06-14 1:05 UTC (permalink / raw) Cc: lektu, emacs-devel In article <jept83d2ap.fsf@sykes.suse.de>, Andreas Schwab <schwab@suse.de> writes: > Kenichi Handa <handa@m17n.org> writes: > > It seems that my recent change caused this bug, but I can't > > reproduce it. At which bootstrapping stage, does the above > > happen? Before byte-compiling or after byte-compiling? > It happens before byte-compiling, try removing all *.elc files first. Thank you for the info. I found what was wrong and fixed it. Please try the latest code. The reason of the bug was that when cyrillic.el (not cyrillic.elc) was loaded, code-pages.el was also loaded. But, some characters (incorrect mapping) in this file caused loading subst-ksc.el which is encoded by euc-kr which is defined in not-yet-loaded korean.el. I fixed that incorrect mapping. > utf-translate-cjk-load-tables probably shouldn't load the tables during > dumping. Yes, but modifying utf-translate-cjk-load-tables not to load the tables will just hide such bugs as above. In general, preloaded files encoded in utf-8 should not contain a Unicode character that will be translated in utf-translate-cjk-mode because such a character may cause incorrect behaviour when utf-translate-cjk-mode is turned off. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Possible UTF-8 CJK Regressions in Terminal Emulators 2004-06-13 8:42 ` Kenichi Handa 2004-06-13 11:36 ` Juanma Barranquero 2004-06-13 13:18 ` Andreas Schwab @ 2004-06-13 20:39 ` Luc Teirlinck 2 siblings, 0 replies; 20+ messages in thread From: Luc Teirlinck @ 2004-06-13 20:39 UTC (permalink / raw) Cc: lektu, emacs-devel Ken'ichi HANDA wrote: It seems that my recent change caused this bug, but I can't reproduce it. At which bootstrapping stage, does the above happen? Before byte-compiling or after byte-compiling? Juanma already answered that question. Some extra info that may or may not be useful, but I provide it just in case. Bootstrapping works completely fine after emptying subst-ksc.el and subst-jis.el. Loading (or byte-compiling) these files in a running Emacs works fine, but for some strange reason apparently not at the early stage at which you are trying to load them. Just a wild guess: _maybe_ you might be able to reproduce the problem by running `make maintainer-clean' before bootstrapping. Sincerely, Luc. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2004-06-14 1:05 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1077557604.1632.26.camel@duende> [not found] ` <jwv3c90bgi7.fsf-monnier+emacs/pretest@asado.iro.umontreal.ca> [not found] ` <1077643915.12919.2.camel@duende> [not found] ` <jwvwu6c6yca.fsf-monnier+emacs/pretest@asado.iro.umontreal.ca> [not found] ` <1077682436.28482.9.camel@duende> [not found] ` <buoeksjyee3.fsf@mcspd15.ucom.lsi.nec.co.jp> [not found] ` <200403010815.RAA14365@etlken.m17n.org> 2004-03-18 15:34 ` Possible UTF-8 CJK Regressions in Terminal Emulators Dave Love 2004-04-07 12:30 ` Kenichi Handa 2004-04-08 11:27 ` Dave Love 2004-04-09 11:28 ` Kenichi Handa 2004-06-07 12:27 ` Kenichi Handa 2004-06-07 12:36 ` Miles Bader 2004-06-07 13:00 ` Kenichi Handa 2004-06-08 18:02 ` Dave Love 2004-06-09 7:37 ` Kenichi Handa 2004-06-09 9:38 ` Stefan Monnier 2004-06-10 0:20 ` Kenichi Handa 2004-06-08 17:56 ` Dave Love 2004-06-09 7:24 ` Kenichi Handa 2004-06-12 2:41 ` Kenichi Handa 2004-06-12 13:46 ` Juanma Barranquero 2004-06-13 8:42 ` Kenichi Handa 2004-06-13 11:36 ` Juanma Barranquero 2004-06-13 13:18 ` Andreas Schwab 2004-06-14 1:05 ` Kenichi Handa 2004-06-13 20:39 ` Luc Teirlinck
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).