* what-cursor-position vs. Unicode @ 2006-06-03 2:34 Dan Jacobson 2006-06-03 8:09 ` Eli Zaretskii 2006-06-05 7:01 ` Kenichi Handa 0 siblings, 2 replies; 8+ messages in thread From: Dan Jacobson @ 2006-06-03 2:34 UTC (permalink / raw) Cc: handa Today we shall discuss what-cursor-position when given an argument of ^U. We see that it gives Unicode information: character: Z (90, #o132, #x5a, U+005A) Except when you really need it: character: 丹 (107109, #o321145, #x1a265) It should mention U+4E39. emacs-version "22.0.50.1" ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: what-cursor-position vs. Unicode 2006-06-03 2:34 what-cursor-position vs. Unicode Dan Jacobson @ 2006-06-03 8:09 ` Eli Zaretskii 2006-06-05 23:17 ` Dan Jacobson [not found] ` <mailman.2667.1149552104.9609.bug-gnu-emacs@gnu.org> 2006-06-05 7:01 ` Kenichi Handa 1 sibling, 2 replies; 8+ messages in thread From: Eli Zaretskii @ 2006-06-03 8:09 UTC (permalink / raw) Cc: bug-gnu-emacs, handa > From: Dan Jacobson <jidanni@jidanni.org> > Date: Sat, 03 Jun 2006 10:34:59 +0800 > Cc: handa@etl.go.jp > > It should mention U+4E39. > emacs-version "22.0.50.1" It does for me, at least when reading your mail. Perhaps you should send the original file (as a binary attachment), or describe how you produced the character, if it wasn't from a file. Also, please tell when was your Emacs resync'ed with CVS, and please try looking at the character in "emacs -Q". ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: what-cursor-position vs. Unicode 2006-06-03 8:09 ` Eli Zaretskii @ 2006-06-05 23:17 ` Dan Jacobson [not found] ` <mailman.2667.1149552104.9609.bug-gnu-emacs@gnu.org> 1 sibling, 0 replies; 8+ messages in thread From: Dan Jacobson @ 2006-06-05 23:17 UTC (permalink / raw) Cc: handa EZ> It does for me, at least when reading your mail. Hmmm, me too, but not from a file. EZ> Also, please tell when was your Emacs resync'ed with CVS All I know beyond emacs-version "22.0.50.1" is I use Debian emacs-snapshot 20060518-1. KH> #x1a265 is a character of chinese-cns11643-1, and the KH> current Emacs doesn't support Unicode mapping for that KH> character set. All I know is me and my Unicode UTF-8 char sitting in the file. >> Just wondering: Why not? KH> I myself want to avoid spending a time on what becomes useless in KH> the future. I see, there is some funny level of indirection that will be eliminated in the future. Good. ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <mailman.2667.1149552104.9609.bug-gnu-emacs@gnu.org>]
* Re: what-cursor-position vs. Unicode [not found] ` <mailman.2667.1149552104.9609.bug-gnu-emacs@gnu.org> @ 2006-06-09 22:17 ` Miles Bader 0 siblings, 0 replies; 8+ messages in thread From: Miles Bader @ 2006-06-09 22:17 UTC (permalink / raw) Cc: bug-gnu-emacs, handa Dan Jacobson <jidanni@jidanni.org> writes: > KH> I myself want to avoid spending a time on what becomes useless in > KH> the future. > > I see, there is some funny level of indirection that will be > eliminated in the future. Good. In the future (well actually right now, on a CVS branch) Emacs will use a unicode internal representation, where obviously this sort of thing will be easier... -Miles -- "Most attacks seem to take place at night, during a rainstorm, uphill, where four map sheets join." -- Anon. British Officer in WW I ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: what-cursor-position vs. Unicode 2006-06-03 2:34 what-cursor-position vs. Unicode Dan Jacobson 2006-06-03 8:09 ` Eli Zaretskii @ 2006-06-05 7:01 ` Kenichi Handa 2006-06-05 8:22 ` Werner LEMBERG 1 sibling, 1 reply; 8+ messages in thread From: Kenichi Handa @ 2006-06-05 7:01 UTC (permalink / raw) Cc: bug-gnu-emacs In article <87irnjklv0.fsf@jidanni.org>, Dan Jacobson <jidanni@jidanni.org> writes: > Today we shall discuss what-cursor-position when given an argument of ^U. > We see that it gives Unicode information: > character: Z (90, #o132, #x5a, U+005A) > Except when you really need it: > character: 丹 (107109, #o321145, #x1a265) > It should mention U+4E39. > emacs-version "22.0.50.1" #x1a265 is a character of chinese-cns11643-1, and the current Emacs doesn't support Unicode mapping for that character set. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: what-cursor-position vs. Unicode 2006-06-05 7:01 ` Kenichi Handa @ 2006-06-05 8:22 ` Werner LEMBERG 2006-06-05 11:07 ` Kenichi Handa 0 siblings, 1 reply; 8+ messages in thread From: Werner LEMBERG @ 2006-06-05 8:22 UTC (permalink / raw) Cc: bug-gnu-emacs, jidanni > #x1a265 is a character of chinese-cns11643-1, and the > current Emacs doesn't support Unicode mapping for that > character set. Just wondering: Why not? Werner ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: what-cursor-position vs. Unicode 2006-06-05 8:22 ` Werner LEMBERG @ 2006-06-05 11:07 ` Kenichi Handa 2006-06-09 15:09 ` Werner LEMBERG 0 siblings, 1 reply; 8+ messages in thread From: Kenichi Handa @ 2006-06-05 11:07 UTC (permalink / raw) Cc: bug-gnu-emacs, jidanni In article <20060605.102222.112830788.wl@gnu.org>, Werner LEMBERG <wl@gnu.org> writes: >> #x1a265 is a character of chinese-cns11643-1, and the >> current Emacs doesn't support Unicode mapping for that >> character set. > Just wondering: Why not? Because no one has implemented it. I myself want to avoid spending a time on what becomes useless in the future. In addition, in the current Emacs code, adding something like lisp/international/subst-cns.el leads to slower startup in CJK locales, which I want to avoid. But, if someone implement it and Richard agrees on including it before the release, please go ahead. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: what-cursor-position vs. Unicode 2006-06-05 11:07 ` Kenichi Handa @ 2006-06-09 15:09 ` Werner LEMBERG 0 siblings, 0 replies; 8+ messages in thread From: Werner LEMBERG @ 2006-06-09 15:09 UTC (permalink / raw) Cc: emacs-devel [-- Attachment #1: Type: Text/Plain, Size: 1209 bytes --] > >> #x1a265 is a character of chinese-cns11643-1, and the > >> current Emacs doesn't support Unicode mapping for that > >> character set. > > > Just wondering: Why not? > > Because no one has implemented it. I've sent a `subst-cns.el' file to you, Ken'ichi-san, and the experimental diff for utf-8.el is attached. A great deal of character codes is larger than U+20000; this works just fine. > I myself want to avoid spending a time on what becomes useless in > the future. Well, it was rather simple; I just wrote a small perl script to extract the data from the Unihan.txt data base. On the other hand, I think it is *very* important to provide good conversion from and to Unicode for all the charsets Emacs supports, thus it wasn't wasted time IMHO. > In addition, in the current Emacs code, adding something like > lisp/international/subst-cns.el leads to slower startup in CJK > locales, which I want to avoid. Agreed -- my changes to utf-8.el don't take this into account. What about an additional `unicode' language environment which loads really all mapping tables? BTW, I suggest to set up a `Chinese-EUC-TW' language environment for which `subst-cns.el' is loaded by default. Werner [-- Attachment #2: utf-8.el.diff --] [-- Type: Text/Plain, Size: 3445 bytes --] --- utf-8.el.old 2005-10-15 07:43:43.000000000 +0200 +++ utf-8.el 2006-06-09 17:01:46.000000000 +0200 @@ -1,7 +1,7 @@ ;;; utf-8.el --- UTF-8 decoding/encoding support -*- coding: iso-2022-7bit -*- ;; Copyright (C) 2001, 2002, 2003, 2004 Free Software Foundation, Inc. -;; Copyright (C) 2001, 2002, 2003, 2004 +;; Copyright (C) 2001, 2002, 2003, 2004, 2006 ;; National Institute of Advanced Industrial Science and Technology (AIST) ;; Registration Number H14PRO021 @@ -194,6 +194,10 @@ (defconst utf-translate-cjk-charsets '(chinese-gb2312 chinese-big5-1 chinese-big5-2 + chinese-cns11643-1 chinese-cns11643-2 + chinese-cns11643-3 chinese-cns11643-4 + chinese-cns11643-5 chinese-cns11643-6 + chinese-cns11643-7 japanese-jisx0208 japanese-jisx0212 katakana-jisx0201 korean-ksc5601) @@ -267,7 +271,9 @@ ucs-unicode-to-mule-cjk (make-hash-table :test 'eq))) (defcustom utf-translate-cjk-unicode-range '((#x2e80 . #xd7a3) - (#xff00 . #xffef)) + (#xff00 . #xffef) + (#x20000 . #x2a6df) + (#x2f800 . #x2fa1f)) "List of Unicode code ranges supported by `utf-translate-cjk-mode'. Setting this variable directly does not take effect; use either \\[customize] or the function @@ -314,22 +320,26 @@ (load "subst-jis") (load "subst-big5") (load "subst-gb2312") - (load "subst-ksc")) + (load "subst-ksc") + (load "subst-cns")) ((string= "Chinese-BIG5" current-language-environment) (load "subst-jis") (load "subst-ksc") (load "subst-gb2312") - (load "subst-big5")) + (load "subst-big5") + (load "subst-cns")) ((string= "Chinese-GB" current-language-environment) (load "subst-jis") (load "subst-ksc") (load "subst-big5") - (load "subst-gb2312")) + (load "subst-gb2312") + (load "subst-cns")) (t (load "subst-ksc") (load "subst-gb2312") (load "subst-big5") - (load "subst-jis")))) ; jis covers as much as big5, gb2312 + (load "subst-jis") + (load "subst-cns")))) ; jis covers as much as big5, gb2312 (when redefined (define-translation-hash-table 'utf-subst-table-for-decode @@ -365,14 +375,22 @@ zero or negative. This is a minor mode. Enabling this allows the coding systems mule-utf-8, mule-utf-16le and mule-utf-16be to encode characters in the charsets -`korean-ksc5601', `chinese-gb2312', `chinese-big5-1', -`chinese-big5-2', `japanese-jisx0208' and `japanese-jisx0212', and to -decode the corresponding unicodes into such characters. + + korean-ksc5601 + chinese-gb2312 + chinese-big5-1 chinese-big5-2 + chinese-cns11643-1 chinese-cns11643-2 chinese-cns11643-3 + chinese-cns11643-4 chinese-cns11643-5 chinese-cns11643-6 + chinese-cns11643-7 + japanese-jisx0208 japanese-jisx0212 + +and to decode the corresponding unicodes into such characters. Where the charsets overlap, the one preferred for decoding is chosen according to the language environment in effect when this option is turned on: ksc5601 for Korean, gb2312 for Chinese-GB, big5 for -Chinese-Big5 and jisx for other environments. +Chinese-Big5 and jisx for other environments. The CNS charsets +are always loaded last. This mode is on by default. If you are not interested in CJK characters and want to avoid some overhead on encoding/decoding [-- Attachment #3: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-06-09 22:17 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-06-03 2:34 what-cursor-position vs. Unicode Dan Jacobson 2006-06-03 8:09 ` Eli Zaretskii 2006-06-05 23:17 ` Dan Jacobson [not found] ` <mailman.2667.1149552104.9609.bug-gnu-emacs@gnu.org> 2006-06-09 22:17 ` Miles Bader 2006-06-05 7:01 ` Kenichi Handa 2006-06-05 8:22 ` Werner LEMBERG 2006-06-05 11:07 ` Kenichi Handa 2006-06-09 15:09 ` Werner LEMBERG
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.