unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
To: Juri Linkov <juri@jurta.org>
Cc: tzz@lifelogs.com, emacs-devel@gnu.org
Subject: Re: inputting characters by hexadigit
Date: Sun, 20 Jul 2008 10:23:57 +0900	[thread overview]
Message-ID: <E1KKNeX-0003FP-7C@etlken.m17n.org> (raw)
In-Reply-To: <87sku5if8t.fsf_-_@jurta.org> (message from Juri Linkov on Sun, 20 Jul 2008 03:29:14 +0300)

In article <87sku5if8t.fsf_-_@jurta.org>, Juri Linkov <juri@jurta.org> writes:

> > I think it is better to skip these ranges:
> >   #x3400..#x4dbf   -- CJK Ideograph Extension A
> >   #x4e00..#x9fff   -- CJK Ideograph
> >   #xd800..#xfaFF   -- surroage-pair, private use, CJK COMPATIBILITY IDEOGRAPH
> >   #x20000..#x2ffff -- CJK Ideograph Extension B
> > and end the loop at #xeffff (#xf0000.. are for private use)

> Actually there are no Unicode names in these ranges in UnicodeData.txt.
> It has only lines for the first and the last character in these ranges:

Yes.  But, for CJK chars:

   (get-char-code-property CHAR 'name)

returns a valid name something like "CJK IDEOGRAPH-3400"(*)
because get-char-code-property not only looks up
UnicodeData.txt but also compute a proper value if
necessary.

> If it would be possible to loop over names instead of loop over all
> characters to check for their names, then this code would be more fast,
> but I don't see how it would be possible to loop over all defined names
> in UnicodeData.txt.

> If this is not possible then we could optimize the loop over all
> characters in the chartable to skip these useless ranges.

I think it doesn't work because Hangul syllabic character
names must also be computed algorithmically(*).   I think
just doing somethink like this is good:

 (dotimes (c #xEFFFF)
    (unless (CHAR-IS-IN-A-RANGE-TO-SKIP-P c)
       ...))


(*): "The Unicode Standard 5.1" has this section.

4.8 Name—Normative
[...]
Ideographs and Hangul Syllables. Names for ideographs and
Hangul syllables are derived algorithmically. Unified CJK
ideographs are named CJK UNIFIED IDEOGRAPH-x, where x is
replaced with the hexadecimal Unicode code point—for
example, cjk unified ideograph-4E00. Similarly,
compatibility CJK ideographs are named “CJK COMPATIBILITY
IDEOGRAPH-x”. The names of Hangul syllables are generated as
described in “Hangul Syllable Names” in Section 3.12,
Conjoining Jamo Behavior.

---
Kenichi Handa
handa@ni.aist.go.jp




  reply	other threads:[~2008-07-20  1:23 UTC|newest]

Thread overview: 160+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-15 17:41 describe-char should display the character's Unicode name Ted Zlatanov
2008-07-15 20:42 ` Juri Linkov
2008-07-16  2:29   ` Kenichi Handa
2008-07-17  0:35     ` Juri Linkov
2008-07-17  1:26       ` Kenichi Handa
2008-07-17 14:09         ` Ted Zlatanov
2008-07-17 23:40         ` Juri Linkov
2008-07-18  0:10           ` Miles Bader
2008-07-18  0:47             ` inputting characters by hexadigit Kenichi Handa
2008-07-18  0:54               ` Lennart Borgman (gmail)
2008-07-18  1:01               ` Miles Bader
2008-07-18  4:48               ` David De La Harpe Golden
2008-07-18 12:39                 ` Miles Bader
2008-07-18 16:17                   ` Stephen J. Turnbull
2008-07-18 17:33                     ` David De La Harpe Golden
2008-07-18 18:08                       ` James Cloos
2008-07-18 18:59                         ` Stephen J. Turnbull
2008-07-18 21:29                           ` James Cloos
2008-07-19  0:30                             ` Juri Linkov
2008-07-21  1:41                               ` Stefan Monnier
2008-07-21  5:35                                 ` David De La Harpe Golden
2008-07-22 22:59                                   ` Juri Linkov
2008-07-22 23:17                                     ` Lennart Borgman (gmail)
2008-07-22 23:29                                     ` David De La Harpe Golden
2008-07-29 16:07                                 ` Juri Linkov
2008-07-29 18:00                                   ` Stefan Monnier
2008-07-19  5:49                             ` Stephen J. Turnbull
2008-07-19  0:13                     ` Miles Bader
2008-07-19  0:29                     ` Juri Linkov
2008-07-19  0:27                   ` Juri Linkov
2008-07-20 20:28                     ` Juri Linkov
2008-07-21  1:46                       ` Stefan Monnier
2008-07-21  5:07                         ` David De La Harpe Golden
2008-07-21  6:04                           ` Stefan Monnier
2008-07-21  9:01                             ` Juri Linkov
2008-07-21 11:04                               ` Johan Bockgård
2008-07-21 11:11                                 ` David Kastrup
2008-07-21 14:20                                   ` Johan Bockgård
2008-07-21 14:24                                     ` David Kastrup
2008-07-21 14:33                                       ` Lennart Borgman (gmail)
2008-07-21 14:37                                         ` David Kastrup
2008-07-21 19:25                               ` James Cloos
2008-07-21 19:53                                 ` Stefan Monnier
2008-07-21 20:23                                   ` Miles Bader
2008-07-21 20:38                                     ` Lennart Borgman (gmail)
2008-07-21 21:49                                     ` Johan Bockgård
2008-07-21 22:03                                       ` David Kastrup
2008-07-21 23:37                                         ` Johan Bockgård
2008-07-22  1:26                                           ` David De La Harpe Golden
2008-07-22 22:52                                           ` Juri Linkov
2008-07-21 20:31                                   ` David Kastrup
2008-07-21 20:36                                     ` David De La Harpe Golden
2008-07-21 20:43                                       ` David Kastrup
2008-07-21 20:50                                         ` Lennart Borgman (gmail)
2008-07-21 21:12                                           ` David Kastrup
2008-07-21 21:03                                         ` Alfred M. Szmidt
2008-07-21 21:24                                           ` Drew Adams
2008-07-22  4:03                                           ` Miles Bader
2008-07-21 21:06                                         ` David De La Harpe Golden
2008-07-22  4:04                                         ` Miles Bader
2008-07-21 20:31                                   ` James Cloos
2008-07-21 21:11                                   ` David De La Harpe Golden
2008-07-21 23:43                                 ` Juri Linkov
2008-07-22  4:06                                   ` Miles Bader
2008-07-21 16:54                             ` David De La Harpe Golden
2008-07-21 17:04                               ` David De La Harpe Golden
2008-07-21  6:16                           ` David Kastrup
2008-07-22  0:51                         ` Kenichi Handa
2008-07-22  7:56                           ` Juanma Barranquero
2008-07-22  8:33                             ` Miles Bader
2008-07-22 10:57                               ` Juanma Barranquero
2008-07-22 12:26                               ` Kenichi Handa
2008-07-22 12:33                                 ` Lennart Borgman (gmail)
2008-07-22 12:35                                   ` Miles Bader
2008-07-22 16:16                                 ` Stefan Monnier
2008-07-22 16:54                                   ` Drew Adams
2008-07-22 20:38                                     ` Alfred M. Szmidt
2008-07-22 22:55                                       ` Juri Linkov
2008-07-23  0:08                                       ` David De La Harpe Golden
2008-07-23  2:18                                       ` Miles Bader
2008-07-22 22:54                                   ` Juri Linkov
2008-07-23  0:28                                   ` Miles Bader
2008-07-23  1:02                                   ` Kenichi Handa
2008-07-23  2:32                                     ` Stefan Monnier
2008-07-23  3:18                                       ` Miles Bader
2008-07-23  3:38                                       ` David De La Harpe Golden
2008-07-23  3:53                                         ` Stefan Monnier
2008-07-23  4:26                                           ` David De La Harpe Golden
2008-07-23 19:19                                             ` Stefan Monnier
2008-07-23 20:26                                               ` David De La Harpe Golden
2008-07-23  9:03                                           ` Juri Linkov
2008-07-23 10:15                                             ` Miles Bader
2008-07-23 14:27                                               ` Juri Linkov
2008-07-23 14:58                                                 ` Miles Bader
2008-07-22 14:06                             ` Drew Adams
2008-07-23 13:01                               ` Ted Zlatanov
2008-07-23 13:05                                 ` Lennart Borgman (gmail)
2008-07-23 13:44                                   ` Drew Adams
2008-07-23 14:27                                   ` Juri Linkov
2008-07-23 19:24                                     ` Stefan Monnier
2008-07-23 22:32                                       ` Juri Linkov
2008-07-24  2:05                                         ` Stefan Monnier
2008-07-29 15:51                                           ` Juri Linkov
2008-07-29 17:33                                             ` Chong Yidong
2008-07-29 17:51                                               ` Juri Linkov
2008-07-29 19:55                                                 ` Stefan Monnier
2008-07-29 20:51                                                   ` Chong Yidong
2008-07-30 15:29                                                     ` Juri Linkov
2008-07-31 19:20                                                       ` Ted Zlatanov
2008-07-18  0:38           ` describe-char should display the character's Unicode name Kenichi Handa
2008-07-18  0:58             ` Miles Bader
2008-07-18 13:33               ` Ted Zlatanov
2008-07-18 13:53                 ` Drew Adams
2008-07-18 14:31                 ` Stefan Monnier
2008-07-18 15:22                   ` Ted Zlatanov
2008-07-19  0:45                     ` Juri Linkov
2008-07-21 16:35                       ` Ted Zlatanov
2008-07-21 16:35                         ` Lennart Borgman (gmail)
2008-07-21 16:58                           ` David De La Harpe Golden
2008-07-22 22:57                         ` Juri Linkov
2008-07-23 14:47                           ` Ted Zlatanov
2008-07-23 22:31                             ` Juri Linkov
2008-07-23 22:52                               ` Lennart Borgman (gmail)
2008-07-24 13:29                                 ` Ted Zlatanov
2008-07-24 13:39                                   ` Lennart Borgman
2008-07-24 15:27                                     ` Ted Zlatanov
2008-07-24 15:34                                       ` Lennart Borgman (gmail)
2008-07-25  0:41                                         ` Juri Linkov
2008-07-25  2:24                                           ` Kenichi Handa
2008-07-19  0:35                 ` Juri Linkov
2008-07-19  1:11                   ` Kenichi Handa
2008-07-20  0:29                     ` inputting characters by hexadigit Juri Linkov
2008-07-20  1:23                       ` Kenichi Handa [this message]
2008-07-20 20:27                         ` Juri Linkov
2008-07-23 14:37                           ` Ted Zlatanov
2008-07-23 19:31                             ` Stefan Monnier
2008-07-23 20:19                               ` Ted Zlatanov
2008-07-24  2:08                                 ` Stefan Monnier
2008-07-24 13:20                                   ` Ted Zlatanov
2008-07-24 13:40                                     ` Lennart Borgman
2008-07-24 14:03                                     ` Stefan Monnier
2008-07-24 15:25                                       ` Ted Zlatanov
2008-07-24 16:06                                         ` Stefan Monnier
2008-07-24 17:00                                           ` Drew Adams
2008-07-24 18:15                                             ` Ted Zlatanov
2008-07-24 18:37                                               ` Drew Adams
2008-07-28 13:56                                                 ` Ted Zlatanov
2008-07-24 20:12                                             ` Stefan Monnier
2008-07-24 20:27                                               ` Drew Adams
2008-07-24 20:37                                                 ` Lennart Borgman (gmail)
2008-07-24 21:30                                                   ` Lennart Borgman (gmail)
2008-07-24 21:59                                                     ` Lennart Borgman (gmail)
2008-07-28 14:00                                                   ` Ted Zlatanov
2008-07-23 22:35                               ` Juri Linkov
2008-07-23 23:00                                 ` Lennart Borgman (gmail)
2008-07-24  2:18                                   ` Stefan Monnier
2008-07-25  0:52                                     ` Juri Linkov
2008-07-24 13:24                                 ` Ted Zlatanov
2008-07-24 15:55                                   ` Drew Adams
2008-07-19  0:35               ` describe-char should display the character's Unicode name Juri Linkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1KKNeX-0003FP-7C@etlken.m17n.org \
    --to=handa@m17n.org \
    --cc=emacs-devel@gnu.org \
    --cc=juri@jurta.org \
    --cc=tzz@lifelogs.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).