describe-char and unicode data

* describe-char and unicode data
@ 2003-05-09 18:31 James H. Cloos Jr.
  2003-05-10 10:06 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: James H. Cloos Jr. @ 2003-05-09 18:31 UTC (permalink / raw)

Describe-char shows the unicode hex value of the character in question
if it exists (some chars do not translate to unicode).

Would a patch that expands that to also show the relevant data from
UnicodeData.txt be accepted?

Step one would be code to convert UnicodeData.txt to a suitable elisp
structure, generating a unicodedata.el file.  Given that, the
additional logic in describe-char is trivial.

To give an idea of the amount of data available, UnicodeData.txt is a
semicolon-separated text db with 15 fields per record, and currently
has 15100 records, so loading this may be an issue.  The related
Unihan.txt has up to 78 possible entries for each of 71098 characters.

The name entry from UnicodeData.txt and probably the kDefinition
entries from Unihan.txt would be the useful additions for
describe-char.  The rest of the data may however be useful elsewhere.

What is therefore the best structure to use for this data?

-JimC

^ permalink raw reply	[flat|nested] 22+ messages in thread