From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: "Stephen J. Turnbull" <stephen@xemacs.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Input method or help feature needed
Date: Mon, 21 Feb 2011 11:53:20 +0900
Message-ID: <87k4gunlnj.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <E1Pq9J5-0007tA-0a@fencepost.gnu.org> <m2ipwixolq.fsf@igel.home>
	<87lj1ew6d3.fsf@catnip.gol.com> <20110218083736.GA12190@tomas>
	<buomxltbuq8.fsf@dhlpc061.dev.necel.com>
	<20110220082705.GA4092@tomas> <E1Pr6jN-000806-Qe@fencepost.gnu.org>
	<E1PrGP9-0000NB-0m@fencepost.gnu.org> <83hbbytmvl.fsf@gnu.org>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: dough.gmane.org 1298256664 26400 80.91.229.12 (21 Feb 2011 02:51:04 GMT)
X-Complaints-To: usenet@dough.gmane.org
NNTP-Posting-Date: Mon, 21 Feb 2011 02:51:04 +0000 (UTC)
Cc: tomas@tuxteam.de, rms@gnu.org, emacs-devel@gnu.org
To: Eli Zaretskii <eliz@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Feb 21 03:50:59 2011
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1PrLrW-0000IZ-Uo
	for ged-emacs-devel@m.gmane.org; Mon, 21 Feb 2011 03:50:59 +0100
Original-Received: from localhost ([127.0.0.1]:57125 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1PrLrW-00070M-DT
	for ged-emacs-devel@m.gmane.org; Sun, 20 Feb 2011 21:50:58 -0500
Original-Received: from [140.186.70.92] (port=43947 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1PrLrS-00070C-2w
	for emacs-devel@gnu.org; Sun, 20 Feb 2011 21:50:54 -0500
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stephen@xemacs.org>) id 1PrLrN-0001xj-0B
	for emacs-devel@gnu.org; Sun, 20 Feb 2011 21:50:53 -0500
Original-Received: from mgmt2.sk.tsukuba.ac.jp ([130.158.97.224]:41021)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stephen@xemacs.org>)
	id 1PrLrK-0001x0-Tm; Sun, 20 Feb 2011 21:50:47 -0500
Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp
	[130.158.99.156])
	by mgmt2.sk.tsukuba.ac.jp (Postfix) with ESMTP id 486079706AB;
	Mon, 21 Feb 2011 11:50:42 +0900 (JST)
Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000)
	id 39CCA1A2884; Mon, 21 Feb 2011 11:53:20 +0900 (JST)
In-Reply-To: <83hbbytmvl.fsf@gnu.org>
X-Mailer: VM 8.1.93a under 21.5 (beta29) "garbanzo" ed3b274cc037 XEmacs Lucid
	(x86_64-unknown-linux)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-Received-From: 130.158.97.224
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:136313
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/136313>

Eli Zaretskii writes:

 > > (Excluding Korean and Han characters and whatever else ought to be
 > > excluded).
 > 
 > Why exclude them?

Because there are 11000 of the former and 21000 (and counting) of the
latter.  The Korean Hangul are precomposed in an algorithmic fashion
from about 70 components called "jamo".  It makes very little sense to
just have many pages when you can look up the jamo in smaller lists,
and drill down to exactly the Hangul you want.  Just as it should be
possible to type "i" and get a page of all characters related to "i"
including the Turkish dotless "i" and Greek iota, etc.

Similarly, the Han characters are organized by radical and stroke
count, and it should be possible to look at the (relatively) short
list of 214 radicals, then drill down to an approximate stroke count,
and then page up and down the stroke count.  There are non-radical
components as well, many of which even total Han illiterates would be
likely to recognize.  I don't know if these are listed in the Unicode
tables, but if so they could be combined with the radical and
(optionally) approximate stroke count to drastically prune the search
tree in 90% or more of practical cases.

However a simple list of Hangul or Hanzi would be rather painful to
use, not to mention that if you don't know how to say it (every Hangul
has an algorithmically constructed pronunciation), you're probably not
fluent enough in the language to easily pick the right character out
of an array of say 400 (20 x 20 seems like a reasonable size for a
"page" of characters).  The real differences are often subtle, not to
mention that many characters have several variant glyphs, and these
variations tend to confuse the non-native speaker.

A pure list in Unicode order for these characters is better than
*nothing*, true, but it's not really an acceptable answer to Richard's
requirement.