From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Dave Love Newsgroups: gmane.emacs.devel Subject: Re: Several serious problems Date: 30 Aug 2002 00:19:14 +0100 Sender: emacs-devel-admin@gnu.org Message-ID: References: <200208190748.QAA14278@etlken.m17n.org> <200208241211.g7OCBW111768@wijiji.santafe.edu> <200208261317.WAA27761@etlken.m17n.org> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1030663440 8534 127.0.0.1 (29 Aug 2002 23:24:00 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 29 Aug 2002 23:24:00 +0000 (UTC) Cc: rms@gnu.org, monnier+gnu/emacs@rum.cs.yale.edu, keichwa@gmx.net, emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17kYdn-0002DF-00 for ; Fri, 30 Aug 2002 01:23:55 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17kZAI-0002Im-00 for ; Fri, 30 Aug 2002 01:57:30 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 17kYfB-0003Dq-00; Thu, 29 Aug 2002 19:25:21 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 17kYZQ-0002kO-00 for emacs-devel@gnu.org; Thu, 29 Aug 2002 19:19:24 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 17kYZN-0002k5-00 for emacs-devel@gnu.org; Thu, 29 Aug 2002 19:19:23 -0400 Original-Received: from albion.dl.ac.uk ([148.79.80.39]) by monty-python.gnu.org with esmtp (Exim 4.10) id 17kYZK-0002jc-00; Thu, 29 Aug 2002 19:19:18 -0400 Original-Received: from fx by albion.dl.ac.uk with local (Exim 3.35 #1 (Debian)) id 17kYZG-0001jq-00; Fri, 30 Aug 2002 00:19:14 +0100 Original-To: Kenichi Handa Original-Lines: 86 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:7138 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:7138 Kenichi Handa writes: > I don't know if they are the same as what Dave currently > has. I tried to install all the relevant stuff I had, but for the CVS head, it's modified versions of what I've actually been using, and is basically untested. I wanted someone who was actually using that code base to install it and test it, but no-one could or would -- I can't remember, but rms leant on me to install it. > But, I have not checked if they surely works as > expected. I believe Dave has done it. Only in more-or-less Emacs 21.2. > And, I don't understand why those many functions/variables > are designed as the current way. For instance, >=20 > (1) Why does loadup.el has this code: > (ucs-unify-8859 'encode-only) > instead of: > (unify-8859-on-encoding-mode 1) Indeed. I didn't do that. The obvious thing to do is to change the default in the defcustom, if ucs-tables is preloaded. > (2) Why doesn't utf-8-subst.el provide mappings of > non-Chinese characters for ksc, gb, and jisx charsets? > The document of utf-8-translate-cjk says as below: > ---------------------------------------------------------------------- > Whether the `mule-utf-8' coding system should encode many CJK characters. >=20 > Enabling this loads tables which enable the coding system to encode > characters in the charsets `korean-ksc5601', `chinese-gb2312' and > `japanese-jisx0208', and to decode the corresponding unicodes into > ... > ---------------------------------------------------------------------- > but, currently only Chinese characters in those charsets are > handled. I didn't realize that. It may be coincidence. What should be translated is the set of characters (japanese-jisx0208 =E2=88=AA chinese-gb2312 =E2=88=AA korean-ksc5601) \ mul= e-unicode-2500-33ff ^ ^ union set difference according to the Mule-UCS tables -- I just took the relevant codes from there above U+33FF. Perhaps that isn't how it actually is. It needs someone with an interest in the CJK range to redo that stuff anyhow; it shouldn't hardwire Japanese as the japanese-jisx0208 as the preferred set, the sets used should probably be configurable, and it should allow translating the relevant characters below U+3400. (I didn't think much about how best to do that without keeping large tables on the heap that aren't actually used to do the translation.) > (3) Why is utf-8-translate-cjk a variable, not a minor-mode > like unify-8859-on-(de/en)coding-mode? I think because it can't be turned off. > Or, why the > latter is not a simple variable? By the way, it seems > that once we customize utf-8-translate-cjk to t, > customize it back to nil doesn't cancel the translation. >=20 > (4) It seems that the variable name > utf-8-fragment-on-decoding is not appropriate because it > is used also in utf-18.el. Perhaps, > ucs-fragment-on-decoding is better. Probably. It was defined before I wrote utf-16.el. Much of that stuff would have been written differently for installation in 21.1, but it was done during the campaign against anything Unicode-based, so that users could have it in Emacs 21.2 as conveniently as possible. > (5) It seems that mule-utf-16 can handle the same range of > characters as mule-utf-8, but `safe-charsets' property > doesn't contain, for instance, `latin-iso8895-2'. > Perhaps, this is simply a bug to be fixed easily. Yes. The coding system needs to register the relevant translation table(s) for safe-chars, that would have to be updated in sync with any changes. I don't know why that didn't get done.