From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.bugs Subject: Re: decode-char & utf-8-fragment-on-decoding Date: Wed, 4 Sep 2002 17:18:36 +0900 (JST) Sender: bug-gnu-emacs-admin@gnu.org Message-ID: <200209040818.RAA12414@etlken.m17n.org> References: <87y9aii9hc.fsf@cricket.magic.csuhayward.edu> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1031127451 24930 127.0.0.1 (4 Sep 2002 08:17:31 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Wed, 4 Sep 2002 08:17:31 +0000 (UTC) Cc: bug-gnu-emacs@gnu.org, d.love@dl.ac.uk Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17mVLt-0006Tx-00 for ; Wed, 04 Sep 2002 10:17:30 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 17mVNV-0003XH-00; Wed, 04 Sep 2002 04:19:09 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 17mVND-0003S3-00 for bug-gnu-emacs@gnu.org; Wed, 04 Sep 2002 04:18:51 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 17mVNA-0003Rk-00 for bug-gnu-emacs@gnu.org; Wed, 04 Sep 2002 04:18:50 -0400 Original-Received: from tsukuba.m17n.org ([192.47.44.130]) by monty-python.gnu.org with esmtp (Exim 4.10) id 17mVN9-0003RS-00 for bug-gnu-emacs@gnu.org; Wed, 04 Sep 2002 04:18:48 -0400 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2]) by tsukuba.m17n.org (8.11.6/3.7W-20010518204228) with ESMTP id g848IbK18885; Wed, 4 Sep 2002 17:18:37 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) by fs.m17n.org (8.11.3/3.7W-20010823150639) with ESMTP id g848Iad07243; Wed, 4 Sep 2002 17:18:36 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id RAA12414; Wed, 4 Sep 2002 17:18:36 +0900 (JST) Original-To: tlm@pocketmail.com In-Reply-To: <87y9aii9hc.fsf@cricket.magic.csuhayward.edu> (message from Thomas Morgan on 04 Sep 2002 01:56:15 -0400) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.1.30 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) Errors-To: bug-gnu-emacs-admin@gnu.org X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Bug reports for GNU Emacs, the Swiss army knife of text editors List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.bugs:3404 X-Report-Spam: http://spam.gmane.org/gmane.emacs.bugs:3404 Thomas Morgan writes: > decode-char does not honor utf-8-fragment-on-decoding. > I tried this code in > GNU Emacs 21.3.50.2 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars) > of 2002-09-03 on cricket > run with options -q and --no-site-file. > (let ((utf-8-fragment-on-decoding nil) > (c ?=CE=93)) > (=3D c (decode-char 'ucs (encode-char c 'ucs)))) > encode-char returns 915, decode-char returns 2883, and the entire sexp > evalutes nil. The Unicode code point is translated into greek-iso8859-7 > by decode-char even though utf-8-fragment-on-decoding is not enabled. > Is this a bug? The documetation of decode-char says that a character is translated by utf-8-translation-table-for-decode (regardless of utf-8-fragment-on-decoding). Thus it's not a bug. But, I agree that this behavior is very confusing and not good. And, I've recently found that utf-8 can't encode cyrillic-iso8859-5 and greek-iso8859-7 correctly because of this behavior. > The following change makes decode-char act as I expected. Thank you. It seems to be the right fix. I'll install it soon. Dave, do you see any problem with that? --- Ken'ichi HANDA handa@etl.go.jp > *** /src/emacs/lisp/international/mule.el.~1.159.~ Sat Aug 24 03:46:25 20= 02 > --- /src/emacs/lisp/international/mule.el Wed Sep 4 01:30:54 2002 > *************** > *** 331,337 **** > (setq code-point (- code-point #xe000)) > (make-char 'mule-unicode-e000-ffff > (+ (/ code-point 96) 32) (+ (% code-point 96) 32)))))) > ! (if (and c (aref utf-8-translation-table-for-decode c)) > (aref utf-8-translation-table-for-decode c) > c))))) =20 > --- 331,339 ---- > (setq code-point (- code-point #xe000)) > (make-char 'mule-unicode-e000-ffff > (+ (/ code-point 96) 32) (+ (% code-point 96) 32)))))) > ! (if (and c > ! utf-8-fragment-on-decoding > ! (aref utf-8-translation-table-for-decode c)) > (aref utf-8-translation-table-for-decode c) > c))))) =20 > Diff finished at Wed Sep 4 01:31:04 > _______________________________________________ > Bug-gnu-emacs mailing list > Bug-gnu-emacs@gnu.org > http://mail.gnu.org/mailman/listinfo/bug-gnu-emacs