From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: please consider emacs-unicode for pervasive changes Date: Fri, 6 Sep 2002 13:29:41 +0900 (JST) Sender: emacs-devel-admin@gnu.org Message-ID: <200209060429.NAA15734@etlken.m17n.org> References: NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: main.gmane.org 1031286587 16319 127.0.0.1 (6 Sep 2002 04:29:47 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Fri, 6 Sep 2002 04:29:47 +0000 (UTC) Cc: eliz@is.elta.co.il, emacs-devel@gnu.org, d.love@dl.ac.uk Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17nAkb-0004F5-00 for ; Fri, 06 Sep 2002 06:29:45 +0200 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17nBKb-0004XD-00 for ; Fri, 06 Sep 2002 07:06:57 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 17nAmF-00089p-00; Fri, 06 Sep 2002 00:31:27 -0400 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 17nAkh-00088n-00 for emacs-devel@gnu.org; Fri, 06 Sep 2002 00:29:51 -0400 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 17nAkf-00088b-00 for emacs-devel@gnu.org; Fri, 06 Sep 2002 00:29:50 -0400 Original-Received: from tsukuba.m17n.org ([192.47.44.130]) by monty-python.gnu.org with esmtp (Exim 4.10) id 17nAke-00088V-00; Fri, 06 Sep 2002 00:29:48 -0400 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2]) by tsukuba.m17n.org (8.11.6/3.7W-20010518204228) with ESMTP id g864TfK03252; Fri, 6 Sep 2002 13:29:41 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) by fs.m17n.org (8.11.3/3.7W-20010823150639) with ESMTP id g864Tfd20107; Fri, 6 Sep 2002 13:29:41 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id NAA15734; Fri, 6 Sep 2002 13:29:41 +0900 (JST) Original-To: rms@gnu.org In-Reply-To: (message from Richard Stallman on Fri, 06 Sep 2002 00:01:46 -0400) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.1.30 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:7609 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:7609 In article , Richard Stallman writes: > In other words, running `diff' on > the original and the edited files will not show any changes in those line > you didn't modify. However, unification means that `diff' _will_ > sometimes show differences in unedited portions of the file, because C2 > was recoded into a different codepoint. > Given that we are using unicode and unification, > isn't this inevitable? Not necessarily. If we put the text property `charset' (the value is a charset) to a text on decoding, and check it on encoding, we can preserve the same byte sequence. Putting that text property to all the text is too much. We should put it only on such characters that will be encoded differently without that information. The new code conversion routine of emacs-unicode already has a basic mechanism to handle to such a thing (currently used only for compositions). Only that I don't have a time to write concrete codes. --- Ken'ichi HANDA handa@etl.go.jp