From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel,gmane.emacs.pretest.bugs Subject: Re: 23.0.60; [nxml] BOM and utf-8 Date: Tue, 20 May 2008 08:36:11 +0900 Message-ID: <87wslpeuj8.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87od75kt78.fsf@pdrechsler.de> <87mymofip6.fsf@uwakimon.sk.tsukuba.ac.jp> <878wy8ny36.fsf@catnip.gol.com> <87k5hsfdvd.fsf@uwakimon.sk.tsukuba.ac.jp> <85y768ug6x.fsf@lola.goethe.zz> <87fxsff0xc.fsf@uwakimon.sk.tsukuba.ac.jp> <854p8vrxk5.fsf@lola.goethe.zz> <874p8uf2xm.fsf@uwakimon.sk.tsukuba.ac.jp> <85ej7yqafj.fsf@lola.goethe.zz> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1211239558 3256 80.91.229.12 (19 May 2008 23:25:58 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 19 May 2008 23:25:58 +0000 (UTC) Cc: emacs-pretest-bug@gnu.org, Patrick Drechsler , Miles Bader To: David Kastrup Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue May 20 01:26:34 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JyEkN-0001cr-0R for ged-emacs-devel@m.gmane.org; Tue, 20 May 2008 01:26:27 +0200 Original-Received: from localhost ([127.0.0.1]:40261 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JyEjd-0000Nt-0N for ged-emacs-devel@m.gmane.org; Mon, 19 May 2008 19:25:41 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JyEil-0008K2-99 for emacs-devel@gnu.org; Mon, 19 May 2008 19:24:47 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JyEij-0008JN-Rz for emacs-devel@gnu.org; Mon, 19 May 2008 19:24:46 -0400 Original-Received: from [199.232.76.173] (port=36332 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JyEij-0008JG-Fb for emacs-devel@gnu.org; Mon, 19 May 2008 19:24:45 -0400 Original-Received: from fencepost.gnu.org ([140.186.70.10]:52709) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JyEij-0002CQ-71 for emacs-devel@gnu.org; Mon, 19 May 2008 19:24:45 -0400 Original-Received: from mail.gnu.org ([199.232.76.166]:43391 helo=mx10.gnu.org) by fencepost.gnu.org with esmtp (Exim 4.67) (envelope-from ) id 1JyEhV-0006EQ-Vx for emacs-pretest-bug@gnu.org; Mon, 19 May 2008 19:23:30 -0400 Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1JyEif-0002Bw-6p for emacs-pretest-bug@gnu.org; Mon, 19 May 2008 19:24:44 -0400 Original-Received: from mtps02.sk.tsukuba.ac.jp ([130.158.97.224]:32986) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JyEiZ-0002BB-K3; Mon, 19 May 2008 19:24:36 -0400 Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp [130.158.99.156]) by mtps02.sk.tsukuba.ac.jp (Postfix) with ESMTP id 951FB7FFD; Tue, 20 May 2008 08:24:33 +0900 (JST) Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000) id 02DAC1A25C3; Tue, 20 May 2008 08:36:12 +0900 (JST) In-Reply-To: <85ej7yqafj.fsf@lola.goethe.zz> X-Mailer: VM ?bug? under XEmacs 21.5.21 (x86_64-unknown-linux) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:97416 gmane.emacs.pretest.bugs:22392 Archived-At: David Kastrup writes: > I am not interested in the "goal of Unicode" but in that of Emacs. > Unicode is about text files. But Emacs communicates via byte streams > and those are not necessarily text, or necessarily all text. Some Emacs files *are* text, and getting them to behave correctly will require understanding "the goals of Unicode". Since Unicode is now the underlying representation of multibyte buffers, you don't have a choice about this. Cf. Thomas Morgan's recent post on "disappearing cursor". > > Sure, and Emacs must provide coding systems that preserve them, and > > generally use those coding systems by default. Did anybody say > > otherwise? > > So what was your point supposed to be? That Miles could use a BOM-swallowing encoding on input and a non-BOM- producing encoding on output to enforce his view of Microsoft conventions on others. I told Patrick what I thought *Emacs* should do, but apparently it doesn't do that yet. > So forward-char and replace-string should be made to work as > expected on non-normalized texts. Good luck. I don't know how to do that, and doubt that it is possible. I do not think that "as expected" can be well defined, because for purposes like computing storage requirements composing characters should be considered characters, while for others like computing the number of columns occupied by a line they should not. > > Binary faithfulness may be incompatible with other user demands, for > > example if a user introduces Latin-2 characters into a Latin-9 text. > > Why do you think we switched to utf-8 internally and got rid of latin > unification? David, don't you realize that is not a response to what I wrote? I think it's time to stop this thread until you address the issues instead of me.