From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Unibyte characters Date: Fri, 31 Oct 2008 10:41:47 -0400 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1225464256 8953 80.91.229.12 (31 Oct 2008 14:44:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 31 Oct 2008 14:44:16 +0000 (UTC) Cc: handa@m17n.org, emacs-devel@gnu.org, Miles Bader To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 31 15:45:18 2008 connect(): Connection refused Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KvvEo-0004qR-DX for ged-emacs-devel@m.gmane.org; Fri, 31 Oct 2008 15:44:35 +0100 Original-Received: from localhost ([127.0.0.1]:48401 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KvvDh-000556-9X for ged-emacs-devel@m.gmane.org; Fri, 31 Oct 2008 10:43:25 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KvvCN-00040w-23 for emacs-devel@gnu.org; Fri, 31 Oct 2008 10:42:03 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KvvCK-0003zN-Ua for emacs-devel@gnu.org; Fri, 31 Oct 2008 10:42:02 -0400 Original-Received: from [199.232.76.173] (port=39854 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KvvCK-0003z7-MO for emacs-devel@gnu.org; Fri, 31 Oct 2008 10:42:00 -0400 Original-Received: from ironport2-out.pppoe.ca ([206.248.154.182]:53690 helo=ironport2-out.teksavvy.com) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KvvCC-0007tO-Hr; Fri, 31 Oct 2008 10:41:52 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsAEAFi2CklMCqyo/2dsb2JhbACBdswBg1GBDg X-IronPort-AV: E=Sophos;i="4.33,522,1220241600"; d="scan'208";a="29137197" Original-Received: from 76-10-172-168.dsl.teksavvy.com (HELO pastel.home) ([76.10.172.168]) by ironport2-out.teksavvy.com with ESMTP; 31 Oct 2008 10:41:47 -0400 Original-Received: by pastel.home (Postfix, from userid 20848) id 81C4B8E74; Fri, 31 Oct 2008 10:41:47 -0400 (EDT) In-Reply-To: (Eli Zaretskii's message of "Fri, 31 Oct 2008 13:27:13 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:105210 Archived-At: >> Text in a unibyte buffer is simply a bunch of binary characters >> 0-255 > Here you are saying what I was saying: that these are just raw 8-bit > bytes. >> you can interpret them however you want, of course, but that's >> not how emacs sees it. > I don't mind saying that displaying such a buffer or string or > movement by characters _interprets_ each byte as a single character. > But interpretation and essence are two different things, and the > manual does not make a point of telling that what it describes is the > Emacs interpretation of such buffers, not what is actually held there. > Thanks for the feedback, I will try to rephrase that text to make this > distinction more clear. IIUC, this part of the manual dates back to the introduction of Mule, when many people were using Emacs in unibyte mode. Nowadays unibyte mode is not recommended (I'd even be all happy to remove it altogether) and unibyte buffers should only be used for binary, undecoded data (i.e. for bytes, not for chars). So I agree with Eli that we should update this text to insist that a unibyte buffer only contains bytes, and then explain that if the buffer is displayed, those bytes will be interpreted in a particular way. BTW IIRC the non-ascii part will just be displayed as \NNN nowadays, rather than in some locale-dependent charset (such as latin-1). Stefan