From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Fwd: Re: Inadequate documentation of silly characters on screen. Date: Thu, 19 Nov 2009 14:55:12 -0500 Message-ID: References: <20091118191258.GA2676@muc.de> <20091119082040.GA1720@muc.de> <874ooq8xay.fsf@wanchan.jasonrumney.net> <20091119141852.GC1720@muc.de> <20091119155848.GB1314@muc.de> <19205.30349.786007.611623@parhasard.net> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1258660560 15983 80.91.229.12 (19 Nov 2009 19:56:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 19 Nov 2009 19:56:00 +0000 (UTC) Cc: Alan Mackenzie , Jason Rumney , Andreas Schwab , emacs-devel@gnu.org To: Aidan Kehoe Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Nov 19 20:55:52 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1NBD6b-00039F-4f for ged-emacs-devel@m.gmane.org; Thu, 19 Nov 2009 20:55:49 +0100 Original-Received: from localhost ([127.0.0.1]:45207 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NBD6a-0004da-Mn for ged-emacs-devel@m.gmane.org; Thu, 19 Nov 2009 14:55:48 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NBD6F-0004NV-Og for emacs-devel@gnu.org; Thu, 19 Nov 2009 14:55:27 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NBD6A-0004Ji-TC for emacs-devel@gnu.org; Thu, 19 Nov 2009 14:55:27 -0500 Original-Received: from [199.232.76.173] (port=56242 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NBD6A-0004JQ-Hh for emacs-devel@gnu.org; Thu, 19 Nov 2009 14:55:22 -0500 Original-Received: from pruche.dit.umontreal.ca ([132.204.246.22]:40756) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NBD66-00063S-4j; Thu, 19 Nov 2009 14:55:18 -0500 Original-Received: from faina.iro.umontreal.ca (faina.iro.umontreal.ca [132.204.26.177]) by pruche.dit.umontreal.ca (8.14.1/8.14.1) with ESMTP id nAJJtDOV024112; Thu, 19 Nov 2009 14:55:13 -0500 Original-Received: by faina.iro.umontreal.ca (Postfix, from userid 20848) id EC8E93A0FA; Thu, 19 Nov 2009 14:55:12 -0500 (EST) In-Reply-To: <19205.30349.786007.611623@parhasard.net> (Aidan Kehoe's message of "Thu, 19 Nov 2009 16:47:09 +0000") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux) X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 1 Rules triggered RV3410=0 X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:117291 Archived-At: >> I'm thinking from the lisp viewpoint. The string is a data structure >> I really don't want to have to think about >> the difference between "chars" and "bytes" when I'm hacking lisp. If I >> do, then the abstraction "string" is broken. > For some context on this, that=E2=80=99s how it works in XEmacs; we=E2=80= =99ve never had > problems with it, we seem to avoid an entire class of programming errors > that GNU Emacs developers deal with on a regular basis. Indeed XEmacs does not represent chars as integers, and that can eliminate several sources of problems. Note that this problem is new in Emacs-23, since in Emacs-22 (and in XEmacs, IIUC), there was no character whose integer value was between 127 and 256, so there was no ambiguity. AFAIK most of the programming errors we've had to deal with over the years (i.e. in Emacs-20, 21, 22) had to do with incorrect (or missing) encoding/decoding and most of those errors existed just as much on XEmacs because there's no way to fix them right in the infrastructure code (tho XEmacs may have managed to hide them better by detecting the lack of encoding/decoding and guessing an appropriate coding-system instead). > Tangentally, for those that like the unibyte/multibyte distinction, to my > knowledge the editor does not have any way of representing =E2=80=9Can oc= tet with > numeric value < #x7f to be treated with byte semantics, not character > semantics=E2=80=9D, which seems arbitrary to me. For example:=20 Indeed. It hasn't bitten us hard yet, mostly because (luckily) there are very few coding-system which use chars 0-127 in ways incompatible with ascii. Stefan