From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ken Newsgroups: gmane.emacs.devel Subject: Re: utf8 char display in buffer Date: Fri, 12 Jun 2009 19:38:30 -0400 Message-ID: <4A32E6F6.5080501@mousecar.com> References: <7I2dndeTy7sqkLLXnZ2dnUVZ_gmdnZ2d@sysmatrix.net> <4A32D54D.1040405@mousecar.com> Reply-To: gebser@mousecar.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1244865570 23273 80.91.229.12 (13 Jun 2009 03:59:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 13 Jun 2009 03:59:30 +0000 (UTC) Cc: Emacs-Devel devel To: Lennart Borgman Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Jun 13 05:59:27 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MFKOt-0005es-4b for ged-emacs-devel@m.gmane.org; Sat, 13 Jun 2009 05:59:27 +0200 Original-Received: from localhost ([127.0.0.1]:57324 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MFKOs-0001xm-K3 for ged-emacs-devel@m.gmane.org; Fri, 12 Jun 2009 23:59:26 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MFGKV-0003aX-Vf for emacs-devel@gnu.org; Fri, 12 Jun 2009 19:38:40 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MFGKV-0003Zv-3z for emacs-devel@gnu.org; Fri, 12 Jun 2009 19:38:39 -0400 Original-Received: from [199.232.76.173] (port=59289 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MFGKU-0003Zl-Ne for emacs-devel@gnu.org; Fri, 12 Jun 2009 19:38:38 -0400 Original-Received: from mout.perfora.net ([74.208.4.194]:50755) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MFGKU-0000c0-5j for emacs-devel@gnu.org; Fri, 12 Jun 2009 19:38:38 -0400 Original-Received: from [192.168.0.26] (dsl093-011-017.cle1.dsl.speakeasy.net [66.93.11.17]) by mrelay.perfora.net (node=mrus1) with ESMTP (Nemesis) id 0MKpCa-1MFGKP3Jcc-000d4B; Fri, 12 Jun 2009 19:38:36 -0400 User-Agent: Thunderbird 2.0.0.0 (X11/20070326) In-Reply-To: X-Enigmail-Version: 0.95.7 OpenPGP: id=5AD091E7 X-Provags-ID: V01U2FsdGVkX19D/nmuo83D57exzxYXGVJOxrIi2uMYLrfzXwL b7TVf+shWXGtZxUQ3Sx+oZ9g1/YUrbyuZ3qnsb4WPDWz9K5MeL X6H61Q7gfXS8XkReFWtKzcIgMateCo1 X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. X-Mailman-Approved-At: Fri, 12 Jun 2009 23:59:22 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:111476 Archived-At: On 06/12/2009 06:27 PM Lennart Borgman wrote: > Ken, I think this is a good idea so I have sent this along to Emacs devel. > > On Sat, Jun 13, 2009 at 12:23 AM, ken wrote: >> Yet emacs puts a little box in the place of a character it cannot find >> (or, per your explanation) possibly confused about. The fact remains >> that the little box is not a correct rendering of the code. It is an >> error... at least it is for me, because that's not what I typed in. So >> it is an error. As an error, there should be a corresponding error >> message, hopefully one (or more) which would help diagnose the problem. >> It seems obvious that, given the long thread on this issue with no >> resolution, we could use some help-- like an error message-- which would >> help in diagnosis. Thank you, Lennart! To give the people at emacs-devel some context to the issue, the salient portion of the previous post is pasted below: 0) Some others and myself want to include some non-English characters in a file being edited in emacs. Problems arise, however: 1) In a buffer which is already utf-8 encoded, I set the appropriate input method, type in the desired characters. They display just peachy and there is happiness in EmacsLand. 2) I save the buffer to a file, then close the buffer. 3) I visit the same file (i.e., load it again into emacs). Because it has as the first line, it opens utf-8 encoded. This is confirmed by the presence of a 'u' as the second character in the status bar. 4) The text in the buffer displays fine, except that in place of each of those non-English characters is a little empty box. With the cursor on one of those boxes, an 'a' with a horizontal bar above it, doing "C-x =", emacs returns "Char: ā (01210041, 331809, 0x51021, file ...)". (While, in emacs the character after "Char:" is a little box, if I load this same file into Firefox, that same character appears as it should, as an 'a' with a horizontal bar above it. How it appears in your email client will depend upon your email client.) A) The fact that, as described in (4), the characters display correctly in Firefox, but not in emacs indicates that emacs is not drawing on the needed character set. Yet, the fact that in (1) the characters initially display correctly (when first input) indicates that the needed character set is present on the system and emacs can find it and has permission access it. Further, we would think that emacs would throw out an error message if either of these conditions were not met... and it doesn't. We can only assume that, when visiting and then decoding a file and pulling into a buffer for display, emacs is not even asking for the proper character set when encountering a non-English character. This is where I would start to look for the error. B) It would be helpful if the code which does the decoding of a file and renders it into the buffer display, if that part of it would throw an error message when it encounters a character it doesn't know how to display, i.e., when a little box character is displayed. After all, isn't it an error when a little box is displayed in lieu of the correct character? Possible error messages would be something like: "decoding process can't find /path/to/charset.file" or "decoding process doesn't have requisite permission to read /path/to/charset.file" or "invalid character: [hex/decimal value]" or other. ### Thanks much, ken