From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lewis Perin Newsgroups: gmane.emacs.help Subject: Re: utf8 char display in buffer Date: 12 Jun 2009 13:45:51 -0400 Organization: Software Prostheses Message-ID: References: <7I2dndeTy7sqkLLXnZ2dnUVZ_gmdnZ2d@sysmatrix.net> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1244840183 14329 80.91.229.12 (12 Jun 2009 20:56:23 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 12 Jun 2009 20:56:23 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Jun 12 22:56:19 2009 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MFDnP-0004IY-2p for geh-help-gnu-emacs@m.gmane.org; Fri, 12 Jun 2009 22:56:19 +0200 Original-Received: from localhost ([127.0.0.1]:45752 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MFDnO-0004QF-Hd for geh-help-gnu-emacs@m.gmane.org; Fri, 12 Jun 2009 16:56:18 -0400 Original-Path: news.stanford.edu!newsfeed.stanford.edu!bloom-beacon.mit.edu!panix!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 79 Original-NNTP-Posting-Host: panix2.panix.com Original-X-Trace: reader1.panix.com 1244828751 17744 166.84.1.2 (12 Jun 2009 17:45:51 GMT) Original-X-Complaints-To: abuse@panix.com Original-NNTP-Posting-Date: Fri, 12 Jun 2009 17:45:51 +0000 (UTC) User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 Original-Xref: news.stanford.edu gnu.emacs.help:169988 X-Mailman-Approved-At: Fri, 12 Jun 2009 16:52:16 -0400 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:65221 Archived-At: "B. T. Raven" writes: > Lewis Perin wrote: > > ken writes: > > > >> [...] > >> Lewis, > >> > >> Thanks for posting. It's lonely out there when you're the only one with > >> a particular problem. > > The few, the proud... > > > >> To make sure we're suffering the same cyber-indignity, here's the > >> scenario as I see it (from an older version of emacs running on > >> Linux): > >> > >> 0) Some others and myself want to include some non-English characters in > >> a file being edited in emacs. Problems arise, however: > >> > >> 1) In a buffer which is already utf-8 encoded, I set the appropriate > >> input method, type in the desired characters. They display just peachy > >> and there is happiness in EmacsLand. > >> > >> 2) I save the buffer to a file, then close the buffer. > >> > >> 3) I visit the same file (i.e., load it again into emacs). Because it > >> has <!-- -*- coding: utf-8; -*- --> as the first line, it opens > >> utf-8 encoded. This is confirmed by the presence of a 'u' as the second > >> character in the status bar. > > I haven't been inserting that special first line. > > > >> 4) The text in the buffer displays fine, except that in place of each of > >> those non-English characters is a little empty box. With the cursor on > >> one of those boxes, an 'a' with a horizontal bar above it, doing "C-x > >> =", emacs returns "Char: ā (01210041, 331809, 0x51021, file ...)". > >> (While, in emacs the character after "Char:" is a little box, if I load > >> this same file into Firefox, that same character appears as it should, > >> as an 'a' with a horizontal bar above it. How it appears in your email > >> client will depend upon your email client.) > > My situation differs in that most of the non-ASCII characters > > (Chinese > > in my case) come through just fine. But the ones that don't have > > those irritating boxes in place of the correct glyphs. > > I wouldn't be surprised if the gaps and overlaps in the CJK ranges of > glyphs weren't so complicated that many characters from the following > encodings may not be included in utf-8, Sorry, I'm not sure what you mean by "may not be included in utf-8": do you mean utf-8 the standard, or do you mean Emacs's implementation of it? The characters I'm talking about are definitely in Unicode. > especially if they are not precomposed. This I don't really understand, either, I'm afraid. Might this explain why I can see the glyph for ni3 when I'm composing Chinese in Emacs using the chinese-tonepy-punct input method but can't see it when the saved file is read by Emacs? > Try some of these encodings to see if some of the empty boxes are > resolved into characters: > [...] > cn-gb-2312 I created a little file with my bête noire character using that encoding and saved it. Reverting the file with that encoding, I did see all the characters. > Also it might help to install a fontset rather than depending on a > single font to represent all these characters. Unfortunately I can't > help with that. I am on w32 and I don't even know whether fontsets can > be used in Emacs on that build. Windows R Us, too. /Lew --- Lew Perin / perin@acm.org http://www.panix.com/~perin/babelcarp.html