From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "B. T. Raven" Newsgroups: gmane.emacs.help Subject: Re: utf8 char display in buffer Date: Fri, 12 Jun 2009 11:48:55 -0500 Message-ID: References: <7I2dndeTy7sqkLLXnZ2dnUVZ_gmdnZ2d@sysmatrix.net> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1244828449 11548 80.91.229.12 (12 Jun 2009 17:40:49 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 12 Jun 2009 17:40:49 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Jun 12 19:40:45 2009 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MFAk7-0006AG-Pm for geh-help-gnu-emacs@m.gmane.org; Fri, 12 Jun 2009 19:40:44 +0200 Original-Received: from localhost ([127.0.0.1]:50352 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MFAk7-0001rk-5n for geh-help-gnu-emacs@m.gmane.org; Fri, 12 Jun 2009 13:40:43 -0400 Original-Path: news.stanford.edu!newsfeed.stanford.edu!postnews.google.com!news1.google.com!border1.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!backlog2.nntp.dca.giganews.com!nntp.sysmatrix.net!news.sysmatrix.net.POSTED!not-for-mail Original-NNTP-Posting-Date: Fri, 12 Jun 2009 11:48:55 -0500 User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) Original-Newsgroups: gnu.emacs.help In-Reply-To: Original-Lines: 78 X-Usenet-Provider: http://www.giganews.com Original-NNTP-Posting-Host: 12.73.128.36 Original-X-Trace: sv3-oSftyJSJ3VcV0ZQry9AbLXoU/YGUS3h0X1W32xx1xZKCe00dFO8IxbRJSjvAbvLPKuicOVJ+gvbxtI3!sDx18A7CW6QV+KXXkVYXdxCmNDYgJQrTu+PHslAd7tVic2/CFir4dF6R3iLhzW1TqArHkvrf3L8L!WMg8oT5bmmdVobp12bdNNkN/S7JjMA== Original-X-Complaints-To: abuse@sysmatrix.net X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.39 X-Original-Bytes: 4462 Original-Xref: news.stanford.edu gnu.emacs.help:169984 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:65209 Archived-At: Lewis Perin wrote: > ken writes: > >> [...] >> Lewis, >> >> Thanks for posting. It's lonely out there when you're the only one with >> a particular problem. > > The few, the proud... > >> To make sure we're suffering the same cyber-indignity, here's the >> scenario as I see it (from an older version of emacs running on >> Linux): >> >> 0) Some others and myself want to include some non-English characters in >> a file being edited in emacs. Problems arise, however: >> >> 1) In a buffer which is already utf-8 encoded, I set the appropriate >> input method, type in the desired characters. They display just peachy >> and there is happiness in EmacsLand. >> >> 2) I save the buffer to a file, then close the buffer. >> >> 3) I visit the same file (i.e., load it again into emacs). Because it >> has <!-- -*- coding: utf-8; -*- --> as the first line, it opens >> utf-8 encoded. This is confirmed by the presence of a 'u' as the second >> character in the status bar. > > I haven't been inserting that special first line. > >> 4) The text in the buffer displays fine, except that in place of each of >> those non-English characters is a little empty box. With the cursor on >> one of those boxes, an 'a' with a horizontal bar above it, doing "C-x >> =", emacs returns "Char: ā (01210041, 331809, 0x51021, file ...)". >> (While, in emacs the character after "Char:" is a little box, if I load >> this same file into Firefox, that same character appears as it should, >> as an 'a' with a horizontal bar above it. How it appears in your email >> client will depend upon your email client.) > > My situation differs in that most of the non-ASCII characters (Chinese > in my case) come through just fine. But the ones that don't have > those irritating boxes in place of the correct glyphs. > > /Lew > --- > Lew Perin / perin@acm.org > http://www.panix.com/~perin/babelcarp.html I wouldn't be surprised if the gaps and overlaps in the CJK ranges of glyphs weren't so complicated that many characters from the following encodings may not be included in utf-8, especially if they are not precomposed. Try some of these encodings to see if some of the empty boxes are resolved into characters: chinese-big5 chinese-hz chinese-iso-7bit chinese-iso-8bit chinese-iso-8bit-with-esc cn-big5 cn-gb cn-gb-2312 iso-2022-cjk iso-2022-cn iso-2022-cn-ext Also it might help to install a fontset rather than depending on a single font to represent all these characters. Unfortunately I can't help with that. I am on w32 and I don't even know whether fontsets can be used in Emacs on that build. Ed