From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Newsgroups: gmane.emacs.help Subject: Re: Character with codepoint #o223 is displayed as \223, do I have a font problem? Date: Fri, 18 Mar 2016 16:02:50 +0100 Message-ID: <20160318150250.GA27859@tuxteam.de> References: <87twk3izmh.fsf@gmail.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1458315480 27975 80.91.229.3 (18 Mar 2016 15:38:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 18 Mar 2016 15:38:00 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Mar 18 16:38:00 2016 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1agwTP-0005wG-Og for geh-help-gnu-emacs@m.gmane.org; Fri, 18 Mar 2016 16:37:59 +0100 Original-Received: from localhost ([::1]:44416 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1agwTP-0001sP-5X for geh-help-gnu-emacs@m.gmane.org; Fri, 18 Mar 2016 11:37:59 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:45490) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1agwT8-0001nz-L0 for help-gnu-emacs@gnu.org; Fri, 18 Mar 2016 11:37:43 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1agwT3-0006YT-DF for help-gnu-emacs@gnu.org; Fri, 18 Mar 2016 11:37:42 -0400 Original-Received: from mail.tuxteam.de ([5.199.139.25]:60834 helo=tomasium.tuxteam.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1agwT3-0006WT-7Y for help-gnu-emacs@gnu.org; Fri, 18 Mar 2016 11:37:37 -0400 Original-Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from ) id 1agvvO-0007NF-Ao for help-gnu-emacs@gnu.org; Fri, 18 Mar 2016 16:02:50 +0100 In-Reply-To: <87twk3izmh.fsf@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 5.199.139.25 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:109618 Archived-At: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Fri, Mar 18, 2016 at 12:17:58PM -0300, N. Jackson wrote: > I'm finding that the name "Oscar" (with an accent on the initial letter) > is displayed as > > Óscar > > That first character (the accented A) is: > > position: 258 of 309 (83%), column: 41 > character: à (displayed as Ã) (codepoint 195, #o303, #xc3) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0xC3 > script: latin > syntax: w which means: word > category: .:Base, L:Left-to-right (strong), j:Japanese, l:Latin, v:Viet > to input: type "C-x 8 RET c3" or "C-x 8 RET LATIN CAPITAL LETTER A WITH TILDE" > buffer code: #xC3 #x83 > file code: #xC3 #x83 (encoded by coding system utf-8-unix) > display: by this font (glyph code) > xft:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-12-*-*-*-m-0-iso10646-1 (#x85) > > Character code properties: customize what to show > name: LATIN CAPITAL LETTER A WITH TILDE > old-name: LATIN CAPITAL LETTER A TILDE > general-category: Lu (Letter, Uppercase) > decomposition: (65 771) ('A' '̃') > > There are text properties here: > fontified t > > and the second character is: > > position: 259 of 309 (83%), column: 42 > character: “ (displayed as “) (codepoint 147, #o223, #x93) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0x93 > syntax: w which means: word > category: l:Latin > to input: type "C-x 8 RET 93" or "C-x 8 RET SET TRANSMIT STATE" > buffer code: #xC2 #x93 > file code: #xC2 #x93 (encoded by coding system utf-8-unix) > display: by this font (glyph code) > xft:-PfEd-Unifont-normal-normal-normal-*-12-*-*-*-d-0-iso10646-1 (#x96) > > Character code properties: customize what to show > old-name: SET TRANSMIT STATE > general-category: Cc (Other, Control) > decomposition: (147) ('“') > > There are text properties here: > fontified t > > Does this mean that the character with codepoint #o223 is missing from > DejaVu Sans Mono (my default font), or that something else is wrong? (I > tried setting my default font to several other faces but didn't see any > change.) No, that's not a font problem. That's an utf8 character "interpreted" as an 8 bit character set (most probably iso-8859-1 aka latin 1 or some near cousin). In utf8, Ó (capital letter O with acute) is represented by the byte sequence 0xc3 0x93, which is what you have above. But in iso-8859-1, 0xc3 is a capital A with tilde, the 93 is in some non-printable area and tends to look funny. > I'm guessing it's some sort of composition problem, which takes me a > long way beyond anything I know anything about! I guess that your Emacs is trying the wrong encoding. There is a heuristic to decide on that (the files themselves don't "know" their encoding, so Emacs has to guess). How did you get at the text? regards - -- t -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlbsGJoACgkQBcgs9XrR2kZpygCfYX/Me0waTpwryhr1+0yHswk9 GXgAn1/x8L5y/Js0ASfFebcqou/C9lYS =UkLP -----END PGP SIGNATURE-----