From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Peter Dyballa Newsgroups: gmane.emacs.help Subject: Re: File Encoding Issue on Windows Date: Tue, 12 Mar 2013 11:50:40 +0100 Message-ID: References: <1363057726.11242.YahooMailNeo@web165001.mail.bf1.yahoo.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1363086110 23455 80.91.229.3 (12 Mar 2013 11:01:50 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 12 Mar 2013 11:01:50 +0000 (UTC) Cc: "help-gnu-emacs@gnu.org" To: Tech Stuff Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue Mar 12 12:02:15 2013 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UFMyC-0006Xk-Pb for geh-help-gnu-emacs@m.gmane.org; Tue, 12 Mar 2013 12:02:12 +0100 Original-Received: from localhost ([::1]:57326 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFMxq-0007u9-Hh for geh-help-gnu-emacs@m.gmane.org; Tue, 12 Mar 2013 07:01:50 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:51615) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFMn7-0008W8-CG for help-gnu-emacs@gnu.org; Tue, 12 Mar 2013 06:50:51 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UFMn5-0002iD-1C for help-gnu-emacs@gnu.org; Tue, 12 Mar 2013 06:50:45 -0400 Original-Received: from mout.web.de ([212.227.17.12]:56338) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFMn4-0002i6-Od for help-gnu-emacs@gnu.org; Tue, 12 Mar 2013 06:50:42 -0400 Original-Received: from sumac.fritz.box ([95.223.148.182]) by smtp.web.de (mrweb101) with ESMTPA (Nemesis) id 0M0ymZ-1V2F9a2lsF-00uX2X; Tue, 12 Mar 2013 11:50:41 +0100 In-Reply-To: <1363057726.11242.YahooMailNeo@web165001.mail.bf1.yahoo.com> X-Mailer: Apple Mail (2.1085) X-Provags-ID: V02:K0:90WmiHgWwL0D+suxTwVroaiL2Tq+AxQ2LMhLTepbYqV rhZ06GZEgy5UyyJX1Cge0LXWNZYJKW5unjlBIEuVR1MUylIXQl L/Ylfyj3Qu4U202ThIJ3xNypTE6mGoTc1cJEbqLGqZrUkq6Uaj 87gSbKs2+2WyHTDm1X7WXu1ukEIMa0iMXtLYLgCOBZJlS3YD+R wsOz3LW7+yan1Qw0xwPMA== X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 212.227.17.12 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:89485 Archived-At: Am 12.03.2013 um 04:08 schrieb Tech Stuff: > =C2=BFEn qu=C3=A9 fecha llegaron >=20 > when I should see: >=20 > =BFEn qu=E9 fecha llegaron The first line encodes the text of the last line in UTF-8 encoding, but = is displayed to you in a different, an 8-bit encoding. In UTF-8 more = than one byte, more than 8 bits, are used to encode the characters. Only = the characters of the US-ASCII range (U+0001 - U+007E), i.e. the digits, = non-accented characters, punctuation, are encoded by one byte. The character =BF, INVERTED QUESTION MARK, U+00BF, is encoded in UTF-8 = as two bytes: C2BF. These two bytes are in Notepad interpreted as some = Latin or MS Windows encoding, i.e. as two different characters, as =C2 = and as =BF, which are then displayed as such. The character =E9, LATIN SMALL LETTER E WITH ACUTE, U+00E9, is encoded = in UTF-8 as two bytes: C3A9. These two bytes are in Notepad interpreted = as some Latin or MS Windows encoding, i.e. as two different characters = and then displayed as =C3 and as =A9. In MS Windows code page CP1252 uses for encoding: A9 =3D =A9, COPYRIGHT SIGN BF =3D =BF, INVERTED QUESTION MARK C2 =3D =C2, LATIN CAPITAL LETTER A WITH CIRCUMFLEX C3 =3D =C4, LATIN CAPITAL LETTER A WITH DIAERESIS So Notepad is using this code page, CP1252, to display the UTF-8 encoded = file. What you need to do is to tell Notepad to use UTF-8. -- Greetings Pete Give a man a fish, and you've fed him for a day. Teach him to fish, and = you've depleted the lake.