From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Tech Stuff Newsgroups: gmane.emacs.help Subject: Re: File Encoding Issue on Windows Date: Tue, 12 Mar 2013 07:57:39 -0700 (PDT) Message-ID: <1363100259.90696.YahooMailNeo@web165001.mail.bf1.yahoo.com> References: <1363057726.11242.YahooMailNeo@web165001.mail.bf1.yahoo.com> Reply-To: Tech Stuff NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="-1710639418-938918544-1363100259=:90696" X-Trace: ger.gmane.org 1363100289 7016 80.91.229.3 (12 Mar 2013 14:58:09 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 12 Mar 2013 14:58:09 +0000 (UTC) Cc: "help-gnu-emacs@gnu.org" To: Peter Dyballa Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue Mar 12 15:58:34 2013 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UFQeu-0008EV-Vz for geh-help-gnu-emacs@m.gmane.org; Tue, 12 Mar 2013 15:58:33 +0100 Original-Received: from localhost ([::1]:57900 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFQeY-0004Ra-MP for geh-help-gnu-emacs@m.gmane.org; Tue, 12 Mar 2013 10:58:10 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:37041) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UFQeC-0004Nq-AD for help-gnu-emacs@gnu.org; Tue, 12 Mar 2013 10:57:59 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UFQe5-0008Eq-8n for help-gnu-emacs@gnu.org; Tue, 12 Mar 2013 10:57:48 -0400 Original-Received: from nm11-vm0.bullet.mail.bf1.yahoo.com ([98.139.213.136]:22520) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1UFQe4-0008Ee-MF for help-gnu-emacs@gnu.org; Tue, 12 Mar 2013 10:57:41 -0400 Original-Received: from [98.139.214.32] by nm11.bullet.mail.bf1.yahoo.com with NNFMP; 12 Mar 2013 14:57:40 -0000 Original-Received: from [98.139.212.213] by tm15.bullet.mail.bf1.yahoo.com with NNFMP; 12 Mar 2013 14:57:39 -0000 Original-Received: from [127.0.0.1] by omp1022.mail.bf1.yahoo.com with NNFMP; 12 Mar 2013 14:57:39 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 977045.23498.bm@omp1022.mail.bf1.yahoo.com Original-Received: (qmail 90900 invoked by uid 60001); 12 Mar 2013 14:57:39 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1363100259; bh=Tn5c2LGzxQ3RF0y2gjgKQRcCc72MYdAk4qcOYLaVixE=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=pivc+TWKDm9mXm5TroyD7DElI+GU9q44T6AYWzhpg7WXUtdeM87000BGWcEkfTvHOuWaTeb+1rJiHA8knRDhdX55P/EsY3FzBc2GK4qKaWcXW1MlyhJA06PMSlLk9nSjGEI5kUvGumP6yWEZdprtQ9XqOO188VF7gKo9zTgvnpw= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=ip85pT1BV06/PNKsUbE3Z2nL2Foqgr8zZ8a/99JFrYaNgS3JzLNfC2GYXAZYirJzDMDPN5DqtumcyPc5kqik0exVYlDwi2H8bXdzFMz5Pvw0OUEFEVF/u8VJD0E2YweIWh7fC95fzPoK7aM6/ulfgRO2INib0vDFJCSRvp0vr2w=; X-YMail-OSG: ReUMBb0VM1l651Zi3wl11OKl2Q_umJQOfALAPYu7I9s3t3w d_jDr2ux.xvbkViD8wC2eCcks9YDy3R0H3d3SDEqoSqa2GI1QVboGAUqOf_C CFVOMGgt3D7JR5Nqa3Uhs18xQdAix66cGQKO9eSKCmqLkeS._yA.AvkDdUa4 zU09hBU3l4CbEFMqZ8rkf3E5tf7dSsuiZ2.BAIa8bVwdoJoHwq41iEF4hvKv ODP7CFDXhIBMb3ac98BZc2DLvM8d2fh7YDqOXhG05f_LoEQZARtJnpmieaXQ aUQ_RoovZkx6YCqDHQOYgdu45z4B89g_mxSOuwtllCvxvPUefhuXxqoljRHN ZmNRN_mkchygLTwY56R_kquELo_LbmInjsi4SFTh7ghPDhbAUkaltTOq.mQw KIcJsolxzIPXHLHzKfZWUw45GT5sJgFvQM0puB.43nMUx7nO1mhkN3TYMz2j R84U387eY5UkC5gI38tleyzowoZUbzFDRsmfmZJVPG5LT3c7sq6ZPEQb4.hZ NQThDycUDuaMbYITYd4Ilmis- Original-Received: from [98.145.41.146] by web165001.mail.bf1.yahoo.com via HTTP; Tue, 12 Mar 2013 07:57:39 PDT X-Rocket-MIMEInfo: 002.001, SGkgUGV0ZXIsCgpUaGFua3MgZm9yIHRha2luZyB0aGUgdGltZSB0byByZXBseS7CoCBUaG91Z2ggaXQgd2FzIHVzZWZ1bCwgSSdtIHN0aWxsIGNvbmZ1c2VkIGFib3V0IGhvdyB0byByZXNvbHZlIHRoaXMgaXNzdWUuwqAgVG8gYmUgY2xlYXIsIHdoZW4gSSBwb3N0ZWQgeWVzdGVyZGF5LCBpdCB3YXMgaW4gZW1hY3MgdGhhdCBJIHdhcyBzZWVpbmcgdGhlIGV4dHJhbmVvdXMgY2hhcmFjdGVycywgbm90IGluIG5vdGVwYWQuwqAgSG93ZXZlciBJIGp1c3Qgb3BlbmVkIGl0IGFnYWluIGluIG5vdGVwYWQgdG8gY2gBMAEBAQE- X-Mailer: YahooMailWebService/0.8.137.519 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: FreeBSD 8.x X-Received-From: 98.139.213.136 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:89487 Archived-At: ---1710639418-938918544-1363100259=:90696 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Hi Peter,=0A=0AThanks for taking the time to reply.=A0 Though it was useful= , I'm still confused about how to resolve this issue.=A0 To be clear, when = I posted yesterday, it was in emacs that I was seeing the extraneous charac= ters, not in notepad.=A0 However I just opened it again in notepad to check= on the encoding and now I'm seeing the extra characters there as well.=A0 = So something must have changed when as part of trying to figure out what wa= s going on, I saved the file in Emacs.=A0 Emacs seems to be the culprit.=A0= Is there something that I can put in my .emacs to tell it to save automati= cally in utf-8?=A0 Or am I maybe still not understanding things.=0A=0AThank= s again.=0A=0A-ts1971=A0 =0A=0A=0A=0A=0A________________________________=0A= From: Peter Dyballa =0ATo: Tech Stuff =0ACc: "help-gnu-emacs@gnu.org" =0ASen= t: Tuesday, March 12, 2013 3:50 AM=0ASubject: Re: File Encoding Issue on Wi= ndows=0A =0A=0AAm 12.03.2013 um 04:08 schrieb Tech Stuff:=0A=0A>=A0 =C2=BFE= n qu=C3=A9 fecha llegaron=0A> =0A> when I should see:=0A> =0A> =BFEn qu=E9 = fecha llegaron=0A=0AThe first line encodes the text of the last line in UTF= -8 encoding, but is displayed to you in a different, an 8-bit encoding. In = UTF-8 more than one byte, more than 8 bits, are used to encode the characte= rs. Only the characters of the US-ASCII range (U+0001 - U+007E), i.e. the d= igits, non-accented characters, punctuation, are encoded by one byte.=0A=0A= The character =BF, INVERTED QUESTION MARK, U+00BF, is encoded in UTF-8 as t= wo bytes: C2BF. These two bytes are in Notepad interpreted as some Latin or= MS Windows encoding, i.e. as two different characters, as =C2 and as =BF, = which are then displayed as such.=0A=0AThe character =E9, LATIN SMALL LETTE= R E WITH ACUTE, U+00E9, is encoded in UTF-8 as two bytes: C3A9. These two b= ytes are in Notepad interpreted as some Latin or MS Windows encoding, i.e. = as two different characters and then displayed as =C3 and as =A9.=0A=0AIn M= S Windows code page CP1252 uses for encoding:=0A=0A=A0=A0=A0 A9 =3D =A9, CO= PYRIGHT SIGN=0A=A0=A0=A0 BF =3D =BF, INVERTED QUESTION MARK=0A=A0=A0=A0 C2 = =3D =C2, LATIN CAPITAL LETTER A WITH CIRCUMFLEX=0A=A0=A0=A0 C3 =3D =C4, LAT= IN CAPITAL LETTER A WITH DIAERESIS=0A=0ASo Notepad is using this code page,= CP1252, to display the UTF-8 encoded file. What you need to do is to tell = Notepad to use UTF-8.=0A=0A--=0AGreetings=0A=0A=A0 Pete=0A=0AGive a man a f= ish, and you've fed him for a day. Teach him to fish, and you've depleted t= he lake. ---1710639418-938918544-1363100259=:90696 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable
Hi Peter,

Than= ks for taking the time to reply.  Though it was useful, I'm still conf= used about how to resolve this issue.  To be clear, when I posted yest= erday, it was in emacs that I was seeing the extraneous characters, not in = notepad.  However I just opened it again in notepad to check on the en= coding and now I'm seeing the extra characters there as well.  So some= thing must have changed when as part of trying to figure out what was going= on, I saved the file in Emacs.  Emacs seems to be the culprit.  = Is there something that I can put in my .emacs to tell it to save automatic= ally in utf-8?  Or am I maybe still not understanding things.

T= hanks again.

-ts1971 


=

From:<= /span> Peter Dyballa <Peter_Dyballa@Web.DE>
To: Tech Stuff <techstuff1971@yahoo.com&= gt;
Cc: "help-gnu-emac= s@gnu.org" <help-gnu-emacs@gnu.org>
Sent: Tuesday, March 12, 2013 3:50 AM
Subject: Re: File Encoding Issue on Wi= ndows

=0A
Am 12.03.2013 um 04:08 schrieb Tech Stu= ff:

>  =C2=BFEn qu=C3=A9 fecha llegaron
>
> whe= n I should see:
>
> =BFEn qu=E9 fecha llegaron

The firs= t line encodes the text of the last line in UTF-8 encoding, but is displaye= d to you in a different, an 8-bit encoding. In UTF-8 more than one byte, mo= re than 8 bits, are used to encode the characters. Only the characters of t= he US-ASCII range (U+0001 - U+007E), i.e. the digits, non-accented characte= rs, punctuation, are encoded by one byte.

The character =BF, INVERTE= D QUESTION MARK, U+00BF, is encoded in UTF-8 as two bytes: C2BF. These two = bytes are in Notepad interpreted as some Latin or MS Windows encoding, i.e.= as two different characters, as =C2 and as =BF, which are then displayed a= s such.

The character =E9, LATIN SMALL LETTER E WITH ACUTE, U+00E9, = is encoded in UTF-8 as two bytes: C3A9. These two bytes are in Notepad inte= rpreted as some Latin or MS Windows encoding, i.e. as two different characters and then displayed as =C3 and a= s =A9.

In MS Windows code page CP1252 uses for encoding:

&nbs= p;   A9 =3D =A9, COPYRIGHT SIGN
    BF =3D =BF,= INVERTED QUESTION MARK
    C2 =3D =C2, LATIN CAPITAL LET= TER A WITH CIRCUMFLEX
    C3 =3D =C4, LATIN CAPITAL LETTE= R A WITH DIAERESIS

So Notepad is using this code page, CP1252, to di= splay the UTF-8 encoded file. What you need to do is to tell Notepad to use= UTF-8.

--
Greetings

  Pete

Give a man a fish,= and you've fed him for a day. Teach him to fish, and you've depleted the l= ake.



---1710639418-938918544-1363100259=:90696--