From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: LENNART BORGMAN Newsgroups: gmane.emacs.devel Subject: Re: Problem with national characters in XHTML Date: Wed, 28 Sep 2005 13:08:03 +0200 Message-ID: <150b4fd1509d3a.1509d3a150b4fd@net.lu.se> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1127910894 8592 80.91.229.2 (28 Sep 2005 12:34:54 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 28 Sep 2005 12:34:54 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Sep 28 14:34:52 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EKb7E-00029R-8p for ged-emacs-devel@m.gmane.org; Wed, 28 Sep 2005 14:32:52 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKb7C-0004B3-KS for ged-emacs-devel@m.gmane.org; Wed, 28 Sep 2005 08:32:50 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EKZne-0003yn-Go for emacs-devel@gnu.org; Wed, 28 Sep 2005 07:08:35 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1EKZnG-0003ta-AH for emacs-devel@gnu.org; Wed, 28 Sep 2005 07:08:33 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKZnD-0003iO-DD for emacs-devel@gnu.org; Wed, 28 Sep 2005 07:08:08 -0400 Original-Received: from [130.235.208.46] (helo=piraten.student.lu.se) by monty-python.gnu.org with esmtp (Exim 4.34) id 1EKZnB-0007hn-TO for emacs-devel@gnu.org; Wed, 28 Sep 2005 07:08:06 -0400 Original-Received: from net.lu.se (localhost [127.0.0.1]) by piraten.student.lu.se (iPlanet Messaging Server 5.2 HotFix 1.14 (built Mar 18 2003)) with ESMTP id <0INI00J5BXLF2H@piraten.student.lu.se> for emacs-devel@gnu.org; Wed, 28 Sep 2005 13:08:03 +0200 (MEST) Original-Received: from [212.209.42.132] by piraten.student.lu.se (mshttpd); Wed, 28 Sep 2005 13:08:03 +0200 Original-To: "emacs-devel@gnu.org" X-Mailer: iPlanet Messenger Express 5.2 HotFix 1.14 (built Mar 18 2003) Content-language: sv Content-disposition: inline X-Accept-Language: sv Priority: normal X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:43324 Archived-At: Ok=2C thanks for help to all that replied=2E I tried to learn a bit=3B-) Putting iso-8859-1 in the header instead of utf-8 as Tomas Zerolo suggest= ed solved the problem=2E ----- Original Message ----- From=3A Juanma Barranquero =3Clekktu=40gmail=2Ecom=3E Date=3A Wednesday=2C September 28=2C 2005 12=3A44 pm Subject=3A Re=3A Problem with national characters in XHTML =3E On 9/28/05=2C LENNART BORGMAN =3Clennart=2Eborgman=2E073=40student=2E= lu=2Ese=3E wrote=3A =3E = =3E =3E I have run into a problem with swedish national characters in an = =3E XHTML document=2E The header of the document is like this=3A =3E =3E =3E =3E =3C=3Fxml version=3D=221=2E0=22 encoding=3D=22utf-8=22=3F=3E =3E =3E =3C!DOCTYPE HTML PUBLIC =22-//W3C//DTD HTML 4=2E0 Transitional/= /EN=22 =3E =3E =22http=3A//www=2Ew3=2Eorg/TR/REC-html40/loose=2Edtd=22= =3E =3E =3E =3Chtml xmlns=3D=22http=3A//www=2Ew3=2Eorg/1999/xhtml=22 xml=3A= lang=3D=22en=22=3E =3E =3E =3E =3E The swedish character =E4 looks like =5C344 in CVS Emacs (2005-09= -23)=2E =3E = =3E Hmm=2E An XHTML document with encoding=3D=22utf-8=22 should not have = =22swedish =3E national characters=22 in it=2C should it=3F Upon reading the file=2C= Emacs =3E will set its coding system to mule-utf-8=2C so it=27s no surprise tha= n =3E high-bit=2C non-valid utf8 byte sequences appear as =5Cxxx=2E=2E=2E =3E = =3E I=27ve created a document with your header=2C and put an =22=C9=22 in= it with =3E notepad=2E Emacs shows this char as =5C311=2E I would not consider th= is an =3E error =3A) =3E = =3E -- =3E /L/e/k/t/u =3E