From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mathias Dahl Newsgroups: gmane.emacs.devel Subject: Re: Problem with national characters in XHTML Date: Thu, 29 Sep 2005 13:11:49 +0200 Message-ID: References: <14e4cba14e7621.14e762114e4cba@net.lu.se> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1127995619 26194 80.91.229.2 (29 Sep 2005 12:06:59 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 29 Sep 2005 12:06:59 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Sep 29 14:06:56 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EKx9l-0004nu-1M for ged-emacs-devel@m.gmane.org; Thu, 29 Sep 2005 14:04:57 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKx9j-0003rs-Fx for ged-emacs-devel@m.gmane.org; Thu, 29 Sep 2005 08:04:55 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EKwqG-00066G-Dg for emacs-devel@gnu.org; Thu, 29 Sep 2005 07:44:48 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1EKwqD-00065F-Bi for emacs-devel@gnu.org; Thu, 29 Sep 2005 07:44:47 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKwmG-0004wm-4b for emacs-devel@gnu.org; Thu, 29 Sep 2005 07:40:40 -0400 Original-Received: from [80.91.229.2] (helo=ciao.gmane.org) by monty-python.gnu.org with esmtp (TLS-1.0:RSA_AES_128_CBC_SHA:16) (Exim 4.34) id 1EKwNX-0005bT-Ha for emacs-devel@gnu.org; Thu, 29 Sep 2005 07:15:07 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1EKwLf-0005se-9n for emacs-devel@gnu.org; Thu, 29 Sep 2005 13:13:11 +0200 Original-Received: from user.ifsab.se ([193.41.170.225]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 29 Sep 2005 13:13:11 +0200 Original-Received: from brakjoller by user.ifsab.se with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 29 Sep 2005 13:13:11 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-To: emacs-devel@gnu.org Original-Lines: 22 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: user.ifsab.se User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (windows-nt) Cancel-Lock: sha1:UUjWbrYXhepDrwg8OLCUirfGt1k= X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:43349 Archived-At: Juanma Barranquero writes: > On 9/28/05, LENNART BORGMAN wrote: > >> I have run into a problem with swedish national characters in an >> XHTML document. The header of the document is like this: >> >> >> > "http://www.w3.org/TR/REC-html40/loose.dtd"> >> >> >> The swedish character ä looks like \344 in CVS Emacs (2005-09-23). > > Hmm. An XHTML document with encoding="utf-8" should not have > "swedish national characters" in it, should it? Upon reading the > file, Emacs will set its coding system to mule-utf-8, so it's no > surprise than high-bit, non-valid utf8 byte sequences appear as > \xxx... I might be wrong here, but doesn't UTF-8 encode all characters in Latin-1 (ISO 8859-1) exactly as they are *in* Latin-1 encoding?