From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lennart Borgman Newsgroups: gmane.emacs.devel Subject: Re: Problem with national characters in XHTML Date: Thu, 29 Sep 2005 15:52:17 +0200 Message-ID: <433BF191.50909@student.lu.se> References: <14e4cba14e7621.14e762114e4cba@net.lu.se> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1128003252 21114 80.91.229.2 (29 Sep 2005 14:14:12 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 29 Sep 2005 14:14:12 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Sep 29 16:14:10 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EKz8Z-0000OV-AQ for ged-emacs-devel@m.gmane.org; Thu, 29 Sep 2005 16:11:51 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKz8Y-0007AX-Kc for ged-emacs-devel@m.gmane.org; Thu, 29 Sep 2005 10:11:50 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EKz6w-0006pZ-8c for emacs-devel@gnu.org; Thu, 29 Sep 2005 10:10:11 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1EKz5V-0006cZ-12 for emacs-devel@gnu.org; Thu, 29 Sep 2005 10:08:45 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKz5U-0006PT-5U for emacs-devel@gnu.org; Thu, 29 Sep 2005 10:08:40 -0400 Original-Received: from [81.228.8.164] (helo=pne-smtpout2-sn2.hy.skanova.net) by monty-python.gnu.org with esmtp (Exim 4.34) id 1EKypi-0001aD-F8 for emacs-devel@gnu.org; Thu, 29 Sep 2005 09:52:22 -0400 Original-Received: from [192.168.123.121] (83.249.205.211) by pne-smtpout2-sn2.hy.skanova.net (7.2.060.1) id 42B94E29011E17E9; Thu, 29 Sep 2005 15:52:20 +0200 User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716) X-Accept-Language: en-us, en Original-To: Piet van Oostrum In-Reply-To: X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:43357 Archived-At: Piet van Oostrum wrote: >>>>>>Mathias Dahl (MD) wrote: >>>>>> >>>>>> > > > >>MD> I might be wrong here, but doesn't UTF-8 encode all characters in >>MD> Latin-1 (ISO 8859-1) exactly as they are *in* Latin-1 encoding? >> >> > >No. Iso 8859-1 uses 1 byte for all characters, while UTF-8 uses two bytes >for those characters that are in iso-8859-1. What you probably mean is that >the Unicode value (code point) for each iso-8859-1 character is the same as >its encoding in iso-8859-1. > > This is not easy. What you say make it even more interesting why C-q 3 4 4 RET is stored as 2276 (or what it was) in the XHTML files. How can that be? (For the context see my earlier mails.)