From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lennart Borgman Newsgroups: gmane.emacs.devel Subject: Re: Problem with national characters in XHTML Date: Thu, 29 Sep 2005 16:02:39 +0200 Message-ID: <433BF3FF.1070602@student.lu.se> References: <14e4cba14e7621.14e762114e4cba@net.lu.se> <433AA30F.8050203@student.lu.se> <433AEB2D.7070906@student.lu.se> <20050929084322.GA16219@www.trapp.net> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: Quoted-Printable X-Trace: sea.gmane.org 1128003186 20842 80.91.229.2 (29 Sep 2005 14:13:06 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 29 Sep 2005 14:13:06 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Sep 29 16:13:03 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EKz7A-0008Hy-M2 for ged-emacs-devel@m.gmane.org; Thu, 29 Sep 2005 16:10:24 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKz79-0006jt-SN for ged-emacs-devel@m.gmane.org; Thu, 29 Sep 2005 10:10:24 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EKz5Y-0006dX-Mv for emacs-devel@gnu.org; Thu, 29 Sep 2005 10:08:45 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1EKz5I-0006U5-Cg for emacs-devel@gnu.org; Thu, 29 Sep 2005 10:08:31 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EKz5H-0006PT-8R for emacs-devel@gnu.org; Thu, 29 Sep 2005 10:08:27 -0400 Original-Received: from [81.228.11.98] (helo=pne-smtpout1-sn1.fre.skanova.net) by monty-python.gnu.org with esmtp (Exim 4.34) id 1EKyzr-000224-Vp for emacs-devel@gnu.org; Thu, 29 Sep 2005 10:02:52 -0400 Original-Received: from [192.168.123.121] (83.249.205.211) by pne-smtpout1-sn1.fre.skanova.net (7.2.060.1) id 43200DEA00470151; Thu, 29 Sep 2005 16:02:48 +0200 User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716) X-Accept-Language: en-us, en Original-To: Piet van Oostrum In-Reply-To: X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:43356 Archived-At: Piet van Oostrum wrote: >>>>>>tomas@tuxteam.de (Tomas Zerolo) (TZ) wrote: >>>>>> =20 >>>>>> > > =20 > >>TZ> Ah. You have to distinguish between Emacs's internal representation >>TZ> (that's possibly the 2276 you mention), which doesn't change (al le= ast >>TZ> unless you try hard ;) and what is in the file (how Emacs writes or >>TZ> interprets what it reads). You can change those things changing the >>TZ> coding system (look for something like `multilingual environment'). >> =20 >> > >By default Emacs uses different internal representations for the "same" >character in different coding systems. So a iso-8859-1 "=C3=A4" is a dif= ferent >thing than a utf-8 "=C3=A4". This difference will disappear when Emacs s= witches >to Unicode internally. For the time being the OP could use Unicode >unification, if his Emacs version is young enough. I have used this for >some years now without any problems. Maybe it solves the original proble= m. > >(require 'ucs-tables) >(unify-8859-on-encoding-mode 1) >(unify-8859-on-decoding-mode 1) > =20 > The values I have I have in CVS emacs.exe -Q is (featurep 'ucs-tables) =3D t unify-8859-on-encoding-mode =3D t unify-8859-on-decoding-mode =3D nil Though I do not understand what it means right now ;-) Evaling (unify-8859-on-decoding-mode 1) does not change the behaviour of=20 C-q 3 4 4 RET. It still enters a character that (following-char) reports = as 2276 (04344, 0x8e4) I did not notice before that there only seem to be on bit that differs=20 (see the second figure) - if that in some way matters.