From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Andreas Schwab Newsgroups: gmane.emacs.devel Subject: Re: Reporting UTF-8 related problems? Date: Tue, 30 Jul 2002 09:57:09 +0200 Sender: emacs-devel-admin@gnu.org Message-ID: References: <2110-Sun28Jul2002212621+0300-eliz@is.elta.co.il> <200207290518.OAA04004@etlken.m17n.org> <200207300522.OAA05828@etlken.m17n.org> <200207300711.QAA05993@etlken.m17n.org> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1028015875 17878 127.0.0.1 (30 Jul 2002 07:57:55 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Tue, 30 Jul 2002 07:57:55 +0000 (UTC) Cc: keichwa@gmx.net, eliz@is.elta.co.il, emacs-devel@gnu.org Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 17ZRtC-0004eF-00 for ; Tue, 30 Jul 2002 09:57:54 +0200 Original-Received: from fencepost.gnu.org ([199.232.76.164]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 17ZSAo-0006HG-00 for ; Tue, 30 Jul 2002 10:16:06 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.35 #1 (Debian)) id 17ZRtY-000300-00; Tue, 30 Jul 2002 03:58:16 -0400 Original-Received: from ns.suse.de ([213.95.15.193] helo=Cantor.suse.de) by fencepost.gnu.org with esmtp (Exim 3.35 #1 (Debian)) id 17ZRsZ-0002wi-00 for ; Tue, 30 Jul 2002 03:57:15 -0400 Original-Received: from Hermes.suse.de (Charybdis.suse.de [213.95.15.201]) by Cantor.suse.de (Postfix) with ESMTP id 57D1914788; Tue, 30 Jul 2002 09:57:14 +0200 (MEST) X-Authentication-Warning: sykes.suse.de: schwab set sender to schwab@suse.de using -f Original-To: Kenichi Handa X-Yow: RELATIVES!! In-Reply-To: <200207300711.QAA05993@etlken.m17n.org> (Kenichi Handa's message of "Tue, 30 Jul 2002 16:11:18 +0900 (JST)") Original-Lines: 37 User-Agent: Gnus/5.090006 (Oort Gnus v0.06) Emacs/21.3.50 (ia64-suse-linux) Errors-To: emacs-devel-admin@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.devel:6169 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:6169 Kenichi Handa writes: |> In article , Karl Eichwalder writes: |> > Kenichi Handa writes: |> >>> Char: =E2=80=9C (0150310, 53448, 0xd0c8) point=3D309 of 321 (96%) c= olumn 12=20 |> >>=20 |> >> This is because Emacs received this byte sequence: |> >> ESC $ ( B ! H |> >> "ESC $ ( B" is a designation sequence for jisx0208,=20 |> >> and the following two bytes "! H" specifies the above |> >> Japanese symbol. |>=20 |> > Originally, it was the "right double quote raising" and not meant to be |> > a special Japanese symbol ;) |>=20 |> I checked the contents of the html file itself and found this: |>=20 |> „Die Familie Schroffenstein“ |>=20 |> I thought that the notation &#NUMBER is for transmitting |> Unicode character of code NUMBER. But, 132 and 147 are |> control codes in Unicode, not any kind of quotings. Do you |> know a proper web page describing the meaning of them? The numbers are supposed to be ISO 8859-1 characters codes. I'd guess the page has been written with some broken (a.k.a. W*nd*ws) software (the use of *.htm makes this apparent). There is no hope for being compliant to any standard. I tried to validate it through the W3.org validator, but no document type matches. Andreas. --=20 Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 N=C3=BCrnberg Key fingerprint =3D 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."