From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: [davidsmith@acm.org: [patch] url-hexify-string does not follow W3C spec] Date: Tue, 01 Aug 2006 16:14:30 +0900 Message-ID: References: <44CDDF7A.8060404@gnu.org> <87lkq9ivgf.fsf@acm.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: sea.gmane.org 1154416559 16375 80.91.229.2 (1 Aug 2006 07:15:59 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 1 Aug 2006 07:15:59 +0000 (UTC) Cc: mituharu@math.s.chiba-u.ac.jp, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Aug 01 09:15:56 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1G7oTg-0000he-1O for ged-emacs-devel@m.gmane.org; Tue, 01 Aug 2006 09:15:44 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1G7oTf-0002wH-7B for ged-emacs-devel@m.gmane.org; Tue, 01 Aug 2006 03:15:43 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1G7oTN-0002vZ-NX for emacs-devel@gnu.org; Tue, 01 Aug 2006 03:15:25 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1G7oTL-0002vJ-VL for emacs-devel@gnu.org; Tue, 01 Aug 2006 03:15:25 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1G7oTL-0002v8-Qe for emacs-devel@gnu.org; Tue, 01 Aug 2006 03:15:23 -0400 Original-Received: from [150.29.246.133] (helo=mx1.aist.go.jp) by monty-python.gnu.org with esmtp (Exim 4.52) id 1G7oWB-0006sT-Cr for emacs-devel@gnu.org; Tue, 01 Aug 2006 03:18:19 -0400 Original-Received: from smtp2.aist.go.jp ([150.29.246.12]) by mx1.aist.go.jp with ESMTP id k717FKAR022602; Tue, 1 Aug 2006 16:15:20 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp2.aist.go.jp with ESMTP id k717FHY2006758; Tue, 1 Aug 2006 16:15:17 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken with local (Exim 3.36 #1 (Debian)) id 1G7oSU-0005rn-00; Tue, 01 Aug 2006 16:14:30 +0900 Original-To: Stefan Monnier In-reply-to: (message from Stefan Monnier on Tue, 01 Aug 2006 02:50:41 -0400) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:57921 Archived-At: In article , Stefan Monnier writes: >>> What incompatibility? If the string only contains ASCII and >>> eight-bit-*, then encoding it with utf-8 will return the same string >>> of bytes (except in a unibyte string rather than multibyte string). >> Here's an example: >> (encode-coding-string "\x80" 'utf-8) >> => "\302\200" > Duh! Looks like a serious bug to me. > Handa-san, what's up with that? ??? \x80 == U+0080 is a valid Unicode character in "C1 Controls" block. However, I agree that the following is very questionable behaviour: >> (encode-coding-string (string-as-unibyte "\x80") 'utf-8) >> => "\302\200" But, that is a long standing problem, and should be fixed (if necessary) after the release. --- Kenichi Handa handa@m17n.org