From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: [william.xwl@gmail.com: webjump-url-encode and non-ascii characters] Date: Tue, 24 Jul 2007 10:52:28 +0900 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=ISO-2022-JP-2 X-Trace: sea.gmane.org 1185241968 1872 80.91.229.12 (24 Jul 2007 01:52:48 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 24 Jul 2007 01:52:48 +0000 (UTC) Cc: william.xwl@gmail.com, emacs-devel@gnu.org To: rms@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Jul 24 03:52:47 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1ID9Zu-0005gU-Jt for ged-emacs-devel@m.gmane.org; Tue, 24 Jul 2007 03:52:46 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1ID9Zt-0000uL-Vi for ged-emacs-devel@m.gmane.org; Mon, 23 Jul 2007 21:52:46 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1ID9Zq-0000uG-Qt for emacs-devel@gnu.org; Mon, 23 Jul 2007 21:52:42 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1ID9Zq-0000u4-5t for emacs-devel@gnu.org; Mon, 23 Jul 2007 21:52:42 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1ID9Zq-0000u1-2n for emacs-devel@gnu.org; Mon, 23 Jul 2007 21:52:42 -0400 Original-Received: from mx1.aist.go.jp ([150.29.246.133]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1ID9Zj-00076g-UN; Mon, 23 Jul 2007 21:52:40 -0400 Original-Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123]) by mx1.aist.go.jp with ESMTP id l6O1qT3r013222; Tue, 24 Jul 2007 10:52:29 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp2.aist.go.jp by rqsmtp2.aist.go.jp with ESMTP id l6O1qTVR016503; Tue, 24 Jul 2007 10:52:29 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp2.aist.go.jp with ESMTP id l6O1qSl2008554; Tue, 24 Jul 2007 10:52:28 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken.m17n.org with local (Exim 4.67) (envelope-from ) id 1ID9Zc-0001A8-Lz; Tue, 24 Jul 2007 10:52:28 +0900 In-reply-to: (message from Richard Stallman on Mon, 23 Jul 2007 00:29:21 -0400) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.0 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) X-detected-kernel: Solaris 8 (1) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:75431 Archived-At: Sorry for the late response. In article , Richard Stallman writes: > [I sent this message a weeks ago but did not get a response. > Is this in your area? I would expect it is, since it deals > with non-ASCII characters, but maybe it isn't. If it isn't, > please say so. > Please respond!] > Is this patch correct? Most particularly, is it correct to use > buffer-file-coding-system for a URL? I have doubts about that. I doubts too. I'm not the expert of URL (or URI) encoding, but, as far as I remember, non-ASCII characters in URL must be at first encoded by UTF-8 then by %-encoding. So, for instance, $(D+"(B (U+00E0) must be encoded to "%C3%80". --- Kenichi Handa handa@m17n.org > ------- Start of forwarded message ------- > X-Spam-Status: No, score=1.3 required=5.0 tests=RCVD_NUMERIC_HELO, > SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY autolearn=no version=3.1.0 > To: emacs-devel@gnu.org > From: William Xu > Date: Wed, 04 Jul 2007 18:34:51 +0800 > Organization: the Church of Emacs > Mime-Version: 1.0 > Content-Type: text/plain; charset=utf-8 > Subject: webjump-url-encode and non-ascii characters > webjump-url-encode fails to encode non-ascii characters correctly. > Here's a patch: > - --- webjump.el 2007-06-03 14:54:53.000000000 +0800 > +++ webjump.el.new 2007-07-04 18:29:41.000000000 +0800 > @@ -451,14 +451,13 @@ > (defun webjump-url-encode (str) > (mapconcat '(lambda (c) > - - (cond ((= c 32) "+") > - - ((or (and (>= c ?a) (<= c ?z)) > - - (and (>= c ?A) (<= c ?Z)) > - - (and (>= c ?0) (<= c ?9))) > - - (char-to-string c)) > - - (t (upcase (format "%%%02x" c))))) > - - str > - - "")) > + (let ((s (char-to-string c))) > + (cond ((string= s " ") "+") > + ((string-match "[a-zA-Z_.-/]" s) s) > + (t (upcase (format "%%%02x" c)))))) > + (string-to-list > + (encode-coding-string str buffer-file-coding-system)) > + "")) > (defun webjump-url-fix (url) > (if (webjump-null-or-blank-string-p url) > - -- > William > ???????? > ???????????? > ???????????????????????????????? > ???????????????????????????????? > _______________________________________________ > Emacs-devel mailing list > Emacs-devel@gnu.org > http://lists.gnu.org/mailman/listinfo/emacs-devel > ------- End of forwarded message -------