From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: William Xu Newsgroups: gmane.emacs.help Subject: Re: url-retrieve and utf-8 Date: Thu, 07 Feb 2008 17:05:31 +0900 Organization: the Church of Emacs Message-ID: References: <200802041702.27763.andreas.roehler@online.de> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1202371594 13635 80.91.229.12 (7 Feb 2008 08:06:34 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 7 Feb 2008 08:06:34 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Feb 07 09:06:56 2008 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1JN1mR-0000wN-Hk for geh-help-gnu-emacs@m.gmane.org; Thu, 07 Feb 2008 09:06:47 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JN1lz-0003vo-0f for geh-help-gnu-emacs@m.gmane.org; Thu, 07 Feb 2008 03:06:19 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JN1lh-0003vi-7G for help-gnu-emacs@gnu.org; Thu, 07 Feb 2008 03:06:01 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JN1lf-0003vT-DL for help-gnu-emacs@gnu.org; Thu, 07 Feb 2008 03:06:00 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JN1lf-0003vQ-54 for help-gnu-emacs@gnu.org; Thu, 07 Feb 2008 03:05:59 -0500 Original-Received: from mx20.gnu.org ([199.232.41.8]) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JN1le-0005Rf-H2 for help-gnu-emacs@gnu.org; Thu, 07 Feb 2008 03:05:58 -0500 Original-Received: from main.gmane.org ([80.91.229.2] helo=ciao.gmane.org) by mx20.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1JN1ld-00019e-Bs for help-gnu-emacs@gnu.org; Thu, 07 Feb 2008 03:05:57 -0500 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1JN1lY-0004LX-Gp for help-gnu-emacs@gnu.org; Thu, 07 Feb 2008 08:05:52 +0000 Original-Received: from gw.community-engine.co.jp ([210.255.51.230]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Feb 2008 08:05:52 +0000 Original-Received: from william.xwl by gw.community-engine.co.jp with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Feb 2008 08:05:52 +0000 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 44 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: gw.community-engine.co.jp User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.50 (darwin) Cancel-Lock: sha1:5V4iIF8SwJK92AOnLpgMaWsjIk0= X-detected-kernel: by mx20.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:51285 Archived-At: Stefan Monnier writes: > I can't remember exactly, but I think it doesn't (it just returns the > raw undecoded bytes). url-insert-file-contents should try and obey > "Content-Type"'s charset info, tho. Hmm, url-insert-file-contents' implementation appears to obey "Content-Type": ,---- | ;;;###autoload | (defun url-insert-file-contents (url &optional visit beg end replace) | (let ((buffer (url-retrieve-synchronously url))) | (if (not buffer) | (error "Opening input file: No such file or directory, %s" url)) | (if visit (setq buffer-file-name url)) | (save-excursion | (let* ((start (point)) | (size-and-charset (url-insert buffer beg end))) | (kill-buffer buffer) | (when replace | (delete-region (point-min) start) | (delete-region (point) (point-max))) | (unless (cadr size-and-charset) | ;; If the headers don't specify any particular charset, use the | ;; usual heuristic/rules that we apply to files. | (decode-coding-inserted-region start (point) url visit beg end replace)) | (list url (car size-and-charset)))))) `---- only it never succeeds. For example, with a header like ,---- | `---- it could only find out "text/html", completely missing "charset" value. It looks like the final header detecting job is fallen on mm-decode.el. Maybe mm-decode.el's fault? -- William http://williamxu.net9.org