From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenjiro NAKAYAMA Newsgroups: gmane.emacs.bugs Subject: bug#16225: 24.3.50; [PATCH] eww: machinery to set character encoding. Date: Wed, 25 Dec 2013 16:32:51 +0900 Message-ID: <87ioudtmfw.fsf@dhcp-193-97.nrt.redhat.com> References: <87eh53h928.fsf@dhcp-193-97.nrt.redhat.com> <87ob46y9bg.fsf@building.gnus.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1387957882 602 80.91.229.3 (25 Dec 2013 07:51:22 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 25 Dec 2013 07:51:22 +0000 (UTC) Cc: Kenjiro NAKAYAMA , 16225@debbugs.gnu.org To: Lars Ingebrigtsen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Dec 25 08:51:27 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Vviyy-000432-PQ for geb-bug-gnu-emacs@m.gmane.org; Wed, 25 Dec 2013 08:34:21 +0100 Original-Received: from localhost ([::1]:41534 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vviyy-0003Ue-1Q for geb-bug-gnu-emacs@m.gmane.org; Wed, 25 Dec 2013 02:34:20 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36737) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vviyo-0003Ro-G1 for bug-gnu-emacs@gnu.org; Wed, 25 Dec 2013 02:34:17 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Vviyh-0004CE-Hs for bug-gnu-emacs@gnu.org; Wed, 25 Dec 2013 02:34:10 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:54647) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vviyh-0004CA-CY for bug-gnu-emacs@gnu.org; Wed, 25 Dec 2013 02:34:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1Vviyg-00014r-EQ for bug-gnu-emacs@gnu.org; Wed, 25 Dec 2013 02:34:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Kenjiro NAKAYAMA Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 25 Dec 2013 07:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16225 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 16225-submit@debbugs.gnu.org id=B16225.13879567924061 (code B ref 16225); Wed, 25 Dec 2013 07:34:02 +0000 Original-Received: (at 16225) by debbugs.gnu.org; 25 Dec 2013 07:33:12 +0000 Original-Received: from localhost ([127.0.0.1]:40433 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Vvixn-00013H-MV for submit@debbugs.gnu.org; Wed, 25 Dec 2013 02:33:12 -0500 Original-Received: from mail-pd0-f175.google.com ([209.85.192.175]:61606) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Vvixg-00012c-9F for 16225@debbugs.gnu.org; Wed, 25 Dec 2013 02:33:05 -0500 Original-Received: by mail-pd0-f175.google.com with SMTP id w10so6998392pde.20 for <16225@debbugs.gnu.org>; Tue, 24 Dec 2013 23:32:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:user-agent:from:to:cc:subject:in-reply-to:date :message-id:mime-version:content-type; bh=fmAbg5KF5d4MQUX1rWVz2EeCald26wth+NfAdZVZ8Yk=; b=rnXxiPxAVgMrwBE+m/zVSV/SqqMxHe0tPvWMPug03aRNKjZb6pwYJxn7oldjtGE6dE 4TZC8YRUkVAvq3J6vpvSQ/JtpnwsZVFDbFplXllfTh93IDAgQHjQk7zB6A72KvERsyEC c934wcZeW6Qeal7/PwNBRnJh0dU4I4nNgz5cJ37mMSJfiOCzRp8T94FDrf9bcUy4NkyD ZaC6J5xCoMkpZdcb27vLQ7nD5mJHwPkfQNGtCaRituIjm09L9UdEL3kbFXTZ7wisBFP2 jgg0SOxjvBfzsvV2BkK/2B5hVgjZdmYfD8+beWy0p5VBYhnCQUfYPNZOTRxc2rjdH3wg mnwQ== X-Received: by 10.68.211.39 with SMTP id mz7mr36874157pbc.90.1387956778949; Tue, 24 Dec 2013 23:32:58 -0800 (PST) Original-Received: from dhcp-193-97.nrt.redhat.com (nat-pool-nrt-u1.redhat.com. [66.187.238.11]) by mx.google.com with ESMTPSA id y9sm60698514pas.10.2013.12.24.23.32.56 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 Dec 2013 23:32:57 -0800 (PST) User-agent: mu4e 0.9.9.6pre2; emacs 24.3.50.2 In-reply-to: <87ob46y9bg.fsf@building.gnus.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:82571 Archived-At: > However, the implementation doesn't seem ideal. The best way to handle > this would be to have the `E' command prompt for the charset, and then > re-render the page immediately without setting any global variables. Thank you Lars, I fixed it. Can you please review again? I send the patch again. Signed-off-by: Kenjiro NAKAYAMA * net/eww.el (eww-display-html,eww-display-raw): Change to enable to set encoding type. (eww-mode-map): New key map and easy-menu to set encoding type. (eww-set-character-encoding): New funtion to set encoding type. --- lisp/net/eww.el | 43 ++++++++++++++++++++++++++++++------------- 1 file changed, 30 insertions(+), 13 deletions(-) diff --git a/lisp/net/eww.el b/lisp/net/eww.el index 02c93a0..8fbf94d 100644 --- a/lisp/net/eww.el +++ b/lisp/net/eww.el @@ -172,7 +172,7 @@ word(s) will be searched for via `eww-search-prefix'." "/") (expand-file-name file)))) -(defun eww-render (status url &optional point) +(defun eww-render (status url &optional point encode) (let ((redirect (plist-get status :redirect))) (when redirect (setq url redirect))) @@ -192,7 +192,7 @@ word(s) will be searched for via `eww-search-prefix'." (or (cdr (assq 'charset (cdr content-type))) (eww-detect-charset (equal (car content-type) "text/html")) - "utf8")))) + "utf-8")))) (data-buffer (current-buffer))) (unwind-protect (progn @@ -203,12 +203,12 @@ word(s) will be searched for via `eww-search-prefix'." (car content-type))) (eww-browse-with-external-browser url)) ((equal (car content-type) "text/html") - (eww-display-html charset url nil point)) + (eww-display-html charset url nil point encode)) ((string-match-p "\\`image/" (car content-type)) (eww-display-image) (eww-update-header-line-format)) (t - (eww-display-raw) + (eww-display-raw encode) (eww-update-header-line-format))) (setq eww-current-url url eww-history-position 0)) @@ -243,12 +243,12 @@ word(s) will be searched for via `eww-search-prefix'." (declare-function libxml-parse-html-region "xml.c" (start end &optional base-url)) -(defun eww-display-html (charset url &optional document point) +(defun eww-display-html (charset url &optional document point encode) (or (fboundp 'libxml-parse-html-region) (error "This function requires Emacs to be compiled with libxml2")) - (unless (eq charset 'utf8) + (unless (eq charset encode) (condition-case nil - (decode-coding-region (point) (point-max) charset) + (decode-coding-region (point) (point-max) encode) (coding-system-error nil))) (let ((document (or document @@ -363,11 +363,16 @@ word(s) will be searched for via `eww-search-prefix'." (list :background (car new-colors)) t)))))) -(defun eww-display-raw () +(defun eww-display-raw (&optional encode) (let ((data (buffer-substring (point) (point-max)))) (eww-setup-buffer) (let ((inhibit-read-only t)) - (insert data)) + (insert data) + (unless (eq encode 'utf-8) + (encode-coding-region (point-min) (1+ (length data)) 'utf-8) + (condition-case nil + (decode-coding-region (point-min) (1+ (length data)) encode) + (coding-system-error nil)))) (goto-char (point-min)))) (defun eww-display-image () @@ -420,6 +425,7 @@ word(s) will be searched for via `eww-search-prefix'." (define-key map "C" 'url-cookie-list) (define-key map "v" 'eww-view-source) (define-key map "H" 'eww-list-histories) + (define-key map "E" 'eww-set-character-encoding) (define-key map "b" 'eww-add-bookmark) (define-key map "B" 'eww-list-bookmarks) @@ -442,7 +448,8 @@ word(s) will be searched for via `eww-search-prefix'." ["List histories" eww-list-histories t] ["Add bookmark" eww-add-bookmark t] ["List bookmarks" eww-list-bookmarks t] - ["List cookies" url-cookie-list t])) + ["List cookies" url-cookie-list t] + ["Character Encoding" eww-set-character-encoding])) map)) (defvar eww-tool-bar-map @@ -552,11 +559,11 @@ appears in a or tag." (eww-browse-url (shr-expand-url best-url eww-current-url)) (user-error "No `top' for this page")))) -(defun eww-reload () +(defun eww-reload (&optional encode) "Reload the current page." (interactive) (url-retrieve eww-current-url 'eww-render - (list eww-current-url (point)))) + (list eww-current-url (point) encode))) ;; Form support. @@ -1032,7 +1039,7 @@ If EXTERNAL, browse the URL using `shr-external-browser'." ((and (url-target (url-generic-parse-url url)) (eww-same-page-p url eww-current-url)) (eww-save-history) - (eww-display-html 'utf8 url eww-current-dom)) + (eww-display-html 'utf-8 url eww-current-dom)) (t (eww-browse-url url))))) @@ -1083,6 +1090,16 @@ Differences in #targets are ignored." (setq count (1+ count))) (expand-file-name file directory))) +(defun eww-set-character-encoding (encode) + "Set character encoding." + (interactive "sSet Character Encoding (default utf-8): ") + (cond ((zerop (length encode)) + (eww-reload 'utf-8)) + (t + (if (not (coding-system-p (intern encode))) + (user-error "Invalid encodeing type.") + (eww-reload (intern encode)))))) + ;;; Bookmarks code (defvar eww-bookmarks nil) -- 1.8.3.1 Regards, Kenjiro larsi@gnus.org writes: > Kenjiro NAKAYAMA writes: > >> This report includes a patch to the bug. Please, review and install it >> to the official tree if appreciated. >> >> The user can't change encoding type. This patch is to fix it. > > Allowing the user to alter the encoding seems like a good idea. > However, the implementation doesn't seem ideal. The best way to handle > this would be to have the `E' command prompt for the charset, and then > re-render the page immediately without setting any global variables.