all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@gnu.org>
To: rms@gnu.org
Cc: 12296@debbugs.gnu.org
Subject: bug#12296: 24.1.50; Slow decoding in Rmail
Date: Wed, 29 Aug 2012 13:35:17 +0900	[thread overview]
Message-ID: <87k3wiuxh6.fsf@gnu.org> (raw)
In-Reply-To: <E1T6SMo-0007BW-E5@fencepost.gnu.org> (message from Richard Stallman on Tue, 28 Aug 2012 16:26:30 -0400)

In article <E1T6SMo-0007BW-E5@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:

> Mime-decoding in Rmail the message included below
> takes 10 seconds on my machine (which is rather slow).
> I am pretty sure it is due to the character code,
> because in general messages in Russian are slow
> and others are not.  I include this so you get an example.

I think the slowness is because of
quoted-printable-decode-region (in lisp/gnus/qp.el).  It is
not well tuned for speed, but I think that's because the
quoted-printable encoding is not intended to be used for
such a mostly non-ASCII text.  RFC2045 says:

------------------------------------------------------------
6.7. Quoted-Printable Content-Transfer-Encoding

   The Quoted-Printable encoding is intended to represent data that
   largely consists of octets that correspond to printable characters in
   the US-ASCII character set. 
------------------------------------------------------------

Anyway, here's a little bit tuned version.  Could you please
try it.
------------------------------------------------------------
(defun qp-decode-hex (n1 n2)
  (+ (* (if (<= n1 ?9) (- n1 ?0) (+ (- n1 ?A) 10)) 16)
     (if (<= n2 ?9) (- n2 ?0) (+ (- n2 ?A) 10))))

(defun quoted-printable-decode-region (from to &optional coding-system)
  "Decode quoted-printable in the region between FROM and TO, per RFC 2045.
If CODING-SYSTEM is non-nil, decode bytes into characters with that
coding-system.

Interactively, you can supply the CODING-SYSTEM argument
with \\[universal-coding-system-argument].

The CODING-SYSTEM argument is a historical hangover and is deprecated.
QP encodes raw bytes and should be decoded into raw bytes.  Decoding
them into characters should be done separately."
  (interactive
   ;; Let the user determine the coding system with "C-x RET c".
   (list (region-beginning) (region-end) coding-system-for-read))
  (unless (mm-coding-system-p coding-system) ; e.g. `ascii' from Gnus
    (setq coding-system nil))
  (save-excursion
    (save-restriction
      ;; RFC 2045:  ``An "=" followed by two hexadecimal digits, one
      ;; or both of which are lowercase letters in "abcdef", is
      ;; formally illegal. A robust implementation might choose to
      ;; recognize them as the corresponding uppercase letters.''
      (let ((case-fold-search t))
	(narrow-to-region from to)
	;; Do this in case we're called from Gnus, say, in a buffer
	;; which already contains non-ASCII characters which would
	;; then get doubly-decoded below.
	(if coding-system
	    (mm-encode-coding-region (point-min) (point-max) coding-system))
	(goto-char (point-min))
	(while (and (skip-chars-forward "^=")
		    (not (eobp)))
	  (cond ((eq (char-after (1+ (point))) ?\n)
		 (delete-char 2))
		((looking-at "\\(=[0-9A-F][0-9A-F]\\)+")
		 (let* ((n (/ (- (match-end 0) (point)) 3))
			(str (make-string n 0))
			(i 0))
		   (while (< i n)
		     (aset str i (qp-decode-hex (char-after (1+ (point)))
						(char-after (+ 2 (point)))))
		     (setq i (1+ i))
		     (forward-char 3))
		   (delete-region (match-beginning 0) (match-end 0))
		   (insert str)))
		(t
		 (message "Malformed quoted-printable text")
		 (forward-char)))))
      (if coding-system
	  (mm-decode-coding-region (point-min) (point-max) coding-system)))))
------------------------------------------------------------

---
Kenichi Handa
handa@gnu.org





  parent reply	other threads:[~2012-08-29  4:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-28 20:26 bug#12296: 24.1.50; Slow decoding in Rmail Richard Stallman
2012-08-28 20:59 ` Andreas Schwab
2012-08-29 14:05   ` Richard Stallman
2012-08-29  4:35 ` Kenichi Handa [this message]
2012-08-30  4:30   ` Richard Stallman
2012-08-30 12:24     ` Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k3wiuxh6.fsf@gnu.org \
    --to=handa@gnu.org \
    --cc=12296@debbugs.gnu.org \
    --cc=rms@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.