From: Kenichi Handa <handa@m17n.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: pmr@pajato.com, eliz@gnu.org, ueno@unixuser.org,
stephen@xemacs.org, emacs-devel@gnu.org
Subject: Re: Need some help with Rmail/mbox
Date: Mon, 22 Sep 2008 13:31:56 +0900 [thread overview]
Message-ID: <E1Khd5Y-00042x-Hc@etlken.m17n.org> (raw)
In-Reply-To: <jwvzlm117p7.fsf-monnier+emacs@gnu.org> (message from Stefan Monnier on Sun, 21 Sep 2008 18:07:10 -0400)
In pre-unicode-merge Emacs (more exactly, before
2008-03-12), the automatic unibyte -> multibyte conversion
sometimes caused a headache for Emacs Lisp developper
because the behaviour differs in each lang. env. But, with
the current Emacs, that conversion works more
developper-friendly; i.e. all bytes with MSB set are
converted to the corresponding eight-bit characters of
multibyte represenation (* see the attached note).
So, now we have these four ways to get a multibute buffer
decoded from a unibyte buffer, and they all should work
equally safely.
(1) Do decode-coding-region while specifying a multibyte
buffer as TARGET.
(2) Insert the contents of unibyte buffer into a multibyte
buffer, and then perform decode-coding-region in that
multibyte buffer.
(3) Get a unibyte string form a unibyte buffer, and then
decode it while specifying a multibyte buffer as TARGET.
(4) Deocde a unibyte buffer into a mulitbyte string, and
then insert it into a multibyte buffer.
(Please note that using decode-coding-region directly in a
unibyte-buffer is not reliable because if a coding system
has post-read-converion function, that funcion (usually)
works only in a mutlibyte buffer.)
The efficiency is (1) > (2) > (3) > (4).
And, for the case of Rmail/mbox, before decoding, we may
have to perform base64 or qp decoding, and they can't
specify the different buffer/string as target. And I don't
know if they works for a multibyte buffer/string.
So, at the moment, I think the following strategy is good.
Copy the contents of RMAIL buffer to a temporary unibyte
buffer, perform base64/qp decoding in that buffer, then do
decode-coding-region while specifying the view buffer as
TARGET.
---
Kenichi Handa
handa@ni.aist.go.jp
* Note: Those eight-bit characters have values
#x3FFF80..#x3FFFFF, and, for instance, char-after and aref
return one of those values. To get the original byte value,
one needs (encode-char EIGHT-BIT-CHAR 'eight-bit) or
(multibyte-char-to-unibyte EIGHT-BIT-CHAR). Perhaps, we
have to provide some APIs for directly getting a byte value
of EIGHT-BIT-CHAR, but we have not yet decided what to do.
next prev parent reply other threads:[~2008-09-22 4:31 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-18 16:02 Need some help with Rmail/mbox Paul Michael Reilly
2008-09-19 3:28 ` Stephen J. Turnbull
2008-09-19 5:35 ` Paul Michael Reilly
2008-09-19 9:32 ` Eli Zaretskii
2008-09-20 7:12 ` Stephen J. Turnbull
2008-09-20 10:04 ` Daiki Ueno
2008-09-20 10:19 ` Eli Zaretskii
2008-09-20 10:46 ` Daiki Ueno
2008-09-20 11:30 ` Eli Zaretskii
2008-09-20 23:33 ` Richard M. Stallman
2008-09-21 3:18 ` Eli Zaretskii
2008-09-21 13:34 ` Stefan Monnier
2008-09-21 17:59 ` Eli Zaretskii
2008-09-21 19:26 ` Stefan Monnier
2008-09-21 20:56 ` Eli Zaretskii
2008-09-21 22:07 ` Stefan Monnier
2008-09-22 3:07 ` Eli Zaretskii
2008-09-22 3:36 ` Stefan Monnier
2008-09-22 3:41 ` Daiki Ueno
2008-09-22 3:58 ` Stefan Monnier
2008-09-22 18:48 ` Eli Zaretskii
2008-09-22 4:31 ` Kenichi Handa [this message]
2008-09-22 14:10 ` Stefan Monnier
2008-09-24 0:56 ` Kenichi Handa
2008-09-24 2:53 ` Stefan Monnier
2008-09-24 3:48 ` Kenichi Handa
2008-09-22 15:24 ` Paul Michael Reilly
2008-09-20 13:48 ` Stephen J. Turnbull
2008-09-21 0:57 ` Daiki Ueno
2008-09-22 9:14 ` Stephen J. Turnbull
2008-09-19 4:30 ` Richard M. Stallman
2008-09-19 4:30 ` Richard M. Stallman
2008-09-19 9:12 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E1Khd5Y-00042x-Hc@etlken.m17n.org \
--to=handa@m17n.org \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=monnier@iro.umontreal.ca \
--cc=pmr@pajato.com \
--cc=stephen@xemacs.org \
--cc=ueno@unixuser.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.