* 23.0.60; rmail initial misdecoding of iso-8859-15 mail (quoted-printable at least)
@ 2008-01-19 0:02 David Golden
2008-01-23 2:38 ` David Golden
0 siblings, 1 reply; 2+ messages in thread
From: David Golden @ 2008-01-19 0:02 UTC (permalink / raw)
To: emacs-pretest-bug
[-- Attachment #1: Type: text/plain, Size: 3386 bytes --]
In rmail, fetching iso-8859-15 mail isn't working right, at least of
format:
Content-Type: text/plain;
charset="iso-8859-15"
Content-Transfer-Encoding: quoted-printable
...
To reproduce (only emacs-unicode-2 branch tested):
Have an email indicating iso-8859-15, with iso-8859-15 characters.
(see included example unix mailbox mbox.iso-8859-15) have rmail fetch
it (C-u g mbox.iso-8859-15) into rmail.
It is misdecoded, the euro currency character #A4 becomes the
international currency symbol (as it would be in iso-8859-1).
However, the coding system for the mail is iso-8859-15, only now the
decoded mail has a character invalid for iso-8859-15 embedded in it, so
if you then e.g. rmail-redecode-body, the international currency
symbol is not preserved and/or fixed but rather lost, quite probably
because reencoding from back to iso-8859-15
(rightly) can't handle the international currency symbol.
Note that starting from iso-8859-1 doesn't exhibit this behaviour -
you can fetch an iso-8859-1 mail with the international currency
character, rmail-redecode-body to iso-8859-15, and it (correctly)
becomes the euro sign, rmail-redecode-body back to iso-8859-1, and it
(correctly) becomes the international currency sign again.
(try C-u g of the also attached mbox.iso-8859-1 )
...
In GNU Emacs 23.0.60.2 (i686-pc-linux-gnu, GTK+ Version 2.12.3)
of 2008-01-18 on golden1
Windowing system distributor `The X.Org Foundation', version
11.0.10400000
configured using
`configure '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif' '--with-png' '--with-freetype' '--with-xft' '--with-gtk' '--enable-font-backend' '--prefix=/home/david/emacs''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_IE.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
Major mode: RMAIL
Minor modes in effect:
tooltip-mode: t
tool-bar-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-compression-mode: t
line-number-mode: t
Recent input:
<help-echo> M-x r m a i l <return> C-u g m b o x .
i s o - 8 8 <tab> <return> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> <down> M-x
r m a i l - r e d <tab> <return> i s o - 8 8 5 9 -
1 5 <return> M-x r m a i l - r e d e c o <tab> <return>
i s o - 8 8 5 9 - 1 <return> C-x k <return> y e s <return>
<help-echo> <help-echo> <help-echo> <help-echo> M-x
r m n <backspace> a i l <return> C-u g m b o x . i
s o <tab> 5 <return> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> M-x r m a
i l - r e d e <tab> <return> i s o - 8 8 5 9 - 1 5
<return> M-x r e o i <backspace> <backspace> p o <tab>
r <tab> <return>
Recent messages:
Counting new messages...done (1)
Wrote /home/david/RMAIL
1 new message read
Counting messages...done
(No new mail has arrived)
Getting mail from /home/david/mbox.iso-8859-15...
Counting new messages...done (1)
Wrote /home/david/RMAIL
1 new message read
Making completion list...
-------------------------------------------------------
[-- Attachment #2: 23.0.60; rmail initial decoding of iso-8859-15 [quoted-printable at least] mail --]
[-- Type: text/plain, Size: 2945 bytes --]
Have an iso-8859-15 email with iso-8859-15 characters.
(see included example unix mailbox mbox.iso-8859-15)
have rmail fetch it (C-u g mbox.iso-8859-15) into rmail.
It is misdecoded, the euro currency character becomes the
international currency symbol (as it would be in iso-8859-1).
However, the coding system for the mail is iso-8859-15,
only now the decoded mail has a character invalid for
iso-8859-15 in it, so if you then rmail-redecode-body,
the international currency symbol is not preserved and/or
fixed but rather lost.
Note that starting from iso-8859-1 doesn't exhibit this behaviour -
you can fetch an iso-8859-1 mail with the international currency
character, rmail-redecode-body to iso-8859-15, and it (correctly) becomes
the euro sign, rmail-redecode-body back to iso-8859-1, and it (correctly)
becomes the international currency sign again.
(try C-u g of the also attached mbox.iso-8859-1 )
In GNU Emacs 23.0.60.2 (i686-pc-linux-gnu, GTK+ Version 2.12.3)
of 2008-01-18 on golden1
Windowing system distributor `The X.Org Foundation', version 11.0.10400000
configured using `configure '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif' '--with-png' '--with-freetype' '--with-xft' '--with-gtk' '--enable-font-backend' '--prefix=/home/david/emacs''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_IE.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
Major mode: RMAIL
Minor modes in effect:
tooltip-mode: t
tool-bar-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-compression-mode: t
line-number-mode: t
Recent input:
<help-echo> M-x r m a i l <return> C-u g m b o x .
i s o - 8 8 <tab> <return> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> <down> M-x
r m a i l - r e d <tab> <return> i s o - 8 8 5 9 -
1 5 <return> M-x r m a i l - r e d e c o <tab> <return>
i s o - 8 8 5 9 - 1 <return> C-x k <return> y e s <return>
<help-echo> <help-echo> <help-echo> <help-echo> M-x
r m n <backspace> a i l <return> C-u g m b o x . i
s o <tab> 5 <return> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> M-x r m a
i l - r e d e <tab> <return> i s o - 8 8 5 9 - 1 5
<return> M-x r e o i <backspace> <backspace> p o <tab>
r <tab> <return>
Recent messages:
Counting new messages...done (1)
Wrote /home/david/RMAIL
1 new message read
Counting messages...done
(No new mail has arrived)
Getting mail from /home/david/mbox.iso-8859-15...
Counting new messages...done (1)
Wrote /home/david/RMAIL
1 new message read
Making completion list...
[-- Attachment #3: mbox.iso-8859-1 --]
[-- Type: application/mbox, Size: 907 bytes --]
[-- Attachment #4: mbox.iso-8859-15 --]
[-- Type: application/mbox, Size: 911 bytes --]
[-- Attachment #5: Type: text/plain, Size: 142 bytes --]
_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: 23.0.60; rmail initial misdecoding of iso-8859-15 mail (quoted-printable at least)
2008-01-19 0:02 23.0.60; rmail initial misdecoding of iso-8859-15 mail (quoted-printable at least) David Golden
@ 2008-01-23 2:38 ` David Golden
0 siblings, 0 replies; 2+ messages in thread
From: David Golden @ 2008-01-23 2:38 UTC (permalink / raw)
To: emacs-pretest-bug
[apologies for doubling up in that last message; somehow managed
to send an old version of the message body as an additional attachment]
I have made some progress on debugging this, I think:
rmail-convert-to-babyl-format calls mail-unquote-printable-region [1]
with arg "unibyte" set to t - supposed to cause the insertion of the
decoded characters as unibyte, probably mostly for rmail's
post-unquoting decoding as it happens (see docstring in the
mail-unquote-printable-region function). However, the
mail-unquote-printable-region function apparently thinks insert-char
will do this for it [2]. Only, these days, it apparently doesn't, it
only inserts a unibyte char if buffer is in unibyte mode...
insert-byte exists, so immediate fix would be to change insert-char to
insert-byte in [2] ?
[1] line 1990 of rmail.el
...
(if quoted-printable-header-field-end
(save-excursion
(unless
===> (mail-unquote-printable-region header-end (point) nil t t)
(message "Malformed MIME quoted-printable message")
...
[2] line 147 of mail-utils.el
...
(if unibyte
(progn
(replace-match "")
===> ;; insert-char will insert this as unibyte,
(insert-char char 1))
(replace-match (make-string 1 char) t t))))
....
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2008-01-23 2:38 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-19 0:02 23.0.60; rmail initial misdecoding of iso-8859-15 mail (quoted-printable at least) David Golden
2008-01-23 2:38 ` David Golden
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.