all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* 23.0.60; rmail initial misdecoding of iso-8859-15 mail (quoted-printable at least)
@ 2008-01-19  0:02 David Golden
  2008-01-23  2:38 ` David Golden
  0 siblings, 1 reply; 2+ messages in thread
From: David Golden @ 2008-01-19  0:02 UTC (permalink / raw)
  To: emacs-pretest-bug

[-- Attachment #1: Type: text/plain, Size: 3386 bytes --]

In rmail, fetching iso-8859-15 mail isn't working right, at least of
format:
Content-Type: text/plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: quoted-printable

...

To reproduce (only emacs-unicode-2 branch tested):

Have an email indicating iso-8859-15, with iso-8859-15 characters.  
(see included example unix mailbox  mbox.iso-8859-15) have rmail fetch 
it (C-u g mbox.iso-8859-15) into rmail.

It is misdecoded, the euro currency character #A4 becomes the 
international currency symbol (as it would be in iso-8859-1).

However, the coding system for the mail is iso-8859-15, only now the 
decoded mail has a character invalid for iso-8859-15 embedded in it, so 
if you then e.g.  rmail-redecode-body,  the international currency 
symbol is not  preserved and/or fixed but rather lost, quite probably 
because  reencoding from back to iso-8859-15
(rightly) can't handle the international currency symbol.

Note that starting from iso-8859-1 doesn't exhibit this behaviour -
you can fetch an iso-8859-1 mail with the international currency
character, rmail-redecode-body to iso-8859-15, and it (correctly) 
becomes the euro sign, rmail-redecode-body back to iso-8859-1, and it 
(correctly) becomes the international currency sign again.
(try  C-u g  of the also attached  mbox.iso-8859-1  )

...
In GNU Emacs 23.0.60.2 (i686-pc-linux-gnu, GTK+ Version 2.12.3)
 of 2008-01-18 on golden1
Windowing system distributor `The X.Org Foundation', version 
11.0.10400000
configured using 
`configure  '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif' '--with-png' '--with-freetype' '--with-xft' '--with-gtk' '--enable-font-backend' '--prefix=/home/david/emacs''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_IE.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default-enable-multibyte-characters: t

Major mode: RMAIL

Minor modes in effect:
  tooltip-mode: t
  tool-bar-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  global-auto-composition-mode: t
  auto-composition-mode: t
  auto-compression-mode: t
  line-number-mode: t

Recent input:
<help-echo> M-x r m a i l <return> C-u g m b o x . 
i s o - 8 8 <tab> <return> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> M-x 
r m a i l - r e d <tab> <return> i s o - 8 8 5 9 - 
1 5 <return> M-x r m a i l - r e d e c o <tab> <return> 
i s o - 8 8 5 9 - 1 <return> C-x k <return> y e s <return> 
<help-echo> <help-echo> <help-echo> <help-echo> M-x 
r m n <backspace> a i l <return> C-u g m b o x . i 
s o <tab> 5 <return> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> M-x r m a 
i l - r e d e <tab> <return> i s o - 8 8 5 9 - 1 5 
<return> M-x r e o i <backspace> <backspace> p o <tab> 
r <tab> <return>

Recent messages:
Counting new messages...done (1)
Wrote /home/david/RMAIL
1 new message read
Counting messages...done
(No new mail has arrived)
Getting mail from /home/david/mbox.iso-8859-15...
Counting new messages...done (1)
Wrote /home/david/RMAIL
1 new message read
Making completion list...


-------------------------------------------------------

[-- Attachment #2: 23.0.60; rmail initial decoding of iso-8859-15 [quoted-printable at least] mail --]
[-- Type: text/plain, Size: 2945 bytes --]

Have an iso-8859-15 email with iso-8859-15 characters. 
(see included example unix mailbox  mbox.iso-8859-15)
have rmail fetch it (C-u g  mbox.iso-8859-15) into rmail.

It is misdecoded, the euro currency character becomes the
international currency symbol (as it would be in iso-8859-1).
However, the coding system for the mail is iso-8859-15,
only now the decoded mail has a character invalid for
 iso-8859-15 in it, so if you then rmail-redecode-body, 
the international currency  symbol is not  preserved and/or 
fixed but rather lost.

Note that starting from iso-8859-1 doesn't exhibit this behaviour -
you can fetch an iso-8859-1 mail with the international currency
character, rmail-redecode-body to iso-8859-15, and it (correctly) becomes
the euro sign, rmail-redecode-body back to iso-8859-1, and it (correctly)
becomes the international currency sign again.
(try  C-u g  of the also attached  mbox.iso-8859-1  )







In GNU Emacs 23.0.60.2 (i686-pc-linux-gnu, GTK+ Version 2.12.3)
 of 2008-01-18 on golden1
Windowing system distributor `The X.Org Foundation', version 11.0.10400000
configured using `configure  '--with-xpm' '--with-jpeg' '--with-tiff' '--with-gif' '--with-png' '--with-freetype' '--with-xft' '--with-gtk' '--enable-font-backend' '--prefix=/home/david/emacs''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_IE.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default-enable-multibyte-characters: t

Major mode: RMAIL

Minor modes in effect:
  tooltip-mode: t
  tool-bar-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  global-auto-composition-mode: t
  auto-composition-mode: t
  auto-compression-mode: t
  line-number-mode: t

Recent input:
<help-echo> M-x r m a i l <return> C-u g m b o x . 
i s o - 8 8 <tab> <return> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> M-x 
r m a i l - r e d <tab> <return> i s o - 8 8 5 9 - 
1 5 <return> M-x r m a i l - r e d e c o <tab> <return> 
i s o - 8 8 5 9 - 1 <return> C-x k <return> y e s <return> 
<help-echo> <help-echo> <help-echo> <help-echo> M-x 
r m n <backspace> a i l <return> C-u g m b o x . i 
s o <tab> 5 <return> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> M-x r m a 
i l - r e d e <tab> <return> i s o - 8 8 5 9 - 1 5 
<return> M-x r e o i <backspace> <backspace> p o <tab> 
r <tab> <return>

Recent messages:
Counting new messages...done (1)
Wrote /home/david/RMAIL
1 new message read
Counting messages...done
(No new mail has arrived)
Getting mail from /home/david/mbox.iso-8859-15...
Counting new messages...done (1)
Wrote /home/david/RMAIL
1 new message read
Making completion list...


[-- Attachment #3: mbox.iso-8859-1 --]
[-- Type: application/mbox, Size: 907 bytes --]

[-- Attachment #4: mbox.iso-8859-15 --]
[-- Type: application/mbox, Size: 911 bytes --]

[-- Attachment #5: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: 23.0.60; rmail initial misdecoding of iso-8859-15 mail (quoted-printable at least)
  2008-01-19  0:02 23.0.60; rmail initial misdecoding of iso-8859-15 mail (quoted-printable at least) David Golden
@ 2008-01-23  2:38 ` David Golden
  0 siblings, 0 replies; 2+ messages in thread
From: David Golden @ 2008-01-23  2:38 UTC (permalink / raw)
  To: emacs-pretest-bug

[apologies for doubling up in that last message; somehow managed
to send an old version of the message body as an additional attachment]

I have made some progress on debugging this, I think: 

rmail-convert-to-babyl-format calls  mail-unquote-printable-region [1] 
with arg "unibyte" set to t - supposed to cause the insertion of the 
decoded characters as unibyte, probably mostly for rmail's 
post-unquoting decoding as it happens (see docstring in the 
mail-unquote-printable-region function).  However, the 
mail-unquote-printable-region function apparently thinks insert-char 
will do this for it [2].  Only, these days, it apparently doesn't, it 
only inserts a unibyte char if buffer is in unibyte mode...

insert-byte exists, so immediate fix would be to change insert-char to 
insert-byte in [2] ?



[1] line 1990 of rmail.el

...
(if quoted-printable-header-field-end
		       (save-excursion
			 (unless
===>			     (mail-unquote-printable-region header-end (point) nil t t)
			   (message "Malformed MIME quoted-printable message")
...


[2] line 147 of mail-utils.el
...
		     (if unibyte
			 (progn
			   (replace-match "")
===>			   ;; insert-char will insert this as unibyte,
			   (insert-char char 1))
		       (replace-match (make-string 1 char) t t))))
....

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-01-23  2:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-19  0:02 23.0.60; rmail initial misdecoding of iso-8859-15 mail (quoted-printable at least) David Golden
2008-01-23  2:38 ` David Golden

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.