unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Ethan Glasser-Camp <ethan.glasser.camp@gmail.com>
To: Tomi Ollila <tomi.ollila@iki.fi>,
	Michael Stapelberg <michael+nm@stapelberg.de>,
	notmuch@notmuchmail.org
Subject: Re: [BUG] Saving attachments containing UTF-8 chars
Date: Sun, 18 Nov 2012 00:29:31 -0500	[thread overview]
Message-ID: <87zk2f1nt0.fsf@betacantrips.com> (raw)
In-Reply-To: <m2pq416xzl.fsf@guru.guru-group.fi>

Tomi Ollila <tomi.ollila@iki.fi> writes:

> I can verify this bug: I copied 'rawmail' to my mail store and attempted
> to 'w' the attacment and got the same result (after notmuch new).
>
> The saving code first does
> notmuch show --format=raw id:"508953E6.70006@gmail.com"
> which decodes OK on command line, and to the buffer when 
> kill-buffer is outcommented in (with-current-notmuch-show-message ...) 
> macro.

I was able to see this behavior, and Tomi did a good job tracking down
where it was :)

I even see the bytes as presented in the file. When moving point to the
problematic character, and doing M-x describe-char, it says:

          buffer code: #xE2 #x80 #x99
            file code: #xE2 #x80 #x99 (encoded by coding system utf-8)

buffer-file-coding-system is, of course, utf-8.

Writing this buffer using C-x C-w encodes it correctly too. So I think
this is an emacs MIME problem. We call mm-save-part, which calls
mm-save-part-to-file, which calls mm-with-unibyte-buffer. Hmm..

Indeed, it seems that inserting this character into a file that's been
marked "unibyte" using (set-buffer-multibyte nil) turns it into the ^Y
character (ASCII code 0x19 -- the character that comes out in the patch
file). There's probably a technical reason that this should be true, but
I can't think of why that would be.

> I attempted a set of trial-&-error tricks to get the attachment
> saved "correctly", and at least this seems to do the trick:
>
> diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el
> index f273eb4..a6a85c0 100644
> --- a/emacs/notmuch-show.el
> +++ b/emacs/notmuch-show.el
> @@ -203,9 +203,11 @@ For example, if you wanted to remove an \"unread\" tag and add a
>       (let ((id (notmuch-show-get-message-id)))
>         (let ((buf (generate-new-buffer (concat "*notmuch-msg-" id "*"))))
>           (with-current-buffer buf
> -	    (call-process notmuch-command nil t nil "show" "--format=raw" id)
> -           ,@body)
> -	 (kill-buffer buf)))))
> +	   (let ((coding-system-for-read 'no-conversion)
> +		 (coding-system-for-write 'no-conversion))
> +	     (call-process notmuch-command nil t nil "show" "--format=raw" id)
> +	     ,@body))))))
> +%%	 (kill-buffer buf)))))
[snip]
> (kill-buffer is outcommented above for testing purposes)
>
> To test this this needs to me evaluated and then the functions
> using this macro (notmuch-show-save-attachments  in this case)
>
> Smart suggestions for proper fix ?

Well, we could limit it just to saving attachments (putting the let
around the with-current-notmuch-show-message). That feels like it could
be right, because intuitively saving an attachment should be done
without any conversions. Or even the above doesn't seem so bad. My vague
feeling is that messages should always be ASCII, or at least mm-* will
interpret it that way, so decoding them into any other character set
might cause problems. Anyone understand character sets?

Ethan

  reply	other threads:[~2012-11-18  5:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-29 13:30 [BUG] Saving attachments containing UTF-8 chars Michael Stapelberg
2012-10-29 16:31 ` Tomi Ollila
2012-11-18  5:29   ` Ethan Glasser-Camp [this message]
2012-11-18 18:02     ` Ethan Glasser-Camp
2012-11-27  2:11 ` David Bremner
2012-11-30  8:51   ` Michael Stapelberg
2012-11-30  9:34     ` Mark Walters
2012-11-30 11:14       ` Michael Stapelberg
2012-11-30 13:02         ` Mark Walters
2012-11-30 13:25           ` Michael Stapelberg
2012-11-30 13:12         ` Tomi Ollila

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zk2f1nt0.fsf@betacantrips.com \
    --to=ethan.glasser.camp@gmail.com \
    --cc=michael+nm@stapelberg.de \
    --cc=notmuch@notmuchmail.org \
    --cc=tomi.ollila@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).