From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 1B94E431FAF for ; Fri, 26 Jul 2013 03:16:38 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p-HSk9R8iZ0o for ; Fri, 26 Jul 2013 03:16:30 -0700 (PDT) Received: from mail-we0-f170.google.com (mail-we0-f170.google.com [74.125.82.170]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 9D6B9431FAE for ; Fri, 26 Jul 2013 03:16:30 -0700 (PDT) Received: by mail-we0-f170.google.com with SMTP id w60so1708236wes.29 for ; Fri, 26 Jul 2013 03:16:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:subject:in-reply-to:references:user-agent:date:message-id :mime-version:content-type:x-gm-message-state; bh=CLwDSrMRVNzv0didh7VStheMYFLoQtswHRk+bC5WZVc=; b=CAC60za2kOcTGaAEIkiBRr8d2KN6z1hXJrLxQW5DEL1LnydJhth/SS6Fjjt6RO/8ec Rj521i6PRvxs3MeKOFFMkMGbhCPsxyaxu+r6GE0NvzsDljjXinJtUHbUqdpjmCbWUQDF WsZhc8wokuw7sGcCuW9xp0UBYDVtYtSRPou0LKniOFD256B3O4mkFYmbm27/kKrOJ8ja orVzU56R+gU5VaYeUriaUgeXFv2SxZV0ZmZDOmYSHbg1mEAmG5Df8WlwBk5xjd5hD9Us 2gKp+/XlP0jLKaE059+SO3FtnhzBHDk9DDoU++Ad39P9MHy6644hJ6SmsP6olbXOQDL/ /bsQ== X-Received: by 10.180.38.45 with SMTP id d13mr5117651wik.62.1374833786960; Fri, 26 Jul 2013 03:16:26 -0700 (PDT) Received: from localhost ([2001:4b98:dc0:43:216:3eff:fe1b:25f3]) by mx.google.com with ESMTPSA id u9sm3616142wif.6.2013.07.26.03.16.24 for (version=TLSv1.1 cipher=RC4-SHA bits=128/128); Fri, 26 Jul 2013 03:16:25 -0700 (PDT) From: Jani Nikula To: David Bremner , Franz Fellner , notmuch@notmuchmail.org Subject: Re: UTF-8 in mail headers (namely FROM) sent by bugzilla In-Reply-To: <87y58xv71x.fsf@zancas.localnet> References: <08cb1dcd-c5db-4e33-8b09-7730cb3d59a2@gmail.com> <871u6psjwr.fsf@ericabrahamsen.net> <5712cc41-d0ce-4ed3-af1c-37cf639dd9c0@gmail.com> <87y58xv71x.fsf@zancas.localnet> User-Agent: Notmuch/0.15.2+177~gb1ba76c (http://notmuchmail.org) Emacs/23.2.1 (x86_64-pc-linux-gnu) Date: Fri, 26 Jul 2013 12:16:21 +0200 Message-ID: <87d2q5wrre.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Gm-Message-State: ALoCoQko29M9Ro43HU2VDrllEyIxRDIGbadpeKM5xJcRvqsopQ3n5ZvT06DzGXNoFV+oWOldg6V2 X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Jul 2013 10:16:38 -0000 On Tue, 23 Jul 2013, David Bremner wrote: > Franz Fellner writes: > >> >> OK, thx. So every app needs to get patched to display those strings >> properly? Any chance this could be done directly in libnotmuch? I >> grepped for "2047" inside te "emacs" subtree, but found nothing (had >> the hope for a comment for the workaround). Would be interesting to >> see how this is done, so I can at least try to create a patch (though >> my ruby is quite basic). > > In general notmuch relies on libgmime for rfc2047 parsing. I'm not sure > of all the details now, but some of the filtering does happen in the > CLI, not the lib. You could start by looking at > gmime-filter-headers.[ch] in the top directory. I'm experiencing a similar problem with the Subject: headers in bugzilla mail. Per RFC 2047, Ordinary ASCII text and 'encoded-word's may appear together in the same header field. However, an 'encoded-word' that appears in a header field defined as '*text' MUST be separated from any adjacent 'encoded-word' or 'text' by 'linear-white-space'. In the problematic mails, the encoded-word begins immediately after preceding text, i.e. without linear-white-space. Manually adding that space in the message file makes the subject display as expected. The decoding is done in the cli using g_mime_message_get_subject(). I'm not sure if there's much that can be done about it within notmuch. BR, Jani.