From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 493EC6DE00B4 for ; Tue, 24 Jul 2018 01:01:01 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.011 X-Spam-Level: X-Spam-Status: No, score=-0.011 tagged_above=-999 required=5 tests=[SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B2kUf15iBUxK for ; Tue, 24 Jul 2018 01:01:00 -0700 (PDT) Received: from smtp.eurecom.fr (smtp.eurecom.fr [193.55.113.210]) by arlo.cworth.org (Postfix) with ESMTP id 9FFCB6DE00AC for ; Tue, 24 Jul 2018 01:00:59 -0700 (PDT) X-IronPort-AV: E=Sophos;i="5.51,397,1526335200"; d="scan'208";a="7932183" Received: from waha.eurecom.fr (HELO smtps.eurecom.fr) ([10.3.2.236]) by drago1i.eurecom.fr with ESMTP; 24 Jul 2018 10:00:56 +0200 Received: from archibald (unknown [193.55.114.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtps.eurecom.fr (Postfix) with ESMTPSA id 51040F67; Tue, 24 Jul 2018 10:00:55 +0200 (CEST) From: Sebastian Poeplau To: David Bremner , notmuch@notmuchmail.org Subject: Re: Handling mislabeled emails encoded with Windows-1252 In-Reply-To: <8736w91jz0.fsf@tethera.net> References: <87lgaeat37.fsf@eurecom.fr> <8736w91jz0.fsf@tethera.net> Date: Tue, 24 Jul 2018 10:00:55 +0200 Message-ID: <87o9exyseg.fsf@eurecom.fr> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jul 2018 08:01:01 -0000 Hi David, > Everyone's mail situation is unique, but I haven't noticed this > problem. Do you have a mechanical (e.g. scripted) way of detecting such > mails? I suppose it could just look for characters in the range 0x80 to > 0x95 in allegedly ISO_8859-1 messages. A census of the situation in my > own mail would help me think about this problem, I think. Yes, I guess that should be a good enough heuristic for detecting affected mail. I'll try to come up with a simple script and post it here. Cheers, Sebastian