From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 3F069431FBC for ; Thu, 23 Feb 2012 20:34:09 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 1.146 X-Spam-Level: * X-Spam-Status: No, score=1.146 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_BL_SPAMCOP_NET=1.246, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VmHE3kXSc61e for ; Thu, 23 Feb 2012 20:34:08 -0800 (PST) X-Greylist: delayed 440 seconds by postgrey-1.32 at olra; Thu, 23 Feb 2012 20:34:08 PST Received: from forward12.mail.yandex.net (forward12.mail.yandex.net [95.108.130.94]) by olra.theworths.org (Postfix) with ESMTP id 7946B431FAE for ; Thu, 23 Feb 2012 20:34:08 -0800 (PST) Received: from smtp14.mail.yandex.net (smtp14.mail.yandex.net [95.108.131.192]) by forward12.mail.yandex.net (Yandex) with ESMTP id 38CAAC223F4 for ; Fri, 24 Feb 2012 08:26:42 +0400 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1330057602; bh=iOGfT2kHNwEjlBccPL3bXGnX/59XyH6lboabtYqnOxA=; h=Content-Type:MIME-Version:Content-Transfer-Encoding:From:To: References:In-Reply-To:Message-ID:Subject:Date; b=ri4ZfO+orp+XSGBBN18MTlTAwCWECLkyrjwCv4/FMT47Ft2+TiE+gF0pJE24DChK4 lgVmeLVYouJ4xoATzPoNGmXaDDQYBcIZ0pOWTG+Z1eISCt9QcZMCkNZ/Qb4aYXKB3Z dzPMg6wYnPcnE9FASNQuLAt+j90+E81gC7yLv0Ik= Received: from smtp14.mail.yandex.net (localhost [127.0.0.1]) by smtp14.mail.yandex.net (Yandex) with ESMTP id 1BA6C1B60018 for ; Fri, 24 Feb 2012 08:26:42 +0400 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1330057602; bh=iOGfT2kHNwEjlBccPL3bXGnX/59XyH6lboabtYqnOxA=; h=Content-Type:MIME-Version:Content-Transfer-Encoding:From:To: References:In-Reply-To:Message-ID:Subject:Date; b=ri4ZfO+orp+XSGBBN18MTlTAwCWECLkyrjwCv4/FMT47Ft2+TiE+gF0pJE24DChK4 lgVmeLVYouJ4xoATzPoNGmXaDDQYBcIZ0pOWTG+Z1eISCt9QcZMCkNZ/Qb4aYXKB3Z dzPMg6wYnPcnE9FASNQuLAt+j90+E81gC7yLv0Ik= Received: from host-158-152-66-217.spbmts.ru (host-158-152-66-217.spbmts.ru [217.66.152.158]) by smtp14.mail.yandex.net (nwsmtp/Yandex) with ESMTP id QcLWeVa9-QeLigpvl; Fri, 24 Feb 2012 08:26:40 +0400 X-Yandex-Spam: 1 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Serge Z User-Agent: alot/0.21+ To: notmuch@notmuchmail.org References: <877gzd5axk.fsf@steelpick.2x.cz> <1330043595-22054-1-git-send-email-sojkam1@fel.cvut.cz> In-Reply-To: <1330043595-22054-1-git-send-email-sojkam1@fel.cvut.cz> Message-ID: <20120224042925.2870.87924@localhost> Subject: Re: [PATCH] test: Add test for searching of uncommonly encoded messages Date: Fri, 24 Feb 2012 08:29:25 +0400 X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Feb 2012 04:34:09 -0000 Quoting Michal Sojka (2012-02-24 04:33:15) >Emails that are encoded differently than as ASCII or UTF-8 are not >indexed properly by notmuch. It is not possible to search for non-ASCII >words within those messages. Ok. But we can preprocess each incoming message right after 'getmail' to convert it from html to text and to utf8 encoding. One solution is to creat= e a seperate script for this and make gmail pipe all messages to this script, a= nd then to notmuch. But It would be better if maildir contains original messag= es only, so the question is: can we make nomuch indexing engine to index preprocessed message while maildir will contain original message - as it was obtained?