From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 48630431FBC for ; Fri, 24 Feb 2012 00:38:37 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -2.3 X-Spam-Level: X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id m3Kzsq096+t3 for ; Fri, 24 Feb 2012 00:38:36 -0800 (PST) Received: from max.feld.cvut.cz (max.feld.cvut.cz [147.32.192.36]) by olra.theworths.org (Postfix) with ESMTP id 18872431FAE for ; Fri, 24 Feb 2012 00:38:36 -0800 (PST) Received: from localhost (unknown [192.168.200.4]) by max.feld.cvut.cz (Postfix) with ESMTP id 6B31819F3419; Fri, 24 Feb 2012 09:38:35 +0100 (CET) X-Virus-Scanned: IMAP AMAVIS Received: from max.feld.cvut.cz ([192.168.200.1]) by localhost (styx.feld.cvut.cz [192.168.200.4]) (amavisd-new, port 10044) with ESMTP id 313Bp80qd0v3; Fri, 24 Feb 2012 09:38:31 +0100 (CET) Received: from imap.feld.cvut.cz (imap.feld.cvut.cz [147.32.192.34]) by max.feld.cvut.cz (Postfix) with ESMTP id 90DB419F341C; Fri, 24 Feb 2012 09:38:31 +0100 (CET) Received: from steelpick.2x.cz (unknown [141.76.49.23]) (Authenticated sender: sojkam1) by imap.feld.cvut.cz (Postfix) with ESMTPSA id 84047660968; Fri, 24 Feb 2012 09:38:31 +0100 (CET) Received: from wsh by steelpick.2x.cz with local (Exim 4.77) (envelope-from ) id 1S0qff-0005E6-EW; Fri, 24 Feb 2012 09:38:31 +0100 From: Michal Sojka To: Serge Z , notmuch@notmuchmail.org Subject: Re: [PATCH] test: Add test for searching of uncommonly encoded messages In-Reply-To: <20120224075700.13214.28221@localhost> References: <877gzd5axk.fsf@steelpick.2x.cz> <1330043595-22054-1-git-send-email-sojkam1@fel.cvut.cz> <20120224042925.2870.87924@localhost> <874nug67il.fsf@steelpick.2x.cz> <20120224075700.13214.28221@localhost> User-Agent: Notmuch/0.11.1+239~g2e86bb7 (http://notmuchmail.org) Emacs/23.3.1 (x86_64-pc-linux-gnu) Date: Fri, 24 Feb 2012 09:38:31 +0100 Message-ID: <874nugiq2g.fsf@steelpick.2x.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Feb 2012 08:38:37 -0000 On Fri, 24 Feb 2012, Serge Z wrote: > > Quoting Michal Sojka (2012-02-24 11:00:02) > >I'm not big fan of adding "preprocessor". First, I thing that both > >reasons you mention are actually bugs and it would be better to fix them > >for everybody than requiring each user to configure some preprocessor. > >Second, depending on what and how would your preprocessor do, the > >initial mail indexing could be a way slower, which is also nothing that > >people want. > > > >Do you have any other use case for the preprocessor besides utf8 and > >html->text conversions? > > > >Cheers, > >-Michal > > Well, I don't want to add any external preprocessor too. > > This may be considered as an architectural decision: search engine should not > access messages directly, but through some preprocessing layer which would > handle the case of different encodings in body and headers, RFC2047-encoded > headers (if this is not handled yet) etc. > > Anyway, this solution imho would be nice to be concluded inside a separate > library Yes, this library is called gmime and notmuch already make use of it. -Michal