From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 76E3B429E3B for ; Tue, 17 Jan 2012 09:44:00 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MNF2T5w9+dhK for ; Tue, 17 Jan 2012 09:44:00 -0800 (PST) Received: from mail-wi0-f181.google.com (mail-wi0-f181.google.com [209.85.212.181]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id B5917429E2E for ; Tue, 17 Jan 2012 09:43:59 -0800 (PST) Received: by wibhr12 with SMTP id hr12so3678239wib.26 for ; Tue, 17 Jan 2012 09:43:58 -0800 (PST) Received: by 10.180.101.101 with SMTP id ff5mr24960742wib.14.1326822238471; Tue, 17 Jan 2012 09:43:58 -0800 (PST) Received: from localhost (dsl-hkibrasgw4-fe5cdc00-23.dhcp.inet.fi. [80.220.92.23]) by mx.google.com with ESMTPS id fy5sm46305748wib.7.2012.01.17.09.43.55 (version=SSLv3 cipher=OTHER); Tue, 17 Jan 2012 09:43:56 -0800 (PST) From: Jani Nikula To: Austin Clements , Andrei Popescu Subject: Re: Partial words on notmuch search? In-Reply-To: <20120117023431.GF16740@mit.edu> References: <20120115220600.GO7037@think.nuvreauspam> <877h0sa207.fsf@fester.com> <20120116202103.GA14329@think.nuvreauspam> <20120117023431.GF16740@mit.edu> User-Agent: Notmuch/0.11+76~g1de742d (http://notmuchmail.org) Emacs/23.3.1 (i686-pc-linux-gnu) Date: Tue, 17 Jan 2012 19:43:54 +0200 Message-ID: <87aa5mkyw5.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jan 2012 17:44:00 -0000 On Mon, 16 Jan 2012 21:34:31 -0500, Austin Clements wrote: > Quoth Andrei Popescu on Jan 16 at 10:21 pm: > > This is also interesting: > > $ notmuch count 'debian' > > 65888 > > $ notmuch count 'dEbian' > > 65888 > > $ notmuch count 'Debian' > > 65887 > > The first two will match stemmed versions of "debian" such as > "debian's" and "debianed". However, starting a term with a capital > letter suppresses stemming (because it suggests that it's a name, > which you wouldn't want to modify), so your last query matches only > the term "debian". This is probably documented somewhere, though I > don't know where. Interesting. Is this done when adding the terms to the database, or when searching? I presume the latter. How much control does notmuch have over this? The assumption that one wouldn't want to have stemming for names is very much language dependent. [1] BR, Jani. [1] http://en.wikipedia.org/wiki/Finnish_noun_cases (the same works for names as well as nouns)