unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Austin Clements <amdragon@MIT.EDU>
To: Jani Nikula <jani@nikula.org>
Cc: notmuch@notmuchmail.org, Andrei Popescu <andreimpopescu@gmail.com>
Subject: Re: Partial words on notmuch search?
Date: Tue, 17 Jan 2012 14:47:15 -0500	[thread overview]
Message-ID: <20120117194715.GO16740@mit.edu> (raw)
In-Reply-To: <87aa5mkyw5.fsf@nikula.org>

Quoth Jani Nikula on Jan 17 at  7:43 pm:
> On Mon, 16 Jan 2012 21:34:31 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> > Quoth Andrei Popescu on Jan 16 at 10:21 pm:
> > > This is also interesting:
> > > $ notmuch count 'debian'
> > > 65888
> > > $ notmuch count 'dEbian'
> > > 65888
> > > $ notmuch count 'Debian'
> > > 65887
> > 
> > The first two will match stemmed versions of "debian" such as
> > "debian's" and "debianed".  However, starting a term with a capital
> > letter suppresses stemming (because it suggests that it's a name,
> > which you wouldn't want to modify), so your last query matches only
> > the term "debian".  This is probably documented somewhere, though I
> > don't know where.
> 
> Interesting. Is this done when adding the terms to the database, or when
> searching? I presume the latter. How much control does notmuch have over
> this?

This is getting a bit out of my depth, but I believe indexing is done
with both stemmed and unstemmed versions of all terms (if stemming is
enabled) so that search can use either.

For indexing, Notmuch can set the stemmer (or no stemmer).  Xapian
provides stemmers for a variety of languages:
  http://xapian.org/docs/apidoc/html/classXapian_1_1Stem.html#6c46cedf2047b159a7e4c9d4468242b1

For query parsing, Notmuch can set both the stemmer and a "stemming
strategy" that controls when it stems or doesn't stem terms:
  http://xapian.org/docs/apidoc/html/classXapian_1_1QueryParser.html#c7dc3b55b6083bd3ff98fc8b2726c8fd

  reply	other threads:[~2012-01-17 19:47 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-15 22:06 Partial words on notmuch search? Andrei Popescu
2012-01-16  1:07 ` mailinglists
2012-01-16 20:21   ` Andrei Popescu
2012-01-16 22:26     ` David Bremner
2012-01-16 22:38       ` Andrei Popescu
2012-01-17  2:34     ` Austin Clements
2012-01-17 17:43       ` Jani Nikula
2012-01-17 19:47         ` Austin Clements [this message]
2012-01-17 22:14       ` Improving notmuch query documentation [was: Re: Partial words on notmuch search?] Andrei Popescu
2012-01-17 22:29         ` Austin Clements
2012-01-20 19:08           ` Mark Anderson
2012-03-15 21:15             ` Austin Clements
2012-03-15  9:39           ` [RFC] http://notmuchmail.org/searching/ [was: Re: Improving notmuch query documentation] Andrei POPESCU
2012-03-15 21:11             ` Austin Clements
2012-03-16  0:30               ` Andrei POPESCU
2012-03-16  2:11                 ` Austin Clements
2012-03-16 22:29                   ` Andrei POPESCU
2012-03-16 23:51                     ` David Bremner
2012-03-17  0:20                     ` Austin Clements
2012-03-17 14:40                       ` Andrei POPESCU
2012-03-17 17:16                         ` Austin Clements
2012-03-17 19:59                           ` Andrei POPESCU
2012-03-16 16:52                 ` David Bremner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120117194715.GO16740@mit.edu \
    --to=amdragon@mit.edu \
    --cc=andreimpopescu@gmail.com \
    --cc=jani@nikula.org \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).