unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Austin Clements <aclements@csail.mit.edu>
To: Carl Worth <cworth@cworth.org>
Cc: notmuch@notmuchmail.org
Subject: Re: [PATCH 0/5] lib: make folder: prefix literal
Date: Wed, 29 Jan 2014 15:46:09 -0500	[thread overview]
Message-ID: <20140129204608.GE4375@mit.edu> (raw)
In-Reply-To: <87k3dir3ci.fsf@yoom.home.cworth.org>

Quoth Carl Worth on Jan 29 at 11:32 am:
> Jani Nikula <jani@nikula.org> writes:
> > Unfortunately, I haven't had the time to experiment with this. But it
> > bugs me that the probabilistic folder: prefix has stemming and it's case
> > insensitive. It's possible to work around the stemming with the anchors
> > you suggest or by quoting, but is there a way to have case sensitive
> > probabilistic prefixes?
> 
> The stemming and case insensitivity just has to do with which terms are
> shoved into the database, (you have to add extra terms to get these
> features). If we're getting those features for folder now, (and I agree
> that we don't want them), it's because we're calling some Xapian
> convenience function along the lines of "create a bunch of terms for
> this chunk of text".
> 
> The fix for that is to do the simple thing and simply break the path at
> each '/' and add a term for each component. Then these problems all go
> away.

I think you're assuming we have much more control over this than we
do.  It's true that we're using Xapian::TermGenerator for this, which
is what strips case and stems terms (and removes any punctuation like
$ or ^), but Xapian's current query parser only gives us two options
for a prefix: either don't parse them at all (boolean terms), or parse
them using TermGenerator (probabilistic terms).  We can index these
terms however we want, but there's simply no hook into the query
parser that would let us split the query at each '/' at search time.

> So fixes for this should not require switching from a probabilistic to a
> Boolean prefix.

  parent reply	other threads:[~2014-01-29 20:46 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-09 22:18 [PATCH 0/5] lib: make folder: prefix literal Jani Nikula
2014-01-09 22:18 ` [PATCH 1/5] " Jani Nikula
2014-01-24 21:18   ` Austin Clements
2014-01-09 22:18 ` [PATCH 2/5] test: fix insert folder: searches Jani Nikula
2014-01-24 21:20   ` Austin Clements
2014-01-25 19:32     ` Rob Browning
2014-01-09 22:18 ` [PATCH 3/5] test: fix test for literal folder: search Jani Nikula
2014-01-09 22:18 ` [PATCH 4/5] test: add test database in format version 1 Jani Nikula
2014-01-09 22:18 ` [PATCH 5/5] test: add database upgrade test from " Jani Nikula
2014-01-24 21:17 ` [PATCH 0/5] lib: make folder: prefix literal Austin Clements
2014-01-24 23:21   ` David Bremner
2014-01-25  9:33   ` Jani Nikula
2014-01-25 10:47     ` Tomi Ollila
2014-01-25 11:06       ` Jani Nikula
2014-01-25 11:52         ` Tomi Ollila
2014-01-25 15:38     ` Jani Nikula
2014-01-25 16:58       ` David Bremner
2014-01-25 18:22         ` Jani Nikula
     [not found]       ` <874n4rvcvo.fsf@yoom.home.cworth.org>
2014-01-29 19:05         ` Jani Nikula
     [not found]           ` <87k3dir3ci.fsf@yoom.home.cworth.org>
2014-01-29 20:46             ` Austin Clements [this message]
     [not found]               ` <87bnyuqw60.fsf@yoom.home.cworth.org>
2014-01-30  6:34                 ` Jani Nikula
2014-01-30 21:15                   ` Mark Walters
2014-01-30 22:02       ` Austin Clements
2014-01-31 19:19         ` Rob Browning
2014-02-04 20:14           ` Austin Clements
2014-02-04 20:17             ` Rob Browning
2014-01-31 19:24         ` Rob Browning
2014-02-01 14:54         ` Jani Nikula
2014-02-04 20:02           ` Austin Clements
2014-02-05 13:12             ` Tomi Ollila
2014-02-05 21:12               ` Tomi Ollila

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140129204608.GE4375@mit.edu \
    --to=aclements@csail.mit.edu \
    --cc=cworth@cworth.org \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).