From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <amdragon@mit.edu>
Received: from localhost (localhost [127.0.0.1])
	by olra.theworths.org (Postfix) with ESMTP id 5FA8F431FB6
	for <notmuch@notmuchmail.org>; Thu, 30 Jan 2014 14:02:47 -0800 (PST)
X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
X-Spam-Flag: NO
X-Spam-Score: -0.7
X-Spam-Level: 
X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
	tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
Received: from olra.theworths.org ([127.0.0.1])
	by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id gykcxSB7ztQl for <notmuch@notmuchmail.org>;
	Thu, 30 Jan 2014 14:02:41 -0800 (PST)
Received: from dmz-mailsec-scanner-8.mit.edu (dmz-mailsec-scanner-8.mit.edu
	[18.7.68.37])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by olra.theworths.org (Postfix) with ESMTPS id 66F21431FBD
	for <notmuch@notmuchmail.org>; Thu, 30 Jan 2014 14:02:41 -0800 (PST)
X-AuditID: 12074425-f79906d000000cf9-1b-52eacc005076
Received: from mailhub-auth-1.mit.edu ( [18.9.21.35])
	(using TLS with cipher AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by dmz-mailsec-scanner-8.mit.edu (Symantec Messaging Gateway) with SMTP
	id 9A.BD.03321.00CCAE25; Thu, 30 Jan 2014 17:02:40 -0500 (EST)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11])
	by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id s0UM2cxg030481; 
	Thu, 30 Jan 2014 17:02:39 -0500
Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])
	(authenticated bits=0)
	(User authenticated as amdragon@ATHENA.MIT.EDU)
	by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s0UM2ZtW008034
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT);
	Thu, 30 Jan 2014 17:02:37 -0500
Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80)
	(envelope-from <amdragon@mit.edu>)
	id 1W8zgx-0007d3-7K; Thu, 30 Jan 2014 17:02:35 -0500
Date: Thu, 30 Jan 2014 17:02:34 -0500
From: Austin Clements <amdragon@MIT.EDU>
To: Jani Nikula <jani@nikula.org>
Subject: Re: [PATCH 0/5] lib: make folder: prefix literal
Message-ID: <20140130220234.GI4375@mit.edu>
References: <cover.1389304779.git.jani@nikula.org>
	<87y525m649.fsf@awakening.csail.mit.edu>
	<87r47wfltb.fsf@nikula.org> <87iot8f4vg.fsf@nikula.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <87iot8f4vg.fsf@nikula.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpmleLIzCtJLcpLzFFi42IR4hRV1mU48yrIYN9eCYum6c4W12/OZHZg
	8rh1/zW7x7NVt5gDmKK4bFJSczLLUov07RK4Mh7t3M1WsFSyYvpb3wbGbSJdjJwcEgImEid/
	T2eHsMUkLtxbz9bFyMUhJDCbSeLihzcsEM5GRolF77awQjinmSRm/9rCDuEsYZRY23eNCaSf
	RUBVYlnbHVYQm01AQ2Lb/uWMILaIgKLE5pP7wWxmAWmJb7+bweqFBSwlpt6ZAmbzCmhLLLw5
	H2roTEaJGR0XmCESghInZz5hgWjWkrjx7yVQAwfYoOX/OEDCnEC7Vu1vBysXFVCRmHJyG9sE
	RqFZSLpnIemehdC9gJF5FaNsSm6Vbm5iZk5xarJucXJiXl5qka6FXm5miV5qSukmRlBYs7uo
	7mCccEjpEKMAB6MSD++MtFdBQqyJZcWVuYcYJTmYlER53+0CCvEl5adUZiQWZ8QXleakFh9i
	lOBgVhLhfd8PlONNSaysSi3Kh0lJc7AoifPe4rAPEhJITyxJzU5NLUgtgsnKcHAoSfB+OQXU
	KFiUmp5akZaZU4KQZuLgBBnOAzSc6zTI8OKCxNzizHSI/ClGRSlx3h0gzQIgiYzSPLheWNp5
	xSgO9Iow7z+QKh5gyoLrfgU0mAlosFY52OCSRISUVANjoNWDfbO5DLtOLZMPXGwWwn5MTkv9
	7Zy92u8Uzi59kxcn/ueOx/ptW2Y9L394qXan+afM2XzH7zGzKU+4euPWnSMLj77IWyHK6uFw
	7GVpUP3Dh+++v/vmKn7wa97lt3GRO63W7NV/c9NXW2/+LdXLRceM5H43cE1c3cFWeVZmtidz
	bKCDYPxsLSWW4oxEQy3mouJEAGfMYYIWAwAA
Cc: notmuch@notmuchmail.org
X-BeenThere: notmuch@notmuchmail.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: "Use and development of the notmuch mail system."
	<notmuch.notmuchmail.org>
List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
	<mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
List-Archive: <http://notmuchmail.org/pipermail/notmuch>
List-Post: <mailto:notmuch@notmuchmail.org>
List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
	<mailto:notmuch-request@notmuchmail.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jan 2014 22:02:47 -0000

Quoth Jani Nikula on Jan 25 at  5:38 pm:
> On Sat, 25 Jan 2014, Jani Nikula <jani@nikula.org> wrote:
> > Perhaps we need to have two prefixes, one of which is the literal
> > filesystem folder and another which hides the implementation details,
> > like I mentioned in my mail to Peter [1]. But consider this: my proposed
> > implementation does cover *all* use cases.
> 
> Here's a thought. With boolean prefix folder:, we can devise a scheme
> where the folder: query defines what is to be matched.
> 
> For example:
> 
> folder:foo	match files in foo, foo/new, and foo/cur.
> folder:foo/	match all files in all subdirectories under foo (this
> 		would handle Tomi's use case), including foo/new and foo/cur.
> folder:foo/.	match in foo only, and specifically not in foo/cur or foo/new.
> folder:foo/new  match in foo/new, and specifically not in foo/cur (this
> 		allows distinguishing between messages in cur and new).
> folder:/	match everything.
> folder:/.	match in top level maildir only.
> folder:""	match in top level maildir, including cur/new.
> 
> This requires indexing all the path components with suitable
> suffixes. For example, a file "foo/new/baz" would get terms "/", "foo",
> "foo/", "foo/new", and "foo/new/.". A file foo/bar would get terms "/",
> "foo", "foo/", and "foo/.".
> 
> It's obviously a concern this increases the database size; not sure how
> it would compare with the current stemmed probabilistic prefix.
> 
> Opinions on this? This would really cover all use cases, and address
> Austin's interface and backward compatibility concerns.

I like this idea in general, though I agree with others that the
specific syntax seems a little wanting.  The concept of adding several
boolean terms seems powerful, and I would be surprised if the extra
terms had any substantive effect on database size.

However, it seems like this is overloading one prefix for two
meanings.  And I think that's because people want two similar but
distinct things.  Several of us want a simple, natural Maildir-aware
folder search (the Maildir folder of "a/b/cur/x:2," is "a/b").  Others
want file system search.  It's easy to conflate these because Maildir
represents folders as directory paths, but maybe they need to be
treated as distinct things.

What if we introduce two prefixes, say folder: and path: (maybe dir:?)
to address both use cases, each as naturally as possible?  Both would
be boolean prefixes because of the limitations of probabilistic
prefixes, but we could take advantage of Jani's idea of generating
several boolean terms.

folder: could work the way I suggested (simply the path to the file,
with {cur,new} stripped off).  path: would support file system search
uses.  These seem more varied, but I think fall into exact match and
recursive match.  Since I don't have this use case, I can't have any
strong opinions about syntax, but I'll throw out an idea: many shells
support "**" for recursive path matching and people are already quite
familiar with glob patterns for paths, so why not simply adopt this?
In other words, when adding the path "a/b/cur/x:2," add path: terms
"a/b/cur" and "a/b/**" and "a/**" and "**".

> BR,
> Jani.