From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 780CA429E21 for ; Wed, 11 Jan 2012 10:48:35 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NvGxMEIV7LhK for ; Wed, 11 Jan 2012 10:48:34 -0800 (PST) Received: from dmz-mailsec-scanner-6.mit.edu (DMZ-MAILSEC-SCANNER-6.MIT.EDU [18.7.68.35]) by olra.theworths.org (Postfix) with ESMTP id 73D8D431FB6 for ; Wed, 11 Jan 2012 10:48:34 -0800 (PST) X-AuditID: 12074423-b7f9c6d0000008c3-a7-4f0dd9824e6c Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-6.mit.edu (Symantec Messaging Gateway) with SMTP id 42.EC.02243.289DD0F4; Wed, 11 Jan 2012 13:48:34 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id q0BImXt0013787; Wed, 11 Jan 2012 13:48:34 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q0BImWho011482 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Wed, 11 Jan 2012 13:48:33 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1Rl3Dy-0004ET-Gz; Wed, 11 Jan 2012 13:48:38 -0500 Date: Wed, 11 Jan 2012 13:48:38 -0500 From: Austin Clements To: Jani Nikula Subject: Re: [PATCH 2/3] lib: Add support for automatically excluding tags from queries Message-ID: <20120111184838.GT20796@mit.edu> References: <20120109043101.GH20796@mit.edu> <1326258173-21163-1-git-send-email-amdragon@mit.edu> <1326258173-21163-3-git-send-email-amdragon@mit.edu> <87ty42v99m.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87ty42v99m.fsf@nikula.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmphleLIzCtJLcpLzFFi42IRYrdT1226yetv8G0Fu0XTdGeL6zdnMjsw edy6/5rd49mqW8wBTFFcNimpOZllqUX6dglcGTsWnGcqWGBYcfVdJ1MD41m1LkZODgkBE4n1 7/pYIGwxiQv31rN1MXJxCAnsY5RY19vICOFsYJR49vUlVOYkk8SRd49ZIJwljBJzXj0D62cR UJXYPekuG4jNJqAhsW3/ckYQW0RAUWLzyf1gNrOAtMS3381MILawQKTEh01/2UFsXgEdidub dkGt28EoceXKVWaIhKDEyZlPWCCatSRu/HsJ1MwBNmj5Pw6QMCfQrr2TF4LNERVQkZhychvb BEahWUi6ZyHpnoXQvYCReRWjbEpulW5uYmZOcWqybnFyYl5eapGumV5uZoleakrpJkZwYLso 72D8c1DpEKMAB6MSD++Ovbz+QqyJZcWVuYcYJTmYlER5V90ACvEl5adUZiQWZ8QXleakFh9i lOBgVhLhdaoByvGmJFZWpRblw6SkOViUxHk1tN75CQmkJ5akZqemFqQWwWRlODiUJHg3ggwV LEpNT61Iy8wpQUgzcXCCDOcBGr4cpIa3uCAxtzgzHSJ/ilFRSpx3EkhCACSRUZoH1wtLPK8Y xYFeEebNB6niASYtuO5XQIOZgAZvWccDMrgkESEl1cAo8e7tdu6TH2te1ITEXm/eF/uAKyQ3 gOPh1puFVy/n3NyzJnnZ0fcf6wt0zW5GrP7SqdrT4L//+f/glCXPJCOC3i8yfJC/96VjY2JK xt+sOWd2njt7RU5Rg/PGEg1GtZtxMStidoQt8bb9UVFb0bDz2vfuR+vWOrPILWzw12r9wrdk gVbM64W/lFiKMxINtZiLihMBpsQZ9BcDAAA= Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Jan 2012 18:48:35 -0000 Quoth Jani Nikula on Jan 11 at 10:11 am: > On Wed, 11 Jan 2012 00:02:52 -0500, Austin Clements wrote: > > This is useful for tags like "deleted" and "spam" that people > > generally want to exclude from query results. These exclusions will > > be overridden if a tag is explicitly mentioned in a query. > > --- > > lib/notmuch.h | 6 ++++++ > > lib/query.cc | 33 +++++++++++++++++++++++++++++++++ > > 2 files changed, 39 insertions(+), 0 deletions(-) > > > > diff --git a/lib/notmuch.h b/lib/notmuch.h > > index 9f23a10..0a3ae2b 100644 > > --- a/lib/notmuch.h > > +++ b/lib/notmuch.h > > @@ -457,6 +457,12 @@ notmuch_query_set_sort (notmuch_query_t *query, notmuch_sort_t sort); > > notmuch_sort_t > > notmuch_query_get_sort (notmuch_query_t *query); > > > > +/* Add a tag that will be excluded by default from the query results. > > + * This exclusion will be overridden if this tag appears explicitly in > > + * the query. */ > > +void > > +notmuch_query_add_tag_exclude (notmuch_query_t *query, const char *tag); > > + > > /* Execute a query for threads, returning a notmuch_threads_t object > > * which can be used to iterate over the results. The returned threads > > * object is owned by the query and as such, will only be valid until > > diff --git a/lib/query.cc b/lib/query.cc > > index b6c0f12..716db1c 100644 > > --- a/lib/query.cc > > +++ b/lib/query.cc > > @@ -27,6 +27,7 @@ struct _notmuch_query { > > notmuch_database_t *notmuch; > > const char *query_string; > > notmuch_sort_t sort; > > + notmuch_string_list_t *exclude_terms; > > }; > > > > typedef struct _notmuch_mset_messages { > > @@ -76,6 +77,8 @@ notmuch_query_create (notmuch_database_t *notmuch, > > > > query->sort = NOTMUCH_SORT_NEWEST_FIRST; > > > > + query->exclude_terms = _notmuch_string_list_create (query); > > + > > return query; > > } > > > > @@ -97,6 +100,13 @@ notmuch_query_get_sort (notmuch_query_t *query) > > return query->sort; > > } > > > > +void > > +notmuch_query_add_tag_exclude (notmuch_query_t *query, const char *tag) > > +{ > > + char *term = talloc_asprintf (query, "%s%s", _find_prefix ("tag"), tag); > > + _notmuch_string_list_append (query->exclude_terms, term); > > +} > > + > > /* We end up having to call the destructors explicitly because we had > > * to use "placement new" in order to initialize C++ objects within a > > * block that we allocated with talloc. So C++ is making talloc > > @@ -112,6 +122,25 @@ _notmuch_messages_destructor (notmuch_mset_messages_t *messages) > > return 0; > > } > > > > I'd like to have a comment here, or inline in the code, explaining the > following function a little bit. Done. > > +static Xapian::Query > > +_notmuch_exclude_tags (notmuch_query_t *query, Xapian::Query xquery) > > +{ > > + Xapian::TermIterator end = xquery.get_terms_end (); > > + > > + for (notmuch_string_node_t *term = query->exclude_terms->head; term; > > + term = term->next) { > > + Xapian::TermIterator it = xquery.get_terms_begin (); > > + for (; it != end; it++) { > > + if (*it == term->string) > > [This is a double reminder to me why I'm not that enthusiastic about > operator overloading in C++.] Actually, that's a good point. I had originally done this to prevent std::string from performing any internal allocation or copying (which calling c_str could do), but since it took me some digging to figure out if I had *actually* avoided this (turns out there's an overloaded string==char*, so I had), I've switched to using string.compare, which I think makes the behavior and performance more obvious. > > + break; > > + } > > + if (it == end) > > + xquery = Xapian::Query (Xapian::Query::OP_AND_NOT, > > + xquery, Xapian::Query (term->string)); > > I briefly dug into Xapian documentation and source code, and became none > the wiser whether this copies the queries passed to it or not, i.e. does > this leak memory or not. I just presume you know what you're doing. ;) Since my code isn't doing any memory allocation, it can't leak memory. It could do a lot of copying, but it turns out that isn't a problem either because Xapian objects are internally reference-counted handles. So, in other words, if you write the obvious thing, it will do the right thing, which is totally contrary to what C++ has conditioned us to expect. ]:--8) > I think the function fails if someone is stupid enough to exclude the > same tag twice. I'm not sure if you should care. If you think so, you > could just check double add in notmuch_query_add_tag_exclude(). It handles this correctly for two reasons. Suppose tag x is excluded twice. At worst, if tag:x doesn't appear in the query, the returned query will get two AND NOT tag:x's, but that doesn't affect the results. It turns out it doesn't even get doubled up for a slightly subtle reason: when the loop reaches the second x in the exclude list, it will iterate over the new query, which already has the AND NOT tag:x in it, so it will see the tag:x as if it were in the original query and not further modify the query. However, this did point out a bug. I was using the end iterator from the original query even when I started iterating over the modified query. I'll send out an updated series after someone looks at the third patch. > Otherwise, looks good. > > > + } > > + return xquery; > > +} > > + > > notmuch_messages_t * > > notmuch_query_search_messages (notmuch_query_t *query) > > { > > @@ -157,6 +186,8 @@ notmuch_query_search_messages (notmuch_query_t *query) > > mail_query, string_query); > > } > > > > + final_query = _notmuch_exclude_tags (query, final_query); > > + > > enquire.set_weighting_scheme (Xapian::BoolWeight()); > > > > switch (query->sort) { > > @@ -436,6 +467,8 @@ notmuch_query_count_messages (notmuch_query_t *query) > > mail_query, string_query); > > } > > > > + final_query = _notmuch_exclude_tags (query, final_query); > > + > > enquire.set_weighting_scheme(Xapian::BoolWeight()); > > enquire.set_docid_order(Xapian::Enquire::ASCENDING); > > >