From: Austin Clements <amdragon@MIT.EDU>
To: Jani Nikula <jani@nikula.org>
Cc: notmuch@notmuchmail.org
Subject: Re: [Patch v3 04/15] lib: make folder: prefix literal
Date: Sun, 9 Mar 2014 12:15:48 -0400 [thread overview]
Message-ID: <20140309161548.GO4709@mit.edu> (raw)
In-Reply-To: <87pplv69q8.fsf@nikula.org>
Quoth Jani Nikula on Mar 09 at 10:45 am:
> On Sun, 09 Mar 2014, Austin Clements <amdragon@MIT.EDU> wrote:
> > Quoth David Bremner on Mar 08 at 5:19 pm:
> >> From: Jani Nikula <jani@nikula.org>
> >>
> >> In xapian terms, convert folder: prefix from probabilistic to boolean
> >> prefix, matching the paths, relative form the maildir root, of the
> >
> > s/form/from/
> >
> >> message files, ignoring the maildir new and cur leaf directories.
> >>
> >> folder:foo matches all message files in foo, foo/new, and foo/cur.
> >>
> >> folder:foo/new does *not* match message files in foo/new.
> >>
> >> folder:"" matches all message files in the top level maildir and its
> >> new and cur subdirectories.
> >>
> >> This change constitutes a database change: bump the database version
> >> and add database upgrade support for folder: terms. The upgrade also
> >> adds path: terms.
> >> ---
> >> lib/database.cc | 38 ++++++++++++++++++++++--
> >> lib/message.cc | 80 ++++++++++++++++++++++++++++++++++++++++++++-------
> >> lib/notmuch-private.h | 3 ++
> >> 3 files changed, 108 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/lib/database.cc b/lib/database.cc
> >> index 93cc7f5..186e3a7 100644
> >> --- a/lib/database.cc
> >> +++ b/lib/database.cc
> >> @@ -42,7 +42,7 @@ typedef struct {
> >> const char *prefix;
> >> } prefix_t;
> >>
> >> -#define NOTMUCH_DATABASE_VERSION 1
> >> +#define NOTMUCH_DATABASE_VERSION 2
> >>
> >> #define STRINGIFY(s) _SUB_STRINGIFY(s)
> >> #define _SUB_STRINGIFY(s) #s
> >> @@ -210,6 +210,7 @@ static prefix_t BOOLEAN_PREFIX_EXTERNAL[] = {
> >> { "is", "K" },
> >> { "id", "Q" },
> >> { "path", "P" },
> >> + { "folder", "XFOLDER:" },
> >
> > It took me a while to figure out that the ":" here means that Xapian
> > will unconditionally use a ":" after the prefix, instead of only using
> > a ":" when the first letter following the prefix is upper-case ASCII.
> > Maybe I was only confused by this because I simultaneously knew too
> > much and not enough about Xapian, but it might be worth a comment.
> > Something like,
> >
> > /* Without the ":", since this is a multi-letter prefix, Xapian
> > * will add a colon itself if the first letter of the path is
> > * upper-case ASCII. Including the ":" forces there to always be
> > * a colon, which keeps our own logic simpler. */
>
> Do you mean "... first letter of the _prefix_ is ..."?
I did mean the path. If the folder prefer were just "XFOLDER", then
Xapian::QueryParser would translate the query folder:foo into the term
XFOLDERfoo like you'd expect, but it would translate the query
folder:Foo into the term XFOLDER:Foo. We'd have to account for this
when constructing terms and (arguably) when removing terms. But
"XFOLDER:" suppresses the colon-adding logic, so these two queries
simply map to XFOLDER:foo and XFOLDER:Foo.
> Jani.
>
> >
> >> };
> >>
> >> static prefix_t PROBABILISTIC_PREFIX[]= {
> >> @@ -217,7 +218,6 @@ static prefix_t PROBABILISTIC_PREFIX[]= {
> >> { "to", "XTO" },
> >> { "attachment", "XATTACHMENT" },
> >> { "subject", "XSUBJECT"},
> >> - { "folder", "XFOLDER"}
> >> };
> >>
> >> const char *
> >> @@ -1168,6 +1168,40 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> >> }
> >> }
> >>
> >> + /*
> >> + * Prior to version 2, the "folder:" prefix was probabilistic and
> >> + * stemmed. Change it to the current boolean prefix. Add "path:"
> >> + * prefixes while at it.
> >> + */
> >> + if (version < 2) {
> >> + notmuch_query_t *query = notmuch_query_create (notmuch, "");
> >
> > Three space indentation and no tabs? (It looks like this was in
> > Jani's v2, also. I'm guessing at some point there was a copy-paste
> > from a diff with tabs converted to spaces?)
> >
> >> + notmuch_messages_t *messages;
> >> + notmuch_message_t *message;
> >> +
> >> + count = 0;
> >> + total = notmuch_query_count_messages (query);
> >> +
> >> + for (messages = notmuch_query_search_messages (query);
> >> + notmuch_messages_valid (messages);
> >> + notmuch_messages_move_to_next (messages)) {
> >> + if (do_progress_notify) {
> >> + progress_notify (closure, (double) count / total);
> >> + do_progress_notify = 0;
> >> + }
> >> +
> >> + message = notmuch_messages_get (messages);
> >> +
> >> + _notmuch_message_upgrade_folder (message);
> >> + _notmuch_message_sync (message);
> >> +
> >> + notmuch_message_destroy (message);
> >> +
> >> + count++;
> >> + }
> >> +
> >> + notmuch_query_destroy (query);
> >> + }
> >> +
> >> db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));
> >> db->flush ();
> >>
> >> diff --git a/lib/message.cc b/lib/message.cc
> >> index 21abe8e..31cb9f1 100644
> >> --- a/lib/message.cc
> >> +++ b/lib/message.cc
> >> @@ -504,6 +504,56 @@ _notmuch_message_remove_terms (notmuch_message_t *message, const char *prefix)
> >> }
> >> }
> >>
> >> +/* Return true if p points at "new" or "cur". */
> >> +static bool is_maildir (const char *p)
> >> +{
> >> + return strcmp (p, "cur") == 0 || strcmp (p, "new") == 0;
> >> +}
> >> +
> >> +/* Add "folder:" term for directory. */
> >> +static notmuch_status_t
> >> +_notmuch_message_add_folder_terms (notmuch_message_t *message,
> >> + const char *directory)
> >> +{
> >> + char *folder, *last;
> >> +
> >> + folder = talloc_strdup (NULL, directory);
> >> + if (! folder)
> >> + return NOTMUCH_STATUS_OUT_OF_MEMORY;
> >
> > Same formatting problem in this chunk.
> >
> >> +
> >> + /*
> >> + * If the message file is in a leaf directory named "new" or
> >> + * "cur", presume maildir and index the parent directory. Thus a
> >> + * "folder:" prefix search matches messages in the specified
> >> + * maildir folder, i.e. in the specified directory and its "new"
> >> + * and "cur" subdirectories.
> >> + *
> >> + * Note that this means the "folder:" prefix can't be used for
> >> + * distinguishing between message files in "new" or "cur". The
> >> + * "path:" prefix needs to be used for that.
> >> + *
> >> + * Note the deliberate difference to _filename_is_in_maildir(). We
> >> + * don't want to index different things depending on the existence
> >> + * or non-existence of all maildir sibling directories "new",
> >> + * "cur", and "tmp". Doing so would be surprising, and difficult
> >> + * for the user to fix in case all subdirectories were not in
> >> + * place during indexing.
> >> + */
> >> + last = strrchr (folder, '/');
> >> + if (last) {
> >> + if (is_maildir (last + 1))
> >> + *last = '\0';
> >> + } else if (is_maildir (folder)) {
> >> + *folder = '\0';
> >> + }
> >> +
> >> + _notmuch_message_add_term (message, "folder", folder);
> >> +
> >> + talloc_free (folder);
> >> +
> >> + return NOTMUCH_STATUS_SUCCESS;
> >> +}
> >> +
> >> #define RECURSIVE_SUFFIX "/**"
> >>
> >> /* Add "path:" terms for directory. */
> >> @@ -570,9 +620,8 @@ _notmuch_message_add_directory_terms (void *ctx, notmuch_message_t *message)
> >> directory = _notmuch_database_get_directory_path (ctx,
> >> message->notmuch,
> >> directory_id);
> >> - if (strlen (directory))
> >> - _notmuch_message_gen_terms (message, "folder", directory);
> >>
> >> + _notmuch_message_add_folder_terms (message, directory);
> >> _notmuch_message_add_path_terms (message, directory);
> >> }
> >>
> >> @@ -610,9 +659,7 @@ _notmuch_message_add_filename (notmuch_message_t *message,
> >> * notmuch_directory_get_child_files() . */
> >> _notmuch_message_add_term (message, "file-direntry", direntry);
> >>
> >> - /* New terms allow user to search with folder: specification. */
> >> - _notmuch_message_gen_terms (message, "folder", directory);
> >> -
> >> + _notmuch_message_add_folder_terms (message, directory);
> >> _notmuch_message_add_path_terms (message, directory);
> >>
> >> talloc_free (local);
> >> @@ -637,8 +684,6 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
> >> const char *filename)
> >> {
> >> void *local = talloc_new (message);
> >> - const char *folder_prefix = _find_prefix ("folder");
> >> - char *zfolder_prefix = talloc_asprintf(local, "Z%s", folder_prefix);
> >> char *direntry;
> >> notmuch_private_status_t private_status;
> >> notmuch_status_t status;
> >> @@ -659,10 +704,7 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
> >> /* Re-synchronize "folder:" and "path:" terms for this message. */
> >>
> >> /* Remove all "folder:" terms. */
> >> - _notmuch_message_remove_terms (message, folder_prefix);
> >> -
> >> - /* Remove all "folder:" stemmed terms. */
> >> - _notmuch_message_remove_terms (message, zfolder_prefix);
> >> + _notmuch_message_remove_terms (message, _find_prefix ("folder"));
> >>
> >> /* Remove all "path:" terms. */
> >> _notmuch_message_remove_terms (message, _find_prefix ("path"));
> >> @@ -675,6 +717,22 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
> >> return status;
> >> }
> >>
> >> +/* Upgrade the "folder:" prefix from V1 to V2. */
> >> +#define FOLDER_PREFIX_V1 "XFOLDER"
> >> +#define ZFOLDER_PREFIX_V1 "Z" FOLDER_PREFIX_V1
> >> +void
> >> +_notmuch_message_upgrade_folder (notmuch_message_t *message)
> >> +{
> >> + /* Remove all old "folder:" terms. */
> >> + _notmuch_message_remove_terms (message, FOLDER_PREFIX_V1);
> >> +
> >> + /* Remove all old "folder:" stemmed terms. */
> >> + _notmuch_message_remove_terms (message, ZFOLDER_PREFIX_V1);
> >> +
> >> + /* Add new boolean "folder:" and "path:" terms. */
> >> + _notmuch_message_add_directory_terms (message, message);
> >> +}
> >> +
> >> char *
> >> _notmuch_message_talloc_copy_data (notmuch_message_t *message)
> >> {
> >> diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> >> index af185c7..59eb2bc 100644
> >> --- a/lib/notmuch-private.h
> >> +++ b/lib/notmuch-private.h
> >> @@ -263,6 +263,9 @@ _notmuch_message_gen_terms (notmuch_message_t *message,
> >> void
> >> _notmuch_message_upgrade_filename_storage (notmuch_message_t *message);
> >>
> >> +void
> >> +_notmuch_message_upgrade_folder (notmuch_message_t *message);
> >> +
> >> notmuch_status_t
> >> _notmuch_message_add_filename (notmuch_message_t *message,
> >> const char *filename);
next prev parent reply other threads:[~2014-03-09 16:16 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-08 21:19 v3 of boolean folder: patches David Bremner
2014-03-08 21:19 ` [Patch v3 01/15] lib: refactor folder term update after filename removal David Bremner
2014-03-08 21:19 ` [Patch v3 02/15] lib: add support for path: prefix searches David Bremner
2014-03-08 21:19 ` [Patch v3 03/15] test: make insert test use the path: prefix David Bremner
2014-03-08 21:19 ` [Patch v3 04/15] lib: make folder: prefix literal David Bremner
2014-03-08 23:51 ` Austin Clements
2014-03-09 8:45 ` Jani Nikula
2014-03-09 16:15 ` Austin Clements [this message]
2014-03-08 21:19 ` [Patch v3 05/15] test: fix test for literal folder: search David Bremner
2014-03-08 21:19 ` [Patch v3 08/15] test: add tests for the new boolean folder: and path: prefixes David Bremner
2014-03-09 2:55 ` Austin Clements
2014-03-08 21:19 ` [Patch v3 09/15] test: add database upgrade test from format version 1 to 2 David Bremner
2014-03-08 21:19 ` [Patch v3 10/15] man: update man pages for folder: and path: search terms David Bremner
2014-03-09 3:52 ` Austin Clements
2014-03-08 21:19 ` [Patch v3 11/15] man: try to clarify the folder: and path: vs. --output=files confusion David Bremner
2014-03-08 21:19 ` [Patch v3 12/15] test: don't use $(dir) in recipes David Bremner
2014-03-08 21:19 ` [Patch v3 13/15] devel: add script to generate test databases David Bremner
2014-03-08 21:19 ` [Patch v3 14/15] test: commit folders-v1.tar.xz checksum, ignore actual databases David Bremner
2014-03-08 21:19 ` [Patch v3 15/15] test: add machinery to download and verify databases David Bremner
2014-03-08 21:40 ` v3 of boolean folder: patches David Bremner
2014-03-10 18:10 ` W. Trevor King
2014-03-10 18:24 ` Jani Nikula
2014-03-10 18:31 ` W. Trevor King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140309161548.GO4709@mit.edu \
--to=amdragon@mit.edu \
--cc=jani@nikula.org \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).