unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] notmuch: Add "maildir:" search option
@ 2013-11-12 15:56 Peter Zijlstra
  2013-11-12 19:31 ` Austin Clements
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Peter Zijlstra @ 2013-11-12 15:56 UTC (permalink / raw)
  To: notmuch

Subject: notmuch: Add "maildir:" search option

The current "folder:" search terms are near useless when you have
recursive folders, introduce a boolean "maildir:" search term to
exactly match the maildir folder.

Given a Maildir++ layout like:

	~/Maildir/
	~/Maildir/cur
	~/Maildir/new
	~/Maildir/tmp
	~/Maildir/.Sent
	~/Maildir/.Sent/cur
	~/Maildir/.Sent/new
	~/Maildir/.Sent/tmp
	~/Maildir/.INBOX.LKML
	~/Maildir/.INBOX.LKML/cur
	~/Maildir/.INBOX.LKML/new
	~/Maildir/.INBOX.LKML/tmp
	~/Maildir/.INBOX.LKML.netdev
	~/Maildir/.INBOX.LKML.netdev/cur
	~/Maildir/.INBOX.LKML.netdev/new
	~/Maildir/.INBOX.LKML.netdev/tmp
	~/Maildir/.INBOX.LKML.arch
	~/Maildir/.INBOX.LKML.arch/cur
	~/Maildir/.INBOX.LKML.arch/new
	~/Maildir/.INBOX.LKML.arch/tmp

This patch generates the following search index:

	$ delve -a Maildir/.notmuch/xapian/ | ~/s XMAILDIR
	XMAILDIR:INBOX
	XMAILDIR:INBOX/LKML
	XMAILDIR:INBOX/LKML/arch
	XMAILDIR:INBOX/LKML/netdev
	XMAILDIR:Sent

Which allows one (me!!1) to pose queries like:

	maildir:INBOX and not tag:list

to more easily find offlist mail (from people like my family who don't
actually send their stuff over LKML :-).

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---

XXX: now I need to go figure out how to do searches like:

  subject:PATCH/0

which would mandate that PATCH is the first word occurring in the
subject. I think the position index holds enough information but I
need to look into that and obviously the query parser needs work for
this.


 lib/database.cc |  7 ++++---
 lib/message.cc  | 40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index a021bf17253c..53aeaa68954d 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -208,15 +208,16 @@ static prefix_t BOOLEAN_PREFIX_EXTERNAL[] = {
     { "thread",			"G" },
     { "tag",			"K" },
     { "is",			"K" },
-    { "id",			"Q" }
+    { "id",			"Q" },
+    { "maildir",		"XMAILDIR:" },
 };
 
 static prefix_t PROBABILISTIC_PREFIX[]= {
     { "from",			"XFROM" },
     { "to",			"XTO" },
     { "attachment",		"XATTACHMENT" },
-    { "subject",		"XSUBJECT"},
-    { "folder",			"XFOLDER"}
+    { "subject",		"XSUBJECT" },
+    { "folder",			"XFOLDER" },
 };
 
 const char *
diff --git a/lib/message.cc b/lib/message.cc
index 1b4637950f8e..45a727a6208f 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -22,6 +22,7 @@
 #include "database-private.h"
 
 #include <stdint.h>
+#include <string.h>
 
 #include <gmime/gmime.h>
 
@@ -485,6 +486,8 @@ _notmuch_message_add_filename (notmuch_message_t *message,
     notmuch_status_t status;
     void *local = talloc_new (message);
     char *direntry;
+    char *maildir;
+    int i;
 
     if (filename == NULL)
 	INTERNAL_ERROR ("Message filename cannot be NULL.");
@@ -507,6 +510,43 @@ _notmuch_message_add_filename (notmuch_message_t *message,
     /* New terms allow user to search with folder: specification. */
     _notmuch_message_gen_terms (message, "folder", directory);
 
+    /* Convert the directory into a maildir path */
+    maildir = talloc_strdup(local, directory);
+
+    /* Strip the maildir "cur", "new" directory entries. */
+    i = strlen(maildir);
+    if (strncmp(maildir + i - 3, "cur", 3) == 0 ||
+	strncmp(maildir + i - 3, "new", 3) == 0) {
+	    maildir[i - 3] = '\0';
+	    i -= 3;
+    }
+
+    /* Strip trailing '/' */
+    while (maildir[i-1] == '/') {
+	    maildir[i-1] = '\0';
+	    i--;
+    }
+
+    /* Strip leading '/' */
+    while (maildir[0] == '/')
+	    maildir++;
+
+    /* Strip leading '.' */
+    while (maildir[0] == '.')
+	    maildir++;
+
+    /* Replace all remaining '.' with '/' */
+    for (i = 0; maildir[i]; i++) {
+	    if (maildir[i] == '.')
+		    maildir[i] = '/';
+    }
+
+    /* If there's no string left, we're the "INBOX" */
+    if (maildir[0] == '\0')
+	    maildir = talloc_strdup(local, "INBOX");
+
+    _notmuch_message_add_term (message, "maildir", maildir);
+
     talloc_free (local);
 
     return NOTMUCH_STATUS_SUCCESS;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2013-11-12 15:56 [PATCH] notmuch: Add "maildir:" search option Peter Zijlstra
@ 2013-11-12 19:31 ` Austin Clements
  2013-11-12 19:39   ` Austin Clements
  2013-11-12 19:47   ` Peter Zijlstra
  2014-01-12 15:53 ` Jani Nikula
  2014-03-14 19:37 ` Jani Nikula
  2 siblings, 2 replies; 11+ messages in thread
From: Austin Clements @ 2013-11-12 19:31 UTC (permalink / raw)
  To: Peter Zijlstra, notmuch

I think this is a great idea.  Personally I think this is how folder:
should work.  I find the semantics of folder: to be useless except where
they happen to coincide with the boolean semantics used here.
Unfortunately, changing folder: would require versioning the database,
which we have only primordial support for right now.

Various comments below, though nothing major.  Of course, we'd also need
some tests and man page updates for this.

On Tue, 12 Nov 2013, Peter Zijlstra <peterz@infradead.org> wrote:
> Subject: notmuch: Add "maildir:" search option
>
> The current "folder:" search terms are near useless when you have
> recursive folders, introduce a boolean "maildir:" search term to
> exactly match the maildir folder.
>
> Given a Maildir++ layout like:
>
> 	~/Maildir/
> 	~/Maildir/cur
> 	~/Maildir/new
> 	~/Maildir/tmp
> 	~/Maildir/.Sent
> 	~/Maildir/.Sent/cur
> 	~/Maildir/.Sent/new
> 	~/Maildir/.Sent/tmp
> 	~/Maildir/.INBOX.LKML
> 	~/Maildir/.INBOX.LKML/cur
> 	~/Maildir/.INBOX.LKML/new
> 	~/Maildir/.INBOX.LKML/tmp
> 	~/Maildir/.INBOX.LKML.netdev
> 	~/Maildir/.INBOX.LKML.netdev/cur
> 	~/Maildir/.INBOX.LKML.netdev/new
> 	~/Maildir/.INBOX.LKML.netdev/tmp
> 	~/Maildir/.INBOX.LKML.arch
> 	~/Maildir/.INBOX.LKML.arch/cur
> 	~/Maildir/.INBOX.LKML.arch/new
> 	~/Maildir/.INBOX.LKML.arch/tmp
>
> This patch generates the following search index:
>
> 	$ delve -a Maildir/.notmuch/xapian/ | ~/s XMAILDIR
> 	XMAILDIR:INBOX
> 	XMAILDIR:INBOX/LKML
> 	XMAILDIR:INBOX/LKML/arch
> 	XMAILDIR:INBOX/LKML/netdev
> 	XMAILDIR:Sent
>
> Which allows one (me!!1) to pose queries like:
>
> 	maildir:INBOX and not tag:list
>
> to more easily find offlist mail (from people like my family who don't
> actually send their stuff over LKML :-).
>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> ---
>
> XXX: now I need to go figure out how to do searches like:
>
>   subject:PATCH/0
>
> which would mandate that PATCH is the first word occurring in the
> subject. I think the position index holds enough information but I
> need to look into that and obviously the query parser needs work for
> this.

This is a frequently requested feature.  Unfortunately, there are two
technical hurdles.  One is that the position information actually
*isn't* enough as it is both because the subject will start at some
arbitrary term offset that depends on the from and to (and maybe other
things) and because the Xapian Query structure doesn't expose a way to
search for a specific term offset (only relative offsets).  The other is
that we currently use Xapian's query parser, which isn't extensible,
though I took a stab at a custom query parser long ago and swear that
one of these days I'll return to it.

>
>
>  lib/database.cc |  7 ++++---
>  lib/message.cc  | 40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 44 insertions(+), 3 deletions(-)
>
> diff --git a/lib/database.cc b/lib/database.cc
> index a021bf17253c..53aeaa68954d 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -208,15 +208,16 @@ static prefix_t BOOLEAN_PREFIX_EXTERNAL[] = {
>      { "thread",			"G" },
>      { "tag",			"K" },
>      { "is",			"K" },
> -    { "id",			"Q" }
> +    { "id",			"Q" },
> +    { "maildir",		"XMAILDIR:" },

No colon.  That is, the term prefix should just be "XMAILDIR".

>  };
>  
>  static prefix_t PROBABILISTIC_PREFIX[]= {
>      { "from",			"XFROM" },
>      { "to",			"XTO" },
>      { "attachment",		"XATTACHMENT" },
> -    { "subject",		"XSUBJECT"},
> -    { "folder",			"XFOLDER"}
> +    { "subject",		"XSUBJECT" },
> +    { "folder",			"XFOLDER" },

Unintentional whitespace change?

>  };
>  
>  const char *
> diff --git a/lib/message.cc b/lib/message.cc
> index 1b4637950f8e..45a727a6208f 100644
> --- a/lib/message.cc
> +++ b/lib/message.cc
> @@ -22,6 +22,7 @@
>  #include "database-private.h"
>  
>  #include <stdint.h>
> +#include <string.h>
>  
>  #include <gmime/gmime.h>
>  
> @@ -485,6 +486,8 @@ _notmuch_message_add_filename (notmuch_message_t *message,
>      notmuch_status_t status;
>      void *local = talloc_new (message);
>      char *direntry;
> +    char *maildir;
> +    int i;
>  
>      if (filename == NULL)
>  	INTERNAL_ERROR ("Message filename cannot be NULL.");
> @@ -507,6 +510,43 @@ _notmuch_message_add_filename (notmuch_message_t *message,
>      /* New terms allow user to search with folder: specification. */
>      _notmuch_message_gen_terms (message, "folder", directory);
>  
> +    /* Convert the directory into a maildir path */
> +    maildir = talloc_strdup(local, directory);

Add a space before the "(".  (Same thing throughout the rest of the
patch.)

> +
> +    /* Strip the maildir "cur", "new" directory entries. */
> +    i = strlen(maildir);
> +    if (strncmp(maildir + i - 3, "cur", 3) == 0 ||
> +	strncmp(maildir + i - 3, "new", 3) == 0) {

This is unsafe if directory is less than three characters, which I
believe could happen if the message is in the root mail directory (which
shouldn't happen with a well-formed maildir, but notmuch doesn't require
maildir, and, regardless, we should be defensive).

Also, we have a STRNCMP_LITERAL macro that we often use for comparisons
with string literals, but I'm good with this, too.

> +	    maildir[i - 3] = '\0';
> +	    i -= 3;
> +    }
> +
> +    /* Strip trailing '/' */
> +    while (maildir[i-1] == '/') {

This is also unsafe if maildir is the empty string.

Also, add spaces around the "-" (likewise on the next line).

> +	    maildir[i-1] = '\0';
> +	    i--;
> +    }
> +
> +    /* Strip leading '/' */
> +    while (maildir[0] == '/')
> +	    maildir++;
> +
> +    /* Strip leading '.' */
> +    while (maildir[0] == '.')

Why strip multiple dots?  (I'm not saying you shouldn't, I'm just
curious; and it may be worth explaining in the comment.)

> +	    maildir++;
> +
> +    /* Replace all remaining '.' with '/' */

I think this should only happen if there was a leading '.', indicating a
Maildir++ folder hierarchy.  A lot of people use the "file system"
Maildir layout, which just consists of nested directories where the
leaves are Maildirs (e.g., Dovecot's LAYOUT=fs,
http://wiki2.dovecot.org/MailboxFormat/Maildir#Directory_Structure).  In
this case, it's perfectly legitimate to have '.'s in folder names and it
would be confusing and unexpected to translate them to '/'s.  This would
also make this compatible with MH folders (which notmuch supports,
though admittedly it's unclear if anyone actually uses them).

> +    for (i = 0; maildir[i]; i++) {
> +	    if (maildir[i] == '.')
> +		    maildir[i] = '/';
> +    }
> +
> +    /* If there's no string left, we're the "INBOX" */
> +    if (maildir[0] == '\0')
> +	    maildir = talloc_strdup(local, "INBOX");
> +
> +    _notmuch_message_add_term (message, "maildir", maildir);
> +
>      talloc_free (local);
>  
>      return NOTMUCH_STATUS_SUCCESS;
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2013-11-12 19:31 ` Austin Clements
@ 2013-11-12 19:39   ` Austin Clements
  2013-11-13 20:08     ` Peter Zijlstra
  2013-11-12 19:47   ` Peter Zijlstra
  1 sibling, 1 reply; 11+ messages in thread
From: Austin Clements @ 2013-11-12 19:39 UTC (permalink / raw)
  To: Peter Zijlstra, notmuch

On Tue, 12 Nov 2013, Austin Clements <aclements@csail.mit.edu> wrote:
> I think this is a great idea.  Personally I think this is how folder:
> should work.  I find the semantics of folder: to be useless except where
> they happen to coincide with the boolean semantics used here.
> Unfortunately, changing folder: would require versioning the database,
> which we have only primordial support for right now.
>
> Various comments below, though nothing major.  Of course, we'd also need
> some tests and man page updates for this.

Sorry, one important thing I missed: this doesn't correctly handle when
file names are removed from a message
(_notmuch_message_remove_filename).  Probably the simplest thing would
be to follow the template for how folder: works by first removing *all*
folder terms and then adding back the still-valid ones.  (Unfortunately,
just removing the term for the removed filename's directory won't work
because the message could have other filenames in the same directory,
though maybe you could just scan for that possibility?)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2013-11-12 19:31 ` Austin Clements
  2013-11-12 19:39   ` Austin Clements
@ 2013-11-12 19:47   ` Peter Zijlstra
  2013-11-12 21:32     ` Jani Nikula
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2013-11-12 19:47 UTC (permalink / raw)
  To: Austin Clements; +Cc: notmuch

On Tue, Nov 12, 2013 at 02:31:25PM -0500, Austin Clements wrote:
> > XXX: now I need to go figure out how to do searches like:
> >
> >   subject:PATCH/0
> >
> > which would mandate that PATCH is the first word occurring in the
> > subject. I think the position index holds enough information but I
> > need to look into that and obviously the query parser needs work for
> > this.
> 
> This is a frequently requested feature.  Unfortunately, there are two
> technical hurdles.  One is that the position information actually
> *isn't* enough as it is both because the subject will start at some
> arbitrary term offset that depends on the from and to (and maybe other
> things) and because the Xapian Query structure doesn't expose a way to
> search for a specific term offset (only relative offsets).  The other is
> that we currently use Xapian's query parser, which isn't extensible,
> though I took a stab at a custom query parser long ago and swear that
> one of these days I'll return to it.

Bah, I knew that would end up being more complex :-)

> >  lib/database.cc |  7 ++++---
> >  lib/message.cc  | 40 ++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 44 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/database.cc b/lib/database.cc
> > index a021bf17253c..53aeaa68954d 100644
> > --- a/lib/database.cc
> > +++ b/lib/database.cc
> > @@ -208,15 +208,16 @@ static prefix_t BOOLEAN_PREFIX_EXTERNAL[] = {
> >      { "thread",			"G" },
> >      { "tag",			"K" },
> >      { "is",			"K" },
> > -    { "id",			"Q" }
> > +    { "id",			"Q" },
> > +    { "maildir",		"XMAILDIR:" },
> 
> No colon.  That is, the term prefix should just be "XMAILDIR".

No, that colon is crucial; see http://xapian.org/docs/omega/termprefixes.html

"X starts a multi-capital letter user-defined prefix. If you want a
prefix for something without a standard prefix, you create your own
starting with an X (e.g. XSHOESIZE). The prefix ends with the first
non-capital. If the term you're prefixing starts with a capital, add a
":" between prefix and term to resolve ambiguity about where the prefix
ends and the term begins."

Since maildir folder names typically start with a capital we need that
':' in between the prefix and value. I tried, my initial versions didn't
have the ':' and reliably didn't work.

> >  };
> >  
> >  static prefix_t PROBABILISTIC_PREFIX[]= {
> >      { "from",			"XFROM" },
> >      { "to",			"XTO" },
> >      { "attachment",		"XATTACHMENT" },
> > -    { "subject",		"XSUBJECT"},
> > -    { "folder",			"XFOLDER"}
> > +    { "subject",		"XSUBJECT" },
> > +    { "folder",			"XFOLDER" },
> 
> Unintentional whitespace change?

Oops yes.

> > +    /* Convert the directory into a maildir path */
> > +    maildir = talloc_strdup(local, directory);
> 
> Add a space before the "(".  (Same thing throughout the rest of the
> patch.)

Urgh, weird coding style but yes you're right, I should've kept it
consistent with the rest of the file. Will fix.

> > +
> > +    /* Strip the maildir "cur", "new" directory entries. */
> > +    i = strlen(maildir);
> > +    if (strncmp(maildir + i - 3, "cur", 3) == 0 ||
> > +	strncmp(maildir + i - 3, "new", 3) == 0) {
> 
> This is unsafe if directory is less than three characters, which I
> believe could happen if the message is in the root mail directory (which
> shouldn't happen with a well-formed maildir, but notmuch doesn't require
> maildir, and, regardless, we should be defensive).
> 
> Also, we have a STRNCMP_LITERAL macro that we often use for comparisons
> with string literals, but I'm good with this, too.

Quite so, I haven't actually seen that, but you're quite right.

> > +	    maildir[i - 3] = '\0';
> > +	    i -= 3;
> > +    }
> > +
> > +    /* Strip trailing '/' */
> > +    while (maildir[i-1] == '/') {
> 
> This is also unsafe if maildir is the empty string.
> 
> Also, add spaces around the "-" (likewise on the next line).

Will fix.

> > +	    maildir[i-1] = '\0';
> > +	    i--;
> > +    }
> > +
> > +    /* Strip leading '/' */
> > +    while (maildir[0] == '/')
> > +	    maildir++;
> > +
> > +    /* Strip leading '.' */
> > +    while (maildir[0] == '.')
> 
> Why strip multiple dots?  (I'm not saying you shouldn't, I'm just
> curious; and it may be worth explaining in the comment.)

No reason, copy paste damage from above mostly.

> > +	    maildir++;
> > +
> > +    /* Replace all remaining '.' with '/' */
> 
> I think this should only happen if there was a leading '.', indicating a
> Maildir++ folder hierarchy.  A lot of people use the "file system"
> Maildir layout, which just consists of nested directories where the
> leaves are Maildirs (e.g., Dovecot's LAYOUT=fs,
> http://wiki2.dovecot.org/MailboxFormat/Maildir#Directory_Structure).  In
> this case, it's perfectly legitimate to have '.'s in folder names and it
> would be confusing and unexpected to translate them to '/'s.  This would
> also make this compatible with MH folders (which notmuch supports,
> though admittedly it's unclear if anyone actually uses them).

Ah, I wasn't aware Dovecot actually supported this Maildir variant --
although I've recently come across this variant someplace and thought it
was odd.

OK, I can make it conditional on the initial leading dot.

I'll respin the patch and try to come up with manpage and test additions
to cover this new functionality.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2013-11-12 19:47   ` Peter Zijlstra
@ 2013-11-12 21:32     ` Jani Nikula
  0 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2013-11-12 21:32 UTC (permalink / raw)
  To: Peter Zijlstra, Austin Clements; +Cc: notmuch

On Tue, 12 Nov 2013, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Nov 12, 2013 at 02:31:25PM -0500, Austin Clements wrote:
>> > +    /* Strip the maildir "cur", "new" directory entries. */
>> > +    i = strlen(maildir);
>> > +    if (strncmp(maildir + i - 3, "cur", 3) == 0 ||
>> > +	strncmp(maildir + i - 3, "new", 3) == 0) {
>> 
>> This is unsafe if directory is less than three characters, which I
>> believe could happen if the message is in the root mail directory (which
>> shouldn't happen with a well-formed maildir, but notmuch doesn't require
>> maildir, and, regardless, we should be defensive).
>> 
>> Also, we have a STRNCMP_LITERAL macro that we often use for comparisons
>> with string literals, but I'm good with this, too.
>
> Quite so, I haven't actually seen that, but you're quite right.

FWIW, in this particular case you can just strcmp because you are
looking at the end of maildir.

BR,
Jani.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2013-11-12 19:39   ` Austin Clements
@ 2013-11-13 20:08     ` Peter Zijlstra
  2013-11-13 20:14       ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2013-11-13 20:08 UTC (permalink / raw)
  To: Austin Clements; +Cc: notmuch

On Tue, Nov 12, 2013 at 02:39:52PM -0500, Austin Clements wrote:
> On Tue, 12 Nov 2013, Austin Clements <aclements@csail.mit.edu> wrote:
> > I think this is a great idea.  Personally I think this is how folder:
> > should work.  I find the semantics of folder: to be useless except where
> > they happen to coincide with the boolean semantics used here.
> > Unfortunately, changing folder: would require versioning the database,
> > which we have only primordial support for right now.
> >
> > Various comments below, though nothing major.  Of course, we'd also need
> > some tests and man page updates for this.
> 
> Sorry, one important thing I missed: this doesn't correctly handle when
> file names are removed from a message
> (_notmuch_message_remove_filename).  Probably the simplest thing would
> be to follow the template for how folder: works by first removing *all*
> folder terms and then adding back the still-valid ones.  (Unfortunately,
> just removing the term for the removed filename's directory won't work
> because the message could have other filenames in the same directory,
> though maybe you could just scan for that possibility?)

Oh, right you are. A little something like the below? Its compile tested
only and I've not yet had time to look at how the test infrastructure
works.


---
 lib/database.cc |   3 +-
 lib/message.cc  | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 104 insertions(+), 10 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index a021bf17253c..e43e17dffcd0 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -208,7 +208,8 @@ static prefix_t BOOLEAN_PREFIX_EXTERNAL[] = {
     { "thread",			"G" },
     { "tag",			"K" },
     { "is",			"K" },
-    { "id",			"Q" }
+    { "id",			"Q" },
+    { "maildir",		"XMAILDIR:" },
 };
 
 static prefix_t PROBABILISTIC_PREFIX[]= {
diff --git a/lib/message.cc b/lib/message.cc
index 1b4637950f8e..73d3bb65ab67 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -22,6 +22,7 @@
 #include "database-private.h"
 
 #include <stdint.h>
+#include <string.h>
 
 #include <gmime/gmime.h>
 
@@ -473,6 +474,71 @@ notmuch_message_get_replies (notmuch_message_t *message)
     return _notmuch_messages_create (message->replies);
 }
 
+/* Construct a proper 'maildir' from 'directory'
+ *
+ * Takes the relative directory component inside the maildir pathname and
+ * construct a maildir path from it.
+ *
+ * For filesystem layout Maildir we use the regular filesystem path except the
+ * trailing "cur"/"new" component.
+ *
+ * For Maildir++ we strip the leading '.' and replace subsequent '.'s with '/'s
+ */
+static char *
+_notmuch_message_maildir (void *ctx, const char *directory)
+{
+    char *maildir;
+    int i;
+
+    maildir = talloc_strdup (ctx, directory);
+    i = strlen (maildir);
+
+    /* Strip trailing '/' */
+    while (i && maildir[i - 1] == '/') {
+	maildir[i - 1] = '\0';
+	i--;
+    }
+
+    /* Strip leading '/' */
+    while (maildir[0] == '/') {
+	maildir++;
+	i--;
+    }
+
+    if (i >= 3) {
+	/* Consume trailing maildir directory entries */
+	if (STRNCMP_LITERAL (maildir, "cur") == 0 ||
+	    STRNCMP_LITERAL (maildir, "new") == 0)
+	{
+	    maildir[i - 3] = '\0';
+	    i -= 3;
+	}
+
+	/* Strip trailing '/' */
+	while (i && maildir[i - 1] == '/') {
+	    maildir[i-1] = '\0';
+	    i--;
+	}
+    }
+
+    /* Maildir++ */
+    if (maildir[0] == '.') {
+	maildir++;
+
+	/* Replace all remaining '.' with '/' */
+	for (i = 0; maildir[i]; i++) {
+	    if (maildir[i] == '.')
+		maildir[i] = '/';
+	}
+    }
+
+    /* If there's no string left, we're the "INBOX" */
+    if (maildir[0] == '\0')
+	    maildir = talloc_strdup (ctx, "INBOX");
+
+    return maildir;
+}
+
 /* Add an additional 'filename' for 'message'.
  *
  * This change will not be reflected in the database until the next
@@ -485,6 +551,7 @@ _notmuch_message_add_filename (notmuch_message_t *message,
     notmuch_status_t status;
     void *local = talloc_new (message);
     char *direntry;
+    char *maildir;
 
     if (filename == NULL)
 	INTERNAL_ERROR ("Message filename cannot be NULL.");
@@ -507,6 +574,10 @@ _notmuch_message_add_filename (notmuch_message_t *message,
     /* New terms allow user to search with folder: specification. */
     _notmuch_message_gen_terms (message, "folder", directory);
 
+    /* New terms allow user to serarch with maildir: specification. */
+    maildir = _notmuch_message_maildir (local, directory);
+    _notmuch_message_add_term (message, "maildir", maildir);
+
     talloc_free (local);
 
     return NOTMUCH_STATUS_SUCCESS;
@@ -535,11 +606,18 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
     void *local = talloc_new (message);
     char *zfolder_prefix = talloc_asprintf(local, "Z%s", folder_prefix);
     int zfolder_prefix_len = strlen (zfolder_prefix);
-    char *direntry;
+    const char *relative, *directory;
+    char *direntry, *maildir;
     notmuch_private_status_t private_status;
     notmuch_status_t status;
     Xapian::TermIterator i, last;
 
+    relative = _notmuch_database_relative_path (message->notmuch, filename);
+
+    status = _notmuch_database_split_path (local, relative, &directory, NULL);
+    if (status)
+	return status;
+
     status = _notmuch_database_filename_to_direntry (
 	local, message->notmuch, filename, NOTMUCH_FIND_LOOKUP, &direntry);
     if (status || !direntry)
@@ -553,12 +631,21 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
     if (status)
 	return status;
 
-    /* Re-synchronize "folder:" terms for this message. This requires:
-     *  1. removing all "folder:" terms
-     *  2. removing all "folder:" stemmed terms
-     *  3. adding back terms for all remaining filenames of the message. */
-
-    /* 1. removing all "folder:" terms */
+    /* Re-synchronize "folder:" and "maildir:" terms for this message. This
+     * requires:
+     *  1. removing "maildir:" for this filename
+     *  2. removing all "folder:" terms
+     *  3. removing all "folder:" stemmed terms
+     *
+     * For all remaining filenames of the message:
+     *  4. adding back "folder:" terms
+     *  5. adding back "maildir:" */
+
+    /* 1. remove "maildir:" for this message */
+    maildir = _notmuch_message_maildir (local, directory);
+    _notmuch_message_remove_term (message, "maildir", maildir);
+
+    /* 2. removing all "folder:" terms */
     while (1) {
 	i = message->doc.termlist_begin ();
 	i.skip_to (folder_prefix);
@@ -577,7 +664,7 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
 	}
     }
 
-    /* 2. removing all "folder:" stemmed terms */
+    /* 3. removing all "folder:" stemmed terms */
     while (1) {
 	i = message->doc.termlist_begin ();
 	i.skip_to (zfolder_prefix);
@@ -596,7 +683,7 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
 	}
     }
 
-    /* 3. adding back terms for all remaining filenames of the message. */
+    /* for all remaining filenames of the message */
     i = message->doc.termlist_begin ();
     i.skip_to (direntry_prefix);
 
@@ -623,8 +710,14 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
 	directory = _notmuch_database_get_directory_path (local,
 							  message->notmuch,
 							  directory_id);
+
+	/* 4. adding back "folder:" terms */
 	if (strlen (directory))
 	    _notmuch_message_gen_terms (message, "folder", directory);
+
+	/* 5. adding back "maildir:" */
+	maildir = _notmuch_message_maildir (local, directory);
+	_notmuch_message_add_term (message, "maildir", maildir);
     }
 
     talloc_free (local);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2013-11-13 20:08     ` Peter Zijlstra
@ 2013-11-13 20:14       ` Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2013-11-13 20:14 UTC (permalink / raw)
  To: Austin Clements; +Cc: notmuch

On Wed, Nov 13, 2013 at 09:08:52PM +0100, Peter Zijlstra wrote:
> +    if (i >= 3) {
> +	/* Consume trailing maildir directory entries */
> +	if (STRNCMP_LITERAL (maildir, "cur") == 0 ||
> +	    STRNCMP_LITERAL (maildir, "new") == 0)

maildir + i - 3

obviously..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2013-11-12 15:56 [PATCH] notmuch: Add "maildir:" search option Peter Zijlstra
  2013-11-12 19:31 ` Austin Clements
@ 2014-01-12 15:53 ` Jani Nikula
  2014-01-12 16:06   ` Peter Zijlstra
  2014-03-14 19:37 ` Jani Nikula
  2 siblings, 1 reply; 11+ messages in thread
From: Jani Nikula @ 2014-01-12 15:53 UTC (permalink / raw)
  To: Peter Zijlstra, notmuch

On Tue, 12 Nov 2013, Peter Zijlstra <peterz@infradead.org> wrote:
> Subject: notmuch: Add "maildir:" search option
>
> The current "folder:" search terms are near useless when you have
> recursive folders, introduce a boolean "maildir:" search term to
> exactly match the maildir folder.

Hi Peter -

Per some discussion on IRC about the usefulness of the current "folder:"
prefix, I went ahead and sent patches to make "folder:" a boolean prefix
[1]. It's not the same as your "maildir:" implementation, but I believe
you could use it to achieve the things you want with that. In
particular, my proposed "folder:" prefix is the literal path from
maildir root, and does not make any assumptions about maildir - i.e. it
requires the final cur/new too. (That can be useful for some mutt users
who need the cur/new distinction.)

I am not opposed to adding another prefix like "maildir:" that does make
assumptions on the mail storage and transformations on the path, but I
do think it would be less important with the boolean "folder:" in place.

BR,
Jani.


[1] id:cover.1389304779.git.jani@nikula.org



>
> Given a Maildir++ layout like:
>
> 	~/Maildir/
> 	~/Maildir/cur
> 	~/Maildir/new
> 	~/Maildir/tmp
> 	~/Maildir/.Sent
> 	~/Maildir/.Sent/cur
> 	~/Maildir/.Sent/new
> 	~/Maildir/.Sent/tmp
> 	~/Maildir/.INBOX.LKML
> 	~/Maildir/.INBOX.LKML/cur
> 	~/Maildir/.INBOX.LKML/new
> 	~/Maildir/.INBOX.LKML/tmp
> 	~/Maildir/.INBOX.LKML.netdev
> 	~/Maildir/.INBOX.LKML.netdev/cur
> 	~/Maildir/.INBOX.LKML.netdev/new
> 	~/Maildir/.INBOX.LKML.netdev/tmp
> 	~/Maildir/.INBOX.LKML.arch
> 	~/Maildir/.INBOX.LKML.arch/cur
> 	~/Maildir/.INBOX.LKML.arch/new
> 	~/Maildir/.INBOX.LKML.arch/tmp
>
> This patch generates the following search index:
>
> 	$ delve -a Maildir/.notmuch/xapian/ | ~/s XMAILDIR
> 	XMAILDIR:INBOX
> 	XMAILDIR:INBOX/LKML
> 	XMAILDIR:INBOX/LKML/arch
> 	XMAILDIR:INBOX/LKML/netdev
> 	XMAILDIR:Sent
>
> Which allows one (me!!1) to pose queries like:
>
> 	maildir:INBOX and not tag:list
>
> to more easily find offlist mail (from people like my family who don't
> actually send their stuff over LKML :-).
>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> ---
>
> XXX: now I need to go figure out how to do searches like:
>
>   subject:PATCH/0
>
> which would mandate that PATCH is the first word occurring in the
> subject. I think the position index holds enough information but I
> need to look into that and obviously the query parser needs work for
> this.
>
>
>  lib/database.cc |  7 ++++---
>  lib/message.cc  | 40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 44 insertions(+), 3 deletions(-)
>
> diff --git a/lib/database.cc b/lib/database.cc
> index a021bf17253c..53aeaa68954d 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -208,15 +208,16 @@ static prefix_t BOOLEAN_PREFIX_EXTERNAL[] = {
>      { "thread",			"G" },
>      { "tag",			"K" },
>      { "is",			"K" },
> -    { "id",			"Q" }
> +    { "id",			"Q" },
> +    { "maildir",		"XMAILDIR:" },
>  };
>  
>  static prefix_t PROBABILISTIC_PREFIX[]= {
>      { "from",			"XFROM" },
>      { "to",			"XTO" },
>      { "attachment",		"XATTACHMENT" },
> -    { "subject",		"XSUBJECT"},
> -    { "folder",			"XFOLDER"}
> +    { "subject",		"XSUBJECT" },
> +    { "folder",			"XFOLDER" },
>  };
>  
>  const char *
> diff --git a/lib/message.cc b/lib/message.cc
> index 1b4637950f8e..45a727a6208f 100644
> --- a/lib/message.cc
> +++ b/lib/message.cc
> @@ -22,6 +22,7 @@
>  #include "database-private.h"
>  
>  #include <stdint.h>
> +#include <string.h>
>  
>  #include <gmime/gmime.h>
>  
> @@ -485,6 +486,8 @@ _notmuch_message_add_filename (notmuch_message_t *message,
>      notmuch_status_t status;
>      void *local = talloc_new (message);
>      char *direntry;
> +    char *maildir;
> +    int i;
>  
>      if (filename == NULL)
>  	INTERNAL_ERROR ("Message filename cannot be NULL.");
> @@ -507,6 +510,43 @@ _notmuch_message_add_filename (notmuch_message_t *message,
>      /* New terms allow user to search with folder: specification. */
>      _notmuch_message_gen_terms (message, "folder", directory);
>  
> +    /* Convert the directory into a maildir path */
> +    maildir = talloc_strdup(local, directory);
> +
> +    /* Strip the maildir "cur", "new" directory entries. */
> +    i = strlen(maildir);
> +    if (strncmp(maildir + i - 3, "cur", 3) == 0 ||
> +	strncmp(maildir + i - 3, "new", 3) == 0) {
> +	    maildir[i - 3] = '\0';
> +	    i -= 3;
> +    }
> +
> +    /* Strip trailing '/' */
> +    while (maildir[i-1] == '/') {
> +	    maildir[i-1] = '\0';
> +	    i--;
> +    }
> +
> +    /* Strip leading '/' */
> +    while (maildir[0] == '/')
> +	    maildir++;
> +
> +    /* Strip leading '.' */
> +    while (maildir[0] == '.')
> +	    maildir++;
> +
> +    /* Replace all remaining '.' with '/' */
> +    for (i = 0; maildir[i]; i++) {
> +	    if (maildir[i] == '.')
> +		    maildir[i] = '/';
> +    }
> +
> +    /* If there's no string left, we're the "INBOX" */
> +    if (maildir[0] == '\0')
> +	    maildir = talloc_strdup(local, "INBOX");
> +
> +    _notmuch_message_add_term (message, "maildir", maildir);
> +
>      talloc_free (local);
>  
>      return NOTMUCH_STATUS_SUCCESS;
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2014-01-12 15:53 ` Jani Nikula
@ 2014-01-12 16:06   ` Peter Zijlstra
  2014-01-12 17:05     ` Jani Nikula
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2014-01-12 16:06 UTC (permalink / raw)
  To: Jani Nikula; +Cc: notmuch

On Sun, Jan 12, 2014 at 05:53:52PM +0200, Jani Nikula wrote:
> On Tue, 12 Nov 2013, Peter Zijlstra <peterz@infradead.org> wrote:
> > Subject: notmuch: Add "maildir:" search option
> >
> > The current "folder:" search terms are near useless when you have
> > recursive folders, introduce a boolean "maildir:" search term to
> > exactly match the maildir folder.
> 
> Hi Peter -
> 
> Per some discussion on IRC about the usefulness of the current "folder:"
> prefix, I went ahead and sent patches to make "folder:" a boolean prefix
> [1]. It's not the same as your "maildir:" implementation, but I believe
> you could use it to achieve the things you want with that. In
> particular, my proposed "folder:" prefix is the literal path from
> maildir root, and does not make any assumptions about maildir - i.e. it
> requires the final cur/new too. (That can be useful for some mutt users
> who need the cur/new distinction.)
> 
> I am not opposed to adding another prefix like "maildir:" that does make
> assumptions on the mail storage and transformations on the path, but I
> do think it would be less important with the boolean "folder:" in place.

Right, sorry for disappearing; I failed to find time to finish the
patch.

I did actually modify it do loose the /new,/cur bits and I think I
addressed all other comments, including the file removal part.

Where I got 'stuck' was writing test cases.

Anyway, I'll try and have a look at the recent notmuch.git and see if it
does what I'd like it to.

Thanks for getting back to me.

~ Peter

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2014-01-12 16:06   ` Peter Zijlstra
@ 2014-01-12 17:05     ` Jani Nikula
  0 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2014-01-12 17:05 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: notmuch

On Sun, 12 Jan 2014, Peter Zijlstra <peterz@infradead.org> wrote:
> Right, sorry for disappearing; I failed to find time to finish the
> patch.

That's what I figured, but didn't want to pester you...

> I did actually modify it do loose the /new,/cur bits and I think I
> addressed all other comments, including the file removal part.
>
> Where I got 'stuck' was writing test cases.

Yeah, that can be a pain, but pretty much a mandatory one. Me mucking
with the "folder:" prefix also requires doing the database upgrade.

> Anyway, I'll try and have a look at the recent notmuch.git and see if it
> does what I'd like it to.

It's currently just patches, awaiting review, but the idea of modifying
the "folder:" prefix was met with approval, at least on IRC.

BR,
Jani.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] notmuch: Add "maildir:" search option
  2013-11-12 15:56 [PATCH] notmuch: Add "maildir:" search option Peter Zijlstra
  2013-11-12 19:31 ` Austin Clements
  2014-01-12 15:53 ` Jani Nikula
@ 2014-03-14 19:37 ` Jani Nikula
  2 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2014-03-14 19:37 UTC (permalink / raw)
  To: Peter Zijlstra, notmuch

On Tue, 12 Nov 2013, Peter Zijlstra <peterz@infradead.org> wrote:
> Subject: notmuch: Add "maildir:" search option
>
> The current "folder:" search terms are near useless when you have
> recursive folders, introduce a boolean "maildir:" search term to
> exactly match the maildir folder.

I've now marked this as "obsolete" in our patch tracking system, as the
patches to add boolean "folder:" and "path:" support have been merged.

BR,
Jani.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-03-14 19:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-12 15:56 [PATCH] notmuch: Add "maildir:" search option Peter Zijlstra
2013-11-12 19:31 ` Austin Clements
2013-11-12 19:39   ` Austin Clements
2013-11-13 20:08     ` Peter Zijlstra
2013-11-13 20:14       ` Peter Zijlstra
2013-11-12 19:47   ` Peter Zijlstra
2013-11-12 21:32     ` Jani Nikula
2014-01-12 15:53 ` Jani Nikula
2014-01-12 16:06   ` Peter Zijlstra
2014-01-12 17:05     ` Jani Nikula
2014-03-14 19:37 ` Jani Nikula

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).