unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] test:Folder tags shouldn't match after removal of file in given folder
@ 2011-06-27 17:12 Mark Anderson
  2011-06-27 20:58 ` Austin Clements
  2011-06-28  6:43 ` Pieter Praet
  0 siblings, 2 replies; 6+ messages in thread
From: Mark Anderson @ 2011-06-27 17:12 UTC (permalink / raw)
  To: notmuch


Test for bug.  Current stemming support for notmuch adds extra terms
to the DB which aren't removed when the file renames are detected.

When folder tags are added to a message, Xapian terms for both XFOLDER
and ZXFOLDER are generated.  When one of the filenames are
renamed/removed, only the XFOLDER tags are removed, leaving it possible
for a match on a folder: tag that was previously but is no longer a
match in the maildir.
---

I found this bug last week.  Essentially when the folder:spam tag is
added and puts the XFOLDERspam, it also inserts the ZXFOLDERspam.  Then
if the mail is removed from the folder, it neglects to remove
ZXFOLDERspam.

This was detected with my offlineimap usage with gmail, still haven't
polished my personal folder/label transition.  As I was looking into it,
I saw some unusual things flagged as spam, and investigated.

 test/notmuch-test            |    1 +
 test/search-folder-coherence |   48 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 0 deletions(-)
 create mode 100755 test/search-folder-coherence

diff --git a/test/notmuch-test b/test/notmuch-test
index fe85c6a..79e6267 100755
--- a/test/notmuch-test
+++ b/test/notmuch-test
@@ -41,6 +41,7 @@ TESTS="
   maildir-sync
   crypto
   symbol-hiding
+  search-folder-coherence
 "
 TESTS=${NOTMUCH_TESTS:=$TESTS}
 
diff --git a/test/search-folder-coherence b/test/search-folder-coherence
new file mode 100755
index 0000000..cf3ba40
--- /dev/null
+++ b/test/search-folder-coherence
@@ -0,0 +1,48 @@
+#!/usr/bin/env bash
+test_description='folder tags removed and added through file renames remain consistent'
+. ./test-lib.sh
+
+test_begin_subtest "No new messages"
+output=$(NOTMUCH_NEW)
+test_expect_equal "$output" "No new mail."
+
+
+test_begin_subtest "Single new message"
+generate_message
+file_x=$gen_msg_filename
+id_x=$gen_msg_id
+output=$(NOTMUCH_NEW)
+test_expect_equal "$output" "Added 1 new message to the database."
+
+test_begin_subtest "Add second folder for same message"
+dir=$(dirname $file_x)
+mkdir $dir/spam
+cp $file_x $dir/spam
+output=$(NOTMUCH_NEW)
+test_expect_equal "$output" "No new mail."
+
+
+test_begin_subtest "Multiple files for same message"
+cat <<EOF >EXPECTED
+MAIL_DIR/msg-001
+MAIL_DIR/spam/msg-001
+EOF
+notmuch search --output=files id:$id_x | sed -e "s,$MAIL_DIR,MAIL_DIR," >OUTPUT
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "Test matches folder:spam"
+output=$(notmuch search folder:spam)
+test_expect_equal "$output" "thread:0000000000000001   2001-01-05 [1/1] Notmuch Test Suite; Test message #1 (inbox unread)"
+
+sleep 1;
+
+test_begin_subtest "Remove folder:spam copy of email"
+rm $dir/spam/$(basename $file_x)
+output=$(NOTMUCH_NEW)
+test_expect_equal "$output" "No new mail. Detected 1 file rename."
+
+test_begin_subtest "No mails match the folder:spam search"
+output=$(notmuch search folder:spam)
+test_expect_equal "$output" ""
+
+test_done
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] test:Folder tags shouldn't match after removal of file in given folder
  2011-06-27 17:12 [PATCH] test:Folder tags shouldn't match after removal of file in given folder Mark Anderson
@ 2011-06-27 20:58 ` Austin Clements
  2011-06-27 22:57   ` Mark Anderson
  2011-06-28  6:43 ` Pieter Praet
  1 sibling, 1 reply; 6+ messages in thread
From: Austin Clements @ 2011-06-27 20:58 UTC (permalink / raw)
  To: Mark Anderson; +Cc: notmuch

On Mon, Jun 27, 2011 at 1:12 PM, Mark Anderson <ma.skies@gmail.com> wrote:
> +test_begin_subtest "Test matches folder:spam"
> +output=$(notmuch search folder:spam)
> +test_expect_equal "$output" "thread:0000000000000001   2001-01-05 [1/1] Notmuch Test Suite; Test message #1 (inbox unread)"
> +
> +sleep 1;

I assume this has to do with directory mtime's.  Is it sufficient to
instead increment_mtime $dir/spam after the rm below?

> +
> +test_begin_subtest "Remove folder:spam copy of email"
> +rm $dir/spam/$(basename $file_x)
> +output=$(NOTMUCH_NEW)
> +test_expect_equal "$output" "No new mail. Detected 1 file rename."

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] test:Folder tags shouldn't match after removal of file in given folder
  2011-06-27 20:58 ` Austin Clements
@ 2011-06-27 22:57   ` Mark Anderson
  0 siblings, 0 replies; 6+ messages in thread
From: Mark Anderson @ 2011-06-27 22:57 UTC (permalink / raw)
  To: Austin Clements, Mark Anderson; +Cc: notmuch@notmuchmail.org

On Mon, 27 Jun 2011 15:58:00 -0500, Austin Clements <amdragon@mit.edu> wrote:
> On Mon, Jun 27, 2011 at 1:12 PM, Mark Anderson <ma.skies@gmail.com> wrote:
> > +test_begin_subtest "Test matches folder:spam"
> > +output=$(notmuch search folder:spam)
> > +test_expect_equal "$output" "thread:0000000000000001   2001-01-05 [1/1] Notmuch Test Suite; Test message #1 (inbox unread)"
> > +
> > +sleep 1;
> 
> I assume this has to do with directory mtime's.  Is it sufficient to
> instead increment_mtime $dir/spam after the rm below?

I believe that would be sufficient.  I just had one of my runs fail to
even remove the filename, and I remember some earlier discussion along
these lines, but I just used the fastest way, I can certainly modify it
to use "increment_mtime"

> 
> > +
> > +test_begin_subtest "Remove folder:spam copy of email"
> > +rm $dir/spam/$(basename $file_x)
> > +output=$(NOTMUCH_NEW)
> > +test_expect_equal "$output" "No new mail. Detected 1 file rename."
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] test:Folder tags shouldn't match after removal of file in given folder
  2011-06-27 17:12 [PATCH] test:Folder tags shouldn't match after removal of file in given folder Mark Anderson
  2011-06-27 20:58 ` Austin Clements
@ 2011-06-28  6:43 ` Pieter Praet
  2011-06-29 20:04   ` [PATCH] Fix folder: coherence issue Mark Anderson
  1 sibling, 1 reply; 6+ messages in thread
From: Pieter Praet @ 2011-06-28  6:43 UTC (permalink / raw)
  To: Mark Anderson, notmuch

On Mon, 27 Jun 2011 11:12:24 -0600, Mark Anderson <ma.skies@gmail.com> wrote:
> 
> Test for bug.  Current stemming support for notmuch adds extra terms
> to the DB which aren't removed when the file renames are detected.
> 
> When folder tags are added to a message, Xapian terms for both XFOLDER
> and ZXFOLDER are generated.  When one of the filenames are
> renamed/removed, only the XFOLDER tags are removed, leaving it possible
> for a match on a folder: tag that was previously but is no longer a
> match in the maildir.
> ---
> 
> I found this bug last week.  Essentially when the folder:spam tag is
> added and puts the XFOLDERspam, it also inserts the ZXFOLDERspam.  Then
> if the mail is removed from the folder, it neglects to remove
> ZXFOLDERspam.
> 
> This was detected with my offlineimap usage with gmail, still haven't
> polished my personal folder/label transition.  As I was looking into it,
> I saw some unusual things flagged as spam, and investigated.

+1

This bug was reported by Sebastian back in January [1], and I
submitted a test [2], but yours is much more thorough, both
regarding the actual test as well as the commit message.

I'd forgotten all about this issue myself, so it's very nice
to see it brought up again.

Thanks!

> 
>  test/notmuch-test            |    1 +
>  test/search-folder-coherence |   48 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 49 insertions(+), 0 deletions(-)
>  create mode 100755 test/search-folder-coherence
> 
> diff --git a/test/notmuch-test b/test/notmuch-test
> index fe85c6a..79e6267 100755
> --- a/test/notmuch-test
> +++ b/test/notmuch-test
> @@ -41,6 +41,7 @@ TESTS="
>    maildir-sync
>    crypto
>    symbol-hiding
> +  search-folder-coherence
>  "
>  TESTS=${NOTMUCH_TESTS:=$TESTS}
>  
> diff --git a/test/search-folder-coherence b/test/search-folder-coherence
> new file mode 100755
> index 0000000..cf3ba40
> --- /dev/null
> +++ b/test/search-folder-coherence
> @@ -0,0 +1,48 @@
> +#!/usr/bin/env bash
> +test_description='folder tags removed and added through file renames remain consistent'
> +. ./test-lib.sh
> +
> +test_begin_subtest "No new messages"
> +output=$(NOTMUCH_NEW)
> +test_expect_equal "$output" "No new mail."
> +
> +
> +test_begin_subtest "Single new message"
> +generate_message
> +file_x=$gen_msg_filename
> +id_x=$gen_msg_id
> +output=$(NOTMUCH_NEW)
> +test_expect_equal "$output" "Added 1 new message to the database."
> +
> +test_begin_subtest "Add second folder for same message"
> +dir=$(dirname $file_x)
> +mkdir $dir/spam
> +cp $file_x $dir/spam
> +output=$(NOTMUCH_NEW)
> +test_expect_equal "$output" "No new mail."
> +
> +
> +test_begin_subtest "Multiple files for same message"
> +cat <<EOF >EXPECTED
> +MAIL_DIR/msg-001
> +MAIL_DIR/spam/msg-001
> +EOF
> +notmuch search --output=files id:$id_x | sed -e "s,$MAIL_DIR,MAIL_DIR," >OUTPUT
> +test_expect_equal_file OUTPUT EXPECTED
> +
> +test_begin_subtest "Test matches folder:spam"
> +output=$(notmuch search folder:spam)
> +test_expect_equal "$output" "thread:0000000000000001   2001-01-05 [1/1] Notmuch Test Suite; Test message #1 (inbox unread)"
> +
> +sleep 1;
> +
> +test_begin_subtest "Remove folder:spam copy of email"
> +rm $dir/spam/$(basename $file_x)
> +output=$(NOTMUCH_NEW)
> +test_expect_equal "$output" "No new mail. Detected 1 file rename."
> +
> +test_begin_subtest "No mails match the folder:spam search"
> +output=$(notmuch search folder:spam)
> +test_expect_equal "$output" ""
> +
> +test_done
> -- 
> 1.7.4.1
> 
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


Peace

-- 
Pieter

[1] id:"87vd1n4rd2.fsf@SSpaeth.de"
[2] id:"1305270308-30660-1-git-send-email-pieter@praet.org"

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Fix folder: coherence issue
  2011-06-28  6:43 ` Pieter Praet
@ 2011-06-29 20:04   ` Mark Anderson
  2011-06-29 21:16     ` Carl Worth
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Anderson @ 2011-06-29 20:04 UTC (permalink / raw)
  To: Pieter Praet, notmuch, cworth

Add removal of all ZXFOLDER terms to removal of all XFOLDER terms for
each message filename removal.

The existing filename-list reindexing will put all the needed terms
back in.  Test search-folder-coherence now passes.

Signed-off-by:Mark Anderson <ma.skies@gmail.com>
---

Once I fixed the removal instead of the addition side, things went
smoothly.

 lib/message.cc |   31 ++++++++++++++++++++++++++++---
 1 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/lib/message.cc b/lib/message.cc
index 8b9c84f..d993cde 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -514,6 +514,8 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
     const char *folder_prefix = _find_prefix ("folder");
     int folder_prefix_len = strlen (folder_prefix);
     void *local = talloc_new (message);
+    char *zfolder_prefix = talloc_asprintf(local, "Z%s", folder_prefix);
+    int zfolder_prefix_len = strlen (zfolder_prefix);
     char *direntry;
     notmuch_private_status_t private_status;
     notmuch_status_t status;
@@ -530,9 +532,12 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
     status = COERCE_STATUS (private_status,
 			    "Unexpected error from _notmuch_message_remove_term");
 
-    /* Re-synchronize "folder:" terms for this message. This requires
-     * first removing all "folder:" terms, then adding back terms for
-     * all remaining filenames of the message. */
+    /* Re-synchronize "folder:" terms for this message. This requires:
+     *  1. removing all "folder:" terms
+     *  2. removing all "folder:" stemmed terms
+     *  3. adding back terms for all remaining filenames of the message. */
+
+    /* 1. removing all "folder:" terms */
     while (1) {
 	i = message->doc.termlist_begin ();
 	i.skip_to (folder_prefix);
@@ -551,6 +556,26 @@ _notmuch_message_remove_filename (notmuch_message_t *message,
 	}
     }
 
+    /* 2. removing all "folder:" stemmed terms */
+    while (1) {
+	i = message->doc.termlist_begin ();
+	i.skip_to (zfolder_prefix);
+
+	/* Terminate loop when no terms remain with desired prefix. */
+	if (i == message->doc.termlist_end () ||
+	    strncmp ((*i).c_str (), zfolder_prefix, zfolder_prefix_len))
+	{
+	    break;
+	}
+
+	try {
+	    message->doc.remove_term ((*i));
+	} catch (const Xapian::InvalidArgumentError) {
+	    /* Ignore failure to remove non-existent term. */
+	}
+    }
+
+    /* 3. adding back terms for all remaining filenames of the message. */
     i = message->doc.termlist_begin ();
     i.skip_to (direntry_prefix);
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix folder: coherence issue
  2011-06-29 20:04   ` [PATCH] Fix folder: coherence issue Mark Anderson
@ 2011-06-29 21:16     ` Carl Worth
  0 siblings, 0 replies; 6+ messages in thread
From: Carl Worth @ 2011-06-29 21:16 UTC (permalink / raw)
  To: Mark Anderson, Pieter Praet, notmuch; +Cc: David Bremner

[-- Attachment #1: Type: text/plain, Size: 637 bytes --]

On Wed, 29 Jun 2011 14:04:45 -0600, Mark Anderson <ma.skies@gmail.com> wrote:
> Add removal of all ZXFOLDER terms to removal of all XFOLDER terms for
> each message filename removal.
> 
> The existing filename-list reindexing will put all the needed terms
> back in.  Test search-folder-coherence now passes.

Thanks so much, Mark!

I've now pushed the new test (followed by a quick fix to turn sleep into
increment_mtime) and the fix.

David, I recommend these three commits for the release branch. This is
an important fix before we first make a release with the folder:
feature.

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-06-29 21:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-27 17:12 [PATCH] test:Folder tags shouldn't match after removal of file in given folder Mark Anderson
2011-06-27 20:58 ` Austin Clements
2011-06-27 22:57   ` Mark Anderson
2011-06-28  6:43 ` Pieter Praet
2011-06-29 20:04   ` [PATCH] Fix folder: coherence issue Mark Anderson
2011-06-29 21:16     ` Carl Worth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).