unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] Store "from" and "subject" headers in the database.
@ 2011-11-06 17:17 Austin Clements
  2011-11-06 21:07 ` Jani Nikula
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Austin Clements @ 2011-11-06 17:17 UTC (permalink / raw)
  To: notmuch; +Cc: notmuch

This is a rebase and cleanup of Istvan Marko's patch from
id:m3pqnj2j7a.fsf@zsu.kismala.com

Search retrieves these headers for every message in the search
results.  Previously, this required opening and parsing every message
file.  Storing them directly in the database significantly reduces IO
and computation, speeding up search by between 50% and 10X.

Taking full advantage of this requires a database rebuild, but it will
fall back to the old behavior for messages that do not have headers
stored in the database.
---
 lib/database.cc       |    2 +-
 lib/message.cc        |   23 +++++++++++++++++++++--
 lib/notmuch-private.h |   11 +++++++----
 3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index fa632f8..e4ef14e 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1725,7 +1725,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
 		goto DONE;
 
 	    date = notmuch_message_file_get_header (message_file, "date");
-	    _notmuch_message_set_date (message, date);
+	    _notmuch_message_set_header_values (message, date, from, subject);
 
 	    _notmuch_message_index_file (message, filename);
 	} else {
diff --git a/lib/message.cc b/lib/message.cc
index 8f22e02..ca7fbf2 100644
--- a/lib/message.cc
+++ b/lib/message.cc
@@ -412,6 +412,21 @@ _notmuch_message_ensure_message_file (notmuch_message_t *message)
 const char *
 notmuch_message_get_header (notmuch_message_t *message, const char *header)
 {
+    std::string value;
+
+    /* Fetch header from the appropriate xapian value field if
+     * available */
+    if (strcasecmp (header, "from") == 0)
+	value = message->doc.get_value (NOTMUCH_VALUE_FROM);
+    else if (strcasecmp (header, "subject") == 0)
+	value = message->doc.get_value (NOTMUCH_VALUE_SUBJECT);
+    else if (strcasecmp (header, "message-id") == 0)
+	value = message->doc.get_value (NOTMUCH_VALUE_MESSAGE_ID);
+
+    if (!value.empty())
+	return talloc_strdup (message, value.c_str ());
+
+    /* Otherwise fall back to parsing the file */
     _notmuch_message_ensure_message_file (message);
     if (message->message_file == NULL)
 	return NULL;
@@ -795,8 +810,10 @@ notmuch_message_set_author (notmuch_message_t *message,
 }
 
 void
-_notmuch_message_set_date (notmuch_message_t *message,
-			   const char *date)
+_notmuch_message_set_header_values (notmuch_message_t *message,
+				    const char *date,
+				    const char *from,
+				    const char *subject)
 {
     time_t time_value;
 
@@ -809,6 +826,8 @@ _notmuch_message_set_date (notmuch_message_t *message,
 
     message->doc.add_value (NOTMUCH_VALUE_TIMESTAMP,
 			    Xapian::sortable_serialise (time_value));
+    message->doc.add_value (NOTMUCH_VALUE_FROM, from);
+    message->doc.add_value (NOTMUCH_VALUE_SUBJECT, subject);
 }
 
 /* Synchronize changes made to message->doc out into the database. */
diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
index 0d3cc27..60a932f 100644
--- a/lib/notmuch-private.h
+++ b/lib/notmuch-private.h
@@ -93,7 +93,9 @@ NOTMUCH_BEGIN_DECLS
 
 typedef enum {
     NOTMUCH_VALUE_TIMESTAMP = 0,
-    NOTMUCH_VALUE_MESSAGE_ID
+    NOTMUCH_VALUE_MESSAGE_ID,
+    NOTMUCH_VALUE_FROM,
+    NOTMUCH_VALUE_SUBJECT
 } notmuch_value_t;
 
 /* Xapian (with flint backend) complains if we provide a term longer
@@ -269,9 +271,10 @@ void
 _notmuch_message_ensure_thread_id (notmuch_message_t *message);
 
 void
-_notmuch_message_set_date (notmuch_message_t *message,
-			   const char *date);
-
+_notmuch_message_set_header_values (notmuch_message_t *message,
+				    const char *date,
+				    const char *from,
+				    const char *subject);
 void
 _notmuch_message_sync (notmuch_message_t *message);
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-06 17:17 [PATCH] Store "from" and "subject" headers in the database Austin Clements
@ 2011-11-06 21:07 ` Jani Nikula
  2011-11-06 21:59   ` Daniel Schoepe
  2011-11-06 22:01   ` Austin Clements
  2011-11-06 21:41 ` Daniel Schoepe
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Jani Nikula @ 2011-11-06 21:07 UTC (permalink / raw)
  To: Austin Clements, notmuch; +Cc: notmuch

On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> This is a rebase and cleanup of Istvan Marko's patch from
> id:m3pqnj2j7a.fsf@zsu.kismala.com
> 
> Search retrieves these headers for every message in the search
> results.  Previously, this required opening and parsing every message
> file.  Storing them directly in the database significantly reduces IO
> and computation, speeding up search by between 50% and 10X.

Hi, sounds good, but...

> Taking full advantage of this requires a database rebuild, but it will
> fall back to the old behavior for messages that do not have headers
> stored in the database.

...what's the most convenient way of rebuilding the database while
preserving my tags etc.? If this was merged, would an older version of
notmuch choke on the rebuilt database with these headers? (To me it
looks like it would be fine.)

BR,
Jani.

> ---
>  lib/database.cc       |    2 +-
>  lib/message.cc        |   23 +++++++++++++++++++++--
>  lib/notmuch-private.h |   11 +++++++----
>  3 files changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/database.cc b/lib/database.cc
> index fa632f8..e4ef14e 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -1725,7 +1725,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
>  		goto DONE;
>  
>  	    date = notmuch_message_file_get_header (message_file, "date");
> -	    _notmuch_message_set_date (message, date);
> +	    _notmuch_message_set_header_values (message, date, from, subject);
>  
>  	    _notmuch_message_index_file (message, filename);
>  	} else {
> diff --git a/lib/message.cc b/lib/message.cc
> index 8f22e02..ca7fbf2 100644
> --- a/lib/message.cc
> +++ b/lib/message.cc
> @@ -412,6 +412,21 @@ _notmuch_message_ensure_message_file (notmuch_message_t *message)
>  const char *
>  notmuch_message_get_header (notmuch_message_t *message, const char *header)
>  {
> +    std::string value;
> +
> +    /* Fetch header from the appropriate xapian value field if
> +     * available */
> +    if (strcasecmp (header, "from") == 0)
> +	value = message->doc.get_value (NOTMUCH_VALUE_FROM);
> +    else if (strcasecmp (header, "subject") == 0)
> +	value = message->doc.get_value (NOTMUCH_VALUE_SUBJECT);
> +    else if (strcasecmp (header, "message-id") == 0)
> +	value = message->doc.get_value (NOTMUCH_VALUE_MESSAGE_ID);
> +
> +    if (!value.empty())
> +	return talloc_strdup (message, value.c_str ());
> +
> +    /* Otherwise fall back to parsing the file */
>      _notmuch_message_ensure_message_file (message);
>      if (message->message_file == NULL)
>  	return NULL;
> @@ -795,8 +810,10 @@ notmuch_message_set_author (notmuch_message_t *message,
>  }
>  
>  void
> -_notmuch_message_set_date (notmuch_message_t *message,
> -			   const char *date)
> +_notmuch_message_set_header_values (notmuch_message_t *message,
> +				    const char *date,
> +				    const char *from,
> +				    const char *subject)
>  {
>      time_t time_value;
>  
> @@ -809,6 +826,8 @@ _notmuch_message_set_date (notmuch_message_t *message,
>  
>      message->doc.add_value (NOTMUCH_VALUE_TIMESTAMP,
>  			    Xapian::sortable_serialise (time_value));
> +    message->doc.add_value (NOTMUCH_VALUE_FROM, from);
> +    message->doc.add_value (NOTMUCH_VALUE_SUBJECT, subject);
>  }
>  
>  /* Synchronize changes made to message->doc out into the database. */
> diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> index 0d3cc27..60a932f 100644
> --- a/lib/notmuch-private.h
> +++ b/lib/notmuch-private.h
> @@ -93,7 +93,9 @@ NOTMUCH_BEGIN_DECLS
>  
>  typedef enum {
>      NOTMUCH_VALUE_TIMESTAMP = 0,
> -    NOTMUCH_VALUE_MESSAGE_ID
> +    NOTMUCH_VALUE_MESSAGE_ID,
> +    NOTMUCH_VALUE_FROM,
> +    NOTMUCH_VALUE_SUBJECT
>  } notmuch_value_t;
>  
>  /* Xapian (with flint backend) complains if we provide a term longer
> @@ -269,9 +271,10 @@ void
>  _notmuch_message_ensure_thread_id (notmuch_message_t *message);
>  
>  void
> -_notmuch_message_set_date (notmuch_message_t *message,
> -			   const char *date);
> -
> +_notmuch_message_set_header_values (notmuch_message_t *message,
> +				    const char *date,
> +				    const char *from,
> +				    const char *subject);
>  void
>  _notmuch_message_sync (notmuch_message_t *message);
>  
> -- 
> 1.7.2.3
> 
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-06 17:17 [PATCH] Store "from" and "subject" headers in the database Austin Clements
  2011-11-06 21:07 ` Jani Nikula
@ 2011-11-06 21:41 ` Daniel Schoepe
  2011-11-11  1:33 ` Pieter Praet
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Daniel Schoepe @ 2011-11-06 21:41 UTC (permalink / raw)
  To: Austin Clements, notmuch; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 474 bytes --]

On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> Search retrieves these headers for every message in the search
> results.  Previously, this required opening and parsing every message
> file.  Storing them directly in the database significantly reduces IO
> and computation, speeding up search by between 50% and 10X.

Just tried the patch and I can confirm that, after rebuilding the
database, it makes searches a lot faster.

Cheers,
Daniel

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-06 21:07 ` Jani Nikula
@ 2011-11-06 21:59   ` Daniel Schoepe
  2011-11-06 22:01   ` Austin Clements
  1 sibling, 0 replies; 12+ messages in thread
From: Daniel Schoepe @ 2011-11-06 21:59 UTC (permalink / raw)
  To: Jani Nikula, Austin Clements, notmuch; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 441 bytes --]

On Sun, 06 Nov 2011 23:07:51 +0200, Jani Nikula <jani@nikula.org> wrote:
> ...what's the most convenient way of rebuilding the database while
> preserving my tags etc.? If this was merged, would an older version of
> notmuch choke on the rebuilt database with these headers? (To me it
> looks like it would be fine.)

Here's what I did:

notmuch dump > tags.db
rm -rf ~/Maildir/.notmuch
notmuch new
notmuch restore < tags.db

Cheers,
Daniel

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-06 21:07 ` Jani Nikula
  2011-11-06 21:59   ` Daniel Schoepe
@ 2011-11-06 22:01   ` Austin Clements
  2011-11-06 22:30     ` Jani Nikula
  1 sibling, 1 reply; 12+ messages in thread
From: Austin Clements @ 2011-11-06 22:01 UTC (permalink / raw)
  To: Jani Nikula; +Cc: notmuch, notmuch

On Sun, Nov 6, 2011 at 4:07 PM, Jani Nikula <jani@nikula.org> wrote:
> On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
>> Taking full advantage of this requires a database rebuild, but it will
>> fall back to the old behavior for messages that do not have headers
>> stored in the database.
>
> ...what's the most convenient way of rebuilding the database while
> preserving my tags etc.? If this was merged, would an older version of
> notmuch choke on the rebuilt database with these headers? (To me it
> looks like it would be fine.)

The standard way to rebuild the database is to do a notmuch dump, move
.notmuch out of the way, notmuch new, then notmuch restore.  Some day
this process should be made automatic.

Old versions of notmuch will be blissfully unaware of the new headers
stored in the database.  They can even safely add messages to an
upgraded database without breaking new versions of notmuch.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-06 22:01   ` Austin Clements
@ 2011-11-06 22:30     ` Jani Nikula
  0 siblings, 0 replies; 12+ messages in thread
From: Jani Nikula @ 2011-11-06 22:30 UTC (permalink / raw)
  To: Austin Clements; +Cc: notmuch, notmuch

On Sun, 6 Nov 2011 17:01:14 -0500, Austin Clements <amdragon@mit.edu> wrote:
> On Sun, Nov 6, 2011 at 4:07 PM, Jani Nikula <jani@nikula.org> wrote:
> > On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> >> Taking full advantage of this requires a database rebuild, but it will
> >> fall back to the old behavior for messages that do not have headers
> >> stored in the database.
> >
> > ...what's the most convenient way of rebuilding the database while
> > preserving my tags etc.? If this was merged, would an older version of
> > notmuch choke on the rebuilt database with these headers? (To me it
> > looks like it would be fine.)
> 
> The standard way to rebuild the database is to do a notmuch dump, move
> .notmuch out of the way, notmuch new, then notmuch restore.  Some day
> this process should be made automatic.
> 
> Old versions of notmuch will be blissfully unaware of the new headers
> stored in the database.  They can even safely add messages to an
> upgraded database without breaking new versions of notmuch.

Hi, I ran a quick test with/without the patch. I don't have much mail,
but on my aging laptop the performance increase is significant. See
below. 'du -h' on the .notmuch dir increased from 82M to 83M with the
patch, IMHO well worth it.


BR,
Jani.


WITHOUT THE PATCH:

$ sudo bash -c "/bin/sync; /bin/echo 3 > /proc/sys/vm/drop_caches"
$ time notmuch search "*" | wc -l
8167

real	0m43.216s
user	0m3.860s
sys	0m2.268s
$ time notmuch search "*" | wc -l
8167

real	0m2.762s
user	0m2.196s
sys	0m0.564s

WITH THE PATCH:

$ sudo bash -c "/bin/sync; /bin/echo 3 > /proc/sys/vm/drop_caches"
$ time notmuch search "*" | wc -l
8167

real	0m8.019s
user	0m2.088s
sys	0m0.720s
$ time notmuch search "*" | wc -l
8167

real	0m2.033s
user	0m1.592s
sys	0m0.440s

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-06 17:17 [PATCH] Store "from" and "subject" headers in the database Austin Clements
  2011-11-06 21:07 ` Jani Nikula
  2011-11-06 21:41 ` Daniel Schoepe
@ 2011-11-11  1:33 ` Pieter Praet
  2011-11-11  1:38   ` Pieter Praet
  2011-11-14  6:34 ` Jameson Graef Rollins
  2011-11-14 23:19 ` David Bremner
  4 siblings, 1 reply; 12+ messages in thread
From: Pieter Praet @ 2011-11-11  1:33 UTC (permalink / raw)
  To: Austin Clements, notmuch; +Cc: notmuch

On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> This is a rebase and cleanup of Istvan Marko's patch from
> id:m3pqnj2j7a.fsf@zsu.kismala.com
> 

Fantastic performance improvement Austin!  This should be merged in ASAP.

BTW, compacting the db from time to time also has a significant impact:

Running:
  $ du -h .notmuch
  $ sync && sudo /sbin/sysctl vm.drop_caches=3
  $ time notmuch search "*" | wc -l

On:
  1 - original database, compacted some time ago
  2 - fresh database generated before patching, non-compacted
  3 - fresh database generated after patching, non-compacted
  4 - fresh database generated after patching, compacted with
      $ mv .notmuch/xapian .notmuch/xapian-fat
      $ xapian-compact --no-renumber .notmuch/xapian-fat .notmuch/xapian

Results:
  | db      | 1         | 2        | 3         | 4         |
  |---------+-----------+----------+-----------+-----------|
  | db size | 272M      | 289M     | 291M      | 172M      |
  | amount  | 9536      | 9540     | 9540      | 9540      |
  |---------+-----------+----------+-----------+-----------|
  | real    | 1m42.221s | 2m3.193s | 0m30.762s | 0m10.505s |
  | user    | 0m8.379s  | 0m8.133s | 0m4.043s  | 0m3.353s  |
  | sys     | 0m5.216s  | 0m4.933s | 0m1.530s  | 0m1.000s  |


> Search retrieves these headers for every message in the search
> results.  Previously, this required opening and parsing every message
> file.  Storing them directly in the database significantly reduces IO
> and computation, speeding up search by between 50% and 10X.
> 
> Taking full advantage of this requires a database rebuild, but it will
> fall back to the old behavior for messages that do not have headers
> stored in the database.
> ---
>  lib/database.cc       |    2 +-
>  lib/message.cc        |   23 +++++++++++++++++++++--
>  lib/notmuch-private.h |   11 +++++++----
>  3 files changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/database.cc b/lib/database.cc
> index fa632f8..e4ef14e 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -1725,7 +1725,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
>  		goto DONE;
>  
>  	    date = notmuch_message_file_get_header (message_file, "date");
> -	    _notmuch_message_set_date (message, date);
> +	    _notmuch_message_set_header_values (message, date, from, subject);
>  
>  	    _notmuch_message_index_file (message, filename);
>  	} else {
> diff --git a/lib/message.cc b/lib/message.cc
> index 8f22e02..ca7fbf2 100644
> --- a/lib/message.cc
> +++ b/lib/message.cc
> @@ -412,6 +412,21 @@ _notmuch_message_ensure_message_file (notmuch_message_t *message)
>  const char *
>  notmuch_message_get_header (notmuch_message_t *message, const char *header)
>  {
> +    std::string value;
> +
> +    /* Fetch header from the appropriate xapian value field if
> +     * available */
> +    if (strcasecmp (header, "from") == 0)
> +	value = message->doc.get_value (NOTMUCH_VALUE_FROM);
> +    else if (strcasecmp (header, "subject") == 0)
> +	value = message->doc.get_value (NOTMUCH_VALUE_SUBJECT);
> +    else if (strcasecmp (header, "message-id") == 0)
> +	value = message->doc.get_value (NOTMUCH_VALUE_MESSAGE_ID);
> +
> +    if (!value.empty())
> +	return talloc_strdup (message, value.c_str ());
> +
> +    /* Otherwise fall back to parsing the file */
>      _notmuch_message_ensure_message_file (message);
>      if (message->message_file == NULL)
>  	return NULL;
> @@ -795,8 +810,10 @@ notmuch_message_set_author (notmuch_message_t *message,
>  }
>  
>  void
> -_notmuch_message_set_date (notmuch_message_t *message,
> -			   const char *date)
> +_notmuch_message_set_header_values (notmuch_message_t *message,
> +				    const char *date,
> +				    const char *from,
> +				    const char *subject)
>  {
>      time_t time_value;
>  
> @@ -809,6 +826,8 @@ _notmuch_message_set_date (notmuch_message_t *message,
>  
>      message->doc.add_value (NOTMUCH_VALUE_TIMESTAMP,
>  			    Xapian::sortable_serialise (time_value));
> +    message->doc.add_value (NOTMUCH_VALUE_FROM, from);
> +    message->doc.add_value (NOTMUCH_VALUE_SUBJECT, subject);
>  }
>  
>  /* Synchronize changes made to message->doc out into the database. */
> diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
> index 0d3cc27..60a932f 100644
> --- a/lib/notmuch-private.h
> +++ b/lib/notmuch-private.h
> @@ -93,7 +93,9 @@ NOTMUCH_BEGIN_DECLS
>  
>  typedef enum {
>      NOTMUCH_VALUE_TIMESTAMP = 0,
> -    NOTMUCH_VALUE_MESSAGE_ID
> +    NOTMUCH_VALUE_MESSAGE_ID,
> +    NOTMUCH_VALUE_FROM,
> +    NOTMUCH_VALUE_SUBJECT
>  } notmuch_value_t;
>  
>  /* Xapian (with flint backend) complains if we provide a term longer
> @@ -269,9 +271,10 @@ void
>  _notmuch_message_ensure_thread_id (notmuch_message_t *message);
>  
>  void
> -_notmuch_message_set_date (notmuch_message_t *message,
> -			   const char *date);
> -
> +_notmuch_message_set_header_values (notmuch_message_t *message,
> +				    const char *date,
> +				    const char *from,
> +				    const char *subject);
>  void
>  _notmuch_message_sync (notmuch_message_t *message);
>  
> -- 
> 1.7.2.3
> 
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


Peace

-- 
Pieter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-11  1:33 ` Pieter Praet
@ 2011-11-11  1:38   ` Pieter Praet
  2011-11-11  3:00     ` Austin Clements
  0 siblings, 1 reply; 12+ messages in thread
From: Pieter Praet @ 2011-11-11  1:38 UTC (permalink / raw)
  To: Austin Clements, notmuch; +Cc: notmuch

On Fri, 11 Nov 2011 02:33:38 +0100, Pieter Praet <pieter@praet.org> wrote:
> On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> > This is a rebase and cleanup of Istvan Marko's patch from
> > id:m3pqnj2j7a.fsf@zsu.kismala.com
> > 
> 
> Fantastic performance improvement Austin!  [...]

... and Istvan Marko, of course! Thanks!


Peace

-- 
Pieter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-11  1:38   ` Pieter Praet
@ 2011-11-11  3:00     ` Austin Clements
  0 siblings, 0 replies; 12+ messages in thread
From: Austin Clements @ 2011-11-11  3:00 UTC (permalink / raw)
  To: Pieter Praet; +Cc: notmuch, notmuch

Quoth Pieter Praet on Nov 11 at  2:38 am:
> On Fri, 11 Nov 2011 02:33:38 +0100, Pieter Praet <pieter@praet.org> wrote:
> > On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> > > This is a rebase and cleanup of Istvan Marko's patch from
> > > id:m3pqnj2j7a.fsf@zsu.kismala.com
> > > 
> > 
> > Fantastic performance improvement Austin!  [...]
> 
> ... and Istvan Marko, of course! Thanks!

Yes.  This is really Istvan's patch.  I just dug it out of the
archives and cleaned up some whitespace.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-06 17:17 [PATCH] Store "from" and "subject" headers in the database Austin Clements
                   ` (2 preceding siblings ...)
  2011-11-11  1:33 ` Pieter Praet
@ 2011-11-14  6:34 ` Jameson Graef Rollins
  2011-11-14 23:19 ` David Bremner
  4 siblings, 0 replies; 12+ messages in thread
From: Jameson Graef Rollins @ 2011-11-14  6:34 UTC (permalink / raw)
  To: Austin Clements, notmuch; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 799 bytes --]

On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> This is a rebase and cleanup of Istvan Marko's patch from
> id:m3pqnj2j7a.fsf@zsu.kismala.com
> 
> Search retrieves these headers for every message in the search
> results.  Previously, this required opening and parsing every message
> file.  Storing them directly in the database significantly reduces IO
> and computation, speeding up search by between 50% and 10X.

Hey, Austin.  This is a very nice patch.  Short and sweet, a really nice
performance improvement, and a nice gentle fallback.

I just rebuilt my database and I can definitely see the improvements.
Search results are incredibly snappy, and the resultant database is only
about 8% bigger.

I fully endorse this being pushed.

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Store "from" and "subject" headers in the database.
  2011-11-06 17:17 [PATCH] Store "from" and "subject" headers in the database Austin Clements
                   ` (3 preceding siblings ...)
  2011-11-14  6:34 ` Jameson Graef Rollins
@ 2011-11-14 23:19 ` David Bremner
  2011-11-15  1:15   ` [PATCH] news: " Austin Clements
  4 siblings, 1 reply; 12+ messages in thread
From: David Bremner @ 2011-11-14 23:19 UTC (permalink / raw)
  To: Austin Clements, notmuch; +Cc: notmuch

On Sun,  6 Nov 2011 12:17:36 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> This is a rebase and cleanup of Istvan Marko's patch from
> id:m3pqnj2j7a.fsf@zsu.kismala.com
> 

Pushed. Would you mind making a NEWS patch?

d

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] news: Store "from" and "subject" headers in the database.
  2011-11-14 23:19 ` David Bremner
@ 2011-11-15  1:15   ` Austin Clements
  0 siblings, 0 replies; 12+ messages in thread
From: Austin Clements @ 2011-11-15  1:15 UTC (permalink / raw)
  To: notmuch

---
 NEWS |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/NEWS b/NEWS
index 71c7c9a..88f7b20 100644
--- a/NEWS
+++ b/NEWS
@@ -23,6 +23,21 @@ Add search terms to  "notmuch dump"
   search/show/tag. The output file argument of dump is deprecated in
   favour of using stdout.
 
+Optimizations
+-------------
+
+Search avoids opening and parsing message files
+
+  We now store more information in the database so search no longer
+  has to open every message file to get basic headers.  This can
+  improve search speed by as much as 10X, but taking advantage of this
+  requires a database rebuild:
+
+	notmuch dump > notmuch.dump
+	# Backup, then remove notmuch database ($MAIL/.notmuch)
+	notmuch new
+	notmuch restore notmuch.dump
+
 Notmuch 0.9 (2011-10-01)
 ========================
 
-- 
1.7.7.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-11-15  1:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-06 17:17 [PATCH] Store "from" and "subject" headers in the database Austin Clements
2011-11-06 21:07 ` Jani Nikula
2011-11-06 21:59   ` Daniel Schoepe
2011-11-06 22:01   ` Austin Clements
2011-11-06 22:30     ` Jani Nikula
2011-11-06 21:41 ` Daniel Schoepe
2011-11-11  1:33 ` Pieter Praet
2011-11-11  1:38   ` Pieter Praet
2011-11-11  3:00     ` Austin Clements
2011-11-14  6:34 ` Jameson Graef Rollins
2011-11-14 23:19 ` David Bremner
2011-11-15  1:15   ` [PATCH] news: " Austin Clements

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).