unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Tom Prince <tom.prince@ualberta.net>
To: Notmuch Mail <notmuch@notmuchmail.org>
Cc: Thomas Schwinge <thomas@schwinge.name>
Subject: [PATCH] dump: Don't sort the output by message id.
Date: Sun, 27 Nov 2011 13:40:53 -0500	[thread overview]
Message-ID: <1322419253-9071-1-git-send-email-tom.prince@ualberta.net> (raw)
In-Reply-To: <1319884657-5574-1-git-send-email-thomas@schwinge.name>

From: Thomas Schwinge <thomas@schwinge.name>

Asking xapian to sort the messages for us causes suboptimal IO patterns. This
would be useful, if we only wanted the first few results, but since we want
everything anyway, this is pessimization.

On 2011-10-29, a measurement on a 372981 messages instance showed that wall
time can be reduced from 28 minutes (sorted by Message-ID) to 15 minutes
(unsorted).

Timings on 189605 messages:

$ time notmuch.old dump
19.48user 5.83system 12:10.42elapsed 3%CPU (0avgtext+0avgdata 110656maxresident)k
3629584inputs+22720outputs (33major+7073minor)pagefaults 0swaps
$ echo 3 > /proc/sys/vm/drop_caches
$ time notmuch.new
14.89user 1.20system 3:23.58elapsed 7%CPU (0avgtext+0avgdata 46032maxresident)k
1256264inputs+22464outputs (43major+1990minor)pagefaults 0swaps
---
 This just moves the motivation to the commit message, and adds more detailed timing information.

 notmuch-dump.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/notmuch-dump.c b/notmuch-dump.c
index 126593d..0475eb9 100644
--- a/notmuch-dump.c
+++ b/notmuch-dump.c
@@ -73,7 +73,10 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[])
 	fprintf (stderr, "Out of memory\n");
 	return 1;
     }
-    notmuch_query_set_sort (query, NOTMUCH_SORT_MESSAGE_ID);
+    /* Don't ask xapian to sort by Message-ID. Xapian optimizes returning the
+     * first results quickly at the expense of total time.
+     */
+    notmuch_query_set_sort (query, NOTMUCH_SORT_UNSORTED);
 
     for (messages = notmuch_query_search_messages (query);
 	 notmuch_messages_valid (messages);
-- 
1.7.6.1

  parent reply	other threads:[~2011-11-27 18:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-29 10:37 [PATCH] dump: Don't sort Thomas Schwinge
2011-11-15  1:10 ` David Bremner
2011-11-21 11:04   ` Tomi Ollila
2011-11-19 15:11 ` Petter Reinholdtsen
2011-11-28 21:04   ` Thomas Schwinge
2011-11-27 18:40 ` Tom Prince [this message]
2011-11-29  7:10   ` [PATCH] dump: Don't sort the output by message id David Bremner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1322419253-9071-1-git-send-email-tom.prince@ualberta.net \
    --to=tom.prince@ualberta.net \
    --cc=notmuch@notmuchmail.org \
    --cc=thomas@schwinge.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).