unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Austin Clements <amdragon@MIT.EDU>
To: Tom Bulli <mrbulli@yahoo.com>
Cc: "notmuch@notmuchmail.org" <notmuch@notmuchmail.org>
Subject: Re: Notmuch indexing 21 million emails
Date: Tue, 22 Nov 2011 22:20:03 -0500	[thread overview]
Message-ID: <20111123032002.GK9351@mit.edu> (raw)
In-Reply-To: <1321930927.73603.YahooMailNeo@web36506.mail.mud.yahoo.com>

Quoth Tom Bulli on Nov 21 at  7:02 pm:
> I have a project where I need to search about 21 emails - and
> decided to use "notmuch" for it.  The system is a Debian Squeeze,
> the notmuch version is "0.8-1~bpo60+1" from "kyria's" private
> repository.
> 
> I am running the "notmuch new" for approx. 4 days now - and
> according to "not,uch count" it has indexed about 4.5 million
> emails.
> 
> Is this expected performance?  Is there any way to speed that up?

Currently, notmuch is much more optimized for search than it is for
indexing.  This is unfortunate for the initial indexing process and
seems to be becoming increasingly unfortunate.

There are some things you can try.  One is to use an SSD if you aren't
already, since constructing the index requires a lot of random IO.
You can also try libeatmydata to disable fsync's, which may improve
your IO performance, with the obvious crash-safety caveats.  However,
unless you have a lot of RAM, I suspect your index has long outgrown
your buffer cache, so this may have limited impact.

Since you're going to the trouble of indexing 21 million emails, you
might want to try 0.10 (under freeze right now, to be released very,
very soon).  It won't improve your indexing time, but if you're doing
searches with non-trivial numbers of results, emails indexed with 0.10
will search much faster.

Sorry I don't have better news, but I hope this helps.

  reply	other threads:[~2011-11-23  3:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-22  3:02 Notmuch indexing 21 million emails Tom Bulli
2011-11-23  3:20 ` Austin Clements [this message]
2011-11-23 15:40 ` Felipe Contreras
2011-11-23 17:20   ` Tom Bulli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111123032002.GK9351@mit.edu \
    --to=amdragon@mit.edu \
    --cc=mrbulli@yahoo.com \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).