* Notmuch new speed degradation @ 2014-07-24 8:19 Dmitry Bogatov 2014-07-24 14:32 ` Austin Clements 0 siblings, 1 reply; 4+ messages in thread From: Dmitry Bogatov @ 2014-07-24 8:19 UTC (permalink / raw) To: notmuch [-- Attachment #1: Type: text/plain, Size: 387 bytes --] Hello! I have ~ 3 000 000 mails. I wanted to index them. First 1 000 000 took several hours, next 200 000 took several days. And now, even with libeatmydata, it takes ~ 4 sec for a file. Is it any way I can improve perfomance? PS. Please, keep me in CC. -- Best regards, Dmitry Bogatov <KAction@gnu.org>, Free Software supporter, esperantisto and netiquette guardian. GPG: 54B7F00D [-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Notmuch new speed degradation 2014-07-24 8:19 Notmuch new speed degradation Dmitry Bogatov @ 2014-07-24 14:32 ` Austin Clements 2014-07-24 19:49 ` Dmitry Bogatov 0 siblings, 1 reply; 4+ messages in thread From: Austin Clements @ 2014-07-24 14:32 UTC (permalink / raw) To: Dmitry Bogatov; +Cc: notmuch Quoth Dmitry Bogatov on Jul 24 at 12:19 pm: > Hello! > > I have ~ 3 000 000 mails. I wanted to index them. > First 1 000 000 took several hours, next 200 000 took several days. > > And now, even with libeatmydata, it takes ~ 4 sec for a file. > > Is it any way I can improve perfomance? > > PS. Please, keep me in CC. Hi Dmitry. My guess is that's you've exceeded your OS buffer cache size by enough that most B-tree reads are going to disk at least once. How big is your database (du -h $MAIL/.notmuch/xapian) and what does free -h report on that computer? Also, is this on an SSD or an HDD? You could try running notmuch compact. That should shrink the database, and, more importantly, pack more into the active page set and, I think, also linearize the database. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Notmuch new speed degradation 2014-07-24 14:32 ` Austin Clements @ 2014-07-24 19:49 ` Dmitry Bogatov 2014-07-24 22:31 ` Austin Clements 0 siblings, 1 reply; 4+ messages in thread From: Dmitry Bogatov @ 2014-07-24 19:49 UTC (permalink / raw) To: Austin Clements; +Cc: notmuch [-- Attachment #1: Type: text/plain, Size: 846 bytes --] * Austin Clements <amdragon@MIT.EDU> [2014-07-24 10:32:14-0400] > Hi Dmitry. My guess is that's you've exceeded your OS buffer cache > size by enough that most B-tree reads are going to disk at least once. > How big is your database (du -h $MAIL/.notmuch/xapian) and what does > free -h report on that computer? Also, is this on an SSD or an HDD? 13Gb on HDD, 9G after compact. Compact did not improved indexing speed, unfortunately. Maybe it is possible to somehow merge databases? total used free shared buffers cached Mem: 7,7G 6,5G 1,2G 240M 826M 3,6G -/+ buffers/cache: 2,1G 5,6G Swap: 1,9G 66M 1,8G -- Best regards, Dmitry Bogatov <KAction@gnu.org>, Free Software supporter, esperantisto and netiquette guardian. GPG: 54B7F00D [-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Notmuch new speed degradation 2014-07-24 19:49 ` Dmitry Bogatov @ 2014-07-24 22:31 ` Austin Clements 0 siblings, 0 replies; 4+ messages in thread From: Austin Clements @ 2014-07-24 22:31 UTC (permalink / raw) To: Dmitry Bogatov; +Cc: notmuch Quoth Dmitry Bogatov on Jul 24 at 11:49 pm: > * Austin Clements <amdragon@MIT.EDU> [2014-07-24 10:32:14-0400] > > Hi Dmitry. My guess is that's you've exceeded your OS buffer cache > > size by enough that most B-tree reads are going to disk at least once. > > How big is your database (du -h $MAIL/.notmuch/xapian) and what does > > free -h report on that computer? Also, is this on an SSD or an HDD? > > 13Gb on HDD, 9G after compact. Compact did not improved indexing speed, > unfortunately. Maybe it is possible to somehow merge databases? Unfortunately, there's no support for merging databases. Other than technical difficulties like identifying messages that should belong to the same thread during merge, the schema wasn't designed with this in mind and uses various features that are incompatible with merging. There are some known problems with Xapian slowing down as the database gets larger, but four seconds per message still sounds extreme. Another thing to try is to raise Xapian's flush threshold by setting the environment variable XAPIAN_FLUSH_THRESHOLD. The default is 10000. Try increasing it by, say, an order of magnitude (you can probably go much higher than that, though you don't want to go too high or you'll start eating in to the memory for your page cache). > total used free shared buffers cached > Mem: 7,7G 6,5G 1,2G 240M 826M 3,6G > -/+ buffers/cache: 2,1G 5,6G > Swap: 1,9G 66M 1,8G Hmm. Was this after the compact or after notmuch new had run for a while? 1.2GB of free memory suggests that it's not a page cache problem, but that would only apply if you took this snapshot after notmuch new, not after compact. We should confirm that this is an IO problem. If you run /usr/bin/time notmuch new for a few minutes, is the %CPU significantly below 100%? If it's above 90%ish, then this is a CPU problem and we might be able to track it down using CPU profiling. If it is an IO problem (which is almost certainly is), I'm afraid it's much harder to track down. Also, what file system are you using? ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-07-24 22:32 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-24 8:19 Notmuch new speed degradation Dmitry Bogatov 2014-07-24 14:32 ` Austin Clements 2014-07-24 19:49 ` Dmitry Bogatov 2014-07-24 22:31 ` Austin Clements
Code repositories for project(s) associated with this public inbox https://yhetil.org/notmuch.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).