unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Notmuch performance (literally, in my case)
@ 2010-03-15  5:59 Ben Gamari
  2010-03-15  9:04 ` Hans Dieter Pearcey
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Gamari @ 2010-03-15  5:59 UTC (permalink / raw)
  To: notmuch

Out of curiosity: How does notmuch perform for you?

I just started using it on a daily basis last week. I love the power that
notmuch provides and the emacs interface is surprisingly usable after you've
gotten used to it, but that being said, I've been quite surprised by how poorly
the kernel responds to the workload presented by notmuch.

Database updates in particular are awful; notmuch new will easily take 3
minutes to index 10 messages and for much of that time the machine will be
close to unusable: input device events are handled sporatically, music will
studder, applications will go unresponsive for tens of seconds at a time; the
traits usually associated with pagefile thrashing. Even something as simple as
saving a file or starting a terminal can take tens of seconds. Considering I
have notmuch new run in my crontab, this gets old quite quickly. It's really
quite awful.

As far as I can tell, this is a result of the horrendous behavior fsync()
invokes in the kernel. I find that performance also suffers in similar ways when
doing backups with rsync, which also seems to use fsync(). During these slow
periods, I/O wait time dominates top while disk throughput hovers at less than
1MByte/second. I have 4GB of memory and a fairly fast hard drive (for a laptop),
yet somehow the system is still barely usable. Meanwhile, latencytop shows large
amounts of time (sometimes 30 seconds or more) spent handling page faults.

Has anyone else observed similarly poor behavior? I am currently using
btrfs on this machine, although ext4 doesn't seem to be any better. Notmuch is
using xapian 1.08-1.99karmic from the Xapian backports PPA, which I believe
includes the recent database update optimizations.

I would really like to get to the bottom of this behavior. There have been many
attempts[1-8] in the past, but to this day the kernel still seems to
suffer under these sorts of workloads. Anyways just wondering if you all are
seeing similar issues. I've never had so reliable a means of reproducing these
latencies, but I think I might bring the issue to the LKML again if I get some
responses. Any feedback in either direction would be greatly appreciated.

Thanks!

- Ben


[1] http://bugzilla.kernel.org/show_bug.cgi?id=5900
[2] http://bugzilla.kernel.org/show_bug.cgi?id=7372
[3] http://lkml.org/lkml/2009/5/16/225
[4] http://lkml.org/lkml/2009/4/28/24
[5] http://lkml.org/lkml/2007/7/21/219
[6] http://lkml.org/lkml/2009/3/26/72
[7] http://lwn.net/Articles/328363/
[8] http://lkml.org/lkml/2009/4/6/114

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-15  5:59 Notmuch performance (literally, in my case) Ben Gamari
@ 2010-03-15  9:04 ` Hans Dieter Pearcey
  2010-03-15  9:29   ` Olly Betts
  0 siblings, 1 reply; 11+ messages in thread
From: Hans Dieter Pearcey @ 2010-03-15  9:04 UTC (permalink / raw)
  To: Ben Gamari, notmuch

On Sun, 14 Mar 2010 22:59:28 -0700 (PDT), Ben Gamari <bgamari.foss@gmail.com> wrote:
> Notmuch is using xapian 1.08-1.99karmic from the Xapian backports PPA, which
> I believe includes the recent database update optimizations.

As far as I know, it doesn't.  1.0.18 is the stable version in which it was
fixed.

http://trac.xapian.org/wiki/ReleaseOverview/1.0.18

hdp.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-15  9:04 ` Hans Dieter Pearcey
@ 2010-03-15  9:29   ` Olly Betts
  2010-03-15 17:29     ` Ben Gamari
  0 siblings, 1 reply; 11+ messages in thread
From: Olly Betts @ 2010-03-15  9:29 UTC (permalink / raw)
  To: notmuch

On 2010-03-15, Hans Dieter Pearcey wrote:
> On Sun, 14 Mar 2010 22:59:28 -0700 (PDT), Ben Gamari wrote:
>> Notmuch is using xapian 1.08-1.99karmic from the Xapian backports PPA, which
>> I believe includes the recent database update optimizations.
>
> As far as I know, it doesn't.  1.0.18 is the stable version in which it was
> fixed.

1.0.18 is also the version that's in the PPA - 1.08 has to be a typo as the
PPA tracks currently releases closely, and 1.0.8 is 18 months old.

I've seen a similar issue reported with apt-xapian-index in Ubuntu (it uses
Xapian to maintain a database of packages).  But I've never seen anything
like this myself, despite running Ubuntu on my laptop and spending a lot of
my time building Xapian databases.

Xapian's commit operation currently writes data and then calls fdatasync(),
on several files one after another.  That sounds a lot like a bad case in
one of the mails you linked to.

Can you try this patch (you'll need to rebuild Xapian from source, and
depending where you install it, perhaps set LD_LIBRARY_PATH to ensure the new
build gets used):

http://oligarchy.co.uk/xapian/patches/xapian-1.0.18-flint-group-fsyncs.patch

What this does it to at least pair up the calls to fdatasync().  It's
possible to move them all together, but requires more effort, so it'd be
nice to know if this is actually going to help.

Cheers,
    Olly

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-15  9:29   ` Olly Betts
@ 2010-03-15 17:29     ` Ben Gamari
  2010-03-16 11:08       ` Olly Betts
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Gamari @ 2010-03-15 17:29 UTC (permalink / raw)
  To: Olly Betts, notmuch

On Mon, 15 Mar 2010 09:29:35 +0000 (UTC), Olly Betts <olly@survex.com> wrote:
> On 2010-03-15, Hans Dieter Pearcey wrote:
> > On Sun, 14 Mar 2010 22:59:28 -0700 (PDT), Ben Gamari wrote:
> >> Notmuch is using xapian 1.08-1.99karmic from the Xapian backports PPA, which
> >> I believe includes the recent database update optimizations.
> >
> > As far as I know, it doesn't.  1.0.18 is the stable version in which it was
> > fixed.
> 
> 1.0.18 is also the version that's in the PPA - 1.08 has to be a typo as the
> PPA tracks currently releases closely, and 1.0.8 is 18 months old.

Yep, my bad. That was a typo.

> 
> I've seen a similar issue reported with apt-xapian-index in Ubuntu (it uses
> Xapian to maintain a database of packages).  But I've never seen anything
> like this myself, despite running Ubuntu on my laptop and spending a lot of
> my time building Xapian databases.
>
> Can you try this patch (you'll need to rebuild Xapian from source, and
> depending where you install it, perhaps set LD_LIBRARY_PATH to ensure the new
> build gets used):
> 
> http://oligarchy.co.uk/xapian/patches/xapian-1.0.18-flint-group-fsyncs.patch
> 
> What this does it to at least pair up the calls to fdatasync().  It's
> possible to move them all together, but requires more effort, so it'd be
> nice to know if this is actually going to help.
> 

This does seem to help. Of course, latency is a difficult thing to measure, but
notmuch does _feel_ faster. That being said, iostat still only shows
700kByte/second read and 300kByte/second write, so things haven't changed in
the throughput side of things.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-15 17:29     ` Ben Gamari
@ 2010-03-16 11:08       ` Olly Betts
  2010-03-16 15:37         ` Ben Gamari
  0 siblings, 1 reply; 11+ messages in thread
From: Olly Betts @ 2010-03-16 11:08 UTC (permalink / raw)
  To: Ben Gamari; +Cc: notmuch

On Mon, Mar 15, 2010 at 10:29:36AM -0700, Ben Gamari wrote:
> On Mon, 15 Mar 2010 09:29:35 +0000 (UTC), Olly Betts <olly@survex.com> wrote:
> > http://oligarchy.co.uk/xapian/patches/xapian-1.0.18-flint-group-fsyncs.patch
> > 
> > What this does it to at least pair up the calls to fdatasync().  It's
> > possible to move them all together, but requires more effort, so it'd be
> > nice to know if this is actually going to help.
> 
> This does seem to help. Of course, latency is a difficult thing to measure,
> but notmuch does _feel_ faster. That being said, iostat still only shows
> 700kByte/second read and 300kByte/second write, so things haven't changed in
> the throughput side of things.

For the issue of a background task interfering with interactive use, the feel
arguably matters more than the throughput.

I'll probably put that patch in 1.0.19, and look at moving all the fdatasync()
calls together.  This is http://trac.xapian.org/ticket/426 BTW.

The kernel should be able to handle this workload better though, so I would
say it was worthwhile to bring up on LKML if you have the energy.  It certainly
isn't just you, as apt-xapian-index seems to trigger it for some Ubuntu users,
and madduck mentioned it on #notmuch a week or so ago.

Cheers,
    Olly

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-16 11:08       ` Olly Betts
@ 2010-03-16 15:37         ` Ben Gamari
  2010-03-16 17:10           ` Aneesh Kumar K. V
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Gamari @ 2010-03-16 15:37 UTC (permalink / raw)
  To: Olly Betts, martin f krafft; +Cc: notmuch

On Tue, 16 Mar 2010 11:08:47 +0000, Olly Betts <olly@survex.com> wrote:
> For the issue of a background task interfering with interactive use, the feel
> arguably matters more than the throughput.
> 
> I'll probably put that patch in 1.0.19, and look at moving all the fdatasync()
> calls together.  This is http://trac.xapian.org/ticket/426 BTW.
> 
> The kernel should be able to handle this workload better though, so I would
> say it was worthwhile to bring up on LKML if you have the energy.  It certainly
> isn't just you, as apt-xapian-index seems to trigger it for some Ubuntu users,
> and madduck mentioned it on #notmuch a week or so ago.

Alright. This issue has been bothering me for a very long time and it's frankly
pretty pathetic how badly the kernel falls apart under this sort of workload.
I just wrote up a message (4b9fa440.12135e0a.7fc8.ffffe745@mx.google.com), so
we'll see what happens. In the past kernel developers have been very eager to
write this issue off as not reproducible enough (perhaps wisely), so if anyone
has anything to say, please contribute it to the thread.

Thanks!

- Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-16 15:37         ` Ben Gamari
@ 2010-03-16 17:10           ` Aneesh Kumar K. V
  2010-03-16 17:22             ` Ben Gamari
  2010-03-16 18:00             ` martin f krafft
  0 siblings, 2 replies; 11+ messages in thread
From: Aneesh Kumar K. V @ 2010-03-16 17:10 UTC (permalink / raw)
  To: Ben Gamari, Olly Betts, martin f krafft; +Cc: notmuch

On Tue, 16 Mar 2010 08:37:54 -0700 (PDT), Ben Gamari <bgamari.foss@gmail.com> wrote:
> On Tue, 16 Mar 2010 11:08:47 +0000, Olly Betts <olly@survex.com> wrote:
> > For the issue of a background task interfering with interactive use, the feel
> > arguably matters more than the throughput.
> > 
> > I'll probably put that patch in 1.0.19, and look at moving all the fdatasync()
> > calls together.  This is http://trac.xapian.org/ticket/426 BTW.
> > 
> > The kernel should be able to handle this workload better though, so I would
> > say it was worthwhile to bring up on LKML if you have the energy.  It certainly
> > isn't just you, as apt-xapian-index seems to trigger it for some Ubuntu users,
> > and madduck mentioned it on #notmuch a week or so ago.
> 
> Alright. This issue has been bothering me for a very long time and it's frankly
> pretty pathetic how badly the kernel falls apart under this sort of workload.
> I just wrote up a message (4b9fa440.12135e0a.7fc8.ffffe745@mx.google.com), so
> we'll see what happens. In the past kernel developers have been very eager to
> write this issue off as not reproducible enough (perhaps wisely), so if anyone
> has anything to say, please contribute it to the thread.
> 

Ext3 fsync related issue is a know problem due to the way journalling is
handled in ext3. The solution for that would be data=writeback ( with
its loss of data integrity ) or not yet upstreamed data=guarded. Another
option would be to try ext4 which should not be impacted that badly by
the data=ordered journalled mode

-aneesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-16 17:10           ` Aneesh Kumar K. V
@ 2010-03-16 17:22             ` Ben Gamari
  2010-03-16 18:00             ` martin f krafft
  1 sibling, 0 replies; 11+ messages in thread
From: Ben Gamari @ 2010-03-16 17:22 UTC (permalink / raw)
  To: Aneesh Kumar K. V, Olly Betts, martin f krafft; +Cc: notmuch

On Tue, 16 Mar 2010 22:40:17 +0530, "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com> wrote:
> Ext3 fsync related issue is a know problem due to the way journalling is
> handled in ext3. The solution for that would be data=writeback ( with
> its loss of data integrity ) or not yet upstreamed data=guarded. Another
> option would be to try ext4 which should not be impacted that badly by
> the data=ordered journalled mode
> 
This problem occurred for me not only with ext4 but also btrfs.

- Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-16 17:10           ` Aneesh Kumar K. V
  2010-03-16 17:22             ` Ben Gamari
@ 2010-03-16 18:00             ` martin f krafft
  2010-03-16 19:18               ` Ben Gamari
  2011-07-28 17:29               ` martin f krafft
  1 sibling, 2 replies; 11+ messages in thread
From: martin f krafft @ 2010-03-16 18:00 UTC (permalink / raw)
  To: Aneesh Kumar K. V; +Cc: Olly Betts, notmuch

[-- Attachment #1: Type: text/plain, Size: 1076 bytes --]

also sprach Aneesh Kumar K. V <aneesh.kumar@linux.vnet.ibm.com> [2010.03.16.1810 +0100]:
> Ext3 fsync related issue is a know problem due to the way journalling is
> handled in ext3. The solution for that would be data=writeback ( with
> its loss of data integrity ) or not yet upstreamed data=guarded. Another
> option would be to try ext4 which should not be impacted that badly by
> the data=ordered journalled mode

I use ext4 with data=ordered, and while notmuch is writing the
Xapian database, most I/O stalls on the machine:

  - Firefox does not get any mouse events
  - Vim blocks writing the viminfo file
  - All disk operations queue for multiple seconds.

So no, ext4 is not a solution. Is it just me, or should no
filesystem of this world be able to hog a system this badly? I think
the culprit is the IO-scheduler.

-- 
martin | http://madduck.net/ | http://two.sentenc.es/
 
"vulgarity is simply the conduct of other people."
                                                        -- oscar wilde
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-16 18:00             ` martin f krafft
@ 2010-03-16 19:18               ` Ben Gamari
  2011-07-28 17:29               ` martin f krafft
  1 sibling, 0 replies; 11+ messages in thread
From: Ben Gamari @ 2010-03-16 19:18 UTC (permalink / raw)
  To: martin f krafft, Aneesh Kumar K. V; +Cc: Olly Betts, notmuch

On Tue, 16 Mar 2010 19:00:52 +0100, martin f krafft <madduck@madduck.net> wrote:
> I use ext4 with data=ordered, and while notmuch is writing the
> Xapian database, most I/O stalls on the machine:
> 
>   - Firefox does not get any mouse events
>   - Vim blocks writing the viminfo file
>   - All disk operations queue for multiple seconds.
> 
> So no, ext4 is not a solution. Is it just me, or should no
> filesystem of this world be able to hog a system this badly? I think
> the culprit is the IO-scheduler.
> 
In my uninformed opinion, I think it more likely that the dominant factor is
the page reclaim code. Even noop scheduling is pretty bad.

- Ben

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Notmuch performance (literally, in my case)
  2010-03-16 18:00             ` martin f krafft
  2010-03-16 19:18               ` Ben Gamari
@ 2011-07-28 17:29               ` martin f krafft
  1 sibling, 0 replies; 11+ messages in thread
From: martin f krafft @ 2011-07-28 17:29 UTC (permalink / raw)
  To: Aneesh Kumar K. V, Ben Gamari, Olly Betts, notmuch

[-- Attachment #1: Type: text/plain, Size: 1295 bytes --]

also sprach martin f krafft <madduck@madduck.net> [2010.03.16.1900 +0100]:
> I use ext4 with data=ordered, and while notmuch is writing the
> Xapian database, most I/O stalls on the machine:
> 
>   - Firefox does not get any mouse events
>   - Vim blocks writing the viminfo file
>   - All disk operations queue for multiple seconds.
> 
> So no, ext4 is not a solution. Is it just me, or should no
> filesystem of this world be able to hog a system this badly? I think
> the culprit is the IO-scheduler.

I just wanted to send a little update on this. Even though the Linux I/O
scheduler performs abysmally during the Xapian database updates,
I can report two improvements, at least to my situation:

  1. The 3.0 kernel seems to be better, but I did not quantify this
     in any way, and I might just as well be wrong.

  2. http://bugs.debian.org/635768 explains the (also I/O-related)
     lockups we've seen. Micah offered the tip that the actual fault
     lies with the awesome WM.

Cheers,

-- 
martin | http://madduck.net/ | http://two.sentenc.es/
 
"writing a book about debian
 is like hitting a moving target
 with a champagne bottle cork."
                                                             -- arky
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/sig-policy/999bbcc4/current) --]
[-- Type: application/pgp-signature, Size: 1124 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-07-28 17:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-15  5:59 Notmuch performance (literally, in my case) Ben Gamari
2010-03-15  9:04 ` Hans Dieter Pearcey
2010-03-15  9:29   ` Olly Betts
2010-03-15 17:29     ` Ben Gamari
2010-03-16 11:08       ` Olly Betts
2010-03-16 15:37         ` Ben Gamari
2010-03-16 17:10           ` Aneesh Kumar K. V
2010-03-16 17:22             ` Ben Gamari
2010-03-16 18:00             ` martin f krafft
2010-03-16 19:18               ` Ben Gamari
2011-07-28 17:29               ` martin f krafft

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).