unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Notmuch success: Xapian database corrupt
@ 2010-04-18 14:18 John Fremlin
  2010-04-19  6:40 ` Sebastian Spaeth
  2010-04-22  0:17 ` Carl Worth
  0 siblings, 2 replies; 6+ messages in thread
From: John Fremlin @ 2010-04-18 14:18 UTC (permalink / raw)
  To: notmuch

First off, thanks for making notmuch, it's a really good idea and it
works generally very well.

Running notmuch new, it processes mails nicely for a while 

Processed 58 files (19 files/sec.)

then after crunching through not many emails

terminate called after throwing an instance of 'Xapian::DatabaseCorruptError'
Aborted (core dumped)

Is there any way to recover the database? Notmuch search works well and
it takes absolutely ages (one or two days) to add my mail to it; and I
would suspect that it might happen again . . . I'm on Ubuntu lucid with
an Intel SSD.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Notmuch success: Xapian database corrupt
  2010-04-18 14:18 Notmuch success: Xapian database corrupt John Fremlin
@ 2010-04-19  6:40 ` Sebastian Spaeth
  2010-04-22  0:17 ` Carl Worth
  1 sibling, 0 replies; 6+ messages in thread
From: Sebastian Spaeth @ 2010-04-19  6:40 UTC (permalink / raw)
  To: John Fremlin, notmuch

On 2010-04-18, John Fremlin wrote:
> Processed 58 files (19 files/sec.)

That seems exceptionally low. I get about 60-70 files/sec on a laptop
hard disk.
> 
> Is there any way to recover the database?

I am no expert with xapian databases, and that might seem obvious, but
you did a notmuch dump to save your tags, did you? If that works you can
nuke your database directory and after a notmuch new (1-2 days?!!) you
can notmuch restore your tags. This way, at least, you won't have any
data loss.

Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Notmuch success: Xapian database corrupt
  2010-04-18 14:18 Notmuch success: Xapian database corrupt John Fremlin
  2010-04-19  6:40 ` Sebastian Spaeth
@ 2010-04-22  0:17 ` Carl Worth
  2010-04-22  2:37   ` Ben Gamari
  2010-04-22  7:19   ` John Fremlin
  1 sibling, 2 replies; 6+ messages in thread
From: Carl Worth @ 2010-04-22  0:17 UTC (permalink / raw)
  To: John Fremlin, notmuch

[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]

On Sun, 18 Apr 2010 14:18:09 +0000, John Fremlin <john@fremlin.org> wrote:
> terminate called after throwing an instance of 'Xapian::DatabaseCorruptError'
> Aborted (core dumped)
> 
> Is there any way to recover the database? Notmuch search works well and
> it takes absolutely ages (one or two days) to add my mail to it; and I
> would suspect that it might happen again . . . I'm on Ubuntu lucid with
> an Intel SSD.

Hi John,

Welcome to notmuch, and I'm so sorry to hear that your initial attempt
to use it was so frustrating.

I'm not aware of any bugs in notmuch that can result in a corrupt Xapian
database. In fact, this can't be a bug in notmuch alone (since Xapian is
detecting the corruption). There must at least be a bug in Xapian or
else some lower-level failure is occurring (disk full?) that Xapian
can't deal with.

I've not yet encountered a corrupt Xapian database, so I'm afraid I
don't have any tips to help you with that.

But I'm also surprised to hear that it takes you days to incorporate
your mail into a notmuch database. I have over 600 thousand messages
myself, and it takes a few hours (maybe 4?) to incorporate all of these
messages, but not days, (also with an Intel SSD).

So there's some performance problem that you're having in addition to
the database corruption. Hopefully we can figure that out. What kernel
and filesystem are you using? Are you using an encrypted partition?

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Notmuch success: Xapian database corrupt
  2010-04-22  0:17 ` Carl Worth
@ 2010-04-22  2:37   ` Ben Gamari
  2010-04-22  7:19   ` John Fremlin
  1 sibling, 0 replies; 6+ messages in thread
From: Ben Gamari @ 2010-04-22  2:37 UTC (permalink / raw)
  To: Carl Worth, John Fremlin, notmuch

On Wed, 21 Apr 2010 17:17:16 -0700, Carl Worth <cworth@cworth.org> wrote:
>
> I'm not aware of any bugs in notmuch that can result in a corrupt Xapian
> database. In fact, this can't be a bug in notmuch alone (since Xapian is
> detecting the corruption). There must at least be a bug in Xapian or
> else some lower-level failure is occurring (disk full?) that Xapian
> can't deal with.
> 
> I've not yet encountered a corrupt Xapian database, so I'm afraid I
> don't have any tips to help you with that.
> 
Nor have I experienced any corruption issues. I'd say just hope that it
was an isolated incident.

> But I'm also surprised to hear that it takes you days to incorporate
> your mail into a notmuch database. I have over 600 thousand messages
> myself, and it takes a few hours (maybe 4?) to incorporate all of these
> messages, but not days, (also with an Intel SSD).
> 
6e5 messages / 4 hours = ~40 messages/s. I don't believe I have ever
seen more than 0 messages per second average on my box (granted, with a
spinning disk, but I'm generally getting 0.05 messages/second or so), so
you are not the only one experiencing such abysmal performance. I sent a
message[1] to the list about this a few weeks ago, and Olly and others
had some productive input, but nothing that seemed too promising as far
as fixing the issue. I then took the issue to the LKML[2], although this
hasn't resulted in much progress. I recently switched from ext4 to btrfs
and both are quite poor when it comes to notmuch performance, so I'm
honestly not entirely convinced the problem can be placed exclusively on
the file system.

I know that the disk is capable of 20MByte/second sustained (peak of
60MByte/second), however I'm lucky to see a throughput of several
hundred kByte/second under the workload presented by notmuch.  I have
plenty of perf/blktrace data of notmuch new sessions if anyone is
interested, but there were unfortunately no takers on the lkml.

I am under the impression that Xapian is doing some really
knuckle-headed things when it comes to fsync()ing and the like, but I
really have a difficult time believing that is the sole issue while
others are getting perfectly acceptable performance with spinning disks.

I would love to get this issue solved, but my experience is definitely
quite limited in the file system/block I/O department and the semester
is definitely severely limiting the amount of time I am able to invest
in the problem, so I find myself pretty much at the mercy of whoever has
time to parse the data. If you are that person, I would be elated to
provide you with whatever data you might want/need

- Ben


[1] id:20100315090401.GA29891@glaive.weftsoar.net
[2] id:4b9fa440.12135e0a.7fc8.ffffe745@mx.google.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Notmuch success: Xapian database corrupt
  2010-04-22  0:17 ` Carl Worth
  2010-04-22  2:37   ` Ben Gamari
@ 2010-04-22  7:19   ` John Fremlin
  2010-04-23 19:04     ` Carl Worth
  1 sibling, 1 reply; 6+ messages in thread
From: John Fremlin @ 2010-04-22  7:19 UTC (permalink / raw)
  To: notmuch

After the encouraging message from Sebastian. I deleted the
.notmuch/xapian dir and started again.

It went off a good rate (300+ files/sec) and here was the final score

Processed 494764 total files in 2h 54m 41s (47 files/sec.).                 
Added 226817 new messages to the database.

This is much faster than before. As I haven't changed the storage or the
filesystem (ext4,data=ordered over encrypted aes-xts-plain), I just
don't know what made the difference. My kernel is now 2.6.32-21-generic
#32-Ubuntu and I had an older one the first try a month or so ago.

Carl Worth <cworth@cworth.org> writes:
[...]
> Welcome to notmuch, and I'm so sorry to hear that your initial attempt
> to use it was so frustrating.

Thanks for the welcome! I was initially impressed by it but rather
worried about relying on it after the database corruption.

> I'm not aware of any bugs in notmuch that can result in a corrupt Xapian
> database. In fact, this can't be a bug in notmuch alone (since Xapian is
> detecting the corruption). There must at least be a bug in Xapian or
> else some lower-level failure is occurring (disk full?) that Xapian
> can't deal with.

Disk full is quite likely. I'll try to avoid that in future.

[...]
> So there's some performance problem that you're having in addition to
> the database corruption. Hopefully we can figure that out. What kernel
> and filesystem are you using? Are you using an encrypted partition?

Happy to say (though frustrating for you), this time it's much
faster. Maybe because I had more disk free this time round so the Xapian
database became less fragmented? (Speculation, no evidence.) 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Notmuch success: Xapian database corrupt
  2010-04-22  7:19   ` John Fremlin
@ 2010-04-23 19:04     ` Carl Worth
  0 siblings, 0 replies; 6+ messages in thread
From: Carl Worth @ 2010-04-23 19:04 UTC (permalink / raw)
  To: John Fremlin, notmuch

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

On Thu, 22 Apr 2010 07:19:58 +0000, John Fremlin <john@fremlin.org> wrote:
> After the encouraging message from Sebastian. I deleted the
> .notmuch/xapian dir and started again.
> 
> It went off a good rate (300+ files/sec) and here was the final score
> 
> Processed 494764 total files in 2h 54m 41s (47 files/sec.).                 
> Added 226817 new messages to the database.

Nice. That's much more like what I'm accustomed to getting.

> This is much faster than before. As I haven't changed the storage or the
> filesystem (ext4,data=ordered over encrypted aes-xts-plain), I just
> don't know what made the difference. My kernel is now 2.6.32-21-generic
> #32-Ubuntu and I had an older one the first try a month or so ago.

Thanks for the details at least. Maybe other people having performance
problems can start finding correlations.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-04-23 19:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-18 14:18 Notmuch success: Xapian database corrupt John Fremlin
2010-04-19  6:40 ` Sebastian Spaeth
2010-04-22  0:17 ` Carl Worth
2010-04-22  2:37   ` Ben Gamari
2010-04-22  7:19   ` John Fremlin
2010-04-23 19:04     ` Carl Worth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).