unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* How to recover from this permanent fatal error?
@ 2021-06-05  0:44 Felipe Contreras
  2021-06-05  1:37 ` David Bremner
  0 siblings, 1 reply; 11+ messages in thread
From: Felipe Contreras @ 2021-06-05  0:44 UTC (permalink / raw)
  To: notmuch@notmuchmail.org

Hello,

I can't use notmuch anymore, I get this error:

A Xapian exception occurred opening database: The revision being read
has been discarded - you should call Xapian::Database::reopen() and
retry the operation

Context. In order to investigate a bug about mbsync I moved away the
folder ~/mail/.notmuch. I have a timer that calls notmuch new after
mbsync, so I paused that timer.

Initially I used notmuch, only to see everything empty. Then I
recalled what I did, removed all the files, and moved back the .nomuch
directory.

IIRC I was able to use notmuch without problems once, and then I got the issue.

All I can find about the issue is that somebody reported the exact
same message to the mailing list (id:87txmb4xyz.fsf@mcs.anl.gov), but
did not receive any feedback.

Ideas?

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-05  0:44 How to recover from this permanent fatal error? Felipe Contreras
@ 2021-06-05  1:37 ` David Bremner
  2021-06-05  1:40   ` Felipe Contreras
  0 siblings, 1 reply; 11+ messages in thread
From: David Bremner @ 2021-06-05  1:37 UTC (permalink / raw)
  To: Felipe Contreras, notmuch@notmuchmail.org; +Cc: xapian-discuss

Felipe Contreras <felipe.contreras@gmail.com> writes:

> Hello,
>
> I can't use notmuch anymore, I get this error:
>
> A Xapian exception occurred opening database: The revision being read
> has been discarded - you should call Xapian::Database::reopen() and
> retry the operation
>
> Context. In order to investigate a bug about mbsync I moved away the
> folder ~/mail/.notmuch. I have a timer that calls notmuch new after
> mbsync, so I paused that timer.
>
> Initially I used notmuch, only to see everything empty. Then I
> recalled what I did, removed all the files, and moved back the .nomuch
> directory.
>
> IIRC I was able to use notmuch without problems once, and then I got the issue.

Maybe the Xapian folk will have a more concrete suggestion, but I would
start by running xapian-check on the database. In your case I guess that
should be "xapian-check ~/mail/.notmuch".

You might have to install an extra package to get xapian-check. On
Debian it's part of xapian-tools.

d

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-05  1:37 ` David Bremner
@ 2021-06-05  1:40   ` Felipe Contreras
  2021-06-05  2:43     ` Olly Betts
  0 siblings, 1 reply; 11+ messages in thread
From: Felipe Contreras @ 2021-06-05  1:40 UTC (permalink / raw)
  To: David Bremner; +Cc: notmuch@notmuchmail.org, xapian-discuss

On Fri, Jun 4, 2021 at 8:37 PM David Bremner <david@tethera.net> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:

> > I can't use notmuch anymore, I get this error:
> >
> > A Xapian exception occurred opening database: The revision being read
> > has been discarded - you should call Xapian::Database::reopen() and
> > retry the operation
> >
> > Context. In order to investigate a bug about mbsync I moved away the
> > folder ~/mail/.notmuch. I have a timer that calls notmuch new after
> > mbsync, so I paused that timer.
> >
> > Initially I used notmuch, only to see everything empty. Then I
> > recalled what I did, removed all the files, and moved back the .nomuch
> > directory.
> >
> > IIRC I was able to use notmuch without problems once, and then I got the issue.
>
> Maybe the Xapian folk will have a more concrete suggestion, but I would
> start by running xapian-check on the database. In your case I guess that
> should be "xapian-check ~/mail/.notmuch".

Actually `xapian-check ~/mai/.notmuch/xapian`, but I already did that:

Database couldn't be opened for reading: DatabaseModifiedError: The
revision being read has been discarded - you should call
Xapian::Database::reopen() and retry the operation
Continuing check anyway
docdata:
xapian-check: DatabaseCorruptError: Db block overwritten - are there
multiple writers?

`xapian-check ~/mail/.notmuch/xapian F` doesn't seem to change anything.

Thanks for the prompt response though.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-05  1:40   ` Felipe Contreras
@ 2021-06-05  2:43     ` Olly Betts
  2021-06-05 14:39       ` Felipe Contreras
  0 siblings, 1 reply; 11+ messages in thread
From: Olly Betts @ 2021-06-05  2:43 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: notmuch@notmuchmail.org, xapian-discuss

On Fri, Jun 04, 2021 at 08:40:56PM -0500, Felipe Contreras wrote:
> On Fri, Jun 4, 2021 at 8:37 PM David Bremner <david@tethera.net> wrote:
> > Felipe Contreras <felipe.contreras@gmail.com> writes:
> 
> > > I can't use notmuch anymore, I get this error:
> > >
> > > A Xapian exception occurred opening database: The revision being read
> > > has been discarded - you should call Xapian::Database::reopen() and
> > > retry the operation
> > >
> > > Context. In order to investigate a bug about mbsync I moved away the
> > > folder ~/mail/.notmuch. I have a timer that calls notmuch new after
> > > mbsync, so I paused that timer.
> > >
> > > Initially I used notmuch, only to see everything empty. Then I
> > > recalled what I did, removed all the files, and moved back the .nomuch
> > > directory.

Perhaps a process had the database or the empty replacement open for
writing over the moving aside or the moving back?  That could result
in a broken database.

> `xapian-check ~/mail/.notmuch/xapian F` doesn't seem to change anything.

With some filing systems and older format (chert) Xapian databases a
system crash or power failure could result in truncating to zero size
the files which tracked which blocks were in use and where the root of a
particular revision of the tree; the xapian-check's "fix" mode was added
to recreate those files by scanning the whole database to work out what
they should contain.

In newer format databases (glass) we eliminated these files and
currently the "fix" mode doesn't actually do anything for glass.

The plan was to teach xapian-check how to recreate the `iamglass` file,
but that doesn't seem to suffer from the truncation problem and so it
hasn't actually been implemented yet and so "F" currently does nothing
for glass databases.

> > > IIRC I was able to use notmuch without problems once, and then I got the issue.
> >
> > Maybe the Xapian folk will have a more concrete suggestion, but I would
> > start by running xapian-check on the database. In your case I guess that
> > should be "xapian-check ~/mail/.notmuch".

I'd suggest trying this simple tool I wrote that can probably rescue the
tags from a broken notmuch database (the tags are the part notmuch can't
just recreate by reindexing):

https://git.xapian.org/?p=xapian;a=blob;f=README.notmuch;hb=refs/heads/notmuch-tag-rescue-hack

Once you have those, you can reindex your mail and then restore the
tags.

Cheers,
    Olly

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-05  2:43     ` Olly Betts
@ 2021-06-05 14:39       ` Felipe Contreras
  2021-06-06  1:45         ` Olly Betts
  0 siblings, 1 reply; 11+ messages in thread
From: Felipe Contreras @ 2021-06-05 14:39 UTC (permalink / raw)
  To: Xapian Discussion, Felipe Contreras, David Bremner,
	notmuch@notmuchmail.org

On Fri, Jun 4, 2021 at 9:43 PM Olly Betts <olly@survex.com> wrote:
>
> On Fri, Jun 04, 2021 at 08:40:56PM -0500, Felipe Contreras wrote:
> > On Fri, Jun 4, 2021 at 8:37 PM David Bremner <david@tethera.net> wrote:
> > > Felipe Contreras <felipe.contreras@gmail.com> writes:
> >
> > > > I can't use notmuch anymore, I get this error:
> > > >
> > > > A Xapian exception occurred opening database: The revision being read
> > > > has been discarded - you should call Xapian::Database::reopen() and
> > > > retry the operation
> > > >
> > > > Context. In order to investigate a bug about mbsync I moved away the
> > > > folder ~/mail/.notmuch. I have a timer that calls notmuch new after
> > > > mbsync, so I paused that timer.
> > > >
> > > > Initially I used notmuch, only to see everything empty. Then I
> > > > recalled what I did, removed all the files, and moved back the .nomuch
> > > > directory.
>
> Perhaps a process had the database or the empty replacement open for
> writing over the moving aside or the moving back?  That could result
> in a broken database.

Perhaps.

> > `xapian-check ~/mail/.notmuch/xapian F` doesn't seem to change anything.

> In newer format databases (glass) we eliminated these files and
> currently the "fix" mode doesn't actually do anything for glass.
>
> The plan was to teach xapian-check how to recreate the `iamglass` file,
> but that doesn't seem to suffer from the truncation problem and so it
> hasn't actually been implemented yet and so "F" currently does nothing
> for glass databases.

Well, my databases seem to be glass.

> > > > IIRC I was able to use notmuch without problems once, and then I got the issue.
> > >
> > > Maybe the Xapian folk will have a more concrete suggestion, but I would
> > > start by running xapian-check on the database. In your case I guess that
> > > should be "xapian-check ~/mail/.notmuch".
>
> I'd suggest trying this simple tool I wrote that can probably rescue the
> tags from a broken notmuch database (the tags are the part notmuch can't
> just recreate by reindexing):
>
> https://git.xapian.org/?p=xapian;a=blob;f=README.notmuch;hb=refs/heads/notmuch-tag-rescue-hack

I can't seem to build it:

In file included from matcher/valuestreamdocument.h:24,
                 from matcher/postlisttree.h:26,
                 from matcher/andmaybepostlist.h:24,
                 from matcher/andmaybepostlist.cc:23:
./backends/documentinternal.h: In member function
‘Xapian::Document::Internal::remove_posting_result
Xapian::Document::Internal::remove_postings(const string&,
Xapian::termpos, Xapian::termpos, Xapian::termcount,
Xapian::termpos&)’:
./backends/documentinternal.h:339:29: error: ‘numeric_limits’ was not
declared in this scope
  339 |                 wdf_delta = numeric_limits<Xapian::termcount>::max();
      |                             ^~~~~~~~~~~~~~

I think I can live starting from scratch again. However, I thought
perhaps there was an easy fix.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-05 14:39       ` Felipe Contreras
@ 2021-06-06  1:45         ` Olly Betts
  2021-06-06  4:40           ` Felipe Contreras
  0 siblings, 1 reply; 11+ messages in thread
From: Olly Betts @ 2021-06-06  1:45 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Xapian Discussion, notmuch@notmuchmail.org

On Sat, Jun 05, 2021 at 09:39:28AM -0500, Felipe Contreras wrote:
> On Fri, Jun 4, 2021 at 9:43 PM Olly Betts <olly@survex.com> wrote:
> > I'd suggest trying this simple tool I wrote that can probably rescue the
> > tags from a broken notmuch database (the tags are the part notmuch can't
> > just recreate by reindexing):
> >
> > https://git.xapian.org/?p=xapian;a=blob;f=README.notmuch;hb=refs/heads/notmuch-tag-rescue-hack
> 
> I can't seem to build it:
[...]
> ./backends/documentinternal.h:339:29: error: ‘numeric_limits’ was not
> declared in this scope
>   339 |                 wdf_delta = numeric_limits<Xapian::termcount>::max();
>       |                             ^~~~~~~~~~~~~~

Oh, that's a missing header include which older compilers seemed to
not complain about - it was fixed on master a few months ago, and I've
just merged master to the branch to pick up the fix so it should build
now.

Cheers,
    Olly

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-06  1:45         ` Olly Betts
@ 2021-06-06  4:40           ` Felipe Contreras
  2021-06-06 10:08             ` Olly Betts
  0 siblings, 1 reply; 11+ messages in thread
From: Felipe Contreras @ 2021-06-06  4:40 UTC (permalink / raw)
  To: Xapian Discussion, Felipe Contreras, David Bremner,
	notmuch@notmuchmail.org

On Sat, Jun 5, 2021 at 8:45 PM Olly Betts <olly@survex.com> wrote:
>
> On Sat, Jun 05, 2021 at 09:39:28AM -0500, Felipe Contreras wrote:
> > On Fri, Jun 4, 2021 at 9:43 PM Olly Betts <olly@survex.com> wrote:
> > > I'd suggest trying this simple tool I wrote that can probably rescue the
> > > tags from a broken notmuch database (the tags are the part notmuch can't
> > > just recreate by reindexing):
> > >
> > > https://git.xapian.org/?p=xapian;a=blob;f=README.notmuch;hb=refs/heads/notmuch-tag-rescue-hack
> >
> > I can't seem to build it:
> [...]
> > ./backends/documentinternal.h:339:29: error: ‘numeric_limits’ was not
> > declared in this scope
> >   339 |                 wdf_delta = numeric_limits<Xapian::termcount>::max();
> >       |                             ^~~~~~~~~~~~~~
>
> Oh, that's a missing header include which older compilers seemed to
> not complain about - it was fixed on master a few months ago, and I've
> just merged master to the branch to pick up the fix so it should build
> now.

This is what I get:

% xapian-core/bin/xapian-check ~/mail/.notmuch/xapian/termlist.glass
termlist:
xapian-core/bin/.libs/lt-xapian-check: DatabaseCorruptError: Db block
overwritten - are there multiple writers?

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-06  4:40           ` Felipe Contreras
@ 2021-06-06 10:08             ` Olly Betts
  2021-06-06 12:48               ` Felipe Contreras
  0 siblings, 1 reply; 11+ messages in thread
From: Olly Betts @ 2021-06-06 10:08 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Xapian Discussion, notmuch@notmuchmail.org

On Sat, Jun 05, 2021 at 11:40:58PM -0500, Felipe Contreras wrote:
> % xapian-core/bin/xapian-check ~/mail/.notmuch/xapian/termlist.glass
> termlist:
> xapian-core/bin/.libs/lt-xapian-check: DatabaseCorruptError: Db block
> overwritten - are there multiple writers?

Ah - this tool currently requires the termlist table to be undamaged
enough to at least scan through.

You could try commenting out the body of GlassTable::set_overwritten()
in xapian-core/backends/glass/glass_table.cc so it keeps going instead
of throwing this exception, which might allow it to usefully recover
some or all tags.  If you (or anyone) try that and it works let me know
and I can patch the branch to emit a warning message and continue there.

If the postlist table is readable it'd be possible to rescue the tag
data from there instead, but that's more complicated to do because
the tags would need collating for each message.

Cheers,
    Olly

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-06 10:08             ` Olly Betts
@ 2021-06-06 12:48               ` Felipe Contreras
  2021-06-07  3:45                 ` Olly Betts
  0 siblings, 1 reply; 11+ messages in thread
From: Felipe Contreras @ 2021-06-06 12:48 UTC (permalink / raw)
  To: Xapian Discussion, Felipe Contreras, David Bremner,
	notmuch@notmuchmail.org

On Sun, Jun 6, 2021 at 5:08 AM Olly Betts <olly@survex.com> wrote:

> You could try commenting out the body of GlassTable::set_overwritten()
> in xapian-core/backends/glass/glass_table.cc so it keeps going instead
> of throwing this exception, which might allow it to usefully recover
> some or all tags.  If you (or anyone) try that and it works let me know
> and I can patch the branch to emit a warning message and continue there.

Now I get this:

termlist:
blocksize=8K items=687440 firstunused=152676 revision=2 levels=2 root=749
/home/felipec/contrib/xapian/xapian-core/bin/.libs/lt-xapian-check:
DatabaseError: Block 152676: used more than once in the Btree

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-06 12:48               ` Felipe Contreras
@ 2021-06-07  3:45                 ` Olly Betts
  2021-06-07  7:07                   ` Felipe Contreras
  0 siblings, 1 reply; 11+ messages in thread
From: Olly Betts @ 2021-06-07  3:45 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Xapian Discussion, notmuch@notmuchmail.org

On Sun, Jun 06, 2021 at 07:48:39AM -0500, Felipe Contreras wrote:
> On Sun, Jun 6, 2021 at 5:08 AM Olly Betts <olly@survex.com> wrote:
> 
> > You could try commenting out the body of GlassTable::set_overwritten()
> > in xapian-core/backends/glass/glass_table.cc so it keeps going instead
> > of throwing this exception, which might allow it to usefully recover
> > some or all tags.  If you (or anyone) try that and it works let me know
> > and I can patch the branch to emit a warning message and continue there.
> 
> Now I get this:
> 
> termlist:
> blocksize=8K items=687440 firstunused=152676 revision=2 levels=2 root=749
> /home/felipec/contrib/xapian/xapian-core/bin/.libs/lt-xapian-check:
> DatabaseError: Block 152676: used more than once in the Btree

I've pushed a change to skip the low level table consistency checking on
the branch since that's where this report is from.  The whole point of
this branch is to rescue tags from a broken database, so the user
presumably already ran the real xapian-check and it's not useful to be
repeating those checks here.  Hopefully that'll get us to actually
rescuing some tags!

Cheers,
    Olly

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: How to recover from this permanent fatal error?
  2021-06-07  3:45                 ` Olly Betts
@ 2021-06-07  7:07                   ` Felipe Contreras
  0 siblings, 0 replies; 11+ messages in thread
From: Felipe Contreras @ 2021-06-07  7:07 UTC (permalink / raw)
  To: Olly Betts, Felipe Contreras; +Cc: Xapian Discussion, notmuch@notmuchmail.org

Olly Betts wrote:
> On Sun, Jun 06, 2021 at 07:48:39AM -0500, Felipe Contreras wrote:
> > On Sun, Jun 6, 2021 at 5:08 AM Olly Betts <olly@survex.com> wrote:
> > 
> > > You could try commenting out the body of GlassTable::set_overwritten()
> > > in xapian-core/backends/glass/glass_table.cc so it keeps going instead
> > > of throwing this exception, which might allow it to usefully recover
> > > some or all tags.  If you (or anyone) try that and it works let me know
> > > and I can patch the branch to emit a warning message and continue there.
> > 
> > Now I get this:
> > 
> > termlist:
> > blocksize=8K items=687440 firstunused=152676 revision=2 levels=2 root=749
> > /home/felipec/contrib/xapian/xapian-core/bin/.libs/lt-xapian-check:
> > DatabaseError: Block 152676: used more than once in the Btree
> 
> I've pushed a change to skip the low level table consistency checking on
> the branch since that's where this report is from.  The whole point of
> this branch is to rescue tags from a broken database, so the user
> presumably already ran the real xapian-check and it's not useful to be
> repeating those checks here.  Hopefully that'll get us to actually
> rescuing some tags!

Yeap, I was able to rescue some tags... only for 339296 mails ;)

I'm back to using notmuch.

This is the error I got:

  termlist:
  blocksize=8K items=687440 firstunused=152676 revision=2 levels=2 root=749
  doclen 168339 > upper bound 168335
  termlist table errors found: 1

  Total errors found: 1

And I still had to disable GlassTable::set_overwritten on top of your
patch.

Thanks!

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-06-07  7:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-05  0:44 How to recover from this permanent fatal error? Felipe Contreras
2021-06-05  1:37 ` David Bremner
2021-06-05  1:40   ` Felipe Contreras
2021-06-05  2:43     ` Olly Betts
2021-06-05 14:39       ` Felipe Contreras
2021-06-06  1:45         ` Olly Betts
2021-06-06  4:40           ` Felipe Contreras
2021-06-06 10:08             ` Olly Betts
2021-06-06 12:48               ` Felipe Contreras
2021-06-07  3:45                 ` Olly Betts
2021-06-07  7:07                   ` Felipe Contreras

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).