unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: David Bremner <david@tethera.net>
To: Gregor Zattler <telegraph@gmx.net>, notmuch <notmuch@notmuchmail.org>
Subject: Re: out of memory on idle machine (was: Re: consistent database corruption with notmuch new)
Date: Sat, 30 Jan 2021 08:58:55 -0400	[thread overview]
Message-ID: <87bld6shrk.fsf@tethera.net> (raw)
In-Reply-To: <20210130085432.GA14025@no.workgroup>

Gregor Zattler <telegraph@gmx.net> writes:

> Hi notmuch developers,,
> * Gregor Zattler <telegraph@gmx.net> [14. Dez. 2020]:
>> notmuch new still corrupts the database, the second notmuch new
>> invocation finds emails the first did not find.
>
> I'm still searching for the reason notmuch chokes on my mails.

>
> I assembled a HP MicroServer, installed basic debian buster and
> notmuch from the debian buster repo, rsynced my mail to a
> separate file system symlinked to the same location as on my
> laptop.
>
> There are now
> grfz@mic:~/Mail$ find -type f | wc -l
> 1209419
> files on this file system.  no other process touches this
> file system, actually the machine is otherwise ilde.
>
> I did notmuch new several times in a row:
>
> grfz@mic:~/Mail/.notmuch$ rm -rf xapian
> grfz@mic:~/Mail/.notmuch$ notmuch new
> Welcome to a new version of notmuch! Your database will now be upgraded.
> This process is safe to interrupt.
> Backing up tags to /home/grfz/Mail/.notmuch/dump-20210127T114210.gz...
> Your notmuch database has now been upgraded.
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607947606.8134_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607940473.9509_1.no:2,S
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607969276.21046_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607987211.1395_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607979988.4942_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607972847.4857_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607943993.24776_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607976389.23296_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607983586.19063_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/drafts.mbox
> Note: Ignoring non-mail file: /home/grfz/Mail/postponed.mbox
> Processed 1183682 total files in 16h 43m 27s (19 files/sec.).
> Added 1091038 new messages to the database.
> grfz@mic:~/Mail/.notmuch$ notmuch new
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607947606.8134_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607940473.9509_1.no:2,S
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607969276.21046_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607987211.1395_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607979988.4942_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607972847.4857_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607943993.24776_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607976389.23296_1.no:2,
> Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607983586.19063_1.no:2,
> Processed 1169095 total files in 16h 52m 48s (19 files/sec.).
> Added 1077686 new messages to the database.

Idea #1
-------

There are several mysteries here, but maybe we should begin at the
beginning. Something is wrong if notmuch scans your entire mail tree the
second time you run notmuch new.

Notmuch checks the mtime of directories against the time stored in the
database. As a sanitity check, maybe you can do that for one of your
directories with many messages. This needs "quest" and "xapian-delve",
from the package xapian-tools.

Unfortunately this should probably be done after the first notmuch
new. I have another idea to try (below) in the state after several news where
you are getting OOM.

I'll use real paths for my system; you'll need to update them.

This gives a time in seconds

% stat --format "%Y" ~/Maildir/tethera/cur
1612008734

Now let us find the database document for that directory

% quest -bdir:XDIRECTORY -d ~/Maildir/.notmuch/xapian/ dir:tethera/cur

Parsed Query: Query(0 * XDIRECTORYtethera/cur)
Exactly 1 matches
MSet:
431067: [0]
tethera/cur

Grabbing the record number from the output of quest:

% xapian-delve -r 431067 -VS0 ~/Maildir/.notmuch/xapian

Value 0 for record #431067: 1.61201e+09
Term List for record #431067: XDDIRENTRY387045:cur XDIRECTORYtethera/cur

You can see the value matches the mtime up to 6 decimal places.

Idea #2
-------

Try to figure out if some specific file is causing the OOM.

Run notmuch-new in gdb

There is a check for NOTMUCH_STATUS_OUT_OF_MEMORY around line 419/420 of
notmuch-new.c. If I understand correctly, that is where things are
failing. The following is untested; you will need the package
notmuch-dbgsym installed [1]

% gdb --args notmuch new
(gdb) b notmuch-new.c:420
(gdb) run
(gdb) p filename



[1]: https://wiki.debian.org/AutomaticDebugPackages

  reply	other threads:[~2021-01-30 12:59 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-13 13:19 consistent database corruption with notmuch new Gregor Zattler
2020-12-13 14:12 ` David Bremner
2020-12-13 14:15   ` Gregor Zattler
2020-12-13 15:13     ` Gregor Zattler
2020-12-13 18:10       ` David Bremner
2020-12-13 18:12         ` David Bremner
2020-12-14 19:19           ` David Bremner
2020-12-13 21:22       ` Gregor Zattler
2020-12-14 19:22         ` Gregor Zattler
2021-01-30  8:54           ` out of memory on idle machine (was: Re: consistent database corruption with notmuch new) Gregor Zattler
2021-01-30 12:58             ` David Bremner [this message]
2021-01-31  8:16               ` out of memory on idle machine Gregor Zattler
2021-01-31 20:21                 ` Gregor Zattler
2021-02-03 11:32                   ` David Bremner
2021-02-03 11:59                 ` David Bremner
2021-02-07 21:46                   ` Gregor Zattler
2021-02-11 10:53                     ` David Bremner
2021-02-11 11:32                       ` David Bremner
2021-03-17 19:47                         ` bug: chokes on long directory names (was: Re: out of memory on idle machine) Gregor Zattler
2021-03-18  1:25                           ` [PATCH] test: add known broken test for long directory bug David Bremner
2021-03-18  7:26                             ` Tomi Ollila
2021-03-18 11:02                               ` David Bremner
2021-03-20 13:10                             ` [PATCH] lib/n_d_index_file: check return value from _n_m_add_filename David Bremner
2021-04-18 13:05                               ` David Bremner
2021-03-18  1:39                           ` bug: chokes on long directory names (was: Re: out of memory on idle machine) David Bremner
2021-02-12  4:19                       ` out of memory on idle machine Olly Betts
2021-02-21  9:42                         ` Gregor Zattler
2021-02-09  4:34                   ` Olly Betts
2021-02-13 20:30                     ` Gregor Zattler
2020-12-14  9:11 ` consistent database corruption with notmuch new David Edmondson
2020-12-14 12:27   ` Gregor Zattler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bld6shrk.fsf@tethera.net \
    --to=david@tethera.net \
    --cc=notmuch@notmuchmail.org \
    --cc=telegraph@gmx.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).