From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id qOSfDyBYFWAMcQAA0tVLHw (envelope-from ) for ; Sat, 30 Jan 2021 12:59:12 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id cG5rCyBYFWBqNwAAB5/wlQ (envelope-from ) for ; Sat, 30 Jan 2021 12:59:12 +0000 Received: from mail.notmuchmail.org (nmbug.tethera.net [144.217.243.247]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (2048 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 8EBF39403C8 for ; Sat, 30 Jan 2021 12:59:10 +0000 (UTC) Received: from nmbug.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id 19B8128CCE; Sat, 30 Jan 2021 07:59:02 -0500 (EST) Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197]) by mail.notmuchmail.org (Postfix) with ESMTP id 3809B28CB4 for ; Sat, 30 Jan 2021 07:58:59 -0500 (EST) Received: by fethera.tethera.net (Postfix, from userid 1001) id 26CB1606DB; Sat, 30 Jan 2021 07:58:57 -0500 (EST) Received: (nullmailer pid 580861 invoked by uid 1000); Sat, 30 Jan 2021 12:58:55 -0000 From: David Bremner To: Gregor Zattler , notmuch Subject: Re: out of memory on idle machine (was: Re: consistent database corruption with notmuch new) In-Reply-To: <20210130085432.GA14025@no.workgroup> References: <20201213131909.GD21521@no.workgroup> <87zh2hhk15.fsf@tethera.net> <20201213141543.GE21521@no.workgroup> <20201213151336.GF21521@no.workgroup> <20201213212252.GH21521@no.workgroup> <20201214192251.GA7858@no.workgroup> <20210130085432.GA14025@no.workgroup> Date: Sat, 30 Jan 2021 08:58:55 -0400 Message-ID: <87bld6shrk.fsf@tethera.net> MIME-Version: 1.0 Message-ID-Hash: LNJ6FZWQ5QQFKXJZWOJ3TFKI6HAFYME2 X-Message-ID-Hash: LNJ6FZWQ5QQFKXJZWOJ3TFKI6HAFYME2 X-MailFrom: david@tethera.net X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.2.1 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -2.06 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 144.217.243.247 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Queue-Id: 8EBF39403C8 X-Spam-Score: -2.06 X-Migadu-Scanner: scn0.migadu.com X-TUID: IZCUNvvC5S0N Gregor Zattler writes: > Hi notmuch developers,, > * Gregor Zattler [14. Dez. 2020]: >> notmuch new still corrupts the database, the second notmuch new >> invocation finds emails the first did not find. > > I'm still searching for the reason notmuch chokes on my mails. > > I assembled a HP MicroServer, installed basic debian buster and > notmuch from the debian buster repo, rsynced my mail to a > separate file system symlinked to the same location as on my > laptop. > > There are now > grfz@mic:~/Mail$ find -type f | wc -l > 1209419 > files on this file system. no other process touches this > file system, actually the machine is otherwise ilde. > > I did notmuch new several times in a row: > > grfz@mic:~/Mail/.notmuch$ rm -rf xapian > grfz@mic:~/Mail/.notmuch$ notmuch new > Welcome to a new version of notmuch! Your database will now be upgraded. > This process is safe to interrupt. > Backing up tags to /home/grfz/Mail/.notmuch/dump-20210127T114210.gz... > Your notmuch database has now been upgraded. > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607947606.8134_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607940473.9509_1.no:2,S > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607969276.21046_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607987211.1395_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607979988.4942_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607972847.4857_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607943993.24776_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607976389.23296_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607983586.19063_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/drafts.mbox > Note: Ignoring non-mail file: /home/grfz/Mail/postponed.mbox > Processed 1183682 total files in 16h 43m 27s (19 files/sec.). > Added 1091038 new messages to the database. > grfz@mic:~/Mail/.notmuch$ notmuch new > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607947606.8134_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607940473.9509_1.no:2,S > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607969276.21046_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607987211.1395_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607979988.4942_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607972847.4857_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607943993.24776_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607976389.23296_1.no:2, > Note: Ignoring non-mail file: /home/grfz/Mail/spam-old/cur/1607983586.19063_1.no:2, > Processed 1169095 total files in 16h 52m 48s (19 files/sec.). > Added 1077686 new messages to the database. Idea #1 ------- There are several mysteries here, but maybe we should begin at the beginning. Something is wrong if notmuch scans your entire mail tree the second time you run notmuch new. Notmuch checks the mtime of directories against the time stored in the database. As a sanitity check, maybe you can do that for one of your directories with many messages. This needs "quest" and "xapian-delve", from the package xapian-tools. Unfortunately this should probably be done after the first notmuch new. I have another idea to try (below) in the state after several news where you are getting OOM. I'll use real paths for my system; you'll need to update them. This gives a time in seconds % stat --format "%Y" ~/Maildir/tethera/cur 1612008734 Now let us find the database document for that directory % quest -bdir:XDIRECTORY -d ~/Maildir/.notmuch/xapian/ dir:tethera/cur Parsed Query: Query(0 * XDIRECTORYtethera/cur) Exactly 1 matches MSet: 431067: [0] tethera/cur Grabbing the record number from the output of quest: % xapian-delve -r 431067 -VS0 ~/Maildir/.notmuch/xapian Value 0 for record #431067: 1.61201e+09 Term List for record #431067: XDDIRENTRY387045:cur XDIRECTORYtethera/cur You can see the value matches the mtime up to 6 decimal places. Idea #2 ------- Try to figure out if some specific file is causing the OOM. Run notmuch-new in gdb There is a check for NOTMUCH_STATUS_OUT_OF_MEMORY around line 419/420 of notmuch-new.c. If I understand correctly, that is where things are failing. The following is untested; you will need the package notmuch-dbgsym installed [1] % gdb --args notmuch new (gdb) b notmuch-new.c:420 (gdb) run (gdb) p filename [1]: https://wiki.debian.org/AutomaticDebugPackages