From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 330A96DE0C72 for ; Sat, 12 Nov 2016 18:06:20 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -1.407 X-Spam-Level: X-Spam-Status: No, score=-1.407 tagged_above=-999 required=5 tests=[AWL=0.904, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id r5zIaN2rgoFc for ; Sat, 12 Nov 2016 18:06:19 -0800 (PST) X-Greylist: delayed 907 seconds by postgrey-1.35 at arlo; Sat, 12 Nov 2016 18:06:19 PST Received: from outgoing-stata.csail.mit.edu (outgoing-stata.csail.mit.edu [128.30.2.210]) by arlo.cworth.org (Postfix) with ESMTP id 68AD36DE0C71 for ; Sat, 12 Nov 2016 18:06:19 -0800 (PST) Received: from awakening.a20.io ([104.131.20.129] helo=awakening) by outgoing-stata.csail.mit.edu with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1c5jwq-000Tjz-7a; Sat, 12 Nov 2016 20:51:08 -0500 Received: from amthrax by awakening with local (Exim 4.86) (envelope-from ) id 1c5jwp-0007Fq-E3; Sat, 12 Nov 2016 20:51:07 -0500 Date: Sat, 12 Nov 2016 20:51:07 -0500 From: Austin Clements To: David Bremner Cc: notmuch@notmuchmail.org, 843127@bugs.debian.org, Paul Wise Subject: Re: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`? Message-ID: <20161113015107.GC5670@csail.mit.edu> References: <87a8dfl8em.fsf@tesseract.cs.unb.ca> <87a8df9pp2.fsf@tesseract.cs.unb.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87a8df9pp2.fsf@tesseract.cs.unb.ca> User-Agent: Mutt/1.6.0 (2016-04-01) X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Nov 2016 02:06:20 -0000 Quoth David Bremner on Nov 04 at 1:26 pm: > > Paul Wise wrote: > > > Last night I got this error from my `notmuch new --quiet` cron job. The > > file that the error message complains about is now in the cur directory > > of the maildir at the following path. > > > > /path/to/mail/cur/1478190211.H80553P18378.chianamo:2, > > > > I wonder if this some kind of race condition in `notmuch new` processing. > > Perhaps it should be using inotify to find out about file movements? > > > > Unexpected error with file /path/to/mail/new/1478190211.H80553P18378.chianamo > > add_file: Something went wrong trying to read or write a file > > Error opening /path/to/mail/new/1478190211.H80553P18378.chianamo: No such file or directory > > Note: A fatal error was encountered: Something went wrong trying to read or write a file > > I agree it looks like a race condition. inotify sounds a bit > overcomplicated and perhaps non-portable? It should probably just > tolerate disappearing files better, consider that a warning. Inotify really *is* the solution. This is a symptom of a much bigger problem: scandir makes no guarantees in the presence of concurrent directory modification. If you delete or rename a file while notmuch new is running, it may think *completely unrelated* files in the same directory were also deleted. Even if scandir were atomic, if you move a mail from one directory to another between notmuch scanning the destination directory and notmuch scanning the source directory, it'll think the mail has been deleted and potentially remove it from the DB. The "recommended" solution is to scandir is to start an inotify watch before the scan and redo (or update) the scan if there are any changes. For notmuch, it would make sense to extend that to watching all directories to make sure it can catch renames during the scan. A possible alternative, though I haven't worked out the details, might be to keep a close eye on the directory mtimes. Roughly, for each directory, check the mtime before scanning, wait if necessary until the mtime != the current time, do the scan and process the files optimistically. Once all directories are processed, re-check all of the mtimes and if any have changed, do something like starting over but hopefully more intelligent.