unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?
@ 2016-11-04 12:46 David Bremner
  2016-11-04 16:26 ` David Bremner
  2016-11-04 18:47 ` Jani Nikula
  0 siblings, 2 replies; 25+ messages in thread
From: David Bremner @ 2016-11-04 12:46 UTC (permalink / raw)
  To: notmuch


[-- Attachment #0: Type: message/rfc822, Size: 8862 bytes --]

[-- Attachment #1.1: Type: text/plain, Size: 1911 bytes --]

Package: notmuch
Version: 0.23.1-1
Severity: normal

Last night I got this error from my `notmuch new --quiet` cron job. The
file that the error message complains about is now in the cur directory
of the maildir at the following path.

/path/to/mail/cur/1478190211.H80553P18378.chianamo:2,

I wonder if this some kind of race condition in `notmuch new` processing.
Perhaps it should be using inotify to find out about file movements?

Unexpected error with file /path/to/mail/new/1478190211.H80553P18378.chianamo
add_file: Something went wrong trying to read or write a file
Error opening /path/to/mail/new/1478190211.H80553P18378.chianamo: No such file or directory
Note: A fatal error was encountered: Something went wrong trying to read or write a file

-- System Information:
Debian Release: stretch/sid
  APT prefers testing-debug
  APT policy: (900, 'testing-debug'), (900, 'testing'), (800, 'unstable-debug'), (800, 'unstable'), (790, 'buildd-unstable'), (700, 'experimental-debug'), (700, 'experimental'), (690, 'buildd-experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.7.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_AU.utf8, LC_CTYPE=en_AU.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages notmuch depends on:
ii  libc6           2.24-5
ii  libglib2.0-0    2.50.1-1
ii  libgmime-2.6-0  2.6.20-8
ii  libnotmuch4     0.23.1-1
ii  libtalloc2      2.1.8-1
ii  zlib1g          1:1.2.8.dfsg-2+b3

Versions of packages notmuch recommends:
ii  alot           0.3.6-1
ii  gnupg-agent    2.1.15-4
pn  gpgsm          <none>
ii  notmuch-emacs  0.23.1-1
ii  notmuch-mutt   0.23.1-1

notmuch suggests no packages.

-- no debconf information

-- 
bye,
pabs

https://wiki.debian.org/PaulWise

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?
  2016-11-04 12:46 [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`? David Bremner
@ 2016-11-04 16:26 ` David Bremner
  2016-11-13  1:51   ` Austin Clements
  2016-11-04 18:47 ` Jani Nikula
  1 sibling, 1 reply; 25+ messages in thread
From: David Bremner @ 2016-11-04 16:26 UTC (permalink / raw)
  To: notmuch; +Cc: Paul Wise, 843127


Paul Wise wrote:

> Last night I got this error from my `notmuch new --quiet` cron job. The
> file that the error message complains about is now in the cur directory
> of the maildir at the following path.
>
> /path/to/mail/cur/1478190211.H80553P18378.chianamo:2,
>
> I wonder if this some kind of race condition in `notmuch new` processing.
> Perhaps it should be using inotify to find out about file movements?
>
> Unexpected error with file /path/to/mail/new/1478190211.H80553P18378.chianamo
> add_file: Something went wrong trying to read or write a file
> Error opening /path/to/mail/new/1478190211.H80553P18378.chianamo: No such file or directory
> Note: A fatal error was encountered: Something went wrong trying to read or write a file

I agree it looks like a race condition. inotify sounds a bit
overcomplicated and perhaps non-portable? It should probably just
tolerate disappearing files better, consider that a warning.

As a workaround, if you can replace background use of notmuch-new with
notmuch-insert (and I understand this doesn't work for everyone), you
will eliminate this kind of race condition.

d

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?
  2016-11-04 12:46 [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`? David Bremner
  2016-11-04 16:26 ` David Bremner
@ 2016-11-04 18:47 ` Jani Nikula
  2016-11-05  2:15   ` Paul Wise
  1 sibling, 1 reply; 25+ messages in thread
From: Jani Nikula @ 2016-11-04 18:47 UTC (permalink / raw)
  To: David Bremner, notmuch

On Fri, 04 Nov 2016, David Bremner <david@tethera.net> wrote:
> I wonder if this some kind of race condition in `notmuch new`
> processing.

Do you have some other software modifying your mail store while you're
running notmuch new?

BR,
Jani.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?
  2016-11-04 18:47 ` Jani Nikula
@ 2016-11-05  2:15   ` Paul Wise
  2016-11-05 12:57     ` [PATCH] cli: consider files vanishing during notmuch new non-fatal Jani Nikula
  0 siblings, 1 reply; 25+ messages in thread
From: Paul Wise @ 2016-11-05  2:15 UTC (permalink / raw)
  To: Jani Nikula, David Bremner, notmuch

[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]

On Fri, 2016-11-04 at 20:47 +0200, Jani Nikula wrote:

> Do you have some other software modifying your mail store while
> you're running notmuch new?

The folder in question has my laptop's exim4 service writing to it when
my cron jobs generate email.

On Fri, 2016-11-04 at 13:26 -0300, David Bremner wrote:

> inotify sounds a bit overcomplicated and perhaps non-portable?

There are similar APIs on non-Linux OSes but there is indeed no common
API or library to abstract away this sort of feature AFAIK.

> It should probably just tolerate disappearing files better, consider
> that a warning.

That sounds like the correct solution indeed. Probably if the code
notices that a file disappeared, it should also rescan the nearby
folders to see if the file was moved instead of deleted.

> As a workaround, if you can replace background use of notmuch-new
> with notmuch-insert (and I understand this doesn't work for
> everyone), you will eliminate this kind of race condition.

Hmm, I don't think that will work for me.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-05  2:15   ` Paul Wise
@ 2016-11-05 12:57     ` Jani Nikula
  2016-11-05 13:22       ` Paul Wise
  2016-11-16 11:43       ` David Bremner
  0 siblings, 2 replies; 25+ messages in thread
From: Jani Nikula @ 2016-11-05 12:57 UTC (permalink / raw)
  To: Paul Wise, David Bremner, notmuch

If some software other than notmuch new renames or removes files
during the notmuch new scan (specifically after scandir but before
indexing the file), keep going instead of bailing out. Failing to
index the file is just a race condition between notmuch and the other
software; the rename could happen after the notmuch new scan
anyway. It's not fatal, and we'll catch the renamed files on the next
scan.

Add a new exit code for when files vanished, so the caller has a
chance to detect the race and re-run notmuch new to recover.

Reported by Paul Wise <pabs@debian.org> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843127

---

Having notmuch new re-run (parts of) the scan automatically seems a
rather more involved change. So does inotify support. This simple
change both finishes the scan and lets the user recover, IMO a
reasonable first step.

Please suggest a better alternative to "vanish" in code...
---
 notmuch-client.h |  8 ++++++++
 notmuch-new.c    | 15 ++++++++++++---
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/notmuch-client.h b/notmuch-client.h
index 9ce2aef17431..d2057e26c5cd 100644
--- a/notmuch-client.h
+++ b/notmuch-client.h
@@ -114,6 +114,14 @@ chomp_newline (char *str)
 	str[strlen(str)-1] = '\0';
 }
 
+/* Exit status code indicating that file(s) in the mail store were
+ * removed or renamed after notmuch new scanned the directories but
+ * before indexing the file(s). If the file was renamed, the indexing
+ * might not be complete, and the user is advised to re-run notmuch
+ * new.
+ */
+#define NOTMUCH_EXIT_VANISHED_FILES 10
+
 /* Exit status code indicating the requested format version is too old
  * (support for that version has been dropped).  CLI code should use
  * notmuch_exit_if_unsupported_format rather than directly exiting
diff --git a/notmuch-new.c b/notmuch-new.c
index c55dea7bc1b7..e694a6adcee1 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -53,6 +53,7 @@ typedef struct {
     int total_files;
     int processed_files;
     int added_messages, removed_messages, renamed_messages;
+    int vanished_files;
     struct timeval tv_start;
 
     _filename_list_t *removed_files;
@@ -280,11 +281,13 @@ add_file (notmuch_database_t *notmuch, const char *filename,
     case NOTMUCH_STATUS_FILE_NOT_EMAIL:
 	fprintf (stderr, "Note: Ignoring non-mail file: %s\n", filename);
 	break;
-    /* Fatal issues. Don't process anymore. */
     case NOTMUCH_STATUS_FILE_ERROR:
+	/* Someone renamed/removed the file between scandir and now. */
+	state->vanished_files++;
 	fprintf (stderr, "Unexpected error with file %s\n", filename);
 	(void) print_status_database ("add_file", notmuch, status);
-	goto DONE;
+	break;
+    /* Fatal issues. Don't process anymore. */
     case NOTMUCH_STATUS_READ_ONLY_DATABASE:
     case NOTMUCH_STATUS_XAPIAN_EXCEPTION:
     case NOTMUCH_STATUS_OUT_OF_MEMORY:
@@ -1151,5 +1154,11 @@ notmuch_new_command (notmuch_config_t *config, int argc, char *argv[])
     if (!no_hooks && !ret && !interrupted)
 	ret = notmuch_run_hook (db_path, "post-new");
 
-    return ret || interrupted ? EXIT_FAILURE : EXIT_SUCCESS;
+    if (ret || interrupted)
+	return EXIT_FAILURE;
+
+    if (add_files_state.vanished_files)
+	return NOTMUCH_EXIT_VANISHED_FILES;
+
+    return EXIT_SUCCESS;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-05 12:57     ` [PATCH] cli: consider files vanishing during notmuch new non-fatal Jani Nikula
@ 2016-11-05 13:22       ` Paul Wise
  2016-11-12 15:39         ` David Bremner
  2016-11-16 11:43       ` David Bremner
  1 sibling, 1 reply; 25+ messages in thread
From: Paul Wise @ 2016-11-05 13:22 UTC (permalink / raw)
  To: Jani Nikula, David Bremner, notmuch

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

On Sat, 2016-11-05 at 14:57 +0200, Jani Nikula wrote:

> Add a new exit code for when files vanished, so the caller has a
> chance to detect the race and re-run notmuch new to recover.

I don't think this is the right approach for two reasons:

The exit code you have chosen is still a failure so I will still get
notified for a minor issue. I use chronic to detect fail scenarios.

This is a pretty normal scenario when you have a mail program open and
are auto-running `notmuch new` on a scheduled basis or when new mail
arrives. notmuch should just ignore the error and continue as normal.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-05 13:22       ` Paul Wise
@ 2016-11-12 15:39         ` David Bremner
  2016-11-12 16:04           ` Brian Sniffen
  0 siblings, 1 reply; 25+ messages in thread
From: David Bremner @ 2016-11-12 15:39 UTC (permalink / raw)
  To: Paul Wise, Jani Nikula, notmuch

Paul Wise <pabs@debian.org> writes:

> On Sat, 2016-11-05 at 14:57 +0200, Jani Nikula wrote:
>
>> Add a new exit code for when files vanished, so the caller has a
>> chance to detect the race and re-run notmuch new to recover.
>
> I don't think this is the right approach for two reasons:
>
> The exit code you have chosen is still a failure so I will still get
> notified for a minor issue. I use chronic to detect fail scenarios.
>
> This is a pretty normal scenario when you have a mail program open and
> are auto-running `notmuch new` on a scheduled basis or when new mail
> arrives. notmuch should just ignore the error and continue as normal.
>

OK, but the patch proposed works both for people who want to be notified
of this problem, and those that don't (with appropriate shell wrapping
checking the return code).  That seems better than hiding it for
everyone. And certainly an improvement on the status quo. A possible
future enhancement would be a flag like notmuch insert has to control
the treatment of these errors.

d

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-12 15:39         ` David Bremner
@ 2016-11-12 16:04           ` Brian Sniffen
  2016-11-12 16:10             ` David Bremner
  2016-11-12 20:35             ` Jani Nikula
  0 siblings, 2 replies; 25+ messages in thread
From: Brian Sniffen @ 2016-11-12 16:04 UTC (permalink / raw)
  To: David Bremner; +Cc: Paul Wise, Jani Nikula, notmuch


> 
> OK, but the patch proposed works both for people who want to be notified
> of this problem, and those that don't (with appropriate shell wrapping
> checking the return code).  

I think it will loop; how do I guarantee termination and indexing of all present messages if deletions cause errors?

-Brian

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-12 16:04           ` Brian Sniffen
@ 2016-11-12 16:10             ` David Bremner
  2016-11-12 16:15               ` David Bremner
  2016-11-12 21:08               ` Brian Sniffen
  2016-11-12 20:35             ` Jani Nikula
  1 sibling, 2 replies; 25+ messages in thread
From: David Bremner @ 2016-11-12 16:10 UTC (permalink / raw)
  To: Brian Sniffen; +Cc: Paul Wise, Jani Nikula, notmuch

Brian Sniffen <bts@evenmere.org> writes:

>> 
>> OK, but the patch proposed works both for people who want to be notified
>> of this problem, and those that don't (with appropriate shell wrapping
>> checking the return code).  
>
> I think it will loop; how do I guarantee termination and indexing of all present messages if deletions cause errors?
>
> -Brian

stop deleting things? You can't guarantee termination and indexing of
all present messages by ignoring deletions either.

d

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-12 16:10             ` David Bremner
@ 2016-11-12 16:15               ` David Bremner
  2016-11-12 21:08               ` Brian Sniffen
  1 sibling, 0 replies; 25+ messages in thread
From: David Bremner @ 2016-11-12 16:15 UTC (permalink / raw)
  To: Brian Sniffen; +Cc: Paul Wise, Jani Nikula, notmuch

David Bremner <david@tethera.net> writes:

> Brian Sniffen <bts@evenmere.org> writes:
>
>>> 
>>> OK, but the patch proposed works both for people who want to be notified
>>> of this problem, and those that don't (with appropriate shell wrapping
>>> checking the return code).  
>>
>> I think it will loop; how do I guarantee termination and indexing of all present messages if deletions cause errors?
>>
>> -Brian
>
> stop deleting things? You can't guarantee termination and indexing of
> all present messages by ignoring deletions either.
>
> d

Sorry, that was written in haste. Of course if that's your goal ignoring
deletions is ok, but renames will still get you, and we have no way of
knowing the difference.  In any case, I was more thinking that people
who want to ignore deletions could check for the specific error code and
consider that not-an-error.

d

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-12 16:04           ` Brian Sniffen
  2016-11-12 16:10             ` David Bremner
@ 2016-11-12 20:35             ` Jani Nikula
  1 sibling, 0 replies; 25+ messages in thread
From: Jani Nikula @ 2016-11-12 20:35 UTC (permalink / raw)
  To: Brian Sniffen, David Bremner; +Cc: Paul Wise, notmuch

On Sat, 12 Nov 2016, Brian Sniffen <bts@evenmere.org> wrote:
>> 
>> OK, but the patch proposed works both for people who want to be notified
>> of this problem, and those that don't (with appropriate shell wrapping
>> checking the return code).  
>
> I think it will loop; how do I guarantee termination and indexing of
> all present messages if deletions cause errors?

Please note that we're talking about deletions and renames *between* the
scandir(3) call and going through the results it returns, during a
single invocation of 'notmuch new'. On the next run, scandir(3) won't
return the entry, and we'll think it's gone.

(Of course, if you keep deleting/renaming files and running 'notmuch
new' simultaneously all the time, you'll hit this on some other files on
the consequent runs, but then you asked for it...)

BR,
Jani.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-12 16:10             ` David Bremner
  2016-11-12 16:15               ` David Bremner
@ 2016-11-12 21:08               ` Brian Sniffen
  2016-11-12 21:36                 ` David Bremner
  1 sibling, 1 reply; 25+ messages in thread
From: Brian Sniffen @ 2016-11-12 21:08 UTC (permalink / raw)
  To: David Bremner; +Cc: Paul Wise, Jani Nikula, notmuch


> On Nov 12, 2016, at 11:10 AM, David Bremner <david@tethera.net> wrote:
> 
> Brian Sniffen <bts@evenmere.org> writes:
> 
>>> 
>>> OK, but the patch proposed works both for people who want to be notified
>>> of this problem, and those that don't (with appropriate shell wrapping
>>> checking the return code).  
>> 
>> I think it will loop; how do I guarantee termination and indexing of all present messages if deletions cause errors?
> 
> stop deleting things? You can't guarantee termination and indexing of
> all present messages by ignoring deletions either.

That's hard, given dovecot pointed at the same maildir: it quickly moves files from new to cur. That makes notmuch insert pretty useless, and I rely on notmuch new to approach correctness. 

But maybe I misunderstand: is the idea that it will return an error but keep processing?  Or stop on that error?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-12 21:08               ` Brian Sniffen
@ 2016-11-12 21:36                 ` David Bremner
  0 siblings, 0 replies; 25+ messages in thread
From: David Bremner @ 2016-11-12 21:36 UTC (permalink / raw)
  To: Brian Sniffen; +Cc: Paul Wise, Jani Nikula, notmuch

Brian Sniffen <bts@evenmere.org> writes:

> That's hard, given dovecot pointed at the same maildir: it quickly
> moves files from new to cur. That makes notmuch insert pretty useless,
> and I rely on notmuch new to approach correctness.

I don't think this discussion is related to notmuch insert at all. If
you have found a race condition (or some other concurrency issue) in
notmuch-insert please report that seperately.

>
> But maybe I misunderstand: is the idea that it will return an error
>but keep processing?  Or stop on that error?

The whole discussion started because under certain circumstances it will
stop processing. The proposed patch makes it continue processing, but
report an error at the end.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?
  2016-11-04 16:26 ` David Bremner
@ 2016-11-13  1:51   ` Austin Clements
  2016-11-14 18:44     ` J. Lewis Muir
  0 siblings, 1 reply; 25+ messages in thread
From: Austin Clements @ 2016-11-13  1:51 UTC (permalink / raw)
  To: David Bremner; +Cc: notmuch, 843127, Paul Wise

Quoth David Bremner on Nov 04 at  1:26 pm:
> 
> Paul Wise wrote:
> 
> > Last night I got this error from my `notmuch new --quiet` cron job. The
> > file that the error message complains about is now in the cur directory
> > of the maildir at the following path.
> >
> > /path/to/mail/cur/1478190211.H80553P18378.chianamo:2,
> >
> > I wonder if this some kind of race condition in `notmuch new` processing.
> > Perhaps it should be using inotify to find out about file movements?
> >
> > Unexpected error with file /path/to/mail/new/1478190211.H80553P18378.chianamo
> > add_file: Something went wrong trying to read or write a file
> > Error opening /path/to/mail/new/1478190211.H80553P18378.chianamo: No such file or directory
> > Note: A fatal error was encountered: Something went wrong trying to read or write a file
> 
> I agree it looks like a race condition. inotify sounds a bit
> overcomplicated and perhaps non-portable? It should probably just
> tolerate disappearing files better, consider that a warning.

Inotify really *is* the solution. This is a symptom of a much bigger
problem: scandir makes no guarantees in the presence of concurrent
directory modification. If you delete or rename a file while notmuch
new is running, it may think *completely unrelated* files in the same
directory were also deleted. Even if scandir were atomic, if you move
a mail from one directory to another between notmuch scanning the
destination directory and notmuch scanning the source directory, it'll
think the mail has been deleted and potentially remove it from the DB.

The "recommended" solution is to scandir is to start an inotify watch
before the scan and redo (or update) the scan if there are any
changes. For notmuch, it would make sense to extend that to watching
all directories to make sure it can catch renames during the scan.

A possible alternative, though I haven't worked out the details, might
be to keep a close eye on the directory mtimes. Roughly, for each
directory, check the mtime before scanning, wait if necessary until
the mtime != the current time, do the scan and process the files
optimistically. Once all directories are processed, re-check all of
the mtimes and if any have changed, do something like starting over
but hopefully more intelligent.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?
  2016-11-13  1:51   ` Austin Clements
@ 2016-11-14 18:44     ` J. Lewis Muir
  2016-11-14 18:59       ` David Bremner
  0 siblings, 1 reply; 25+ messages in thread
From: J. Lewis Muir @ 2016-11-14 18:44 UTC (permalink / raw)
  To: Austin Clements; +Cc: David Bremner, 843127, notmuch, Paul Wise

On 11/12, Austin Clements wrote:
> Quoth David Bremner on Nov 04 at  1:26 pm:
> > I agree it looks like a race condition. inotify sounds a bit
> > overcomplicated and perhaps non-portable? It should probably just
> > tolerate disappearing files better, consider that a warning.
> 
> Inotify really *is* the solution.

I don't see how inotify can be the solution unless the idea is to make
Notmuch run on Linux only.  Inotify is a Linux kernel API.  Some other
OSes have their own native file event notification facilities, but not
all of them have it, and most (if not all) only support file event
notifications for certain file systems and not for others (e.g., not for
NFS).

Regards,

Lewis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`?
  2016-11-14 18:44     ` J. Lewis Muir
@ 2016-11-14 18:59       ` David Bremner
  0 siblings, 0 replies; 25+ messages in thread
From: David Bremner @ 2016-11-14 18:59 UTC (permalink / raw)
  To: J. Lewis Muir, Austin Clements; +Cc: notmuch, Paul Wise

"J. Lewis Muir" <jlmuir@imca-cat.org> writes:

> On 11/12, Austin Clements wrote:
>> Quoth David Bremner on Nov 04 at  1:26 pm:
>> > I agree it looks like a race condition. inotify sounds a bit
>> > overcomplicated and perhaps non-portable? It should probably just
>> > tolerate disappearing files better, consider that a warning.
>> 
>> Inotify really *is* the solution.
>
> I don't see how inotify can be the solution unless the idea is to make
> Notmuch run on Linux only.  Inotify is a Linux kernel API.  Some other
> OSes have their own native file event notification facilities, but not
> all of them have it, and most (if not all) only support file event
> notifications for certain file systems and not for others (e.g., not for
> NFS).

Yeah, it's worth saying that, even if I think Austin knows. I was
thinking that an alternative approach might be to have either notmuch
new or notmuch insert take a file name on the command line, and let the
user call it via what ever kind of directory watcher utility works on
their system.

d

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli: consider files vanishing during notmuch new non-fatal
  2016-11-05 12:57     ` [PATCH] cli: consider files vanishing during notmuch new non-fatal Jani Nikula
  2016-11-05 13:22       ` Paul Wise
@ 2016-11-16 11:43       ` David Bremner
  2016-11-21 20:14         ` [PATCH v2] " Jani Nikula
  1 sibling, 1 reply; 25+ messages in thread
From: David Bremner @ 2016-11-16 11:43 UTC (permalink / raw)
  To: Jani Nikula, Paul Wise, notmuch

Jani Nikula <jani@nikula.org> writes:

> +/* Exit status code indicating that file(s) in the mail store were
> + * removed or renamed after notmuch new scanned the directories but
> + * before indexing the file(s). If the file was renamed, the indexing
> + * might not be complete, and the user is advised to re-run notmuch
> + * new.
> + */
> +#define NOTMUCH_EXIT_VANISHED_FILES 10
> +

What do you think about defining something like NOTMUCH_EXIT_TEMPFAIL 75
(to match EX_TEMPFAIL) and using that?  There is also some stalled patch
around for insert to use EX_TEMPFAIL (although in that case part of the
reason it has stalled is I'm not convinced the error is temporary).

I think such an exit code would also make sense for locking failures;
but that is a different discussion.

Other than that the patch looks like an incremental improvement

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2] cli: consider files vanishing during notmuch new non-fatal
  2016-11-16 11:43       ` David Bremner
@ 2016-11-21 20:14         ` Jani Nikula
  2016-11-26  2:44           ` [PATCH] cli/new: document new exit code David Bremner
  0 siblings, 1 reply; 25+ messages in thread
From: Jani Nikula @ 2016-11-21 20:14 UTC (permalink / raw)
  To: David Bremner, Jani Nikula, Paul Wise, notmuch

If some software other than notmuch new renames or removes files
during the notmuch new scan (specifically after scandir but before
indexing the file), keep going instead of bailing out. Failing to
index the file is just a race condition between notmuch and the other
software; the rename could happen after the notmuch new scan
anyway. It's not fatal, and we'll catch the renamed files on the next
scan.

Add a new exit code for when files vanished, so the caller has a
chance to detect the race and re-run notmuch new to recover.

Reported by Paul Wise <pabs@debian.org> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843127

---

v2: use EX_TEMPFAIL status code
---
 notmuch-client.h | 11 +++++++++++
 notmuch-new.c    | 15 ++++++++++++---
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/notmuch-client.h b/notmuch-client.h
index 9ce2aef17431..793f32ecc55a 100644
--- a/notmuch-client.h
+++ b/notmuch-client.h
@@ -25,6 +25,7 @@
 #define _GNU_SOURCE /* for getline */
 #endif
 #include <stdio.h>
+#include <sysexits.h>
 
 #include "compat.h"
 
@@ -114,6 +115,16 @@ chomp_newline (char *str)
 	str[strlen(str)-1] = '\0';
 }
 
+/* Exit status code indicating temporary failure; user is invited to
+ * retry.
+ *
+ * For example, file(s) in the mail store were removed or renamed
+ * after notmuch new scanned the directories but before indexing the
+ * file(s). If the file was renamed, the indexing might not be
+ * complete, and the user is advised to re-run notmuch new.
+ */
+#define NOTMUCH_EXIT_TEMPFAIL EX_TEMPFAIL
+
 /* Exit status code indicating the requested format version is too old
  * (support for that version has been dropped).  CLI code should use
  * notmuch_exit_if_unsupported_format rather than directly exiting
diff --git a/notmuch-new.c b/notmuch-new.c
index c55dea7bc1b7..cc680b412a45 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -53,6 +53,7 @@ typedef struct {
     int total_files;
     int processed_files;
     int added_messages, removed_messages, renamed_messages;
+    int vanished_files;
     struct timeval tv_start;
 
     _filename_list_t *removed_files;
@@ -280,11 +281,13 @@ add_file (notmuch_database_t *notmuch, const char *filename,
     case NOTMUCH_STATUS_FILE_NOT_EMAIL:
 	fprintf (stderr, "Note: Ignoring non-mail file: %s\n", filename);
 	break;
-    /* Fatal issues. Don't process anymore. */
     case NOTMUCH_STATUS_FILE_ERROR:
+	/* Someone renamed/removed the file between scandir and now. */
+	state->vanished_files++;
 	fprintf (stderr, "Unexpected error with file %s\n", filename);
 	(void) print_status_database ("add_file", notmuch, status);
-	goto DONE;
+	break;
+    /* Fatal issues. Don't process anymore. */
     case NOTMUCH_STATUS_READ_ONLY_DATABASE:
     case NOTMUCH_STATUS_XAPIAN_EXCEPTION:
     case NOTMUCH_STATUS_OUT_OF_MEMORY:
@@ -1151,5 +1154,11 @@ notmuch_new_command (notmuch_config_t *config, int argc, char *argv[])
     if (!no_hooks && !ret && !interrupted)
 	ret = notmuch_run_hook (db_path, "post-new");
 
-    return ret || interrupted ? EXIT_FAILURE : EXIT_SUCCESS;
+    if (ret || interrupted)
+	return EXIT_FAILURE;
+
+    if (add_files_state.vanished_files)
+	return NOTMUCH_EXIT_TEMPFAIL;
+
+    return EXIT_SUCCESS;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH] cli/new: document new exit code
  2016-11-21 20:14         ` [PATCH v2] " Jani Nikula
@ 2016-11-26  2:44           ` David Bremner
  2016-11-26  9:17             ` Jani Nikula
  0 siblings, 1 reply; 25+ messages in thread
From: David Bremner @ 2016-11-26  2:44 UTC (permalink / raw)
  To: Jani Nikula, David Bremner, notmuch

It seems important to give the numeric return code for people writing
scripts. Hopefully deviations from this convention are rare.
---
 doc/man1/notmuch-new.rst | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/doc/man1/notmuch-new.rst b/doc/man1/notmuch-new.rst
index 787ed78..7f0b223 100644
--- a/doc/man1/notmuch-new.rst
+++ b/doc/man1/notmuch-new.rst
@@ -43,6 +43,14 @@ Supported options for **new** include
     ``--quiet``
         Do not print progress or results.
 
+EXIT STATUS
+===========
+
+This command supports the following special exit status code
+
+``75 (EX_TEMPFAIL)``
+    A temporary failure occured; the user is invited to retry.
+
 SEE ALSO
 ========
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] cli/new: document new exit code
  2016-11-26  2:44           ` [PATCH] cli/new: document new exit code David Bremner
@ 2016-11-26  9:17             ` Jani Nikula
  2016-11-26  9:18               ` [PATCH] test: check the handling of files vanishing between scandir and indexing Jani Nikula
  0 siblings, 1 reply; 25+ messages in thread
From: Jani Nikula @ 2016-11-26  9:17 UTC (permalink / raw)
  To: David Bremner, David Bremner, notmuch

On Sat, 26 Nov 2016, David Bremner <david@tethera.net> wrote:
> It seems important to give the numeric return code for people writing
> scripts. Hopefully deviations from this convention are rare.

*blush*

As a token of my gratitude, a test for the change follows. I'm not sure
if it's quite ready for merging, but the ground work is there.

Thanks,
Jani.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] test: check the handling of files vanishing between scandir and indexing
  2016-11-26  9:17             ` Jani Nikula
@ 2016-11-26  9:18               ` Jani Nikula
  2016-11-27  9:59                 ` [PATCH v2] " Jani Nikula
  0 siblings, 1 reply; 25+ messages in thread
From: Jani Nikula @ 2016-11-26  9:18 UTC (permalink / raw)
  To: David Bremner, notmuch

Add a file for scandir to find, but use gdb to remove it before it
gets indexed.

---

The ugly part is that this should require gdb as external dep... but
we shouldn't skip all of T050-new.sh if gdb isn't there. I'm in a
hurry, any good ideas?
---
 test/T050-new.sh | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/test/T050-new.sh b/test/T050-new.sh
index beeb574a3b30..072b63a148cc 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -298,4 +298,34 @@ output=$(NOTMUCH_NEW --debug 2>&1 | sed 's/: .*$//' )
 chmod u+w  ${MAIL_DIR}/.notmuch/xapian/*.${db_ending}
 test_expect_equal "$output" "A Xapian exception occurred opening database"
 
+
+test_begin_subtest "Handle files vanishing between scandir and add_file"
+
+# A file for scandir to find. It won't get indexed, so can be empty.
+touch ${MAIL_DIR}/vanish
+
+# Breakpoint to remove the file before indexing
+cat <<EOF > notmuch-new-vanish.gdb
+set breakpoint pending on
+set logging file notmuch-new-vanish-gdb.log
+set logging on
+break add_file
+commands
+shell rm -f ${MAIL_DIR}/vanish
+continue
+end
+run
+EOF
+
+gdb --batch-silent --return-child-result -x notmuch-new-vanish.gdb \
+    --args notmuch new 2>OUTPUT 1>/dev/null
+echo "exit status: $?" >> OUTPUT
+cat <<EOF > EXPECTED
+Unexpected error with file ${MAIL_DIR}/vanish
+add_file: Something went wrong trying to read or write a file
+Error opening ${MAIL_DIR}/vanish: No such file or directory
+exit status: 75
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
 test_done
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2] test: check the handling of files vanishing between scandir and indexing
  2016-11-26  9:18               ` [PATCH] test: check the handling of files vanishing between scandir and indexing Jani Nikula
@ 2016-11-27  9:59                 ` Jani Nikula
  2016-11-29  2:16                   ` David Bremner
  2016-12-03 11:24                   ` David Bremner
  0 siblings, 2 replies; 25+ messages in thread
From: Jani Nikula @ 2016-11-27  9:59 UTC (permalink / raw)
  To: David Bremner, notmuch

Add a file for scandir to find, but use gdb to remove it before it
gets indexed.

---

v2: Apparently our test setup is clever enough to gracefully handle
missing prerequisites, and ignore subtest results. Just make sure we
remove the test file also in case gdb isn't there, to not leave
garbage behind.
---
 test/T050-new.sh | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/test/T050-new.sh b/test/T050-new.sh
index beeb574a3b30..2bc799d2e2bc 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -298,4 +298,38 @@ output=$(NOTMUCH_NEW --debug 2>&1 | sed 's/: .*$//' )
 chmod u+w  ${MAIL_DIR}/.notmuch/xapian/*.${db_ending}
 test_expect_equal "$output" "A Xapian exception occurred opening database"
 
+
+test_begin_subtest "Handle files vanishing between scandir and add_file"
+
+# A file for scandir to find. It won't get indexed, so can be empty.
+touch ${MAIL_DIR}/vanish
+
+# Breakpoint to remove the file before indexing
+cat <<EOF > notmuch-new-vanish.gdb
+set breakpoint pending on
+set logging file notmuch-new-vanish-gdb.log
+set logging on
+break add_file
+commands
+shell rm -f ${MAIL_DIR}/vanish
+continue
+end
+run
+EOF
+
+gdb --batch-silent --return-child-result -x notmuch-new-vanish.gdb \
+    --args notmuch new 2>OUTPUT 1>/dev/null
+echo "exit status: $?" >> OUTPUT
+
+# Clean up the file in case gdb isn't available.
+rm -f ${MAIL_DIR}/vanish
+
+cat <<EOF > EXPECTED
+Unexpected error with file ${MAIL_DIR}/vanish
+add_file: Something went wrong trying to read or write a file
+Error opening ${MAIL_DIR}/vanish: No such file or directory
+exit status: 75
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
 test_done
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] test: check the handling of files vanishing between scandir and indexing
  2016-11-27  9:59                 ` [PATCH v2] " Jani Nikula
@ 2016-11-29  2:16                   ` David Bremner
  2016-11-29  7:31                     ` Tomi Ollila
  2016-12-03 11:24                   ` David Bremner
  1 sibling, 1 reply; 25+ messages in thread
From: David Bremner @ 2016-11-29  2:16 UTC (permalink / raw)
  To: Jani Nikula, notmuch; +Cc: Tomi Ollila

Jani Nikula <jani@nikula.org> writes:

> +gdb --batch-silent --return-child-result -x notmuch-new-vanish.gdb \
> +    --args notmuch new 2>OUTPUT 1>/dev/null

I wonder if Tomi's suggestion of

  id:20161128221231.25528-2-david@tethera.net

applies here as well. In this case it is redirecting output, rather than
input, but I guess the same principle applies?

d

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] test: check the handling of files vanishing between scandir and indexing
  2016-11-29  2:16                   ` David Bremner
@ 2016-11-29  7:31                     ` Tomi Ollila
  0 siblings, 0 replies; 25+ messages in thread
From: Tomi Ollila @ 2016-11-29  7:31 UTC (permalink / raw)
  To: David Bremner, notmuch

On Tue, Nov 29 2016, David Bremner <david@tethera.net> wrote:

> Jani Nikula <jani@nikula.org> writes:
>
>> +gdb --batch-silent --return-child-result -x notmuch-new-vanish.gdb \
>> +    --args notmuch new 2>OUTPUT 1>/dev/null
>
> I wonder if Tomi's suggestion of
>
>   id:20161128221231.25528-2-david@tethera.net
>
> applies here as well. In this case it is redirecting output, rather than
> input, but I guess the same principle applies?

In case of input it may matter who has chance to consume it. In this
case of output every output of every process is dumped to /dev/null; I'd go
with this simpler approach in this case (and probably in all other cases;
If we wanted to (debug) log things we used `--batch` and redirecs to log
files instead of /dev/null).

Tomi


>
> d

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] test: check the handling of files vanishing between scandir and indexing
  2016-11-27  9:59                 ` [PATCH v2] " Jani Nikula
  2016-11-29  2:16                   ` David Bremner
@ 2016-12-03 11:24                   ` David Bremner
  1 sibling, 0 replies; 25+ messages in thread
From: David Bremner @ 2016-12-03 11:24 UTC (permalink / raw)
  To: Jani Nikula, notmuch

Jani Nikula <jani@nikula.org> writes:

> Add a file for scandir to find, but use gdb to remove it before it
> gets indexed.
>

pushed to master,

d

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-12-03 11:24 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-04 12:46 [Paul Wise] Bug#843127: notmuch: race condition in `notmuch new`? David Bremner
2016-11-04 16:26 ` David Bremner
2016-11-13  1:51   ` Austin Clements
2016-11-14 18:44     ` J. Lewis Muir
2016-11-14 18:59       ` David Bremner
2016-11-04 18:47 ` Jani Nikula
2016-11-05  2:15   ` Paul Wise
2016-11-05 12:57     ` [PATCH] cli: consider files vanishing during notmuch new non-fatal Jani Nikula
2016-11-05 13:22       ` Paul Wise
2016-11-12 15:39         ` David Bremner
2016-11-12 16:04           ` Brian Sniffen
2016-11-12 16:10             ` David Bremner
2016-11-12 16:15               ` David Bremner
2016-11-12 21:08               ` Brian Sniffen
2016-11-12 21:36                 ` David Bremner
2016-11-12 20:35             ` Jani Nikula
2016-11-16 11:43       ` David Bremner
2016-11-21 20:14         ` [PATCH v2] " Jani Nikula
2016-11-26  2:44           ` [PATCH] cli/new: document new exit code David Bremner
2016-11-26  9:17             ` Jani Nikula
2016-11-26  9:18               ` [PATCH] test: check the handling of files vanishing between scandir and indexing Jani Nikula
2016-11-27  9:59                 ` [PATCH v2] " Jani Nikula
2016-11-29  2:16                   ` David Bremner
2016-11-29  7:31                     ` Tomi Ollila
2016-12-03 11:24                   ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).