unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Lost updates to Notmuch database
@ 2016-02-17 20:44 Eric J
  2016-02-18  1:03 ` David Bremner
  2016-02-21 12:57 ` Eric J
  0 siblings, 2 replies; 7+ messages in thread
From: Eric J @ 2016-02-17 20:44 UTC (permalink / raw)
  To: notmuch

Using the API, I am adding single mail files, already in the maildir, to
the Notmuch database and tagging them. It works, every time, as long as
I run it one file at a time. 

However, if I do it twice, in different processes, at the same time, one
file is added and tagged properly, the other is not (totally unfindable
by notmuch search). Neither process reports any error, and they both log
their actions normally. Actually a third simultaneous process also fails
to leave any result in the database.

This is in spite of using begin_atomic/end_atomic. I would not have been
surprised to get Xapian lock errors, but the database_open returns
success, as does the database_begin_atomic. 

The wrapper around the API for Tcl is very simple, and I can not see any
way for that or Tcl itself to cause this sort of problem. Beyond this,
I haven't thought of any way to decide if this could be a Notmuch problem
or a Xapian problem.

The API sequence is:

    database_open
    database_begin_atomic
    database_add_message
    (next 4 are a for loop)
    message_get_tags
    tags_valid
    tags_move_to_next
    tags_get
    message_freeze
    message_add_tag
    message_thaw
    message_maildir_flags_to_tags
    message_get_filename
    message_get_message_id
    database_end_atomic
    message_destroy
    database_close
    database_destroy

I didn't realise till it was mostly written, but it is pretty much like
add_new() in notmuch-new.c .

Eric
-- 
ms fnd in a lbry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lost updates to Notmuch database
  2016-02-17 20:44 Lost updates to Notmuch database Eric J
@ 2016-02-18  1:03 ` David Bremner
  2016-02-18 12:59   ` Eric J
  2016-02-18 14:30   ` Tomi Ollila
  2016-02-21 12:57 ` Eric J
  1 sibling, 2 replies; 7+ messages in thread
From: David Bremner @ 2016-02-18  1:03 UTC (permalink / raw)
  To: Eric J, notmuch

Eric J <eric@deptj.eu> writes:

> However, if I do it twice, in different processes, at the same time, one
> file is added and tagged properly, the other is not (totally unfindable
> by notmuch search). Neither process reports any error, and they both log
> their actions normally. Actually a third simultaneous process also fails
> to leave any result in the database.

It should be impossible for more than one process to open a Xapian
database for writing at the same time. So if the processes are really
running in parallel, you should be getting error codes from the later
calls to notmuch_database_open{_verbose}. You claim that's not
happening, which is puzzling. Maybe you can try to duplicate your
problem with a tiny C program.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lost updates to Notmuch database
  2016-02-18  1:03 ` David Bremner
@ 2016-02-18 12:59   ` Eric J
  2016-02-18 14:30   ` Tomi Ollila
  1 sibling, 0 replies; 7+ messages in thread
From: Eric J @ 2016-02-18 12:59 UTC (permalink / raw)
  To: notmuch

On Wed, 17 Feb 2016 21:03:13 -0400, David Bremner <david@tethera.net> wrote:
> Eric J <eric@deptj.eu> writes:
> 
> > However, if I do it twice, in different processes, at the same time, one
> > file is added and tagged properly, the other is not (totally unfindable
> > by notmuch search). Neither process reports any error, and they both log
> > their actions normally. Actually a third simultaneous process also fails
> > to leave any result in the database.
> 
> It should be impossible for more than one process to open a Xapian
> database for writing at the same time. So if the processes are really
> running in parallel, you should be getting error codes from the later
> calls to notmuch_database_open{_verbose}. You claim that's not
> happening, which is puzzling. Maybe you can try to duplicate your
> problem with a tiny C program.

Thanks David. Impossible? - yes, but if I do just the open in two
interactive sessions (the Tcl interface makes this easy), I get the
following from "lsof flintlock":

COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
cat     21408 eric    5w   REG    8,9        0 667773 flintlock
cat     21418 eric    5w   REG    8,9        0 667773 flintlock

So, open for writing, but not locked (the processes have the right
parents).

I managed to catch a run of "notmuch new" with "lsof -r5 flintlock":

COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
cat     20763 eric    3ww  REG    8,9        0 667773 flintlock

which is open for writing and (partially) locked, so it must be doing
something that I'm not. I obviously need to go carefully through the
code to see what that is (and experiment in C if I can't find it!).

I don't have the _verbose functions BTW, still on 0.18.1 - I thought I
would get a proof-of-concept before upgrading notmuch, but...

Thanks again,

Eric
-- 
ms fnd in a lbry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lost updates to Notmuch database
  2016-02-18  1:03 ` David Bremner
  2016-02-18 12:59   ` Eric J
@ 2016-02-18 14:30   ` Tomi Ollila
  2016-02-18 21:26     ` Eric J
  1 sibling, 1 reply; 7+ messages in thread
From: Tomi Ollila @ 2016-02-18 14:30 UTC (permalink / raw)
  To: David Bremner, Eric J, notmuch

On Thu, Feb 18 2016, David Bremner <david@tethera.net> wrote:

> Eric J <eric@deptj.eu> writes:
>
>> However, if I do it twice, in different processes, at the same time, one
>> file is added and tagged properly, the other is not (totally unfindable
>> by notmuch search). Neither process reports any error, and they both log
>> their actions normally. Actually a third simultaneous process also fails
>> to leave any result in the database.
>
> It should be impossible for more than one process to open a Xapian
> database for writing at the same time. So if the processes are really
> running in parallel, you should be getting error codes from the later
> calls to notmuch_database_open{_verbose}. You claim that's not
> happening, which is puzzling. Maybe you can try to duplicate your
> problem with a tiny C program.

In addition to that (or even before), you could

1) be able to reproduce the problem
2) try to reproduce it prefixing the command with ltrace -tt
3) examine carefully the ltrace logs to figure out where the proble lies

Tomi

Hmm, Interestingly when I run

LD_LIBRARY_PATH=~/vc/ext/notmuch/lib ltrace -f -tt ~/vc/ext/notmuch/notmuch-shared new

I did not see any Xapian references, but when I did

ltrace -f -tt ~/vc/ext/notmuch/notmuch new

I did. Interestingly when using libnotmuch.so.4 the xapian interface
is hidden (is it baked inside ~/vc/ext/notmuch/lib/libnotmuch.so.4.3.0 :O)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lost updates to Notmuch database
  2016-02-18 14:30   ` Tomi Ollila
@ 2016-02-18 21:26     ` Eric J
  0 siblings, 0 replies; 7+ messages in thread
From: Eric J @ 2016-02-18 21:26 UTC (permalink / raw)
  To: notmuch

On Thu, 18 Feb 2016 16:30:34 +0200, Tomi Ollila <tomi.ollila@iki.fi> wrote:
> On Thu, Feb 18 2016, David Bremner <david@tethera.net> wrote:
> 
> > Eric J <eric@deptj.eu> writes:
> >
> >> However, if I do it twice, in different processes, at the same time, one
> >> file is added and tagged properly, the other is not (totally unfindable
> >> by notmuch search). Neither process reports any error, and they both log
> >> their actions normally. Actually a third simultaneous process also fails
> >> to leave any result in the database.
> >
> > It should be impossible for more than one process to open a Xapian
> > database for writing at the same time. So if the processes are really
> > running in parallel, you should be getting error codes from the later
> > calls to notmuch_database_open{_verbose}. You claim that's not
> > happening, which is puzzling. Maybe you can try to duplicate your
> > problem with a tiny C program.
> 
> In addition to that (or even before), you could
> 
> 1) be able to reproduce the problem

As I said in my response to David, I can do just the open, and see that
there is no lock, and do the same open in another session without error.

> 2) try to reproduce it prefixing the command with ltrace -tt

Well, as you say below, no Xapian references. Even worse, ltracing the
Tcl version sees no Notmuch references either.

> 3) examine carefully the ltrace logs to figure out where the proble lies

So all I can see is that the notmuch_database_open from notmuch new
looks just like I think mine would look - if I could see it.

(Adding -S to ltrace was not particularly informative either.)

> Tomi
> 
> Hmm, Interestingly when I run
> 
> LD_LIBRARY_PATH=~/vc/ext/notmuch/lib ltrace -f -tt ~/vc/ext/notmuch/notmuch-shared new
> 
> I did not see any Xapian references, but when I did
> 
> ltrace -f -tt ~/vc/ext/notmuch/notmuch new
> 
> I did. Interestingly when using libnotmuch.so.4 the xapian interface
> is hidden (is it baked inside ~/vc/ext/notmuch/lib/libnotmuch.so.4.3.0 :O)

My latest piece of evidence suggests that this may not be a Notmuch
problem, because opening a Xapian database via the Xapian-Tcl bindings
results in:

COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
cat     2803 eric    5w   REG    8,9        0 526601 /home/eric/omega/data/newsgroups/flintlock

No lock here either (and this is not even a Notmuch database). 

Investigations will continue.

Eric
-- 
ms fnd in a lbry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lost updates to Notmuch database
  2016-02-17 20:44 Lost updates to Notmuch database Eric J
  2016-02-18  1:03 ` David Bremner
@ 2016-02-21 12:57 ` Eric J
  2016-02-27 20:04   ` Eric J
  1 sibling, 1 reply; 7+ messages in thread
From: Eric J @ 2016-02-21 12:57 UTC (permalink / raw)
  To: notmuch

On Wed, 17 Feb 2016 21:44:23 +0100 (CET), Eric J <eric@deptj.eu> wrote:
> Using the API, I am adding single mail files, already in the maildir, to
> the Notmuch database and tagging them. It works, every time, as long as
> I run it one file at a time. 
> 
> However, if I do it twice, in different processes, at the same time, one
> file is added and tagged properly, the other is not (totally unfindable
> by notmuch search). Neither process reports any error, and they both log
> their actions normally. Actually a third simultaneous process also fails
> to leave any result in the database.
8>< --------
> 
> The wrapper around the API for Tcl is very simple, and I can not see any
> way for that or Tcl itself to cause this sort of problem. Beyond this,
> I haven't thought of any way to decide if this could be a Notmuch problem
> or a Xapian problem.
8>< --------

Well, after some experimenting, this is not specific to Notmuch at all.
Xapian itself has Tcl bindings, and they also silently fail to lock the
file. So does putting a minimally changed copy of Xapian's locking code
directly in a Tcl extension, though that code works when called from a
tiny C main program instead.

So thanks for looking, I will report here if I find out why.

Eric
-- 
ms fnd in a lbry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Lost updates to Notmuch database
  2016-02-21 12:57 ` Eric J
@ 2016-02-27 20:04   ` Eric J
  0 siblings, 0 replies; 7+ messages in thread
From: Eric J @ 2016-02-27 20:04 UTC (permalink / raw)
  To: notmuch

On Sun, 21 Feb 2016 13:57:30 +0100 (CET), Eric J <eric@deptj.eu> wrote:
> On Wed, 17 Feb 2016 21:44:23 +0100 (CET), Eric J <eric@deptj.eu> wrote:
> > Using the API, I am adding single mail files, already in the maildir, to
> > the Notmuch database and tagging them. It works, every time, as long as
> > I run it one file at a time. 
> > 
> > However, if I do it twice, in different processes, at the same time, one
> > file is added and tagged properly, the other is not (totally unfindable
> > by notmuch search). Neither process reports any error, and they both log
> > their actions normally. Actually a third simultaneous process also fails
> > to leave any result in the database.
> 8>< --------
> > 
> > The wrapper around the API for Tcl is very simple, and I can not see any
> > way for that or Tcl itself to cause this sort of problem. Beyond this,
> > I haven't thought of any way to decide if this could be a Notmuch problem
> > or a Xapian problem.
> 8>< --------
> 
> Well, after some experimenting, this is not specific to Notmuch at all.
> Xapian itself has Tcl bindings, and they also silently fail to lock the
> file. So does putting a minimally changed copy of Xapian's locking code
> directly in a Tcl extension, though that code works when called from a
> tiny C main program instead.
> 
> So thanks for looking, I will report here if I find out why.

Well, this seems to be a Tcl problem, not present in 8.5, but present in
8.6, up until the very new 8.6.5rc2, where it seems to work. Sadly, I
still have no explanation of the problem.

Eric
-- 
ms fnd in a lbry

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-02-27 20:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-17 20:44 Lost updates to Notmuch database Eric J
2016-02-18  1:03 ` David Bremner
2016-02-18 12:59   ` Eric J
2016-02-18 14:30   ` Tomi Ollila
2016-02-18 21:26     ` Eric J
2016-02-21 12:57 ` Eric J
2016-02-27 20:04   ` Eric J

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).