notmuch's idea of concurrency / failing an invocation

unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed

* notmuch's idea of concurrency / failing an invocation
@ 2011-01-27 18:20 Thomas Schwinge
  2011-01-27 18:40 ` micah anderson
  2011-01-28  5:07 ` Carl Worth
  0 siblings, 2 replies; 17+ messages in thread
From: Thomas Schwinge @ 2011-01-27 18:20 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 1935 bytes --]

Hallo!

Stepping away from the current code base -- what is notmuch's original
idea of concurrency?  That is, all of us probably know that one:

    A Xapian exception occurred opening database: Unable to get write
      lock on /home/thomas/Mail-schwinge.name-thomas/.notmuch/xapian:
      already locked

I recently saw that one while using the Emacs UI (that one tried to
remove a unread tag or similar), and in parallel a delivery to the
notmuch DB was going on.

Apparently the DB we're using doesn't allow for simultaneous writing
(even though it can't even possibly have been dangerous in this case).

Which is the original idea here?  Is it that...

  * each and every client should catch these kinds of errors, and retry,
    or eventually give up at some point, and report the status to the
    user; or is it that...

  * notmuch internally should catch these concurrency cases, and retry,
    or eventually give up at some point (``notmuch --maximum-wait=30s tag
    [...]''), and fail as seen above?

This one is an obvious temporary error due to a concurrency situation.
Wouldn't the latter suggestion be preferable here?  I guess that in most
cases the DB isn't locked for long periods of time, and thus the
concurrency situation would decline quickly.

One difficulty I see is judging which errors are temporary and which are
permanent -- which is obvious in a lot of cases (concurrent DB access,
memory starved or any other OS resource), but may not be, for example in
case of I/O errors (is ``disk full'' a permanent error?).  And then, for
some of these cases, waiting does make sense (concurrent DB access, as
suggested above), and for other (temporary?) errors it doesn't make (a
lot of) sense (out of memory: only sensible thing is to abort, and have
the caller re-try, or disk full: waiting for some free space may be worth
it, or it may be not).

Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-27 18:20 notmuch's idea of concurrency / failing an invocation Thomas Schwinge
@ 2011-01-27 18:40 ` micah anderson
  2011-01-27 20:35   ` Jameson Rollins
  2011-01-29  1:05   ` Stewart Smith
  2011-01-28  5:07 ` Carl Worth
  1 sibling, 2 replies; 17+ messages in thread
From: micah anderson @ 2011-01-27 18:40 UTC (permalink / raw)
  To: Thomas Schwinge, notmuch

[-- Attachment #1: Type: text/plain, Size: 1691 bytes --]

On Thu, 27 Jan 2011 19:20:00 +0100, Thomas Schwinge <thomas@schwinge.name> wrote:
> Stepping away from the current code base -- what is notmuch's original
> idea of concurrency?  That is, all of us probably know that one:
> 
>     A Xapian exception occurred opening database: Unable to get write
>       lock on /home/thomas/Mail-schwinge.name-thomas/.notmuch/xapian:
>       already locked
> 
> I recently saw that one while using the Emacs UI (that one tried to
> remove a unread tag or similar), and in parallel a delivery to the
> notmuch DB was going on.

Due to my harddisk in my laptop being slow (5400RPM), my notmuch
database growing, and perhaps some fragmentation somewhere, this has
become *incredibly* annoying for me. I am checking email every 30
minutes, and I'm nicing and ionicing the processes so I can use my
machine, but while those processes are running, I'm effectively locked
out of a good portion of my email. 

Usually, I switch to another task until my disk light has ceased being
solid, because the update time is too slow for me to wait. 

Now that folders are making it in, the two remaining features that are
driving me nuts with notmuch is this one and the
verification/decryption/encryption process (replying to an encrypted
message is 12 distinct steps for me, which is discouraging me from doing
that at all). 

I really don't want to complain, because I have no time to help in these
areas,  rather I'm interested  to know  if anyone  has any  pointers for
making this less annoying, and I'm  hoping that at some point I can free
up time to help. Perhaps I need to dump/restore my notmuch DB? Or index
less mail?

micah

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-27 18:40 ` micah anderson
@ 2011-01-27 20:35   ` Jameson Rollins
  2011-01-27 22:20     ` Austin Clements
  2011-02-03 16:28     ` micah anderson
  2011-01-29  1:05   ` Stewart Smith
  1 sibling, 2 replies; 17+ messages in thread
From: Jameson Rollins @ 2011-01-27 20:35 UTC (permalink / raw)
  To: micah anderson, Thomas Schwinge, notmuch

[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]

On Thu, 27 Jan 2011 13:40:25 -0500, micah anderson <micah@riseup.net> wrote:
> Due to my harddisk in my laptop being slow (5400RPM), my notmuch
> database growing, and perhaps some fragmentation somewhere, this has
> become *incredibly* annoying for me. I am checking email every 30
> minutes, and I'm nicing and ionicing the processes so I can use my
> machine, but while those processes are running, I'm effectively locked
> out of a good portion of my email. 

I also have a very slow disk, but this is very rarely a problem for me.
I retrieve mail every 10 minutes, and the corresponding notmuch new
usually takes a minute or so.  I really haven't found it to be much of a
bother to just wait it out.

One of the suggested ways to develop around this problem would be a
notmuch daemon that would queue database modification requests.  I don't
think anyone has been working on this yet, but if this is a big problem
for you guys, you might start looking into putting one together.

jamie.

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-27 20:35   ` Jameson Rollins
@ 2011-01-27 22:20     ` Austin Clements
  2011-01-28  5:10       ` Carl Worth
  2011-02-24  6:59       ` Austin Clements
  2011-02-03 16:28     ` micah anderson
  1 sibling, 2 replies; 17+ messages in thread
From: Austin Clements @ 2011-01-27 22:20 UTC (permalink / raw)
  To: Jameson Rollins; +Cc: notmuch, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 1596 bytes --]

I'm looking into breaking notmuch new up into small transactions.  It
wouldn't be much a leap from there to simply close and reopen the database
between transactions if another task wants to use it, which would release
the lock and let the queued notmuch task have the database for a bit.  It
seems silly to have a daemon when all of notmuch's state is already on disk
and queue on a lock is as good as a queue in a daemon, but without the
accompanying architectural shenanigans.

On Thu, Jan 27, 2011 at 3:35 PM, Jameson Rollins <jrollins@finestructure.net
> wrote:

> On Thu, 27 Jan 2011 13:40:25 -0500, micah anderson <micah@riseup.net>
> wrote:
> > Due to my harddisk in my laptop being slow (5400RPM), my notmuch
> > database growing, and perhaps some fragmentation somewhere, this has
> > become *incredibly* annoying for me. I am checking email every 30
> > minutes, and I'm nicing and ionicing the processes so I can use my
> > machine, but while those processes are running, I'm effectively locked
> > out of a good portion of my email.
>
> I also have a very slow disk, but this is very rarely a problem for me.
> I retrieve mail every 10 minutes, and the corresponding notmuch new
> usually takes a minute or so.  I really haven't found it to be much of a
> bother to just wait it out.
>
> One of the suggested ways to develop around this problem would be a
> notmuch daemon that would queue database modification requests.  I don't
> think anyone has been working on this yet, but if this is a big problem
> for you guys, you might start looking into putting one together.
>
> jamie.

[-- Attachment #2: Type: text/html, Size: 1997 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-27 18:20 notmuch's idea of concurrency / failing an invocation Thomas Schwinge
  2011-01-27 18:40 ` micah anderson
@ 2011-01-28  5:07 ` Carl Worth
  1 sibling, 0 replies; 17+ messages in thread
From: Carl Worth @ 2011-01-28  5:07 UTC (permalink / raw)
  To: Thomas Schwinge, notmuch

[-- Attachment #1: Type: text/plain, Size: 2961 bytes --]

On Thu, 27 Jan 2011 19:20:00 +0100, Thomas Schwinge <thomas@schwinge.name> wrote:
> Which is the original idea here?  Is it that...

There's no original idea yet. It's essentially an unsolved problem right now.

>   * each and every client should catch these kinds of errors, and retry,
>     or eventually give up at some point, and report the status to the
>     user; or is it that...
> 
>   * notmuch internally should catch these concurrency cases, and retry,
>     or eventually give up at some point (``notmuch --maximum-wait=30s tag
>     [...]''), and fail as seen above?

Some people have actually already done work solutions in one way or
another. Here are a few of the messages I found in my "outstanding
notmuch mail to read"[*] queue:

    James Vasile patched the emacs interface to call notmuch
    asynchronously and to repeatedly call it if it fails (he also
    wonders if it should have some sort of timeout):

	id:"87vddnlxos.wl%james@hackervisions.org"

    James also wrote a shell script that repeatedly calls the notmuch
    binary as necessary (and he wonders if this retrying should happen
    inside notmuch itself):

	id:"87pr3sw43a.fsf@hackervisions.org"

    "servilio" wrote a new "notmuch repl" command which can accept
    notmuch operations expressed in text form on stdin, and then
    interpret and execute them. That's a good start on a notmuch daemon:

	id:"AANLkTi=7eCt0=NqUiJFrGDcaZ17LOd3qNNqN1-ASwYzr@mail.gmail.com"

I'm not sure yet which approach (or approaches) we want. But I would
love to see some of the limitations described in the messages above
addressed. That would definitely make some of the patches more
acceptable.

-Carl

[*] And yes, my queue really does span a year(!) or so. That's
embarrassing. I'm committed to making progress on this queue and staying
up-to-date with new patches, so I've made a couple of recent changes:

1. I'm now processing the queue largely in reverse-chronological
   order. The idea here is that I can stay on top of new posts, while
   also making progress on previously-sent items.

   This does mean that you can hack my workflow by replying to an old
   thread, (and thereby bringing it back to my attention). Please feel
   free to do that---ideally by mentioning any new information such as
   "these patches are now rebased <here>" or "I've tested these patches
   in daily use for X months and they still apply fine to master" or
   similar.

2. I've date-limited my saved search for my notmuch queue to show a
   small number of messages. This is a cheap psychological hack. If the
   number on the queue is too large it makes me hesitant to even look at
   it. But with a small number, it's easier to make progress since the
   end is apparently in sight.

   Of course, once I reduce my date-limited queue to 0, I'll extend the
   date back into the past and try to keep working through things.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-27 22:20     ` Austin Clements
@ 2011-01-28  5:10       ` Carl Worth
  2011-01-28  9:45         ` Thomas Schwinge
  2011-02-24  6:59       ` Austin Clements
  1 sibling, 1 reply; 17+ messages in thread
From: Carl Worth @ 2011-01-28  5:10 UTC (permalink / raw)
  To: Austin Clements, Jameson Rollins; +Cc: notmuch, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

On Thu, 27 Jan 2011 17:20:21 -0500, Austin Clements <amdragon@mit.edu> wrote:
> I'm looking into breaking notmuch new up into small transactions.  It
> wouldn't be much a leap from there to simply close and reopen the database
> between transactions if another task wants to use it, which would release
> the lock and let the queued notmuch task have the database for a bit.

That sounds like something very useful to pursue. Please continue!

> It seems silly to have a daemon when all of notmuch's state is already on disk
> and queue on a lock is as good as a queue in a daemon, but without the
> accompanying architectural shenanigans.

It would definitely be nice to avoid the complexity inherent in having a
daemon, but how do you imagine "queue on a lock" to work? We don't have
anything like that in place now.

Another advantage that can happen with queueing (wherever it occurs) is
to allow a client to be very responsive without waiting for an operation
to complete. Though that can of course be band if the operation isn't
reliably committed.

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-28  5:10       ` Carl Worth
@ 2011-01-28  9:45         ` Thomas Schwinge
  2011-01-28 15:36           ` Mike Kelly
  2011-01-28 16:50           ` Austin Clements
  0 siblings, 2 replies; 17+ messages in thread
From: Thomas Schwinge @ 2011-01-28  9:45 UTC (permalink / raw)
  To: Carl Worth, Austin Clements, Jameson Rollins; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 4232 bytes --]

Hallo!

On Fri, 28 Jan 2011 15:10:01 +1000, Carl Worth <cworth@cworth.org> wrote:
> On Thu, 27 Jan 2011 17:20:21 -0500, Austin Clements <amdragon@mit.edu> wrote:
> > I'm looking into breaking notmuch new up into small transactions.  It
> > wouldn't be much a leap from there to simply close and reopen the database
> > between transactions if another task wants to use it, which would release
> > the lock and let the queued notmuch task have the database for a bit.
> 
> That sounds like something very useful to pursue. Please continue!

Ack!  And actually -- I just wondered about that: what happens if
``notmuch new'' has executed notmuch_database_add_message for a new
message M, but then is killed before adding any tags and finishing up
(and supposing that the DB isn't in an invalid state now).  This process
of adding M to the DB and applying any tags isn't one single transaction
currently, right?  (And this is exactly what you're working on
chainging?)  Am I right that what currently happens is that upon the next
``notmuch new'' run, notmuch will not reconsider M (given that it already
is present in the DB), but continue with the next messages -- thus
leaving M without any tags?  This isn't a very likely scenario, but still
a possible one.

> > It seems silly to have a daemon when all of notmuch's state is already on disk
> > and queue on a lock is as good as a queue in a daemon, but without the
> > accompanying architectural shenanigans.

Ack, too.  A daemon seems one abstraction layer too much.  (But I'm not
actively opposed either, if someone has a valid use for such a scheme.)

> It would definitely be nice to avoid the complexity inherent in having a
> daemon, but how do you imagine "queue on a lock" to work? We don't have
> anything like that in place now.

I suppose what he means is trying to get the lock, and if that fails wait
a bit / wait until it is available again.

Actually, as a next step, wouldn't it also be possible to add some
heuristic to avoid ``notmuch new'' (being a low-priority task) blocking
some interactive user (UI; high-priority task)?  But we can pursue such
schemes as soon as the basic infrastructure is in place.

> Another advantage that can happen with queueing (wherever it occurs) is
> to allow a client to be very responsive without waiting for an operation
> to complete. Though that can of course be band if the operation isn't
> reliably committed.

(Obviously this can only work as long as we don't immediatelly need the
operation's result; think ``notmuch show''.)

So, if the DB has the functionality to internally queue and immediatelly
acknowledge transactions, and only later (reliably) commit them, wouldn't
that be fine indeed?  For example, ``notmuch tag'' then wouldn't have to
wait for the DB to be writable.  (And note that I have no idea whether
Xapian supports such things.)  But on the other hand we would like to
immediatelly display the requested change in the UI, right?

What notmuch-show.el:notmuch-show-remove-tag currently does is *not*
re-asking the DB for a message's current tags after having removed a
specific one, but instead it interprets the tag removal command itself --
which is easy enough in this case, and rather unlikely to ever yield
different results, at least unless there's another process operating on
the DB concurrently.

Otherwise, the other way round, the client could maintain a list of to-do
items, to which actions are added if the DB is currently busy, and this
list is periodically worked on in order to get it empty.  For example,
tag changes that are in this list, but not yet committed in the DB could
be displayed in another color in the UI.  But doing so would shift the
responsibility to the UI, which should be in the DB, in my humble
opinion.  (Actually, this issue feels similar to the one who should be
doing the re-trying in case the DB is busy: the UI, or the notmuch
process itself, which we're discussing in another thread.)

As you can guess I'm not very much into DBs, and neither too much into
concurrent systems, so if my ideas don't make sense, please feel free to
refer me to literature.

Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-28  9:45         ` Thomas Schwinge
@ 2011-01-28 15:36           ` Mike Kelly
  2011-01-28 16:57             ` Austin Clements
  2011-01-28 16:50           ` Austin Clements
  1 sibling, 1 reply; 17+ messages in thread
From: Mike Kelly @ 2011-01-28 15:36 UTC (permalink / raw)
  To: notmuch

On Fri, 28 Jan 2011 10:45:19 +0100, Thomas Schwinge <thomas@schwinge.name> wrote:
> > It would definitely be nice to avoid the complexity inherent in having a
> > daemon, but how do you imagine "queue on a lock" to work? We don't have
> > anything like that in place now.
>
> I suppose what he means is trying to get the lock, and if that fails wait
> a bit / wait until it is available again.
>
> Actually, as a next step, wouldn't it also be possible to add some
> heuristic to avoid ``notmuch new'' (being a low-priority task) blocking
> some interactive user (UI; high-priority task)?  But we can pursue such
> schemes as soon as the basic infrastructure is in place.

Couldn't we pretty much get the desired behavior by using flock(2)?
Basically, take out a LOCK_EX when we need to write, and a LOCK_SH when
we only need to read. Using the blocking form, things should pretty much
just queue up and take their turn, right?

I'm not familiar with Xapian, but if it doesn't give us something we
could use this sort of locking on, couldn't we just add some
/path/to/mail/.notmuch.lock file that we open to hold a lock on?

We already have to specify if we want a read-only or read-write database
handle in notmuch_database_open, so it seems like it'd be easy enough to
hook in there.

-- 
Mike Kelly

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-28  9:45         ` Thomas Schwinge
  2011-01-28 15:36           ` Mike Kelly
@ 2011-01-28 16:50           ` Austin Clements
  1 sibling, 0 replies; 17+ messages in thread
From: Austin Clements @ 2011-01-28 16:50 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 3314 bytes --]

On Fri, Jan 28, 2011 at 4:45 AM, Thomas Schwinge <thomas@schwinge.name>wrote:

> On Fri, 28 Jan 2011 15:10:01 +1000, Carl Worth <cworth@cworth.org> wrote:
> > On Thu, 27 Jan 2011 17:20:21 -0500, Austin Clements <amdragon@mit.edu>
> wrote:
> > > I'm looking into breaking notmuch new up into small transactions.  It
> > > wouldn't be much a leap from there to simply close and reopen the
> database
> > > between transactions if another task wants to use it, which would
> release
> > > the lock and let the queued notmuch task have the database for a bit.
> >
> > That sounds like something very useful to pursue. Please continue!
>
> Ack!  And actually -- I just wondered about that: what happens if
> ``notmuch new'' has executed notmuch_database_add_message for a new
> message M, but then is killed before adding any tags and finishing up
> (and supposing that the DB isn't in an invalid state now).  This process
> of adding M to the DB and applying any tags isn't one single transaction
> currently, right?  (And this is exactly what you're working on
> chainging?)  Am I right that what currently happens is that upon the next
> ``notmuch new'' run, notmuch will not reconsider M (given that it already
> is present in the DB), but continue with the next messages -- thus
> leaving M without any tags?  This isn't a very likely scenario, but still
> a possible one.

There are quite a few bugs like this.  In fact, last night I added a test
that interrupts notmuch new (for real, not SIGINT) after every database
write, and on each interrupted database snapshot, re-runs notmuch new to
completion, then checks that the database winds up in the correct state.
 There are dozens of interruption points where it doesn't, many of which are
permanent, even if you force notmuch new to rescan the maildir.

> Another advantage that can happen with queueing (wherever it occurs) is
> > to allow a client to be very responsive without waiting for an operation
> > to complete. Though that can of course be band if the operation isn't
> > reliably committed.
>
> (Obviously this can only work as long as we don't immediatelly need the
> operation's result; think ``notmuch show''.)
>
> So, if the DB has the functionality to internally queue and immediatelly
> acknowledge transactions, and only later (reliably) commit them, wouldn't
> that be fine indeed?  For example, ``notmuch tag'' then wouldn't have to
> wait for the DB to be writable.  (And note that I have no idea whether
> Xapian supports such things.)  But on the other hand we would like to
> immediatelly display the requested change in the UI, right?
>

This would be fantastic, if the client could indicate the difference between
a "pending" change and a "committed" change as you suggest below.  I don't
think having the database lie about its commit state is the right way to do
this, though (nor should the client lie about this, thus the "pending"
display).  A better way would be for the client to update the display to
"pending", start the notmuch operation asynchronously, have the notmuch
operation block and queue up on the database lock, then have the client
update the display to "committed" when the asynchronous operation returns.
 No weird database operations or transactional semantics and the client side
is fairly straightforward.

[-- Attachment #2: Type: text/html, Size: 4222 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-28 15:36           ` Mike Kelly
@ 2011-01-28 16:57             ` Austin Clements
  2011-01-28 18:17               ` Austin Clements
  2011-01-29 16:10               ` Mike Kelly
  0 siblings, 2 replies; 17+ messages in thread
From: Austin Clements @ 2011-01-28 16:57 UTC (permalink / raw)
  To: Mike Kelly; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]

On Fri, Jan 28, 2011 at 10:36 AM, Mike Kelly <pioto@pioto.org> wrote:

> On Fri, 28 Jan 2011 10:45:19 +0100, Thomas Schwinge <thomas@schwinge.name>
> wrote:
> > > It would definitely be nice to avoid the complexity inherent in having
> a
> > > daemon, but how do you imagine "queue on a lock" to work? We don't have
> > > anything like that in place now.
> >
> > I suppose what he means is trying to get the lock, and if that fails wait
> > a bit / wait until it is available again.
> >
> > Actually, as a next step, wouldn't it also be possible to add some
> > heuristic to avoid ``notmuch new'' (being a low-priority task) blocking
> > some interactive user (UI; high-priority task)?  But we can pursue such
> > schemes as soon as the basic infrastructure is in place.
>
> Couldn't we pretty much get the desired behavior by using flock(2)?
> Basically, take out a LOCK_EX when we need to write, and a LOCK_SH when
> we only need to read. Using the blocking form, things should pretty much
> just queue up and take their turn, right?
>
> I'm not familiar with Xapian, but if it doesn't give us something we
> could use this sort of locking on, couldn't we just add some
> /path/to/mail/.notmuch.lock file that we open to hold a lock on?
>

Yes, exactly.  All of this.  Unfortunately, Xapian doesn't expose the
ability to block on the lock (see the fcntl call in backends/flint_lock.cc,
which is hard-coded to the non-blocking F_SETLK instead of F_SETLKW), so
we'd either need a new Xapian option, or we would just have to wrap our own
flock/fcntl lock around things as you suggest.

[-- Attachment #2: Type: text/html, Size: 2063 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-28 16:57             ` Austin Clements
@ 2011-01-28 18:17               ` Austin Clements
  2011-01-29 16:10               ` Mike Kelly
  1 sibling, 0 replies; 17+ messages in thread
From: Austin Clements @ 2011-01-28 18:17 UTC (permalink / raw)
  To: Mike Kelly; +Cc: notmuch

Actually, this is trivial to play with.  Here's a stop-gap wrapper
script for people having trouble with Xapian locking,

#!/bin/bash

NOTMUCH_BIN="/path/to/notmuch"
MAIL_DIR="/path/to/mailroot"

(
    case "$1" in
        setup|help)
            ;;
        search|show|count|reply|dump|search-tags|part)
            flock -s 200;;
        *)
            flock -x 200;;
    esac
    "$NOTMUCH_BIN" "$@" 200>&-
) 200>"$MAIL_DIR"/.notmuch/lock

On Fri, Jan 28, 2011 at 11:57 AM, Austin Clements <amdragon@mit.edu> wrote:
>
> On Fri, Jan 28, 2011 at 10:36 AM, Mike Kelly <pioto@pioto.org> wrote:
>>
>> On Fri, 28 Jan 2011 10:45:19 +0100, Thomas Schwinge <thomas@schwinge.name> wrote:
>> > > It would definitely be nice to avoid the complexity inherent in having a
>> > > daemon, but how do you imagine "queue on a lock" to work? We don't have
>> > > anything like that in place now.
>> >
>> > I suppose what he means is trying to get the lock, and if that fails wait
>> > a bit / wait until it is available again.
>> >
>> > Actually, as a next step, wouldn't it also be possible to add some
>> > heuristic to avoid ``notmuch new'' (being a low-priority task) blocking
>> > some interactive user (UI; high-priority task)?  But we can pursue such
>> > schemes as soon as the basic infrastructure is in place.
>>
>> Couldn't we pretty much get the desired behavior by using flock(2)?
>> Basically, take out a LOCK_EX when we need to write, and a LOCK_SH when
>> we only need to read. Using the blocking form, things should pretty much
>> just queue up and take their turn, right?
>>
>> I'm not familiar with Xapian, but if it doesn't give us something we
>> could use this sort of locking on, couldn't we just add some
>> /path/to/mail/.notmuch.lock file that we open to hold a lock on?
>
> Yes, exactly.  All of this.  Unfortunately, Xapian doesn't expose the ability to block on the lock (see the fcntl call in backends/flint_lock.cc, which is hard-coded to the non-blocking F_SETLK instead of F_SETLKW), so we'd either need a new Xapian option, or we would just have to wrap our own flock/fcntl lock around things as you suggest.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-27 18:40 ` micah anderson
  2011-01-27 20:35   ` Jameson Rollins
@ 2011-01-29  1:05   ` Stewart Smith
  2011-01-30  0:14     ` Daniel Kahn Gillmor
  1 sibling, 1 reply; 17+ messages in thread
From: Stewart Smith @ 2011-01-29  1:05 UTC (permalink / raw)
  To: micah anderson, Thomas Schwinge, notmuch

On Thu, 27 Jan 2011 13:40:25 -0500, micah anderson <micah@riseup.net> wrote:
> Due to my harddisk in my laptop being slow (5400RPM), my notmuch
> database growing, and perhaps some fragmentation somewhere, this has
> become *incredibly* annoying for me. I am checking email every 30
> minutes, and I'm nicing and ionicing the processes so I can use my
> machine, but while those processes are running, I'm effectively locked
> out of a good portion of my email. 

I used to use spinning rust and also noticed things were slow. This
is in fact mostly not xapian - but rather crawling the Maildir. I
improved this early on in notmuch history by reducing the number of
seeks needed when traversing the Maildir hierarchy (e.g. stat in
i-node order, which is roughly on-disk order).

I'm about at the point where I'm going to take my git mail store
experiments and get them really to work (and everyone will have to use
'notmuch cat' or the like to access the messages) which should provide
both great storage efficiency, much faster backups of your Maildir as
well as having way fewer paths to traverse checking for new mail.

-- 
Stewart Smith

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-28 16:57             ` Austin Clements
  2011-01-28 18:17               ` Austin Clements
@ 2011-01-29 16:10               ` Mike Kelly
  1 sibling, 0 replies; 17+ messages in thread
From: Mike Kelly @ 2011-01-29 16:10 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 619 bytes --]

On Fri, 28 Jan 2011 11:57:34 -0500
Austin Clements <amdragon@mit.edu> wrote:

> Yes, exactly.  All of this.  Unfortunately, Xapian doesn't expose the
> ability to block on the lock (see the fcntl call in
> backends/flint_lock.cc, which is hard-coded to the non-blocking
> F_SETLK instead of F_SETLKW), so we'd either need a new Xapian
> option, or we would just have to wrap our own flock/fcntl lock around
> things as you suggest.

Hrm. Do you know if Xapian upstream would be open to a patch to support
optional blocking locks? We can't be the only ones hitting these sorts
of issues.

-- 
Mike Kelly

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-29  1:05   ` Stewart Smith
@ 2011-01-30  0:14     ` Daniel Kahn Gillmor
  2011-02-02  0:40       ` Stewart Smith
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Kahn Gillmor @ 2011-01-30  0:14 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

On 01/28/2011 08:05 PM, Stewart Smith wrote:
> I'm about at the point where I'm going to take my git mail store
> experiments and get them really to work (and everyone will have to use
> 'notmuch cat' or the like to access the messages)

Would this hypothetical git-based mail store retain the atomicity and
lockless concurrent-access of a maildir?  That is, could it be used in a
server environment?

> which should provide
> both great storage efficiency, much faster backups of your Maildir as
> well as having way fewer paths to traverse checking for new mail.

when you say "backups of your Maildir" do you mean "backups of your
git-based mail store" ?  or is this somehow a literal Maildir stored in git?

Intrigued,

	--dkg


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 1030 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-30  0:14     ` Daniel Kahn Gillmor
@ 2011-02-02  0:40       ` Stewart Smith
  0 siblings, 0 replies; 17+ messages in thread
From: Stewart Smith @ 2011-02-02  0:40 UTC (permalink / raw)
  To: Daniel Kahn Gillmor, notmuch

On Sat, 29 Jan 2011 19:14:27 -0500, Daniel Kahn Gillmor <dkg@fifthhorseman.net> wrote:
> On 01/28/2011 08:05 PM, Stewart Smith wrote:
> > I'm about at the point where I'm going to take my git mail store
> > experiments and get them really to work (and everyone will have to use
> > 'notmuch cat' or the like to access the messages)
> 
> Would this hypothetical git-based mail store retain the atomicity and
> lockless concurrent-access of a maildir?  That is, could it be used in a
> server environment?

My idea is that it would be... at least with the experiments conducted
so far.

> > which should provide
> > both great storage efficiency, much faster backups of your Maildir as
> > well as having way fewer paths to traverse checking for new mail.
> 
> when you say "backups of your Maildir" do you mean "backups of your
> git-based mail store" ?  or is this somehow a literal Maildir stored in git?

I'll write more "soon" when there is more code behind it... and I figure
out a good upgrade path to something that is also self-consistently sane.

-- 
Stewart Smith

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-27 20:35   ` Jameson Rollins
  2011-01-27 22:20     ` Austin Clements
@ 2011-02-03 16:28     ` micah anderson
  1 sibling, 0 replies; 17+ messages in thread
From: micah anderson @ 2011-02-03 16:28 UTC (permalink / raw)
  To: Jameson Rollins, Thomas Schwinge, notmuch

[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]

On Thu, 27 Jan 2011 12:35:16 -0800, Jameson Rollins <jrollins@finestructure.net> wrote:
> On Thu, 27 Jan 2011 13:40:25 -0500, micah anderson <micah@riseup.net> wrote:
> > Due to my harddisk in my laptop being slow (5400RPM), my notmuch
> > database growing, and perhaps some fragmentation somewhere, this has
> > become *incredibly* annoying for me. I am checking email every 30
> > minutes, and I'm nicing and ionicing the processes so I can use my
> > machine, but while those processes are running, I'm effectively locked
> > out of a good portion of my email. 
> 
> I also have a very slow disk, but this is very rarely a problem for me.
> I retrieve mail every 10 minutes, and the corresponding notmuch new
> usually takes a minute or so.  I really haven't found it to be much of a
> bother to just wait it out.

Sometimes I can have several thousand messages to add/remove from the
database. I know this is probably unusal, but for me its not due to
system emails. I suppose if I checked my email more frequently, I'd have
less to process on each run, but thats more side-stepping the
concurrency issue.

micah

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: notmuch's idea of concurrency / failing an invocation
  2011-01-27 22:20     ` Austin Clements
  2011-01-28  5:10       ` Carl Worth
@ 2011-02-24  6:59       ` Austin Clements
  1 sibling, 0 replies; 17+ messages in thread
From: Austin Clements @ 2011-02-24  6:59 UTC (permalink / raw)
  To: notmuch; +Cc: Thomas Schwinge

Now that I've split notmuch new up in to lots of small transactions, I
think the database locking issue is quite approachable.  Here's a
proposed locking protocol where a notmuch operation that wants to
modify the database blocks if there's another operation in progress
(instead of immediately failing like now), but indicates to the
in-progress operation that, when convenient, it should temporarily
abdicate the database.

Add a file to the .notmuch directory, say "lock", which we'll use for
fcntl locks (fcntl locks have nice properties, like automatic cleanup
on process exit and NFS compatibility).

To open the database for write,
1. Acquire an exclusive lock on byte 0 of the lock file (in blocking mode)
2. Acquire an exclusive lock on byte 1 of the lock file
3. Release the lock on byte 0
4. Open the Xapian database

When it's convenient to abdicate the lock, test if there are pending
operations by testing for a lock on byte 0 of the lock file using
F_GETLK.  If there's no lock on byte 0, just continue without
releasing the database.  Otherwise,
1. Close the Xapian database
2. Release the lock on byte 1
3. Re-lock and re-open the database.

In effect, this acts like one lock, since byte 1 is only ever acquired
while byte 0 is held, but splitting it across two locks like this lets
us "peek" at the waiter queue and see if someone is waiting.

Some possible extensions: We may want a timeout for how long to wait
for the lock (in case the lock holder gets wedged).  We could work
around DatabaseModified exceptions by having readers do essentially
the same thing as writers, but take the locks in shared mode.  Readers
wouldn't proceed in parallel with writers, but long-running writers
would relinquish the lock, so this isn't so bad.  Finally, concurrent
notmuch new's should probably be serialized (instead of repeatedly
abdicating to each other), so it may make sense to have an additional
"notmuch new lock".

On Thu, Jan 27, 2011 at 5:20 PM, Austin Clements <amdragon@mit.edu> wrote:
> I'm looking into breaking notmuch new up into small transactions.  It
> wouldn't be much a leap from there to simply close and reopen the database
> between transactions if another task wants to use it, which would release
> the lock and let the queued notmuch task have the database for a bit.  It
> seems silly to have a daemon when all of notmuch's state is already on disk
> and queue on a lock is as good as a queue in a daemon, but without the
> accompanying architectural shenanigans.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-02-24  6:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-27 18:20 notmuch's idea of concurrency / failing an invocation Thomas Schwinge
2011-01-27 18:40 ` micah anderson
2011-01-27 20:35   ` Jameson Rollins
2011-01-27 22:20     ` Austin Clements
2011-01-28  5:10       ` Carl Worth
2011-01-28  9:45         ` Thomas Schwinge
2011-01-28 15:36           ` Mike Kelly
2011-01-28 16:57             ` Austin Clements
2011-01-28 18:17               ` Austin Clements
2011-01-29 16:10               ` Mike Kelly
2011-01-28 16:50           ` Austin Clements
2011-02-24  6:59       ` Austin Clements
2011-02-03 16:28     ` micah anderson
2011-01-29  1:05   ` Stewart Smith
2011-01-30  0:14     ` Daniel Kahn Gillmor
2011-02-02  0:40       ` Stewart Smith
2011-01-28  5:07 ` Carl Worth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).