Excerpts from Austin Clements's message of Thu May 26 22:43:02 +0100 2011: > http://notmuch.198994.n3.nabble.com/notmuch-s-idea-of-concurrency-failing-an-invocation-tp2373468p2565731.html ah, good old peterson :P thanks. > > > Though, Patrick, that solution doesn't address your problem.  On the > > > other hand, it's not clear to me what concurrent access semantics > > > you're actually expecting.  I suspect you don't want the remaining > > > iteration to reflect the changes, since your changes could equally > > > well have affected earlier iteration results.  > > That's right. > > > But if you want a > > > consistent view of your query results, something's going to have to > > > materialize that iterator, and it might as well be you (or Xapian > > > would need more sophisticated concurrency control than it has).  But > > > this shouldn't be expensive because all you need to materialize are > > > the document ids; you shouldn't need to eagerly fetch the per-thread > > > information. > > I thought so, but it seems that Query.search_threads() already > > caches more than the id of each item. Which is as expected > > because it is designed to return thread objects, not their ids. > > As you can see above, this _is_ too expensive for me. > > I'd forgotten that constructing threads on the C side was eager about > the thread tags, author list and subject (which, without Istvan's > proposed patch, even requires opening and parsing the message file). > This is probably what's killing you. > > Out of curiosity, what is your situation that you won't wind up paying > the cost of this iteration one way or the other and that the latency > of doing these tag changes matters? I'm trying to implement a terminal interface for notmuch in python that resembles sup. For the search results view, i read an initial portion from a Threads iterator to fill my teminal window with threadline-widgets. Obviously, for a large number of results I don't want to go through all of them. The problem arises if you toggle a tag on the selected threadline and afterwards continue to scroll down. > > > Have you tried simply calling list() on your thread > > > iterator to see how expensive it is? My bet is that it's quite cheap, > > > both memory-wise and CPU-wise. > > Funny thing: > > q=Database().create_query('*') > > time tlist = list(q.search_threads()) > > raises a NotmuchError(STATUS.NOT_INITIALIZED) exception. For some reason > > the list constructor must read mere than once from the iterator. > > So this is not an option, but even if it worked, it would show > > the same behaviour as my above test.. > > Interesting. Looks like the Threads class implements __len__ and that > its implementation exhausts the iterator. Which isn't a great idea in > itself, but it turns out that Python's implementation of list() calls > __len__ if it's available (presumably to pre-size the list) before > iterating over the object, so it exhausts the iterator before even > using it. > > That said, if list(q.search_threads()) did work, it wouldn't give you > better performance than your experiment above. > > > would it be very hard to implement a Query.search_thread_ids() ? > > This name is a bit off because it had to be done on a lower level. > > Lazily fetching the thread metadata on the C side would probably > address your problem automatically. But what are you doing that > doesn't require any information about the threads you're manipulating? Agreed. Unfortunately, there seems to be no way to get a list of thread ids or a reliable iterator thereof by using the current python bindings. It would be enough for me to have the ids because then I could search for the few threads I actually need individually on demand. Here is the branch in which I'm trying out these things. Sorry for the messy code, its late :P https://github.com/pazz/notmuch-gui/tree/toggletags /p