unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Tomi Ollila <tomi.ollila@iki.fi>
To: Mark Walters <markwalters1009@gmail.com>,
	Jani Nikula <jani@nikula.org>,
	notmuch@notmuchmail.org
Subject: Re: [PATCH 0/5] notmuch batch count
Date: Wed, 16 Jan 2013 07:02:05 +0200	[thread overview]
Message-ID: <m27gndsotu.fsf@guru.guru-group.fi> (raw)
In-Reply-To: <8738y2ui4y.fsf@qmul.ac.uk>

On Wed, Jan 16 2013, Mark Walters <markwalters1009@gmail.com> wrote:

> On Tue, 15 Jan 2013, Jani Nikula <jani@nikula.org> wrote:
>> Hi all -
>>
>> Notmuch remote usage [1] is a pretty handy way of accessing a notmuch
>> database on a remote server. However, the more you have saved searches
>> and tags, the slower notmuch-hello becomes, and it ends up being by and
>> far the biggest usability issue with remote notmuch. This is because
>> notmuch-hello issues a separate 'notmuch count' for each saved search
>> and tag.
>>
>> One could argue that notmuch-hello should be fixed somehow, but I chose
>> to try another route: batch support for notmuch count. This enables
>> notmuch-hello to get the counts for all the saved searches or tags in a
>> single call. The performance improvement is huge in remote usage, but
>> it's not limited to that. Regular local usage benefits from it too, but
>> it's not as obviously noticeable.
>
> This series looks good to me (that is the code looks fine).
>
> Two questions are:
>
> Do we want this functionality? I think it is useful even on local setups
> particularly if people have lots of tags (the section that shows all
> tags can be quite noticeably sped up). It is a substantial improvement
> on remote setups but I am not sure if that is sufficiently common to
> warrant the change. At least the code path is the same so it will get
> enough testing.

I do want the functionality. Especialy where I am now it takes about
0.4 sec for 'ssh remote echo foo' to get executed (using connection sharing).
pipelining the count requests could make all the count requests emacs
does (in my current set) to complete in less than 1 sec. 

> Secondly, if we do the functionality should it be more general so that
> it can do searches etc too. I think this is less clear. Count is likely
> to be the most useful one since running several (simultaneous) counts is
> probably more common than running several simultaneous searches.

One could argue that we'd should send json "documents" to notmuch in
stdin and notmuch would output json(/sexp) "documents". That is just
SMOP. I bet Austin would like this solution, especially the part
that involves writing or integrating json parser >;). 
I'd be happy with this 'batch' approach. 

I'll be testing this soon, but refrain from reviewing the code
until 0.15 is out.

>
> Best wishes
>
> Mark


Tomi


>
>
>>
>> Here's a script that demonstrates one-by-one count vs. batch count,
>> locally and over ssh (assuming ssh key authentication is set up), over
>> 10 iterations:
>>
>> #!/bin/bash
>>
>> echo "tag count:"
>> notmuch search --output=tags "*" | wc -l
>>
>> for remote in "" "ssh example.com"; do
>>     export remote
>>     echo "one-by-one count:"
>>     time sh -c 'for i in `seq 10`; do notmuch search --format=text0 --output=tags "*" | xargs -0 -n 1 -I "{}" $remote notmuch count tag:"{}" > /dev/null; done'
>>
>>     echo "batch count:"
>>     time sh -c 'for i in `seq 10`; do notmuch search --format=text --output=tags "*" | sed "s/.*/tag:\"\0\"/" | $remote notmuch count --batch > /dev/null; done'
>> done
>>
>> And here's the output of it in my setup:
>>
>> tag count:
>> 36
>> one-by-one count:
>>
>> real	0m2.349s
>> user	0m0.552s
>> sys	0m0.868s
>> batch count:
>>
>> real	0m0.179s
>> user	0m0.120s
>> sys	0m0.064s
>> one-by-one count:
>>
>> real	0m56.527s
>> user	0m1.424s
>> sys	0m1.164s
>> batch count:
>>
>> real	0m2.407s
>> user	0m0.068s
>> sys	0m0.040s
>>
>> As can be seen, in local usage (the first pair of results) the speedup
>> is more than 10x, although one-by-one notmuch count is usually
>> sufficiently fast. The difference is more noticeable in remote use (the
>> second pair of results), where the speedup is 20x here, and any
>> additional, occasional network latency is multiplied by tag count. (That
>> result is actually faster than usual for me, but it's still 5+ seconds
>> to display or refresh notmuch-hello.)
>>
>> Mark has written a patch that I've been using to switch notmuch-hello to
>> use batch count. That has made me switch from running notmuch in ssh to
>> using remote notmuch. The great thing is that we could switch to using
>> that in Emacs with no special casing for remote usage, and it would
>> speed things up also in local use. I'm expecting Mark to post his patch
>> in reply to this series.
>>
>> Mark actually wrote the elisp part based on the rough idea prior to any
>> of this cli plumbing, so I felt obliged to follow up. So thanks Mark!
>>
>>
>> BR,
>> Jani.
>>
>>
>> [1] http://notmuchmail.org/remoteusage/ (the page could use some
>> cleanup; it's really not nearly as complicated as the page suggests)
>>
>>
>> Jani Nikula (5):
>>   cli: remove useless strdup
>>   cli: extract count printing to a separate function in notmuch count
>>   cli: add --batch option to notmuch count
>>   man: document notmuch count --batch and --input options
>>   test: notmuch count --batch and --input options
>>
>>  man/man1/notmuch-count.1 |   20 +++++++++
>>  notmuch-count.c          |  111 +++++++++++++++++++++++++++++++++++-----------
>>  test/count               |   46 +++++++++++++++++++
>>  3 files changed, 150 insertions(+), 27 deletions(-)
>>
>> -- 
>> 1.7.10.4
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

  reply	other threads:[~2013-01-16  5:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-15 21:54 [PATCH 0/5] notmuch batch count Jani Nikula
2013-01-15 21:54 ` [PATCH 1/5] cli: remove useless strdup Jani Nikula
2013-01-15 21:54 ` [PATCH 2/5] cli: extract count printing to a separate function in notmuch count Jani Nikula
2013-01-15 21:54 ` [PATCH 3/5] cli: add --batch option to " Jani Nikula
2013-01-23 13:36   ` Tomi Ollila
2013-01-15 21:54 ` [PATCH 4/5] man: document notmuch count --batch and --input options Jani Nikula
2013-01-15 21:54 ` [PATCH 5/5] test: " Jani Nikula
2013-01-15 23:24 ` [PATCH] emacs: hello: use batch count Mark Walters
2013-01-23 14:13   ` Tomi Ollila
2013-01-15 23:43 ` [PATCH 0/5] notmuch " Mark Walters
2013-01-16  5:02   ` Tomi Ollila [this message]
2013-01-21 17:21     ` Jani Nikula
2013-01-22 13:43       ` Tomi Ollila

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m27gndsotu.fsf@guru.guru-group.fi \
    --to=tomi.ollila@iki.fi \
    --cc=jani@nikula.org \
    --cc=markwalters1009@gmail.com \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).