From: Jani Nikula <jani@nikula.org>
To: Jesse Rosenthal <jrosenthal@jhu.edu>,
Daniel Schoepe <daniel@schoepe.org>,
Justus Winter <4winter@informatik.uni-hamburg.de>,
Philippe LeCavalier <support@plecavalier.com>,
notmuch@notmuchmail.org
Subject: Re: nomuch_addresses.py
Date: Wed, 22 Feb 2012 13:07:35 +0000 [thread overview]
Message-ID: <871upn6mp4.fsf@nikula.org> (raw)
In-Reply-To: <87boosjgd9.fsf@jhu.edu>
On Tue, 21 Feb 2012 11:33:38 -0500, Jesse Rosenthal <jrosenthal@jhu.edu> wrote:
> On Tue, 21 Feb 2012 14:53:06 +0100, Daniel Schoepe <daniel@schoepe.org> wrote:
> > On Tue, 21 Feb 2012 09:15:09 -0000, Justus Winter <4winter@informatik.uni-hamburg.de> wrote:
> > The reason I mentioned nottoomuch-addresses at all, is that completion
> > itself is _a lot_ faster (at least for me), compared to
> > addrlookup. According to the wiki, notmuch-addresses.py is even slower
> > than addrlookup, so I thought (and still think) that it was worth
> > mentioning. Of course, one could rewrite the database-generation part in
> > python using the bindings, but I personally don't think it's that
> > necessary.
>
> I'm not sure what speed comparisons were being used -- I think it was
> Sebastian comparing vala to python. In any case, using
> notmuch_addresses.py to look up a common prefix ("Jes") on a slowish
> computer takes 0.2 seconds. So I'm not sure if the speed is all that
> much of an issue. It might be a question of cache temperature, though --
> it'll probably take longer the first time you run it. Still, even trying
> something out on a cold cache, it seems to be about a second.
The speed comparisons between vanilla notmuch_addresses.py and
nottoomuch-addresses.sh are going to be flawed in that they do different
things. It's comparing apples and oranges.
notmuch_addresses.py looks for matches in the recipients of mails the
user has sent. Nothing else. notmuch_addresses.py filters out multiple
names for one email address using a popularity contest.
AFAICT nottoomuch-addresses.sh scans all the addresses in all the
mails. It has no logic for filtering out multiple names for one email
address, and just returns all matches.
Personally I would like to have best of both worlds, and I'm using a
modified notmuch_addresses.py that matches all the mails I have, and
cleans up the duplicate results. Unfortunately that does have a toll on
performance, taking about a second on my system for typical searches,
cache hot, while nottoomuch-addresses.sh takes less than a tenth of a
second. It is enough to be annoying, I'm afraid. Even so, it's not a
fair comparison because notmuch_addresses.py wasn't designed with this
in mind, and nottoomuch-addresses.sh maintains its own database and does
less.
One just needs to pick the tool that fits the needs best.
BR,
Jani.
next prev parent reply other threads:[~2012-02-22 13:07 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-16 19:12 nomuch_addresses.py Philippe LeCavalier
2012-02-16 19:46 ` nomuch_addresses.py Jesse Rosenthal
2012-02-16 19:51 ` nomuch_addresses.py Philippe LeCavalier
2012-02-16 20:00 ` nomuch_addresses.py Jesse Rosenthal
2012-02-16 20:03 ` nomuch_addresses.py Daniel Kahn Gillmor
2012-02-17 1:28 ` nomuch_addresses.py Daniel Schoepe
2012-02-17 18:58 ` nomuch_addresses.py Tomi Ollila
2012-02-17 20:33 ` nomuch_addresses.py Sebastian Spaeth
2012-02-17 20:46 ` nomuch_addresses.py Tomi Ollila
2012-02-18 3:04 ` nomuch_addresses.py Philippe LeCavalier
2012-02-18 5:15 ` nomuch_addresses.py Tomi Ollila
2012-02-21 1:25 ` nomuch_addresses.py Philippe LeCavalier
2012-02-21 9:15 ` nomuch_addresses.py Justus Winter
2012-02-21 11:23 ` nomuch_addresses.py Tomi Ollila
2012-02-21 13:53 ` nomuch_addresses.py Daniel Schoepe
2012-02-21 16:33 ` nomuch_addresses.py Jesse Rosenthal
2012-02-22 13:07 ` Jani Nikula [this message]
2012-02-21 20:30 ` nomuch_addresses.py David Bremner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871upn6mp4.fsf@nikula.org \
--to=jani@nikula.org \
--cc=4winter@informatik.uni-hamburg.de \
--cc=daniel@schoepe.org \
--cc=jrosenthal@jhu.edu \
--cc=notmuch@notmuchmail.org \
--cc=support@plecavalier.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).