unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Jani Nikula <jani@nikula.org>
To: Jesse Rosenthal <jrosenthal@jhu.edu>,
	Daniel Schoepe <daniel@schoepe.org>,
	Justus Winter <4winter@informatik.uni-hamburg.de>,
	Philippe LeCavalier <support@plecavalier.com>,
	notmuch@notmuchmail.org
Subject: Re: nomuch_addresses.py
Date: Wed, 22 Feb 2012 13:07:35 +0000	[thread overview]
Message-ID: <871upn6mp4.fsf@nikula.org> (raw)
In-Reply-To: <87boosjgd9.fsf@jhu.edu>

On Tue, 21 Feb 2012 11:33:38 -0500, Jesse Rosenthal <jrosenthal@jhu.edu> wrote:
> On Tue, 21 Feb 2012 14:53:06 +0100, Daniel Schoepe <daniel@schoepe.org> wrote:
> > On Tue, 21 Feb 2012 09:15:09 -0000, Justus Winter <4winter@informatik.uni-hamburg.de> wrote:
> > The reason I mentioned nottoomuch-addresses at all, is that completion
> > itself is _a lot_ faster (at least for me), compared to
> > addrlookup. According to the wiki, notmuch-addresses.py is even slower
> > than addrlookup, so I thought (and still think) that it was worth
> > mentioning. Of course, one could rewrite the database-generation part in
> > python using the bindings, but I personally don't think it's that
> > necessary.
> 
> I'm not sure what speed comparisons were being used -- I think it was
> Sebastian comparing vala to python. In any case, using
> notmuch_addresses.py to look up a common prefix ("Jes") on a slowish
> computer takes 0.2 seconds. So I'm not sure if the speed is all that
> much of an issue. It might be a question of cache temperature, though --
> it'll probably take longer the first time you run it. Still, even trying
> something out on a cold cache, it seems to be about a second.

The speed comparisons between vanilla notmuch_addresses.py and
nottoomuch-addresses.sh are going to be flawed in that they do different
things. It's comparing apples and oranges.

notmuch_addresses.py looks for matches in the recipients of mails the
user has sent. Nothing else. notmuch_addresses.py filters out multiple
names for one email address using a popularity contest.

AFAICT nottoomuch-addresses.sh scans all the addresses in all the
mails. It has no logic for filtering out multiple names for one email
address, and just returns all matches.

Personally I would like to have best of both worlds, and I'm using a
modified notmuch_addresses.py that matches all the mails I have, and
cleans up the duplicate results. Unfortunately that does have a toll on
performance, taking about a second on my system for typical searches,
cache hot, while nottoomuch-addresses.sh takes less than a tenth of a
second. It is enough to be annoying, I'm afraid. Even so, it's not a
fair comparison because notmuch_addresses.py wasn't designed with this
in mind, and nottoomuch-addresses.sh maintains its own database and does
less.

One just needs to pick the tool that fits the needs best.


BR,
Jani.

  reply	other threads:[~2012-02-22 13:07 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-16 19:12 nomuch_addresses.py Philippe LeCavalier
2012-02-16 19:46 ` nomuch_addresses.py Jesse Rosenthal
2012-02-16 19:51   ` nomuch_addresses.py Philippe LeCavalier
2012-02-16 20:00     ` nomuch_addresses.py Jesse Rosenthal
2012-02-16 20:03     ` nomuch_addresses.py Daniel Kahn Gillmor
2012-02-17  1:28 ` nomuch_addresses.py Daniel Schoepe
2012-02-17 18:58   ` nomuch_addresses.py Tomi Ollila
2012-02-17 20:33     ` nomuch_addresses.py Sebastian Spaeth
2012-02-17 20:46       ` nomuch_addresses.py Tomi Ollila
2012-02-18  3:04         ` nomuch_addresses.py Philippe LeCavalier
2012-02-18  5:15           ` nomuch_addresses.py Tomi Ollila
2012-02-21  1:25   ` nomuch_addresses.py Philippe LeCavalier
2012-02-21  9:15   ` nomuch_addresses.py Justus Winter
2012-02-21 11:23     ` nomuch_addresses.py Tomi Ollila
2012-02-21 13:53     ` nomuch_addresses.py Daniel Schoepe
2012-02-21 16:33       ` nomuch_addresses.py Jesse Rosenthal
2012-02-22 13:07         ` Jani Nikula [this message]
2012-02-21 20:30     ` nomuch_addresses.py David Bremner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871upn6mp4.fsf@nikula.org \
    --to=jani@nikula.org \
    --cc=4winter@informatik.uni-hamburg.de \
    --cc=daniel@schoepe.org \
    --cc=jrosenthal@jhu.edu \
    --cc=notmuch@notmuchmail.org \
    --cc=support@plecavalier.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).