From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <steven@stebalien.com>
Received: from localhost (localhost [127.0.0.1])
 by arlo.cworth.org (Postfix) with ESMTP id 88E356DE1C8D
 for <notmuch@notmuchmail.org>; Mon,  6 Mar 2017 16:02:58 -0800 (PST)
X-Virus-Scanned: Debian amavisd-new at cworth.org
X-Spam-Flag: NO
X-Spam-Score: -2.265
X-Spam-Level: 
X-Spam-Status: No, score=-2.265 tagged_above=-999 required=5 tests=[AWL=0.036, 
 RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=disabled
Received: from arlo.cworth.org ([127.0.0.1])
 by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id HHO_gjTrwwqi for <notmuch@notmuchmail.org>;
 Mon,  6 Mar 2017 16:02:57 -0800 (PST)
Received: from outgoing-stata.csail.mit.edu (outgoing-stata.csail.mit.edu
 [128.30.2.210])
 by arlo.cworth.org (Postfix) with ESMTP id 073A36DE1C59
 for <notmuch@notmuchmail.org>; Mon,  6 Mar 2017 16:02:57 -0800 (PST)
Received: from 99-167-85-176.lightspeed.irvnca.sbcglobal.net ([99.167.85.176]
 helo=localhost)
 by outgoing-stata.csail.mit.edu with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <steven@stebalien.com>)
 id 1cl2ad-000EOt-3u; Mon, 06 Mar 2017 19:02:55 -0500
From: Steven Allen <steven@stebalien.com>
To: Jameson Graef Rollins <jrollins@finestructure.net>
Cc: notmuch@notmuchmail.org
Subject: Re: whitelisting
In-Reply-To: <87innmvvam.fsf@ligo.caltech.edu>
References: <87innmvvam.fsf@ligo.caltech.edu>
Date: Mon, 06 Mar 2017 16:02:50 -0800
Message-ID: <87fuiq0x91.fsf@bistromath>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-=";
 micalg=pgp-sha256; protocol="application/pgp-signature"
X-BeenThere: notmuch@notmuchmail.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Use and development of the notmuch mail system."
 <notmuch.notmuchmail.org>
List-Unsubscribe: <https://notmuchmail.org/mailman/options/notmuch>,
 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
List-Archive: <http://notmuchmail.org/pipermail/notmuch/>
List-Post: <mailto:notmuch@notmuchmail.org>
List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
List-Subscribe: <https://notmuchmail.org/mailman/listinfo/notmuch>,
 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Mar 2017 00:02:58 -0000

--=-=-=
Content-Type: text/plain


Jameson,

> This works ok, but takes more than 20s to execute, which will slow down
> my inbox processing quite a bit.  I could try to write a python script
> to iterate over all tag:spam, extract addresses from those messages, and
> match against the whitelist, but I doubt that will be any faster.

Instead of iterating over all messages in spam, why not just iterate
over *new* messages (`tag:new`) in your pre hook? That is (pseudo code):

    for message in `notmuch search tag:new and tag:spam`:
        for author in message.headers["From"]: 
            author = clean(author) # Extract the *actual* email address (name@domain).
            # There are probably faster ways to check this...
            if `notmuch count tag:sent and to:author` > 0:
                notmuch tag -spam -- message

That should be reasonably fast.

Note: you probably will have to do this in python because extracting the
from addresses otherwise is a bit of a pain.

- Steven

--=-=-=
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEE2vt3MKJDbR/uMVY0wSXs/RcTlDwFAli9+KoACgkQwSXs/RcT
lDxr9BAAnT6WhQmym667HNCODOrz+m+XI5899tYakrZIH1qq0wDDyXZyqTBqxU9M
dI1ax7u5sSGQgAD9bCkDCYBdyaxl81SxQrlZu87BkDml8hBrfwwG0Amn6NdknYum
rgpvWh0a3X/tlySarGnCZdMr4wDoSI8X6C97QHY96z8IYBYoPKx1B4Qix+ohRBPh
XWQlEKxJG6uJZICnmwCuzuA+2KuM7NKNEuvNlikYpdFnK11VxV9HJR/ldisvC2P4
CjzU2X4mEfCVDQAZ5kD8BebJsC9a3nhrjpryz/0SaLHggpdngfZ9NtSU9m5FFAs7
SKSLK1BfMJGhmU7zyS+zxsjGjEuTkNkExwiJbTwD4tt7aQVqDS9tfPDkQC9tZNJm
K61I7sG5Qe9+1TIJHu1NL+851pzvZOmTtVZw/yOSk8lmSAdyCJiTG59QmDiutOqc
XiQAuAz1FF/ZCIsRTJygq5Cpc4MZMA/VDa2AikOd+gRpm/AB2W+R2oanF5K4Wv5C
ey1Sh+2jV/NpH/Mkfkc86qstd0p6E2C2SiJNmBA9bTxHMUzQRPcPvZ2CL6XEpUSD
UG7pKlssnmIJC2IokCSm9ppdvXfxio9HGEY+r6IBbK7D37pW+b35qCfVkFLW2NA4
OkgsTPXZqj4aF9sfLx6kAdGaos4epD4fGH7Cy+dy/iGxM8R2/CM=
=qUIJ
-----END PGP SIGNATURE-----
--=-=-=--