From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 88E356DE1C8D for ; Mon, 6 Mar 2017 16:02:58 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -2.265 X-Spam-Level: X-Spam-Status: No, score=-2.265 tagged_above=-999 required=5 tests=[AWL=0.036, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HHO_gjTrwwqi for ; Mon, 6 Mar 2017 16:02:57 -0800 (PST) Received: from outgoing-stata.csail.mit.edu (outgoing-stata.csail.mit.edu [128.30.2.210]) by arlo.cworth.org (Postfix) with ESMTP id 073A36DE1C59 for ; Mon, 6 Mar 2017 16:02:57 -0800 (PST) Received: from 99-167-85-176.lightspeed.irvnca.sbcglobal.net ([99.167.85.176] helo=localhost) by outgoing-stata.csail.mit.edu with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1cl2ad-000EOt-3u; Mon, 06 Mar 2017 19:02:55 -0500 From: Steven Allen To: Jameson Graef Rollins Cc: notmuch@notmuchmail.org Subject: Re: whitelisting In-Reply-To: <87innmvvam.fsf@ligo.caltech.edu> References: <87innmvvam.fsf@ligo.caltech.edu> Date: Mon, 06 Mar 2017 16:02:50 -0800 Message-ID: <87fuiq0x91.fsf@bistromath> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Mar 2017 00:02:58 -0000 --=-=-= Content-Type: text/plain Jameson, > This works ok, but takes more than 20s to execute, which will slow down > my inbox processing quite a bit. I could try to write a python script > to iterate over all tag:spam, extract addresses from those messages, and > match against the whitelist, but I doubt that will be any faster. Instead of iterating over all messages in spam, why not just iterate over *new* messages (`tag:new`) in your pre hook? That is (pseudo code): for message in `notmuch search tag:new and tag:spam`: for author in message.headers["From"]: author = clean(author) # Extract the *actual* email address (name@domain). # There are probably faster ways to check this... if `notmuch count tag:sent and to:author` > 0: notmuch tag -spam -- message That should be reasonably fast. Note: you probably will have to do this in python because extracting the from addresses otherwise is a bit of a pain. - Steven --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEE2vt3MKJDbR/uMVY0wSXs/RcTlDwFAli9+KoACgkQwSXs/RcT lDxr9BAAnT6WhQmym667HNCODOrz+m+XI5899tYakrZIH1qq0wDDyXZyqTBqxU9M dI1ax7u5sSGQgAD9bCkDCYBdyaxl81SxQrlZu87BkDml8hBrfwwG0Amn6NdknYum rgpvWh0a3X/tlySarGnCZdMr4wDoSI8X6C97QHY96z8IYBYoPKx1B4Qix+ohRBPh XWQlEKxJG6uJZICnmwCuzuA+2KuM7NKNEuvNlikYpdFnK11VxV9HJR/ldisvC2P4 CjzU2X4mEfCVDQAZ5kD8BebJsC9a3nhrjpryz/0SaLHggpdngfZ9NtSU9m5FFAs7 SKSLK1BfMJGhmU7zyS+zxsjGjEuTkNkExwiJbTwD4tt7aQVqDS9tfPDkQC9tZNJm K61I7sG5Qe9+1TIJHu1NL+851pzvZOmTtVZw/yOSk8lmSAdyCJiTG59QmDiutOqc XiQAuAz1FF/ZCIsRTJygq5Cpc4MZMA/VDa2AikOd+gRpm/AB2W+R2oanF5K4Wv5C ey1Sh+2jV/NpH/Mkfkc86qstd0p6E2C2SiJNmBA9bTxHMUzQRPcPvZ2CL6XEpUSD UG7pKlssnmIJC2IokCSm9ppdvXfxio9HGEY+r6IBbK7D37pW+b35qCfVkFLW2NA4 OkgsTPXZqj4aF9sfLx6kAdGaos4epD4fGH7Cy+dy/iGxM8R2/CM= =qUIJ -----END PGP SIGNATURE----- --=-=-=--