unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Matt Armstrong <marmstrong@google.com>
To: Gaute Hope <eg@gaute.vetsj.com>,
	David Bremner <david@tethera.net>,
	Daniel Kahn Gillmor <dkg@fifthhorseman.net>,
	notmuch@notmuchmail.org, Xu Wang <xuwang762@gmail.com>
Subject: Re: find threads where I and Jian participated but not Dave
Date: Thu, 22 Jun 2017 17:00:58 -0700	[thread overview]
Message-ID: <qf5wp83wp1h.fsf@google.com> (raw)
In-Reply-To: <1498112439.apimm1pnum.astroid@strange.none>

Gaute Hope <eg@gaute.vetsj.com> writes:

> Gaute Hope writes on juni 22, 2017 8:08:
>> Daniel Kahn Gillmor writes on juni 21, 2017 23:30:
>>> On Wed 2017-06-21 13:04:53 -0700, Matt Armstrong wrote:
>>>> For what it is worth, I've found this idea from Daniel intriguing and
>>>> pretty useful in practice:
>>>>
>>>>   "show me threads in which i've participated, where there are some
>>>>    messages flagged with 'inbox'"
>>>>
>>>> I implement it like this in my post-new hook:
>>>>
>>>>     # All messages in threads in which I participate get tag:participated
>>>>     notmuch search --output=threads from:marmstrong | \
>>>>       sed -e 's,^,+participated -- ,' | \
>>>>       notmuch tag --batch
>>> 
>>> cool, thx for the suggestion.
>>> 
>>> the "notmuch search" part of the pipeline alone takes ~19s (wall time,
>>> and actual CPU time) for me though :/  It returns 30504 threads!  how
>>> many threads do you get?
>> 
>> Is there any reason why you do not filter on a tag 'new' as well?
>> 
>>      notmuch search --output=threads from:marmstrong and tag:new | \
>>        sed -e 's,^,+participated -- ,' | \
>>        notmuch tag --batch
>> 
>
> Nevermind, I get it - it might be possible to add a temporary tag 
> new-tag to the whole thread first and not just new messages. That might 
> be faster. As long as all sent messages get the new tag as well.

Gaute, I took this as a challenge and came up with what I think is an
equivalent but more efficient approach.  The disadvantage is that it is
much more complex.  The advantage is that it runs in under 0.2 seconds
to process a day's worth of my "new" mail.

I now have this in my notmuch post-hook.  I believe I could change the
"tag:new OR date:today" query to just "tag:new".  The "OR date:today"
helped during interactive development.

# All threads in which I participate get tag:participated
#  1) Find all threads with a message tagged new
#     (finding all 'today' messages helps during testing,
#     but isn't necessary)
#  2) Run through "xargs -s 2048 echo" to to group threads
#     lines of about 2K in size.
#  3) For each line (2) produces, narrow the threads to
#     those containing a message from me.
#  4) For each such thread, tag every message with +participated.
notmuch search --output=threads tag:new OR date:today | \
  xargs -s 2048 echo | \
  xargs -I '{}' notmuch search \
  --output=threads from:marmstrong AND \( '{}' \) | \
  sed -e 's,^,+participated -- ,' | \
  notmuch tag --batch


The basic idea is that each run of the notmuch post-hook will
incorporate relatively little mail, so the number of unique threads will
be relatively small.  So, we just list them all by thread ID.

Then for each thread with new messages, we figure out which threads have
a message from:marmstrong (it need not be the new message).

We then tag all messages in each of those threads with +participated.

You said "it might be possible to add a temporary tag new-tag to the
whole thread first and not just new messages." -- Yes, and that is
implicitly what I am doing, except that each such thread is instead
tracked in an ephemeral way through the xargs based shell pipeline.

I did try an approach of explicitly labeling all messages in "new"
threads, temporarily, but that was slower.

You said "As long as all sent messages get the new tag as well." --
true, and I'm not sure about that.  My primary use for this is to
discover new activity from others *after* I've participated in a thread,
so I don't much care if a thread that is "participated in" is not tagged
that way until some mail from somebody else arrives.

  reply	other threads:[~2017-06-23  0:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-09 21:55 find threads where I and Jian participated but not Dave Xu Wang
2017-06-13 15:57 ` Xu Wang
2017-06-13 23:31   ` David Bremner
2017-06-14  0:24     ` Xu Wang
2017-06-14  3:32     ` Brian Sniffen
2017-06-15 17:54     ` Daniel Kahn Gillmor
2017-06-15 20:20       ` David Bremner
2017-06-16  1:07         ` Matt Armstrong
2017-06-16  6:28         ` Gaute Hope
2017-06-21 20:04           ` Matt Armstrong
2017-06-21 21:30             ` Daniel Kahn Gillmor
2017-06-22  6:08               ` Gaute Hope
2017-06-22  6:22                 ` Gaute Hope
2017-06-23  0:00                   ` Matt Armstrong [this message]
2017-06-25 15:46                     ` finding incoming messages in threads in which i've participated [was: Re: find threads where I and Jian participated but not Dave] Daniel Kahn Gillmor
2017-06-25 17:14                       ` David Bremner
2017-06-26 20:49                         ` Matt Armstrong
2017-06-26 23:09                           ` David Bremner
2017-08-20 13:35                           ` Jani Nikula
2017-08-20 13:48                         ` Jani Nikula
2017-08-20 21:32                           ` [PATCH] WIP: add thread subqueries David Bremner
2017-08-21  1:35                             ` David Bremner
2017-09-07 17:47                               ` Gaute Hope
2017-09-07 18:51                                 ` David Bremner
2017-06-25 17:40                       ` finding incoming messages in threads in which i've participated [was: Re: find threads where I and Jian participated but not Dave] Brian Sniffen
2017-06-26 20:54                       ` Matt Armstrong
2017-06-22 20:15               ` find threads where I and Jian participated but not Dave Matt Armstrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=qf5wp83wp1h.fsf@google.com \
    --to=marmstrong@google.com \
    --cc=david@tethera.net \
    --cc=dkg@fifthhorseman.net \
    --cc=eg@gaute.vetsj.com \
    --cc=notmuch@notmuchmail.org \
    --cc=xuwang762@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).