unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Carl Worth <cworth@cworth.org>
To: Jeff Templon <templon@nikhef.nl>,
	David Bremner <david@tethera.net>,
	Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Cc: notmuch@notmuchmail.org
Subject: Re: tell me how to do this right (mail sent to lists)
Date: Fri, 12 Oct 2018 11:02:28 -0700	[thread overview]
Message-ID: <87pnwfqc0b.fsf@wondoo.home.cworth.org> (raw)
In-Reply-To: <m2d0sh32od.fsf@nikhef.nl>

[-- Attachment #1: Type: text/plain, Size: 4072 bytes --]

On Wed, Oct 10 2018, Jeff Templon wrote:
>> The tag is not associated with the file in Sent, it is associated
>> with the message-id.
>
> I guess I didn't make myself clear enough, again.  I didn't mean that
> the tag is associated with the file.  What I am guessing is something
> like this:

Hi Jeff,

Thanks for persevering so that we can all try to understand what's
happening. I appreciate the patience on all sides. :-)

> for message in new_messages:
>    if message.id not in database:
>       process message and determine list of tags
>       appy those tags to the messageID

Well, there are actually a couple of different processing loops that you
might be describing with the above. Let me try to walk through things:

First, there's a loop where "notmuch new" finds previously-unseen files
and indexes the content, adding it to the database:

for message_file in new_files:
  message_id = get_headers_message_id (message_file)
  (message,is_new) = database_lookup (message_id)
  index_file (message, message_file)
  if is_new:
    add_new_message_tags (message)

The above pseudo-code is based on the loop in notmuch-new.c:add_files(),
add_file() as well as lib/add-message.cc:notmuch_database_index_file()
and more or less trying to use naming consistent with the code.

Something to not in the above loop: The database_lookup above, (which is
actually _notmuch_message_create_for_message_id), can either create a
new message object in the database or return an existing object. But,
either way, the content of the message will be indexed. So, the
significant feature is that notmuch will always be able to search the
content it indexes for any message file, (regardless of the order it was
encountered given any duplication).

However, as can also be seen in the above loop, tags that are added
to new messages, (these are as configured in the "new.tags" entry in
~/.notmuch-config) will only be added if the message is new in this pass
of notmuch-new.

I'm not looking into the code for afew right now, but I can guess a
couple of places the undesired bug could be happening:

1. It could be looping over all messages with the "new" tag. And if your
   sent message gets tagged "new" in a pass before the mailing-list
   duplicate is present, then afew will not have access to the
   mailing-list version when it does its processing. Then, later, when a
   pass does have the mailing-list duplicate present, it won't be
   considered a "new" message so would not get picked processed in a
   loop considering messages tagged "new".

2. It's possible that both message files are present at the time that
   afew does its processing, but that it only opens one of the files to
   go looking for the List-Id header, (which it must be doing
   somewhere---as David mentioned, the List-Id header is not ever
   indexed by notmuch itself).

And in the above discussion, I'm assuming that it's even notmuch-new
that's doing the detection of new files. Some people use mail flows that
have some external mechanism for processing new incoming mail and then
calling "notmuch insert" for each one.

In conclusion, you have a few different options to get reliable
behavior:

One option is to use notmuch-based searches to find the mailing-list
mail that is of interest for you. To do this you would want to key off
of a header that is indexed by notmuch. For example, you could do
something like:

	notmuch tag +my-list-tag to:my-list-recipient-address

Another option is to continue to tag messages by inspecting the file
(outside of notmuch) to look for a header like List-Id (like you are
apparently doing now). To make this reliably, you would simply want to
ensure that that processing happens on every new file that is added. And
note that the "new" tag as added by "notmuch new" is not reliable for
that. That tag _is_ reliable for learning that a new message ID has
become available in the database, but is not reliably for know that a
new message file has appeared, (for a message ID that was present
previously).

Does that help explain things?

-Carl






[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  parent reply	other threads:[~2018-10-12 18:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-09 15:02 tell me how to do this right (mail sent to lists) Jeff Templon
2018-10-09 16:08 ` Daniel Kahn Gillmor
2018-10-09 20:53   ` Jeff Templon
2018-10-09 22:02     ` Carl Worth
2018-10-10  7:16       ` Jeff Templon
2018-10-10 19:03         ` David Bremner
2018-10-10 21:35           ` Jeff Templon
2018-10-11  0:50             ` David Bremner
2018-10-11  9:22             ` Martin Jambor
2018-10-11 12:13               ` Jeff Templon
2018-10-12 18:02             ` Carl Worth [this message]
2018-10-15 11:02               ` Jeff Templon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pnwfqc0b.fsf@wondoo.home.cworth.org \
    --to=cworth@cworth.org \
    --cc=david@tethera.net \
    --cc=dkg@fifthhorseman.net \
    --cc=notmuch@notmuchmail.org \
    --cc=templon@nikhef.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).