unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Questions regarding headers that may occur multiple times
@ 2022-02-11  8:16 mbw+nm
  2022-02-12 12:48 ` David Bremner
  0 siblings, 1 reply; 2+ messages in thread
From: mbw+nm @ 2022-02-11  8:16 UTC (permalink / raw)
  To: notmuch

Hi there,

I have been using notmuch in conjunction with neomutt and some homemade
tagging scripts more or less successfully for about two years now.

However, sometimes I have difficulties to uniquely identify (and thus tag
accordingly) the intended recipient of an email. For that I wanted to
try to use the `Delivered-To` and `Received:` headers.

These may occur an arbitrary amount of times within a given email. But
from what I gather by first reindexing with

`$ notmuch config set index.header.Received Received`

`$ notmuch reindex '*'`

and then inspecting the result via

`$ notmuch config set show.extra_headers="received;delivered-to"

`$ notmuch show --format=json date:today | jq`,

it appears that only the first occurrence of these header values are
taken into account?

I might be wrong, so I'll just try to describe what problem I am trying
to solve:

Consider an email that looks like this:

Return-Path:
        <0101017ee5317599-02f47ee2-4eda-4840-abbd-7d52b54ff13b-000000@us-west-2.amazonses.com>
Delivered-To: unknown
Received: from pop3.mailbox.org ([2001:67c:2050:106::143:199]:995) by
        legion.localdomain with POP3-SSL getmail6 msgid:UID12422-1430074968; 10 Feb
        2022 20:27:40 -0000
Delivered-To: mbw+nm@mailbox.org
Received: from director-05.heinlein-hosting.de ([80.241.60.215])
        (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
        by dobby23a.heinlein-hosting.de with LMTPS
        id 2GLdD/RsBWLLYwEAwSxZKQ
        (envelope-from
        <0101017ee5317599-02f47ee2-4eda-4840-abbd-7d52b54ff13b-000000@us-west-2.amazonses.com>)
        for <mbw+nm@mailbox.org>; Thu, 10 Feb 2022 20:52:20 +0100
Received: from mx2.mailbox.org ([80.241.60.215])
        (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
        by director-05.heinlein-hosting.de with LMTPS
        id +M7kDfRsBWJYZAEATazItQ
        (envelope-from
        <0101017ee5317599-02f47ee2-4eda-4840-abbd-7d52b54ff13b-000000@us-west-2.amazonses.com>)
        for <mbw+nm@mailbox.org>; Thu, 10 Feb 2022 20:52:20 +0100
X-Virus-Scanned: amavisd-new at heinlein-support.de
etc.

Here, the information visible with `notmuch show` includes `Received:
from pop3.mailbox.org` (good) and `Delivered-To: unknown` (bad).

What I would like to do is to somehow access the second (or maybe all)
occurrences of these header values. Is that possible?

This would allow me to identify emails which `To: all@some-mailing-list.net`.

It also appears that (with notmuch 0.35), the `extra_headers` only show
up with `--format.json`, by the way.

I'd be grateful for any suggestions on how to approach this.


Cheers,
Max

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Questions regarding headers that may occur multiple times
  2022-02-11  8:16 Questions regarding headers that may occur multiple times mbw+nm
@ 2022-02-12 12:48 ` David Bremner
  0 siblings, 0 replies; 2+ messages in thread
From: David Bremner @ 2022-02-12 12:48 UTC (permalink / raw)
  To: mbw+nm, notmuch

mbw+nm@mailbox.org writes:

TL;DR: yes, the things you think are not supported are not supported.

>
> it appears that only the first occurrence of these header values are
> taken into account?

Yes, we use g_mime_object_get_header, which "Gets the value of the first
header with the specified name.".

> What I would like to do is to somehow access the second (or maybe all)
> occurrences of these header values. Is that possible?

Not with the current indexing implementation. In principle it would be
possible to change the indexer to scan all of the headers for each user
defined header to be indexed. I don't know how bad the performance
impact would be; it would mean moving from a hash table lookup to a
linear scan of the headers, but perhaps that time is small relative to
the work of actually updating the database.

> It also appears that (with notmuch 0.35), the `extra_headers` only show
> up with `--format.json`, by the way.

Also for s-expression output (and raw, fwiw), but yes the text format is
missing quite a few features. The problem is that it does not use the
same "structured output" code as the other formats, it's essentially
double the code/bugs to support new features in the text format. For
that reason the format has been more or less frozen since the emacs
front-end (and vim front end iirc) stopped using it. 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-02-12 12:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-11  8:16 Questions regarding headers that may occur multiple times mbw+nm
2022-02-12 12:48 ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).