From: Eric Wong <e@80x24.org>
To: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
Cc: meta@public-inbox.org
Subject: Re: About header filtering
Date: Tue, 22 Dec 2020 23:11:26 +0000 [thread overview]
Message-ID: <20201222231126.GA14850@dcvr> (raw)
In-Reply-To: <20201222222118.i4bioeo7l6iuf3pk@pengutronix.de>
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote:
> Hello Konstantin,
>
> On Tue, Dec 22, 2020 at 11:28:28AM -0500, Konstantin Ryabitsev wrote:
> > On Tue, Dec 22, 2020 at 08:37:04AM +0100, Uwe Kleine-König wrote:
> > > I found that Konstantin Ryabitsev's tool to prepare an initial archive
> > > from an already existing mailing list[1] filters some of these out, but
> > > the instance on kernel.org has some of these details, too. (See for
> > > example
> > > https://lore.kernel.org/lkml/20201013082132.661993-1-u.kleine-koenig@pengutronix.de/raw;
> > > there are Return-Path: and also some Received: headers that I consider
> > > not-so-nice as they were added after the mail was processed by the
> > > mailing list tool on vger.kernel.org.)
> > >
> > > Is it considerd bad to filter these out? Or is it just that nobody
> > > wanted this kind of cleanliness before in such a setup?
> >
> > The reason we don't do any filtering after receiving the mail on the archiver
> > system is two-fold:
> >
> > 1. we don't know if any of the Received: lines are part of any DKIM/ARC
> > signatures (they shouldn't be -- it's wrong to include them, but I've seen
> > this happen).
>
> Note I don't intend to throw away all Received lines, only the ones
> concerning the hops after the mailing list server. These cannot be
> signed using DKIM unless the mailing list subscription goes to an
> address that is forwarded and the forwarding server signs the Received
> lines.
Fwiw, you should be able to use either Email::MIME or
PublicInbox::Eml to shift off the latest (topmost) Received
header:
----8<----
#!/usr/bin/perl -w
use strict;
use PublicInbox::Eml;
my $eml = PublicInbox::Eml->new(do { local $/; <STDIN> });
my @rcvd = $eml->header_raw('Received'); # array context for all instances
shift @rcvd; # remove topmost
$eml->header_set('Received', @rcvd); # set to keep remaining
print $eml->as_string;
----8<----
s/PublicInbox::Eml/Email::MIME/ works, too, but PublicInbox::Eml
won't endlessly recurse multipart mails like Email::MIME does.
Otherwise the header_raw, header_set, as_string APIs should
behave the same.
next prev parent reply other threads:[~2020-12-22 23:11 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-22 7:37 About header filtering Uwe Kleine-König
2020-12-22 16:28 ` Konstantin Ryabitsev
2020-12-22 22:21 ` Uwe Kleine-König
2020-12-22 23:11 ` Eric Wong [this message]
2020-12-23 17:57 ` Konstantin Ryabitsev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201222231126.GA14850@dcvr \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
--cc=u.kleine-koenig@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).