From: "Eric Wong (Contractor, The Linux Foundation)" <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 11/13] v2writable: clarify header cleanups
Date: Thu, 22 Mar 2018 09:40:13 +0000 [thread overview]
Message-ID: <20180322094015.14422-12-e@80x24.org> (raw)
In-Reply-To: <20180322094015.14422-1-e@80x24.org>
We want to make it clear to the code and DEBUG_DIFF users
that we do not introduce messages with unsuitable headers
into public archives.
---
lib/PublicInbox/Import.pm | 12 +++++++++---
lib/PublicInbox/V2Writable.pm | 7 +++++++
2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index d69934b..5d116a1 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -288,6 +288,14 @@ sub extract_author_info ($) {
($name, $email);
}
+# kill potentially confusing/misleading headers
+sub drop_unwanted_headers ($) {
+ my ($mime) = @_;
+
+ $mime->header_set($_) for qw(bytes lines content-length status);
+ $mime->header_set($_) for @PublicInbox::MDA::BAD_HEADERS;
+}
+
# returns undef on duplicate
# returns the :MARK of the most recent commit
sub add {
@@ -321,9 +329,7 @@ sub add {
_check_path($r, $w, $tip, $path) and return;
}
- # kill potentially confusing/misleading headers
- $mime->header_set($_) for qw(bytes lines content-length status);
- $mime->header_set($_) for @PublicInbox::MDA::BAD_HEADERS;
+ drop_unwanted_headers($mime);
# spam check:
if ($check_cb) {
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 605f688..44b5528 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -223,6 +223,12 @@ sub remove {
my $mm = $skel->{mm};
my $removed;
my $mids = mids($mime->header_obj);
+
+ # We avoid introducing new blobs into git since the raw content
+ # can be slightly different, so we do not need the user-supplied
+ # message now that we have the mids and content_id
+ $mime = undef;
+
foreach my $mid (@$mids) {
$srch->reopen->each_smsg_by_mid($mid, sub {
my ($smsg) = @_;
@@ -430,6 +436,7 @@ sub diff ($$$) {
print $ah $cur->as_string or die "print: $!";
close $ah or die "close: $!";
my ($bh, $bn) = tempfile('email-new-XXXXXXXX');
+ PublicInbox::Import::drop_unwanted_headers($new);
print $bh $new->as_string or die "print: $!";
close $bh or die "close: $!";
my $cmd = [ qw(diff -u), $an, $bn ];
--
EW
next prev parent reply other threads:[~2018-03-22 9:40 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-22 9:40 [PATCH 00/13] reindexing, feeds, date fixes Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 01/13] content_id: do not take Message-Id into account Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 02/13] introduce InboxWritable class Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 03/13] import: discard all the same headers as MDA Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 04/13] InboxWritable: add mbox/maildir parsing + import logic Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 05/13] use both Date: and Received: times Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 06/13] msgmap: add tmp_clone to create an anonymous copy Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 07/13] fix syntax warnings Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 08/13] v2writable: support reindexing Xapian Eric Wong (Contractor, The Linux Foundation)
2018-03-26 20:08 ` Eric Wong
2018-03-22 9:40 ` [PATCH 09/13] t/altid.t: extra tests for mid_set Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 10/13] v2writable: add NNTP article number regeneration support Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` Eric Wong (Contractor, The Linux Foundation) [this message]
2018-03-22 9:40 ` [PATCH 12/13] v2writable: DEBUG_DIFF respects $TMPDIR Eric Wong (Contractor, The Linux Foundation)
2018-03-22 9:40 ` [PATCH 13/13] feed: $INBOX/new.atom endpoint supports v2 inboxes Eric Wong (Contractor, The Linux Foundation)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180322094015.14422-12-e@80x24.org \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).