From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 547061F461; Mon, 24 Jun 2019 23:38:09 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Cc: "Eric W. Biederman" Subject: [PATCH] msgmap: mid_insert: use plain "INSERT" to detect duplicates Date: Mon, 24 Jun 2019 23:38:09 +0000 Message-Id: <20190624233809.1721-1-e@80x24.org> In-Reply-To: <878strvusz.fsf@xmission.com> References: <878strvusz.fsf@xmission.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: "INSERT OR IGNORE" still bumps the auto-increment counter in SQLite, which causes gaps to appear in NNTP article numbering. This bug appeared in v2 repos where V2Writable may call ->add repeatedly on the same message. This bug is apparent with public-inbox-watch and work-in-progress IMAP watchers which may rescan and (attempt to) reinsert the same message on mailbox changes. Most uses of public-inbox-mda were not affected, unless the same message is actually delivered multiple times to the mda. v1 is not affected, either, since deduplication is only based on Message-ID and msgmap never sees the duplicate. Reported-by: "Eric W. Biederman" --- lib/PublicInbox/Msgmap.pm | 4 ++-- t/msgmap.t | 3 +++ t/v2writable.t | 2 ++ 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm index 0035c9e3..5a89b85a 100644 --- a/lib/PublicInbox/Msgmap.pm +++ b/lib/PublicInbox/Msgmap.pm @@ -126,9 +126,9 @@ sub mid_insert { my ($self, $mid) = @_; my $dbh = $self->{dbh}; my $sth = $dbh->prepare_cached(<<''); -INSERT OR IGNORE INTO msgmap (mid) VALUES (?) +INSERT INTO msgmap (mid) VALUES (?) - return if $sth->execute($mid) == 0; + return unless eval { $sth->execute($mid) }; my $num = $dbh->last_insert_id(undef, undef, 'msgmap', 'num'); $self->num_highwater($num) if defined($num); $num; diff --git a/t/msgmap.t b/t/msgmap.t index 4dddd0a8..2d018219 100644 --- a/t/msgmap.t +++ b/t/msgmap.t @@ -30,6 +30,9 @@ $@ = undef; my $ret = $d->mid_insert('a@b'); is($ret, undef, 'duplicate mid_insert in undef result'); is($d->num_for('a@b'), $mid2num{'a@b'}, 'existing number not clobbered'); +my $max = (sort(keys %num2mid))[-1]; +is($d->mid_insert('ok@unique'), $max + 1, + 'got expected num after failing mid_insert'); foreach my $n (keys %num2mid) { is($d->mid_for($n), $num2mid{$n}, "num:$n maps correctly"); diff --git a/t/v2writable.t b/t/v2writable.t index 88df2d64..8f32fbe5 100644 --- a/t/v2writable.t +++ b/t/v2writable.t @@ -118,6 +118,8 @@ if ('ensure git configs are correct') { $mime->header_set('References', ''); ok($im->add($mime), 'message with multiple Message-ID'); $im->done; + my ($total, undef) = $ibx->over->recent; + is($ibx->mm->num_highwater, $total, 'got expected highwater value'); my $srch = $ibx->search; my $mset1 = $srch->reopen->query('m:abcde@1', { mset => 1 }); is($mset1->size, 1, 'message found by first MID'); -- EW