unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* Q: Did you do something to message number recently?
@ 2019-06-24 15:59 Eric W. Biederman
  2019-06-24 16:34 ` Eric Wong
  2019-06-24 23:38 ` [PATCH] msgmap: mid_insert: use plain "INSERT" to detect duplicates Eric Wong
  0 siblings, 2 replies; 9+ messages in thread
From: Eric W. Biederman @ 2019-06-24 15:59 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta


Eric,

I am just starting to dig into this, I just noticed that I have several
inboxes that are seeing huge skips in message numbers assigned in
msgmap.  Do you have any idea why this would be?

If not I will dig in and figure this out.  I just figured I would ask
in case you have any handy canidates.

I thought I had a regression test in public-inbox to catch this.

Eric


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Q: Did you do something to message number recently?
  2019-06-24 15:59 Q: Did you do something to message number recently? Eric W. Biederman
@ 2019-06-24 16:34 ` Eric Wong
  2019-06-24 17:33   ` Eric Wong
  2019-06-24 23:38 ` [PATCH] msgmap: mid_insert: use plain "INSERT" to detect duplicates Eric Wong
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Wong @ 2019-06-24 16:34 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: meta

"Eric W. Biederman" <ebiederm@xmission.com> wrote:
> 
> Eric,
> 
> I am just starting to dig into this, I just noticed that I have several
> inboxes that are seeing huge skips in message numbers assigned in
> msgmap.  Do you have any idea why this would be?

I've hit this in the past using the altid stuff to keep NNTP
article numbers in check.

There's a lot of changes w.r.t. progress reporting, no-op
speedups,

> If not I will dig in and figure this out.  I just figured I would ask
> in case you have any handy canidates.

Are you hitting this with "git fetch && public-inbox-index" or
-watch/-mda?

> I thought I had a regression test in public-inbox to catch this.

Me too.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Q: Did you do something to message number recently?
  2019-06-24 16:34 ` Eric Wong
@ 2019-06-24 17:33   ` Eric Wong
  2019-06-24 22:56     ` Eric W. Biederman
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Wong @ 2019-06-24 17:33 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> "Eric W. Biederman" <ebiederm@xmission.com> wrote:
> > 
> > Eric,
> > 
> > I am just starting to dig into this, I just noticed that I have several
> > inboxes that are seeing huge skips in message numbers assigned in
> > msgmap.  Do you have any idea why this would be?

How big are the gaps you're hitting?

> I've hit this in the past using the altid stuff to keep NNTP
> article numbers in check.
> 
> There's a lot of changes w.r.t. progress reporting, no-op
> speedups,
> 
> > If not I will dig in and figure this out.  I just figured I would ask
> > in case you have any handy canidates.
> 
> Are you hitting this with "git fetch && public-inbox-index" or
> -watch/-mda?

OK, I see this in my libc-alpha v2 mirror via -watch,
but not my v1 -watch mirrors.  Likely to affect -mda, too.
But the number gaps are small, however (1-2 messages), and.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Q: Did you do something to message number recently?
  2019-06-24 17:33   ` Eric Wong
@ 2019-06-24 22:56     ` Eric W. Biederman
  2019-06-24 23:42       ` Eric Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Eric W. Biederman @ 2019-06-24 22:56 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong <e@80x24.org> writes:

> Eric Wong <e@80x24.org> wrote:
>> "Eric W. Biederman" <ebiederm@xmission.com> wrote:
>> > 
>> > Eric,
>> > 
>> > I am just starting to dig into this, I just noticed that I have several
>> > inboxes that are seeing huge skips in message numbers assigned in
>> > msgmap.  Do you have any idea why this would be?
>
> How big are the gaps you're hitting?

This is from my mailbox where I confirmed there was an issue.

> sqlite> select max(num) from msgmap
>    ...> ;
> 110106
> sqlite> select * from msgmap where num = 110106 ;
> 110106|84814e0e219b33ade641718dc76b3225@swift.generated
> sqlite> select * from msgmap where num = 110105 ;
> sqlite> select * from msgmap where num = 110104 ;
> sqlite> select * from msgmap where num = 110103 ;
> sqlite> select * from msgmap where num = 110102 ;
> sqlite> select * from msgmap where num = 110101 ;
> sqlite> select * from msgmap where num = 110100 ;
> sqlite> select * from msgmap where num = 110100 ;
> sqlite> select * from msgmap where num = 110099 ;
> sqlite> select * from msgmap where num = 110098 ;
> sqlite> select * from msgmap where num = 110097 ;
> sqlite> select * from msgmap where num = 110096 ;
> sqlite> select * from msgmap where num = 110095 ;
> sqlite> select * from msgmap where num = 110094 ;
> sqlite> select * from msgmap where num = 110093 ;
> sqlite> select * from msgmap where num = 110092 ;
> sqlite> select * from msgmap where num = 110091 ;
> sqlite> select * from msgmap where num = 110090 ;
> sqlite> select * from msgmap where num = 110089 ;
> 110089|be2d2a0d036c01dba1fcbe3c001d3f0e@rhempresarial.com

Hmm.  With a smarter querey I am seeing:

> sqlite> select num, mid from msgmap order by num ;
> 110361|CAC8N3VSSLZ+ANaTXb0b3-K2NNrSRxYkuozAvJrPa4TZBL70ESg@mail.gmail.com
> 110365|947beed847840a7b278d820d00158ff9@rhempresarial.com
> 110412|388652ac24bbd42b565255e60032ac38@training-programs.net
> 110534|0.0.ED.ACB.1D52978A96689C0.0@uspmta120030.emarsys.net
> 110547|20190623052647.GA9838@gmail.com
> 110555|OF1F4E7E7A.F48DBC61-ON47258422.0008F563@LocalDomain
> 110560|832660e183e06b2c917d925c438b74b1@gruyeresfvxneypf.onion.pet
> 110571|1561260206528_102468_20455_1833.sc-10_9_12_206-inbound8$ebiederm@xmission.com
> 110577|166408121.377898.1561282483109@mail.yahoo.com
> 110578|20190623024642.3CADECD60542E@mail.casanare.gov.co
> 110580|308185e7c16c72785bfa7ca1a11bebb8@contato.ufsc.br
> 110592|207A6EDACFB04E7299FCD018E0A4BFA4.MAI@srv.newbilisim.com
> 110603|20190623084720.0180DCDAF68EC@mail.casanare.gov.co
> 110612|20190623103511.B05FF259256F2@mail.07d02.mspz7.gob.ec
> 110662|20190610090733.0CF77DB98C18400A@al-reefy.com
> 110689|9d692655ffc312f5899ee60d045055df@swift.generated
> 110701|25mseb$-t-8$-$$03-v$33@1nabo
> 110744|E1hfHhs-0003HR-Qo@mx02.mta.xmission.com
> 110753|20190624052459.6136AA4A29@poczta.zarz.agh.edu.pl
> 110755|SG2PR04MB260278655E8FC19804B95C76AEE00@SG2PR04MB2602.apcprd04.prod.outlook.com
> 110756|1459657803.17189221561359906087.JavaMail.app@rapp21.atlis1
> 110759|0e2c946c07c8941ce9b5b4d8df655c22@swift.generated
> 110765|SG2PR01MB2983FD90065F6FDAEB52F3B2CCE00@SG2PR01MB2983.apcprd01.prod.exchangelabs.com
> 110772|613cadb920e6c2bd6cecc7a5e2d7af17@swift.generated
> 110774|156136730324625@kroah.com
> 110776|f67aacd0534c471aaf9f531fd7d76514@SRV-MBX13-02.chu-toulouse.fr
> 110782|CAC=vM_Z8Pr4NVsatk3ROn1W8fJjkO3B_CDs8s+2UXm56cg3scA@mail.gmail.com
> 110783|20190624092326.475497997@linuxfoundation.org
> 110784|cb83bd8a41985599698df9225cbcd322@swift.generated
> 110787|6c4e4b4ecf374e0f92f6553e96368c38@SRV-MBX13-02.chu-toulouse.fr
> 110810|20190624132129.4c772nkjbrilxtcy@willie-the-truck
> 110826|1ada2c2ef303f2e6742747bb2218a9a4@fasobf.com
> 110827|20190624105927.84B07703461ACB3F@srv155.main-hosting.eu
> 110843|20190624163442.xhk6drl7ptnq7i5o@dcvr
> 110848|CF9A396C287FC34DABB6CB7E8D92D8FD0C9C6850@V1-EXCH-01.hccsc.k12.in.us
> 110852|20190624173319.3bb4t3zrieb4k5w2@dcvr
> 110856|822002_597274_254187_778544@cvanguardia.net
> 110865|20190624184534.209896-1-joel@joelfernandes.org
> 110866|20190624185214.GA211230@google.com
> 110869|CAG48ez3maGsRbN3qr8YVb6ZCw0FDq-7GqqiTiA4yEa1mebkubw@mail.gmail.com
> 110879|20190624201023.Vlx4D%akpm@linux-foundation.org
> 110884|CAEXW_YS0YR8Au+1f-sW_BT3xVONXKo9zrcSJMBwGJizyMw0xag@mail.gmail.com

So it is pretty bad.  I am using my import_imap_mailbox script that
I am trying to clean up and so I can merge it into at leat contrib.
This is a v2 mailbox and I am using
"PublicInbox::InboxWritable($ibx)->importer(1)->add(...)"

There may have been some crashes in the past, and I am delivering to
multiple mailboxes at once.

Let's see. InboxWritable->importer->add resolves to V2Writeable->add

The add method computes the number using num_for, which uses
Msgmpa::mid_insert.

Short of the sequence number for msgmap getting scrambled I don't
see how that can go wrong.  Sigh.

I will have to instrument my client, and see if that will give
me a clue what is going on.

I definitely don't have altid's in play here.

>> I've hit this in the past using the altid stuff to keep NNTP
>> article numbers in check.
>> 
>> There's a lot of changes w.r.t. progress reporting, no-op
>> speedups,
>> 
>> > If not I will dig in and figure this out.  I just figured I would ask
>> > in case you have any handy canidates.
>> 
>> Are you hitting this with "git fetch && public-inbox-index" or
>> -watch/-mda?
>
> OK, I see this in my libc-alpha v2 mirror via -watch,
> but not my v1 -watch mirrors.  Likely to affect -mda, too.
> But the number gaps are small, however (1-2 messages), and.

Good.  I am glad you can see it as well.  One of use might be able
to spot what strange thing is going on here.

I think most of my previous work was all in the index rebuild path.
Which may be why we don't have a regression test catching this.

Grr.

I don't see a good candiate for this failure!

Eric






^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] msgmap: mid_insert: use plain "INSERT" to detect duplicates
  2019-06-24 15:59 Q: Did you do something to message number recently? Eric W. Biederman
  2019-06-24 16:34 ` Eric Wong
@ 2019-06-24 23:38 ` Eric Wong
  2019-06-25  3:48   ` Eric Wong
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Wong @ 2019-06-24 23:38 UTC (permalink / raw)
  To: meta; +Cc: Eric W. Biederman

"INSERT OR IGNORE" still bumps the auto-increment counter in
SQLite, which causes gaps to appear in NNTP article numbering.

This bug appeared in v2 repos where V2Writable may call ->add
repeatedly on the same message.  This bug is apparent with
public-inbox-watch and work-in-progress IMAP watchers which may
rescan and (attempt to) reinsert the same message on mailbox
changes.

Most uses of public-inbox-mda were not affected, unless the
same message is actually delivered multiple times to the mda.
v1 is not affected, either, since deduplication is only based
on Message-ID and msgmap never sees the duplicate.

Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 lib/PublicInbox/Msgmap.pm | 4 ++--
 t/msgmap.t                | 3 +++
 t/v2writable.t            | 2 ++
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index 0035c9e3..5a89b85a 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -126,9 +126,9 @@ sub mid_insert {
 	my ($self, $mid) = @_;
 	my $dbh = $self->{dbh};
 	my $sth = $dbh->prepare_cached(<<'');
-INSERT OR IGNORE INTO msgmap (mid) VALUES (?)
+INSERT INTO msgmap (mid) VALUES (?)
 
-	return if $sth->execute($mid) == 0;
+	return unless eval { $sth->execute($mid) };
 	my $num = $dbh->last_insert_id(undef, undef, 'msgmap', 'num');
 	$self->num_highwater($num) if defined($num);
 	$num;
diff --git a/t/msgmap.t b/t/msgmap.t
index 4dddd0a8..2d018219 100644
--- a/t/msgmap.t
+++ b/t/msgmap.t
@@ -30,6 +30,9 @@ $@ = undef;
 my $ret = $d->mid_insert('a@b');
 is($ret, undef, 'duplicate mid_insert in undef result');
 is($d->num_for('a@b'), $mid2num{'a@b'}, 'existing number not clobbered');
+my $max = (sort(keys %num2mid))[-1];
+is($d->mid_insert('ok@unique'), $max + 1,
+	'got expected num after failing mid_insert');
 
 foreach my $n (keys %num2mid) {
 	is($d->mid_for($n), $num2mid{$n}, "num:$n maps correctly");
diff --git a/t/v2writable.t b/t/v2writable.t
index 88df2d64..8f32fbe5 100644
--- a/t/v2writable.t
+++ b/t/v2writable.t
@@ -118,6 +118,8 @@ if ('ensure git configs are correct') {
 	$mime->header_set('References', '<zz-mid@b>');
 	ok($im->add($mime), 'message with multiple Message-ID');
 	$im->done;
+	my ($total, undef) = $ibx->over->recent;
+	is($ibx->mm->num_highwater, $total, 'got expected highwater value');
 	my $srch = $ibx->search;
 	my $mset1 = $srch->reopen->query('m:abcde@1', { mset => 1 });
 	is($mset1->size, 1, 'message found by first MID');
-- 
EW


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Q: Did you do something to message number recently?
  2019-06-24 22:56     ` Eric W. Biederman
@ 2019-06-24 23:42       ` Eric Wong
  2019-06-25 12:01         ` Eric W. Biederman
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Wong @ 2019-06-24 23:42 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: meta

"Eric W. Biederman" <ebiederm@xmission.com> wrote:
> The add method computes the number using num_for, which uses
> Msgmpa::mid_insert.
> 
> Short of the sequence number for msgmap getting scrambled I don't
> see how that can go wrong.  Sigh.

That seems to be what went wrong with "INSERT OR IGNORE"

cf. https://public-inbox.org/meta/20190624233809.1721-1-e@80x24.org/

> I will have to instrument my client, and see if that will give
> me a clue what is going on.

I guess your client ends up seeing the same message
over-and-over again like -watch does?

> I think most of my previous work was all in the index rebuild path.
> Which may be why we don't have a regression test catching this.

Yeah, this isn't a new bug at all.  I didn't notice since I
mainly use w3m or (neo)mutt for NNTP and didn't look too closely
at NNTP article numbers.  lynx makes it apparent messages are
missing: lynx news.public-inbox.org/inbox.comp.lib.glibc.alpha

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] msgmap: mid_insert: use plain "INSERT" to detect duplicates
  2019-06-24 23:38 ` [PATCH] msgmap: mid_insert: use plain "INSERT" to detect duplicates Eric Wong
@ 2019-06-25  3:48   ` Eric Wong
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-06-25  3:48 UTC (permalink / raw)
  To: meta; +Cc: Eric W. Biederman

Pushed to `master' with this warning fix squashed in since I
used `$max' elsewhere:

diff --git a/t/msgmap.t b/t/msgmap.t
index 2d018219..20985ce8 100644
--- a/t/msgmap.t
+++ b/t/msgmap.t
@@ -30,8 +30,8 @@ $@ = undef;
 my $ret = $d->mid_insert('a@b');
 is($ret, undef, 'duplicate mid_insert in undef result');
 is($d->num_for('a@b'), $mid2num{'a@b'}, 'existing number not clobbered');
-my $max = (sort(keys %num2mid))[-1];
-is($d->mid_insert('ok@unique'), $max + 1,
+my $next = (sort(keys %num2mid))[-1];
+is($d->mid_insert('ok@unique'), $next + 1,
 	'got expected num after failing mid_insert');
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Q: Did you do something to message number recently?
  2019-06-24 23:42       ` Eric Wong
@ 2019-06-25 12:01         ` Eric W. Biederman
  2019-06-25 17:51           ` Eric Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Eric W. Biederman @ 2019-06-25 12:01 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong <e@80x24.org> writes:

> "Eric W. Biederman" <ebiederm@xmission.com> wrote:
>> The add method computes the number using num_for, which uses
>> Msgmpa::mid_insert.
>> 
>> Short of the sequence number for msgmap getting scrambled I don't
>> see how that can go wrong.  Sigh.
>
> That seems to be what went wrong with "INSERT OR IGNORE"
>
> cf. https://public-inbox.org/meta/20190624233809.1721-1-e@80x24.org/

Is it INSERT OR IGNORE or is it the autoincrement?  As I read
the sqlite docs autoincrement is allowed to skip message numbers.
Especially in cases where the insert fails.

This fix should come with a test case shouldn't it?  Such a test
case should be pretty straight forward to write.  Insert a message
multiple times and then insert a new one.

I don't recall the transaction constraints on sqlite but I am wondering
if instead of INSERT OR IGNORE we should read and the increment
num_highwater in the public-inbox code and then we would not need to
depend upon autoincrement only incrementing once.  Even when the sqlite
docs don't promise it will.

>> I will have to instrument my client, and see if that will give
>> me a clue what is going on.
>
> I guess your client ends up seeing the same message
> over-and-over again like -watch does?

It usually doesn't but for dealing with a different imap server I
tweaked the code and one of the servers feeding me the last message in
the mail box every time I checked the mailbox for new messages.

I have also narrowed it down to duplication insertions.  I was just
about to start looking at what the fix would be.

>> I think most of my previous work was all in the index rebuild path.
>> Which may be why we don't have a regression test catching this.
>
> Yeah, this isn't a new bug at all.  I didn't notice since I
> mainly use w3m or (neo)mutt for NNTP and didn't look too closely
> at NNTP article numbers.  lynx makes it apparent messages are
> missing: lynx news.public-inbox.org/inbox.comp.lib.glibc.alpha

It is a little obscure in gnus but I start getting suspcious when
I tell it to give me the last 10 messages and I get only one.

Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Q: Did you do something to message number recently?
  2019-06-25 12:01         ` Eric W. Biederman
@ 2019-06-25 17:51           ` Eric Wong
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-06-25 17:51 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: meta

"Eric W. Biederman" <ebiederm@xmission.com> wrote:
> Eric Wong <e@80x24.org> writes:
> 
> > "Eric W. Biederman" <ebiederm@xmission.com> wrote:
> >> The add method computes the number using num_for, which uses
> >> Msgmpa::mid_insert.
> >> 
> >> Short of the sequence number for msgmap getting scrambled I don't
> >> see how that can go wrong.  Sigh.
> >
> > That seems to be what went wrong with "INSERT OR IGNORE"
> >
> > cf. https://public-inbox.org/meta/20190624233809.1721-1-e@80x24.org/
> 
> Is it INSERT OR IGNORE or is it the autoincrement?  As I read
> the sqlite docs autoincrement is allowed to skip message numbers.
> Especially in cases where the insert fails.

Changing it to plain INSERT made the autoincrement stop skipping.
That's been my experience with other DBs in the past, too;
but I haven't used "OR IGNORE" much (and it's been a
long time since I've looked at SQL outside of SQLite).

> This fix should come with a test case shouldn't it?  Such a test
> case should be pretty straight forward to write.  Insert a message
> multiple times and then insert a new one.

Did you see my changes to t/msgmap.t and t/v2writable.t?

I appended the test to existing dupe test in t/msgmap.t and
added the num_highwater check to t/v2writable.t which hit
duplicates.  I don't think there needs to be a separate test,
and the test suite is already slow enough to impact productivity
on my laptop.

> I don't recall the transaction constraints on sqlite but I am wondering
> if instead of INSERT OR IGNORE we should read and the increment
> num_highwater in the public-inbox code and then we would not need to
> depend upon autoincrement only incrementing once.  Even when the sqlite
> docs don't promise it will.

Yeah, that's probably the right step going forward.  I don't
think there's forward/backwards-compatibility things to worry
about, since the indexing code for mirrors already ignores the
autoincrement.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-25 17:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-24 15:59 Q: Did you do something to message number recently? Eric W. Biederman
2019-06-24 16:34 ` Eric Wong
2019-06-24 17:33   ` Eric Wong
2019-06-24 22:56     ` Eric W. Biederman
2019-06-24 23:42       ` Eric Wong
2019-06-25 12:01         ` Eric W. Biederman
2019-06-25 17:51           ` Eric Wong
2019-06-24 23:38 ` [PATCH] msgmap: mid_insert: use plain "INSERT" to detect duplicates Eric Wong
2019-06-25  3:48   ` Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).