* V2 shard roll-over @ 2019-02-12 19:11 Konstantin Ryabitsev 2019-02-12 19:27 ` Eric Wong 0 siblings, 1 reply; 6+ messages in thread From: Konstantin Ryabitsev @ 2019-02-12 19:11 UTC (permalink / raw) To: meta Eric: I noticed today that the LKML shard 6 has grown over 1.1 GB, which is the size of other shards (0-5). I'm wondering if it will roll over to shard 7 automatically, or if there are other steps that need to be undertaken. Best, -K ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: V2 shard roll-over 2019-02-12 19:11 V2 shard roll-over Konstantin Ryabitsev @ 2019-02-12 19:27 ` Eric Wong 2019-02-27 0:22 ` Eric Wong 0 siblings, 1 reply; 6+ messages in thread From: Eric Wong @ 2019-02-12 19:27 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > Eric: > > I noticed today that the LKML shard 6 has grown over 1.1 GB, which is the > size of other shards (0-5). I'm wondering if it will roll over to shard 7 > automatically, or if there are other steps that need to be undertaken. It only counts bytes in *.pack files; so you might need to repack (or wait for gc to run via --auto). You can monitor the rollover via stderr with the following to be sure: diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm index 1f17fe2..0fd8c60 100644 --- a/lib/PublicInbox/V2Writable.pm +++ b/lib/PublicInbox/V2Writable.pm @@ -588,6 +588,9 @@ sub importer { if (defined $latest) { my $git = PublicInbox::Git->new($latest); my $packed_bytes = $git->packed_bytes; + + print STDERR "packed_bytes=$packed_bytes ", + "rotate_bytes=$self->{rotate_bytes}\n"; if ($packed_bytes >= $self->{rotate_bytes}) { $epoch = $max + 1; } else { ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: V2 shard roll-over 2019-02-12 19:27 ` Eric Wong @ 2019-02-27 0:22 ` Eric Wong 2019-02-27 13:26 ` Konstantin Ryabitsev 0 siblings, 1 reply; 6+ messages in thread From: Eric Wong @ 2019-02-27 0:22 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Eric Wong <e@80x24.org> wrote: > Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > > Eric: > > > > I noticed today that the LKML shard 6 has grown over 1.1 GB, which is the > > size of other shards (0-5). I'm wondering if it will roll over to shard 7 > > automatically, or if there are other steps that need to be undertaken. > > It only counts bytes in *.pack files; so you might need to repack > (or wait for gc to run via --auto). Btw, have you checked this? I've been wondering if 7 will show up, too. > You can monitor the rollover via stderr with the following to be > sure: > > diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm > index 1f17fe2..0fd8c60 100644 > --- a/lib/PublicInbox/V2Writable.pm > +++ b/lib/PublicInbox/V2Writable.pm > @@ -588,6 +588,9 @@ sub importer { > if (defined $latest) { > my $git = PublicInbox::Git->new($latest); > my $packed_bytes = $git->packed_bytes; > + > + print STDERR "packed_bytes=$packed_bytes ", > + "rotate_bytes=$self->{rotate_bytes}\n"; > if ($packed_bytes >= $self->{rotate_bytes}) { > $epoch = $max + 1; > } else { ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: V2 shard roll-over 2019-02-27 0:22 ` Eric Wong @ 2019-02-27 13:26 ` Konstantin Ryabitsev 2019-02-27 20:25 ` [PATCH] v2writable: fix epoch rollover on incremental imports Eric Wong 0 siblings, 1 reply; 6+ messages in thread From: Konstantin Ryabitsev @ 2019-02-27 13:26 UTC (permalink / raw) To: Eric Wong; +Cc: meta On Wed, Feb 27, 2019 at 12:22:04AM +0000, Eric Wong wrote: > Eric Wong <e@80x24.org> wrote: > > Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > > > Eric: > > > > > > I noticed today that the LKML shard 6 has grown over 1.1 GB, which is the > > > size of other shards (0-5). I'm wondering if it will roll over to shard 7 > > > automatically, or if there are other steps that need to be undertaken. > > > > It only counts bytes in *.pack files; so you might need to repack > > (or wait for gc to run via --auto). > > Btw, have you checked this? I've been wondering if 7 will show up, too. Yeah, I've repacked it, but we still haven't rolled over to 7. This is the latest on the server: $ git count-objects -v count: 18 size: 88 in-pack: 1036749 packs: 1 size-pack: 1169104 prune-packable: 0 garbage: 0 size-garbage: 0 $ ls -al objects/pack/pack-7d2041260250f79f5d2396f38959560e013c8d26.pack -r--r--r--. 1 archiver archiver 1168133236 Feb 27 13:10 objects/pack/pack-7d2041260250f79f5d2396f38959560e013c8d26.pack I'm looking at the code and I'm not entirely sure what PACKING_FACTOR is: my $PACKING_FACTOR = 0.4; ... rotate_bytes => int((1024 * 1024 * 1024) / $PACKING_FACTOR), Wouldn't that give us 2.7GB? (1024*1024*1024)/0.4 = 2,684,354,560 It's possible I'm not following the logic right. It looks to be the same code that properly sharded things on the initial import, so I'm not sure. -K ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] v2writable: fix epoch rollover on incremental imports 2019-02-27 13:26 ` Konstantin Ryabitsev @ 2019-02-27 20:25 ` Eric Wong 2019-02-27 23:34 ` Konstantin Ryabitsev 0 siblings, 1 reply; 6+ messages in thread From: Eric Wong @ 2019-02-27 20:25 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > On Wed, Feb 27, 2019 at 12:22:04AM +0000, Eric Wong wrote: > > Eric Wong <e@80x24.org> wrote: > > > Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > > > > Eric: > > > > > > > > I noticed today that the LKML shard 6 has grown over 1.1 GB, which is the > > > > size of other shards (0-5). I'm wondering if it will roll over to shard 7 > > > > automatically, or if there are other steps that need to be undertaken. > > > > > > It only counts bytes in *.pack files; so you might need to repack > > > (or wait for gc to run via --auto). > > > > Btw, have you checked this? I've been wondering if 7 will show up, too. > > Yeah, I've repacked it, but we still haven't rolled over to 7. This is > the latest on the server: > > $ git count-objects -v > count: 18 > size: 88 > in-pack: 1036749 > packs: 1 > size-pack: 1169104 > prune-packable: 0 > garbage: 0 > size-garbage: 0 > $ ls -al objects/pack/pack-7d2041260250f79f5d2396f38959560e013c8d26.pack > -r--r--r--. 1 archiver archiver 1168133236 Feb 27 13:10 objects/pack/pack-7d2041260250f79f5d2396f38959560e013c8d26.pack > > I'm looking at the code and I'm not entirely sure what PACKING_FACTOR > is: > > my $PACKING_FACTOR = 0.4; > ... > rotate_bytes => int((1024 * 1024 * 1024) / $PACKING_FACTOR), > > Wouldn't that give us 2.7GB? > (1024*1024*1024)/0.4 = 2,684,354,560 Yes, we do all the calculations using the estimated unpacked size. So the estimate is 2.7GB unpacked (and uncompressed) is roughly 1GB packed. > It's possible I'm not following the logic right. It looks to be the same > code that properly sharded things on the initial import, so I'm not > sure. Almost, the problem was the initial import never saw an existing git repo with data in it. The incemental -mda/-watch path failed to take into account the unpacked size of the existing data. This fixes it: ---------8<----------- Subject: [PATCH] v2writable: fix epoch rollover on incremental imports All of our internal epoch rollover calculations are done using the estimated unpacked (and uncompressed) size of the repo. The importer instance needs to check that unpacked size before selecting an epoch when an epoch already has packed data. This bug did not impact the initial mass imports since we only initialize the Import instance once-per-epoch and did not need to take existing epochs into account. Tested manually with -mda on a local clone of LKML Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> --- lib/PublicInbox/V2Writable.pm | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm index 1f17fe2..b1d8095 100644 --- a/lib/PublicInbox/V2Writable.pm +++ b/lib/PublicInbox/V2Writable.pm @@ -588,7 +588,9 @@ sub importer { if (defined $latest) { my $git = PublicInbox::Git->new($latest); my $packed_bytes = $git->packed_bytes; - if ($packed_bytes >= $self->{rotate_bytes}) { + my $unpacked_bytes = $packed_bytes / $PACKING_FACTOR; + + if ($unpacked_bytes >= $self->{rotate_bytes}) { $epoch = $max + 1; } else { $self->{epoch_max} = $max; -- EW ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] v2writable: fix epoch rollover on incremental imports 2019-02-27 20:25 ` [PATCH] v2writable: fix epoch rollover on incremental imports Eric Wong @ 2019-02-27 23:34 ` Konstantin Ryabitsev 0 siblings, 0 replies; 6+ messages in thread From: Konstantin Ryabitsev @ 2019-02-27 23:34 UTC (permalink / raw) To: Eric Wong; +Cc: meta On Wed, Feb 27, 2019 at 08:25:36PM +0000, Eric Wong wrote: >All of our internal epoch rollover calculations are done using >the estimated unpacked (and uncompressed) size of the repo. The >importer instance needs to check that unpacked size before >selecting an epoch when an epoch already has packed data. > >This bug did not impact the initial mass imports since we only >initialize the Import instance once-per-epoch and did not need >to take existing epochs into account. > >Tested manually with -mda on a local clone of LKML Ding, this got us shard 7. :) Thanks, Eric! -K ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-02-27 23:34 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-02-12 19:11 V2 shard roll-over Konstantin Ryabitsev 2019-02-12 19:27 ` Eric Wong 2019-02-27 0:22 ` Eric Wong 2019-02-27 13:26 ` Konstantin Ryabitsev 2019-02-27 20:25 ` [PATCH] v2writable: fix epoch rollover on incremental imports Eric Wong 2019-02-27 23:34 ` Konstantin Ryabitsev
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).