From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 534141FC11 for ; Fri, 27 Nov 2020 09:52:55 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 05/12] nntp: NEWNEWS: speed up filtering Date: Fri, 27 Nov 2020 09:52:47 +0000 Message-Id: <20201127095254.21624-6-e@80x24.org> In-Reply-To: <20201127095254.21624-1-e@80x24.org> References: <20201127095254.21624-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: With 50K newsgroups, the filtering phase goes from ~2000 seconds to ~90 MILLISECONDS by relying on the grep perlop. This moves ->over checking out of the main dispatch and amortizes the cost via long_response. (Fairly scheduled) long_response time in newnews_i now takes ~360 seconds as opposed to ~30 seconds before this change, however; but the initial filtering speedup eliminating 2000s is more than worth it. --- lib/PublicInbox/NNTP.pm | 49 ++++++++++++++++++++--------------------- 1 file changed, 24 insertions(+), 25 deletions(-) diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm index 68222196..5cbf5a16 100644 --- a/lib/PublicInbox/NNTP.pm +++ b/lib/PublicInbox/NNTP.pm @@ -294,23 +294,28 @@ sub ngpat2re (;$) { } sub newnews_i { - my ($self, $overs, $ts, $prev) = @_; - my $over = $overs->[0]; - my $msgs = $over->query_ts($ts, $$prev); - if (scalar @$msgs) { - more($self, '<' . - join(">\r\n<", map { $_->{mid} } @$msgs ). - '>'); - $$prev = $msgs->[-1]->{num}; - } else { - shift @$overs; - if (@$overs) { # continue onto next newsgroup - $$prev = 0; - return 1; - } else { # break out of the long response. - return; + my ($self, $names, $ts, $prev) = @_; + my $ngname = $names->[0]; + if (my $ibx = $self->{nntpd}->{groups}->{$ngname}) { + if (my $over = $ibx->over) { + my $msgs = $over->query_ts($ts, $$prev); + if (scalar @$msgs) { + more($self, '<' . + join(">\r\n<", + map { $_->{mid} } @$msgs ) . + '>'); + $$prev = $msgs->[-1]->{num}; + return 1; # continue on current group + } } } + shift @$names; + if (@$names) { # continue onto next newsgroup + $$prev = 0; + 1; + } else { # all done, break out of the long_response + undef; + } } sub cmd_newnews ($$$$;$$) { @@ -321,17 +326,11 @@ sub cmd_newnews ($$$$;$$) { my ($keep, $skip) = split('!', $newsgroups, 2); ngpat2re($keep); ngpat2re($skip); - my @overs; - foreach my $ng (@{$self->{nntpd}->{grouplist}}) { - $ng->{newsgroup} =~ $keep or next; - $ng->{newsgroup} =~ $skip and next; - my $over = $ng->over or next; - push @overs, $over; - }; - return '.' unless @overs; - + my @names = grep(!/$skip/, grep(/$keep/, + @{$self->{nntpd}->{groupnames}})); + return '.' unless scalar(@names); my $prev = 0; - long_response($self, \&newnews_i, \@overs, $ts, \$prev); + long_response($self, \&newnews_i, \@names, $ts, \$prev); } sub cmd_group ($$) {