From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 915961FA0D for ; Fri, 7 Aug 2020 10:52:19 +0000 (UTC) From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 3/5] index: max out XAPIAN_FLUSH_THRESHOLD if using --batch-size Date: Fri, 7 Aug 2020 10:52:16 +0000 Message-Id: <20200807105218.16843-4-e@yhbt.net> In-Reply-To: <20200807105218.16843-1-e@yhbt.net> References: <20200807105218.16843-1-e@yhbt.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: If XAPIAN_FLUSH_THRESHOLD is unset, Xapian will default to 10000. That limits the effectiveness of users specifying extremely large values of --batch-size. While we're at it, localize the changes to globals since -index may be eval-ed in tests (and perhaps production code in the future). --- script/public-inbox-index | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/script/public-inbox-index b/script/public-inbox-index index 56df5bfe..e2bca16e 100755 --- a/script/public-inbox-index +++ b/script/public-inbox-index @@ -42,11 +42,16 @@ if (defined $max_size) { die "`publicInbox.indexMaxSize=$max_size' not parsed\n"; } -if (my $bs = $opt->{batchsize} // $cfg->{lc('publicInbox.indexBatchSize')}) { +my $bs = $opt->{batchsize} // $cfg->{lc('publicInbox.indexBatchSize')}; +if (defined $bs) { PublicInbox::Admin::parse_unsigned(\$bs) or die "`publicInbox.indexBatchSize=$bs' not parsed\n"; - $PublicInbox::SearchIdx::BATCH_BYTES = $bs; } +local $PublicInbox::SearchIdx::BATCH_BYTES = $bs if defined($bs); + +# out-of-the-box builds of Xapian 1.4.x are still limited to 32-bit +# https://getting-started-with-xapian.readthedocs.io/en/latest/concepts/indexing/limitations.html +local $ENV{XAPIAN_FLUSH_THRESHOLD} ||= '4294967295' if defined($bs); my $s = $opt->{sequentialshard} // $cfg->{lc('publicInbox.indexSequentialShard')};