From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 3C7141F534 for ; Tue, 21 Mar 2023 23:07:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1679440067; bh=SmDNo6UXzCwWgCvUFrU9FksCKh6Q0rXJvuYI63/pOvk=; h=From:To:Subject:Date:In-Reply-To:References:From; b=V9NOqiAnT25RMgiPteS2NJ6OkKhkoenHOvcTnxeMSXg+8b5ilm/NGsubsdu+CPkbM kka924bMpWoW9eGDyFhaxjRxtqqrMoZ8kg0U8EJU/onBmk354GsPOsbMxZZHkt8kZm G/SxLMnAKfKmaf4AUsFAeUQIj/a5xlwbwSRIxCW0= From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 18/28] cindex: check for checkpoint before giant messages Date: Tue, 21 Mar 2023 23:07:33 +0000 Message-Id: <20230321230743.3020032-18-e@80x24.org> In-Reply-To: <20230321230743.3020032-1-e@80x24.org> References: <20230321230701.3019936-1-e@80x24.org> <20230321230743.3020032-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: Giant messages may put us far over the batch limit if we're close to it. --- lib/PublicInbox/CodeSearchIdx.pm | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/CodeSearchIdx.pm b/lib/PublicInbox/CodeSearchIdx.pm index b185731d..829fe28e 100644 --- a/lib/PublicInbox/CodeSearchIdx.pm +++ b/lib/PublicInbox/CodeSearchIdx.pm @@ -151,6 +151,14 @@ sub store_repo { # wq_do - returns docid } } +sub cidx_ckpoint ($$) { + my ($self, $msg) = @_; + progress($self, $msg); + return if $PublicInbox::Search::X{CLOEXEC_UNSET}; + $self->{xdb}->commit_transaction; + $self->{xdb}->begin_transaction; +} + # sharded reader for `git log --pretty=format: --stdin' sub shard_index { # via wq_io_do my ($self, $git, $n, $roots) = @_; @@ -184,16 +192,18 @@ sub shard_index { # via wq_io_do next; } $TXN_BYTES -= length($buf); + if ($TXN_BYTES <= 0) { + cidx_ckpoint($self, "[$n] $nr"); + $TXN_BYTES = $batch_bytes - length($buf); + } @$cmt{@FMT} = split(/\n/, $buf, scalar(@FMT)); $/ = "\n"; add_commit($self, $cmt); last if $DO_QUIT; ++$nr; - if ($TXN_BYTES <= 0 && !$PublicInbox::Search::X{CLOEXEC_UNSET}) { - progress($self, "[$n] $nr"); - $self->{xdb}->commit_transaction; + if ($TXN_BYTES <= 0) { + cidx_ckpoint($self, "[$n] $nr"); $TXN_BYTES = $batch_bytes; - $self->{xdb}->begin_transaction; } $/ = $FS; }