From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 533AE1F4C1 for ; Fri, 15 Nov 2024 22:23:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1731709395; bh=HhbFtxhopnTlGvYVKcJ9HCrUYkm1nMvOl3u17sLQp2s=; h=From:To:Subject:Date:From; b=H04CZBQWcIrbrwdaHRdr/PerHC+2HQ7zzojZkRnISVL+nVo/X/8aue5mYdya7EwAJ W6/9ciinan+jDGlJfXu4osO0ZFHdsbIFaO4ChmPN0B7KXfPyT2zZPDwJVq0oZyw5nE Rkusi9pbRtGs0KyRNa5ODrFTAzcY4p9ehHqb7kvQ= From: Eric Wong To: meta@public-inbox.org Subject: [PATCH] lei/store: auto-commit for long-running imports Date: Fri, 15 Nov 2024 22:23:15 +0000 Message-ID: <20241115222315.2761178-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: DBD::SQLite (not SQLite itself) sets a 30s busy_timeout which we currently do not override. This means readers can wait up to 30s for a writer to finish. For long imports exceeding 30s, SQLite readers (for deduplication during import) can die with a "database is locked" message while the lei/store process holds a long write transaction open. Forcing commits every 5s ought to fix the problem in most cases, assuming commits themselves happen in under 25s (which isn't always true on slow devices). 5 seconds was chosen since it matches the default commit interval on ext* filesystems and the vm.dirty_writeback_centisecs sysctl. Many (but not all) failures around long-running `lei import' processes. --- lib/PublicInbox/LeiStore.pm | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm index 9551da5f..3ae9f38f 100644 --- a/lib/PublicInbox/LeiStore.pm +++ b/lib/PublicInbox/LeiStore.pm @@ -241,6 +241,12 @@ sub sto_export_kw ($$$) { } } +# commit every 5s to get under the default DBD::SQLite timeout of 30s +sub _schedule_checkpoint ($) { + my ($self) = @_; + add_uniq_timer("$self-checkpoint", 5, \&_commit, $self, 'barrier'); +} + # vmd = { kw => [ qw(seen ...) ], L => [ qw(inbox ...) ] } sub set_eml_vmd { my ($self, $eml, $vmd, $docids) = @_; @@ -250,6 +256,7 @@ sub set_eml_vmd { $eidx->idx_shard($docid)->ipc_do('set_vmd', $docid, $vmd); sto_export_kw($self, $docid, $vmd); } + _schedule_checkpoint $self; $docids; } @@ -260,6 +267,7 @@ sub add_eml_vmd { for my $docid (@docids) { $eidx->idx_shard($docid)->ipc_do('add_vmd', $docid, $vmd); } + _schedule_checkpoint $self; \@docids; } @@ -270,6 +278,7 @@ sub remove_eml_vmd { # remove just the VMD for my $docid (@docids) { $eidx->idx_shard($docid)->ipc_do('remove_vmd', $docid, $vmd); } + _schedule_checkpoint $self; \@docids; } @@ -319,6 +328,7 @@ sub remove_eml { } $git->async_wait_all; remove_docids($self, @docids); + _schedule_checkpoint $self; \@docids; } @@ -343,6 +353,7 @@ sub _add_vmd ($$$$) { sub _docids_and_maybe_kw ($$) { my ($self, $docids) = @_; + _schedule_checkpoint $self; return $docids unless wantarray; my (@kw, $idx, @tmp); for my $num (@$docids) { # likely only 1, unless ContentHash changes @@ -376,6 +387,7 @@ sub _reindex_1 { # git->cat_async callback } else { warn("E: $type $hex\n"); } + _schedule_checkpoint $self; } sub reindex_art { @@ -469,6 +481,7 @@ sub add_eml { my $idx = $eidx->idx_shard($smsg->{num}); $idx->index_eml($eml, $smsg); _add_vmd($self, $idx, $smsg->{num}, $vmd) if $vmd; + _schedule_checkpoint $self; wantarray ? ($smsg, []) : $smsg; } } @@ -514,6 +527,7 @@ sub update_xvmd { my ($eidx, $tl) = eidx_init($self); my $oidx = $eidx->{oidx}; my %seen; + _schedule_checkpoint $self; for my $oid (keys %$xoids) { my $docid = oid2docid($self, $oid) // next; delete $xoids->{$oid}; @@ -551,6 +565,8 @@ sub set_xvmd { my $oidx = $eidx->{oidx}; my %seen; + _schedule_checkpoint $self; + # see if we can just update existing docs for my $oid (keys %$xoids) { my $docid = oid2docid($self, $oid) // next;