unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH] v2writable: more debug output for `lei import' failures
@ 2024-10-08  5:18 Eric Wong
  0 siblings, 0 replies; only message in thread
From: Eric Wong @ 2024-10-08  5:18 UTC (permalink / raw)
  To: meta

A difficult-to-reproduce bug in `lei import' introduced in 4ff8e8d21ab5
(lei/store: stop shard workers + cat-file on idle, 2024-04-16)
causes {idx_shards} to not be recreated properly due to a shard
being locked while attempting to get a write lock on it:

  Exception: Unable to get write lock on /path/to/shard0: already locked

I'm not sure if the bug is from unnecessarily holding onto a
shard too long, or incorrectly attempting to open an
already-open shard.  In either case, hopefully a more complete
backtrace can be obtained since setting PERL5OPT=-MCarp=verbose
in the shared lei/store worker process isn't straightforward[*].

AFAIK, this doesn't affect normal v2 and -extindex activity,
only lei users who import mail[*]

[*] lei attempts to ensure read-after-write consistency
    across parallel instances to satisfy users who simultaneously
    import multiple IMAP mailboxes at once over high-latency networks.
    However, Xapian, SQLite, and multi-epoch git only work well with
    one writer-at-a-time so lei jumps through hoops to avoid
    introducing suprising local delays and waits.
---
 lib/PublicInbox/V2Writable.pm | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index dab5e64d..19bb66ad 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -23,6 +23,7 @@ use PublicInbox::Search;
 use PublicInbox::SearchIdx qw(log2stack is_ancestor check_size is_bad_blob);
 use IO::Handle; # ->autoflush
 use POSIX ();
+use Carp qw(confess);
 
 my $OID = qr/[a-f0-9]{40,}/;
 # an estimate of the post-packed size to the raw uncompressed size
@@ -90,9 +91,9 @@ sub init_inbox {
 
 sub idx_shard ($$) {
 	my ($self, $num) = @_;
-	use Carp qw(confess); # FIXME: lei_store bug somewhere..
+	# FIXME: lei_store bug somewhere..
 	confess 'BUG: {idx_shards} unset' if !$self->{idx_shards};
-	confess 'BUG: {idx_shards} empty'  if !@{$self->{idx_shards}};
+	confess 'BUG: {idx_shards} empty' if !@{$self->{idx_shards}};
 	$self->{idx_shards}->[$num % scalar(@{$self->{idx_shards}})];
 }
 
@@ -227,11 +228,18 @@ sub _idx_init { # with_umask callback
 
 	# need to create all shards before initializing msgmap FD
 	# idx_shards must be visible to all forked processes
-	my $max = $self->{shards} - 1;
-	my $idx = $self->{idx_shards} = [];
-	push @$idx, PublicInbox::SearchIdxShard->new($self, $_) for (0..$max);
-	$self->{-need_xapian} = $idx->[0]->need_xapian;
-
+	eval {
+		my $max = $self->{shards} - 1;
+		my $idx = $self->{idx_shards} = [];
+		for (0..$max) {
+			push @$idx, PublicInbox::SearchIdxShard->new($self, $_)
+		}
+		$self->{-need_xapian} = $idx->[0]->need_xapian;
+	};
+	if ($@) {
+		delete $self->{idx_shards};
+		confess $@;
+	}
 	# SearchIdxShard may do their own flushing, so don't scale
 	# until after forking
 	$self->{batch_bytes} *= $self->{shards} if $self->{parallel};

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2024-10-08  5:18 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-08  5:18 [PATCH] v2writable: more debug output for `lei import' failures Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).