unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 0/4] miscidx: lazy transactions to fix tests
@ 2021-01-25  6:41 Eric Wong
  2021-01-25  6:41 ` [PATCH 1/4] lei: use Time::HiRes stat for nanosecond resolution Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Eric Wong @ 2021-01-25  6:41 UTC (permalink / raw)
  To: meta

1/4 fixes a theoretical problem while search of a fix for 2/4
2/4 fixes a real problem with t/lei.t on a weak machine
The rest are some yak shaves I noticed while chasing 2/4

Eric Wong (4):
  lei: use Time::HiRes stat for nanosecond resolution
  miscidx: switch to lazy transactions
  spawn: split() on regexp, not a literal string
  use defined-or in a few more places

 lib/PublicInbox/ExtSearchIdx.pm |  3 +--
 lib/PublicInbox/LEI.pm          |  1 +
 lib/PublicInbox/MiscIdx.pm      | 19 ++++++++++++-------
 lib/PublicInbox/Spawn.pm        |  9 ++++-----
 lib/PublicInbox/TestCommon.pm   | 10 ++++------
 lib/PublicInbox/V2Writable.pm   |  4 ----
 lib/PublicInbox/Watch.pm        |  2 +-
 t/gcf2.t                        |  2 +-
 t/imap_tracker.t                |  2 +-
 t/miscsearch.t                  |  1 -
 t/run.perl                      |  2 +-
 xt/create-many-inboxes.t        |  3 +--
 12 files changed, 27 insertions(+), 31 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] lei: use Time::HiRes stat for nanosecond resolution
  2021-01-25  6:41 [PATCH 0/4] miscidx: lazy transactions to fix tests Eric Wong
@ 2021-01-25  6:41 ` Eric Wong
  2021-01-25  6:41 ` [PATCH 2/4] miscidx: switch to lazy transactions Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-01-25  6:41 UTC (permalink / raw)
  To: meta

The default stat() lacks subsecond granularity and may
lead to config updates being ignored.
---
 lib/PublicInbox/LEI.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 09eac58c..effc6c52 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -24,6 +24,7 @@ use PublicInbox::DS qw(now dwaitpid);
 use PublicInbox::Spawn qw(spawn popen_rd);
 use PublicInbox::OnDestroy;
 use Text::Wrap qw(wrap);
+use Time::HiRes qw(stat); # ctime comparisons for config cache
 use File::Path qw(mkpath);
 use File::Spec;
 our $quit = \&CORE::exit;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] miscidx: switch to lazy transactions
  2021-01-25  6:41 [PATCH 0/4] miscidx: lazy transactions to fix tests Eric Wong
  2021-01-25  6:41 ` [PATCH 1/4] lei: use Time::HiRes stat for nanosecond resolution Eric Wong
@ 2021-01-25  6:41 ` Eric Wong
  2021-01-25  6:41 ` [PATCH 3/4] spawn: split() on regexp, not a literal string Eric Wong
  2021-01-25  6:41 ` [PATCH 4/4] use defined-or in a few more places Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-01-25  6:41 UTC (permalink / raw)
  To: meta

This fixes a sporadic failure on a 1/2 core VM where
"git cat-file --batch" hasn't started up by the time
$cleanup->() destroys the ALL.git directory in t/lei.t
(but not t/lei-oneshot.t).

This happens because dwaitpid() runs inside the event loop
asynchronously and we were able to return to the client before
the cat-file process could even start.

I could not reproduce this failure on my usual 4-core
workstation via "schedtool -a 0x1" to force the entire
test to use a single core.

Lazy transactions matches OverIdx and SearchIdx behavior, and
I've verified this lets us avoid problems with old Xapian
versions (on CentOS 7.x) which failed to set FD_CLOEXEC.
---
 lib/PublicInbox/ExtSearchIdx.pm |  3 +--
 lib/PublicInbox/MiscIdx.pm      | 19 ++++++++++++-------
 lib/PublicInbox/V2Writable.pm   |  4 ----
 t/miscsearch.t                  |  1 -
 4 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index c782a62a..9b7340df 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -1003,8 +1003,7 @@ sub idx_init { # similar to V2Writable
 	$self->with_umask(\&_idx_init, $self, $opt);
 	$self->{oidx}->begin_lazy;
 	$self->{oidx}->eidx_prep;
-	$self->git->batch_prepare;
-	$self->{midx}->begin_txn;
+	$self->{midx}->create_xdb if @new;
 }
 
 sub _watch_commit { # PublicInbox::DS::add_timer callback
diff --git a/lib/PublicInbox/MiscIdx.pm b/lib/PublicInbox/MiscIdx.pm
index ab5e029a..f5a374b2 100644
--- a/lib/PublicInbox/MiscIdx.pm
+++ b/lib/PublicInbox/MiscIdx.pm
@@ -39,25 +39,30 @@ sub new {
 	}, $class;
 }
 
-sub begin_txn {
+sub _begin_txn ($) {
 	my ($self) = @_;
-	croak 'BUG: already in txn' if $self->{xdb}; # XXX make lazy?
 	my $wdb = $PublicInbox::Search::X{WritableDatabase};
 	my $xdb = eval { $wdb->new($self->{mi_dir}, $self->{flags}) };
 	croak "Failed opening $self->{mi_dir}: $@" if $@;
-	$self->{xdb} = $xdb;
 	$xdb->begin_transaction;
+	$xdb;
 }
 
 sub commit_txn {
 	my ($self) = @_;
-	croak 'BUG: not in txn' unless $self->{xdb}; # XXX make lazy?
-	delete($self->{xdb})->commit_transaction;
+	my $xdb = delete $self->{xdb} or return;
+	$xdb->commit_transaction;
+}
+
+sub create_xdb {
+	my ($self) = @_;
+	$self->{xdb} //= _begin_txn($self);
+	commit_txn($self);
 }
 
 sub remove_eidx_key {
 	my ($self, $eidx_key) = @_;
-	my $xdb = $self->{xdb};
+	my $xdb = $self->{xdb} //= _begin_txn($self);
 	my $head = $xdb->postlist_begin('Q'.$eidx_key);
 	my $tail = $xdb->postlist_end('Q'.$eidx_key);
 	my @docids; # only one, unless we had bugs
@@ -74,7 +79,7 @@ sub remove_eidx_key {
 sub index_ibx {
 	my ($self, $ibx) = @_;
 	my $eidx_key = $ibx->eidx_key;
-	my $xdb = $self->{xdb};
+	my $xdb = $self->{xdb} //= _begin_txn($self);
 	# Q = uniQue in Xapian terminology
 	my $head = $xdb->postlist_begin('Q'.$eidx_key);
 	my $tail = $xdb->postlist_end('Q'.$eidx_key);
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 0104f87a..7f9342b0 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -623,10 +623,6 @@ shard[$i] bad echo:$echo != $i waiting for txn commit
 			$dbh->commit;
 			$dbh->begin_work;
 		}
-		if ($midx) {
-			$self->git->batch_prepare;
-			$midx->begin_txn;
-		}
 	}
 	$self->{total_bytes} += $self->{transact_bytes};
 	$self->{transact_bytes} = 0;
diff --git a/t/miscsearch.t b/t/miscsearch.t
index 0424328d..413579fb 100644
--- a/t/miscsearch.t
+++ b/t/miscsearch.t
@@ -27,7 +27,6 @@ my $eidx = { xpfx => "$tmp/eidx", -no_fsync => 1 }; # mock ExtSearchIdx
 	});
 	$v1->init_inbox;
 	my $mi = PublicInbox::MiscIdx->new($eidx);
-	$mi->begin_txn;
 	$mi->index_ibx($v1);
 	$mi->commit_txn;
 }

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] spawn: split() on regexp, not a literal string
  2021-01-25  6:41 [PATCH 0/4] miscidx: lazy transactions to fix tests Eric Wong
  2021-01-25  6:41 ` [PATCH 1/4] lei: use Time::HiRes stat for nanosecond resolution Eric Wong
  2021-01-25  6:41 ` [PATCH 2/4] miscidx: switch to lazy transactions Eric Wong
@ 2021-01-25  6:41 ` Eric Wong
  2021-01-25  6:41 ` [PATCH 4/4] use defined-or in a few more places Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-01-25  6:41 UTC (permalink / raw)
  To: meta

It doesn't appear Perl (as of 5.32.x) has any internal
optimization for splitting on a single-byte, so give it
a regexp instead of letting it compile and discard a
new one every single time.
---
 lib/PublicInbox/Spawn.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index 376d2190..86f66605 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -343,7 +343,7 @@ undef $fdpass;
 sub which ($) {
 	my ($file) = @_;
 	return $file if index($file, '/') >= 0;
-	foreach my $p (split(':', $ENV{PATH})) {
+	for my $p (split(/:/, $ENV{PATH})) {
 		$p .= "/$file";
 		return $p if -x $p;
 	}

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] use defined-or in a few more places
  2021-01-25  6:41 [PATCH 0/4] miscidx: lazy transactions to fix tests Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-25  6:41 ` [PATCH 3/4] spawn: split() on regexp, not a literal string Eric Wong
@ 2021-01-25  6:41 ` Eric Wong
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-01-25  6:41 UTC (permalink / raw)
  To: meta

Mainly around fork() calls, but some nearby places as well.
---
 lib/PublicInbox/Spawn.pm      |  7 +++----
 lib/PublicInbox/TestCommon.pm | 10 ++++------
 lib/PublicInbox/Watch.pm      |  2 +-
 t/gcf2.t                      |  2 +-
 t/imap_tracker.t              |  2 +-
 t/run.perl                    |  2 +-
 xt/create-many-inboxes.t      |  3 +--
 7 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index 86f66605..ef4885c1 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -352,8 +352,7 @@ sub which ($) {
 
 sub spawn ($;$$) {
 	my ($cmd, $env, $opts) = @_;
-	my $f = which($cmd->[0]);
-	defined $f or die "$cmd->[0]: command not found\n";
+	my $f = which($cmd->[0]) // die "$cmd->[0]: command not found\n";
 	my @env;
 	$opts ||= {};
 
@@ -365,7 +364,7 @@ sub spawn ($;$$) {
 	for my $child_fd (0..2) {
 		my $parent_fd = $opts->{$child_fd};
 		if (defined($parent_fd) && $parent_fd !~ /\A[0-9]+\z/) {
-			defined(my $fd = fileno($parent_fd)) or
+			my $fd = fileno($parent_fd) //
 					die "$parent_fd not an IO GLOB? $!";
 			$parent_fd = $fd;
 		}
@@ -374,7 +373,7 @@ sub spawn ($;$$) {
 	my $rlim = [];
 
 	foreach my $l (@RLIMITS) {
-		defined(my $v = $opts->{$l}) or next;
+		my $v = $opts->{$l} // next;
 		my $r = eval "require BSD::Resource; BSD::Resource::$l();";
 		unless (defined $r) {
 			warn "$l undefined by BSD::Resource: $@\n";
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 16ae2650..40c2dc9e 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -254,8 +254,7 @@ sub run_script ($;$$) {
 		my $cmd = [ key2script($key), @argv ];
 		my $pid = PublicInbox::Spawn::spawn($cmd, $env, $spawn_opt);
 		if (defined $pid) {
-			my $r = waitpid($pid, 0);
-			defined($r) or die "waitpid: $!";
+			my $r = waitpid($pid, 0) // die "waitpid: $!";
 			$r == $pid or die "waitpid: expected $pid, got $r";
 		}
 	} else { # localize and run everything in the same process:
@@ -367,7 +366,7 @@ sub start_script {
 			}
 		}
 		if (@paths) {
-			defined($tail_pid = fork) or die "fork: $!\n";
+			$tail_pid = fork // die "fork: $!";
 			if ($tail_pid == 0) {
 				# make sure files exist, first
 				open my $fh, '>>', $_ for @paths;
@@ -378,7 +377,7 @@ sub start_script {
 			wait_for_tail($tail_pid, scalar @paths);
 		}
 	}
-	defined(my $pid = fork) or die "fork: $!\n";
+	my $pid = fork // die "fork: $!\n";
 	if ($pid == 0) {
 		eval { PublicInbox::DS->Reset };
 		# pretend to be systemd (cf. sd_listen_fds(3))
@@ -440,8 +439,7 @@ sub join {
 	my ($self, $sig) = @_;
 	my $pid = delete $self->{pid} or return;
 	CORE::kill($sig, $pid) if defined $sig;
-	my $ret = waitpid($pid, 0);
-	defined($ret) or die "waitpid($pid): $!";
+	my $ret = waitpid($pid, 0) // die "waitpid($pid): $!";
 	$ret == $pid or die "waitpid($pid) != $ret";
 }
 
diff --git a/lib/PublicInbox/Watch.pm b/lib/PublicInbox/Watch.pm
index 1de5018d..2b44ba43 100644
--- a/lib/PublicInbox/Watch.pm
+++ b/lib/PublicInbox/Watch.pm
@@ -626,7 +626,7 @@ sub imap_idle_fork ($$) {
 	my ($url, $intvl) = @$url_intvl;
 	pipe(my ($r, $w)) or die "pipe: $!";
 	my $seed = rand(0xffffffff);
-	defined(my $pid = fork) or die "fork: $!";
+	my $pid = fork // die "fork: $!";
 	if ($pid == 0) {
 		srand($seed);
 		eval { Net::SSLeay::randomize() };
diff --git a/t/gcf2.t b/t/gcf2.t
index fa907c8b..d12a4420 100644
--- a/t/gcf2.t
+++ b/t/gcf2.t
@@ -114,7 +114,7 @@ SKIP: {
 		$w->blocking($blk);
 		seek($fh, 0, SEEK_SET) or BAIL_OUT "seek: $!";
 		truncate($fh, 0) or BAIL_OUT "truncate: $!";
-		defined(my $pid = fork) or BAIL_OUT "fork: $!";
+		my $pid = fork // BAIL_OUT "fork: $!";
 		if ($pid == 0) {
 			close $w;
 			tick; # wait for parent to block on writev
diff --git a/t/imap_tracker.t b/t/imap_tracker.t
index be7c6e65..90dea99f 100644
--- a/t/imap_tracker.t
+++ b/t/imap_tracker.t
@@ -29,7 +29,7 @@ SKIP: {
 	diag "TEST_STRESS_NPROC=$nproc TEST_STRESS_NR=$nr";
 	require POSIX;
 	for my $n (1..$nproc) {
-		defined(my $pid = fork) or BAIL_OUT "fork: $!";
+		my $pid = fork // BAIL_OUT "fork: $!";
 		if ($pid == 0) {
 			my $url = "imap://example.com/INBOX.$$";
 			my $uidval = time;
diff --git a/t/run.perl b/t/run.perl
index b7cb988b..96db3045 100755
--- a/t/run.perl
+++ b/t/run.perl
@@ -128,7 +128,7 @@ my $eof; # we stop respawning if true
 
 my $start_worker = sub {
 	my ($i, $j, $rd, $todo) = @_;
-	defined(my $pid = fork) or DIE "fork: $!";
+	my $pid = fork // DIE "fork: $!";
 	if ($pid == 0) {
 		$worker = $$;
 		while (1) {
diff --git a/xt/create-many-inboxes.t b/xt/create-many-inboxes.t
index 0c2de40d..f44334cc 100644
--- a/xt/create-many-inboxes.t
+++ b/xt/create-many-inboxes.t
@@ -68,7 +68,7 @@ my @children;
 for my $i (1..$nproc) {
 	my ($r, $w);
 	pipe($r, $w) or BAIL_OUT $!;
-	my $pid = fork;
+	my $pid = fork // BAIL_OUT "fork: $!";
 	if ($pid == 0) {
 		close $w;
 		while (my $i = <$r>) {
@@ -77,7 +77,6 @@ for my $i (1..$nproc) {
 		}
 		_exit(0);
 	}
-	defined $pid or BAIL_OUT "fork: $!";
 	close $r or BAIL_OUT $!;
 	push @children, [ $w, $pid ];
 	$w->autoflush(1);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-25  6:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-25  6:41 [PATCH 0/4] miscidx: lazy transactions to fix tests Eric Wong
2021-01-25  6:41 ` [PATCH 1/4] lei: use Time::HiRes stat for nanosecond resolution Eric Wong
2021-01-25  6:41 ` [PATCH 2/4] miscidx: switch to lazy transactions Eric Wong
2021-01-25  6:41 ` [PATCH 3/4] spawn: split() on regexp, not a literal string Eric Wong
2021-01-25  6:41 ` [PATCH 4/4] use defined-or in a few more places Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).