* [PATCH 0/6] qspawn-related cleanups, comments, and minor fix
@ 2019-09-17 8:31 Eric Wong
2019-09-17 8:31 ` [PATCH 1/6] qspawn: remove return value from ->finish Eric Wong
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Eric Wong @ 2019-09-17 8:31 UTC (permalink / raw)
To: meta
Still haven't found anything which is causing problems on korg,
but found a bunch of worthwhile cleanups, clarifications
and perhaps minor efficiency improvements that I'm comfortable
putting on 'master'.
Eric Wong (6):
qspawn: remove return value from ->finish
qspawn: log errors for generic PSGI server users
qspawn: shorten lifetime of circular references
qspawn: improve variable naming and commenting
http: drop unused `$env' variable after delete
http: remove unnecessary delete
lib/PublicInbox/GitHTTPBackend.pm | 2 +-
lib/PublicInbox/HTTP.pm | 6 ++--
lib/PublicInbox/Qspawn.pm | 48 ++++++++++++-------------------
t/qspawn.t | 17 +++++++----
4 files changed, 33 insertions(+), 40 deletions(-)
--
EW
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/6] qspawn: remove return value from ->finish
2019-09-17 8:31 [PATCH 0/6] qspawn-related cleanups, comments, and minor fix Eric Wong
@ 2019-09-17 8:31 ` Eric Wong
2019-09-17 8:31 ` [PATCH 2/6] qspawn: log errors for generic PSGI server users Eric Wong
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-09-17 8:31 UTC (permalink / raw)
To: meta
We don't use the return value in real code since we do waitpid
asynchronously, now. So simplify our runtime code at the cost
of making our test slighly more complex.
---
lib/PublicInbox/Qspawn.pm | 1 -
t/qspawn.t | 17 +++++++++++------
2 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index 54976b0..6b350f0 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -141,7 +141,6 @@ sub finish ($;$) {
_do_spawn(@$next);
}
}
- $self->{err}; # may be meaningless if non-blocking
}
sub start {
diff --git a/t/qspawn.t b/t/qspawn.t
index ab6e375..58c6feb 100644
--- a/t/qspawn.t
+++ b/t/qspawn.t
@@ -11,6 +11,12 @@ use_ok 'PublicInbox::Qspawn';
is($res, "err\nout\n", 'captured stderr and stdout');
}
+sub finish_err ($) {
+ my ($qsp) = @_;
+ $qsp->finish;
+ $qsp->{err};
+}
+
my $limiter = PublicInbox::Qspawn::Limiter->new(1);
{
my $x = PublicInbox::Qspawn->new([qw(true)]);
@@ -18,7 +24,7 @@ my $limiter = PublicInbox::Qspawn::Limiter->new(1);
$x->start($limiter, sub {
my ($rpipe) = @_;
is(0, sysread($rpipe, my $buf, 1), 'read zero bytes');
- ok(!$x->finish, 'no error on finish');
+ ok(!finish_err($x), 'no error on finish');
$run = 1;
});
is($run, 1, 'callback ran alright');
@@ -30,8 +36,7 @@ my $limiter = PublicInbox::Qspawn::Limiter->new(1);
$x->start($limiter, sub {
my ($rpipe) = @_;
is(0, sysread($rpipe, my $buf, 1), 'read zero bytes from false');
- my $err = $x->finish;
- ok($err, 'error on finish');
+ ok(finish_err($x), 'error on finish');
$run = 1;
});
is($run, 1, 'callback ran alright');
@@ -57,11 +62,11 @@ foreach my $cmd ([qw(sleep 1)], [qw(sh -c), 'sleep 1; false']) {
} (0..2);
if ($cmd->[-1] =~ /false\z/) {
- ok($s->finish, 'got error on false after sleep');
+ ok(finish_err($s), 'got error on false after sleep');
} else {
- ok(!$s->finish, 'no error on sleep');
+ ok(!finish_err($s), 'no error on sleep');
}
- ok(!$_->[0]->finish, "true $_->[1] succeeded") foreach @t;
+ ok(!finish_err($_->[0]), "true $_->[1] succeeded") foreach @t;
is_deeply([qw(sleep 0 1 2)], \@run, 'ran in order');
}
--
EW
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/6] qspawn: log errors for generic PSGI server users
2019-09-17 8:31 [PATCH 0/6] qspawn-related cleanups, comments, and minor fix Eric Wong
2019-09-17 8:31 ` [PATCH 1/6] qspawn: remove return value from ->finish Eric Wong
@ 2019-09-17 8:31 ` Eric Wong
2019-09-17 8:31 ` [PATCH 3/6] qspawn: shorten lifetime of circular references Eric Wong
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-09-17 8:31 UTC (permalink / raw)
To: meta
Generic PSGI servers have $env->{'psgi.errors'}, too,
so ensure they can log errors.
---
lib/PublicInbox/Qspawn.pm | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index 6b350f0..844d50f 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -116,10 +116,9 @@ sub waitpid_err ($$) {
sub do_waitpid ($;$) {
my ($self, $env) = @_;
my $pid = $self->{pid};
- eval { # PublicInbox::DS may not be loaded
- PublicInbox::DS::dwaitpid($pid, \&waitpid_err, $self);
- $self->{env} = $env;
- };
+ $self->{env} = $env;
+ # PublicInbox::DS may not be loaded
+ eval { PublicInbox::DS::dwaitpid($pid, \&waitpid_err, $self) };
# done if we're running in PublicInbox::DS::EventLoop
if ($@) {
# non public-inbox-{httpd,nntpd} callers may block:
--
EW
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/6] qspawn: shorten lifetime of circular references
2019-09-17 8:31 [PATCH 0/6] qspawn-related cleanups, comments, and minor fix Eric Wong
2019-09-17 8:31 ` [PATCH 1/6] qspawn: remove return value from ->finish Eric Wong
2019-09-17 8:31 ` [PATCH 2/6] qspawn: log errors for generic PSGI server users Eric Wong
@ 2019-09-17 8:31 ` Eric Wong
2019-09-17 8:31 ` [PATCH 4/6] qspawn: improve variable naming and commenting Eric Wong
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-09-17 8:31 UTC (permalink / raw)
To: meta
All of these circular references are designed to clear
themselves, but these will make actual errors from Devel::Cycle
easier-to-spot.
The circular reference in the limiter {run_queue} is not a real
problem, but we can avoid storing the circular reference until
we actually need to spawn the child, reducing the size of the
Qspawn object while it's in the queue, slightly.
We also do not need to have redundant checks to spawn new
processes, we should only spawn new processes when they're
->start-ed or after waitpid reaps them.
---
lib/PublicInbox/Qspawn.pm | 28 +++++++++-------------------
1 file changed, 9 insertions(+), 19 deletions(-)
diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index 844d50f..10fe534 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -44,11 +44,11 @@ sub new ($$$;) {
}
sub _do_spawn {
- my ($self, $cb) = @_;
+ my ($self, $cb, $limiter) = @_;
my $err;
my ($cmd, $env, $opts) = @{$self->{args}};
my %opts = %{$opts || {}};
- my $limiter = $self->{limiter};
+ $self->{limiter} = $limiter;
foreach my $k (PublicInbox::Spawn::RLIMITS()) {
if (defined(my $rlimit = $limiter->{$k})) {
$opts{$k} = $rlimit;
@@ -94,21 +94,21 @@ sub waitpid_err ($$) {
$err = "W: waitpid($xpid, 0) => $pid: $!";
} # else should not be called with pid == 0
+ my $env = delete $self->{env};
+
# done, spawn whatever's in the queue
my $limiter = $self->{limiter};
my $running = --$limiter->{running};
- # limiter->{max} may change dynamically
- if (($running || $limiter->{running}) < $limiter->{max}) {
- if (my $next = shift @{$limiter->{run_queue}}) {
- _do_spawn(@$next);
+ if ($running < $limiter->{max}) {
+ if (my $next = shift(@{$limiter->{run_queue}})) {
+ _do_spawn(@$next, $limiter);
}
}
return unless $err;
$self->{err} = $err;
- my $env = $self->{env} or return;
- if (!$env->{'qspawn.quiet'}) {
+ if ($env && !$env->{'qspawn.quiet'}) {
log_err($env, join(' ', @{$self->{args}}) . ": $err");
}
}
@@ -132,22 +132,12 @@ sub finish ($;$) {
if (delete $self->{rpipe}) {
do_waitpid($self, $env);
}
-
- # limiter->{max} may change dynamically
- my $limiter = $self->{limiter};
- if ($limiter->{running} < $limiter->{max}) {
- if (my $next = shift @{$limiter->{run_queue}}) {
- _do_spawn(@$next);
- }
- }
}
sub start {
my ($self, $limiter, $cb) = @_;
- $self->{limiter} = $limiter;
-
if ($limiter->{running} < $limiter->{max}) {
- _do_spawn($self, $cb);
+ _do_spawn($self, $cb, $limiter);
} else {
push @{$limiter->{run_queue}}, [ $self, $cb ];
}
--
EW
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/6] qspawn: improve variable naming and commenting
2019-09-17 8:31 [PATCH 0/6] qspawn-related cleanups, comments, and minor fix Eric Wong
` (2 preceding siblings ...)
2019-09-17 8:31 ` [PATCH 3/6] qspawn: shorten lifetime of circular references Eric Wong
@ 2019-09-17 8:31 ` Eric Wong
2019-09-17 8:31 ` [PATCH 5/6] http: drop unused `$env' variable after delete Eric Wong
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-09-17 8:31 UTC (permalink / raw)
To: meta
Naming $start_cb consistently helps avoid confusing new readers,
and some comments will help with understanding flow
---
lib/PublicInbox/GitHTTPBackend.pm | 2 +-
lib/PublicInbox/Qspawn.pm | 16 +++++++++-------
2 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/lib/PublicInbox/GitHTTPBackend.pm b/lib/PublicInbox/GitHTTPBackend.pm
index c9a7cff..ec8e651 100644
--- a/lib/PublicInbox/GitHTTPBackend.pm
+++ b/lib/PublicInbox/GitHTTPBackend.pm
@@ -204,7 +204,7 @@ sub serve_smart {
$env{PATH_TRANSLATED} = "$git->{git_dir}/$path";
my $rdr = input_prepare($env) or return r(500);
my $qsp = PublicInbox::Qspawn->new([qw(git http-backend)], \%env, $rdr);
- $qsp->psgi_return($env, $limiter, sub {
+ $qsp->psgi_return($env, $limiter, sub { # parse_hdr
my ($r, $bref) = @_;
my $res = parse_cgi_headers($r, $bref) or return; # incomplete
$res->[0] == 403 ? serve_dumb($env, $git, $path) : $res;
diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index 10fe534..5a30064 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -44,7 +44,7 @@ sub new ($$$;) {
}
sub _do_spawn {
- my ($self, $cb, $limiter) = @_;
+ my ($self, $start_cb, $limiter) = @_;
my $err;
my ($cmd, $env, $opts) = @{$self->{args}};
my %opts = %{$opts || {}};
@@ -66,7 +66,7 @@ sub _do_spawn {
} else {
$self->{err} = $!;
}
- $cb->($self->{rpipe});
+ $start_cb->($self->{rpipe});
}
sub child_err ($) {
@@ -135,11 +135,11 @@ sub finish ($;$) {
}
sub start {
- my ($self, $limiter, $cb) = @_;
+ my ($self, $limiter, $start_cb) = @_;
if ($limiter->{running} < $limiter->{max}) {
- _do_spawn($self, $cb, $limiter);
+ _do_spawn($self, $start_cb, $limiter);
} else {
- push @{$limiter->{run_queue}}, [ $self, $cb ];
+ push @{$limiter->{run_queue}}, [ $self, $start_cb ];
}
}
@@ -175,11 +175,12 @@ reread:
}
};
$limiter ||= $def_limiter ||= PublicInbox::Qspawn::Limiter->new(32);
- $self->start($limiter, sub { # may run later, much later...
+ $self->start($limiter, sub { # start_cb, may run later, much later...
($rpipe) = @_; # popen_rd result
if ($async) {
# PublicInbox::HTTPD::Async->new($rpipe, $cb, $end)
$async = $async->($rpipe, $cb, $end);
+ # $cb will call ->async_pass or ->close
} else { # generic PSGI
$cb->() while $qx;
}
@@ -254,7 +255,7 @@ sub psgi_return {
$ret;
};
- my $wcb = delete $env->{'qspawn.wcb'};
+ my $wcb = delete $env->{'qspawn.wcb'}; # or PSGI server supplies it
my $async = $env->{'pi-httpd.async'};
my $cb = sub {
@@ -287,6 +288,7 @@ sub psgi_return {
if ($async) {
# PublicInbox::HTTPD::Async->new($rpipe, $cb, $end)
$async = $async->($rpipe, $cb, $end);
+ # $cb will call ->async_pass or ->close
} else { # generic PSGI
$cb->() while $rd_hdr;
}
--
EW
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/6] http: drop unused `$env' variable after delete
2019-09-17 8:31 [PATCH 0/6] qspawn-related cleanups, comments, and minor fix Eric Wong
` (3 preceding siblings ...)
2019-09-17 8:31 ` [PATCH 4/6] qspawn: improve variable naming and commenting Eric Wong
@ 2019-09-17 8:31 ` Eric Wong
2019-09-17 8:31 ` [PATCH 6/6] http: remove unnecessary delete Eric Wong
2019-09-17 9:19 ` [MAYBE] kill rpipe early Eric Wong
6 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-09-17 8:31 UTC (permalink / raw)
To: meta
And explain why we need to do that delete in a comment.
---
lib/PublicInbox/HTTP.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/PublicInbox/HTTP.pm b/lib/PublicInbox/HTTP.pm
index b43ef87..cd95f4a 100644
--- a/lib/PublicInbox/HTTP.pm
+++ b/lib/PublicInbox/HTTP.pm
@@ -255,7 +255,7 @@ sub next_request ($) {
sub response_done_cb ($$) {
my ($self, $alive) = @_;
sub {
- my $env = delete $self->{env};
+ delete $self->{env}; # we're no longer busy
$self->write(\"0\r\n\r\n") if $alive == 2;
$self->write($alive ? \&next_request : \&close);
}
--
EW
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 6/6] http: remove unnecessary delete
2019-09-17 8:31 [PATCH 0/6] qspawn-related cleanups, comments, and minor fix Eric Wong
` (4 preceding siblings ...)
2019-09-17 8:31 ` [PATCH 5/6] http: drop unused `$env' variable after delete Eric Wong
@ 2019-09-17 8:31 ` Eric Wong
2019-09-17 9:19 ` [MAYBE] kill rpipe early Eric Wong
6 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-09-17 8:31 UTC (permalink / raw)
To: meta
Only removing $http->{env} is needed to prevent circular
references. $env->{'psgix.io'} does not need to be deleted
since $env will no longer have any references to it when
->close returns.
---
lib/PublicInbox/HTTP.pm | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/lib/PublicInbox/HTTP.pm b/lib/PublicInbox/HTTP.pm
index cd95f4a..009b5ff 100644
--- a/lib/PublicInbox/HTTP.pm
+++ b/lib/PublicInbox/HTTP.pm
@@ -453,9 +453,7 @@ sub quit {
sub close {
my $self = $_[0];
- if (my $env = delete $self->{env}) {
- delete $env->{'psgix.io'}; # prevent circular references
- }
+ delete $self->{env}; # prevent circular references
if (my $forward = delete $self->{forward}) {
eval { $forward->close };
err($self, "forward ->close error: $@") if $@;
--
EW
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [MAYBE] kill rpipe early
2019-09-17 8:31 [PATCH 0/6] qspawn-related cleanups, comments, and minor fix Eric Wong
` (5 preceding siblings ...)
2019-09-17 8:31 ` [PATCH 6/6] http: remove unnecessary delete Eric Wong
@ 2019-09-17 9:19 ` Eric Wong
2019-09-17 9:40 ` Eric Wong
6 siblings, 1 reply; 9+ messages in thread
From: Eric Wong @ 2019-09-17 9:19 UTC (permalink / raw)
To: meta
Eric Wong <e@80x24.org> wrote:
> Still haven't found anything which is causing problems on korg,
> but found a bunch of worthwhile cleanups, clarifications
> and perhaps minor efficiency improvements that I'm comfortable
> putting on 'master'.
This might help narrow down where the FD leak is if it stops
pipes (but not regular files) from leaking. However, if it
fixes things, I also don't understand why it'd work, but I'm
mostly running on fumes at this point...
diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm
index 5a30064..3bf5c03 100644
--- a/lib/PublicInbox/Qspawn.pm
+++ b/lib/PublicInbox/Qspawn.pm
@@ -224,7 +224,7 @@ sub filter_fh ($$) {
# immediately (or streamed via ->getline (pull-based)).
sub psgi_return {
my ($self, $env, $limiter, $parse_hdr) = @_;
- my ($fh, $rpipe);
+ my $fh;
my $end = sub {
my $err = $_[0]; # $!
log_err($env, "psgi_return: $err") if defined($err);
@@ -233,6 +233,7 @@ sub psgi_return {
};
my $buf = '';
+ my $rpipe;
my $rd_hdr = sub {
# typically used for reading CGI headers
# we must loop until EAGAIN for EPOLLET in HTTPD/Async.pm
@@ -275,6 +276,7 @@ sub psgi_return {
$fh = $wcb->($r); # scalar @$r == 2
$fh = filter_fh($fh, $filter) if $filter;
$async->async_pass($env->{'psgix.io'}, $fh, \$buf);
+ $rpipe = undef;
} else { # for synchronous PSGI servers
require PublicInbox::GetlineBody;
$r->[2] = PublicInbox::GetlineBody->new($rpipe, $end,
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [MAYBE] kill rpipe early
2019-09-17 9:19 ` [MAYBE] kill rpipe early Eric Wong
@ 2019-09-17 9:40 ` Eric Wong
0 siblings, 0 replies; 9+ messages in thread
From: Eric Wong @ 2019-09-17 9:40 UTC (permalink / raw)
To: meta
Eric Wong <e@80x24.org> wrote:
> index 5a30064..3bf5c03 100644
> --- a/lib/PublicInbox/Qspawn.pm
> +++ b/lib/PublicInbox/Qspawn.pm
> @@ -224,7 +224,7 @@ sub filter_fh ($$) {
> # immediately (or streamed via ->getline (pull-based)).
> sub psgi_return {
> my ($self, $env, $limiter, $parse_hdr) = @_;
> - my ($fh, $rpipe);
> + my $fh;
> my $end = sub {
> my $err = $_[0]; # $!
> log_err($env, "psgi_return: $err") if defined($err);
> @@ -233,6 +233,7 @@ sub psgi_return {
> };
>
> my $buf = '';
> + my $rpipe;
> my $rd_hdr = sub {
> # typically used for reading CGI headers
> # we must loop until EAGAIN for EPOLLET in HTTPD/Async.pm
> @@ -275,6 +276,7 @@ sub psgi_return {
> $fh = $wcb->($r); # scalar @$r == 2
> $fh = filter_fh($fh, $filter) if $filter;
> $async->async_pass($env->{'psgix.io'}, $fh, \$buf);
> + $rpipe = undef;
# if the above "$rpipe = undef" fixes the pipe leak,
# the following ought to fix the regular file leak.
# HOWEVER, there is still a likely memory leak
# even if all the FD leaks are fixed, so keeping the
# FD leaks would help find and fix the memory leak
delete $env->{'psgi.input'};
> } else { # for synchronous PSGI servers
> require PublicInbox::GetlineBody;
> $r->[2] = PublicInbox::GetlineBody->new($rpipe, $end,
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-09-17 9:40 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-17 8:31 [PATCH 0/6] qspawn-related cleanups, comments, and minor fix Eric Wong
2019-09-17 8:31 ` [PATCH 1/6] qspawn: remove return value from ->finish Eric Wong
2019-09-17 8:31 ` [PATCH 2/6] qspawn: log errors for generic PSGI server users Eric Wong
2019-09-17 8:31 ` [PATCH 3/6] qspawn: shorten lifetime of circular references Eric Wong
2019-09-17 8:31 ` [PATCH 4/6] qspawn: improve variable naming and commenting Eric Wong
2019-09-17 8:31 ` [PATCH 5/6] http: drop unused `$env' variable after delete Eric Wong
2019-09-17 8:31 ` [PATCH 6/6] http: remove unnecessary delete Eric Wong
2019-09-17 9:19 ` [MAYBE] kill rpipe early Eric Wong
2019-09-17 9:40 ` Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).