* [PATCH 0/4] http + mbox: tiny optimizations
@ 2016-06-25 0:45 Eric Wong
2016-06-25 0:45 ` [PATCH 1/4] http: always yield on getline/body Eric Wong
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Eric Wong @ 2016-06-25 0:45 UTC (permalink / raw)
To: meta
For the gigantic $INBOX/all.mbox.gz response, this seems to slightly
improve speeds from roughly 290K/s to roughly 330K/s when fetching
out of a ~750MB aggressively-packed inbox.
Eric Wong (4):
http: always yield on getline/body
evcleanup: micro-optimize asap function
mbox: reduce small packets for gzipped mboxes
http: cork chunked responses for small savings
lib/PublicInbox/EvCleanup.pm | 42 +++++++++++++++++++++++++++++++++---------
lib/PublicInbox/HTTP.pm | 14 ++++++--------
lib/PublicInbox/Mbox.pm | 23 ++++++++++-------------
3 files changed, 49 insertions(+), 30 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/4] http: always yield on getline/body
2016-06-25 0:45 [PATCH 0/4] http + mbox: tiny optimizations Eric Wong
@ 2016-06-25 0:45 ` Eric Wong
2016-06-25 0:45 ` [PATCH 2/4] evcleanup: micro-optimize asap function Eric Wong
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2016-06-25 0:45 UTC (permalink / raw)
To: meta
We want to maximize fairness for large responses which may
download the entire mbox.
---
lib/PublicInbox/HTTP.pm | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/lib/PublicInbox/HTTP.pm b/lib/PublicInbox/HTTP.pm
index 800b240..c141fc8 100644
--- a/lib/PublicInbox/HTTP.pm
+++ b/lib/PublicInbox/HTTP.pm
@@ -16,7 +16,6 @@ use Fcntl qw(:seek);
use Plack::HTTPParser qw(parse_http_request); # XS or pure Perl
use HTTP::Status qw(status_message);
use HTTP::Date qw(time2str);
-use Time::HiRes qw(clock_gettime CLOCK_MONOTONIC);
use Scalar::Util qw(weaken);
use IO::File;
use constant {
@@ -26,8 +25,6 @@ use constant {
CHUNK_MAX_HDR => 256,
};
-sub now () { clock_gettime(CLOCK_MONOTONIC) }
-
# FIXME: duplicated code with NNTP.pm, layering violation
my $WEAKEN = {}; # string(inbox) -> inbox
my $weakt;
@@ -270,17 +267,15 @@ sub getline_response {
my $forward = $self->{forward};
# limit our own running time for fairness with other
# clients and to avoid buffering too much:
- my $end = now() + 0.1;
while ($forward && defined(my $buf = $forward->getline)) {
$write->($buf);
last if $self->{closed};
if ($self->{write_buf_size}) {
$self->write($self->{pull});
- return;
- } elsif (now() > $end) {
+ } else {
PublicInbox::EvCleanup::asap($self->{pull});
- return;
}
+ return;
}
$self->{forward} = $self->{pull} = undef;
$forward->close if $forward; # avoid recursion
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/4] evcleanup: micro-optimize asap function
2016-06-25 0:45 [PATCH 0/4] http + mbox: tiny optimizations Eric Wong
2016-06-25 0:45 ` [PATCH 1/4] http: always yield on getline/body Eric Wong
@ 2016-06-25 0:45 ` Eric Wong
2016-06-25 0:45 ` [PATCH 3/4] mbox: reduce small packets for gzipped mboxes Eric Wong
2016-06-25 0:45 ` [PATCH 4/4] http: cork chunked responses for small savings Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2016-06-25 0:45 UTC (permalink / raw)
To: meta
Instead of relying on a timer with immediate callback,
arm a pipe to watch for writability, ensuring the callback
always fires.
---
lib/PublicInbox/EvCleanup.pm | 42 +++++++++++++++++++++++++++++++++---------
1 file changed, 33 insertions(+), 9 deletions(-)
diff --git a/lib/PublicInbox/EvCleanup.pm b/lib/PublicInbox/EvCleanup.pm
index 5efb093..61837b8 100644
--- a/lib/PublicInbox/EvCleanup.pm
+++ b/lib/PublicInbox/EvCleanup.pm
@@ -5,32 +5,56 @@
package PublicInbox::EvCleanup;
use strict;
use warnings;
+use base qw(Danga::Socket);
+use fields qw(rd);
+my $singleton;
+my $asapq = [ [], undef ];
+my $laterq = [ [], undef ];
-my $asapq = { queue => [], timer => undef };
-my $laterq = { queue => [], timer => undef };
+sub once_init () {
+ my $self = fields::new('PublicInbox::EvCleanup');
+ my ($r, $w);
+ pipe($r, $w) or die "pipe: $!";
+ $self->SUPER::new($w);
+ $self->{rd} = $r; # never read, since we never write..
+ $self;
+}
sub _run_all ($) {
my ($q) = @_;
- my $run = $q->{queue};
- $q->{queue} = [];
- $q->{timer} = undef;
+ my $run = $q->[0];
+ $q->[0] = [];
+ $q->[1] = undef;
$_->() foreach @$run;
}
sub _run_asap () { _run_all($asapq) }
sub _run_later () { _run_all($laterq) }
+# Called by Danga::Socket
+sub event_write {
+ my ($self) = @_;
+ $self->watch_write(0);
+ _run_asap();
+}
+
+sub _asap_timer () {
+ $singleton ||= once_init();
+ $singleton->watch_write(1);
+ 1;
+}
+
sub asap ($) {
my ($cb) = @_;
- push @{$asapq->{queue}}, $cb;
- $asapq->{timer} ||= Danga::Socket->AddTimer(0, *_run_asap);
+ push @{$asapq->[0]}, $cb;
+ $asapq->[1] ||= _asap_timer();
}
sub later ($) {
my ($cb) = @_;
- push @{$laterq->{queue}}, $cb;
- $laterq->{timer} ||= Danga::Socket->AddTimer(60, *_run_later);
+ push @{$laterq->[0]}, $cb;
+ $laterq->[1] ||= Danga::Socket->AddTimer(60, *_run_later);
}
END {
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/4] mbox: reduce small packets for gzipped mboxes
2016-06-25 0:45 [PATCH 0/4] http + mbox: tiny optimizations Eric Wong
2016-06-25 0:45 ` [PATCH 1/4] http: always yield on getline/body Eric Wong
2016-06-25 0:45 ` [PATCH 2/4] evcleanup: micro-optimize asap function Eric Wong
@ 2016-06-25 0:45 ` Eric Wong
2016-06-25 0:45 ` [PATCH 4/4] http: cork chunked responses for small savings Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2016-06-25 0:45 UTC (permalink / raw)
To: meta
We want to avoid sending 10 or 20-byte gzip headers as
separate TCP packets to reduce syscalls and avoid wasting
bandwidth.
---
lib/PublicInbox/Mbox.pm | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)
diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm
index 63ec605..1c97f95 100644
--- a/lib/PublicInbox/Mbox.pm
+++ b/lib/PublicInbox/Mbox.pm
@@ -110,7 +110,7 @@ use warnings;
sub new {
my ($class, $ctx, $cb) = @_;
- my $buf;
+ my $buf = '';
bless {
buf => \$buf,
gz => IO::Compress::Gzip->new(\$buf, Time => 0),
@@ -121,19 +121,11 @@ sub new {
}, $class;
}
-sub _flush_buf {
- my ($self) = @_;
- my $ret = $self->{buf};
- $ret = $$ret;
- ${$self->{buf}} = undef;
- $ret;
-}
-
# called by Plack::Util::foreach or similar
sub getline {
my ($self) = @_;
+ my $ctx = $self->{ctx} or return;
my $res;
- my $ctx = $self->{ctx};
my $ibx = $ctx->{-inbox};
my $gz = $self->{gz};
do {
@@ -141,8 +133,12 @@ sub getline {
my $msg = eval { $ibx->msg_by_mid($smsg->mid) } or next;
$msg = Email::Simple->new($msg);
$gz->write(PublicInbox::Mbox::msg_str($ctx, $msg));
- my $ret = _flush_buf($self);
- return $ret if $ret;
+ my $bref = $self->{buf};
+ if (length($$bref) >= 8192) {
+ my $ret = $$bref; # copy :<
+ ${$self->{buf}} = '';
+ return $ret;
+ }
}
$res = $self->{cb}->($self->{opts});
$self->{msgs} = $res->{msgs};
@@ -150,7 +146,8 @@ sub getline {
$self->{opts}->{offset} += $res;
} while ($res);
$gz->close;
- _flush_buf($self);
+ delete $self->{ctx};
+ ${delete $self->{buf}};
}
sub close {} # noop
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 4/4] http: cork chunked responses for small savings
2016-06-25 0:45 [PATCH 0/4] http + mbox: tiny optimizations Eric Wong
` (2 preceding siblings ...)
2016-06-25 0:45 ` [PATCH 3/4] mbox: reduce small packets for gzipped mboxes Eric Wong
@ 2016-06-25 0:45 ` Eric Wong
3 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2016-06-25 0:45 UTC (permalink / raw)
To: meta
This only affects Linux users with MSG_MORE support.
We can avoid extra TCP overhead for sub-optimal chunk sizes
by using MSG_MORE even with chunk trailers under Linux.
This breaks real-time apps which require <= 200ms latency for
streaming small packets (e.g. implementing "tail -F"), but the
public-inbox WWW code does not (and will never) do such things.
---
lib/PublicInbox/HTTP.pm | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/PublicInbox/HTTP.pm b/lib/PublicInbox/HTTP.pm
index c141fc8..e19c592 100644
--- a/lib/PublicInbox/HTTP.pm
+++ b/lib/PublicInbox/HTTP.pm
@@ -223,7 +223,10 @@ sub chunked_wcb ($) {
return if $_[0] eq '';
more($self, sprintf("%x\r\n", bytes::length($_[0])));
more($self, $_[0]);
- $self->write("\r\n");
+
+ # use $self->write("\n\n") if you care about real-time
+ # streaming responses, public-inbox WWW does not.
+ more($self, "\r\n");
}
}
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-06-25 0:45 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-25 0:45 [PATCH 0/4] http + mbox: tiny optimizations Eric Wong
2016-06-25 0:45 ` [PATCH 1/4] http: always yield on getline/body Eric Wong
2016-06-25 0:45 ` [PATCH 2/4] evcleanup: micro-optimize asap function Eric Wong
2016-06-25 0:45 ` [PATCH 3/4] mbox: reduce small packets for gzipped mboxes Eric Wong
2016-06-25 0:45 ` [PATCH 4/4] http: cork chunked responses for small savings Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).