From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-3.9 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C4570211B3; Thu, 6 Dec 2018 03:59:03 +0000 (UTC) Date: Thu, 6 Dec 2018 03:59:03 +0000 From: Eric Wong To: Jonathan Corbet Cc: meta@public-inbox.org Subject: [PATCH] nntp: prevent event_read from firing twice in a row Message-ID: <20181206035903.2k3qqorj7f2ckeg5@dcvr> References: <20181013124658.23b9f9d2@lwn.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20181013124658.23b9f9d2@lwn.net> List-Id: Jonathan Corbet wrote: > index 022bb80..017ad88 100644 > --- a/lib/PublicInbox/NNTP.pm > +++ b/lib/PublicInbox/NNTP.pm > @@ -951,6 +951,10 @@ sub event_read { > $self->{rbuf} .= $$buf; > } > my $r = 1; > + if ($self->{long_res}) { > + err($self, "long-res on event read"); > + $r = 0; > + } It's actually fine, too; and I considered returning early from event_read if long_res was set once I figured out the problem. I would at least have made it exit the function before calling $self->read a few lines above, however; otherwise rbuf might get too huge if a client kept sending. But my alternative patch below is fewer syscalls with epoll[*], since it avoids the needless registration and immediate wakeup ---------8<--------- Subject: [PATCH] nntp: prevent event_read from firing twice in a row When a client starts pipelining requests to us which trigger long responses, we need to keep socket readiness checks disabled and only enable them when our socket rbuf is drained. Failure to do this caused aborted clients with "BUG: nested long response" when Danga::Socket calls event_read for read-readiness after our "next_tick" sub fires in the same event loop iteration. Reported-by: Jonathan Corbet cf. https://public-inbox.org/meta/20181013124658.23b9f9d2@lwn.net/ --- lib/PublicInbox/NNTP.pm | 26 ++++++++++++++++++-------- t/nntpd.t | 21 +++++++++++++++++++++ 2 files changed, 39 insertions(+), 8 deletions(-) diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm index 022bb80..90a5a3a 100644 --- a/lib/PublicInbox/NNTP.pm +++ b/lib/PublicInbox/NNTP.pm @@ -51,8 +51,16 @@ () # before finishing reading: if (my $long_cb = $nntp->{long_res}) { $nntp->write($long_cb); - } elsif (&Danga::Socket::POLLIN & $nntp->{event_watch}) { + } else { + # pipelined request, we bypassed socket-readiness + # checks to get here: event_read($nntp); + + # maybe there's more pipelined data, or we'll have + # to register it for socket-readiness notifications + if (!$nntp->{long_res} && !$nntp->{closed}) { + check_read($nntp); + } } } } @@ -609,7 +617,7 @@ ($$) now() - $t0); } else { update_idle_time($self); - $self->watch_read(1); + check_read($self); } } elsif ($more) { # $self->{write_buf_size}: # no recursion, schedule another call ASAP @@ -620,7 +628,7 @@ ($$) $nextt ||= PublicInbox::EvCleanup::asap(*next_tick); } else { # all done! $self->{long_res} = undef; - $self->watch_read(1); + check_read($self); res($self, '.'); out($self, " deferred[$fd] done - %0.6f", now() - $t0); } @@ -968,10 +976,9 @@ sub event_read { update_idle_time($self); } -sub watch_read { - my ($self, $bool) = @_; - my $rv = $self->SUPER::watch_read($bool); - if ($bool && index($self->{rbuf}, "\n") >= 0) { +sub check_read { + my ($self) = @_; + if (index($self->{rbuf}, "\n") >= 0) { # Force another read if there is a pipelined request. # We don't know if the socket has anything for us to read, # and we must double-check again by the time the timer fires @@ -979,8 +986,11 @@ sub watch_read { # another long response. push @$nextq, $self; $nextt ||= PublicInbox::EvCleanup::asap(*next_tick); + } else { + # no pipelined requests available, let the kernel know + # to wake us up if there's more + $self->watch_read(1); # Danga::Socket::watch_read } - $rv; } sub not_idle_long ($$) { diff --git a/t/nntpd.t b/t/nntpd.t index 9c1d076..ffed437 100644 --- a/t/nntpd.t +++ b/t/nntpd.t @@ -252,6 +252,27 @@ EOF ok($date <= $t1, 'valid date before stop'); } + # pipelined requests: + { + my $nreq = 90; + syswrite($s, "GROUP $group\r\n"); + my $res = <$s>; + my $rdr = fork; + if ($rdr == 0) { + use POSIX qw(_exit); + for (1..$nreq) { + <$s> =~ /\A224 / or _exit(1); + <$s> =~ /\A1/ or _exit(2); + <$s> eq ".\r\n" or _exit(3); + } + _exit(0); + } + for (1..$nreq) { + syswrite($s, "XOVER 1\r\n"); + } + is($rdr, waitpid($rdr, 0), 'reader done'); + is($? >> 8, 0, 'no errors'); + } { setsockopt($s, IPPROTO_TCP, TCP_NODELAY, 1); syswrite($s, 'HDR List-id 1-'); [*] With my noisy system, the time it took to do all the git@vger XOVER one-by-one with the below script went from ~3:58 to ~3:52 on my system by avoiding unnecessary watch_read (epoll_ctl) calls and epoll_wait wakeups. Quick-and-dirty Ruby script I wrote to trigger the problem (Ruby since it's what I've been spending too much of my time on :x) -------------------------------- #!/usr/bin/ruby # frozen_string_literal: true require 'socket' require 'uri' usage = "usage: #$0 news://example.com/inbox.group.name" uri = ARGV.shift or abort usage uri = URI(uri) uri.port ||= 119 group = uri.path.sub(%r{\A/+}, '') # String#delete_prefix requires Ruby 2.5+ s = Socket.tcp(uri.host, uri.port) l = s.gets l =~ /\A2\d\d / or abort "bad greeting: #{l}" s.puts "GROUP #{group}" l = s.gets code, _, min, max = l.chomp!.split.map!(&:to_i) code == 211 or abort "bad GROUP response: #{l}" rdr = Thread.new do nres = 0 while l = s.gets l.start_with?('205 ') and break # cmd_quit l.start_with?('224 ') or abort "bad OVER response: #{l}" while l = s.gets if l == ".\r\n" nres += 1 break end end end nres end range = min..max range.each { |i| s.puts "XOVER #{i}" } puts "requests=#{range.size} #{Time.now}" s.puts "QUIT" puts "responses=#{rdr.value} #{Time.now}"