I started writing documentation and managed to cobble up a p2q example in patch 7/9 to find unapplied patches when combined with "git log". 7/9 makes p2q use less memory, now. Fixed a bunch of small bugs along the way and killed some redundant code, too. Eric Wong (9): doc: tuning: additional notes for many inboxes doc: lei-store-format: mail sync section, update IPC eml: keep body if no headers are found lei q: enable expensive Xapian flags lei inspect: fix atfork hook lei: add net getopt spec to various commands lei p2q: use LeiInput for multi-patch series lei rm|tag: drop redundant mbox+net callbacks input_pipe: account for undefined {sock} Documentation/lei-p2q.pod | 6 ++ Documentation/lei-store-format.pod | 14 +++- Documentation/public-inbox-tuning.pod | 11 ++- lib/PublicInbox/Eml.pm | 7 +- lib/PublicInbox/InputPipe.pm | 2 +- lib/PublicInbox/LEI.pm | 11 +-- lib/PublicInbox/LeiInput.pm | 30 +++++-- lib/PublicInbox/LeiInspect.pm | 2 +- lib/PublicInbox/LeiP2q.pm | 115 +++++++++++++------------- lib/PublicInbox/LeiRm.pm | 10 --- lib/PublicInbox/LeiSearch.pm | 2 + lib/PublicInbox/LeiTag.pm | 11 --- t/eml.t | 11 +++ t/mbox_reader.t | 6 +- 14 files changed, 134 insertions(+), 104 deletions(-)
-extindex is the most important piece for dealing with many inboxes, so note it first. Also, frequent use of "git gc" is important for both loose object performance and reducing memory mappings. --- Documentation/public-inbox-tuning.pod | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/Documentation/public-inbox-tuning.pod b/Documentation/public-inbox-tuning.pod index 7b18b3bc4030..53668eccb7cb 100644 --- a/Documentation/public-inbox-tuning.pod +++ b/Documentation/public-inbox-tuning.pod @@ -165,12 +165,15 @@ Other OSes may have similar tuning knobs (patches appreciated). =head2 Scalability to many inboxes +L<public-inbox-extindex(1)> allows any number of public-inboxes +to share the same Xapian indices. + git 2.33+ startup time is orders-of-magnitude faster and uses less memory when dealing with thousands of alternates required -for thousands of inboxes. +for thousands of inboxes with L<public-inbox-extindex(1)>. -L<public-inbox-extindex(1)> allows any number of public-inboxes -to share the same Xapian indices. +Frequent packing (via L<git-gc(1)>) both improves performance +and reduces the need to increase C<sys.vm.max_map_count>. =head1 CONTACT @@ -184,6 +187,6 @@ L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>, =head1 COPYRIGHT -Copyright 2020-2021 all contributors L<mailto:meta@public-inbox.org> +Copyright all contributors L<mailto:meta@public-inbox.org> License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
mail_sync.sqlite3 needs to be documented, and brings the IPC section up-to-date while we're in the area. --- Documentation/lei-store-format.pod | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/Documentation/lei-store-format.pod b/Documentation/lei-store-format.pod index 71aa72cb96e3..625c60f436a8 100644 --- a/Documentation/lei-store-format.pod +++ b/Documentation/lei-store-format.pod @@ -30,7 +30,6 @@ prevent them from being accidentally treated as a v2 inbox. $SHARD - Integer starting with 0 based on parallelism ~/.local/share/lei/store - - ipc.lock # lock file for internal lei IPC - local/$EPOCH.git # normal bare git repositories - mail_sync.sqlite3 # sync state IMAP, Maildir, NNTP @@ -66,11 +65,18 @@ stored in Xapian indices, volatile metadata is associated with the Xapian document, thus it is shared across different blobs of the "same" message. +=head2 mail_sync.sqlite3 + +This SQLite database maintained for bidirectional mapping of +git blobs to IMAP UIDs, Maildir file names, and NNTP article numbers. + +It is also used for retrieving messages from Maildirs indexed by +L<lei-index(1)>. + =head1 IPC -When L<lei(1)> is run in daemon mode, L<flock(2)> is used on -C<ipc.lock> is used to serialize writes to C<lei/store> across -multiple internal lei workers while minimizing commits. +L<lei-daemon(8)> communicates with the C<lei/store> process using +L<unix(7)> C<SOCK_SEQPACKET> sockets. =head1 CAVEATS
This easily allows us to treat "git diff" output as header-less "messages" for commands such as "lei p2q". --- lib/PublicInbox/Eml.pm | 7 ++++--- t/eml.t | 11 +++++++++++ t/mbox_reader.t | 6 +++++- 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/Eml.pm b/lib/PublicInbox/Eml.pm index 3c681ba5bc2e..485f637a3e7b 100644 --- a/lib/PublicInbox/Eml.pm +++ b/lib/PublicInbox/Eml.pm @@ -122,9 +122,10 @@ sub new { my $hdr = substr($$ref, 0, $header_size_limit + 1); hdr_truncate($hdr) if length($hdr) > $header_size_limit; bless { hdr => \$hdr, crlf => $1 }, __PACKAGE__; - } else { # nothing useful - my $hdr = $$ref = ''; - bless { hdr => \$hdr, crlf => "\n" }, __PACKAGE__; + } else { # just a body w/o header? + my $hdr = ''; + my $eol = ($$ref =~ /(\r?\n)/) ? $1 : "\n"; + bless { hdr => \$hdr, crlf => $eol, bdy => $ref }, __PACKAGE__; } } diff --git a/t/eml.t b/t/eml.t index 0cf48f225a45..2d8993a51075 100644 --- a/t/eml.t +++ b/t/eml.t @@ -216,6 +216,17 @@ if ('one newline before headers') { is($eml->body, ""); } +if ('body only') { + my $str = <<EOM; +--- a/lib/PublicInbox/Eml.pm ++++ b/lib/PublicInbox/Eml.pm +@@ -122,9 +122,10 @@ sub new { +\x20 +EOM + my $eml = PublicInbox::Eml->new($str); + is($eml->body, $str, 'body-only accepted'); +} + for my $cls (@classes) { # XXX: matching E::M, but not sure about this my $s = <<EOF; Content-Type: multipart/mixed; boundary="b" diff --git a/t/mbox_reader.t b/t/mbox_reader.t index e5f57d7bcba3..87e8f397662c 100644 --- a/t/mbox_reader.t +++ b/t/mbox_reader.t @@ -138,7 +138,11 @@ EOM PublicInbox::MboxReader->$m($fh, sub { push @x, $_[0]->as_string }); - is_deeply(\@x, [], "messages in invalid $m"); + if ($m =~ /\Amboxcl/) { + is_deeply(\@x, [], "messages in invalid $m"); + } else { + is_deeply(\@x, [ "\n$html" ], "body-only $m"); + } is_deeply([grep(!/^W: leftover/, @w)], [], "no extra warnings besides leftover ($m)"); }
FLAG_PURE_NOT is too expensive for public-facing WWW use, but lei isn't public-facing. We'll also unconditionally enable phrase search on old "chert" DBs since lei doesn't need to worry about fairness across 10K users. --- lib/PublicInbox/LeiSearch.pm | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/PublicInbox/LeiSearch.pm b/lib/PublicInbox/LeiSearch.pm index 1fb38da1d7aa..d0ca13f0aa11 100644 --- a/lib/PublicInbox/LeiSearch.pm +++ b/lib/PublicInbox/LeiSearch.pm @@ -175,6 +175,8 @@ sub all_terms { sub qparse_new { my ($self) = @_; my $qp = $self->SUPER::qparse_new; # PublicInbox::Search + $self->{qp_flags} |= PublicInbox::Search::FLAG_PHRASE() | + PublicInbox::Search::FLAG_PURE_NOT(); $qp->add_boolean_prefix('kw', 'K'); $qp->add_boolean_prefix('L', 'L'); $qp
The misnamed sub wasn't firing, but was unlikely to be noticeable given the short lifetime of the process. Fixes: 1f887bd51d92b0d4 ("lei inspect: add atfork hook") --- lib/PublicInbox/LeiInspect.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm index 5ea32ccb7e66..d7775d4b6162 100644 --- a/lib/PublicInbox/LeiInspect.pm +++ b/lib/PublicInbox/LeiInspect.pm @@ -294,7 +294,7 @@ sub _complete_inspect { # TODO: message-ids?, blobs? could get expensive... } -sub input_only_atfork_child { +sub ipc_atfork_child { my ($self) = @_; $self->{lei}->_lei_atfork_child; $self->SUPER::ipc_atfork_child;
All of these commands should support --proxy, at least, if not other curl options. --- lib/PublicInbox/LEI.pm | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm index 32d4b9f3b427..93a7b426de3c 100644 --- a/lib/PublicInbox/LEI.pm +++ b/lib/PublicInbox/LEI.pm @@ -179,7 +179,7 @@ our %CMD = ( # sorted in order of importance/use: 'up' => [ 'OUTPUT...|--all', 'update saved search', qw(jobs|j=s lock=s@ alert=s@ mua=s verbose|v+ exclude=s@ - remote-fudge-time=s all:s remote! local! external!), @c_opt ], + remote-fudge-time=s all:s remote! local! external!), @net_opt, @c_opt ], 'lcat' => [ '--stdin|MSGID_OR_URL...', 'display local copy of message(s)', 'stdin|', # /|\z/ must be first for lone dash @@ -215,7 +215,7 @@ our %CMD = ( # sorted in order of importance/use: 'ls-mail-sync' => [ '[FILTER]', 'list mail sync folders', qw(z|0 globoff|g invert-match|v local remote), @c_opt ], 'ls-mail-source' => [ 'URL', 'list IMAP or NNTP mail source folders', - qw(z|0 ascii l pretty url), @c_opt ], + qw(z|0 ascii l pretty url), @net_opt, @c_opt ], 'forget-external' => [ 'LOCATION...|--prune', 'exclude further results from a publicinbox|extindex', qw(prune), @c_opt ], @@ -265,7 +265,8 @@ our %CMD = ( # sorted in order of importance/use: 'forget-mail-sync' => [ 'LOCATION...', 'forget sync information for a mail folder', @c_opt ], 'refresh-mail-sync' => [ 'LOCATION...|--all', - 'prune dangling sync data for a mail folder', 'all:s', @c_opt ], + 'prune dangling sync data for a mail folder', 'all:s', + @net_opt, @c_opt ], 'export-kw' => [ 'LOCATION...|--all', 'one-time export of keywords of sync sources', qw(all:s mode=s), @net_opt, @c_opt ],
The LeiInput backend now allows p2q to work like any other command which reads .eml, .patch, mbox*, Maildir, IMAP, and NNTP input. Running "git format-patch --stdout -1 $COMMIT" remains supported. This is intended to allow lower memory use while parsing "git log --pretty=mboxrd -p" output. Previously, the entire output of "git log" would be slurped into memory at once. The intended use is to allow easy(-ish :P) searching for unapplied patches as documented in the new example in the manpage. --- Documentation/lei-p2q.pod | 6 ++ lib/PublicInbox/LEI.pm | 4 +- lib/PublicInbox/LeiInput.pm | 30 ++++++++-- lib/PublicInbox/LeiP2q.pm | 115 ++++++++++++++++++------------------ 4 files changed, 89 insertions(+), 66 deletions(-) diff --git a/Documentation/lei-p2q.pod b/Documentation/lei-p2q.pod index 2e0b1ab676ed..4bc5d25f8ef0 100644 --- a/Documentation/lei-p2q.pod +++ b/Documentation/lei-p2q.pod @@ -77,6 +77,12 @@ Suppress feedback messages. # to view results on a remote HTTP(S) public-inbox instance $BROWSER https://example.com/pub-inbox/?q=$(lei p2q --uri $COMMIT_OID) + # to view unapplied patches for a given $FILE from the past year: + echo \( rt:last.year.. AND dfn:$FILE \) AND NOT \( \ + $(git log -p --pretty=mboxrd --since=last.year $FILE | + lei p2q -F mboxrd ) + \) | lei q -o /tmp/unapplied + =head1 CONTACT Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org> diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm index 93a7b426de3c..f9d24f29c87d 100644 --- a/lib/PublicInbox/LEI.pm +++ b/lib/PublicInbox/LEI.pm @@ -274,9 +274,9 @@ our %CMD = ( # sorted in order of importance/use: 'one-time conversion from URL or filesystem to another format', qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s lock=s@ kw!), @net_opt, @c_opt ], -'p2q' => [ 'FILE|COMMIT_OID|--stdin', +'p2q' => [ 'LOCATION_OR_COMMIT...|--stdin', "use a patch to generate a query for `lei q --stdin'", - qw(stdin| want|w=s@ uri debug), @c_opt ], + qw(stdin| in-format|F=s want|w=s@ uri debug), @net_opt, @c_opt ], 'config' => [ '[...]', sub { 'git-config(1) wrapper for '._config_path($_[0]); }, qw(config-file|system|global|file|f=s), # for conflict detection diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm index 2621fc1f9d05..540681e3ff6b 100644 --- a/lib/PublicInbox/LeiInput.pm +++ b/lib/PublicInbox/LeiInput.pm @@ -64,6 +64,11 @@ sub input_mbox_cb { # base MboxReader callback $self->input_eml_cb($eml); } +sub input_net_cb { # imap_each, nntp_each cb + my ($url, $uid, $kw, $eml, $self) = @_; + $self->input_eml_cb($eml); +} + # import a single file handle of $name # Subclass must define ->input_eml_cb and ->input_mbox_cb sub input_fh { @@ -108,10 +113,10 @@ sub handle_http_input ($$@) { grep(/\A--compressed\z/, @$curl) or $fh = IO::Uncompress::Gunzip->new($fh, MultiStream => 1); eval { $self->input_fh('mboxrd', $fh, $url, @args) }; - my $err = $@; + my @err = ($@ ? $@ : ()); $ar->join; - $? || $err and - $lei->child_error($?, "@$cmd failed".$err ? " $err" : ''); + push(@err, "\$?=$?") if $?; + $lei->child_error($?, "@$cmd failed: @err") if @err; } sub input_path_url { @@ -184,7 +189,17 @@ EOM $self, @args); } } elsif ($self->{missing_ok} && !-e $input) { # don't ->fail - $self->folder_missing("$ifmt:$input"); + if ($lei->{cmd} eq 'p2q') { + my $fp = [ qw(git format-patch --stdout -1), $input ]; + my $rdr = { 2 => $lei->{2} }; + my $fh = popen_rd($fp, undef, $rdr); + eval { $self->input_fh('eml', $fh, $input, @args) }; + my @err = ($@ ? $@ : ()); + close($fh) or push @err, "\$?=$?"; + $lei->child_error($?, "@$fp failed: @err") if @err; + } else { + $self->folder_missing("$ifmt:$input"); + } } else { $lei->fail("$ifmt_pfx$input unsupported (TODO)"); } @@ -330,9 +345,12 @@ $input is `eml', not --in-format=$in_fmt } push @md, $input; } elsif ($self->{missing_ok} && !-e $input) { - # for lei rm-watch - $may_sync and $input = 'maildir:'. + if ($lei->{cmd} eq 'p2q') { + # will run "git format-patch" + } elsif ($may_sync) { # for lei rm-watch + $input = 'maildir:'. $lei->abs_path($input); + } } else { return $lei->fail("Unable to handle $input") } diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm index 08ec81c5295e..09ec0a079bb9 100644 --- a/lib/PublicInbox/LeiP2q.pm +++ b/lib/PublicInbox/LeiP2q.pm @@ -1,16 +1,16 @@ -# Copyright (C) 2021 all contributors <meta@public-inbox.org> +# Copyright (C) all contributors <meta@public-inbox.org> # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt> # front-end for the "lei patch-to-query" sub-command package PublicInbox::LeiP2q; use strict; use v5.10.1; -use parent qw(PublicInbox::IPC); +use parent qw(PublicInbox::IPC PublicInbox::LeiInput); use PublicInbox::Eml; use PublicInbox::Smsg; use PublicInbox::MsgIter qw(msg_part_text); use PublicInbox::Git qw(git_unquote); -use PublicInbox::Spawn qw(popen_rd); +use PublicInbox::OnDestroy; use URI::Escape qw(uri_escape_utf8); my $FN = qr!((?:"?[^/\n]+/[^\r\n]+)|/dev/null)!; @@ -28,8 +28,16 @@ sub xphrase ($) { } ($s =~ m!(\w[\|=><,\./:\\\@\-\w\s]+)!g); } +sub add_qterm ($$@) { + my ($self, $p, @v) = @_; + for (@v) { + $self->{qseen}->{"$p\0$_"} //= + push(@{$self->{qterms}->{$p}}, $_); + } +} + sub extract_terms { # eml->each_part callback - my ($p, $lei) = @_; + my ($p, $self) = @_; my $part = $p->[0]; # ignore $depth and @idx; my $ct = $part->content_type || 'text/plain'; my ($s, undef) = msg_part_text($part, $ct); @@ -38,7 +46,7 @@ sub extract_terms { # eml->each_part callback # TODO: b: nq: q: for (split(/\n/, $s)) { if ($in_diff && s/^ //) { # diff context - push @{$lei->{qterms}->{dfctx}}, xphrase($_); + add_qterm($self, 'dfctx', xphrase($_)); } elsif (/^-- $/) { # email signature begins $in_diff = undef; } elsif (m!^diff --git $FN $FN!) { @@ -46,21 +54,21 @@ sub extract_terms { # eml->each_part callback $in_diff = 1; } elsif (/^index ([a-f0-9]+)\.\.([a-f0-9]+)\b/) { my ($oa, $ob) = ($1, $2); - push @{$lei->{qterms}->{dfpre}}, $oa; - push @{$lei->{qterms}->{dfpost}}, $ob; + add_qterm($self, 'dfpre', $oa); + add_qterm($self, 'dfpost', $ob); # who uses dfblob? } elsif (m!^(?:---|\+{3}) ($FN)!) { next if $1 eq '/dev/null'; my $fn = (split(m!/!, git_unquote($1.''), 2))[1]; - push @{$lei->{qterms}->{dfn}}, xphrase($fn); + add_qterm($self, 'dfn', xphrase($fn)); } elsif ($in_diff && s/^\+//) { # diff added - push @{$lei->{qterms}->{dfb}}, xphrase($_); + add_qterm($self, 'dfb', xphrase($_)); } elsif ($in_diff && s/^-//) { # diff removed - push @{$lei->{qterms}->{dfa}}, xphrase($_); + add_qterm($self, 'dfa', xphrase($_)); } elsif (/^@@ (?:\S+) (?:\S+) @@\s*$/) { # traditional diff w/o -p } elsif (/^@@ (?:\S+) (?:\S+) @@\s*(\S+.*)/) { - push @{$lei->{qterms}->{dfhh}}, xphrase($1); + add_qterm($self, 'dfhh', xphrase($1)); } elsif (/^(?:dis)similarity index/ || /^(?:old|new) mode/ || /^(?:deleted|new) file mode/ || @@ -92,53 +100,43 @@ my %pfx2smsg = ( rt => [ qw(ts) ], # ditto... ); -sub do_p2q { # via wq_do - my ($self) = @_; - my $lei = $self->{lei}; - my $want = $lei->{opt}->{want} // [ qw(dfpost7) ]; - my @want = split(/[, ]+/, "@$want"); - for (@want) { - /\A(?:(d|dt|rt):)?([0-9]+)(\.(?:day|weeks)s?)?\z/ or next; - my ($pfx, $n, $unit) = ($1, $2, $3); - $n *= 86400 * ($unit =~ /week/i ? 7 : 1); - $_ = [ $pfx, $n ]; - } - my $smsg = bless {}, 'PublicInbox::Smsg'; - my $in = $self->{0}; - my @cmd; - unless ($in) { - my $input = $self->{input}; - my $devfd = $lei->path_to_fd($input) // return; - if ($devfd >= 0) { - $in = $lei->{$devfd}; - } elsif (-e $input) { - open($in, '<', $input) or - return $lei->fail("open < $input: $!"); - } else { - @cmd = (qw(git format-patch --stdout -1), $input); - $in = popen_rd(\@cmd, undef, { 2 => $lei->{2} }); +sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh + my ($self, $eml) = @_; + my $diff_want = $self->{diff_want} // do { + my $want = $self->{lei}->{opt}->{want} // [ qw(dfpost7) ]; + my @want = split(/[, ]+/, "@$want"); + for (@want) { + /\A(?:(d|dt|rt):)?([0-9]+)(\.(?:day|weeks)s?)?\z/ + or next; + my ($pfx, $n, $unit) = ($1, $2, $3); + $n *= 86400 * ($unit =~ /week/i ? 7 : 1); + $_ = [ $pfx, $n ]; } + $self->{want_order} = \@want; + $self->{diff_want} = +{ map { $_ => 1 } @want }; }; - my $str = do { local $/; <$in> }; - @cmd && !close($in) and return $lei->fail("E: @cmd failed: $?"); - my $eml = PublicInbox::Eml->new(\$str); - $lei->{diff_want} = +{ map { $_ => 1 } @want }; + my $smsg = bless {}, 'PublicInbox::Smsg'; $smsg->populate($eml); while (my ($pfx, $fields) = each %pfx2smsg) { - next unless $lei->{diff_want}->{$pfx}; + next unless $diff_want->{$pfx}; for my $f (@$fields) { my $v = $smsg->{$f} // next; - push @{$lei->{qterms}->{$pfx}}, xphrase($v); + add_qterm($self, $pfx, xphrase($v)); } } - $eml->each_part(\&extract_terms, $lei, 1); + $eml->each_part(\&extract_terms, $self, 1); +} + +sub emit_query { + my ($self) = @_; + my $lei = $self->{lei}; if ($lei->{opt}->{debug}) { my $json = ref(PublicInbox::Config->json)->new; $json->utf8->canonical->pretty; - print { $lei->{2} } $json->encode($lei->{qterms}); + print { $lei->{2} } $json->encode($self->{qterms}); } my (@q, %seen); - for my $pfx (@want) { + for my $pfx (@{$self->{want_order}}) { if (ref($pfx) eq 'ARRAY') { my ($p, $t_range) = @$pfx; # TODO @@ -148,7 +146,7 @@ sub do_p2q { # via wq_do } else { my $plusminus = ($pfx =~ s/\A([\+\-])//) ? $1 : ''; my $end = ($pfx =~ s/([0-9\*]+)\z//) ? $1 : ''; - my $x = delete($lei->{qterms}->{$pfx}) or next; + my $x = delete($self->{qterms}->{$pfx}) or next; my $star = $end =~ tr/*//d ? '*' : ''; my $min_len = ($end || 0) + 0; @@ -181,24 +179,25 @@ sub do_p2q { # via wq_do } sub lei_p2q { # the "lei patch-to-query" entry point - my ($lei, $input) = @_; - my $self = bless {}, __PACKAGE__; - if ($lei->{opt}->{stdin}) { - $self->{0} = delete $lei->{0}; # guard from _lei_atfork_child - } else { - $self->{input} = $input; - } - my ($op_c, $ops) = $lei->workers_start($self, 1); + my ($lei, @inputs) = @_; + $lei->{opt}->{'in-format'} //= 'eml' if $lei->{opt}->{stdin}; + my $self = bless { missing_ok => 1 }, __PACKAGE__; + $self->prepare_inputs($lei, \@inputs) or return; + my $ops = {}; + $lei->{auth}->op_merge($ops, $self, $lei) if $lei->{auth}; + (my $op_c, $ops) = $lei->workers_start($self, 1, $ops); $lei->{wq1} = $self; - $self->wq_io_do('do_p2q', []); - $self->wq_close; + net_merge_all_done($self) unless $lei->{auth}; $lei->wait_wq_events($op_c, $ops); } sub ipc_atfork_child { my ($self) = @_; - $self->{lei}->_lei_atfork_child; - $self->SUPER::ipc_atfork_child; + PublicInbox::LeiInput::input_only_atfork_child($self); + PublicInbox::OnDestroy->new($$, \&emit_query, $self); } +no warnings 'once'; +*net_merge_all_done = \&PublicInbox::LeiInput::input_only_net_merge_all_done; + 1;
These are supplied by the base LeiInput class --- lib/PublicInbox/LeiRm.pm | 10 ---------- lib/PublicInbox/LeiTag.pm | 11 ----------- 2 files changed, 21 deletions(-) diff --git a/lib/PublicInbox/LeiRm.pm b/lib/PublicInbox/LeiRm.pm index 97b1c5c15dc4..524c178e3b47 100644 --- a/lib/PublicInbox/LeiRm.pm +++ b/lib/PublicInbox/LeiRm.pm @@ -13,16 +13,6 @@ sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh $self->{lei}->{sto}->wq_do('remove_eml', $eml); } -sub input_mbox_cb { # MboxReader callback - my ($eml, $self) = @_; - input_eml_cb($self, $eml); -} - -sub input_net_cb { # callback for ->imap_each, ->nntp_each - my (undef, undef, $kw, $eml, $self) = @_; # @_[0,1]: url + uid ignored - input_eml_cb($self, $eml); -} - sub input_maildir_cb { my (undef, $kw, $eml, $self) = @_; # $_[0] $filename ignored input_eml_cb($self, $eml); diff --git a/lib/PublicInbox/LeiTag.pm b/lib/PublicInbox/LeiTag.pm index 77654d1a2f22..d64a9f86e05a 100644 --- a/lib/PublicInbox/LeiTag.pm +++ b/lib/PublicInbox/LeiTag.pm @@ -19,23 +19,12 @@ sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh } } -sub input_mbox_cb { - my ($eml, $self) = @_; - $eml->header_set($_) for (qw(X-Status Status)); - input_eml_cb($self, $eml); -} - sub pmdir_cb { # called via wq_io_do from LeiPmdir->each_mdir_fn my ($self, $f) = @_; my $eml = eml_from_path($f) or return; input_eml_cb($self, $eml); } -sub input_net_cb { # imap_each, nntp_each cb - my ($url, $uid, $kw, $eml, $self) = @_; - input_eml_cb($self, $eml); -} - sub lei_tag { # the "lei tag" method my ($lei, @argv) = @_; my $sto = $lei->_lei_store(1);
It's possible for ->event_step to fire twice due to ->requeue with EPOLLET (but not EPOLLONESHOT). So account for that and avoid causing event loop errors as a result. --- lib/PublicInbox/InputPipe.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/InputPipe.pm b/lib/PublicInbox/InputPipe.pm index 00813a0701b1..e1e26e20b9d2 100644 --- a/lib/PublicInbox/InputPipe.pm +++ b/lib/PublicInbox/InputPipe.pm @@ -18,7 +18,7 @@ sub consume { sub event_step { my ($self) = @_; - my $r = sysread($self->{sock}, my $rbuf, 65536); + my $r = sysread($self->{sock} // return, my $rbuf, 65536); if ($r) { $self->{cb}->(@{$self->{args} // []}, $rbuf); return $self->requeue; # may be regular file or pipe