* [PATCH 0/5] lei: more input + worker-related stuff
@ 2021-03-23 11:48 Eric Wong
2021-03-23 11:48 ` [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef' Eric Wong
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
To: meta
Drop a bunch of redundant code, yay!
Eric Wong (5):
net_reader: nntp_each: pass keywords as `undef'
test_common: check lei/errors.log
lei: persistent workers (lei_store) run in /
lei_input: more common code between <mark|convert|import>
lei: improve management around short-lived workers
lib/PublicInbox/LEI.pm | 2 +-
lib/PublicInbox/LeiConvert.pm | 50 +++++-------------
lib/PublicInbox/LeiExternal.pm | 3 +-
lib/PublicInbox/LeiImport.pm | 94 +++++++++-------------------------
lib/PublicInbox/LeiInput.pm | 45 ++++++++++++++--
lib/PublicInbox/LeiMark.pm | 59 ++++-----------------
lib/PublicInbox/LeiMirror.pm | 2 +-
lib/PublicInbox/LeiP2q.pm | 5 +-
lib/PublicInbox/LeiQuery.pm | 2 +-
lib/PublicInbox/NetReader.pm | 5 +-
lib/PublicInbox/TestCommon.pm | 13 +++--
11 files changed, 109 insertions(+), 171 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef'
2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
2021-03-23 11:48 ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
To: meta
We'll use `undef' to denote keywords are unknown/unsupported,
instead of an empty arrayref.
This will let callers use the same callback and args for
imap_each. Passing an empty arrayref to set_eml in LeiStore
causes keywords to be cleared completely, which is not desired
behavior when "lei import" is importing already-seen messages
from NNTP.
---
lib/PublicInbox/NetReader.pm | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index bc211029..6a52b479 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -554,11 +554,10 @@ sub _nntp_fetch_all ($$$) {
return if $l_art >= $end; # nothing to do
$beg = $l_art + 1;
}
- my ($err, $art);
+ my ($err, $art, $last_art, $kw); # kw stays undef, no keywords in NNTP
unless ($self->{quiet}) {
warn "# $uri fetching ARTICLE $beg..$end\n";
}
- my $last_art;
my $n = $self->{max_batch};
for ($beg..$end) {
last if $self->{quit};
@@ -582,7 +581,7 @@ sub _nntp_fetch_all ($$$) {
$raw = join('', @$raw);
$raw =~ s/\r\n/\n/sg;
my ($eml_cb, @args) = @{$self->{eml_each}};
- $eml_cb->($uri, $art, [], PublicInbox::Eml->new(\$raw), @args);
+ $eml_cb->($uri, $art, $kw, PublicInbox::Eml->new(\$raw), @args);
$last_art = $art;
}
run_commit_cb($self);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/5] test_common: check lei/errors.log
2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
2021-03-23 11:48 ` [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef' Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
2021-03-23 11:48 ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
To: meta
This will make it easier to diagnose some large internal
rewrites.
---
lib/PublicInbox/TestCommon.pm | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index e67e94ea..d4117b6c 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -507,7 +507,7 @@ SKIP: {
Socket::MsgHdr missing or Inline::C is unconfigured/missing
EOM
$lei_opt = { 1 => \$lei_out, 2 => \$lei_err };
- my ($daemon_pid, $for_destroy);
+ my ($daemon_pid, $for_destroy, $daemon_xrd);
my $tmpdir = $test_opt->{tmpdir};
($tmpdir, $for_destroy) = tmpdir unless $tmpdir;
SKIP: {
@@ -515,9 +515,9 @@ EOM
my $home = "$tmpdir/lei-daemon";
mkdir($home, 0700) or BAIL_OUT "mkdir: $!";
local $ENV{HOME} = $home;
- my $xrd = "$home/xdg_run";
- mkdir($xrd, 0700) or BAIL_OUT "mkdir: $!";
- local $ENV{XDG_RUNTIME_DIR} = $xrd;
+ $daemon_xrd = "$home/xdg_run";
+ mkdir($daemon_xrd, 0700) or BAIL_OUT "mkdir: $!";
+ local $ENV{XDG_RUNTIME_DIR} = $daemon_xrd;
$cb->();
lei_ok(qw(daemon-pid), \"daemon-pid after $t");
chomp($daemon_pid = $lei_out);
@@ -547,6 +547,11 @@ EOM
tick;
}
ok(!kill(0, $daemon_pid), "$t daemon stopped after oneshot");
+ my $f = "$daemon_xrd/lei/errors.log";
+ open my $fh, '<', $f or BAIL_OUT "$f: $!";
+ my @l = <$fh>;
+ is_deeply(\@l, [],
+ "$t daemon XDG_RUNTIME_DIR/lei/errors.log empty");
}
}; # SKIP if missing git 2.6+ || Xapian || SQLite || json
} # /test_lei
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/5] lei: persistent workers (lei_store) run in /
2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
2021-03-23 11:48 ` [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef' Eric Wong
2021-03-23 11:48 ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
2021-03-23 11:48 ` [PATCH 4/5] lei_input: more common code between <mark|convert|import> Eric Wong
2021-03-23 11:48 ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
To: meta
Since each lei->event_step can change the directory of
lei-daemon, we need to ensure the lei_store runs in a
directory that is stable.
---
lib/PublicInbox/LEI.pm | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 17ca637e..d3ac19b2 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -453,6 +453,7 @@ sub _lei_atfork_child {
my ($self, $persist) = @_;
# we need to explicitly close things which are on stack
if ($persist) {
+ chdir '/' or die "chdir(/): $!";
my @io = delete @$self{qw(0 1 2 sock)};
unless ($self->{oneshot}) {
close($_) for @io;
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/5] lei_input: more common code between <mark|convert|import>
2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
` (2 preceding siblings ...)
2021-03-23 11:48 ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
2021-03-23 11:48 ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
To: meta
"lei convert" is actually a bit of the odd one, since
it uses lei2mail for auth, unlike the others.
---
lib/PublicInbox/LeiConvert.pm | 47 ++++++----------------
lib/PublicInbox/LeiImport.pm | 74 +++++++++--------------------------
lib/PublicInbox/LeiInput.pm | 45 +++++++++++++++++++--
lib/PublicInbox/LeiMark.pm | 57 +++++----------------------
4 files changed, 80 insertions(+), 143 deletions(-)
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 49e2b7af..bc86fe25 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -6,64 +6,39 @@ package PublicInbox::LeiConvert;
use strict;
use v5.10.1;
use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use PublicInbox::Eml;
-use PublicInbox::LeiStore;
use PublicInbox::LeiOverview;
-sub mbox_cb { # MboxReader callback used by PublicInbox::LeiInput::input_fh
+# /^input_/ subs are used by PublicInbox::LeiInput
+
+sub input_mbox_cb { # MboxReader callback
my ($eml, $self) = @_;
my $kw = PublicInbox::MboxReader::mbox_keywords($eml);
$eml->header_set($_) for qw(Status X-Status);
$self->{wcb}->(undef, { kw => $kw }, $eml);
}
-sub eml_cb { # used by PublicInbox::LeiInput::input_fh
+sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
my ($self, $eml) = @_;
- $self->{wcb}->(undef, { kw => [] }, $eml);
+ $self->{wcb}->(undef, {}, $eml);
}
-sub net_cb { # callback for ->imap_each, ->nntp_each
+sub input_net_cb { # callback for ->imap_each, ->nntp_each
my (undef, undef, $kw, $eml, $self) = @_; # @_[0,1]: url + uid ignored
$self->{wcb}->(undef, { kw => $kw }, $eml);
}
-sub mdir_cb {
- my ($f, $kw, $eml, $self) = @_;
+sub input_maildir_cb {
+ my (undef, $kw, $eml, $self) = @_; # $_[0] $filename ignored
$self->{wcb}->(undef, { kw => $kw }, $eml);
}
sub do_convert { # via wq_do
my ($self) = @_;
- my $lei = $self->{lei};
- my $ifmt = $lei->{opt}->{'in-format'};
- if (my $stdin = delete $self->{0}) {
- $self->input_fh($ifmt, $stdin, '<stdin>');
- }
+ $self->input_stdin;
for my $input (@{$self->{inputs}}) {
- my $ifmt = lc($ifmt // '');
- if ($input =~ m!\Aimaps?://!) {
- $lei->{net}->imap_each($input, \&net_cb, $self);
- next;
- } elsif ($input =~ m!\A(?:nntps?|s?news)://!) {
- $lei->{net}->nntp_each($input, \&net_cb, $self);
- next;
- } elsif ($input =~ s!\A([a-z0-9]+):!!i) {
- $ifmt = lc $1;
- }
- if (-f $input) {
- my $m = $lei->{opt}->{'lock'} //
- ($ifmt eq 'eml' ? ['none'] :
- PublicInbox::MboxLock->defaults);
- my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
- $self->input_fh($ifmt, $mbl->{fh}, $input);
- } elsif (-d _) {
- PublicInbox::MdirReader::maildir_each_eml($input,
- \&mdir_cb, $self);
- } else {
- die "BUG: $input unhandled"; # should've failed earlier
- }
+ $self->input_path_url($input);
}
- delete $lei->{1};
+ delete $self->{lei}->{1};
delete $self->{wcb}; # commit
}
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 21af28a3..991c84f2 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -6,23 +6,33 @@ package PublicInbox::LeiImport;
use strict;
use v5.10.1;
use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use PublicInbox::Eml;
-use PublicInbox::PktOp qw(pkt_do);
-sub eml_cb { # used by PublicInbox::LeiInput::input_fh
+# /^input_/ subs are used by (or override) PublicInbox::LeiInput superclass
+
+sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
my ($self, $eml, $vmd) = @_;
my $xoids = $self->{lei}->{ale}->xoids_for($eml);
$self->{lei}->{sto}->ipc_do('set_eml', $eml, $vmd, $xoids);
}
-sub mbox_cb { # MboxReader callback used by PublicInbox::LeiInput::input_fh
+sub input_mbox_cb { # MboxReader callback
my ($eml, $self) = @_;
my $vmd;
if ($self->{-import_kw}) {
my $kw = PublicInbox::MboxReader::mbox_keywords($eml);
$vmd = { kw => $kw } if scalar(@$kw);
}
- eml_cb($self, $eml, $vmd);
+ input_eml_cb($self, $eml, $vmd);
+}
+
+sub input_maildir_cb { # maildir_each_eml cb
+ my ($f, $kw, $eml, $self) = @_;
+ input_eml_cb($self, $eml, $self->{-import_kw} ? { kw => $kw } : undef);
+}
+
+sub input_net_cb { # imap_each, nntp_each cb
+ my ($url, $uid, $kw, $eml, $self) = @_;
+ input_eml_cb($self, $eml, $self->{-import_kw} ? { kw => $kw } : undef);
}
sub import_done_wait { # dwaitpid callback
@@ -43,7 +53,7 @@ sub import_done { # EOF callback for main daemon
sub net_merge_complete { # callback used by LeiAuth
my ($self) = @_;
for my $input (@{$self->{inputs}}) {
- $self->wq_io_do('import_path_url', [], $input);
+ $self->wq_io_do('input_path_url', [], $input);
}
$self->wq_close(1);
}
@@ -63,7 +73,8 @@ sub import_start {
$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
$self->{-wq_nr_workers} = $j // 1; # locked
my $op = $lei->workers_start($self, 'lei_import', undef, $ops);
- $self->wq_io_do('import_stdin', []) if $self->{0};
+ $lei->{imp} = $self;
+ $self->wq_io_do('input_stdin', []) if $self->{0};
net_merge_complete($self) unless $lei->{auth};
while ($op && $op->{sock}) { $op->event_step }
}
@@ -78,55 +89,6 @@ sub lei_import { # the main "lei import" method
import_start($lei);
}
-sub _import_maildir { # maildir_each_eml cb
- my ($f, $kw, $eml, $sto, $set_kw) = @_;
- $sto->ipc_do('set_eml', $eml, $set_kw ? { kw => $kw }: ());
-}
-
-sub _import_net { # imap_each, nntp_each cb
- my ($url, $uid, $kw, $eml, $sto, $set_kw) = @_;
- $sto->ipc_do('set_eml', $eml, $set_kw ? { kw => $kw } : ());
-}
-
-sub import_path_url {
- my ($self, $input) = @_;
- my $lei = $self->{lei};
- my $ifmt = lc($lei->{opt}->{'in-format'} // '');
- # TODO auto-detect?
- if ($input =~ m!\Aimaps?://!i) {
- $lei->{net}->imap_each($input, \&_import_net, $lei->{sto},
- $self->{-import_kw});
- return;
- } elsif ($input =~ m!\A(?:nntps?|s?news)://!i) {
- $lei->{net}->nntp_each($input, \&_import_net, $lei->{sto}, 0);
- return;
- } elsif ($input =~ s!\A([a-z0-9]+):!!i) {
- $ifmt = lc $1;
- }
- if (-f $input) {
- my $m = $lei->{opt}->{'lock'} // ($ifmt eq 'eml' ? ['none'] :
- PublicInbox::MboxLock->defaults);
- my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
- $self->input_fh($ifmt, $mbl->{fh}, $input);
- } elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
- return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
-$input appears to a be a maildir, not $ifmt
-EOM
- PublicInbox::MdirReader::maildir_each_eml($input,
- \&_import_maildir,
- $lei->{sto}, $self->{-import_kw});
- } else {
- $lei->fail("$input unsupported (TODO)");
- }
-}
-
-sub import_stdin {
- my ($self) = @_;
- my $lei = $self->{lei};
- my $in = delete $self->{0};
- $self->input_fh($lei->{opt}->{'in-format'}, $in, '<stdin>');
-}
-
no warnings 'once';
*ipc_atfork_child = \&PublicInbox::LeiInput::input_only_atfork_child;
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 2a4968d4..b059ecda 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -24,10 +24,11 @@ sub check_input_format ($;$) {
}
# import a single file handle of $name
-# Subclass must define ->eml_cb and ->mbox_cb
+# Subclass must define ->input_eml_cb and ->input_mbox_cb
sub input_fh {
my ($self, $ifmt, $fh, $name, @args) = @_;
if ($ifmt eq 'eml') {
+ require PublicInbox::Eml;
my $buf = do { local $/; <$fh> } //
return $self->{lei}->child_error(1 << 8, <<"");
error reading $name: $!
@@ -36,12 +37,50 @@ error reading $name: $!
# but no Content-Length or "From " escaping.
# "git format-patch" also generates such files by default.
$buf =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s;
- $self->eml_cb(PublicInbox::Eml->new(\$buf), @args);
+ $self->input_eml_cb(PublicInbox::Eml->new(\$buf), @args);
} else {
# prepare_inputs already validated $ifmt
my $cb = PublicInbox::MboxReader->reads($ifmt) //
die "BUG: bad fmt=$ifmt";
- $cb->(undef, $fh, $self->can('mbox_cb'), $self, @args);
+ $cb->(undef, $fh, $self->can('input_mbox_cb'), $self, @args);
+ }
+}
+
+sub input_stdin {
+ my ($self) = @_;
+ my $in = delete $self->{0} or return;
+ $self->input_fh($self->{lei}->{opt}->{'in-format'}, $in, '<stdin>');
+}
+
+sub input_path_url {
+ my ($self, $input, @args) = @_;
+ my $lei = $self->{lei};
+ my $ifmt = lc($lei->{opt}->{'in-format'} // '');
+ # TODO auto-detect?
+ if ($input =~ m!\Aimaps?://!i) {
+ $lei->{net}->imap_each($input, $self->can('input_net_cb'),
+ $self, @args);
+ return;
+ } elsif ($input =~ m!\A(?:nntps?|s?news)://!i) {
+ $lei->{net}->nntp_each($input, $self->can('input_net_cb'),
+ $self, @args);
+ return;
+ }
+ $input =~ s!\A([a-z0-9]+):!!i and $ifmt = lc($1);
+ if (-f $input) {
+ my $m = $lei->{opt}->{'lock'} // ($ifmt eq 'eml' ? ['none'] :
+ PublicInbox::MboxLock->defaults);
+ my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
+ $self->input_fh($ifmt, $mbl->{fh}, $input, @args);
+ } elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
+ return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
+$input appears to a be a maildir, not $ifmt
+EOM
+ PublicInbox::MdirReader::maildir_each_eml($input,
+ $self->can('input_maildir_cb'),
+ $self, @args);
+ } else {
+ $lei->fail("$input unsupported (TODO)");
}
}
diff --git a/lib/PublicInbox/LeiMark.pm b/lib/PublicInbox/LeiMark.pm
index 7b50aa51..3b5e6c2c 100644
--- a/lib/PublicInbox/LeiMark.pm
+++ b/lib/PublicInbox/LeiMark.pm
@@ -6,8 +6,6 @@ package PublicInbox::LeiMark;
use strict;
use v5.10.1;
use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use PublicInbox::Eml;
-use PublicInbox::PktOp qw(pkt_do);
# JMAP RFC 8621 4.1.1
my @KW = (qw(seen answered flagged draft), # system
@@ -34,7 +32,6 @@ my %ERR = (
`$kw' is not one of: `seen', `flagged', `answered', `draft'
`junk', `notjunk', `phishing' or `forwarded'
EOM
-
}
);
@@ -60,7 +57,7 @@ sub vmd_mod_extract {
$vmd_mod;
}
-sub eml_cb { # used by PublicInbox::LeiInput::input_fh
+sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
my ($self, $eml) = @_;
if (my $xoids = $self->{lei}->{ale}->xoids_for($eml)) {
$self->{lei}->{sto}->ipc_do('update_xvmd', $xoids,
@@ -70,7 +67,7 @@ sub eml_cb { # used by PublicInbox::LeiInput::input_fh
}
}
-sub mbox_cb { eml_cb($_[1], $_[0]) } # used by PublicInbox::LeiInput::input_fh
+sub input_mbox_cb { input_eml_cb($_[1], $_[0]) }
sub mark_done_wait { # dwaitpid callback
my ($arg, $pid) = @_;
@@ -90,19 +87,19 @@ sub mark_done { # EOF callback for main daemon
sub net_merge_complete { # callback used by LeiAuth
my ($self) = @_;
for my $input (@{$self->{inputs}}) {
- $self->wq_io_do('mark_path_url', [], $input);
+ $self->wq_io_do('input_path_url', [], $input);
}
$self->wq_close(1);
}
-sub _mark_maildir { # maildir_each_eml cb
+sub input_maildir_cb { # maildir_each_eml cb
my ($f, $kw, $eml, $self) = @_;
- eml_cb($self, $eml);
+ input_eml_cb($self, $eml);
}
-sub _mark_net { # imap_each, nntp_each cb
+sub input_net_cb { # imap_each, nntp_each cb
my ($url, $uid, $kw, $eml, $self) = @_;
- eml_cb($self, $eml)
+ input_eml_cb($self, $eml);
}
sub lei_mark { # the "lei mark" method
@@ -120,48 +117,12 @@ sub lei_mark { # the "lei mark" method
$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
$self->{vmd_mod} = $vmd_mod;
my $op = $lei->workers_start($self, 'lei_mark', 1, $ops);
- $self->wq_io_do('mark_stdin', []) if $self->{0};
+ $lei->{mark} = $self;
+ $self->wq_io_do('input_stdin', []) if $self->{0};
net_merge_complete($self) unless $lei->{auth};
while ($op && $op->{sock}) { $op->event_step }
}
-sub mark_path_url {
- my ($self, $input) = @_;
- my $lei = $self->{lei};
- my $ifmt = lc($lei->{opt}->{'in-format'} // '');
- # TODO auto-detect?
- if ($input =~ m!\Aimaps?://!i) {
- $lei->{net}->imap_each($input, \&_mark_net, $self);
- return;
- } elsif ($input =~ m!\A(?:nntps?|s?news)://!i) {
- $lei->{net}->nntp_each($input, \&_mark_net, $self);
- return;
- } elsif ($input =~ s!\A([a-z0-9]+):!!i) {
- $ifmt = lc $1;
- }
- if (-f $input) {
- my $m = $lei->{opt}->{'lock'} // ($ifmt eq 'eml' ? ['none'] :
- PublicInbox::MboxLock->defaults);
- my $mbl = PublicInbox::MboxLock->acq($input, 0, $m);
- $self->input_fh($ifmt, $mbl->{fh}, $input);
- } elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
- return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
-$input appears to a be a maildir, not $ifmt
-EOM
- PublicInbox::MdirReader::maildir_each_eml($input,
- \&_mark_maildir, $self);
- } else {
- $lei->fail("$input unsupported (TODO)");
- }
-}
-
-sub mark_stdin {
- my ($self) = @_;
- my $lei = $self->{lei};
- my $in = delete $self->{0};
- $self->input_fh($lei->{opt}->{'in-format'}, $in, '<stdin>');
-}
-
sub note_missing {
my ($self) = @_;
$self->{lei}->child_error(1 << 8) if $self->{missing};
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 5/5] lei: improve management around short-lived workers
2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
` (3 preceding siblings ...)
2021-03-23 11:48 ` [PATCH 4/5] lei_input: more common code between <mark|convert|import> Eric Wong
@ 2021-03-23 11:48 ` Eric Wong
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-23 11:48 UTC (permalink / raw)
To: meta
Instead of creating a short-lived circular reference,
ensure they don't exist in the first place.
Note the following changes to hold an extra ref to $sto:
- $self->_lei_store(1)->write_prepare($self);
+ my $sto = $self->_lei_store(1);
+ $sto->write_prepare($self);
I'm not a perlguts expert, but I actually wanted to switch
to the one-line version for LeiImport, but xt/lei-auth-fail.t
was getting stuck for some reason. It seems the extra ref
to the LeiStore ($sto) object is necessary.
---
lib/PublicInbox/LEI.pm | 1 -
lib/PublicInbox/LeiConvert.pm | 3 ++-
lib/PublicInbox/LeiExternal.pm | 3 ++-
lib/PublicInbox/LeiImport.pm | 20 +++++++-------------
lib/PublicInbox/LeiMark.pm | 2 +-
lib/PublicInbox/LeiMirror.pm | 2 +-
lib/PublicInbox/LeiP2q.pm | 5 +++--
lib/PublicInbox/LeiQuery.pm | 2 +-
8 files changed, 17 insertions(+), 21 deletions(-)
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index d3ac19b2..8cbaac01 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -462,7 +462,6 @@ sub _lei_atfork_child {
open STDERR, '+>&='.fileno($self->{2}) or warn "open $!";
delete $self->{0};
}
- delete @$self{qw(cnv mark imp)};
for (delete @$self{qw(3 old_1 au_done)}) {
close($_) if defined($_);
}
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index bc86fe25..0cc65108 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -46,13 +46,14 @@ sub lei_convert { # the main "lei convert" method
my ($lei, @inputs) = @_;
$lei->{opt}->{kw} //= 1;
$lei->{opt}->{dedupe} //= 'none';
- my $self = $lei->{cnv} = bless {}, __PACKAGE__;
+ my $self = bless {}, __PACKAGE__;
my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
$lei->{l2m} or return
$lei->fail("output not specified or is not a mail destination");
$lei->{opt}->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
$self->prepare_inputs($lei, \@inputs) or return;
my $op = $lei->workers_start($self, 'lei_convert', 1);
+ $lei->{cnv} = $self;
$self->wq_io_do('do_convert', []);
$self->wq_close(1);
while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 9a555831..56d6ef39 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -144,7 +144,8 @@ sub add_external_finish {
sub lei_add_external {
my ($self, $location) = @_;
- $self->_lei_store(1)->write_prepare($self);
+ my $sto = $self->_lei_store(1);
+ $sto->write_prepare($self);
my $opt = $self->{opt};
my $mirror = $opt->{mirror} // do {
my @fail;
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 991c84f2..9da6b7f9 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -58,9 +58,13 @@ sub net_merge_complete { # callback used by LeiAuth
$self->wq_close(1);
}
-sub import_start {
- my ($lei) = @_;
- my $self = $lei->{imp};
+sub lei_import { # the main "lei import" method
+ my ($lei, @inputs) = @_;
+ my $sto = $lei->_lei_store(1);
+ $sto->write_prepare($lei);
+ my $self = bless {}, __PACKAGE__;
+ $self->{-import_kw} = $lei->{opt}->{kw} // 1;
+ $self->prepare_inputs($lei, \@inputs) or return;
$lei->ale; # initialize for workers to read
my $j = $lei->{opt}->{jobs} // scalar(@{$self->{inputs}}) || 1;
if (my $net = $lei->{net}) {
@@ -79,16 +83,6 @@ sub import_start {
while ($op && $op->{sock}) { $op->event_step }
}
-sub lei_import { # the main "lei import" method
- my ($lei, @inputs) = @_;
- my $sto = $lei->_lei_store(1);
- $sto->write_prepare($lei);
- my $self = $lei->{imp} = bless {}, __PACKAGE__;
- $self->{-import_kw} = $lei->{opt}->{kw} // 1;
- $self->prepare_inputs($lei, \@inputs) or return;
- import_start($lei);
-}
-
no warnings 'once';
*ipc_atfork_child = \&PublicInbox::LeiInput::input_only_atfork_child;
diff --git a/lib/PublicInbox/LeiMark.pm b/lib/PublicInbox/LeiMark.pm
index 3b5e6c2c..9d77f4b4 100644
--- a/lib/PublicInbox/LeiMark.pm
+++ b/lib/PublicInbox/LeiMark.pm
@@ -105,8 +105,8 @@ sub input_net_cb { # imap_each, nntp_each cb
sub lei_mark { # the "lei mark" method
my ($lei, @argv) = @_;
my $sto = $lei->_lei_store(1);
- my $self = $lei->{mark} = bless { missing => 0 }, __PACKAGE__;
$sto->write_prepare($lei);
+ my $self = bless { missing => 0 }, __PACKAGE__;
$lei->ale; # refresh and prepare
my $vmd_mod = vmd_mod_extract(\@argv);
return $lei->fail(join("\n", @{$vmd_mod->{err}})) if $vmd_mod->{err};
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index c916f2d0..6e62625d 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -269,7 +269,6 @@ sub do_mirror { # via wq_io_do
sub start {
my ($cls, $lei, $src, $dst) = @_;
my $self = bless { lei => $lei, src => $src, dst => $dst }, $cls;
- $lei->{mrr} = $self;
if ($src =~ m!https?://!) {
require URI;
require PublicInbox::LeiCurl;
@@ -281,6 +280,7 @@ sub start {
my $op = $lei->workers_start($self, 'lei_mirror', 1, {
'' => [ \&mirror_done, $lei ]
});
+ $lei->{mrr} = $self;
$self->wq_io_do('do_mirror', []);
$self->wq_close(1);
while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiP2q.pm b/lib/PublicInbox/LeiP2q.pm
index 0f7ffb5f..fda055fe 100644
--- a/lib/PublicInbox/LeiP2q.pm
+++ b/lib/PublicInbox/LeiP2q.pm
@@ -176,13 +176,14 @@ sub do_p2q { # via wq_do
sub lei_p2q { # the "lei patch-to-query" entry point
my ($lei, $input) = @_;
- my $self = $lei->{p2q} = bless {}, __PACKAGE__;
+ my $self = bless {}, __PACKAGE__;
if ($lei->{opt}->{stdin}) {
$self->{0} = delete $lei->{0}; # guard from _lei_atfork_child
} else {
$self->{input} = $input;
}
- my $op = $lei->workers_start($self, 'lei patch2query', 1);
+ my $op = $lei->workers_start($self, 'lei_p2q', 1);
+ $lei->{p2q} = $self;
$self->wq_io_do('do_p2q', []);
$self->wq_close(1);
while ($op && $op->{sock}) { $op->event_step }
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 148e8524..84996e7e 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -50,11 +50,11 @@ sub lei_q {
# --local is enabled by default unless --only is used
# we'll allow "--only $LOCATION --local"
my $sto = $self->_lei_store(1);
- my $lse = $sto->search;
if (($opt->{'import-remote'} //= 1) |
(($opt->{'import-before'} //= \1) ? 1 : 0)) {
$sto->write_prepare($self);
}
+ my $lse = $sto->search;
if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
$lxs->prepare_external($lse);
}
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-23 11:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23 11:48 [PATCH 0/5] lei: more input + worker-related stuff Eric Wong
2021-03-23 11:48 ` [PATCH 1/5] net_reader: nntp_each: pass keywords as `undef' Eric Wong
2021-03-23 11:48 ` [PATCH 2/5] test_common: check lei/errors.log Eric Wong
2021-03-23 11:48 ` [PATCH 3/5] lei: persistent workers (lei_store) run in / Eric Wong
2021-03-23 11:48 ` [PATCH 4/5] lei_input: more common code between <mark|convert|import> Eric Wong
2021-03-23 11:48 ` [PATCH 5/5] lei: improve management around short-lived workers Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).