unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH/RFC 0/7] lei - Local Email Interface skeleton
@ 2020-12-15 11:47 63% Eric Wong
  2020-12-15 11:47 28% ` [RFC 3/7] lei: FD-passing and IPC basics Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2020-12-15 11:47 UTC (permalink / raw)
  To: meta

patches 1 and 2 are boring cleanups.

The most important is 4/7 which features data structures
for a proposed command set.  Hopefully the command-names
and 1-line descriptions are helpful.

Comments from (potential) users appreciated, especially about 4/7.


I decided to take care of patch 3/7 (FD-passing) early on
because startup latency sucks.

I never used notmuch, but this will feature saved searches (aka
"named queries").  Otherwise, the query subcommand will probably
operate like mairix and dump the results to a
Maildir/mbox/etc...

patch 5/7 - keywords (e.g. `seen', 'draft', ...) read/write
(but not query) support added.

And a couple more cleanups.

lei will have its own writable git storage on top of extindex,
but will be able to do read-only queries against extinbox
(publicinbox || extindex) sources.

Eric Wong (7):
  daemon: support --daemonize without Net::Server::Daemonize
  daemon: simplify fork() failure checks
  lei: FD-passing and IPC basics
  lei: proposed command-listing and options
  lei_store: local storage for Local Email Interface
  tests: more common JSON module loading
  lei: use spawn (vfork + execve) for lazy start

 MANIFEST                          |   6 +
 lib/PublicInbox/Daemon.pm         |  26 +-
 lib/PublicInbox/ExtSearch.pm      |   4 +-
 lib/PublicInbox/ExtSearchIdx.pm   |  35 ++-
 lib/PublicInbox/Import.pm         |   4 +
 lib/PublicInbox/LeiDaemon.pm      | 449 ++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiSearch.pm      |  40 +++
 lib/PublicInbox/LeiStore.pm       | 197 +++++++++++++
 lib/PublicInbox/ManifestJsGz.pm   |   2 +-
 lib/PublicInbox/OverIdx.pm        |  10 +
 lib/PublicInbox/SearchIdx.pm      |  47 +++-
 lib/PublicInbox/SearchIdxShard.pm |  33 +++
 lib/PublicInbox/TestCommon.pm     |   4 +
 lib/PublicInbox/V2Writable.pm     |   2 +-
 script/lei                        |  64 +++++
 t/extsearch.t                     |   3 +-
 t/lei.t                           |  79 ++++++
 t/lei_store.t                     |  74 +++++
 t/www_listing.t                   |   8 +-
 19 files changed, 1055 insertions(+), 32 deletions(-)
 create mode 100644 lib/PublicInbox/LeiDaemon.pm
 create mode 100644 lib/PublicInbox/LeiSearch.pm
 create mode 100644 lib/PublicInbox/LeiStore.pm
 create mode 100755 script/lei
 create mode 100644 t/lei.t
 create mode 100644 t/lei_store.t


^ permalink raw reply	[relevance 63%]

* [RFC 7/7] lei: use spawn (vfork + execve) for lazy start
  2020-12-15 11:47 63% [PATCH/RFC 0/7] lei - Local Email Interface skeleton Eric Wong
  2020-12-15 11:47 28% ` [RFC 3/7] lei: FD-passing and IPC basics Eric Wong
  2020-12-15 11:47 43% ` [RFC 4/7] lei: proposed command-listing and options Eric Wong
@ 2020-12-15 11:47 61% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-15 11:47 UTC (permalink / raw)
  To: meta

This allows us to rely on FD_CLOEXEC being set on pipes
from prove(1), so forgetting `daemon-stop' won't cause
tests to hang.

Unfortunately, daemon tests will be slower with this.
---
 lib/PublicInbox/LeiDaemon.pm | 12 +++++-------
 script/lei                   | 14 ++++++++++----
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
index 20ff0758..2f614ba4 100644
--- a/lib/PublicInbox/LeiDaemon.pm
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -335,29 +335,27 @@ sub accept_dispatch { # Listener {post_accept} callback
 sub noop {}
 
 # lei(1) calls this when it can't connect
-sub lazy_start ($$) {
+sub lazy_start {
 	my ($path, $err) = @_;
 	if ($err == ECONNREFUSED) {
 		unlink($path) or die "unlink($path): $!";
 	} elsif ($err != ENOENT) {
 		die "connect($path): $!";
 	}
+	require IO::FDPass;
 	my $umask = umask(077) // die("umask(077): $!");
 	my $l = IO::Socket::UNIX->new(Local => $path,
 					Listen => 1024,
 					Type => SOCK_STREAM) or
 		$err = $!;
 	umask($umask) or die("umask(restore): $!");
-	$l or return $err;
+	$l or return die "bind($path): $err";
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
 	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
 	my $oldset = PublicInbox::Sigfd::block_signals();
 	my $pid = fork // die "fork: $!";
-	if ($pid) {
-		PublicInbox::Sigfd::sig_setmask($oldset);
-		return; # client will connect to $path
-	}
+	return if $pid;
 	openlog($path, 'pid', 'user');
 	local $SIG{__DIE__} = sub {
 		syslog('crit', "@_");
@@ -371,7 +369,7 @@ sub lazy_start ($$) {
 	open STDERR, '>&STDIN' or die "redirect stderr failed: $!\n";
 	setsid();
 	$pid = fork // die "fork: $!";
-	exit if $pid;
+	return if $pid;
 	$0 = "lei-daemon $path";
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
diff --git a/script/lei b/script/lei
index 1b5af3a1..637c1951 100755
--- a/script/lei
+++ b/script/lei
@@ -21,13 +21,19 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 	};
 	my $sock = IO::Socket::UNIX->new(Peer => $path, Type => SOCK_STREAM);
 	unless ($sock) { # start the daemon if not started
-		my $err = $!;
-		require PublicInbox::LeiDaemon;
-		$err = PublicInbox::LeiDaemon::lazy_start($path, $err);
+		my $err = $! + 0;
+		my $env = { PERL5LIB => join(':', @INC) };
+		my $cmd = [ $^X, qw[-MPublicInbox::LeiDaemon
+			-E PublicInbox::LeiDaemon::lazy_start(@ARGV)],
+			$path, $err ];
+		require PublicInbox::Spawn;
+		waitpid(PublicInbox::Spawn::spawn($cmd, $env), 0);
+		warn "lei-daemon exited with \$?=$?\n" if $?;
+
 		# try connecting again anyways, unlink+bind may be racy
 		$sock = IO::Socket::UNIX->new(Peer => $path,
 						Type => SOCK_STREAM) // die
-			"connect($path): $! (bind($path): $err)";
+			"connect($path): $! (after attempted daemon start)";
 	}
 	my $pwd = $ENV{PWD};
 	my $cwd = cwd();

^ permalink raw reply related	[relevance 61%]

* [RFC 4/7] lei: proposed command-listing and options
  2020-12-15 11:47 63% [PATCH/RFC 0/7] lei - Local Email Interface skeleton Eric Wong
  2020-12-15 11:47 28% ` [RFC 3/7] lei: FD-passing and IPC basics Eric Wong
@ 2020-12-15 11:47 43% ` Eric Wong
  2020-12-26 11:26 71%   ` "extinbox" term - was: [RFC 4/7] lei: proposed command-listing Eric Wong
  2020-12-15 11:47 61% ` [RFC 7/7] lei: use spawn (vfork + execve) for lazy start Eric Wong
  2 siblings, 1 reply; 200+ results
From: Eric Wong @ 2020-12-15 11:47 UTC (permalink / raw)
  To: meta

In an attempt to ensure a coherent UI/UX, we'll try to document
all proposed commands and options in one place for easy reference
---
 lib/PublicInbox/LeiDaemon.pm | 148 +++++++++++++++++++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
index ae40b3a6..89434cb8 100644
--- a/lib/PublicInbox/LeiDaemon.pm
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -23,6 +23,149 @@ our $quit = sub { exit(shift // 0) };
 my $glp = Getopt::Long::Parser->new;
 $glp->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
 
+# TBD: this is a documentation mechanism to show a subcommand
+# (may) pass options through to another command:
+sub pass_through { () }
+
+# TODO: generate shell completion + help using %CMD and %OPTDESC
+# command => [ positional_args, 1-line description, Getopt::Long option spec ]
+our %CMD = ( # sorted in order of importance/use:
+'query' => [ 'SEARCH-TERMS...', 'search for messages matching terms', qw(
+	save-as=s output|o=s format|f=s dedupe|d=s thread|t augment|a
+	limit|n=i sort|s=s reverse|r offset=i remote local! extinbox!
+	since|after=s until|before=s) ],
+
+'show' => [ '{MID|OID}', 'show a given object (Message-ID or object ID)',
+	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
+	pass_through('git show') ],
+
+'add-extinbox' => [ 'URL-OR-PATHNAME',
+	'add/set priority of a publicinbox|extindex for extra matches',
+	qw(prio=i) ],
+'ls-extinbox' => [ '[FILTER]', 'list publicinbox|extindex sources',
+	qw(format|f=s z local remote) ],
+'forget-extinbox' => [ '{URL-OR-PATHNAME|--prune}',
+	'exclude further results from a publicinbox|extindex',
+	qw(prune) ],
+
+'ls-query' => [ '[FILTER]', 'list saved search queries',
+		qw(name-only format|f=s z) ],
+'rm-query' => [ 'QUERY_NAME', 'remove a saved search' ],
+'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search' ],
+
+'plonk' => [ '{--thread|--from=IDENT}',
+	'exclude mail matching From: or thread from non-Message-ID searches',
+	qw(thread|t from|f=s mid=s oid=s) ],
+'mark' => [ 'MESSAGE-FLAGS', 'set/unset flags on message(s) from stdin',
+	qw(stdin| oid=s exact by-mid|mid:s) ],
+'forget' => [ '--stdin', 'exclude message(s) on stdin from query results',
+	qw(stdin| oid=s  exact by-mid|mid:s) ],
+
+'purge-mailsource' => [ '{URL-OR-PATHNAME|--all}',
+	'remove imported messages from IMAP, Maildirs, and MH',
+	qw(exact! all jobs:i indexed) ],
+
+# code repos are used for `show' to solve blobs from patch mails
+'add-coderepo' => [ 'PATHNAME', 'add or set priority of a git code repo',
+	qw(prio=i) ],
+'ls-coderepo' => [ '[FILTER]', 'list known code repos', qw(format|f=s z) ],
+'forget-coderepo' => [ 'PATHNAME',
+	'stop using repo to solve blobs from patches',
+	qw(prune) ],
+
+'add-watch' => [ '[URL_OR_PATHNAME]',
+		'watch for new messages and flag changes',
+	qw(import! flags! interval=s recursive|r exclude=s include=s) ],
+'ls-watch' => [ '[FILTER]', 'list active watches with numbers and status',
+		qw(format|f=s z) ],
+'pause-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
+'resume-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
+'forget-watch' => [ '{WATCH_NUMBER|--prune}', 'stop and forget a watch',
+	qw(prune) ],
+
+'import' => [ '{URL_OR_PATHNAME|--stdin}',
+	'one-shot import/update from URL or filesystem',
+	qw(stdin| limit|n=i offset=i recursive|r exclude=s include=s !flags),
+	],
+
+'config' => [ '[ANYTHING...]',
+		'git-config(1) wrapper for ~/.config/lei/config',
+		pass_through('git config') ],
+'init' => [ '[PATHNAME]',
+	'initialize storage, default: ~/.local/share/lei/store',
+	qw(quiet|q) ],
+'daemon-stop' => [ undef, 'stop the lei-daemon' ],
+'daemon-pid' => [ undef, 'show the PID of the lei-daemon' ],
+'help' => [ '[SUBCOMMAND]', 'show help' ],
+
+# XXX do we need this?
+# 'git' => [ '[ANYTHING...]', 'git(1) wrapper', pass_through('git') ],
+
+'reorder-local-store-and-break-history' => [ '[REFNAME]',
+	'rewrite git history in an attempt to improve compression',
+	'gc!' ]
+); # @CMD
+
+# switch descriptions, try to keep consistent across commands
+# $spec: Getopt::Long option specification
+# $spec => [@ALLOWED_VALUES (default is first), $description],
+# $spec => $description
+# "$SUB_COMMAND TAB $spec" => as above
+my $stdin_formats = [ qw(auto raw mboxrd mboxcl2 mboxcl mboxo),
+		'specify message input format' ];
+my $ls_format = [ qw(plain json null), 'listing output format' ];
+
+my %OPTDESC = (
+'quiet|q' => 'be quiet',
+'solve!' => 'do not attempt to reconstruct blobs from emails',
+'save-as=s' => ['NAME', 'save a search terms by given name'],
+
+'type=s' => [qw(any mid git), 'disambiguate type' ],
+
+'dedupe|d=s' => [qw(content oid mid), 'deduplication strategy'],
+'show	thread|t' => 'display entire thread a message belongs to',
+'query	thread|t' =>
+	'return message in the same thread as the actual match(es)',
+'augment|a' => 'augment --output destination instead of clobbering',
+
+'output|o=s' => "destination (e.g. `/path/to/Maildir', or `-' for stdout)",
+
+'show	format|f=s' => [ qw(plain raw html mboxrd mboxcl2 mboxcl),
+			'message/object output format' ],
+'mark	format|f=s' => $stdin_formats,
+'forget	format|f=s' => $stdin_formats,
+'query	format|f=s' => [qw(maildir mboxrd mboxcl2 mboxcl html oid),
+		'specify output format, default: depends on --output'],
+'ls-query	format|f=s' => $ls_format,
+'ls-extinbox format|f=s' => $ls_format,
+
+'limit|n=i' => 'integer limit on number of matches (default: 10000)',
+'offset=i' => 'search result offset (default: 0)',
+
+'sort|s=s@' => [qw(internaldate date relevance docid),
+		"order of results `--output'-dependent)"],
+
+'prio=i' => 'priority of query source',
+
+'local' => 'limit operations to the local filesystem',
+'local!' => 'exclude results from the local filesystem',
+'remote' => 'limit operations to those requiring network access',
+'remote!' => 'prevent operations requiring network access',
+
+'mid=s' => 'specify the Message-ID of a message',
+'oid=s' => 'specify the git object ID of a message',
+
+'recursive|r' => 'scan directories/mailboxes/newsgroups recursively',
+'exclude=s' => 'exclude mailboxes/newsgroups based on pattern',
+'include=s' => 'include mailboxes/newsgroups based on pattern',
+
+'exact' => 'operate on exact header matches only',
+'exact!' => 'rely on content match instead of exact header matches',
+
+'by-mid|mid:s' => 'match only by Message-ID, ignoring contents',
+'jobs:i' => 'set parallelism level',
+); # %OPTDESC
+
 sub x_it ($$) { # pronounced "exit"
 	my ($client, $code) = @_;
 	if (my $sig = ($code & 127)) {
@@ -100,6 +243,11 @@ sub dispatch {
 	}
 }
 
+sub lei_init {
+	my ($client, $argv) = @_;
+	assert_args($client, $argv, '') and emit($client, 1, "hi\n");
+}
+
 sub lei_daemon_pid {
 	my ($client, $argv) = @_;
 	assert_args($client, $argv, '') and emit($client, 1, "$$\n");

^ permalink raw reply related	[relevance 43%]

* [RFC 3/7] lei: FD-passing and IPC basics
  2020-12-15 11:47 63% [PATCH/RFC 0/7] lei - Local Email Interface skeleton Eric Wong
@ 2020-12-15 11:47 28% ` Eric Wong
  2020-12-15 11:47 43% ` [RFC 4/7] lei: proposed command-listing and options Eric Wong
  2020-12-15 11:47 61% ` [RFC 7/7] lei: use spawn (vfork + execve) for lazy start Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-15 11:47 UTC (permalink / raw)
  To: meta

The start of lei, a Local Email Interface.  It'll support a
daemon via FD passing to avoid startup time penalties if
IO::FDPass is installed, but fall back to a slow one-shot mode
if not.

Compared to traditional socket daemon, FD passing should allow
us to eventually do stuff like run "git show" and still have
proper terminal support for pager and color.
---
 MANIFEST                     |   3 +
 lib/PublicInbox/Daemon.pm    |   6 +-
 lib/PublicInbox/LeiDaemon.pm | 303 +++++++++++++++++++++++++++++++++++
 script/lei                   |  58 +++++++
 t/lei.t                      |  80 +++++++++
 5 files changed, 448 insertions(+), 2 deletions(-)
 create mode 100644 lib/PublicInbox/LeiDaemon.pm
 create mode 100755 script/lei
 create mode 100644 t/lei.t

diff --git a/MANIFEST b/MANIFEST
index ac442606..7536b7c2 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -159,6 +159,7 @@ lib/PublicInbox/InboxIdle.pm
 lib/PublicInbox/InboxWritable.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
+lib/PublicInbox/LeiDaemon.pm
 lib/PublicInbox/Linkify.pm
 lib/PublicInbox/Listener.pm
 lib/PublicInbox/Lock.pm
@@ -226,6 +227,7 @@ sa_config/Makefile
 sa_config/README
 sa_config/root/etc/spamassassin/public-inbox.pre
 sa_config/user/.spamassassin/user_prefs
+script/lei
 script/public-inbox-compact
 script/public-inbox-convert
 script/public-inbox-edit
@@ -316,6 +318,7 @@ t/indexlevels-mirror.t
 t/init.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei.t
 t/linkify.t
 t/main-bin/spamc
 t/mda-mime.eml
diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index a2171535..6b92b60d 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -1,7 +1,9 @@
 # Copyright (C) 2015-2020 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-# contains common daemon code for the httpd, imapd, and nntpd servers.
-# This may be used for read-only IMAP server if we decide to implement it.
+#
+# Contains common daemon code for the httpd, imapd, and nntpd servers
+# and designed for handling thousands of untrusted clients over slow
+# and/or lossy connections.
 package PublicInbox::Daemon;
 use strict;
 use warnings;
diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
new file mode 100644
index 00000000..ae40b3a6
--- /dev/null
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -0,0 +1,303 @@
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# Backend for `lei' (local email interface).  Unlike the C10K-oriented
+# PublicInbox::Daemon, this is designed exclusively to handle trusted
+# local clients with read/write access to the FS and use as many
+# system resources as the local user has access to.
+package PublicInbox::LeiDaemon;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::DS);
+use Getopt::Long ();
+use Errno qw(EAGAIN ECONNREFUSED ENOENT);
+use POSIX qw(setsid);
+use IO::Socket::UNIX;
+use IO::Handle ();
+use Sys::Syslog qw(syslog openlog);
+use PublicInbox::Syscall qw($SFD_NONBLOCK EPOLLIN EPOLLONESHOT);
+use PublicInbox::Sigfd;
+use PublicInbox::DS qw(now);
+use PublicInbox::Spawn qw(spawn);
+our $quit = sub { exit(shift // 0) };
+my $glp = Getopt::Long::Parser->new;
+$glp->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
+
+sub x_it ($$) { # pronounced "exit"
+	my ($client, $code) = @_;
+	if (my $sig = ($code & 127)) {
+		kill($sig, $client->{pid} // $$);
+	} else {
+		$code >>= 8;
+		if (my $sock = $client->{sock}) {
+			say $sock "exit=$code";
+		} else { # for oneshot
+			$quit->($code);
+		}
+	}
+}
+
+sub emit ($$$) {
+	my ($client, $channel, $buf) = @_;
+	print { $client->{$channel} } $buf or warn "print FD[$channel]: $!";
+}
+
+sub fail ($$;$) {
+	my ($client, $buf, $exit_code) = @_;
+	$buf .= "\n" unless $buf =~ /\n\z/s;
+	emit($client, 2, $buf);
+	x_it($client, ($exit_code // 1) << 8);
+	undef;
+}
+
+sub _help ($;$) {
+	my ($client, $channel) = @_;
+	emit($client, $channel //= 1, <<EOF);
+usage: lei COMMAND [OPTIONS]
+
+...
+EOF
+	x_it($client, $channel == 2 ? 1 << 8 : 0); # stderr => failure
+}
+
+sub assert_args ($$$;$@) {
+	my ($client, $argv, $proto, $opt, @spec) = @_;
+	$opt //= {};
+	push @spec, qw(help|h);
+	$glp->getoptionsfromarray($argv, $opt, @spec) or
+		return fail($client, 'bad arguments or options');
+	if ($opt->{help}) {
+		_help($client);
+		undef;
+	} else {
+		my ($nreq, $rest) = split(/;/, $proto);
+		$nreq = (($nreq // '') =~ tr/$/$/);
+		my $argc = scalar(@$argv);
+		my $tot = ($rest // '') eq '@' ? $argc : ($proto =~ tr/$/$/);
+		return 1 if $argc <= $tot && $argc >= $nreq;
+		_help($client, 2);
+		undef
+	}
+}
+
+sub dispatch {
+	my ($client, $cmd, @argv) = @_;
+	local $SIG{__WARN__} = sub { emit($client, 2, "@_") };
+	local $SIG{__DIE__} = 'DEFAULT';
+	if (defined $cmd) {
+		my $func = "lei_$cmd";
+		$func =~ tr/-/_/;
+		if (my $cb = __PACKAGE__->can($func)) {
+			$client->{cmd} = $cmd;
+			$cb->($client, \@argv);
+		} elsif (grep(/\A-/, $cmd, @argv)) {
+			assert_args($client, [ $cmd, @argv ], '');
+		} else {
+			fail($client, "`$cmd' is not an lei command");
+		}
+	} else {
+		_help($client, 2);
+	}
+}
+
+sub lei_daemon_pid {
+	my ($client, $argv) = @_;
+	assert_args($client, $argv, '') and emit($client, 1, "$$\n");
+}
+
+sub lei_DBG_pwd {
+	my ($client, $argv) = @_;
+	assert_args($client, $argv, '') and
+		emit($client, 1, "$client->{env}->{PWD}\n");
+}
+
+sub lei_DBG_cwd {
+	my ($client, $argv) = @_;
+	require Cwd;
+	assert_args($client, $argv, '') and emit($client, 1, Cwd::cwd()."\n");
+}
+
+sub lei_DBG_false { x_it($_[0], 1 << 8) }
+
+sub lei_daemon_stop {
+	my ($client, $argv) = @_;
+	assert_args($client, $argv, '') and $quit->(0);
+}
+
+sub lei_help { _help($_[0]) }
+
+sub reap_exec { # dwaitpid callback
+	my ($client, $pid) = @_;
+	x_it($client, $?);
+}
+
+sub lei_git { # support passing through random git commands
+	my ($client, $argv) = @_;
+	my %opt = map { $_ => $client->{$_} } (0..2);
+	my $pid = spawn(['git', @$argv], $client->{env}, \%opt);
+	PublicInbox::DS::dwaitpid($pid, \&reap_exec, $client);
+}
+
+sub accept_dispatch { # Listener {post_accept} callback
+	my ($sock) = @_; # ignore other
+	$sock->blocking(1);
+	$sock->autoflush(1);
+	my $client = { sock => $sock };
+	vec(my $rin = '', fileno($sock), 1) = 1;
+	# `say $sock' triggers "die" in lei(1)
+	for my $i (0..2) {
+		if (select(my $rout = $rin, undef, undef, 1)) {
+			my $fd = IO::FDPass::recv(fileno($sock));
+			if ($fd >= 0) {
+				my $rdr = ($fd == 0 ? '<&=' : '>&=');
+				if (open(my $fh, $rdr, $fd)) {
+					$client->{$i} = $fh;
+				} else {
+					say $sock "open($rdr$fd) (FD=$i): $!";
+					return;
+				}
+			} else {
+				say $sock "recv FD=$i: $!";
+				return;
+			}
+		} else {
+			say $sock "timed out waiting to recv FD=$i";
+			return;
+		}
+	}
+	# $ARGV_STR = join("]\0[", @ARGV);
+	# $ENV_STR = join('', map { "$_=$ENV{$_}\0" } keys %ENV);
+	# $line = "$$\0\0>$ARGV_STR\0\0>$ENV_STR\0\0";
+	my ($client_pid, $argv, $env) = do {
+		local $/ = "\0\0\0"; # yes, 3 NULs at EOL, not 2
+		chomp(my $line = <$sock>);
+		split(/\0\0>/, $line, 3);
+	};
+	my %env = map { split(/=/, $_, 2) } split(/\0/, $env);
+	if (chdir($env{PWD})) {
+		$client->{env} = \%env;
+		$client->{pid} = $client_pid;
+		eval { dispatch($client, split(/\]\0\[/, $argv)) };
+		say $sock $@ if $@;
+	} else {
+		say $sock "chdir($env{PWD}): $!"; # implicit close
+	}
+}
+
+sub noop {}
+
+# lei(1) calls this when it can't connect
+sub lazy_start ($$) {
+	my ($path, $err) = @_;
+	if ($err == ECONNREFUSED) {
+		unlink($path) or die "unlink($path): $!";
+	} elsif ($err != ENOENT) {
+		die "connect($path): $!";
+	}
+	my $umask = umask(077) // die("umask(077): $!");
+	my $l = IO::Socket::UNIX->new(Local => $path,
+					Listen => 1024,
+					Type => SOCK_STREAM) or
+		$err = $!;
+	umask($umask) or die("umask(restore): $!");
+	$l or return $err;
+	my @st = stat($path) or die "stat($path): $!";
+	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
+	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
+	my $oldset = PublicInbox::Sigfd::block_signals();
+	my $pid = fork // die "fork: $!";
+	if ($pid) {
+		PublicInbox::Sigfd::sig_setmask($oldset);
+		return; # client will connect to $path
+	}
+	openlog($path, 'pid', 'user');
+	local $SIG{__DIE__} = sub {
+		syslog('crit', "@_");
+		exit $! if $!;
+		exit $? >> 8 if $? >> 8;
+		exit 255;
+	};
+	local $SIG{__WARN__} = sub { syslog('warning', "@_") };
+	open(STDIN, '+<', '/dev/null') or die "redirect stdin failed: $!\n";
+	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!\n";
+	open STDERR, '>&STDIN' or die "redirect stderr failed: $!\n";
+	setsid();
+	$pid = fork // die "fork: $!";
+	exit if $pid;
+	$0 = "lei-daemon $path";
+	require PublicInbox::Listener;
+	require PublicInbox::EOFpipe;
+	$l->blocking(0);
+	$eof_w->blocking(0);
+	$eof_r->blocking(0);
+	my $listener = PublicInbox::Listener->new($l, \&accept_dispatch, $l);
+	my $exit_code;
+	local $quit = sub {
+		$exit_code //= shift;
+		my $tmp = $listener or exit($exit_code);
+		unlink($path) if defined($path);
+		syswrite($eof_w, '.');
+		$l = $listener = $path = undef;
+		$tmp->close if $tmp; # DS::close
+		PublicInbox::DS->SetLoopTimeout(1000);
+	};
+	PublicInbox::EOFpipe->new($eof_r, sub {}, undef);
+	my $sig = {
+		CHLD => \&PublicInbox::DS::enqueue_reap,
+		QUIT => $quit,
+		INT => $quit,
+		TERM => $quit,
+		HUP => \&noop,
+		USR1 => \&noop,
+		USR2 => \&noop,
+	};
+	my $sigfd = PublicInbox::Sigfd->new($sig, $SFD_NONBLOCK);
+	local %SIG = (%SIG, %$sig) if !$sigfd;
+	if ($sigfd) { # TODO: use inotify/kqueue to detect unlinked sockets
+		PublicInbox::DS->SetLoopTimeout(5000);
+	} else {
+		# wake up every second to accept signals if we don't
+		# have signalfd or IO::KQueue:
+		PublicInbox::Sigfd::sig_setmask($oldset);
+		PublicInbox::DS->SetLoopTimeout(1000);
+	}
+	PublicInbox::DS->SetPostLoopCallback(sub {
+		my ($dmap, undef) = @_;
+		if (@st = defined($path) ? stat($path) : ()) {
+			if ($dev_ino_expect ne pack('dd', $st[0], $st[1])) {
+				warn "$path dev/ino changed, quitting\n";
+				$path = undef;
+			}
+		} elsif (defined($path)) {
+			warn "stat($path): $!, quitting ...\n";
+			undef $path; # don't unlink
+			$quit->();
+		}
+		return 1 if defined($path);
+		my $now = now();
+		my $n = 0;
+		for my $s (values %$dmap) {
+			$s->can('busy') or next;
+			if ($s->busy($now)) {
+				++$n;
+			} else {
+				$s->close;
+			}
+		}
+		$n; # true: continue, false: stop
+	});
+	PublicInbox::DS->EventLoop;
+	exit($exit_code // 0);
+}
+
+# for users w/o IO::FDPass
+sub oneshot {
+	dispatch({
+		0 => *STDIN{IO},
+		1 => *STDOUT{IO},
+		2 => *STDERR{IO},
+		env => \%ENV
+	}, @ARGV);
+}
+
+1;
diff --git a/script/lei b/script/lei
new file mode 100755
index 00000000..1b5af3a1
--- /dev/null
+++ b/script/lei
@@ -0,0 +1,58 @@
+#!perl -w
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Cwd qw(cwd);
+use IO::Socket::UNIX;
+
+if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
+	my $path = do {
+		my $runtime_dir = ($ENV{XDG_RUNTIME_DIR} // '') . '/lei';
+		if ($runtime_dir eq '/lei') {
+			require File::Spec;
+			$runtime_dir = File::Spec->tmpdir."/lei-$<";
+		}
+		unless (-d $runtime_dir && -w _) {
+			require File::Path;
+			File::Path::mkpath($runtime_dir, 0, 0700);
+		}
+		"$runtime_dir/sock";
+	};
+	my $sock = IO::Socket::UNIX->new(Peer => $path, Type => SOCK_STREAM);
+	unless ($sock) { # start the daemon if not started
+		my $err = $!;
+		require PublicInbox::LeiDaemon;
+		$err = PublicInbox::LeiDaemon::lazy_start($path, $err);
+		# try connecting again anyways, unlink+bind may be racy
+		$sock = IO::Socket::UNIX->new(Peer => $path,
+						Type => SOCK_STREAM) // die
+			"connect($path): $! (bind($path): $err)";
+	}
+	my $pwd = $ENV{PWD};
+	my $cwd = cwd();
+	if ($pwd) { # prefer ENV{PWD} if it's a symlink to real cwd
+		my @st_cwd = stat($cwd) or die "stat(cwd=$cwd): $!\n";
+		my @st_pwd = stat($pwd);
+		# make sure st_dev/st_ino match for {PWD} to be valid
+		$pwd = $cwd if (!@st_pwd || $st_pwd[1] != $st_cwd[1] ||
+					$st_pwd[0] != $st_cwd[0]);
+	} else {
+		$pwd = $cwd;
+	}
+	local $ENV{PWD} = $pwd;
+	$sock->autoflush(1);
+	IO::FDPass::send(fileno($sock), $_) for (0..2);
+	my $buf = "$$\0\0>" . join("]\0[", @ARGV) . "\0\0>";
+	while (my ($k, $v) = each %ENV) { $buf .= "$k=$v\0" }
+	$buf .= "\0\0";
+	print $sock $buf or die "print(sock, buf): $!";
+	local $/ = "\n";
+	while (my $line = <$sock>) {
+		$line =~ /\Aexit=([0-9]+)\n\z/ and exit($1 + 0);
+		die $line;
+	}
+} else { # for systems lacking IO::FDPass
+	require PublicInbox::LeiDaemon;
+	PublicInbox::LeiDaemon::oneshot();
+}
diff --git a/t/lei.t b/t/lei.t
new file mode 100644
index 00000000..feee9270
--- /dev/null
+++ b/t/lei.t
@@ -0,0 +1,80 @@
+#!perl -w
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Test::More;
+use PublicInbox::TestCommon;
+use PublicInbox::Config;
+my $json = PublicInbox::Config::json() or plan skip_all => 'JSON missing';
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($home, $for_destroy) = tmpdir();
+my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
+
+SKIP: {
+	require_mods('IO::FDPass', 51);
+	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
+	mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
+	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
+
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	is($err, '', 'no error from daemon-pid');
+	like($out, qr/\A[0-9]+\n\z/s, 'pid returned') or BAIL_OUT;
+	chomp(my $pid = $out);
+	ok(kill(0, $pid), 'pid is valid');
+	ok(-S $sock, 'sock created');
+
+	ok(!run_script([qw(lei)], undef, $opt), 'no args fails');
+	is($? >> 8, 1, '$? is 1');
+	is($out, '', 'nothing in stdout');
+	like($err, qr/^usage:/sm, 'usage in stderr');
+
+	for my $arg (['-h'], ['--help'], ['help'], [qw(daemon-pid --help)]) {
+		$out = $err = '';
+		ok(run_script(['lei', @$arg], undef, $opt), "lei @$arg");
+		like($out, qr/^usage:/sm, "usage in stdout (@$arg)");
+		is($err, '', "nothing in stderr (@$arg)");
+	}
+
+	ok(!run_script([qw(lei DBG-false)], undef, $opt), 'false(1) emulation');
+	is($? >> 8, 1, '$? set correctly');
+	is($err, '', 'no error from false(1) emulation');
+
+	for my $arg ([''], ['--halp'], ['halp'], [qw(daemon-pid --halp)]) {
+		$out = $err = '';
+		ok(!run_script(['lei', @$arg], undef, $opt), "lei @$arg");
+		is($? >> 8, 1, '$? set correctly');
+		isnt($err, '', 'something in stderr');
+		is($out, '', 'nothing in stdout');
+	}
+
+	$out = '';
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	chomp(my $pid_again = $out);
+	is($pid, $pid_again, 'daemon-pid idempotent');
+
+	ok(run_script([qw(lei daemon-stop)], undef, $opt), 'daemon-stop');
+	is($out, '', 'no output from daemon-stop');
+	is($err, '', 'no error from daemon-stop');
+	for (0..100) {
+		kill(0, $pid) or last;
+		tick();
+	}
+	ok(!-S $sock, 'sock gone');
+	ok(!kill(0, $pid), 'pid gone after stop');
+
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	chomp(my $new_pid = $out);
+	ok(kill(0, $new_pid), 'new pid is running');
+	ok(-S $sock, 'sock exists again');
+	unlink $sock or BAIL_OUT "unlink $!";
+	for (0..100) {
+		kill('CHLD', $new_pid) or last;
+		tick();
+	}
+	ok(!kill(0, $new_pid), 'daemon exits after unlink');
+};
+
+require_ok 'PublicInbox::LeiDaemon';
+
+done_testing;

^ permalink raw reply related	[relevance 28%]

* [PATCH 00/26] lei: basic UI + IPC work
@ 2020-12-18 12:09 55% Eric Wong
  2020-12-18 12:09 28% ` [PATCH 01/26] lei: FD-passing and IPC basics Eric Wong
                   ` (18 more replies)
  0 siblings, 19 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

Some work on the storage side, but MiscIdx still needs work to
handle existing publicinboxes, extinboxes (over HTTP(S)), and
other config things.

PATCH 22/26 - bash completion sorta works, but filename
completions get broken.  Not sure why and help would be
greatly appreciated (along with help for other shells).
I don't know bash-specific stuff well at all, even; and
less about other non-POSIX shells.

Somewhat nice UI things (at least to my delirious sleep-deprived
state):

* -$DIGIT option parsing works (e.g. "git log -10"),
  "kill -9"

* help-based CLI arg/prototype checking seems working
  and hopefully cuts down on long-term maintenance work
  while promoting UI consistency

* having IO::FDPass hides startup time, 20-30ms isn't
  really noticeable for humans on interactive terminals,
  but still not ideal for loops.

* lei.sh + "make symlink-install"

And some internal improvements:

* several simplifications to existing Search code,
  ->xdb_shards_flat will come in handy

* generic OnDestroy - long overdue

Eric Wong (26):
  lei: FD-passing and IPC basics
  lei: proposed command-listing and options
  lei_store: local storage for Local Email Interface
  tests: more common JSON module loading
  lei: use spawn (vfork + execve) for lazy start
  lei: refine help/option parsing, implement "init"
  t/lei-oneshot: standalone oneshot (non-socket) test
  lei: ensure we run a restrictive umask
  lei: support `daemon-env' for modifying long-lived env
  lei_store: simplify git_epoch_max, slightly
  search: simplify initialization, add ->xdb_shards_flat
  rename LeiDaemon package to PublicInbox::LEI
  lei: support pass-through for `lei config'
  lei: help: show actual paths being operated on
  lei: rename $client => $self and bless
  lei: micro-optimize startup time
  lei_store: relax GIT_COMMITTER_IDENT check
  lei_store: keyword extraction from mbox and Maildir
  on_destroy: generic localized END
  lei: restore default __DIE__ handler for event loop
  lei: drop $SIG{__DIE__}, add oneshot fallbacks
  lei: start working on bash completion
  build: add lei.sh + "make symlink-install" target
  lei: support for -$DIGIT and -$SIG CLI switches
  lei: revise output routines
  lei: extinbox: start implementing in config file

 MANIFEST                               |  11 +
 Makefile.PL                            |  11 +
 contrib/completion/lei-completion.bash |  11 +
 lei.sh                                 |   7 +
 lib/PublicInbox/Daemon.pm              |   6 +-
 lib/PublicInbox/ExtSearch.pm           |  10 +-
 lib/PublicInbox/ExtSearchIdx.pm        |  35 +-
 lib/PublicInbox/Import.pm              |   4 +
 lib/PublicInbox/LEI.pm                 | 776 +++++++++++++++++++++++++
 lib/PublicInbox/LeiExtinbox.pm         |  52 ++
 lib/PublicInbox/LeiSearch.pm           |  39 ++
 lib/PublicInbox/LeiStore.pm            | 227 ++++++++
 lib/PublicInbox/ManifestJsGz.pm        |   2 +-
 lib/PublicInbox/OnDestroy.pm           |  16 +
 lib/PublicInbox/OverIdx.pm             |  10 +
 lib/PublicInbox/Search.pm              |  65 +--
 lib/PublicInbox/SearchIdx.pm           |  62 +-
 lib/PublicInbox/SearchIdxShard.pm      |  33 ++
 lib/PublicInbox/TestCommon.pm          |   7 +-
 lib/PublicInbox/V2Writable.pm          |  10 +-
 script/lei                             |  76 +++
 t/extsearch.t                          |   3 +-
 t/lei-oneshot.t                        |  25 +
 t/lei.t                                | 306 ++++++++++
 t/lei_store.t                          |  88 +++
 t/on_destroy.t                         |  25 +
 t/www_listing.t                        |   8 +-
 27 files changed, 1843 insertions(+), 82 deletions(-)
 create mode 100644 contrib/completion/lei-completion.bash
 create mode 100755 lei.sh
 create mode 100644 lib/PublicInbox/LEI.pm
 create mode 100644 lib/PublicInbox/LeiExtinbox.pm
 create mode 100644 lib/PublicInbox/LeiSearch.pm
 create mode 100644 lib/PublicInbox/LeiStore.pm
 create mode 100644 lib/PublicInbox/OnDestroy.pm
 create mode 100755 script/lei
 create mode 100644 t/lei-oneshot.t
 create mode 100644 t/lei.t
 create mode 100644 t/lei_store.t
 create mode 100644 t/on_destroy.t

^ permalink raw reply	[relevance 55%]

* [PATCH 02/26] lei: proposed command-listing and options
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
  2020-12-18 12:09 28% ` [PATCH 01/26] lei: FD-passing and IPC basics Eric Wong
@ 2020-12-18 12:09 45% ` Eric Wong
  2020-12-18 12:09 61% ` [PATCH 05/26] lei: use spawn (vfork + execve) for lazy start Eric Wong
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

In an attempt to ensure a coherent UI/UX, we'll try to document
all proposed commands and options in one place for easy reference
---
 lib/PublicInbox/LeiDaemon.pm | 137 +++++++++++++++++++++++++++++++++++
 1 file changed, 137 insertions(+)

diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
index ae40b3a6..d0c53416 100644
--- a/lib/PublicInbox/LeiDaemon.pm
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -23,6 +23,143 @@ our $quit = sub { exit(shift // 0) };
 my $glp = Getopt::Long::Parser->new;
 $glp->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
 
+# TBD: this is a documentation mechanism to show a subcommand
+# (may) pass options through to another command:
+sub pass_through { () }
+
+# TODO: generate shell completion + help using %CMD and %OPTDESC
+# command => [ positional_args, 1-line description, Getopt::Long option spec ]
+our %CMD = ( # sorted in order of importance/use:
+'query' => [ 'SEARCH-TERMS...', 'search for messages matching terms', qw(
+	save-as=s output|o=s format|f=s dedupe|d=s thread|t augment|a
+	limit|n=i sort|s=s reverse|r offset=i remote local! extinbox!
+	since|after=s until|before=s) ],
+
+'show' => [ '{MID|OID}', 'show a given object (Message-ID or object ID)',
+	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
+	pass_through('git show') ],
+
+'add-extinbox' => [ 'URL-OR-PATHNAME',
+	'add/set priority of a publicinbox|extindex for extra matches',
+	qw(prio=i) ],
+'ls-extinbox' => [ '[FILTER]', 'list publicinbox|extindex sources',
+	qw(format|f=s z local remote) ],
+'forget-extinbox' => [ '{URL-OR-PATHNAME|--prune}',
+	'exclude further results from a publicinbox|extindex',
+	qw(prune) ],
+
+'ls-query' => [ '[FILTER]', 'list saved search queries',
+		qw(name-only format|f=s z) ],
+'rm-query' => [ 'QUERY_NAME', 'remove a saved search' ],
+'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search' ],
+
+'plonk' => [ '{--thread|--from=IDENT}',
+	'exclude mail matching From: or thread from non-Message-ID searches',
+	qw(thread|t from|f=s mid=s oid=s) ],
+'mark' => [ 'MESSAGE-FLAGS', 'set/unset flags on message(s) from stdin',
+	qw(stdin| oid=s exact by-mid|mid:s) ],
+'forget' => [ '--stdin', 'exclude message(s) on stdin from query results',
+	qw(stdin| oid=s  exact by-mid|mid:s) ],
+
+'purge-mailsource' => [ '{URL-OR-PATHNAME|--all}',
+	'remove imported messages from IMAP, Maildirs, and MH',
+	qw(exact! all jobs:i indexed) ],
+
+# code repos are used for `show' to solve blobs from patch mails
+'add-coderepo' => [ 'PATHNAME', 'add or set priority of a git code repo',
+	qw(prio=i) ],
+'ls-coderepo' => [ '[FILTER]', 'list known code repos', qw(format|f=s z) ],
+'forget-coderepo' => [ 'PATHNAME',
+	'stop using repo to solve blobs from patches',
+	qw(prune) ],
+
+'add-watch' => [ '[URL_OR_PATHNAME]',
+		'watch for new messages and flag changes',
+	qw(import! flags! interval=s recursive|r exclude=s include=s) ],
+'ls-watch' => [ '[FILTER]', 'list active watches with numbers and status',
+		qw(format|f=s z) ],
+'pause-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
+'resume-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
+'forget-watch' => [ '{WATCH_NUMBER|--prune}', 'stop and forget a watch',
+	qw(prune) ],
+
+'import' => [ '{URL_OR_PATHNAME|--stdin}',
+	'one-shot import/update from URL or filesystem',
+	qw(stdin| limit|n=i offset=i recursive|r exclude=s include=s !flags),
+	],
+
+'config' => [ '[ANYTHING...]',
+		'git-config(1) wrapper for ~/.config/lei/config',
+		pass_through('git config') ],
+'daemon-stop' => [ undef, 'stop the lei-daemon' ],
+'daemon-pid' => [ undef, 'show the PID of the lei-daemon' ],
+'help' => [ '[SUBCOMMAND]', 'show help' ],
+
+# XXX do we need this?
+# 'git' => [ '[ANYTHING...]', 'git(1) wrapper', pass_through('git') ],
+
+'reorder-local-store-and-break-history' => [ '[REFNAME]',
+	'rewrite git history in an attempt to improve compression',
+	'gc!' ]
+); # @CMD
+
+# switch descriptions, try to keep consistent across commands
+# $spec: Getopt::Long option specification
+# $spec => [@ALLOWED_VALUES (default is first), $description],
+# $spec => $description
+# "$SUB_COMMAND TAB $spec" => as above
+my $stdin_formats = [ qw(auto raw mboxrd mboxcl2 mboxcl mboxo),
+		'specify message input format' ];
+my $ls_format = [ qw(plain json null), 'listing output format' ];
+my $show_format = [ qw(plain raw html mboxrd mboxcl2 mboxcl),
+		'message/object output format' ];
+
+my %OPTDESC = (
+'solve!' => 'do not attempt to reconstruct blobs from emails',
+'save-as=s' => 'save a search terms by given name',
+
+'type=s' => [qw(any mid git), 'disambiguate type' ],
+
+'dedupe|d=s' => [qw(content oid mid), 'deduplication strategy'],
+'thread|t' => 'every message in the same thread as the actual match(es)',
+'augment|a' => 'augment --output destination instead of clobbering',
+
+'output|o=s' => "destination (e.g. `/path/to/Maildir', or `-' for stdout)",
+
+'mark	format|f=s' => $stdin_formats,
+'forget	format|f=s' => $stdin_formats,
+'query	format|f=s' => [qw(maildir mboxrd mboxcl2 mboxcl html oid),
+		q[specify output format (default: determined by --output)]],
+'ls-query	format|f=s' => $ls_format,
+'ls-extinbox format|f=s' => $ls_format,
+
+'limit|n=i' => 'integer limit on number of matches (default: 10000)',
+'offset=i' => 'search result offset (default: 0)',
+
+'sort|s=s@' => [qw(internaldate date relevance docid),
+		"order of results `--output'-dependent)"],
+
+'prio=i' => 'priority of query source',
+
+'local' => 'limit operations to the local filesystem',
+'local!' => 'exclude results from the local filesystem',
+'remote' => 'limit operations to those requiring network access',
+'remote!' => 'prevent operations requiring network access',
+
+'mid=s' => 'specify the Message-ID of a message',
+'oid=s' => 'specify the git object ID of a message',
+
+'recursive|r' => 'scan directories/mailboxes/newsgroups recursively',
+'exclude=s' => 'exclude mailboxes/newsgroups based on pattern',
+'include=s' => 'include mailboxes/newsgroups based on pattern',
+
+'exact' => 'operate on exact header matches only',
+'exact!' => 'rely on content match instead of exact header matches',
+
+'by-mid|mid:s' => 'match only by Message-ID, ignoring contents',
+'jobs:i' => 'set parallelism level',
+); # %OPTDESC
+
 sub x_it ($$) { # pronounced "exit"
 	my ($client, $code) = @_;
 	if (my $sig = ($code & 127)) {

^ permalink raw reply related	[relevance 45%]

* [PATCH 01/26] lei: FD-passing and IPC basics
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
@ 2020-12-18 12:09 28% ` Eric Wong
  2020-12-18 12:09 45% ` [PATCH 02/26] lei: proposed command-listing and options Eric Wong
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

The start of lei, a Local Email Interface.  It'll support a
daemon via FD passing to avoid startup time penalties if
IO::FDPass is installed, but fall back to a slow one-shot mode
if not.

Compared to traditional socket daemon, FD passing should allow
us to eventually do stuff like run "git show" and still have
proper terminal support for pager and color.
---
 MANIFEST                     |   3 +
 lib/PublicInbox/Daemon.pm    |   6 +-
 lib/PublicInbox/LeiDaemon.pm | 303 +++++++++++++++++++++++++++++++++++
 script/lei                   |  58 +++++++
 t/lei.t                      |  80 +++++++++
 5 files changed, 448 insertions(+), 2 deletions(-)
 create mode 100644 lib/PublicInbox/LeiDaemon.pm
 create mode 100755 script/lei
 create mode 100644 t/lei.t

diff --git a/MANIFEST b/MANIFEST
index ac442606..7536b7c2 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -159,6 +159,7 @@ lib/PublicInbox/InboxIdle.pm
 lib/PublicInbox/InboxWritable.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
+lib/PublicInbox/LeiDaemon.pm
 lib/PublicInbox/Linkify.pm
 lib/PublicInbox/Listener.pm
 lib/PublicInbox/Lock.pm
@@ -226,6 +227,7 @@ sa_config/Makefile
 sa_config/README
 sa_config/root/etc/spamassassin/public-inbox.pre
 sa_config/user/.spamassassin/user_prefs
+script/lei
 script/public-inbox-compact
 script/public-inbox-convert
 script/public-inbox-edit
@@ -316,6 +318,7 @@ t/indexlevels-mirror.t
 t/init.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei.t
 t/linkify.t
 t/main-bin/spamc
 t/mda-mime.eml
diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index a2171535..6b92b60d 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -1,7 +1,9 @@
 # Copyright (C) 2015-2020 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-# contains common daemon code for the httpd, imapd, and nntpd servers.
-# This may be used for read-only IMAP server if we decide to implement it.
+#
+# Contains common daemon code for the httpd, imapd, and nntpd servers
+# and designed for handling thousands of untrusted clients over slow
+# and/or lossy connections.
 package PublicInbox::Daemon;
 use strict;
 use warnings;
diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
new file mode 100644
index 00000000..ae40b3a6
--- /dev/null
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -0,0 +1,303 @@
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# Backend for `lei' (local email interface).  Unlike the C10K-oriented
+# PublicInbox::Daemon, this is designed exclusively to handle trusted
+# local clients with read/write access to the FS and use as many
+# system resources as the local user has access to.
+package PublicInbox::LeiDaemon;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::DS);
+use Getopt::Long ();
+use Errno qw(EAGAIN ECONNREFUSED ENOENT);
+use POSIX qw(setsid);
+use IO::Socket::UNIX;
+use IO::Handle ();
+use Sys::Syslog qw(syslog openlog);
+use PublicInbox::Syscall qw($SFD_NONBLOCK EPOLLIN EPOLLONESHOT);
+use PublicInbox::Sigfd;
+use PublicInbox::DS qw(now);
+use PublicInbox::Spawn qw(spawn);
+our $quit = sub { exit(shift // 0) };
+my $glp = Getopt::Long::Parser->new;
+$glp->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
+
+sub x_it ($$) { # pronounced "exit"
+	my ($client, $code) = @_;
+	if (my $sig = ($code & 127)) {
+		kill($sig, $client->{pid} // $$);
+	} else {
+		$code >>= 8;
+		if (my $sock = $client->{sock}) {
+			say $sock "exit=$code";
+		} else { # for oneshot
+			$quit->($code);
+		}
+	}
+}
+
+sub emit ($$$) {
+	my ($client, $channel, $buf) = @_;
+	print { $client->{$channel} } $buf or warn "print FD[$channel]: $!";
+}
+
+sub fail ($$;$) {
+	my ($client, $buf, $exit_code) = @_;
+	$buf .= "\n" unless $buf =~ /\n\z/s;
+	emit($client, 2, $buf);
+	x_it($client, ($exit_code // 1) << 8);
+	undef;
+}
+
+sub _help ($;$) {
+	my ($client, $channel) = @_;
+	emit($client, $channel //= 1, <<EOF);
+usage: lei COMMAND [OPTIONS]
+
+...
+EOF
+	x_it($client, $channel == 2 ? 1 << 8 : 0); # stderr => failure
+}
+
+sub assert_args ($$$;$@) {
+	my ($client, $argv, $proto, $opt, @spec) = @_;
+	$opt //= {};
+	push @spec, qw(help|h);
+	$glp->getoptionsfromarray($argv, $opt, @spec) or
+		return fail($client, 'bad arguments or options');
+	if ($opt->{help}) {
+		_help($client);
+		undef;
+	} else {
+		my ($nreq, $rest) = split(/;/, $proto);
+		$nreq = (($nreq // '') =~ tr/$/$/);
+		my $argc = scalar(@$argv);
+		my $tot = ($rest // '') eq '@' ? $argc : ($proto =~ tr/$/$/);
+		return 1 if $argc <= $tot && $argc >= $nreq;
+		_help($client, 2);
+		undef
+	}
+}
+
+sub dispatch {
+	my ($client, $cmd, @argv) = @_;
+	local $SIG{__WARN__} = sub { emit($client, 2, "@_") };
+	local $SIG{__DIE__} = 'DEFAULT';
+	if (defined $cmd) {
+		my $func = "lei_$cmd";
+		$func =~ tr/-/_/;
+		if (my $cb = __PACKAGE__->can($func)) {
+			$client->{cmd} = $cmd;
+			$cb->($client, \@argv);
+		} elsif (grep(/\A-/, $cmd, @argv)) {
+			assert_args($client, [ $cmd, @argv ], '');
+		} else {
+			fail($client, "`$cmd' is not an lei command");
+		}
+	} else {
+		_help($client, 2);
+	}
+}
+
+sub lei_daemon_pid {
+	my ($client, $argv) = @_;
+	assert_args($client, $argv, '') and emit($client, 1, "$$\n");
+}
+
+sub lei_DBG_pwd {
+	my ($client, $argv) = @_;
+	assert_args($client, $argv, '') and
+		emit($client, 1, "$client->{env}->{PWD}\n");
+}
+
+sub lei_DBG_cwd {
+	my ($client, $argv) = @_;
+	require Cwd;
+	assert_args($client, $argv, '') and emit($client, 1, Cwd::cwd()."\n");
+}
+
+sub lei_DBG_false { x_it($_[0], 1 << 8) }
+
+sub lei_daemon_stop {
+	my ($client, $argv) = @_;
+	assert_args($client, $argv, '') and $quit->(0);
+}
+
+sub lei_help { _help($_[0]) }
+
+sub reap_exec { # dwaitpid callback
+	my ($client, $pid) = @_;
+	x_it($client, $?);
+}
+
+sub lei_git { # support passing through random git commands
+	my ($client, $argv) = @_;
+	my %opt = map { $_ => $client->{$_} } (0..2);
+	my $pid = spawn(['git', @$argv], $client->{env}, \%opt);
+	PublicInbox::DS::dwaitpid($pid, \&reap_exec, $client);
+}
+
+sub accept_dispatch { # Listener {post_accept} callback
+	my ($sock) = @_; # ignore other
+	$sock->blocking(1);
+	$sock->autoflush(1);
+	my $client = { sock => $sock };
+	vec(my $rin = '', fileno($sock), 1) = 1;
+	# `say $sock' triggers "die" in lei(1)
+	for my $i (0..2) {
+		if (select(my $rout = $rin, undef, undef, 1)) {
+			my $fd = IO::FDPass::recv(fileno($sock));
+			if ($fd >= 0) {
+				my $rdr = ($fd == 0 ? '<&=' : '>&=');
+				if (open(my $fh, $rdr, $fd)) {
+					$client->{$i} = $fh;
+				} else {
+					say $sock "open($rdr$fd) (FD=$i): $!";
+					return;
+				}
+			} else {
+				say $sock "recv FD=$i: $!";
+				return;
+			}
+		} else {
+			say $sock "timed out waiting to recv FD=$i";
+			return;
+		}
+	}
+	# $ARGV_STR = join("]\0[", @ARGV);
+	# $ENV_STR = join('', map { "$_=$ENV{$_}\0" } keys %ENV);
+	# $line = "$$\0\0>$ARGV_STR\0\0>$ENV_STR\0\0";
+	my ($client_pid, $argv, $env) = do {
+		local $/ = "\0\0\0"; # yes, 3 NULs at EOL, not 2
+		chomp(my $line = <$sock>);
+		split(/\0\0>/, $line, 3);
+	};
+	my %env = map { split(/=/, $_, 2) } split(/\0/, $env);
+	if (chdir($env{PWD})) {
+		$client->{env} = \%env;
+		$client->{pid} = $client_pid;
+		eval { dispatch($client, split(/\]\0\[/, $argv)) };
+		say $sock $@ if $@;
+	} else {
+		say $sock "chdir($env{PWD}): $!"; # implicit close
+	}
+}
+
+sub noop {}
+
+# lei(1) calls this when it can't connect
+sub lazy_start ($$) {
+	my ($path, $err) = @_;
+	if ($err == ECONNREFUSED) {
+		unlink($path) or die "unlink($path): $!";
+	} elsif ($err != ENOENT) {
+		die "connect($path): $!";
+	}
+	my $umask = umask(077) // die("umask(077): $!");
+	my $l = IO::Socket::UNIX->new(Local => $path,
+					Listen => 1024,
+					Type => SOCK_STREAM) or
+		$err = $!;
+	umask($umask) or die("umask(restore): $!");
+	$l or return $err;
+	my @st = stat($path) or die "stat($path): $!";
+	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
+	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
+	my $oldset = PublicInbox::Sigfd::block_signals();
+	my $pid = fork // die "fork: $!";
+	if ($pid) {
+		PublicInbox::Sigfd::sig_setmask($oldset);
+		return; # client will connect to $path
+	}
+	openlog($path, 'pid', 'user');
+	local $SIG{__DIE__} = sub {
+		syslog('crit', "@_");
+		exit $! if $!;
+		exit $? >> 8 if $? >> 8;
+		exit 255;
+	};
+	local $SIG{__WARN__} = sub { syslog('warning', "@_") };
+	open(STDIN, '+<', '/dev/null') or die "redirect stdin failed: $!\n";
+	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!\n";
+	open STDERR, '>&STDIN' or die "redirect stderr failed: $!\n";
+	setsid();
+	$pid = fork // die "fork: $!";
+	exit if $pid;
+	$0 = "lei-daemon $path";
+	require PublicInbox::Listener;
+	require PublicInbox::EOFpipe;
+	$l->blocking(0);
+	$eof_w->blocking(0);
+	$eof_r->blocking(0);
+	my $listener = PublicInbox::Listener->new($l, \&accept_dispatch, $l);
+	my $exit_code;
+	local $quit = sub {
+		$exit_code //= shift;
+		my $tmp = $listener or exit($exit_code);
+		unlink($path) if defined($path);
+		syswrite($eof_w, '.');
+		$l = $listener = $path = undef;
+		$tmp->close if $tmp; # DS::close
+		PublicInbox::DS->SetLoopTimeout(1000);
+	};
+	PublicInbox::EOFpipe->new($eof_r, sub {}, undef);
+	my $sig = {
+		CHLD => \&PublicInbox::DS::enqueue_reap,
+		QUIT => $quit,
+		INT => $quit,
+		TERM => $quit,
+		HUP => \&noop,
+		USR1 => \&noop,
+		USR2 => \&noop,
+	};
+	my $sigfd = PublicInbox::Sigfd->new($sig, $SFD_NONBLOCK);
+	local %SIG = (%SIG, %$sig) if !$sigfd;
+	if ($sigfd) { # TODO: use inotify/kqueue to detect unlinked sockets
+		PublicInbox::DS->SetLoopTimeout(5000);
+	} else {
+		# wake up every second to accept signals if we don't
+		# have signalfd or IO::KQueue:
+		PublicInbox::Sigfd::sig_setmask($oldset);
+		PublicInbox::DS->SetLoopTimeout(1000);
+	}
+	PublicInbox::DS->SetPostLoopCallback(sub {
+		my ($dmap, undef) = @_;
+		if (@st = defined($path) ? stat($path) : ()) {
+			if ($dev_ino_expect ne pack('dd', $st[0], $st[1])) {
+				warn "$path dev/ino changed, quitting\n";
+				$path = undef;
+			}
+		} elsif (defined($path)) {
+			warn "stat($path): $!, quitting ...\n";
+			undef $path; # don't unlink
+			$quit->();
+		}
+		return 1 if defined($path);
+		my $now = now();
+		my $n = 0;
+		for my $s (values %$dmap) {
+			$s->can('busy') or next;
+			if ($s->busy($now)) {
+				++$n;
+			} else {
+				$s->close;
+			}
+		}
+		$n; # true: continue, false: stop
+	});
+	PublicInbox::DS->EventLoop;
+	exit($exit_code // 0);
+}
+
+# for users w/o IO::FDPass
+sub oneshot {
+	dispatch({
+		0 => *STDIN{IO},
+		1 => *STDOUT{IO},
+		2 => *STDERR{IO},
+		env => \%ENV
+	}, @ARGV);
+}
+
+1;
diff --git a/script/lei b/script/lei
new file mode 100755
index 00000000..1b5af3a1
--- /dev/null
+++ b/script/lei
@@ -0,0 +1,58 @@
+#!perl -w
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Cwd qw(cwd);
+use IO::Socket::UNIX;
+
+if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
+	my $path = do {
+		my $runtime_dir = ($ENV{XDG_RUNTIME_DIR} // '') . '/lei';
+		if ($runtime_dir eq '/lei') {
+			require File::Spec;
+			$runtime_dir = File::Spec->tmpdir."/lei-$<";
+		}
+		unless (-d $runtime_dir && -w _) {
+			require File::Path;
+			File::Path::mkpath($runtime_dir, 0, 0700);
+		}
+		"$runtime_dir/sock";
+	};
+	my $sock = IO::Socket::UNIX->new(Peer => $path, Type => SOCK_STREAM);
+	unless ($sock) { # start the daemon if not started
+		my $err = $!;
+		require PublicInbox::LeiDaemon;
+		$err = PublicInbox::LeiDaemon::lazy_start($path, $err);
+		# try connecting again anyways, unlink+bind may be racy
+		$sock = IO::Socket::UNIX->new(Peer => $path,
+						Type => SOCK_STREAM) // die
+			"connect($path): $! (bind($path): $err)";
+	}
+	my $pwd = $ENV{PWD};
+	my $cwd = cwd();
+	if ($pwd) { # prefer ENV{PWD} if it's a symlink to real cwd
+		my @st_cwd = stat($cwd) or die "stat(cwd=$cwd): $!\n";
+		my @st_pwd = stat($pwd);
+		# make sure st_dev/st_ino match for {PWD} to be valid
+		$pwd = $cwd if (!@st_pwd || $st_pwd[1] != $st_cwd[1] ||
+					$st_pwd[0] != $st_cwd[0]);
+	} else {
+		$pwd = $cwd;
+	}
+	local $ENV{PWD} = $pwd;
+	$sock->autoflush(1);
+	IO::FDPass::send(fileno($sock), $_) for (0..2);
+	my $buf = "$$\0\0>" . join("]\0[", @ARGV) . "\0\0>";
+	while (my ($k, $v) = each %ENV) { $buf .= "$k=$v\0" }
+	$buf .= "\0\0";
+	print $sock $buf or die "print(sock, buf): $!";
+	local $/ = "\n";
+	while (my $line = <$sock>) {
+		$line =~ /\Aexit=([0-9]+)\n\z/ and exit($1 + 0);
+		die $line;
+	}
+} else { # for systems lacking IO::FDPass
+	require PublicInbox::LeiDaemon;
+	PublicInbox::LeiDaemon::oneshot();
+}
diff --git a/t/lei.t b/t/lei.t
new file mode 100644
index 00000000..feee9270
--- /dev/null
+++ b/t/lei.t
@@ -0,0 +1,80 @@
+#!perl -w
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Test::More;
+use PublicInbox::TestCommon;
+use PublicInbox::Config;
+my $json = PublicInbox::Config::json() or plan skip_all => 'JSON missing';
+require_mods(qw(DBD::SQLite Search::Xapian));
+my ($home, $for_destroy) = tmpdir();
+my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
+
+SKIP: {
+	require_mods('IO::FDPass', 51);
+	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
+	mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
+	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
+
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	is($err, '', 'no error from daemon-pid');
+	like($out, qr/\A[0-9]+\n\z/s, 'pid returned') or BAIL_OUT;
+	chomp(my $pid = $out);
+	ok(kill(0, $pid), 'pid is valid');
+	ok(-S $sock, 'sock created');
+
+	ok(!run_script([qw(lei)], undef, $opt), 'no args fails');
+	is($? >> 8, 1, '$? is 1');
+	is($out, '', 'nothing in stdout');
+	like($err, qr/^usage:/sm, 'usage in stderr');
+
+	for my $arg (['-h'], ['--help'], ['help'], [qw(daemon-pid --help)]) {
+		$out = $err = '';
+		ok(run_script(['lei', @$arg], undef, $opt), "lei @$arg");
+		like($out, qr/^usage:/sm, "usage in stdout (@$arg)");
+		is($err, '', "nothing in stderr (@$arg)");
+	}
+
+	ok(!run_script([qw(lei DBG-false)], undef, $opt), 'false(1) emulation');
+	is($? >> 8, 1, '$? set correctly');
+	is($err, '', 'no error from false(1) emulation');
+
+	for my $arg ([''], ['--halp'], ['halp'], [qw(daemon-pid --halp)]) {
+		$out = $err = '';
+		ok(!run_script(['lei', @$arg], undef, $opt), "lei @$arg");
+		is($? >> 8, 1, '$? set correctly');
+		isnt($err, '', 'something in stderr');
+		is($out, '', 'nothing in stdout');
+	}
+
+	$out = '';
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	chomp(my $pid_again = $out);
+	is($pid, $pid_again, 'daemon-pid idempotent');
+
+	ok(run_script([qw(lei daemon-stop)], undef, $opt), 'daemon-stop');
+	is($out, '', 'no output from daemon-stop');
+	is($err, '', 'no error from daemon-stop');
+	for (0..100) {
+		kill(0, $pid) or last;
+		tick();
+	}
+	ok(!-S $sock, 'sock gone');
+	ok(!kill(0, $pid), 'pid gone after stop');
+
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	chomp(my $new_pid = $out);
+	ok(kill(0, $new_pid), 'new pid is running');
+	ok(-S $sock, 'sock exists again');
+	unlink $sock or BAIL_OUT "unlink $!";
+	for (0..100) {
+		kill('CHLD', $new_pid) or last;
+		tick();
+	}
+	ok(!kill(0, $new_pid), 'daemon exits after unlink');
+};
+
+require_ok 'PublicInbox::LeiDaemon';
+
+done_testing;

^ permalink raw reply related	[relevance 28%]

* [PATCH 08/26] lei: ensure we run a restrictive umask
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (4 preceding siblings ...)
  2020-12-18 12:09 49% ` [PATCH 07/26] t/lei-oneshot: standalone oneshot (non-socket) test Eric Wong
@ 2020-12-18 12:09 68% ` Eric Wong
  2020-12-18 12:09 44% ` [PATCH 09/26] lei: support `daemon-env' for modifying long-lived env Eric Wong
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

While we configure the LeiStore git repos and DBs to have a
restrictive umask, lei may also write to Maildirs/mboxes/etc.

We will follow mutt behavior when saving files/messages to the FS.
We only want to create files which are only readable by the local
user since this is intended for private mail and could be used
on shared systems.

We may allow passing the umask on a per-command-basis, but it's
probably not worth the effort to support.
---
 lib/PublicInbox/LeiDaemon.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
index 010c1cba..1f170f1d 100644
--- a/lib/PublicInbox/LeiDaemon.pm
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -538,12 +538,11 @@ sub lazy_start {
 		die "connect($path): $!";
 	}
 	require IO::FDPass;
-	my $umask = umask(077) // die("umask(077): $!");
+	umask(077) // die("umask(077): $!");
 	my $l = IO::Socket::UNIX->new(Local => $path,
 					Listen => 1024,
 					Type => SOCK_STREAM) or
 		$err = $!;
-	umask($umask) or die("umask(restore): $!");
 	$l or return die "bind($path): $err";
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
@@ -638,6 +637,7 @@ sub oneshot {
 	my $exit = $main_pkg->can('exit'); # caller may override exit()
 	local $quit = $exit if $exit;
 	local %PATH2CFG;
+	umask(077) // die("umask(077): $!");
 	dispatch({
 		0 => *STDIN{IO},
 		1 => *STDOUT{IO},

^ permalink raw reply related	[relevance 68%]

* [PATCH 05/26] lei: use spawn (vfork + execve) for lazy start
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
  2020-12-18 12:09 28% ` [PATCH 01/26] lei: FD-passing and IPC basics Eric Wong
  2020-12-18 12:09 45% ` [PATCH 02/26] lei: proposed command-listing and options Eric Wong
@ 2020-12-18 12:09 61% ` Eric Wong
  2020-12-18 12:09 20% ` [PATCH 06/26] lei: refine help/option parsing, implement "init" Eric Wong
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

This allows us to rely on FD_CLOEXEC being set on pipes
from prove(1), so forgetting `daemon-stop' won't cause
tests to hang.

Unfortunately, daemon tests will be slower with this.
---
 lib/PublicInbox/LeiDaemon.pm | 12 +++++-------
 script/lei                   | 14 ++++++++++----
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
index b4b1ac59..fd4d00d4 100644
--- a/lib/PublicInbox/LeiDaemon.pm
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -324,29 +324,27 @@ sub accept_dispatch { # Listener {post_accept} callback
 sub noop {}
 
 # lei(1) calls this when it can't connect
-sub lazy_start ($$) {
+sub lazy_start {
 	my ($path, $err) = @_;
 	if ($err == ECONNREFUSED) {
 		unlink($path) or die "unlink($path): $!";
 	} elsif ($err != ENOENT) {
 		die "connect($path): $!";
 	}
+	require IO::FDPass;
 	my $umask = umask(077) // die("umask(077): $!");
 	my $l = IO::Socket::UNIX->new(Local => $path,
 					Listen => 1024,
 					Type => SOCK_STREAM) or
 		$err = $!;
 	umask($umask) or die("umask(restore): $!");
-	$l or return $err;
+	$l or return die "bind($path): $err";
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
 	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
 	my $oldset = PublicInbox::Sigfd::block_signals();
 	my $pid = fork // die "fork: $!";
-	if ($pid) {
-		PublicInbox::Sigfd::sig_setmask($oldset);
-		return; # client will connect to $path
-	}
+	return if $pid;
 	openlog($path, 'pid', 'user');
 	local $SIG{__DIE__} = sub {
 		syslog('crit', "@_");
@@ -360,7 +358,7 @@ sub lazy_start ($$) {
 	open STDERR, '>&STDIN' or die "redirect stderr failed: $!\n";
 	setsid();
 	$pid = fork // die "fork: $!";
-	exit if $pid;
+	return if $pid;
 	$0 = "lei-daemon $path";
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
diff --git a/script/lei b/script/lei
index 1b5af3a1..637c1951 100755
--- a/script/lei
+++ b/script/lei
@@ -21,13 +21,19 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 	};
 	my $sock = IO::Socket::UNIX->new(Peer => $path, Type => SOCK_STREAM);
 	unless ($sock) { # start the daemon if not started
-		my $err = $!;
-		require PublicInbox::LeiDaemon;
-		$err = PublicInbox::LeiDaemon::lazy_start($path, $err);
+		my $err = $! + 0;
+		my $env = { PERL5LIB => join(':', @INC) };
+		my $cmd = [ $^X, qw[-MPublicInbox::LeiDaemon
+			-E PublicInbox::LeiDaemon::lazy_start(@ARGV)],
+			$path, $err ];
+		require PublicInbox::Spawn;
+		waitpid(PublicInbox::Spawn::spawn($cmd, $env), 0);
+		warn "lei-daemon exited with \$?=$?\n" if $?;
+
 		# try connecting again anyways, unlink+bind may be racy
 		$sock = IO::Socket::UNIX->new(Peer => $path,
 						Type => SOCK_STREAM) // die
-			"connect($path): $! (bind($path): $err)";
+			"connect($path): $! (after attempted daemon start)";
 	}
 	my $pwd = $ENV{PWD};
 	my $cwd = cwd();

^ permalink raw reply related	[relevance 61%]

* [PATCH 12/26] rename LeiDaemon package to PublicInbox::LEI
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (6 preceding siblings ...)
  2020-12-18 12:09 44% ` [PATCH 09/26] lei: support `daemon-env' for modifying long-lived env Eric Wong
@ 2020-12-18 12:09 60% ` Eric Wong
  2020-12-18 12:09 66% ` [PATCH 13/26] lei: support pass-through for `lei config' Eric Wong
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

"LEI" is an acronym, and ALL CAPS is consistent with existing
PublicInbox::{IMAP,HTTP,NNTP,WWW} naming for top-level modules,
3 of 4 old ones which deal directly with sockets and requests.
---
 MANIFEST                                 | 2 +-
 lib/PublicInbox/{LeiDaemon.pm => LEI.pm} | 2 +-
 script/lei                               | 8 ++++----
 t/lei-oneshot.t                          | 4 ++--
 t/lei.t                                  | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)
 rename lib/PublicInbox/{LeiDaemon.pm => LEI.pm} (99%)

diff --git a/MANIFEST b/MANIFEST
index 898766e7..29b47843 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -159,7 +159,7 @@ lib/PublicInbox/InboxIdle.pm
 lib/PublicInbox/InboxWritable.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
-lib/PublicInbox/LeiDaemon.pm
+lib/PublicInbox/LEI.pm
 lib/PublicInbox/LeiSearch.pm
 lib/PublicInbox/LeiStore.pm
 lib/PublicInbox/Linkify.pm
diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LEI.pm
similarity index 99%
rename from lib/PublicInbox/LeiDaemon.pm
rename to lib/PublicInbox/LEI.pm
index 56f4aa7d..b5ba1f71 100644
--- a/lib/PublicInbox/LeiDaemon.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -5,7 +5,7 @@
 # PublicInbox::Daemon, this is designed exclusively to handle trusted
 # local clients with read/write access to the FS and use as many
 # system resources as the local user has access to.
-package PublicInbox::LeiDaemon;
+package PublicInbox::LEI;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::DS);
diff --git a/script/lei b/script/lei
index fce088e9..e59e4316 100755
--- a/script/lei
+++ b/script/lei
@@ -23,8 +23,8 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 	unless ($sock) { # start the daemon if not started
 		my $err = $! + 0;
 		my $env = { PERL5LIB => join(':', @INC) };
-		my $cmd = [ $^X, qw[-MPublicInbox::LeiDaemon
-			-E PublicInbox::LeiDaemon::lazy_start(@ARGV)],
+		my $cmd = [ $^X, qw[-MPublicInbox::LEI
+			-E PublicInbox::LEI::lazy_start(@ARGV)],
 			$path, $err ];
 		require PublicInbox::Spawn;
 		waitpid(PublicInbox::Spawn::spawn($cmd, $env), 0);
@@ -59,6 +59,6 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 		die $line;
 	}
 } else { # for systems lacking IO::FDPass
-	require PublicInbox::LeiDaemon;
-	PublicInbox::LeiDaemon::oneshot(__PACKAGE__);
+	require PublicInbox::LEI;
+	PublicInbox::LEI::oneshot(__PACKAGE__);
 }
diff --git a/t/lei-oneshot.t b/t/lei-oneshot.t
index 848682ee..3b8e412d 100644
--- a/t/lei-oneshot.t
+++ b/t/lei-oneshot.t
@@ -13,8 +13,8 @@ use subs qw(exit);
 sub main {
 # the below "line" directive is a magic comment, see perlsyn(1) manpage
 # line 1 "lei-oneshot"
-	require PublicInbox::LeiDaemon;
-	PublicInbox::LeiDaemon::oneshot(__PACKAGE__);
+	require PublicInbox::LEI;
+	PublicInbox::LEI::oneshot(__PACKAGE__);
 	0;
 }
 1;
diff --git a/t/lei.t b/t/lei.t
index 53268908..7ecadf7d 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -159,7 +159,7 @@ SKIP: {
 	$test_lei_common = undef;
 };
 
-require_ok 'PublicInbox::LeiDaemon';
+require_ok 'PublicInbox::LEI';
 $LEI = 'lei-oneshot' if $test_lei_oneshot;
 $test_lei_common->() if $test_lei_common;
 

^ permalink raw reply related	[relevance 60%]

* [PATCH 07/26] t/lei-oneshot: standalone oneshot (non-socket) test
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (3 preceding siblings ...)
  2020-12-18 12:09 20% ` [PATCH 06/26] lei: refine help/option parsing, implement "init" Eric Wong
@ 2020-12-18 12:09 49% ` Eric Wong
  2020-12-18 12:09 68% ` [PATCH 08/26] lei: ensure we run a restrictive umask Eric Wong
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

We can use the same "local $ENV{FOO}" hack we do with
t/nntpd-v2.t to test the oneshot code path without imposing
an extra script in the users' $PATH.
---
 MANIFEST                      |  1 +
 lib/PublicInbox/TestCommon.pm |  2 +-
 t/lei-oneshot.t               | 25 +++++++++++++++++++++++++
 t/lei.t                       | 32 ++++++++++++++++++++------------
 4 files changed, 47 insertions(+), 13 deletions(-)
 create mode 100644 t/lei-oneshot.t

diff --git a/MANIFEST b/MANIFEST
index 9eb97d14..898766e7 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -320,6 +320,7 @@ t/indexlevels-mirror.t
 t/init.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei-oneshot.t
 t/lei.t
 t/lei_store.t
 t/linkify.t
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 2116575b..c236c589 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -168,7 +168,7 @@ sub run_script_exit {
 	die RUN_SCRIPT_EXIT;
 }
 
-my %cached_scripts;
+our %cached_scripts;
 sub key2sub ($) {
 	my ($key) = @_;
 	$cached_scripts{$key} //= do {
diff --git a/t/lei-oneshot.t b/t/lei-oneshot.t
new file mode 100644
index 00000000..848682ee
--- /dev/null
+++ b/t/lei-oneshot.t
@@ -0,0 +1,25 @@
+#!perl -w
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use PublicInbox::TestCommon;
+$PublicInbox::TestCommon::cached_scripts{'lei-oneshot'} //= do {
+	eval <<'EOF';
+package LeiOneshot;
+use strict;
+use subs qw(exit);
+*exit = \&PublicInbox::TestCommon::run_script_exit;
+sub main {
+# the below "line" directive is a magic comment, see perlsyn(1) manpage
+# line 1 "lei-oneshot"
+	require PublicInbox::LeiDaemon;
+	PublicInbox::LeiDaemon::oneshot(__PACKAGE__);
+	0;
+}
+1;
+EOF
+	LeiOneshot->can('main');
+};
+local $ENV{TEST_LEI_ONESHOT} = '1';
+require './t/lei.t';
diff --git a/t/lei.t b/t/lei.t
index 9fb0ce00..507c7164 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -8,6 +8,12 @@ use PublicInbox::TestCommon;
 use PublicInbox::Config;
 use File::Path qw(rmtree);
 require_mods(qw(json DBD::SQLite Search::Xapian));
+my $LEI = 'lei';
+my $lei = sub {
+	my ($cmd, $env, $opt) = @_;
+	run_script([$LEI, @$cmd], $env, $opt);
+};
+
 my ($home, $for_destroy) = tmpdir();
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 delete local $ENV{XDG_DATA_HOME};
@@ -17,21 +23,21 @@ local $ENV{HOME} = $home;
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
 
 my $test_lei_common = sub {
-	ok(!run_script([qw(lei)], undef, $opt), 'no args fails');
+	ok(!$lei->([], undef, $opt), 'no args fails');
 	is($? >> 8, 1, '$? is 1');
 	is($out, '', 'nothing in stdout');
 	like($err, qr/^usage:/sm, 'usage in stderr');
 
 	for my $arg (['-h'], ['--help'], ['help'], [qw(daemon-pid --help)]) {
 		$out = $err = '';
-		ok(run_script(['lei', @$arg], undef, $opt), "lei @$arg");
+		ok($lei->($arg, undef, $opt), "lei @$arg");
 		like($out, qr/^usage:/sm, "usage in stdout (@$arg)");
 		is($err, '', "nothing in stderr (@$arg)");
 	}
 
 	for my $arg ([''], ['--halp'], ['halp'], [qw(daemon-pid --halp)]) {
 		$out = $err = '';
-		ok(!run_script(['lei', @$arg], undef, $opt), "lei @$arg");
+		ok(!$lei->($arg, undef, $opt), "lei @$arg");
 		is($? >> 8, 1, '$? set correctly');
 		isnt($err, '', 'something in stderr');
 		is($out, '', 'nothing in stdout');
@@ -47,29 +53,27 @@ my $test_lei_common = sub {
 	};
 	my $home_trash = [ "$home/.local", "$home/.config" ];
 	rmtree($home_trash);
-	ok(run_script([qw(lei init)], undef, $opt), 'init w/o args');
+	ok($lei->(['init'], undef, $opt), 'init w/o args');
 	$ok_err_info->('after init w/o args');
-	ok(run_script([qw(lei init)], undef, $opt), 'idempotent init w/o args');
+	ok($lei->(['init'], undef, $opt), 'idempotent init w/o args');
 	$ok_err_info->('after idempotent init w/o args');
 
-	ok(!run_script([qw(lei init), "$home/x"], undef, $opt),
+	ok(!$lei->(['init', "$home/x"], undef, $opt),
 		'init conflict');
 	is(grep(/^E:/, split(/^/, $err)), 1, 'got error on conflict');
 	ok(!-e "$home/x", 'nothing created on conflict');
 	rmtree($home_trash);
 
 	$err = '';
-	ok(run_script([qw(lei init), "$home/x"], undef, $opt),
-		'init conflict resolved');
+	ok($lei->(['init', "$home/x"], undef, $opt), 'init conflict resolved');
 	$ok_err_info->('init w/ arg');
-	ok(run_script([qw(lei init), "$home/x"], undef, $opt),
-		'init idempotent with path');
+	ok($lei->(['init', "$home/x"], undef, $opt), 'init idempotent w/ path');
 	$ok_err_info->('init idempotent w/ arg');
 	ok(-d "$home/x", 'created dir');
 	rmtree([ "$home/x", @$home_trash ]);
 
 	$err = '';
-	ok(!run_script([qw(lei init), "$home/x", "$home/2" ], undef, $opt),
+	ok(!$lei->(['init', "$home/x", "$home/2" ], undef, $opt),
 		'too many args fails');
 	like($err, qr/too many/, 'noted excessive');
 	ok(!-e "$home/x", 'x not created on excessive');
@@ -80,7 +84,9 @@ my $test_lei_common = sub {
 	is($out, '', 'nothing in stdout');
 };
 
+my $test_lei_oneshot = $ENV{TEST_LEI_ONESHOT};
 SKIP: {
+	last SKIP if $test_lei_oneshot;
 	require_mods('IO::FDPass', 16);
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
 
@@ -118,10 +124,12 @@ SKIP: {
 		tick();
 	}
 	ok(!kill(0, $new_pid), 'daemon exits after unlink');
-	$test_lei_common = undef; # success over socket, can't test without
+	# success over socket, can't test without
+	$test_lei_common = undef;
 };
 
 require_ok 'PublicInbox::LeiDaemon';
+$LEI = 'lei-oneshot' if $test_lei_oneshot;
 $test_lei_common->() if $test_lei_common;
 
 done_testing;

^ permalink raw reply related	[relevance 49%]

* [PATCH 09/26] lei: support `daemon-env' for modifying long-lived env
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (5 preceding siblings ...)
  2020-12-18 12:09 68% ` [PATCH 08/26] lei: ensure we run a restrictive umask Eric Wong
@ 2020-12-18 12:09 44% ` Eric Wong
  2020-12-18 12:09 60% ` [PATCH 12/26] rename LeiDaemon package to PublicInbox::LEI Eric Wong
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

While lei(1) socket connections can set environment variables
for its running context, it may not completely remove some of
them.  The background daemon just inherits whatever env the
client spawning it had.  This command ensures the persistent env
can be modified as needed.

Similar to env(1), this supports "-u", "-" (--clear), and
"-0"/"-z" switches.  It may be useful to unset or change
or even completely clear the environment independently
of what a socket client feeds us.

"-i" is omitted since "--ignore-environment" seems like a bad
name for a persistent daemon as opposed to a one-shot command.
"-" and --clear (like clearenv(3)) will completely clobber
the environment.

"Lonesome dash" support is added to our option/help parsing
for the "-" shortcut to "--clear".
Getopt::Long doesn't seem to support specs like "clear|" or
"stdin|", but only "", so we do a little pre/post-processing
to merge the cases.
---
 lib/PublicInbox/LeiDaemon.pm | 55 ++++++++++++++++++++++++++++++++----
 t/lei.t                      | 31 ++++++++++++++++++++
 2 files changed, 80 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
index 1f170f1d..56f4aa7d 100644
--- a/lib/PublicInbox/LeiDaemon.pm
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -60,7 +60,7 @@ our %CMD = ( # sorted in order of importance/use:
 
 'plonk' => [ '--thread|--from=IDENT',
 	'exclude mail matching From: or thread from non-Message-ID searches',
-	qw(thread|t stdin| from|f=s mid=s oid=s) ],
+	qw(stdin| thread|t from|f=s mid=s oid=s) ],
 'mark' => [ 'MESSAGE_FLAGS...',
 	'set/unset flags on message(s) from stdin',
 	qw(stdin| oid=s exact by-mid|mid:s) ],
@@ -103,6 +103,8 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(quiet|q) ],
 'daemon-stop' => [ '', 'stop the lei-daemon' ],
 'daemon-pid' => [ '', 'show the PID of the lei-daemon' ],
+'daemon-env' => [ '[NAME=VALUE...]', 'set, unset, or show daemon environment',
+	qw(clear| unset|u=s@ z|0) ],
 'help' => [ '[SUBCOMMAND]', 'show help' ],
 
 # XXX do we need this?
@@ -175,6 +177,16 @@ my %OPTDESC = (
 
 'by-mid|mid:s' => [ 'MID', 'match only by Message-ID, ignoring contents' ],
 'jobs:i' => 'set parallelism level',
+
+# xargs, env, use "-0", git(1) uses "-z".  Should we support z|0 everywhere?
+'z' => 'use NUL \\0 instead of newline (CR) to delimit lines',
+'z|0' => 'use NUL \\0 instead of newline (CR) to delimit lines',
+
+# note: no "--ignore-environment" / "-i" support like env(1) since that
+# is one-shot and this is for a persistent daemon:
+'clear|' => 'clear the daemon environment',
+'unset|u=s@' => ['NAME',
+	'unset matching NAME, may be specified multiple times'],
 ); # %OPTDESC
 
 sub x_it ($$) { # pronounced "exit"
@@ -257,7 +269,11 @@ sub _help ($;$) {
 				join(', ', @allow) . " or $last";
 		}
 		my $lhs = join(', ', @s, @l) . join('', @vals);
-		$lhs =~ s/\A--/    --/; # pad if no short options
+		if ($x =~ /\|\z/) { # "stdin|" or "clear|"
+			$lhs =~ s/\A--/- , --/;
+		} else {
+			$lhs =~ s/\A--/    --/; # pad if no short options
+		}
 		$lpad = length($lhs) if length($lhs) > $lpad;
 		push @opt_desc, $lhs, $desc;
 	}
@@ -289,9 +305,20 @@ sub optparse ($$$) {
 	my $opt = $client->{opt} = {};
 	my $info = $CMD{$cmd} // [ '[...]', '(undocumented command)' ];
 	my ($proto, $desc, @spec) = @$info;
-	$glp->getoptionsfromarray($argv, $opt, @spec, qw(help|h)) or
+	push @spec, qw(help|h);
+	my $lone_dash;
+	if ($spec[0] =~ s/\|\z//s) { # "stdin|" or "clear|" allows "-" alias
+		$lone_dash = $spec[0];
+		$opt->{$spec[0]} = \(my $var);
+		push @spec, '' => \$var;
+	}
+	$glp->getoptionsfromarray($argv, $opt, @spec) or
 		return _help($client, "bad arguments or options for $cmd");
 	return _help($client) if $opt->{help};
+
+	# "-" aliases "stdin" or "clear"
+	$opt->{$lone_dash} = ${$opt->{$lone_dash}} if defined $lone_dash;
+
 	my $i = 0;
 	my $POS_ARG = '[A-Z][A-Z0-9_]+';
 	my ($err, $inf);
@@ -461,12 +488,28 @@ E: leistore.dir=$cur already initialized and it is not $dir
 	return qerr($client, $exists);
 }
 
-sub lei_daemon_pid {
-	emit($_[0], 1, "$$\n");
-}
+sub lei_daemon_pid { emit($_[0], 1, "$$\n") }
 
 sub lei_daemon_stop { $quit->(0) }
 
+sub lei_daemon_env {
+	my ($client, @argv) = @_;
+	my $opt = $client->{opt};
+	if (defined $opt->{clear}) {
+		%ENV = ();
+	} elsif (my $u = $opt->{unset}) {
+		delete @ENV{@$u};
+	}
+	if (@argv) {
+		%ENV = (%ENV, map { split(/=/, $_, 2) } @argv);
+	} elsif (!defined($opt->{clear}) && !$opt->{unset}) {
+		my $eor = $opt->{z} ? "\0" : "\n";
+		my $buf = '';
+		while (my ($k, $v) = each %ENV) { $buf .= "$k=$v$eor" }
+		emit($client, 1, $buf)
+	}
+}
+
 sub lei_help { _help($_[0]) }
 
 sub reap_exec { # dwaitpid callback
diff --git a/t/lei.t b/t/lei.t
index 507c7164..53268908 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -20,6 +20,7 @@ delete local $ENV{XDG_DATA_HOME};
 delete local $ENV{XDG_CONFIG_HOME};
 local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 local $ENV{HOME} = $home;
+local $ENV{FOO} = 'BAR';
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
 
 my $test_lei_common = sub {
@@ -104,6 +105,36 @@ SKIP: {
 	chomp(my $pid_again = $out);
 	is($pid, $pid_again, 'daemon-pid idempotent');
 
+	$out = '';
+	ok(run_script([qw(lei daemon-env -0)], undef, $opt), 'show env');
+	is($err, '', 'no errors in env dump');
+	my @env = split(/\0/, $out);
+	is(scalar grep(/\AHOME=\Q$home\E\z/, @env), 1, 'env has HOME');
+	is(scalar grep(/\AFOO=BAR\z/, @env), 1, 'env has FOO=BAR');
+	is(scalar grep(/\AXDG_RUNTIME_DIR=/, @env), 1, 'has XDG_RUNTIME_DIR');
+
+	$out = '';
+	ok(run_script([qw(lei daemon-env -u FOO)], undef, $opt), 'unset');
+	is($out.$err, '', 'no output for unset');
+	ok(run_script([qw(lei daemon-env -0)], undef, $opt), 'show again');
+	is($err, '', 'no errors in env dump');
+	@env = split(/\0/, $out);
+	is(scalar grep(/\AFOO=BAR\z/, @env), 0, 'env unset FOO');
+
+	$out = '';
+	ok(run_script([qw(lei daemon-env -u FOO -u HOME -u XDG_RUNTIME_DIR)],
+			undef, $opt), 'unset multiple');
+	is($out.$err, '', 'no errors output for unset');
+	ok(run_script([qw(lei daemon-env -0)], undef, $opt), 'show again');
+	is($err, '', 'no errors in env dump');
+	@env = split(/\0/, $out);
+	is(scalar grep(/\A(?:HOME|XDG_RUNTIME_DIR)=\z/, @env), 0, 'env unset@');
+	$out = '';
+	ok(run_script([qw(lei daemon-env -)], undef, $opt), 'clear env');
+	is($out.$err, '', 'no output');
+	ok(run_script([qw(lei daemon-env)], undef, $opt), 'env is empty');
+	is($out, '', 'env cleared');
+
 	ok(run_script([qw(lei daemon-stop)], undef, $opt), 'daemon-stop');
 	is($out, '', 'no output from daemon-stop');
 	is($err, '', 'no error from daemon-stop');

^ permalink raw reply related	[relevance 44%]

* [PATCH 06/26] lei: refine help/option parsing, implement "init"
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (2 preceding siblings ...)
  2020-12-18 12:09 61% ` [PATCH 05/26] lei: use spawn (vfork + execve) for lazy start Eric Wong
@ 2020-12-18 12:09 20% ` Eric Wong
  2020-12-18 12:09 49% ` [PATCH 07/26] t/lei-oneshot: standalone oneshot (non-socket) test Eric Wong
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

There's a bunch of work in here as the foundations are being
fleshed out.  One of the UI/UX is to make it easy to keep
built-in help and shell completions consistent
---
 lib/PublicInbox/LeiDaemon.pm | 401 ++++++++++++++++++++++++++---------
 lib/PublicInbox/LeiStore.pm  |   7 +-
 script/lei                   |   2 +-
 t/lei.t                      |  82 +++++--
 4 files changed, 378 insertions(+), 114 deletions(-)

diff --git a/lib/PublicInbox/LeiDaemon.pm b/lib/PublicInbox/LeiDaemon.pm
index fd4d00d4..010c1cba 100644
--- a/lib/PublicInbox/LeiDaemon.pm
+++ b/lib/PublicInbox/LeiDaemon.pm
@@ -15,13 +15,18 @@ use POSIX qw(setsid);
 use IO::Socket::UNIX;
 use IO::Handle ();
 use Sys::Syslog qw(syslog openlog);
+use PublicInbox::Config;
 use PublicInbox::Syscall qw($SFD_NONBLOCK EPOLLIN EPOLLONESHOT);
 use PublicInbox::Sigfd;
 use PublicInbox::DS qw(now);
 use PublicInbox::Spawn qw(spawn);
-our $quit = sub { exit(shift // 0) };
+use Text::Wrap qw(wrap);
+use File::Path qw(mkpath);
+use File::Spec;
+our $quit = \&CORE::exit;
 my $glp = Getopt::Long::Parser->new;
 $glp->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
+our %PATH2CFG; # persistent for socket daemon
 
 # TBD: this is a documentation mechanism to show a subcommand
 # (may) pass options through to another command:
@@ -30,45 +35,48 @@ sub pass_through { () }
 # TODO: generate shell completion + help using %CMD and %OPTDESC
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
-'query' => [ 'SEARCH-TERMS...', 'search for messages matching terms', qw(
+'query' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|o=s format|f=s dedupe|d=s thread|t augment|a
-	limit|n=i sort|s=s reverse|r offset=i remote local! extinbox!
+	limit|n=i sort|s=s@ reverse|r offset=i remote local! extinbox!
 	since|after=s until|before=s) ],
 
-'show' => [ '{MID|OID}', 'show a given object (Message-ID or object ID)',
+'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
 	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
 	pass_through('git show') ],
 
-'add-extinbox' => [ 'URL-OR-PATHNAME',
+'add-extinbox' => [ 'URL_OR_PATHNAME',
 	'add/set priority of a publicinbox|extindex for extra matches',
 	qw(prio=i) ],
-'ls-extinbox' => [ '[FILTER]', 'list publicinbox|extindex locations',
+'ls-extinbox' => [ '[FILTER...]', 'list publicinbox|extindex locations',
 	qw(format|f=s z local remote) ],
-'forget-extinbox' => [ '{URL-OR-PATHNAME|--prune}',
+'forget-extinbox' => [ '{URL_OR_PATHNAME|--prune}',
 	'exclude further results from a publicinbox|extindex',
 	qw(prune) ],
 
-'ls-query' => [ '[FILTER]', 'list saved search queries',
+'ls-query' => [ '[FILTER...]', 'list saved search queries',
 		qw(name-only format|f=s z) ],
 'rm-query' => [ 'QUERY_NAME', 'remove a saved search' ],
 'mv-query' => [ qw(OLD_NAME NEW_NAME), 'rename a saved search' ],
 
-'plonk' => [ '{--thread|--from=IDENT}',
+'plonk' => [ '--thread|--from=IDENT',
 	'exclude mail matching From: or thread from non-Message-ID searches',
-	qw(thread|t from|f=s mid=s oid=s) ],
-'mark' => [ 'MESSAGE-FLAGS', 'set/unset flags on message(s) from stdin',
+	qw(thread|t stdin| from|f=s mid=s oid=s) ],
+'mark' => [ 'MESSAGE_FLAGS...',
+	'set/unset flags on message(s) from stdin',
 	qw(stdin| oid=s exact by-mid|mid:s) ],
-'forget' => [ '--stdin', 'exclude message(s) on stdin from query results',
-	qw(stdin| oid=s  exact by-mid|mid:s) ],
+'forget' => [ '[--stdin|--oid=OID|--by-mid=MID]',
+	'exclude message(s) on stdin from query results',
+	qw(stdin| oid=s exact by-mid|mid:s quiet|q) ],
 
-'purge-mailsource' => [ '{URL-OR-PATHNAME|--all}',
+'purge-mailsource' => [ '{URL_OR_PATHNAME|--all}',
 	'remove imported messages from IMAP, Maildirs, and MH',
 	qw(exact! all jobs:i indexed) ],
 
 # code repos are used for `show' to solve blobs from patch mails
 'add-coderepo' => [ 'PATHNAME', 'add or set priority of a git code repo',
 	qw(prio=i) ],
-'ls-coderepo' => [ '[FILTER]', 'list known code repos', qw(format|f=s z) ],
+'ls-coderepo' => [ '[FILTER_TERMS...]',
+		'list known code repos', qw(format|f=s z) ],
 'forget-coderepo' => [ 'PATHNAME',
 	'stop using repo to solve blobs from patches',
 	qw(prune) ],
@@ -76,7 +84,7 @@ our %CMD = ( # sorted in order of importance/use:
 'add-watch' => [ '[URL_OR_PATHNAME]',
 		'watch for new messages and flag changes',
 	qw(import! flags! interval=s recursive|r exclude=s include=s) ],
-'ls-watch' => [ '[FILTER]', 'list active watches with numbers and status',
+'ls-watch' => [ '[FILTER...]', 'list active watches with numbers and status',
 		qw(format|f=s z) ],
 'pause-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
 'resume-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
@@ -88,11 +96,13 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(stdin| limit|n=i offset=i recursive|r exclude=s include=s !flags),
 	],
 
-'config' => [ '[ANYTHING...]',
-		'git-config(1) wrapper for ~/.config/lei/config',
+'config' => [ '[...]', 'git-config(1) wrapper for ~/.config/lei/config',
 		pass_through('git config') ],
-'daemon-stop' => [ undef, 'stop the lei-daemon' ],
-'daemon-pid' => [ undef, 'show the PID of the lei-daemon' ],
+'init' => [ '[PATHNAME]',
+	'initialize storage, default: ~/.local/share/lei/store',
+	qw(quiet|q) ],
+'daemon-stop' => [ '', 'stop the lei-daemon' ],
+'daemon-pid' => [ '', 'show the PID of the lei-daemon' ],
 'help' => [ '[SUBCOMMAND]', 'show help' ],
 
 # XXX do we need this?
@@ -108,36 +118,43 @@ our %CMD = ( # sorted in order of importance/use:
 # $spec => [@ALLOWED_VALUES (default is first), $description],
 # $spec => $description
 # "$SUB_COMMAND TAB $spec" => as above
-my $stdin_formats = [ qw(auto raw mboxrd mboxcl2 mboxcl mboxo),
+my $stdin_formats = [ 'IN|auto|raw|mboxrd|mboxcl2|mboxcl|mboxo',
 		'specify message input format' ];
-my $ls_format = [ qw(plain json null), 'listing output format' ];
-my $show_format = [ qw(plain raw html mboxrd mboxcl2 mboxcl),
-		'message/object output format' ];
+my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
 
 my %OPTDESC = (
+'help|h' => 'show this built-in help',
+'quiet|q' => 'be quiet',
 'solve!' => 'do not attempt to reconstruct blobs from emails',
-'save-as=s' => 'save a search terms by given name',
+'save-as=s' => ['NAME', 'save a search terms by given name'],
 
-'type=s' => [qw(any mid git), 'disambiguate type' ],
+'type=s' => [ 'any|mid|git', 'disambiguate type' ],
 
-'dedupe|d=s' => [qw(content oid mid), 'deduplication strategy'],
-'thread|t' => 'every message in the same thread as the actual match(es)',
+'dedupe|d=s' => ['STRAT|content|oid|mid',
+		'deduplication strategy'],
+'show	thread|t' => 'display entire thread a message belongs to',
+'query	thread|t' =>
+	'return all messages in the same thread as the actual match(es)',
 'augment|a' => 'augment --output destination instead of clobbering',
 
-'output|o=s' => "destination (e.g. `/path/to/Maildir', or `-' for stdout)",
+'output|o=s' => [ 'DEST',
+	"destination (e.g. `/path/to/Maildir', or `-' for stdout)" ],
 
+'show	format|f=s' => [ 'OUT|plain|raw|html|mboxrd|mboxcl2|mboxcl',
+			'message/object output format' ],
 'mark	format|f=s' => $stdin_formats,
 'forget	format|f=s' => $stdin_formats,
-'query	format|f=s' => [qw(maildir mboxrd mboxcl2 mboxcl html oid),
-		q[specify output format (default: determined by --output)]],
+'query	format|f=s' => [ 'OUT|maildir|mboxrd|mboxcl2|mboxcl|html|oid',
+		'specify output format, default depends on --output'],
 'ls-query	format|f=s' => $ls_format,
-'ls-extinbox format|f=s' => $ls_format,
+'ls-extinbox	format|f=s' => $ls_format,
 
-'limit|n=i' => 'integer limit on number of matches (default: 10000)',
-'offset=i' => 'search result offset (default: 0)',
+'limit|n=i' => ['NUM',
+	'limit on number of matches (default: 10000)' ],
+'offset=i' => ['OFF', 'search result offset (default: 0)'],
 
-'sort|s=s@' => [qw(internaldate date relevance docid),
-		"order of results `--output'-dependent)"],
+'sort|s=s@' => [ 'VAL|internaldate,date,relevance,docid',
+		"order of results `--output'-dependent"],
 
 'prio=i' => 'priority of query source',
 
@@ -156,7 +173,7 @@ my %OPTDESC = (
 'exact' => 'operate on exact header matches only',
 'exact!' => 'rely on content match instead of exact header matches',
 
-'by-mid|mid:s' => 'match only by Message-ID, ignoring contents',
+'by-mid|mid:s' => [ 'MID', 'match only by Message-ID, ignoring contents' ],
 'jobs:i' => 'set parallelism level',
 ); # %OPTDESC
 
@@ -174,93 +191,282 @@ sub x_it ($$) { # pronounced "exit"
 	}
 }
 
-sub emit ($$$) {
-	my ($client, $channel, $buf) = @_;
-	print { $client->{$channel} } $buf or warn "print FD[$channel]: $!";
+sub emit {
+	my ($client, $channel) = @_; # $buf = $_[2]
+	print { $client->{$channel} } $_[2] or die "print FD[$channel]: $!";
 }
 
-sub fail ($$;$) {
-	my ($client, $buf, $exit_code) = @_;
+sub err {
+	my ($client, $buf) = @_;
 	$buf .= "\n" unless $buf =~ /\n\z/s;
 	emit($client, 2, $buf);
+}
+
+sub qerr { $_[0]->{opt}->{quiet} or err(@_) }
+
+sub fail ($$;$) {
+	my ($client, $buf, $exit_code) = @_;
+	err($client, $buf);
 	x_it($client, ($exit_code // 1) << 8);
 	undef;
 }
 
 sub _help ($;$) {
-	my ($client, $channel) = @_;
-	emit($client, $channel //= 1, <<EOF);
-usage: lei COMMAND [OPTIONS]
+	my ($client, $errmsg) = @_;
+	my $cmd = $client->{cmd} // 'COMMAND';
+	my @info = @{$CMD{$cmd} // [ '...', '...' ]};
+	my @top = ($cmd, shift(@info) // ());
+	my $cmd_desc = shift(@info);
+	my @opt_desc;
+	my $lpad = 2;
+	for my $sw (@info) { # qw(prio=s
+		my $desc = $OPTDESC{"$cmd\t$sw"} // $OPTDESC{$sw} // next;
+		my $arg_vals = '';
+		($arg_vals, $desc) = @$desc if ref($desc) eq 'ARRAY';
+
+		# lower-case is a keyword (e.g. `content', `oid'),
+		# ALL_CAPS is a string description (e.g. `PATH')
+		if ($desc !~ /default/ && $arg_vals =~ /\b([a-z]+)[,\|]/) {
+			$desc .= "\ndefault: `$1'";
+		}
+		my (@vals, @s, @l);
+		my $x = $sw;
+		if ($x =~ s/!\z//) { # solve! => --no-solve
+			$x = "no-$x";
+		} elsif ($x =~ s/:.+//) { # optional args: $x = "mid:s"
+			@vals = (' [', undef, ']');
+		} elsif ($x =~ s/=.+//) { # required arg: $x = "type=s"
+			@vals = (' ', undef);
+		} # else: no args $x = 'thread|t'
+		for (split(/\|/, $x)) { # help|h
+			length($_) > 1 ? push(@l, "--$_") : push(@s, "-$_");
+		}
+		if (!scalar(@vals)) { # no args 'thread|t'
+		} elsif ($arg_vals =~ s/\A([A-Z_]+)\b//) { # "NAME"
+			$vals[1] = $1;
+		} else {
+			$vals[1] = uc(substr($l[0], 2)); # "--type" => "TYPE"
+		}
+		if ($arg_vals =~ /([,\|])/) {
+			my $sep = $1;
+			my @allow = split(/\Q$sep\E/, $arg_vals);
+			my $must = $sep eq '|' ? 'Must' : 'Can';
+			@allow = map { "`$_'" } @allow;
+			my $last = pop @allow;
+			$desc .= "\n$must be one of: " .
+				join(', ', @allow) . " or $last";
+		}
+		my $lhs = join(', ', @s, @l) . join('', @vals);
+		$lhs =~ s/\A--/    --/; # pad if no short options
+		$lpad = length($lhs) if length($lhs) > $lpad;
+		push @opt_desc, $lhs, $desc;
+	}
+	my $msg = $errmsg ? "E: $errmsg\n" : '';
+	$msg .= <<EOF;
+usage: lei @top
+  $cmd_desc
 
-...
 EOF
-	x_it($client, $channel == 2 ? 1 << 8 : 0); # stderr => failure
+	$lpad += 2;
+	local $Text::Wrap::columns = 78 - $lpad;
+	my $padding = ' ' x ($lpad + 2);
+	while (my ($lhs, $rhs) = splice(@opt_desc, 0, 2)) {
+		$msg .= '  '.pack("A$lpad", $lhs);
+		$rhs = wrap('', '', $rhs);
+		$rhs =~ s/\n/\n$padding/sg; # LHS pad continuation lines
+		$msg .= $rhs;
+		$msg .= "\n";
+	}
+	my $channel = $errmsg ? 2 : 1;
+	emit($client, $channel, $msg);
+	x_it($client, $errmsg ? 1 << 8 : 0); # stderr => failure
+	undef;
 }
 
-sub assert_args ($$$;$@) {
-	my ($client, $argv, $proto, $opt, @spec) = @_;
-	$opt //= {};
-	push @spec, qw(help|h);
-	$glp->getoptionsfromarray($argv, $opt, @spec) or
-		return fail($client, 'bad arguments or options');
-	if ($opt->{help}) {
-		_help($client);
-		undef;
-	} else {
-		my ($nreq, $rest) = split(/;/, $proto);
-		$nreq = (($nreq // '') =~ tr/$/$/);
-		my $argc = scalar(@$argv);
-		my $tot = ($rest // '') eq '@' ? $argc : ($proto =~ tr/$/$/);
-		return 1 if $argc <= $tot && $argc >= $nreq;
-		_help($client, 2);
-		undef
+sub optparse ($$$) {
+	my ($client, $cmd, $argv) = @_;
+	$client->{cmd} = $cmd;
+	my $opt = $client->{opt} = {};
+	my $info = $CMD{$cmd} // [ '[...]', '(undocumented command)' ];
+	my ($proto, $desc, @spec) = @$info;
+	$glp->getoptionsfromarray($argv, $opt, @spec, qw(help|h)) or
+		return _help($client, "bad arguments or options for $cmd");
+	return _help($client) if $opt->{help};
+	my $i = 0;
+	my $POS_ARG = '[A-Z][A-Z0-9_]+';
+	my ($err, $inf);
+	my @args = split(/ /, $proto);
+	for my $var (@args) {
+		if ($var =~ /\A$POS_ARG\.\.\.\z/o) { # >= 1 args;
+			$inf = defined($argv->[$i]) and last;
+			$var =~ s/\.\.\.\z//;
+			$err = "$var not supplied";
+		} elsif ($var =~ /\A$POS_ARG\z/o) { # required arg at $i
+			$argv->[$i++] // ($err = "$var not supplied");
+		} elsif ($var =~ /\.\.\.\]\z/) { # optional args start
+			$inf = 1;
+			last;
+		} elsif ($var =~ /\A\[$POS_ARG\]\z/) { # one optional arg
+			$i++;
+		} elsif ($var =~ /\A.+?\|/) { # required FOO|--stdin
+			my @or = split(/\|/, $var);
+			my $ok;
+			for my $o (@or) {
+				if ($o =~ /\A--([a-z0-9\-]+)/) {
+					$ok = defined($opt->{$1});
+					last;
+				} elsif (defined($argv->[$i])) {
+					$ok = 1;
+					$i++;
+					last;
+				} # else continue looping
+			}
+			my $last = pop @or;
+			$err = join(', ', @or) . " or $last must be set";
+		} else {
+			warn "BUG: can't parse `$var' in $proto";
+		}
+		last if $err;
+	}
+	# warn "inf=$inf ".scalar(@$argv). ' '.scalar(@args)."\n";
+	if (!$inf && scalar(@$argv) > scalar(@args)) {
+		$err //= 'too many arguments';
 	}
+	$err ? fail($client, "usage: lei $cmd $proto\nE: $err") : 1;
 }
 
 sub dispatch {
 	my ($client, $cmd, @argv) = @_;
-	local $SIG{__WARN__} = sub { emit($client, 2, "@_") };
+	local $SIG{__WARN__} = sub { err($client, "@_") };
 	local $SIG{__DIE__} = 'DEFAULT';
-	if (defined $cmd) {
-		my $func = "lei_$cmd";
-		$func =~ tr/-/_/;
-		if (my $cb = __PACKAGE__->can($func)) {
-			$client->{cmd} = $cmd;
-			$cb->($client, \@argv);
-		} elsif (grep(/\A-/, $cmd, @argv)) {
-			assert_args($client, [ $cmd, @argv ], '');
-		} else {
-			fail($client, "`$cmd' is not an lei command");
-		}
+	return _help($client, 'no command given') unless defined($cmd);
+	my $func = "lei_$cmd";
+	$func =~ tr/-/_/;
+	if (my $cb = __PACKAGE__->can($func)) {
+		optparse($client, $cmd, \@argv) or return;
+		$cb->($client, @argv);
+	} elsif (grep(/\A-/, $cmd, @argv)) { # --help or -h only
+		my $opt = {};
+		$glp->getoptionsfromarray([$cmd, @argv], $opt, qw(help|h)) or
+			return _help($client, 'bad arguments or options');
+		_help($client);
 	} else {
-		_help($client, 2);
+		fail($client, "`$cmd' is not an lei command");
 	}
 }
 
-sub lei_daemon_pid {
-	my ($client, $argv) = @_;
-	assert_args($client, $argv, '') and emit($client, 1, "$$\n");
+sub _lei_cfg ($;$) {
+	my ($client, $creat) = @_;
+	my $env = $client->{env};
+	my $cfg_dir = File::Spec->canonpath(( $env->{XDG_CONFIG_HOME} //
+			($env->{HOME} // '/nonexistent').'/.config').'/lei');
+	my $f = "$cfg_dir/config";
+	my @st = stat($f);
+	my $cur_st = @st ? pack('dd', $st[10], $st[7]) : ''; # 10:ctime, 7:size
+	if (my $cfg = $PATH2CFG{$f}) { # reuse existing object in common case
+		return ($client->{cfg} = $cfg) if $cur_st eq $cfg->{-st};
+	}
+	if (!@st) {
+		unless ($creat) {
+			delete $client->{cfg};
+			return;
+		}
+		-d $cfg_dir or mkpath($cfg_dir) or die "mkpath($cfg_dir): $!\n";
+		open my $fh, '>>', $f or die "open($f): $!\n";
+		@st = stat($fh) or die "fstat($f): $!\n";
+		$cur_st = pack('dd', $st[10], $st[7]);
+		qerr($client, "I: $f created");
+	}
+	my $cfg = PublicInbox::Config::git_config_dump($f);
+	$cfg->{-st} = $cur_st;
+	$cfg->{'-f'} = $f;
+	$client->{cfg} = $PATH2CFG{$f} = $cfg;
 }
 
-sub lei_DBG_pwd {
-	my ($client, $argv) = @_;
-	assert_args($client, $argv, '') and
-		emit($client, 1, "$client->{env}->{PWD}\n");
+sub _lei_store ($;$) {
+	my ($client, $creat) = @_;
+	my $cfg = _lei_cfg($client, $creat);
+	$cfg->{-lei_store} //= do {
+		require PublicInbox::LeiStore;
+		PublicInbox::SearchIdx::load_xapian_writable();
+		defined(my $dir = $cfg->{'leistore.dir'}) or return;
+		PublicInbox::LeiStore->new($dir, { creat => $creat });
+	};
+}
+
+sub lei_show {
+	my ($client, @argv) = @_;
 }
 
-sub lei_DBG_cwd {
-	my ($client, $argv) = @_;
-	require Cwd;
-	assert_args($client, $argv, '') and emit($client, 1, Cwd::cwd()."\n");
+sub lei_query {
+	my ($client, @argv) = @_;
 }
 
-sub lei_DBG_false { x_it($_[0], 1 << 8) }
+sub lei_mark {
+	my ($client, @argv) = @_;
+}
 
-sub lei_daemon_stop {
-	my ($client, $argv) = @_;
-	assert_args($client, $argv, '') and $quit->(0);
+sub lei_config {
+	my ($client, @argv) = @_;
+	my $env = $client->{env};
+	if (defined $env->{GIT_CONFIG}) {
+		my %copy = %$env;
+		delete $copy{GIT_CONFIG};
+		$env = \%copy;
+	}
+	if (my @conflict = (grep(/\A-f=?\z/, @argv),
+				grep(/\A--(?:global|system|
+					file|config-file)=?\z/x, @argv))) {
+		return fail($client, "@conflict not supported by lei config");
+	}
+	my $cfg = _lei_cfg($client, 1);
+	my $cmd = [ qw(git config -f), $cfg->{'-f'}, @argv ];
+	my %rdr = map { $_ => $client->{$_} } (0..2);
+	require PublicInbox::Import;
+	PublicInbox::Import::run_die($cmd, $env, \%rdr);
 }
 
+sub lei_init {
+	my ($client, $dir) = @_;
+	my $cfg = _lei_cfg($client, 1);
+	my $cur = $cfg->{'leistore.dir'};
+	my $env = $client->{env};
+	$dir //= ( $env->{XDG_DATA_HOME} //
+		($env->{HOME} // '/nonexistent').'/.local/share'
+		) . '/lei/store';
+	$dir = File::Spec->rel2abs($dir, $env->{PWD}); # PWD is symlink-aware
+	my @cur = stat($cur) if defined($cur);
+	$cur = File::Spec->canonpath($cur) if $cur;
+	my @dir = stat($dir);
+	my $exists = "I: leistore.dir=$cur already initialized" if @dir;
+	if (@cur) {
+		if ($cur eq $dir) {
+			_lei_store($client, 1)->done;
+			return qerr($client, $exists);
+		}
+
+		# some folks like symlinks and bind mounts :P
+		if (@dir && "$cur[0] $cur[1]" eq "$dir[0] $dir[1]") {
+			lei_config($client, 'leistore.dir', $dir);
+			_lei_store($client, 1)->done;
+			return qerr($client, "$exists (as $cur)");
+		}
+		return fail($client, <<"");
+E: leistore.dir=$cur already initialized and it is not $dir
+
+	}
+	lei_config($client, 'leistore.dir', $dir);
+	_lei_store($client, 1)->done;
+	$exists //= "I: leistore.dir=$dir newly initialized";
+	return qerr($client, $exists);
+}
+
+sub lei_daemon_pid {
+	emit($_[0], 1, "$$\n");
+}
+
+sub lei_daemon_stop { $quit->(0) }
+
 sub lei_help { _help($_[0]) }
 
 sub reap_exec { # dwaitpid callback
@@ -269,9 +475,9 @@ sub reap_exec { # dwaitpid callback
 }
 
 sub lei_git { # support passing through random git commands
-	my ($client, $argv) = @_;
-	my %opt = map { $_ => $client->{$_} } (0..2);
-	my $pid = spawn(['git', @$argv], $client->{env}, \%opt);
+	my ($client, @argv) = @_;
+	my %rdr = map { $_ => $client->{$_} } (0..2);
+	my $pid = spawn(['git', @argv], $client->{env}, \%rdr);
 	PublicInbox::DS::dwaitpid($pid, \&reap_exec, $client);
 }
 
@@ -360,6 +566,7 @@ sub lazy_start {
 	$pid = fork // die "fork: $!";
 	return if $pid;
 	$0 = "lei-daemon $path";
+	local %PATH2CFG;
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
 	$l->blocking(0);
@@ -427,6 +634,10 @@ sub lazy_start {
 
 # for users w/o IO::FDPass
 sub oneshot {
+	my ($main_pkg) = @_;
+	my $exit = $main_pkg->can('exit'); # caller may override exit()
+	local $quit = $exit if $exit;
+	local %PATH2CFG;
 	dispatch({
 		0 => *STDIN{IO},
 		1 => *STDOUT{IO},
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 56f668b8..b5b49efb 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -22,7 +22,12 @@ use PublicInbox::LeiSearch;
 sub new {
 	my (undef, $dir, $opt) = @_;
 	my $eidx = PublicInbox::ExtSearchIdx->new($dir, $opt);
-	bless { priv_eidx => $eidx }, __PACKAGE__;
+	my $self = bless { priv_eidx => $eidx }, __PACKAGE__;
+	if ($opt->{creat}) {
+		PublicInbox::SearchIdx::load_xapian_writable();
+		eidx_init($self);
+	}
+	$self;
 }
 
 sub git { $_[0]->{priv_eidx}->git } # read-only
diff --git a/script/lei b/script/lei
index 637c1951..fce088e9 100755
--- a/script/lei
+++ b/script/lei
@@ -60,5 +60,5 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 	}
 } else { # for systems lacking IO::FDPass
 	require PublicInbox::LeiDaemon;
-	PublicInbox::LeiDaemon::oneshot();
+	PublicInbox::LeiDaemon::oneshot(__PACKAGE__);
 }
diff --git a/t/lei.t b/t/lei.t
index 02f21322..9fb0ce00 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -6,23 +6,17 @@ use v5.10.1;
 use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::Config;
+use File::Path qw(rmtree);
 require_mods(qw(json DBD::SQLite Search::Xapian));
 my ($home, $for_destroy) = tmpdir();
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
+delete local $ENV{XDG_DATA_HOME};
+delete local $ENV{XDG_CONFIG_HOME};
+local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
+local $ENV{HOME} = $home;
+mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
 
-SKIP: {
-	require_mods('IO::FDPass', 51);
-	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
-	mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
-	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
-
-	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
-	is($err, '', 'no error from daemon-pid');
-	like($out, qr/\A[0-9]+\n\z/s, 'pid returned') or BAIL_OUT;
-	chomp(my $pid = $out);
-	ok(kill(0, $pid), 'pid is valid');
-	ok(-S $sock, 'sock created');
-
+my $test_lei_common = sub {
 	ok(!run_script([qw(lei)], undef, $opt), 'no args fails');
 	is($? >> 8, 1, '$? is 1');
 	is($out, '', 'nothing in stdout');
@@ -35,10 +29,6 @@ SKIP: {
 		is($err, '', "nothing in stderr (@$arg)");
 	}
 
-	ok(!run_script([qw(lei DBG-false)], undef, $opt), 'false(1) emulation');
-	is($? >> 8, 1, '$? set correctly');
-	is($err, '', 'no error from false(1) emulation');
-
 	for my $arg ([''], ['--halp'], ['halp'], [qw(daemon-pid --halp)]) {
 		$out = $err = '';
 		ok(!run_script(['lei', @$arg], undef, $opt), "lei @$arg");
@@ -47,6 +37,62 @@ SKIP: {
 		is($out, '', 'nothing in stdout');
 	}
 
+	# init tests
+	$out = $err = '';
+	my $ok_err_info = sub {
+		my ($msg) = @_;
+		is(grep(!/^I:/, split(/^/, $err)), 0, $msg) or
+			diag "$msg: err=$err";
+		$err = '';
+	};
+	my $home_trash = [ "$home/.local", "$home/.config" ];
+	rmtree($home_trash);
+	ok(run_script([qw(lei init)], undef, $opt), 'init w/o args');
+	$ok_err_info->('after init w/o args');
+	ok(run_script([qw(lei init)], undef, $opt), 'idempotent init w/o args');
+	$ok_err_info->('after idempotent init w/o args');
+
+	ok(!run_script([qw(lei init), "$home/x"], undef, $opt),
+		'init conflict');
+	is(grep(/^E:/, split(/^/, $err)), 1, 'got error on conflict');
+	ok(!-e "$home/x", 'nothing created on conflict');
+	rmtree($home_trash);
+
+	$err = '';
+	ok(run_script([qw(lei init), "$home/x"], undef, $opt),
+		'init conflict resolved');
+	$ok_err_info->('init w/ arg');
+	ok(run_script([qw(lei init), "$home/x"], undef, $opt),
+		'init idempotent with path');
+	$ok_err_info->('init idempotent w/ arg');
+	ok(-d "$home/x", 'created dir');
+	rmtree([ "$home/x", @$home_trash ]);
+
+	$err = '';
+	ok(!run_script([qw(lei init), "$home/x", "$home/2" ], undef, $opt),
+		'too many args fails');
+	like($err, qr/too many/, 'noted excessive');
+	ok(!-e "$home/x", 'x not created on excessive');
+	for my $d (@$home_trash) {
+		my $base = (split(m!/!, $d))[-1];
+		ok(!-d $d, "$base not created");
+	}
+	is($out, '', 'nothing in stdout');
+};
+
+SKIP: {
+	require_mods('IO::FDPass', 16);
+	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
+
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	is($err, '', 'no error from daemon-pid');
+	like($out, qr/\A[0-9]+\n\z/s, 'pid returned') or BAIL_OUT;
+	chomp(my $pid = $out);
+	ok(kill(0, $pid), 'pid is valid');
+	ok(-S $sock, 'sock created');
+
+	$test_lei_common->();
+
 	$out = '';
 	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
 	chomp(my $pid_again = $out);
@@ -72,8 +118,10 @@ SKIP: {
 		tick();
 	}
 	ok(!kill(0, $new_pid), 'daemon exits after unlink');
+	$test_lei_common = undef; # success over socket, can't test without
 };
 
 require_ok 'PublicInbox::LeiDaemon';
+$test_lei_common->() if $test_lei_common;
 
 done_testing;

^ permalink raw reply related	[relevance 20%]

* [PATCH 23/26] build: add lei.sh + "make symlink-install" target
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (14 preceding siblings ...)
  2020-12-18 12:09 53% ` [PATCH 22/26] lei: start working on bash completion Eric Wong
@ 2020-12-18 12:09 67% ` Eric Wong
  2020-12-18 12:09 42% ` [PATCH 24/26] lei: support for -$DIGIT and -$SIG CLI switches Eric Wong
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

This could've been done ages ago, but I rarely invoked
public-inbox-* commands from an interactive terminal
like I would with lei.
---
 MANIFEST    |  1 +
 Makefile.PL | 11 +++++++++++
 lei.sh      |  7 +++++++
 3 files changed, 19 insertions(+)
 create mode 100755 lei.sh

diff --git a/MANIFEST b/MANIFEST
index 1834e7bb..e2d4ef72 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -102,6 +102,7 @@ examples/unsubscribe-psgi@.service
 examples/unsubscribe.milter
 examples/unsubscribe.psgi
 examples/varnish-4.vcl
+lei.sh
 lib/PublicInbox/Address.pm
 lib/PublicInbox/AddressPP.pm
 lib/PublicInbox/Admin.pm
diff --git a/Makefile.PL b/Makefile.PL
index 8e710df2..a1a9161f 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -223,5 +223,16 @@ Makefile.PL : MANIFEST
 	touch -r MANIFEST \$@
 	\$(PERLRUN) \$@
 
+# Install symlinks to ~/bin (which is hopefuly in PATH) which point to
+# this source tree.
+# prefix + bindir matches git.git Makefile:
+prefix = \$(HOME)
+bindir = \$(prefix)/bin
+symlink-install :
+	mkdir -p \$(bindir)
+	lei=\$\$(realpath lei.sh) && cd \$(bindir) && \\
+	for x in \$(EXE_FILES); do \\
+		ln -sf "\$\$lei" \$\$(basename "\$\$x"); \\
+	done
 EOF
 }
diff --git a/lei.sh b/lei.sh
new file mode 100755
index 00000000..f1510a73
--- /dev/null
+++ b/lei.sh
@@ -0,0 +1,7 @@
+#!/bin/sh -e
+# symlink this file to a directory in PATH to run lei (or anything in script/*)
+# without needing perms to install globally.  Used by "make symlink-install"
+p=$(realpath "$0" || readlink "$0") # neither is POSIX, but common
+p=$(dirname "$p") c=$(basename "$0") # both are POSIX
+exec ${PERL-perl} -w -I"$p"/lib "$p"/script/"${c%.sh}" "$@"
+: this script is too short to copyright

^ permalink raw reply related	[relevance 67%]

* [PATCH 13/26] lei: support pass-through for `lei config'
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (7 preceding siblings ...)
  2020-12-18 12:09 60% ` [PATCH 12/26] rename LeiDaemon package to PublicInbox::LEI Eric Wong
@ 2020-12-18 12:09 66% ` Eric Wong
  2020-12-18 12:09 47% ` [PATCH 14/26] lei: help: show actual paths being operated on Eric Wong
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

This will be a handy wrapper for "git config" for manipulating
~/.config/lei/config.  Since we'll have many commands, start
breaking up t/lei.t into more distinct sections for
ease-of-testing.
---
 lib/PublicInbox/LEI.pm | 32 ++++++++++++-------------
 t/lei.t                | 54 +++++++++++++++++++++++++++++-------------
 2 files changed, 53 insertions(+), 33 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index b5ba1f71..dbd2875d 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -24,13 +24,16 @@ use Text::Wrap qw(wrap);
 use File::Path qw(mkpath);
 use File::Spec;
 our $quit = \&CORE::exit;
-my $glp = Getopt::Long::Parser->new;
-$glp->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
+my $GLP = Getopt::Long::Parser->new;
+$GLP->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
+my $GLP_PASS = Getopt::Long::Parser->new;
+$GLP_PASS->configure(qw(gnu_getopt no_ignore_case auto_abbrev pass_through));
+
 our %PATH2CFG; # persistent for socket daemon
 
 # TBD: this is a documentation mechanism to show a subcommand
 # (may) pass options through to another command:
-sub pass_through { () }
+sub pass_through { $GLP_PASS }
 
 # TODO: generate shell completion + help using %CMD and %OPTDESC
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
@@ -97,7 +100,8 @@ our %CMD = ( # sorted in order of importance/use:
 	],
 
 'config' => [ '[...]', 'git-config(1) wrapper for ~/.config/lei/config',
-		pass_through('git config') ],
+	qw(config-file|system|global|file|f=s), # conflict detection
+	pass_through('git config') ],
 'init' => [ '[PATHNAME]',
 	'initialize storage, default: ~/.local/share/lei/store',
 	qw(quiet|q) ],
@@ -231,7 +235,7 @@ sub _help ($;$) {
 	my $cmd_desc = shift(@info);
 	my @opt_desc;
 	my $lpad = 2;
-	for my $sw (@info) { # qw(prio=s
+	for my $sw (grep { !ref($_) } @info) { # ("prio=s", "z", $GLP_PASS)
 		my $desc = $OPTDESC{"$cmd\t$sw"} // $OPTDESC{$sw} // next;
 		my $arg_vals = '';
 		($arg_vals, $desc) = @$desc if ref($desc) eq 'ARRAY';
@@ -305,6 +309,7 @@ sub optparse ($$$) {
 	my $opt = $client->{opt} = {};
 	my $info = $CMD{$cmd} // [ '[...]', '(undocumented command)' ];
 	my ($proto, $desc, @spec) = @$info;
+	my $glp = ref($spec[-1]) ? pop(@spec) : $GLP; # or $GLP_PASS
 	push @spec, qw(help|h);
 	my $lone_dash;
 	if ($spec[0] =~ s/\|\z//s) { # "stdin|" or "clear|" allows "-" alias
@@ -374,7 +379,7 @@ sub dispatch {
 		$cb->($client, @argv);
 	} elsif (grep(/\A-/, $cmd, @argv)) { # --help or -h only
 		my $opt = {};
-		$glp->getoptionsfromarray([$cmd, @argv], $opt, qw(help|h)) or
+		$GLP->getoptionsfromarray([$cmd, @argv], $opt, qw(help|h)) or
 			return _help($client, 'bad arguments or options');
 		_help($client);
 	} else {
@@ -402,7 +407,7 @@ sub _lei_cfg ($;$) {
 		open my $fh, '>>', $f or die "open($f): $!\n";
 		@st = stat($fh) or die "fstat($f): $!\n";
 		$cur_st = pack('dd', $st[10], $st[7]);
-		qerr($client, "I: $f created");
+		qerr($client, "I: $f created") if $client->{cmd} ne 'config';
 	}
 	my $cfg = PublicInbox::Config::git_config_dump($f);
 	$cfg->{-st} = $cur_st;
@@ -435,17 +440,10 @@ sub lei_mark {
 
 sub lei_config {
 	my ($client, @argv) = @_;
+	$client->{opt}->{'config-file'} and return fail $client,
+		"config file switches not supported by `lei config'";
 	my $env = $client->{env};
-	if (defined $env->{GIT_CONFIG}) {
-		my %copy = %$env;
-		delete $copy{GIT_CONFIG};
-		$env = \%copy;
-	}
-	if (my @conflict = (grep(/\A-f=?\z/, @argv),
-				grep(/\A--(?:global|system|
-					file|config-file)=?\z/x, @argv))) {
-		return fail($client, "@conflict not supported by lei config");
-	}
+	delete local $env->{GIT_CONFIG};
 	my $cfg = _lei_cfg($client, 1);
 	my $cmd = [ qw(git config -f), $cfg->{'-f'}, @argv ];
 	my %rdr = map { $_ => $client->{$_} } (0..2);
diff --git a/t/lei.t b/t/lei.t
index 7ecadf7d..b0943962 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -22,8 +22,13 @@ local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 local $ENV{HOME} = $home;
 local $ENV{FOO} = 'BAR';
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
+my $home_trash = [ "$home/.local", "$home/.config" ];
+my $cleanup = sub {
+	rmtree([@$home_trash, @_]);
+	$out = $err = '';
+};
 
-my $test_lei_common = sub {
+my $test_help = sub {
 	ok(!$lei->([], undef, $opt), 'no args fails');
 	is($? >> 8, 1, '$? is 1');
 	is($out, '', 'nothing in stdout');
@@ -43,17 +48,17 @@ my $test_lei_common = sub {
 		isnt($err, '', 'something in stderr');
 		is($out, '', 'nothing in stdout');
 	}
+};
 
-	# init tests
-	$out = $err = '';
-	my $ok_err_info = sub {
-		my ($msg) = @_;
-		is(grep(!/^I:/, split(/^/, $err)), 0, $msg) or
-			diag "$msg: err=$err";
-		$err = '';
-	};
-	my $home_trash = [ "$home/.local", "$home/.config" ];
-	rmtree($home_trash);
+my $ok_err_info = sub {
+	my ($msg) = @_;
+	is(grep(!/^I:/, split(/^/, $err)), 0, $msg) or
+		diag "$msg: err=$err";
+	$err = '';
+};
+
+my $test_init = sub {
+	$cleanup->();
 	ok($lei->(['init'], undef, $opt), 'init w/o args');
 	$ok_err_info->('after init w/o args');
 	ok($lei->(['init'], undef, $opt), 'idempotent init w/o args');
@@ -63,17 +68,15 @@ my $test_lei_common = sub {
 		'init conflict');
 	is(grep(/^E:/, split(/^/, $err)), 1, 'got error on conflict');
 	ok(!-e "$home/x", 'nothing created on conflict');
-	rmtree($home_trash);
+	$cleanup->();
 
-	$err = '';
 	ok($lei->(['init', "$home/x"], undef, $opt), 'init conflict resolved');
 	$ok_err_info->('init w/ arg');
 	ok($lei->(['init', "$home/x"], undef, $opt), 'init idempotent w/ path');
 	$ok_err_info->('init idempotent w/ arg');
 	ok(-d "$home/x", 'created dir');
-	rmtree([ "$home/x", @$home_trash ]);
+	$cleanup->("$home/x");
 
-	$err = '';
 	ok(!$lei->(['init', "$home/x", "$home/2" ], undef, $opt),
 		'too many args fails');
 	like($err, qr/too many/, 'noted excessive');
@@ -82,7 +85,26 @@ my $test_lei_common = sub {
 		my $base = (split(m!/!, $d))[-1];
 		ok(!-d $d, "$base not created");
 	}
-	is($out, '', 'nothing in stdout');
+	is($out, '', 'nothing in stdout on init failure');
+};
+
+my $test_config = sub {
+	$cleanup->();
+	ok($lei->([qw(config a.b c)], undef, $opt), 'config set var');
+	is($out.$err, '', 'no output on var set');
+	ok($lei->([qw(config -l)], undef, $opt), 'config -l');
+	is($err, '', 'no errors on listing');
+	is($out, "a.b=c\n", 'got expected output');
+	ok(!$lei->([qw(config -f), "$home/.config/f", qw(x.y z)], undef, $opt),
+			'config set var with -f fails');
+	like($err, qr/not supported/, 'not supported noted');
+	ok(!-f "$home/config/f", 'no file created');
+};
+
+my $test_lei_common = sub {
+	$test_help->();
+	$test_config->();
+	$test_init->();
 };
 
 my $test_lei_oneshot = $ENV{TEST_LEI_ONESHOT};

^ permalink raw reply related	[relevance 66%]

* [PATCH 20/26] lei: restore default __DIE__ handler for event loop
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (11 preceding siblings ...)
  2020-12-18 12:09 50% ` [PATCH 16/26] lei: micro-optimize startup time Eric Wong
@ 2020-12-18 12:09 64% ` Eric Wong
  2020-12-18 12:09 38% ` [PATCH 21/26] lei: drop $SIG{__DIE__}, add oneshot fallbacks Eric Wong
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

The kqueue code paths will trigger exceptions which are caught
by eval{}, so we can't be calling exit() from the __DIE__
handler and expect eval to catch it.

We only need the __DIE__ handler to deal with fork or open
failures at startup (since stderr is pointed to /dev/null).
After that we can rely on OnDestroy writing errors to syslog
when it goes out of scope.
---
 lib/PublicInbox/LEI.pm | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 5399fade..95b48095 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -20,6 +20,7 @@ use PublicInbox::Syscall qw($SFD_NONBLOCK EPOLLIN EPOLLONESHOT);
 use PublicInbox::Sigfd;
 use PublicInbox::DS qw(now);
 use PublicInbox::Spawn qw(spawn);
+use PublicInbox::OnDestroy;
 use Text::Wrap qw(wrap);
 use File::Path qw(mkpath);
 use File::Spec;
@@ -386,7 +387,6 @@ sub optparse ($$$) {
 sub dispatch {
 	my ($self, $cmd, @argv) = @_;
 	local $SIG{__WARN__} = sub { err($self, "@_") };
-	local $SIG{__DIE__} = 'DEFAULT';
 	return _help($self, 'no command given') unless defined($cmd);
 	my $func = "lei_$cmd";
 	$func =~ tr/-/_/;
@@ -602,12 +602,12 @@ sub lazy_start {
 	my $oldset = PublicInbox::Sigfd::block_signals();
 	my $pid = fork // die "fork: $!";
 	return if $pid;
+	require PublicInbox::Listener;
+	require PublicInbox::EOFpipe;
 	openlog($path, 'pid', 'user');
 	local $SIG{__DIE__} = sub {
 		syslog('crit', "@_");
-		exit $! if $!;
-		exit $? >> 8 if $? >> 8;
-		exit 255;
+		die; # calls the default __DIE__ handler
 	};
 	local $SIG{__WARN__} = sub { syslog('warning', "@_") };
 	open(STDIN, '+<', '/dev/null') or die "redirect stdin failed: $!\n";
@@ -616,10 +616,13 @@ sub lazy_start {
 	setsid();
 	$pid = fork // die "fork: $!";
 	return if $pid;
+	$SIG{__DIE__} = 'DEFAULT';
+	my $on_destroy = PublicInbox::OnDestroy->new(sub {
+		my ($owner_pid) = @_;
+		syslog('crit', "$@") if $@ && $$ == $owner_pid;
+	}, $$);
 	$0 = "lei-daemon $path";
 	local %PATH2CFG;
-	require PublicInbox::Listener;
-	require PublicInbox::EOFpipe;
 	$l->blocking(0);
 	$eof_w->blocking(0);
 	$eof_r->blocking(0);
@@ -680,6 +683,7 @@ sub lazy_start {
 		$n; # true: continue, false: stop
 	});
 	PublicInbox::DS->EventLoop;
+	$@ = undef if $on_destroy; # quiet OnDestroy if we got here
 	exit($exit_code // 0);
 }
 

^ permalink raw reply related	[relevance 64%]

* [PATCH 22/26] lei: start working on bash completion
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (13 preceding siblings ...)
  2020-12-18 12:09 38% ` [PATCH 21/26] lei: drop $SIG{__DIE__}, add oneshot fallbacks Eric Wong
@ 2020-12-18 12:09 53% ` Eric Wong
  2020-12-18 12:09 67% ` [PATCH 23/26] build: add lei.sh + "make symlink-install" target Eric Wong
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

Much work still needs to be done, but that goes for this
entire project :P
---
 MANIFEST                               |  1 +
 contrib/completion/lei-completion.bash | 11 +++++
 lib/PublicInbox/LEI.pm                 | 61 +++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 1 deletion(-)
 create mode 100644 contrib/completion/lei-completion.bash

diff --git a/MANIFEST b/MANIFEST
index 8e870c22..1834e7bb 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -62,6 +62,7 @@ ci/README
 ci/deps.perl
 ci/profiles.sh
 ci/run.sh
+contrib/completion/lei-completion.bash
 contrib/css/216dark.css
 contrib/css/216light.css
 contrib/css/README
diff --git a/contrib/completion/lei-completion.bash b/contrib/completion/lei-completion.bash
new file mode 100644
index 00000000..67cdd3ed
--- /dev/null
+++ b/contrib/completion/lei-completion.bash
@@ -0,0 +1,11 @@
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# preliminary bash completion support for lei (Local Email Interface)
+# Needs a lot of work, see `lei__complete' in lib/PublicInbox::LEI.pm
+_lei() {
+	COMPREPLY=($(compgen -W "$(lei _complete ${COMP_WORDS[@]})" \
+			-- "${COMP_WORDS[COMP_CWORD]}"))
+	return 0
+}
+complete -o filenames -o bashdefault -F _lei lei
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index fd412324..7004e9d7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -132,7 +132,11 @@ our %CMD = ( # sorted in order of importance/use:
 
 'reorder-local-store-and-break-history' => [ '[REFNAME]',
 	'rewrite git history in an attempt to improve compression',
-	'gc!' ]
+	'gc!' ],
+
+# internal commands are prefixed with '_'
+'_complete' => [ '[...]', 'internal shell completion helper',
+		pass_through('everything') ],
 ); # @CMD
 
 # switch descriptions, try to keep consistent across commands
@@ -209,6 +213,10 @@ my %OPTDESC = (
 	'unset matching NAME, may be specified multiple times'],
 ); # %OPTDESC
 
+my %CONFIG_KEYS = (
+	'leistore.dir' => 'top-level storage location',
+);
+
 sub x_it ($$) { # pronounced "exit"
 	my ($self, $code) = @_;
 	if (my $sig = ($code & 127)) {
@@ -223,6 +231,8 @@ sub x_it ($$) { # pronounced "exit"
 	}
 }
 
+sub puts ($;@) { print { shift->{1} } map { "$_\n" } @_ }
+
 sub emit {
 	my ($self, $channel) = @_; # $buf = $_[2]
 	print { $self->{$channel} } $_[2] or die "print FD[$channel]: $!";
@@ -522,6 +532,55 @@ sub lei_daemon_env {
 
 sub lei_help { _help($_[0]) }
 
+# Shell completion helper.  Used by lei-completion.bash and hopefully
+# other shells.  Try to do as much here as possible to avoid redundancy
+# and improve maintainability.
+sub lei__complete {
+	my ($self, @argv) = @_; # argv = qw(lei and any other args...)
+	shift @argv; # ignore "lei", the entire command is sent
+	@argv or return puts $self, grep(!/^_/, keys %CMD);
+	my $cmd = shift @argv;
+	my $info = $CMD{$cmd} // do { # filter matching commands
+		@argv or puts $self, grep(/\A\Q$cmd\E/, keys %CMD);
+		return;
+	};
+	my ($proto, undef, @spec) = @$info;
+	my $cur = pop @argv;
+	my $re = defined($cur) ? qr/\A\Q$cur\E/ : qr/./;
+	if (substr($cur // '-', 0, 1) eq '-') { # --switches
+		# gross special case since the only git-config options
+		# Consider moving to a table if we need more special cases
+		# we use Getopt::Long for are the ones we reject, so these
+		# are the ones we don't reject:
+		if ($cmd eq 'config') {
+			puts $self, grep(/$re/, keys %CONFIG_KEYS);
+			@spec = qw(add z|null get get-all unset unset-all
+				replace-all get-urlmatch
+				remove-section rename-section
+				name-only list|l edit|e
+				get-color-name get-colorbool);
+			# fall-through
+		}
+		# TODO: arg support
+		puts $self, grep(/$re/, map { # generate short/long names
+			my $eq = '';
+			if (s/=.+\z//) { # required arg, e.g. output|o=i
+				$eq = '=';
+			} elsif (s/:.+\z//) { # optional arg, e.g. mid:s
+			} else { # negation: solve! => no-solve|solve
+				s/\A(.+)!\z/no-$1|$1/;
+			}
+			map {
+				length > 1 ? "--$_$eq" : "-$_"
+			} split(/\|/, $_, -1) # help|h
+		} grep { !ref } @spec); # filter out $GLP_PASS ref
+	} elsif ($cmd eq 'config' && !@argv && !$CONFIG_KEYS{$cur}) {
+		puts $self, grep(/$re/, keys %CONFIG_KEYS);
+	}
+	# TODO: URLs, pathnames, OIDs, MIDs, etc...  See optparse() for
+	# proto parsing.
+}
+
 sub reap_exec { # dwaitpid callback
 	my ($self, $pid) = @_;
 	x_it($self, $?);

^ permalink raw reply related	[relevance 53%]

* [PATCH 16/26] lei: micro-optimize startup time
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (10 preceding siblings ...)
  2020-12-18 12:09 37% ` [PATCH 15/26] lei: rename $client => $self and bless Eric Wong
@ 2020-12-18 12:09 50% ` Eric Wong
  2020-12-18 12:09 64% ` [PATCH 20/26] lei: restore default __DIE__ handler for event loop Eric Wong
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

We'll use lower-level Socket and avoid IO::Socket::UNIX,
use Cwd::fastcwd(*), avoid IO::Handle->autoflush by
using the select operator, and reuse buffer for reading
the socket while avoiding unnecessary $/ localization
in a tiny script.

All these things adds up to ~5-10 ms savings on my loaded
system.

(*) caveats about fastcwd won't apply since lei won't work
    in removed directories.
---
 lib/PublicInbox/LEI.pm        | 13 ++++++-------
 lib/PublicInbox/TestCommon.pm |  1 +
 script/lei                    | 33 +++++++++++++++++----------------
 3 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f5824c59..5399fade 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -10,9 +10,9 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::DS);
 use Getopt::Long ();
+use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
 use Errno qw(EAGAIN ECONNREFUSED ENOENT);
 use POSIX qw(setsid);
-use IO::Socket::UNIX;
 use IO::Handle ();
 use Sys::Syslog qw(syslog openlog);
 use PublicInbox::Config;
@@ -585,18 +585,17 @@ sub noop {}
 # lei(1) calls this when it can't connect
 sub lazy_start {
 	my ($path, $err) = @_;
+	require IO::FDPass; # require this early so caller sees it
 	if ($err == ECONNREFUSED) {
 		unlink($path) or die "unlink($path): $!";
 	} elsif ($err != ENOENT) {
+		$! = $err; # allow interpolation to stringify in die
 		die "connect($path): $!";
 	}
-	require IO::FDPass;
 	umask(077) // die("umask(077): $!");
-	my $l = IO::Socket::UNIX->new(Local => $path,
-					Listen => 1024,
-					Type => SOCK_STREAM) or
-		$err = $!;
-	$l or return die "bind($path): $err";
+	socket(my $l, AF_UNIX, SOCK_STREAM, 0) or die "socket: $!";
+	bind($l, pack_sockaddr_un($path)) or die "bind($path): $!";
+	listen($l, 1024) or die "listen $!";
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
 	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index c236c589..338e760c 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -261,6 +261,7 @@ sub run_script ($;$$) {
 		my $orig_io = _prepare_redirects($fhref);
 		_run_sub($sub, $key, \@argv);
 		_undo_redirects($orig_io);
+		select STDOUT;
 	}
 
 	# slurp the redirects back into user-supplied strings
diff --git a/script/lei b/script/lei
index e59e4316..2b041fb4 100755
--- a/script/lei
+++ b/script/lei
@@ -3,8 +3,7 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict;
 use v5.10.1;
-use Cwd qw(cwd);
-use IO::Socket::UNIX;
+use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
 
 if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 	my $path = do {
@@ -13,14 +12,15 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 			require File::Spec;
 			$runtime_dir = File::Spec->tmpdir."/lei-$<";
 		}
-		unless (-d $runtime_dir && -w _) {
+		unless (-d $runtime_dir) {
 			require File::Path;
 			File::Path::mkpath($runtime_dir, 0, 0700);
 		}
 		"$runtime_dir/sock";
 	};
-	my $sock = IO::Socket::UNIX->new(Peer => $path, Type => SOCK_STREAM);
-	unless ($sock) { # start the daemon if not started
+	my $addr = pack_sockaddr_un($path);
+	socket(my $sock, AF_UNIX, SOCK_STREAM, 0) or die "socket: $!";
+	unless (connect($sock, $addr)) { # start the daemon if not started
 		my $err = $! + 0;
 		my $env = { PERL5LIB => join(':', @INC) };
 		my $cmd = [ $^X, qw[-MPublicInbox::LEI
@@ -31,13 +31,14 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 		warn "lei-daemon exited with \$?=$?\n" if $?;
 
 		# try connecting again anyways, unlink+bind may be racy
-		$sock = IO::Socket::UNIX->new(Peer => $path,
-						Type => SOCK_STREAM) // die
+		connect($sock, $addr) or die
 			"connect($path): $! (after attempted daemon start)";
 	}
-	my $pwd = $ENV{PWD};
-	my $cwd = cwd();
-	if ($pwd) { # prefer ENV{PWD} if it's a symlink to real cwd
+	require Cwd;
+	my $cwd = Cwd::fastcwd() // die "fastcwd: $!";
+	my $pwd = $ENV{PWD} // '';
+	if ($pwd eq $cwd) { # likely, all good
+	} elsif ($pwd) { # prefer ENV{PWD} if it's a symlink to real cwd
 		my @st_cwd = stat($cwd) or die "stat(cwd=$cwd): $!\n";
 		my @st_pwd = stat($pwd);
 		# make sure st_dev/st_ino match for {PWD} to be valid
@@ -47,16 +48,16 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 		$pwd = $cwd;
 	}
 	local $ENV{PWD} = $pwd;
-	$sock->autoflush(1);
-	IO::FDPass::send(fileno($sock), $_) for (0..2);
 	my $buf = "$$\0\0>" . join("]\0[", @ARGV) . "\0\0>";
 	while (my ($k, $v) = each %ENV) { $buf .= "$k=$v\0" }
 	$buf .= "\0\0";
+	select $sock;
+	$| = 1; # unbuffer selected $sock
+	IO::FDPass::send(fileno($sock), $_) for (0..2);
 	print $sock $buf or die "print(sock, buf): $!";
-	local $/ = "\n";
-	while (my $line = <$sock>) {
-		$line =~ /\Aexit=([0-9]+)\n\z/ and exit($1 + 0);
-		die $line;
+	while ($buf = <$sock>) {
+		$buf =~ /\Aexit=([0-9]+)\n\z/ and exit($1 + 0);
+		die $buf;
 	}
 } else { # for systems lacking IO::FDPass
 	require PublicInbox::LEI;

^ permalink raw reply related	[relevance 50%]

* [PATCH 14/26] lei: help: show actual paths being operated on
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (8 preceding siblings ...)
  2020-12-18 12:09 66% ` [PATCH 13/26] lei: support pass-through for `lei config' Eric Wong
@ 2020-12-18 12:09 47% ` Eric Wong
  2020-12-18 12:09 37% ` [PATCH 15/26] lei: rename $client => $self and bless Eric Wong
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

This allows us to respect XDG_* environment variables
to override HOME.

We'll also make the $lei wrapper easier-to-use by auto-clearing
$out/$err and reducing [] needed for common cases.
---
 lib/PublicInbox/LEI.pm | 42 +++++++++++++++++++++++++++---------------
 t/lei.t                | 27 ++++++++++++++++++++++-----
 2 files changed, 49 insertions(+), 20 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index dbd2875d..667ef765 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -35,6 +35,20 @@ our %PATH2CFG; # persistent for socket daemon
 # (may) pass options through to another command:
 sub pass_through { $GLP_PASS }
 
+sub _store_path ($) {
+	my ($env) = @_;
+	File::Spec->rel2abs(($env->{XDG_DATA_HOME} //
+		($env->{HOME} // '/nonexistent').'/.local/share')
+		.'/lei/store', $env->{PWD});
+}
+
+sub _config_path ($) {
+	my ($env) = @_;
+	File::Spec->rel2abs(($env->{XDG_CONFIG_HOME} //
+		($env->{HOME} // '/nonexistent').'/.config')
+		.'/lei/config', $env->{PWD});
+}
+
 # TODO: generate shell completion + help using %CMD and %OPTDESC
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
@@ -99,12 +113,13 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(stdin| limit|n=i offset=i recursive|r exclude=s include=s !flags),
 	],
 
-'config' => [ '[...]', 'git-config(1) wrapper for ~/.config/lei/config',
-	qw(config-file|system|global|file|f=s), # conflict detection
+'config' => [ '[...]', sub {
+		'git-config(1) wrapper for '._config_path($_[0]);
+	}, qw(config-file|system|global|file|f=s), # for conflict detection
 	pass_through('git config') ],
-'init' => [ '[PATHNAME]',
-	'initialize storage, default: ~/.local/share/lei/store',
-	qw(quiet|q) ],
+'init' => [ '[PATHNAME]', sub {
+		'initialize storage, default: '._store_path($_[0]);
+	}, qw(quiet|q) ],
 'daemon-stop' => [ '', 'stop the lei-daemon' ],
 'daemon-pid' => [ '', 'show the PID of the lei-daemon' ],
 'daemon-env' => [ '[NAME=VALUE...]', 'set, unset, or show daemon environment',
@@ -233,9 +248,10 @@ sub _help ($;$) {
 	my @info = @{$CMD{$cmd} // [ '...', '...' ]};
 	my @top = ($cmd, shift(@info) // ());
 	my $cmd_desc = shift(@info);
+	$cmd_desc = $cmd_desc->($client->{env}) if ref($cmd_desc) eq 'CODE';
 	my @opt_desc;
 	my $lpad = 2;
-	for my $sw (grep { !ref($_) } @info) { # ("prio=s", "z", $GLP_PASS)
+	for my $sw (grep { !ref } @info) { # ("prio=s", "z", $GLP_PASS)
 		my $desc = $OPTDESC{"$cmd\t$sw"} // $OPTDESC{$sw} // next;
 		my $arg_vals = '';
 		($arg_vals, $desc) = @$desc if ref($desc) eq 'ARRAY';
@@ -307,8 +323,8 @@ sub optparse ($$$) {
 	my ($client, $cmd, $argv) = @_;
 	$client->{cmd} = $cmd;
 	my $opt = $client->{opt} = {};
-	my $info = $CMD{$cmd} // [ '[...]', '(undocumented command)' ];
-	my ($proto, $desc, @spec) = @$info;
+	my $info = $CMD{$cmd} // [ '[...]' ];
+	my ($proto, undef, @spec) = @$info;
 	my $glp = ref($spec[-1]) ? pop(@spec) : $GLP; # or $GLP_PASS
 	push @spec, qw(help|h);
 	my $lone_dash;
@@ -389,10 +405,7 @@ sub dispatch {
 
 sub _lei_cfg ($;$) {
 	my ($client, $creat) = @_;
-	my $env = $client->{env};
-	my $cfg_dir = File::Spec->canonpath(( $env->{XDG_CONFIG_HOME} //
-			($env->{HOME} // '/nonexistent').'/.config').'/lei');
-	my $f = "$cfg_dir/config";
+	my $f = _config_path($client->{env});
 	my @st = stat($f);
 	my $cur_st = @st ? pack('dd', $st[10], $st[7]) : ''; # 10:ctime, 7:size
 	if (my $cfg = $PATH2CFG{$f}) { # reuse existing object in common case
@@ -403,6 +416,7 @@ sub _lei_cfg ($;$) {
 			delete $client->{cfg};
 			return;
 		}
+		my (undef, $cfg_dir, undef) = File::Spec->splitpath($f);
 		-d $cfg_dir or mkpath($cfg_dir) or die "mkpath($cfg_dir): $!\n";
 		open my $fh, '>>', $f or die "open($f): $!\n";
 		@st = stat($fh) or die "fstat($f): $!\n";
@@ -456,9 +470,7 @@ sub lei_init {
 	my $cfg = _lei_cfg($client, 1);
 	my $cur = $cfg->{'leistore.dir'};
 	my $env = $client->{env};
-	$dir //= ( $env->{XDG_DATA_HOME} //
-		($env->{HOME} // '/nonexistent').'/.local/share'
-		) . '/lei/store';
+	$dir //= _store_path($env);
 	$dir = File::Spec->rel2abs($dir, $env->{PWD}); # PWD is symlink-aware
 	my @cur = stat($cur) if defined($cur);
 	$cur = File::Spec->canonpath($cur) if $cur;
diff --git a/t/lei.t b/t/lei.t
index b0943962..bdf6cc1c 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -9,13 +9,18 @@ use PublicInbox::Config;
 use File::Path qw(rmtree);
 require_mods(qw(json DBD::SQLite Search::Xapian));
 my $LEI = 'lei';
+my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 my $lei = sub {
 	my ($cmd, $env, $opt) = @_;
+	$out = $err = '';
+	if (!ref($cmd)) {
+		($env, $opt) = grep { (!defined) || ref } @_;
+		$cmd = [ grep { defined } @_ ];
+	}
 	run_script([$LEI, @$cmd], $env, $opt);
 };
 
 my ($home, $for_destroy) = tmpdir();
-my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 delete local $ENV{XDG_DATA_HOME};
 delete local $ENV{XDG_CONFIG_HOME};
 local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
@@ -23,10 +28,7 @@ local $ENV{HOME} = $home;
 local $ENV{FOO} = 'BAR';
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
 my $home_trash = [ "$home/.local", "$home/.config" ];
-my $cleanup = sub {
-	rmtree([@$home_trash, @_]);
-	$out = $err = '';
-};
+my $cleanup = sub { rmtree([@$home_trash, @_]) };
 
 my $test_help = sub {
 	ok(!$lei->([], undef, $opt), 'no args fails');
@@ -48,6 +50,21 @@ my $test_help = sub {
 		isnt($err, '', 'something in stderr');
 		is($out, '', 'nothing in stdout');
 	}
+	ok($lei->(qw(init -h), undef, $opt), 'init -h');
+	like($out, qr! \Q$home\E/\.local/share/lei/store\b!,
+		'actual path shown in init -h');
+	ok($lei->(qw(init -h), { XDG_DATA_HOME => '/XDH' }, $opt),
+		'init with XDG_DATA_HOME');
+	like($out, qr! /XDH/lei/store\b!, 'XDG_DATA_HOME in init -h');
+	is($err, '', 'no errors from init -h');
+
+	ok($lei->(qw(config -h), undef, $opt), 'config-h');
+	like($out, qr! \Q$home\E/\.config/lei/config\b!,
+		'actual path shown in config -h');
+	ok($lei->(qw(config -h), { XDG_CONFIG_HOME => '/XDC' }, $opt),
+		'config with XDG_CONFIG_HOME');
+	like($out, qr! /XDC/lei/config\b!, 'XDG_CONFIG_HOME in config -h');
+	is($err, '', 'no errors from config -h');
 };
 
 my $ok_err_info = sub {

^ permalink raw reply related	[relevance 47%]

* [PATCH 24/26] lei: support for -$DIGIT and -$SIG CLI switches
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (15 preceding siblings ...)
  2020-12-18 12:09 67% ` [PATCH 23/26] build: add lei.sh + "make symlink-install" target Eric Wong
@ 2020-12-18 12:09 42% ` Eric Wong
  2020-12-18 12:09 64% ` [PATCH 25/26] lei: revise output routines Eric Wong
  2020-12-18 12:09 42% ` [PATCH 26/26] lei: extinbox: start implementing in config file Eric Wong
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

I'm a bit spoiled by using single-dash digit options
from common tools: ("git log -$DIGIT", "kill -9",
"tail -1", ...), so we'll support it for limiting
query results.

But first, make it easier to send arbitrary signals to
the daemon via "daemon-kill".  "daemon-stop" is redundant,
now, and removed, since the default for "daemon-kill" is
SIGTERM to match kill(1) behavior.
---
 lib/PublicInbox/LEI.pm | 55 +++++++++++++++++++++++++++++-------------
 t/lei.t                | 18 +++++++++++---
 2 files changed, 52 insertions(+), 21 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 7004e9d7..c28c9b59 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -36,6 +36,21 @@ our %PATH2CFG; # persistent for socket daemon
 # (may) pass options through to another command:
 sub pass_through { $GLP_PASS }
 
+my $OPT;
+sub opt_dash {
+	my ($spec, $re_str) = @_; # 'limit|n=i', '([0-9]+)'
+	my ($key) = ($spec =~ m/\A([a-z]+)/g);
+	my $cb = sub { # Getopt::Long "<>" catch-all handler
+		my ($arg) = @_;
+		if ($arg =~ /\A-($re_str)\z/) {
+			$OPT->{$key} = $1;
+		} else {
+			die "bad argument for --$key: $arg\n";
+		}
+	};
+	($spec, '<>' => $cb, $GLP_PASS)
+}
+
 sub _store_path ($) {
 	my ($env) = @_;
 	File::Spec->rel2abs(($env->{XDG_DATA_HOME} //
@@ -55,8 +70,8 @@ sub _config_path ($) {
 our %CMD = ( # sorted in order of importance/use:
 'query' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|o=s format|f=s dedupe|d=s thread|t augment|a
-	limit|n=i sort|s=s@ reverse|r offset=i remote local! extinbox!
-	since|after=s until|before=s) ],
+	sort|s=s@ reverse|r offset=i remote local! extinbox!
+	since|after=s until|before=s), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
 	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
@@ -111,7 +126,7 @@ our %CMD = ( # sorted in order of importance/use:
 
 'import' => [ '{URL_OR_PATHNAME|--stdin}',
 	'one-shot import/update from URL or filesystem',
-	qw(stdin| limit|n=i offset=i recursive|r exclude=s include=s !flags),
+	qw(stdin| offset=i recursive|r exclude=s include=s !flags),
 	],
 
 'config' => [ '[...]', sub {
@@ -121,7 +136,8 @@ our %CMD = ( # sorted in order of importance/use:
 'init' => [ '[PATHNAME]', sub {
 		'initialize storage, default: '._store_path($_[0]);
 	}, qw(quiet|q) ],
-'daemon-stop' => [ '', 'stop the lei-daemon' ],
+'daemon-kill' => [ '[-SIGNAL]', 'signal the lei-daemon',
+	opt_dash('signal|s=s', '[0-9]+|(?:[A-Z][A-Z0-9]+)') ],
 'daemon-pid' => [ '', 'show the PID of the lei-daemon' ],
 'daemon-env' => [ '[NAME=VALUE...]', 'set, unset, or show daemon environment',
 	qw(clear| unset|u=s@ z|0) ],
@@ -175,8 +191,7 @@ my %OPTDESC = (
 'ls-query	format|f=s' => $ls_format,
 'ls-extinbox	format|f=s' => $ls_format,
 
-'limit|n=i' => ['NUM',
-	'limit on number of matches (default: 10000)' ],
+'limit|n=i@' => ['NUM', 'limit on number of matches (default: 10000)' ],
 'offset=i' => ['OFF', 'search result offset (default: 0)'],
 
 'sort|s=s@' => [ 'VAL|internaldate,date,relevance,docid',
@@ -211,6 +226,8 @@ my %OPTDESC = (
 'clear|' => 'clear the daemon environment',
 'unset|u=s@' => ['NAME',
 	'unset matching NAME, may be specified multiple times'],
+
+'signal|s=s' => [ 'SIG', 'signal to send lei-daemon (default: TERM)' ],
 ); # %OPTDESC
 
 my %CONFIG_KEYS = (
@@ -333,23 +350,23 @@ EOF
 sub optparse ($$$) {
 	my ($self, $cmd, $argv) = @_;
 	$self->{cmd} = $cmd;
-	my $opt = $self->{opt} = {};
+	$OPT = $self->{opt} = {};
 	my $info = $CMD{$cmd} // [ '[...]' ];
 	my ($proto, undef, @spec) = @$info;
-	my $glp = ref($spec[-1]) ? pop(@spec) : $GLP; # or $GLP_PASS
+	my $glp = ref($spec[-1]) eq ref($GLP) ? pop(@spec) : $GLP;
 	push @spec, qw(help|h);
 	my $lone_dash;
 	if ($spec[0] =~ s/\|\z//s) { # "stdin|" or "clear|" allows "-" alias
 		$lone_dash = $spec[0];
-		$opt->{$spec[0]} = \(my $var);
+		$OPT->{$spec[0]} = \(my $var);
 		push @spec, '' => \$var;
 	}
-	$glp->getoptionsfromarray($argv, $opt, @spec) or
+	$glp->getoptionsfromarray($argv, $OPT, @spec) or
 		return _help($self, "bad arguments or options for $cmd");
-	return _help($self) if $opt->{help};
+	return _help($self) if $OPT->{help};
 
 	# "-" aliases "stdin" or "clear"
-	$opt->{$lone_dash} = ${$opt->{$lone_dash}} if defined $lone_dash;
+	$OPT->{$lone_dash} = ${$OPT->{$lone_dash}} if defined $lone_dash;
 
 	my $i = 0;
 	my $POS_ARG = '[A-Z][A-Z0-9_]+';
@@ -365,14 +382,14 @@ sub optparse ($$$) {
 		} elsif ($var =~ /\.\.\.\]\z/) { # optional args start
 			$inf = 1;
 			last;
-		} elsif ($var =~ /\A\[$POS_ARG\]\z/) { # one optional arg
+		} elsif ($var =~ /\A\[-?$POS_ARG\]\z/) { # one optional arg
 			$i++;
 		} elsif ($var =~ /\A.+?\|/) { # required FOO|--stdin
 			my @or = split(/\|/, $var);
 			my $ok;
 			for my $o (@or) {
 				if ($o =~ /\A--([a-z0-9\-]+)/) {
-					$ok = defined($opt->{$1});
+					$ok = defined($OPT->{$1});
 					last;
 				} elsif (defined($argv->[$i])) {
 					$ok = 1;
@@ -510,7 +527,11 @@ E: leistore.dir=$cur already initialized and it is not $dir
 
 sub lei_daemon_pid { emit($_[0], 1, "$$\n") }
 
-sub lei_daemon_stop { $quit->(0) }
+sub lei_daemon_kill {
+	my ($self) = @_;
+	my $sig = $self->{opt}->{signal} // 'TERM';
+	kill($sig, $$) or fail($self, "kill($sig, $$): $!");
+}
 
 sub lei_daemon_env {
 	my ($self, @argv) = @_;
@@ -538,7 +559,7 @@ sub lei_help { _help($_[0]) }
 sub lei__complete {
 	my ($self, @argv) = @_; # argv = qw(lei and any other args...)
 	shift @argv; # ignore "lei", the entire command is sent
-	@argv or return puts $self, grep(!/^_/, keys %CMD);
+	@argv or return puts $self, grep(!/^_/, keys %CMD), qw(--help -h);
 	my $cmd = shift @argv;
 	my $info = $CMD{$cmd} // do { # filter matching commands
 		@argv or puts $self, grep(/\A\Q$cmd\E/, keys %CMD);
@@ -573,7 +594,7 @@ sub lei__complete {
 			map {
 				length > 1 ? "--$_$eq" : "-$_"
 			} split(/\|/, $_, -1) # help|h
-		} grep { !ref } @spec); # filter out $GLP_PASS ref
+		} grep { $OPTDESC{"$cmd\t$_"} || $OPTDESC{$_} } @spec);
 	} elsif ($cmd eq 'config' && !@argv && !$CONFIG_KEYS{$cur}) {
 		puts $self, grep(/$re/, keys %CONFIG_KEYS);
 	}
diff --git a/t/lei.t b/t/lei.t
index cce90fff..30f9d2b6 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -127,7 +127,7 @@ my $test_lei_common = sub {
 my $test_lei_oneshot = $ENV{TEST_LEI_ONESHOT};
 SKIP: {
 	last SKIP if $test_lei_oneshot;
-	require_mods(qw(IO::FDPass Cwd), 41);
+	require_mods(qw(IO::FDPass Cwd), 46);
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
 
 	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
@@ -174,9 +174,9 @@ SKIP: {
 	ok(run_script([qw(lei daemon-env)], undef, $opt), 'env is empty');
 	is($out, '', 'env cleared');
 
-	ok(run_script([qw(lei daemon-stop)], undef, $opt), 'daemon-stop');
-	is($out, '', 'no output from daemon-stop');
-	is($err, '', 'no error from daemon-stop');
+	ok(run_script([qw(lei daemon-kill)], undef, $opt), 'daemon-kill');
+	is($out, '', 'no output from daemon-kill');
+	is($err, '', 'no error from daemon-kill');
 	for (0..100) {
 		kill(0, $pid) or last;
 		tick();
@@ -189,6 +189,16 @@ SKIP: {
 	ok(kill(0, $new_pid), 'new pid is running');
 	ok(-S $sock, 'sock exists again');
 
+	$out = $err = '';
+	for my $sig (qw(-0 -CHLD)) {
+		ok(run_script([qw(lei daemon-kill), $sig ], undef, $opt),
+					"handles $sig");
+	}
+	is($out.$err, '', 'no output on innocuous signals');
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	chomp $out;
+	is($out, $new_pid, 'PID unchanged after -0/-CHLD');
+
 	if ('socket inaccessible') {
 		chmod 0000, $sock or BAIL_OUT "chmod 0000: $!";
 		$out = $err = '';

^ permalink raw reply related	[relevance 42%]

* [PATCH 21/26] lei: drop $SIG{__DIE__}, add oneshot fallbacks
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (12 preceding siblings ...)
  2020-12-18 12:09 64% ` [PATCH 20/26] lei: restore default __DIE__ handler for event loop Eric Wong
@ 2020-12-18 12:09 38% ` Eric Wong
  2020-12-18 12:09 53% ` [PATCH 22/26] lei: start working on bash completion Eric Wong
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

We'll force stdout+stderr to be a pipe the spawning client
controls, thus there's no need to lose error reporting by
prematurely redirecting stdout+stderr to /dev/null.

We can now rely exclusively on OnDestroy to write to syslog() on
uncaught die failures.

Also support falling back to oneshot mode on socket and cwd
failures, since some commands may still be useful if the current
working directory goes missing :P
---
 lib/PublicInbox/LEI.pm | 67 ++++++++++++++++++++----------------------
 script/lei             | 39 +++++++++++++++---------
 t/lei.t                | 31 +++++++++++++++++--
 3 files changed, 86 insertions(+), 51 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 95b48095..fd412324 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -12,7 +12,7 @@ use parent qw(PublicInbox::DS);
 use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
 use Errno qw(EAGAIN ECONNREFUSED ENOENT);
-use POSIX qw(setsid);
+use POSIX ();
 use IO::Handle ();
 use Sys::Syslog qw(syslog openlog);
 use PublicInbox::Config;
@@ -584,60 +584,44 @@ sub noop {}
 
 # lei(1) calls this when it can't connect
 sub lazy_start {
-	my ($path, $err) = @_;
-	require IO::FDPass; # require this early so caller sees it
-	if ($err == ECONNREFUSED) {
+	my ($path, $errno) = @_;
+	if ($errno == ECONNREFUSED) {
 		unlink($path) or die "unlink($path): $!";
-	} elsif ($err != ENOENT) {
-		$! = $err; # allow interpolation to stringify in die
+	} elsif ($errno != ENOENT) {
+		$! = $errno; # allow interpolation to stringify in die
 		die "connect($path): $!";
 	}
 	umask(077) // die("umask(077): $!");
 	socket(my $l, AF_UNIX, SOCK_STREAM, 0) or die "socket: $!";
 	bind($l, pack_sockaddr_un($path)) or die "bind($path): $!";
-	listen($l, 1024) or die "listen $!";
+	listen($l, 1024) or die "listen: $!";
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
 	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
 	my $oldset = PublicInbox::Sigfd::block_signals();
-	my $pid = fork // die "fork: $!";
-	return if $pid;
+	require IO::FDPass;
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
-	openlog($path, 'pid', 'user');
-	local $SIG{__DIE__} = sub {
-		syslog('crit', "@_");
-		die; # calls the default __DIE__ handler
-	};
-	local $SIG{__WARN__} = sub { syslog('warning', "@_") };
-	open(STDIN, '+<', '/dev/null') or die "redirect stdin failed: $!\n";
-	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!\n";
-	open STDERR, '>&STDIN' or die "redirect stderr failed: $!\n";
-	setsid();
-	$pid = fork // die "fork: $!";
+	(-p STDOUT && -p STDERR) or die "E: stdout+stderr must be pipes\n";
+	open(STDIN, '+<', '/dev/null') or die "redirect stdin failed: $!";
+	POSIX::setsid() > 0 or die "setsid: $!";
+	my $pid = fork // die "fork: $!";
 	return if $pid;
-	$SIG{__DIE__} = 'DEFAULT';
-	my $on_destroy = PublicInbox::OnDestroy->new(sub {
-		my ($owner_pid) = @_;
-		syslog('crit', "$@") if $@ && $$ == $owner_pid;
-	}, $$);
 	$0 = "lei-daemon $path";
 	local %PATH2CFG;
-	$l->blocking(0);
-	$eof_w->blocking(0);
-	$eof_r->blocking(0);
-	my $listener = PublicInbox::Listener->new($l, \&accept_dispatch, $l);
+	$_->blocking(0) for ($l, $eof_r, $eof_w);
+	$l = PublicInbox::Listener->new($l, \&accept_dispatch, $l);
 	my $exit_code;
 	local $quit = sub {
 		$exit_code //= shift;
-		my $tmp = $listener or exit($exit_code);
+		my $listener = $l or exit($exit_code);
 		unlink($path) if defined($path);
-		syswrite($eof_w, '.');
-		$l = $listener = $path = undef;
-		$tmp->close if $tmp; # DS::close
+		# closing eof_w triggers \&noop wakeup
+		$eof_w = $l = $path = undef;
+		$listener->close; # DS::close
 		PublicInbox::DS->SetLoopTimeout(1000);
 	};
-	PublicInbox::EOFpipe->new($eof_r, sub {}, undef);
+	PublicInbox::EOFpipe->new($eof_r, \&noop, undef);
 	my $sig = {
 		CHLD => \&PublicInbox::DS::enqueue_reap,
 		QUIT => $quit,
@@ -682,8 +666,21 @@ sub lazy_start {
 		}
 		$n; # true: continue, false: stop
 	});
+
+	# STDIN was redirected to /dev/null above, closing STDOUT and
+	# STDERR will cause the calling `lei' client process to finish
+	# reading <$daemon> pipe.
+	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!";
+	openlog($path, 'pid', 'user');
+	local $SIG{__WARN__} = sub { syslog('warning', "@_") };
+	my $owner_pid = $$;
+	my $on_destroy = PublicInbox::OnDestroy->new(sub {
+		syslog('crit', "$@") if $@ && $$ == $owner_pid;
+	});
+	open STDERR, '>&STDIN' or die "redirect stderr failed: $!";
+	# $daemon pipe to `lei' closed, main loop begins:
 	PublicInbox::DS->EventLoop;
-	$@ = undef if $on_destroy; # quiet OnDestroy if we got here
+	@$on_destroy = (); # cancel on_destroy if we get here
 	exit($exit_code // 0);
 }
 
diff --git a/script/lei b/script/lei
index 2b041fb4..ceaf1e00 100755
--- a/script/lei
+++ b/script/lei
@@ -4,8 +4,8 @@
 use strict;
 use v5.10.1;
 use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
-
-if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
+if (my ($sock, $pwd) = eval {
+	require IO::FDPass; # will try to use a daemon to reduce load time
 	my $path = do {
 		my $runtime_dir = ($ENV{XDG_RUNTIME_DIR} // '') . '/lei';
 		if ($runtime_dir eq '/lei') {
@@ -21,32 +21,41 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 	my $addr = pack_sockaddr_un($path);
 	socket(my $sock, AF_UNIX, SOCK_STREAM, 0) or die "socket: $!";
 	unless (connect($sock, $addr)) { # start the daemon if not started
-		my $err = $! + 0;
-		my $env = { PERL5LIB => join(':', @INC) };
 		my $cmd = [ $^X, qw[-MPublicInbox::LEI
 			-E PublicInbox::LEI::lazy_start(@ARGV)],
-			$path, $err ];
+			$path, $! + 0 ];
+		my $env = { PERL5LIB => join(':', @INC) };
+		pipe(my ($daemon, $w)) or die "pipe: $!";
+		my $opt = { 1 => $w, 2 => $w };
 		require PublicInbox::Spawn;
-		waitpid(PublicInbox::Spawn::spawn($cmd, $env), 0);
-		warn "lei-daemon exited with \$?=$?\n" if $?;
+		my $pid = PublicInbox::Spawn::spawn($cmd, $env, $opt);
+		$opt = $w = undef;
+		while (<$daemon>) { warn $_ } # EOF when STDERR is redirected
+		waitpid($pid, 0) or warn <<"";
+lei-daemon could not start, PID:$pid exited with \$?=$?
 
 		# try connecting again anyways, unlink+bind may be racy
-		connect($sock, $addr) or die
-			"connect($path): $! (after attempted daemon start)";
+		unless (connect($sock, $addr)) {
+			die <<"";
+connect($path): $! (after attempted daemon start)
+Falling back to (slow) one-shot mode
+
+		}
 	}
 	require Cwd;
-	my $cwd = Cwd::fastcwd() // die "fastcwd: $!";
+	my $cwd = Cwd::fastcwd() // die "fastcwd(PWD=".($ENV{PWD}//'').": $!";
 	my $pwd = $ENV{PWD} // '';
-	if ($pwd eq $cwd) { # likely, all good
-	} elsif ($pwd) { # prefer ENV{PWD} if it's a symlink to real cwd
-		my @st_cwd = stat($cwd) or die "stat(cwd=$cwd): $!\n";
-		my @st_pwd = stat($pwd);
+	if ($pwd ne $cwd) { # prefer ENV{PWD} if it's a symlink to real cwd
+		my @st_cwd = stat($cwd) or die "stat(cwd=$cwd): $!";
+		my @st_pwd = stat($pwd); # PWD invalid, use cwd
 		# make sure st_dev/st_ino match for {PWD} to be valid
 		$pwd = $cwd if (!@st_pwd || $st_pwd[1] != $st_cwd[1] ||
 					$st_pwd[0] != $st_cwd[0]);
 	} else {
 		$pwd = $cwd;
 	}
+	($sock, $pwd);
+}) { # IO::FDPass, $sock, $pwd are all available:
 	local $ENV{PWD} = $pwd;
 	my $buf = "$$\0\0>" . join("]\0[", @ARGV) . "\0\0>";
 	while (my ($k, $v) = each %ENV) { $buf .= "$k=$v\0" }
@@ -60,6 +69,8 @@ if (eval { require IO::FDPass; 1 }) { # use daemon to reduce load time
 		die $buf;
 	}
 } else { # for systems lacking IO::FDPass
+	# don't warn about IO::FDPass since it's not commonly installed
+	warn $@ if $@ && index($@, 'IO::FDPass') < 0;
 	require PublicInbox::LEI;
 	PublicInbox::LEI::oneshot(__PACKAGE__);
 }
diff --git a/t/lei.t b/t/lei.t
index bdf6cc1c..cce90fff 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -127,7 +127,7 @@ my $test_lei_common = sub {
 my $test_lei_oneshot = $ENV{TEST_LEI_ONESHOT};
 SKIP: {
 	last SKIP if $test_lei_oneshot;
-	require_mods('IO::FDPass', 16);
+	require_mods(qw(IO::FDPass Cwd), 41);
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
 
 	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
@@ -188,7 +188,34 @@ SKIP: {
 	chomp(my $new_pid = $out);
 	ok(kill(0, $new_pid), 'new pid is running');
 	ok(-S $sock, 'sock exists again');
-	unlink $sock or BAIL_OUT "unlink $!";
+
+	if ('socket inaccessible') {
+		chmod 0000, $sock or BAIL_OUT "chmod 0000: $!";
+		$out = $err = '';
+		ok(run_script([qw(lei help)], undef, $opt),
+			'connect fail, one-shot fallback works');
+		like($err, qr/\bconnect\(/, 'connect error noted');
+		like($out, qr/^usage: /, 'help output works');
+		chmod 0700, $sock or BAIL_OUT "chmod 0700: $!";
+	}
+	if ('oneshot on cwd gone') {
+		my $cwd = Cwd::fastcwd() or BAIL_OUT "fastcwd: $!";
+		my $d = "$home/to-be-removed";
+		mkdir $d or BAIL_OUT "mkdir($d) $!";
+		chdir $d or BAIL_OUT "chdir($d) $!";
+		if (rmdir($d)) {
+			$out = $err = '';
+			ok(run_script([qw(lei help)], undef, $opt),
+				'cwd fail, one-shot fallback works');
+		} else {
+			$err = "rmdir=$!";
+		}
+		chdir $cwd or BAIL_OUT "chdir($cwd) $!";
+		like($err, qr/cwd\(/, 'cwd error noted');
+		like($out, qr/^usage: /, 'help output still works');
+	}
+
+	unlink $sock or BAIL_OUT "unlink($sock) $!";
 	for (0..100) {
 		kill('CHLD', $new_pid) or last;
 		tick();

^ permalink raw reply related	[relevance 38%]

* [PATCH 15/26] lei: rename $client => $self and bless
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (9 preceding siblings ...)
  2020-12-18 12:09 47% ` [PATCH 14/26] lei: help: show actual paths being operated on Eric Wong
@ 2020-12-18 12:09 37% ` Eric Wong
  2020-12-18 12:09 50% ` [PATCH 16/26] lei: micro-optimize startup time Eric Wong
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

lei will get bigger, so follow existing OO conventions to make
it easy to call methods in PublicInbox::LEI from other packages.
---
 lib/PublicInbox/LEI.pm | 146 ++++++++++++++++++++---------------------
 1 file changed, 73 insertions(+), 73 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 667ef765..f5824c59 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -209,12 +209,12 @@ my %OPTDESC = (
 ); # %OPTDESC
 
 sub x_it ($$) { # pronounced "exit"
-	my ($client, $code) = @_;
+	my ($self, $code) = @_;
 	if (my $sig = ($code & 127)) {
-		kill($sig, $client->{pid} // $$);
+		kill($sig, $self->{pid} // $$);
 	} else {
 		$code >>= 8;
-		if (my $sock = $client->{sock}) {
+		if (my $sock = $self->{sock}) {
 			say $sock "exit=$code";
 		} else { # for oneshot
 			$quit->($code);
@@ -223,32 +223,32 @@ sub x_it ($$) { # pronounced "exit"
 }
 
 sub emit {
-	my ($client, $channel) = @_; # $buf = $_[2]
-	print { $client->{$channel} } $_[2] or die "print FD[$channel]: $!";
+	my ($self, $channel) = @_; # $buf = $_[2]
+	print { $self->{$channel} } $_[2] or die "print FD[$channel]: $!";
 }
 
 sub err {
-	my ($client, $buf) = @_;
+	my ($self, $buf) = @_;
 	$buf .= "\n" unless $buf =~ /\n\z/s;
-	emit($client, 2, $buf);
+	emit($self, 2, $buf);
 }
 
 sub qerr { $_[0]->{opt}->{quiet} or err(@_) }
 
 sub fail ($$;$) {
-	my ($client, $buf, $exit_code) = @_;
-	err($client, $buf);
-	x_it($client, ($exit_code // 1) << 8);
+	my ($self, $buf, $exit_code) = @_;
+	err($self, $buf);
+	x_it($self, ($exit_code // 1) << 8);
 	undef;
 }
 
 sub _help ($;$) {
-	my ($client, $errmsg) = @_;
-	my $cmd = $client->{cmd} // 'COMMAND';
+	my ($self, $errmsg) = @_;
+	my $cmd = $self->{cmd} // 'COMMAND';
 	my @info = @{$CMD{$cmd} // [ '...', '...' ]};
 	my @top = ($cmd, shift(@info) // ());
 	my $cmd_desc = shift(@info);
-	$cmd_desc = $cmd_desc->($client->{env}) if ref($cmd_desc) eq 'CODE';
+	$cmd_desc = $cmd_desc->($self->{env}) if ref($cmd_desc) eq 'CODE';
 	my @opt_desc;
 	my $lpad = 2;
 	for my $sw (grep { !ref } @info) { # ("prio=s", "z", $GLP_PASS)
@@ -314,15 +314,15 @@ EOF
 		$msg .= "\n";
 	}
 	my $channel = $errmsg ? 2 : 1;
-	emit($client, $channel, $msg);
-	x_it($client, $errmsg ? 1 << 8 : 0); # stderr => failure
+	emit($self, $channel, $msg);
+	x_it($self, $errmsg ? 1 << 8 : 0); # stderr => failure
 	undef;
 }
 
 sub optparse ($$$) {
-	my ($client, $cmd, $argv) = @_;
-	$client->{cmd} = $cmd;
-	my $opt = $client->{opt} = {};
+	my ($self, $cmd, $argv) = @_;
+	$self->{cmd} = $cmd;
+	my $opt = $self->{opt} = {};
 	my $info = $CMD{$cmd} // [ '[...]' ];
 	my ($proto, undef, @spec) = @$info;
 	my $glp = ref($spec[-1]) ? pop(@spec) : $GLP; # or $GLP_PASS
@@ -334,8 +334,8 @@ sub optparse ($$$) {
 		push @spec, '' => \$var;
 	}
 	$glp->getoptionsfromarray($argv, $opt, @spec) or
-		return _help($client, "bad arguments or options for $cmd");
-	return _help($client) if $opt->{help};
+		return _help($self, "bad arguments or options for $cmd");
+	return _help($self) if $opt->{help};
 
 	# "-" aliases "stdin" or "clear"
 	$opt->{$lone_dash} = ${$opt->{$lone_dash}} if defined $lone_dash;
@@ -380,40 +380,40 @@ sub optparse ($$$) {
 	if (!$inf && scalar(@$argv) > scalar(@args)) {
 		$err //= 'too many arguments';
 	}
-	$err ? fail($client, "usage: lei $cmd $proto\nE: $err") : 1;
+	$err ? fail($self, "usage: lei $cmd $proto\nE: $err") : 1;
 }
 
 sub dispatch {
-	my ($client, $cmd, @argv) = @_;
-	local $SIG{__WARN__} = sub { err($client, "@_") };
+	my ($self, $cmd, @argv) = @_;
+	local $SIG{__WARN__} = sub { err($self, "@_") };
 	local $SIG{__DIE__} = 'DEFAULT';
-	return _help($client, 'no command given') unless defined($cmd);
+	return _help($self, 'no command given') unless defined($cmd);
 	my $func = "lei_$cmd";
 	$func =~ tr/-/_/;
 	if (my $cb = __PACKAGE__->can($func)) {
-		optparse($client, $cmd, \@argv) or return;
-		$cb->($client, @argv);
+		optparse($self, $cmd, \@argv) or return;
+		$cb->($self, @argv);
 	} elsif (grep(/\A-/, $cmd, @argv)) { # --help or -h only
 		my $opt = {};
 		$GLP->getoptionsfromarray([$cmd, @argv], $opt, qw(help|h)) or
-			return _help($client, 'bad arguments or options');
-		_help($client);
+			return _help($self, 'bad arguments or options');
+		_help($self);
 	} else {
-		fail($client, "`$cmd' is not an lei command");
+		fail($self, "`$cmd' is not an lei command");
 	}
 }
 
 sub _lei_cfg ($;$) {
-	my ($client, $creat) = @_;
-	my $f = _config_path($client->{env});
+	my ($self, $creat) = @_;
+	my $f = _config_path($self->{env});
 	my @st = stat($f);
 	my $cur_st = @st ? pack('dd', $st[10], $st[7]) : ''; # 10:ctime, 7:size
 	if (my $cfg = $PATH2CFG{$f}) { # reuse existing object in common case
-		return ($client->{cfg} = $cfg) if $cur_st eq $cfg->{-st};
+		return ($self->{cfg} = $cfg) if $cur_st eq $cfg->{-st};
 	}
 	if (!@st) {
 		unless ($creat) {
-			delete $client->{cfg};
+			delete $self->{cfg};
 			return;
 		}
 		my (undef, $cfg_dir, undef) = File::Spec->splitpath($f);
@@ -421,17 +421,17 @@ sub _lei_cfg ($;$) {
 		open my $fh, '>>', $f or die "open($f): $!\n";
 		@st = stat($fh) or die "fstat($f): $!\n";
 		$cur_st = pack('dd', $st[10], $st[7]);
-		qerr($client, "I: $f created") if $client->{cmd} ne 'config';
+		qerr($self, "I: $f created") if $self->{cmd} ne 'config';
 	}
 	my $cfg = PublicInbox::Config::git_config_dump($f);
 	$cfg->{-st} = $cur_st;
 	$cfg->{'-f'} = $f;
-	$client->{cfg} = $PATH2CFG{$f} = $cfg;
+	$self->{cfg} = $PATH2CFG{$f} = $cfg;
 }
 
 sub _lei_store ($;$) {
-	my ($client, $creat) = @_;
-	my $cfg = _lei_cfg($client, $creat);
+	my ($self, $creat) = @_;
+	my $cfg = _lei_cfg($self, $creat);
 	$cfg->{-lei_store} //= do {
 		require PublicInbox::LeiStore;
 		PublicInbox::SearchIdx::load_xapian_writable();
@@ -441,35 +441,35 @@ sub _lei_store ($;$) {
 }
 
 sub lei_show {
-	my ($client, @argv) = @_;
+	my ($self, @argv) = @_;
 }
 
 sub lei_query {
-	my ($client, @argv) = @_;
+	my ($self, @argv) = @_;
 }
 
 sub lei_mark {
-	my ($client, @argv) = @_;
+	my ($self, @argv) = @_;
 }
 
 sub lei_config {
-	my ($client, @argv) = @_;
-	$client->{opt}->{'config-file'} and return fail $client,
+	my ($self, @argv) = @_;
+	$self->{opt}->{'config-file'} and return fail $self,
 		"config file switches not supported by `lei config'";
-	my $env = $client->{env};
+	my $env = $self->{env};
 	delete local $env->{GIT_CONFIG};
-	my $cfg = _lei_cfg($client, 1);
+	my $cfg = _lei_cfg($self, 1);
 	my $cmd = [ qw(git config -f), $cfg->{'-f'}, @argv ];
-	my %rdr = map { $_ => $client->{$_} } (0..2);
+	my %rdr = map { $_ => $self->{$_} } (0..2);
 	require PublicInbox::Import;
 	PublicInbox::Import::run_die($cmd, $env, \%rdr);
 }
 
 sub lei_init {
-	my ($client, $dir) = @_;
-	my $cfg = _lei_cfg($client, 1);
+	my ($self, $dir) = @_;
+	my $cfg = _lei_cfg($self, 1);
 	my $cur = $cfg->{'leistore.dir'};
-	my $env = $client->{env};
+	my $env = $self->{env};
 	$dir //= _store_path($env);
 	$dir = File::Spec->rel2abs($dir, $env->{PWD}); # PWD is symlink-aware
 	my @cur = stat($cur) if defined($cur);
@@ -478,24 +478,24 @@ sub lei_init {
 	my $exists = "I: leistore.dir=$cur already initialized" if @dir;
 	if (@cur) {
 		if ($cur eq $dir) {
-			_lei_store($client, 1)->done;
-			return qerr($client, $exists);
+			_lei_store($self, 1)->done;
+			return qerr($self, $exists);
 		}
 
 		# some folks like symlinks and bind mounts :P
 		if (@dir && "$cur[0] $cur[1]" eq "$dir[0] $dir[1]") {
-			lei_config($client, 'leistore.dir', $dir);
-			_lei_store($client, 1)->done;
-			return qerr($client, "$exists (as $cur)");
+			lei_config($self, 'leistore.dir', $dir);
+			_lei_store($self, 1)->done;
+			return qerr($self, "$exists (as $cur)");
 		}
-		return fail($client, <<"");
+		return fail($self, <<"");
 E: leistore.dir=$cur already initialized and it is not $dir
 
 	}
-	lei_config($client, 'leistore.dir', $dir);
-	_lei_store($client, 1)->done;
+	lei_config($self, 'leistore.dir', $dir);
+	_lei_store($self, 1)->done;
 	$exists //= "I: leistore.dir=$dir newly initialized";
-	return qerr($client, $exists);
+	return qerr($self, $exists);
 }
 
 sub lei_daemon_pid { emit($_[0], 1, "$$\n") }
@@ -503,8 +503,8 @@ sub lei_daemon_pid { emit($_[0], 1, "$$\n") }
 sub lei_daemon_stop { $quit->(0) }
 
 sub lei_daemon_env {
-	my ($client, @argv) = @_;
-	my $opt = $client->{opt};
+	my ($self, @argv) = @_;
+	my $opt = $self->{opt};
 	if (defined $opt->{clear}) {
 		%ENV = ();
 	} elsif (my $u = $opt->{unset}) {
@@ -516,29 +516,29 @@ sub lei_daemon_env {
 		my $eor = $opt->{z} ? "\0" : "\n";
 		my $buf = '';
 		while (my ($k, $v) = each %ENV) { $buf .= "$k=$v$eor" }
-		emit($client, 1, $buf)
+		emit($self, 1, $buf)
 	}
 }
 
 sub lei_help { _help($_[0]) }
 
 sub reap_exec { # dwaitpid callback
-	my ($client, $pid) = @_;
-	x_it($client, $?);
+	my ($self, $pid) = @_;
+	x_it($self, $?);
 }
 
 sub lei_git { # support passing through random git commands
-	my ($client, @argv) = @_;
-	my %rdr = map { $_ => $client->{$_} } (0..2);
-	my $pid = spawn(['git', @argv], $client->{env}, \%rdr);
-	PublicInbox::DS::dwaitpid($pid, \&reap_exec, $client);
+	my ($self, @argv) = @_;
+	my %rdr = map { $_ => $self->{$_} } (0..2);
+	my $pid = spawn(['git', @argv], $self->{env}, \%rdr);
+	PublicInbox::DS::dwaitpid($pid, \&reap_exec, $self);
 }
 
 sub accept_dispatch { # Listener {post_accept} callback
 	my ($sock) = @_; # ignore other
 	$sock->blocking(1);
 	$sock->autoflush(1);
-	my $client = { sock => $sock };
+	my $self = bless { sock => $sock }, __PACKAGE__;
 	vec(my $rin = '', fileno($sock), 1) = 1;
 	# `say $sock' triggers "die" in lei(1)
 	for my $i (0..2) {
@@ -547,7 +547,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 			if ($fd >= 0) {
 				my $rdr = ($fd == 0 ? '<&=' : '>&=');
 				if (open(my $fh, $rdr, $fd)) {
-					$client->{$i} = $fh;
+					$self->{$i} = $fh;
 				} else {
 					say $sock "open($rdr$fd) (FD=$i): $!";
 					return;
@@ -571,9 +571,9 @@ sub accept_dispatch { # Listener {post_accept} callback
 	};
 	my %env = map { split(/=/, $_, 2) } split(/\0/, $env);
 	if (chdir($env{PWD})) {
-		$client->{env} = \%env;
-		$client->{pid} = $client_pid;
-		eval { dispatch($client, split(/\]\0\[/, $argv)) };
+		$self->{env} = \%env;
+		$self->{pid} = $client_pid;
+		eval { dispatch($self, split(/\]\0\[/, $argv)) };
 		say $sock $@ if $@;
 	} else {
 		say $sock "chdir($env{PWD}): $!"; # implicit close
@@ -691,12 +691,12 @@ sub oneshot {
 	local $quit = $exit if $exit;
 	local %PATH2CFG;
 	umask(077) // die("umask(077): $!");
-	dispatch({
+	dispatch((bless {
 		0 => *STDIN{IO},
 		1 => *STDOUT{IO},
 		2 => *STDERR{IO},
 		env => \%ENV
-	}, @ARGV);
+	}, __PACKAGE__), @ARGV);
 }
 
 1;

^ permalink raw reply related	[relevance 37%]

* [PATCH 25/26] lei: revise output routines
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (16 preceding siblings ...)
  2020-12-18 12:09 42% ` [PATCH 24/26] lei: support for -$DIGIT and -$SIG CLI switches Eric Wong
@ 2020-12-18 12:09 64% ` Eric Wong
  2020-12-18 12:09 42% ` [PATCH 26/26] lei: extinbox: start implementing in config file Eric Wong
  18 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

Drop emit(), since we hard code the channel (client FD) 99% of
the time and use prototypes to avoid parentheses because my
hands are tired.
---
 lib/PublicInbox/LEI.pm | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index c28c9b59..97c5d91b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -250,18 +250,13 @@ sub x_it ($$) { # pronounced "exit"
 
 sub puts ($;@) { print { shift->{1} } map { "$_\n" } @_ }
 
-sub emit {
-	my ($self, $channel) = @_; # $buf = $_[2]
-	print { $self->{$channel} } $_[2] or die "print FD[$channel]: $!";
-}
+sub out ($;@) { print { shift->{1} } @_ }
 
-sub err {
-	my ($self, $buf) = @_;
-	$buf .= "\n" unless $buf =~ /\n\z/s;
-	emit($self, 2, $buf);
+sub err ($;@) {
+	print { shift->{2} } @_, (substr($_[-1], -1, 1) eq "\n" ? () : "\n");
 }
 
-sub qerr { $_[0]->{opt}->{quiet} or err(@_) }
+sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }
 
 sub fail ($$;$) {
 	my ($self, $buf, $exit_code) = @_;
@@ -341,8 +336,7 @@ EOF
 		$msg .= $rhs;
 		$msg .= "\n";
 	}
-	my $channel = $errmsg ? 2 : 1;
-	emit($self, $channel, $msg);
+	print { $self->{$errmsg ? 2 : 1} } $msg;
 	x_it($self, $errmsg ? 1 << 8 : 0); # stderr => failure
 	undef;
 }
@@ -404,7 +398,6 @@ sub optparse ($$$) {
 		}
 		last if $err;
 	}
-	# warn "inf=$inf ".scalar(@$argv). ' '.scalar(@args)."\n";
 	if (!$inf && scalar(@$argv) > scalar(@args)) {
 		$err //= 'too many arguments';
 	}
@@ -413,7 +406,7 @@ sub optparse ($$$) {
 
 sub dispatch {
 	my ($self, $cmd, @argv) = @_;
-	local $SIG{__WARN__} = sub { err($self, "@_") };
+	local $SIG{__WARN__} = sub { err($self, @_) };
 	return _help($self, 'no command given') unless defined($cmd);
 	my $func = "lei_$cmd";
 	$func =~ tr/-/_/;
@@ -525,7 +518,7 @@ E: leistore.dir=$cur already initialized and it is not $dir
 	return qerr($self, $exists);
 }
 
-sub lei_daemon_pid { emit($_[0], 1, "$$\n") }
+sub lei_daemon_pid { puts shift, $$ }
 
 sub lei_daemon_kill {
 	my ($self) = @_;
@@ -547,7 +540,7 @@ sub lei_daemon_env {
 		my $eor = $opt->{z} ? "\0" : "\n";
 		my $buf = '';
 		while (my ($k, $v) = each %ENV) { $buf .= "$k=$v$eor" }
-		emit($self, 1, $buf)
+		out $self, $buf;
 	}
 }
 

^ permalink raw reply related	[relevance 64%]

* [PATCH 26/26] lei: extinbox: start implementing in config file
  2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
                   ` (17 preceding siblings ...)
  2020-12-18 12:09 64% ` [PATCH 25/26] lei: revise output routines Eric Wong
@ 2020-12-18 12:09 42% ` Eric Wong
  2020-12-18 20:23 71%   ` Eric Wong
  18 siblings, 1 reply; 200+ results
From: Eric Wong @ 2020-12-18 12:09 UTC (permalink / raw)
  To: meta

They need to be indexed by MiscIdx, but MiscIdx
still needs more work to support faster config
loading when dealing with ~100K data sources.
---
 lib/PublicInbox/LEI.pm         | 19 ++++-----
 lib/PublicInbox/LeiExtinbox.pm | 52 ++++++++++++++++++++++++
 t/lei.t                        | 72 ++++++++++++++++++++++++++++++++--
 3 files changed, 130 insertions(+), 13 deletions(-)
 create mode 100644 lib/PublicInbox/LeiExtinbox.pm

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 97c5d91b..b254e2c5 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -8,7 +8,7 @@
 package PublicInbox::LEI;
 use strict;
 use v5.10.1;
-use parent qw(PublicInbox::DS);
+use parent qw(PublicInbox::DS PublicInbox::LeiExtinbox);
 use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
 use Errno qw(EAGAIN ECONNREFUSED ENOENT);
@@ -79,12 +79,12 @@ our %CMD = ( # sorted in order of importance/use:
 
 'add-extinbox' => [ 'URL_OR_PATHNAME',
 	'add/set priority of a publicinbox|extindex for extra matches',
-	qw(prio=i) ],
+	qw(boost=i quiet|q) ],
 'ls-extinbox' => [ '[FILTER...]', 'list publicinbox|extindex locations',
-	qw(format|f=s z local remote) ],
+	qw(format|f=s z|0 local remote quiet|q) ],
 'forget-extinbox' => [ '{URL_OR_PATHNAME|--prune}',
 	'exclude further results from a publicinbox|extindex',
-	qw(prune) ],
+	qw(prune quiet|q) ],
 
 'ls-query' => [ '[FILTER...]', 'list saved search queries',
 		qw(name-only format|f=s z) ],
@@ -107,7 +107,7 @@ our %CMD = ( # sorted in order of importance/use:
 
 # code repos are used for `show' to solve blobs from patch mails
 'add-coderepo' => [ 'PATHNAME', 'add or set priority of a git code repo',
-	qw(prio=i) ],
+	qw(boost=i) ],
 'ls-coderepo' => [ '[FILTER_TERMS...]',
 		'list known code repos', qw(format|f=s z) ],
 'forget-coderepo' => [ 'PATHNAME',
@@ -197,7 +197,7 @@ my %OPTDESC = (
 'sort|s=s@' => [ 'VAL|internaldate,date,relevance,docid',
 		"order of results `--output'-dependent"],
 
-'prio=i' => 'priority of query source',
+'boost=i' => 'increase/decrease priority of results (default: 0)',
 
 'local' => 'limit operations to the local filesystem',
 'local!' => 'exclude results from the local filesystem',
@@ -217,8 +217,7 @@ my %OPTDESC = (
 'by-mid|mid:s' => [ 'MID', 'match only by Message-ID, ignoring contents' ],
 'jobs:i' => 'set parallelism level',
 
-# xargs, env, use "-0", git(1) uses "-z".  Should we support z|0 everywhere?
-'z' => 'use NUL \\0 instead of newline (CR) to delimit lines',
+# xargs, env, use "-0", git(1) uses "-z".  We support z|0 everywhere
 'z|0' => 'use NUL \\0 instead of newline (CR) to delimit lines',
 
 # note: no "--ignore-environment" / "-i" support like env(1) since that
@@ -455,7 +454,9 @@ sub _lei_store ($;$) {
 	$cfg->{-lei_store} //= do {
 		require PublicInbox::LeiStore;
 		PublicInbox::SearchIdx::load_xapian_writable();
-		defined(my $dir = $cfg->{'leistore.dir'}) or return;
+		my $dir = $cfg->{'leistore.dir'};
+		$dir //= _store_path($self->{env}) if $creat;
+		return unless $dir;
 		PublicInbox::LeiStore->new($dir, { creat => $creat });
 	};
 }
diff --git a/lib/PublicInbox/LeiExtinbox.pm b/lib/PublicInbox/LeiExtinbox.pm
new file mode 100644
index 00000000..2f52b115
--- /dev/null
+++ b/lib/PublicInbox/LeiExtinbox.pm
@@ -0,0 +1,52 @@
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# *-extinbox commands of lei
+package PublicInbox::LeiExtinbox;
+use strict;
+use v5.10.1;
+use parent qw(Exporter);
+our @EXPORT = qw(lei_ls_extinbox lei_add_extinbox lei_forget_extinbox);
+
+sub lei_ls_extinbox {
+	my ($self, @argv) = @_;
+	my $stor = $self->_lei_store(0);
+	my $cfg = $self->_lei_cfg(0);
+	my $out = $self->{1};
+	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
+	my (%boost, @loc);
+	for my $sec (grep(/\Aextinbox\./, @{$cfg->{-section_order}})) {
+		my $loc = substr($sec, length('extinbox.'));
+		$boost{$loc} = $cfg->{"$sec.boost"};
+		push @loc, $loc;
+	}
+	my $out = $self->{1};
+	use sort 'stable';
+	# highest boost first, but stable for alphabetic tie break
+	for (sort { $boost{$b} <=> $boost{$a} } sort keys %boost) {
+		# TODO: use miscidx and show docid so forget/set is easier
+		print $out $_, $OFS, 'boost=', $boost{$_}, $ORS;
+	}
+}
+
+sub lei_add_extinbox {
+	my ($self, $url_or_dir) = @_;
+	my $cfg = $self->_lei_cfg(1);
+	if ($url_or_dir !~ m!\Ahttps?://!) {
+		$url_or_dir = File::Spec->canonpath($url_or_dir);
+	}
+	my $new_boost = $self->{opt}->{boost} // 0;
+	my $key = "extinbox.$url_or_dir.boost";
+	my $cur_boost = $cfg->{$key};
+	return if defined($cur_boost) && $cur_boost == $new_boost; # idempotent
+	$self->lei_config($key, $new_boost);
+	my $stor = $self->_lei_store(1);
+	# TODO: add to MiscIdx
+	$stor->done;
+}
+
+sub lei_forget_extinbox {
+	# TODO
+}
+
+1;
diff --git a/t/lei.t b/t/lei.t
index 30f9d2b6..a95a0efc 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -7,17 +7,18 @@ use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::Config;
 use File::Path qw(rmtree);
+require_git 2.6;
 require_mods(qw(json DBD::SQLite Search::Xapian));
 my $LEI = 'lei';
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 my $lei = sub {
-	my ($cmd, $env, $opt) = @_;
+	my ($cmd, $env, $xopt) = @_;
 	$out = $err = '';
 	if (!ref($cmd)) {
-		($env, $opt) = grep { (!defined) || ref } @_;
-		$cmd = [ grep { defined } @_ ];
+		($env, $xopt) = grep { (!defined) || ref } @_;
+		$cmd = [ grep { defined && !ref } @_ ];
 	}
-	run_script([$LEI, @$cmd], $env, $opt);
+	run_script([$LEI, @$cmd], $env, $xopt // $opt);
 };
 
 my ($home, $for_destroy) = tmpdir();
@@ -29,6 +30,8 @@ local $ENV{FOO} = 'BAR';
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
 my $home_trash = [ "$home/.local", "$home/.config" ];
 my $cleanup = sub { rmtree([@$home_trash, @_]) };
+my $config_file = "$home/.config/lei/config";
+my $store_dir = "$home/.local/share/lei";
 
 my $test_help = sub {
 	ok(!$lei->([], undef, $opt), 'no args fails');
@@ -118,10 +121,71 @@ my $test_config = sub {
 	ok(!-f "$home/config/f", 'no file created');
 };
 
+my $setup_publicinboxes = sub {
+	state $done = '';
+	return if $done eq $home;
+	use PublicInbox::InboxWritable;
+	for my $V (1, 2) {
+		run_script([qw(-init -Lmedium), "-V$V", "t$V",
+				'--newsgroup', "t.$V",
+				"$home/t$V", "http://example.com/t$V",
+				"t$V\@example.com" ]) or BAIL_OUT "init v$V";
+	}
+	my $cfg = PublicInbox::Config->new;
+	my $seen = 0;
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		my $im = PublicInbox::InboxWritable->new($ibx)->importer(0);
+		my $V = $ibx->version;
+		my @eml = glob('t/*.eml');
+		push(@eml, 't/data/0001.patch') if $V == 2;
+		for (@eml) {
+			next if $_ eq 't/psgi_v2-old.eml'; # dup mid
+			$im->add(eml_load($_)) or BAIL_OUT "v$V add $_";
+			$seen++;
+		}
+		$im->done;
+		if ($V == 1) {
+			run_script(['-index', $ibx->{inboxdir}]) or
+				BAIL_OUT 'index v1';
+		}
+	});
+	$done = $home;
+	$seen || BAIL_OUT 'no imports';
+};
+
+my $test_extinbox = sub {
+	$setup_publicinboxes->();
+	$cleanup->();
+	$lei->('ls-extinbox');
+	is($out.$err, '', 'ls-extinbox no output, yet');
+	ok(!-e $config_file && !-e $store_dir,
+		'nothing created by ls-extinbox');
+
+	my $cfg = PublicInbox::Config->new;
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		ok($lei->(qw(add-extinbox -q), $ibx->{inboxdir}),
+			'added extinbox');
+		is($out.$err, '', 'no output');
+	});
+	ok(-s $config_file && -e $store_dir,
+		'add-extinbox created config + store');
+	my $lcfg = PublicInbox::Config->new($config_file);
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		is($lcfg->{"extinbox.$ibx->{inboxdir}.boost"}, 0,
+			"configured boost on $ibx->{name}");
+	});
+	$lei->('ls-extinbox');
+	like($out, qr/boost=0\n/s, 'ls-extinbox has output');
+};
+
 my $test_lei_common = sub {
 	$test_help->();
 	$test_config->();
 	$test_init->();
+	$test_extinbox->();
 };
 
 my $test_lei_oneshot = $ENV{TEST_LEI_ONESHOT};

^ permalink raw reply related	[relevance 42%]

* Re: [PATCH 26/26] lei: extinbox: start implementing in config file
  2020-12-18 12:09 42% ` [PATCH 26/26] lei: extinbox: start implementing in config file Eric Wong
@ 2020-12-18 20:23 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-18 20:23 UTC (permalink / raw)
  To: meta

Will squash these changes in before pushing:

diff --git a/MANIFEST b/MANIFEST
index e2d4ef72..f0847e3c 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -162,6 +162,7 @@ lib/PublicInbox/InboxWritable.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
+lib/PublicInbox/LeiExtinbox.pm
 lib/PublicInbox/LeiSearch.pm
 lib/PublicInbox/LeiStore.pm
 lib/PublicInbox/Linkify.pm
diff --git a/lib/PublicInbox/LeiExtinbox.pm b/lib/PublicInbox/LeiExtinbox.pm
index 2f52b115..c2de7735 100644
--- a/lib/PublicInbox/LeiExtinbox.pm
+++ b/lib/PublicInbox/LeiExtinbox.pm
@@ -20,7 +20,6 @@ sub lei_ls_extinbox {
 		$boost{$loc} = $cfg->{"$sec.boost"};
 		push @loc, $loc;
 	}
-	my $out = $self->{1};
 	use sort 'stable';
 	# highest boost first, but stable for alphabetic tie break
 	for (sort { $boost{$b} <=> $boost{$a} } sort keys %boost) {

^ permalink raw reply related	[relevance 71%]

* [RFC] lei: rename proposed "query" command to "q", add JSON output
  @ 2020-12-26 11:13 55%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-26 11:13 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> I also use notmuch (via its Emacs interface).  As someone that will
> probably write an Emacs interface for lei (as part of piem), an aspect
> of notmuch that I'd be grateful to see in lei is a structured output
> format for easier parsing and for conveying the thread layout.  `notmuch
> show' and `notmuch search' have json and S-expressions.  I wouldn't
> expect to see S-expressions coming out of lei :), but perhaps json would
> be on the table for `lei show' and `lei query' given that it's planned
> for $ls_format.

OK, before I forget, JSON is added.
And an extremely long message on why I want to type less :x
----------8<---------
Subject: [PATCH] lei: rename proposed "query" command to "q", add JSON output

Using "query" as a verb may be confusing when we'll also refer to
them as nouns with the "<ls|rm|mv>-query" sub commands.  "query"
is also many characters to type without tab-completion on what I
expect to be one of the most commonly used sub-commands

Furthermore, "q" is also the common query parameter name used by
our PSGI interface, as is the case with several major web search
engines; so there's an element of familiarity there.

The name "search" was disregarded because "show" could be a
commonly used lei sub-command, too, and typing "se" for
tab-completion may be slow since two-handed typists on QWERTY
keyboards won't be able to use alternating hands.

"f" or "find" could be a possibility here, too; but we're
currently using the term "forget" as a weaker version of
"remove" or "rm", though "ignore" could be substituted for
"forget", perhaps...

Kyle Meyer noted the lack of (proposed) JSON output support
so that's been added to the proposed UI.
---
 lib/PublicInbox/LEI.pm | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index b254e2c5..7002a1f7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -68,7 +68,7 @@ sub _config_path ($) {
 # TODO: generate shell completion + help using %CMD and %OPTDESC
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
-'query' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
+'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s@ reverse|r offset=i remote local! extinbox!
 	since|after=s until|before=s), opt_dash('limit|n=i', '[0-9]+') ],
@@ -98,7 +98,7 @@ our %CMD = ( # sorted in order of importance/use:
 	'set/unset flags on message(s) from stdin',
 	qw(stdin| oid=s exact by-mid|mid:s) ],
 'forget' => [ '[--stdin|--oid=OID|--by-mid=MID]',
-	'exclude message(s) on stdin from query results',
+	"exclude message(s) on stdin from `q' search results",
 	qw(stdin| oid=s exact by-mid|mid:s quiet|q) ],
 
 'purge-mailsource' => [ '{URL_OR_PATHNAME|--all}',
@@ -175,7 +175,7 @@ my %OPTDESC = (
 'dedupe|d=s' => ['STRAT|content|oid|mid',
 		'deduplication strategy'],
 'show	thread|t' => 'display entire thread a message belongs to',
-'query	thread|t' =>
+'q	thread|t' =>
 	'return all messages in the same thread as the actual match(es)',
 'augment|a' => 'augment --output destination instead of clobbering',
 
@@ -186,7 +186,7 @@ my %OPTDESC = (
 			'message/object output format' ],
 'mark	format|f=s' => $stdin_formats,
 'forget	format|f=s' => $stdin_formats,
-'query	format|f=s' => [ 'OUT|maildir|mboxrd|mboxcl2|mboxcl|html|oid',
+'q	format|f=s' => [ 'OUT|maildir|mboxrd|mboxcl2|mboxcl|html|oid|json',
 		'specify output format, default depends on --output'],
 'ls-query	format|f=s' => $ls_format,
 'ls-extinbox	format|f=s' => $ls_format,

^ permalink raw reply related	[relevance 55%]

* "extinbox" term - was: [RFC 4/7] lei: proposed command-listing...
  2020-12-15 11:47 43% ` [RFC 4/7] lei: proposed command-listing and options Eric Wong
@ 2020-12-26 11:26 71%   ` Eric Wong
  2020-12-28 15:29 71%     ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2020-12-26 11:26 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> +'add-extinbox' => [ 'URL-OR-PATHNAME',
> +	'add/set priority of a publicinbox|extindex for extra matches',
> +	qw(prio=i) ],
> +'ls-extinbox' => [ '[FILTER]', 'list publicinbox|extindex sources',
> +	qw(format|f=s z local remote) ],
> +'forget-extinbox' => [ '{URL-OR-PATHNAME|--prune}',
> +	'exclude further results from a publicinbox|extindex',
> +	qw(prune) ],

I'm a bit iffy on "extinbox"  It's supposed to be a short
version meaning "either external index or a public inbox"

However, it's the same length and only two middle letters
away from "extindex" (short for "external index").

Would "inboxish" be an appropriate term in place of "extinbox"?
There's precedent with git using the terms "treeish" and
"committish".

I also don't want to force a user to specify the type, since it
will support HTTP(S) URLs and not just on-filesystem storage.

^ permalink raw reply	[relevance 71%]

* Re: "extinbox" term - was: [RFC 4/7] lei: proposed command-listing...
  2020-12-26 11:26 71%   ` "extinbox" term - was: [RFC 4/7] lei: proposed command-listing Eric Wong
@ 2020-12-28 15:29 71%     ` Kyle Meyer
  2020-12-28 21:55 71%       ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2020-12-28 15:29 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Eric Wong <e@80x24.org> wrote:
>> +'add-extinbox' => [ 'URL-OR-PATHNAME',
>> +	'add/set priority of a publicinbox|extindex for extra matches',
>> +	qw(prio=i) ],
>> +'ls-extinbox' => [ '[FILTER]', 'list publicinbox|extindex sources',
>> +	qw(format|f=s z local remote) ],
>> +'forget-extinbox' => [ '{URL-OR-PATHNAME|--prune}',
>> +	'exclude further results from a publicinbox|extindex',
>> +	qw(prune) ],
>
> I'm a bit iffy on "extinbox"  It's supposed to be a short
> version meaning "either external index or a public inbox"
>
> However, it's the same length and only two middle letters
> away from "extindex" (short for "external index").

Fwiw my brain made the incorrect extinbox => extindex jump when first
glancing over the command names before reading the descriptions.

> Would "inboxish" be an appropriate term in place of "extinbox"?
> There's precedent with git using the terms "treeish" and
> "committish".

Yeah, that seems okay.  I think "ish" would certainly make it clear to
the reader that there is more going on while avoiding the issue above,
but I wonder if that's really much better than just using "inbox" in the
command names and making the descriptions state something to the effect
of "... or external index".  At least from the standpoint of the search
UI, it seems natural to think of an external index as an "inbox", but
perhaps such an overloading is setting things up for confusion.

^ permalink raw reply	[relevance 71%]

* Re: "extinbox" term - was: [RFC 4/7] lei: proposed command-listing...
  2020-12-28 15:29 71%     ` Kyle Meyer
@ 2020-12-28 21:55 71%       ` Eric Wong
  2020-12-29  3:01 71%         ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2020-12-28 21:55 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > Eric Wong <e@80x24.org> wrote:
> >> +'add-extinbox' => [ 'URL-OR-PATHNAME',
> >> +	'add/set priority of a publicinbox|extindex for extra matches',
> >> +	qw(prio=i) ],
> >> +'ls-extinbox' => [ '[FILTER]', 'list publicinbox|extindex sources',
> >> +	qw(format|f=s z local remote) ],
> >> +'forget-extinbox' => [ '{URL-OR-PATHNAME|--prune}',
> >> +	'exclude further results from a publicinbox|extindex',
> >> +	qw(prune) ],
> >
> > I'm a bit iffy on "extinbox"  It's supposed to be a short
> > version meaning "either external index or a public inbox"
> >
> > However, it's the same length and only two middle letters
> > away from "extindex" (short for "external index").
> 
> Fwiw my brain made the incorrect extinbox => extindex jump when first
> glancing over the command names before reading the descriptions.

What about just "external"?  It could probably be extended to
handle existing IMAP, JMAP, notmuch, mairix, etc... as search
sources with query translation, even.

> > Would "inboxish" be an appropriate term in place of "extinbox"?
> > There's precedent with git using the terms "treeish" and
> > "committish".
> 
> Yeah, that seems okay.  I think "ish" would certainly make it clear to
> the reader that there is more going on while avoiding the issue above,
> but I wonder if that's really much better than just using "inbox" in the
> command names and making the descriptions state something to the effect
> of "... or external index".  At least from the standpoint of the search
> UI, it seems natural to think of an external index as an "inbox", but
> perhaps such an overloading is setting things up for confusion.

I'm using inboxish/ibxish internally, at least.  But now I'm
thinking "external" would give us more flexibility w.r.t. future
features.

^ permalink raw reply	[relevance 71%]

* Re: "extinbox" term - was: [RFC 4/7] lei: proposed command-listing...
  2020-12-28 21:55 71%       ` Eric Wong
@ 2020-12-29  3:01 71%         ` Kyle Meyer
  0 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2020-12-29  3:01 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:


> What about just "external"?  It could probably be extended to
> handle existing IMAP, JMAP, notmuch, mairix, etc... as search
> sources with query translation, even.

"external" sounds good to me.

^ permalink raw reply	[relevance 71%]

* [PATCH 00/36] another round of lei stuff
@ 2020-12-31 13:51 51% Eric Wong
  2020-12-31 13:51 37% ` [PATCH 10/36] lei: implement various deduplication strategies Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2020-12-31 13:51 UTC (permalink / raw)
  To: meta

This is against lei branch @ commit
0c8106d44f317175e122744b43407bf067183175 in
https://public-inbox.org/public-inbox.git

Infrastructure stuff for reading + writing local Maildirs and a
bunch of mbox formats are done (including gz/bz2/xz support)
and it's usage should be familiar to mairix(1) users.

Infrastructure for deduplication + augmenting search results
in place and tested.

Going to skip MH and MMDF for now; but IMAP/JMAP might happen
sooner but deduplication needs low-latency.

"extinbox" renamed "external"

Basic infrastructure like PublicInbox::IPC and SharedKV
should've been done and in use ages ago...  I look forward to
using them, at least.

Some DS safety fixes since lei will use it in stranger ways
than current.

Bad enough we have messages with duplicate Message-IDs, lei will
need to deal with Unsent/Drafts messages w/o Message-IDs at all!

Eric Wong (36):
  import: respect init.defaultBranch
  lei_store: use per-machine refname as git HEAD
  revert "lei_store: use per-machine refname as git HEAD"
  lei_to_mail: initial implementation for writing mbox formats
  sharedkv: fork()-friendly key-value store
  sharedkv: split out index_values
  lei_to_mail: start atomic and compressed mbox writing
  mboxreader: new class for reading various mbox formats
  lei_to_mail: start --augment, dedupe, bz2 and xz
  lei: implement various deduplication strategies
  lei_to_mail: lazy-require LeiDedupe
  lei_to_mail: support for non-seekable outputs
  lei_to_mail: support Maildir, fix+test --augment
  ipc: generic IPC dispatch based on Storable
  ipc: support Sereal
  lei_store: add ->set_eml, ->add_eml can return smsg
  lei: rename "extinbox" => "external"
  mid: use defined-or with `push' for uniqueness check
  mid: hoist out mids_in sub
  lei_store: handle messages without Message-ID at all
  ipc: use shutdown(2), base atfork* callback
  lei_to_mail: unlink mboxes if not augmenting
  lei: add --mfolder as an option
  spawn: move run_die here from PublicInbox::Import
  init: remove embedded UnlinkMe package
  t/run.perl: avoid uninitialized var on incomplete test
  gcf2client: reap process on DESTROY
  lei_to_mail: open FIFOs O_WRONLY so we block
  searchidxshard: call DS->Reset at worker start
  t/ipc.t: test for references via `die'
  use PublicInbox::DS for dwaitpid
  syscall: SFD_NONBLOCK can be a constant, again
  lei: avoid Spawn package when starting daemon
  avoid calling waitpid from children in DESTROY
  ds: clobber $in_loop first at reset
  on_destroy: support PID owner guard

 MANIFEST                                      |  12 +-
 lib/PublicInbox/DS.pm                         |  42 +-
 lib/PublicInbox/DSKQXS.pm                     |   4 +-
 lib/PublicInbox/Daemon.pm                     |   4 +-
 lib/PublicInbox/Gcf2Client.pm                 |  18 +-
 lib/PublicInbox/Git.pm                        |   7 +-
 lib/PublicInbox/IPC.pm                        | 165 ++++++++
 lib/PublicInbox/Import.pm                     |  36 +-
 lib/PublicInbox/LEI.pm                        |  44 +--
 lib/PublicInbox/LeiDedupe.pm                  | 100 +++++
 .../{LeiExtinbox.pm => LeiExternal.pm}        |  18 +-
 lib/PublicInbox/LeiStore.pm                   |  32 +-
 lib/PublicInbox/LeiToMail.pm                  | 361 ++++++++++++++++++
 lib/PublicInbox/LeiXSearch.pm                 |   2 +-
 lib/PublicInbox/Lock.pm                       |  17 +-
 lib/PublicInbox/MID.pm                        |  15 +-
 lib/PublicInbox/MboxReader.pm                 | 127 ++++++
 lib/PublicInbox/OnDestroy.pm                  |   5 +
 lib/PublicInbox/OverIdx.pm                    |   2 +
 lib/PublicInbox/ProcessPipe.pm                |  34 +-
 lib/PublicInbox/Qspawn.pm                     |  43 +--
 lib/PublicInbox/SearchIdxShard.pm             |   1 +
 lib/PublicInbox/SharedKV.pm                   | 148 +++++++
 lib/PublicInbox/Sigfd.pm                      |   4 +-
 lib/PublicInbox/Smsg.pm                       |   6 +-
 lib/PublicInbox/Spawn.pm                      |   9 +-
 lib/PublicInbox/Syscall.pm                    |   4 +-
 lib/PublicInbox/TestCommon.pm                 |  25 +-
 lib/PublicInbox/V2Writable.pm                 |  10 +-
 script/lei                                    |  17 +-
 script/public-inbox-init                      |  32 +-
 script/public-inbox-watch                     |   4 +-
 t/convert-compact.t                           |   4 +-
 t/index-git-times.t                           |   3 +-
 t/ipc.t                                       |  80 ++++
 t/lei.t                                       |  22 +-
 t/lei_dedupe.t                                |  59 +++
 t/lei_store.t                                 |  47 ++-
 t/lei_to_mail.t                               | 246 ++++++++++++
 t/lei_xsearch.t                               |   2 +-
 t/mbox_reader.t                               |  75 ++++
 t/on_destroy.t                                |   9 +
 t/plack.t                                     |   4 +-
 t/run.perl                                    |   3 +-
 t/shared_kv.t                                 |  58 +++
 t/sigfd.t                                     |   6 +-
 46 files changed, 1755 insertions(+), 211 deletions(-)
 create mode 100644 lib/PublicInbox/IPC.pm
 create mode 100644 lib/PublicInbox/LeiDedupe.pm
 rename lib/PublicInbox/{LeiExtinbox.pm => LeiExternal.pm} (75%)
 create mode 100644 lib/PublicInbox/LeiToMail.pm
 create mode 100644 lib/PublicInbox/MboxReader.pm
 create mode 100644 lib/PublicInbox/SharedKV.pm
 create mode 100644 t/ipc.t
 create mode 100644 t/lei_dedupe.t
 create mode 100644 t/lei_to_mail.t
 create mode 100644 t/mbox_reader.t
 create mode 100644 t/shared_kv.t


^ permalink raw reply	[relevance 51%]

* [PATCH 10/36] lei: implement various deduplication strategies
  2020-12-31 13:51 51% [PATCH 00/36] another round of lei stuff Eric Wong
@ 2020-12-31 13:51 37% ` Eric Wong
  2020-12-31 13:51 44% ` [PATCH 17/36] lei: rename "extinbox" => "external" Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-31 13:51 UTC (permalink / raw)
  To: meta

For writing mboxes and Maildirs, users may wish to use
stricter or looser deduplication strategies.  This
gives them more control.
---
 MANIFEST                     |  2 +
 lib/PublicInbox/LEI.pm       |  2 +-
 lib/PublicInbox/LeiDedupe.pm | 96 ++++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiToMail.pm | 26 +++++-----
 t/lei_dedupe.t               | 59 ++++++++++++++++++++++
 t/lei_to_mail.t              |  3 ++
 6 files changed, 176 insertions(+), 12 deletions(-)
 create mode 100644 lib/PublicInbox/LeiDedupe.pm
 create mode 100644 t/lei_dedupe.t

diff --git a/MANIFEST b/MANIFEST
index 1fb1e181..7ce2075e 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -162,6 +162,7 @@ lib/PublicInbox/InboxWritable.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
+lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExtinbox.pm
 lib/PublicInbox/LeiSearch.pm
 lib/PublicInbox/LeiStore.pm
@@ -330,6 +331,7 @@ t/iso-2202-jp.eml
 t/kqnotify.t
 t/lei-oneshot.t
 t/lei.t
+t/lei_dedupe.t
 t/lei_store.t
 t/lei_to_mail.t
 t/lei_xsearch.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 7002a1f7..9aa4d95a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -172,7 +172,7 @@ my %OPTDESC = (
 
 'type=s' => [ 'any|mid|git', 'disambiguate type' ],
 
-'dedupe|d=s' => ['STRAT|content|oid|mid',
+'dedupe|d=s' => ['STRAT|content|oid|mid|none',
 		'deduplication strategy'],
 'show	thread|t' => 'display entire thread a message belongs to',
 'q	thread|t' =>
diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
new file mode 100644
index 00000000..c6eb7196
--- /dev/null
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -0,0 +1,96 @@
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+package PublicInbox::LeiDedupe;
+use strict;
+use v5.10.1;
+use PublicInbox::SharedKV;
+use PublicInbox::ContentHash qw(content_hash);
+
+# n.b. mutt sets most of these headers not sure about Bytes
+our @OID_IGNORE = qw(Status X-Status Content-Length Lines Bytes);
+
+# best-effort regeneration of OID when augmenting existing results
+sub _regen_oid ($) {
+	my ($eml) = @_;
+	my @stash; # stash away headers we shouldn't have in git
+	for my $k (@OID_IGNORE) {
+		my @v = $eml->header_raw($k) or next;
+		push @stash, [ $k, \@v ];
+		$eml->header_set($k); # restore below
+	}
+	my $dig = Digest::SHA->new(1); # XXX SHA256 later
+	my $buf = $eml->as_string;
+	$dig->add('blob '.length($buf)."\0");
+	$dig->add($buf);
+	undef $buf;
+
+	for my $kv (@stash) { # restore stashed headers
+		my ($k, @v) = @$kv;
+		$eml->header_set($k, @v);
+	}
+	$dig->digest;
+}
+
+sub _oidbin ($) { defined($_[0]) ? pack('H*', $_[0]) : undef }
+
+# the paranoid option
+sub dedupe_oid () {
+	my $skv = PublicInbox::SharedKV->new;
+	($skv, sub { # may be called in a child process
+		my ($eml, $oid) = @_;
+		$skv->set_maybe(_oidbin($oid) // _regen_oid($eml), '');
+	});
+}
+
+# dangerous if there's duplicate messages with different Message-IDs
+sub dedupe_mid () {
+	my $skv = PublicInbox::SharedKV->new;
+	($skv, sub { # may be called in a child process
+		my ($eml, $oid) = @_;
+		# TODO: lei will support non-public messages w/o Message-ID
+		my $mid = $eml->header_raw('Message-ID') // _oidbin($oid) //
+			content_hash($eml);
+		$skv->set_maybe($mid, '');
+	});
+}
+
+# our default deduplication strategy (used by v2, also)
+sub dedupe_content () {
+	my $skv = PublicInbox::SharedKV->new;
+	($skv, sub { # may be called in a child process
+		my ($eml) = @_; # oid = $_[1], ignored
+		$skv->set_maybe(content_hash($eml), '');
+	});
+}
+
+# no deduplication at all
+sub dedupe_none () { (undef, sub { 1 }) }
+
+sub new {
+	my ($cls, $lei) = @_;
+	my $dd = $lei->{opt}->{dedupe} // 'content';
+	my $dd_new = $cls->can("dedupe_$dd") //
+			die "unsupported dedupe strategy: $dd\n";
+	bless [ $dd_new->() ], $cls; # [ $skv, $cb ]
+}
+
+# returns true on unseen messages according to the deduplication strategy,
+# returns false if seen
+sub is_dup {
+	my ($self, $eml, $oid) = @_;
+	!$self->[1]->($eml, $oid);
+}
+
+sub prepare_dedupe {
+	my ($self) = @_;
+	my $skv = $self->[0];
+	$skv ? $skv->dbh : undef;
+}
+
+sub pause_dedupe {
+	my ($self) = @_;
+	my $skv = $self->[0];
+	delete($skv->{dbh}) if $skv;
+}
+
+1;
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 294291b2..ead00d1a 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -8,9 +8,8 @@ use v5.10.1;
 use PublicInbox::Eml;
 use PublicInbox::Lock;
 use PublicInbox::ProcessPipe;
-use PublicInbox::SharedKV;
 use PublicInbox::Spawn qw(which spawn popen_rd);
-use PublicInbox::ContentHash qw(content_hash);
+use PublicInbox::LeiDedupe;
 use Symbol qw(gensym);
 use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET);
@@ -226,10 +225,11 @@ sub dup_src ($) {
 	$dup;
 }
 
-# --augment existing output destination, without duplicating anything
+# --augment existing output destination, with deduplication
 sub _augment { # MboxReader eml_cb
 	my ($eml, $lei) = @_;
-	$lei->{skv}->set_maybe(content_hash($eml), '');
+	# ignore return value, just populate the skv
+	$lei->{dedupe_cb}->is_dup($eml);
 }
 
 sub _mbox_write_cb ($$$$) {
@@ -240,23 +240,27 @@ sub _mbox_write_cb ($$$$) {
 	open $out, '+>>', $dst or die "open $dst: $!";
 	# Perl does SEEK_END even with O_APPEND :<
 	seek($out, 0, SEEK_SET) or die "seek $dst: $!";
-	my $atomic = !!(($lei->{opt}->{jobs} // 0) > 1);
-	$lei->{skv} = PublicInbox::SharedKV->new;
-	$lei->{skv}->dbh;
+	my $jobs = $lei->{opt}->{jobs} // 0;
+	my $atomic = $jobs > 1;
+	my $dedupe = $lei->{dedupe} = PublicInbox::LeiDedupe->new($lei);
 	state $zsfx_allow = join('|', keys %zsfx2cmd);
 	my ($zsfx) = ($dst =~ /\.($zsfx_allow)\z/);
 	if ($lei->{opt}->{augment}) {
-		my $rd = $zsfx ? decompress_src($out, $zsfx, $lei) :
-				dup_src($out);
-		PublicInbox::MboxReader->$mbox($rd, \&_augment, $lei);
+		if (-s $out && $dedupe->prepare_dedupe) {
+			my $rd = $zsfx ? decompress_src($out, $zsfx, $lei) :
+					dup_src($out);
+			PublicInbox::MboxReader->$mbox($rd, \&_augment, $lei);
+		}
+		$dedupe->pause_dedupe if $jobs; # are we forking?
 	} else {
 		truncate($out, 0) or die "truncate $dst: $!";
+		$dedupe->prepare_dedupe if !$jobs;
 	}
 	($out, $pipe_lk) = compress_dst($out, $zsfx, $lei) if $zsfx;
 	sub {
 		my ($buf, $oid, $kw) = @_;
 		my $eml = PublicInbox::Eml->new($buf);
-		if ($lei->{skv}->set_maybe(content_hash($eml), '')) {
+		if (!$lei->{dedupe}->is_dup($eml, $oid)) {
 			$buf = $eml2mbox->($eml, $kw);
 			my $lock = $pipe_lk->lock_for_scope if $pipe_lk;
 			write_in_full($out, $buf, $atomic);
diff --git a/t/lei_dedupe.t b/t/lei_dedupe.t
new file mode 100644
index 00000000..08f38aa0
--- /dev/null
+++ b/t/lei_dedupe.t
@@ -0,0 +1,59 @@
+#!perl -w
+# Copyright (C) 2020 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Test::More;
+use PublicInbox::TestCommon;
+use PublicInbox::Eml;
+require_mods(qw(DBD::SQLite));
+use_ok 'PublicInbox::LeiDedupe';
+my $eml = eml_load('t/plack-qp.eml');
+my $mid = $eml->header_raw('Message-ID');
+my $different = eml_load('t/msg_iter-order.eml');
+$different->header_set('Message-ID', $mid);
+
+my $lei = { opt => { dedupe => 'none' } };
+my $dd = PublicInbox::LeiDedupe->new($lei);
+$dd->prepare_dedupe;
+ok(!$dd->is_dup($eml), '1st is_dup w/o dedupe');
+ok(!$dd->is_dup($eml), '2nd is_dup w/o dedupe');
+ok(!$dd->is_dup($different), 'different is_dup w/o dedupe');
+
+for my $strat (undef, 'content') {
+	$lei->{opt}->{dedupe} = $strat;
+	$dd = PublicInbox::LeiDedupe->new($lei);
+	$dd->prepare_dedupe;
+	my $desc = $strat // 'default';
+	ok(!$dd->is_dup($eml), "1st is_dup with $desc dedupe");
+	ok($dd->is_dup($eml), "2nd seen with $desc dedupe");
+	ok(!$dd->is_dup($different), "different is_dup with $desc dedupe");
+}
+$lei->{opt}->{dedupe} = 'bogus';
+eval { PublicInbox::LeiDedupe->new($lei) };
+like($@, qr/unsupported.*bogus/, 'died on bogus strategy');
+
+$lei->{opt}->{dedupe} = 'mid';
+$dd = PublicInbox::LeiDedupe->new($lei);
+$dd->prepare_dedupe;
+ok(!$dd->is_dup($eml), '1st is_dup with mid dedupe');
+ok($dd->is_dup($eml), '2nd seen with mid dedupe');
+ok($dd->is_dup($different), 'different seen with mid dedupe');
+
+$lei->{opt}->{dedupe} = 'oid';
+$dd = PublicInbox::LeiDedupe->new($lei);
+$dd->prepare_dedupe;
+
+# --augment won't have OIDs:
+ok(!$dd->is_dup($eml), '1st is_dup with oid dedupe (augment)');
+ok($dd->is_dup($eml), '2nd seen with oid dedupe (augment)');
+ok(!$dd->is_dup($different), 'different is_dup with mid dedupe (augment)');
+$different->header_set('Status', 'RO');
+ok($dd->is_dup($different), 'different seen with oid dedupe Status removed');
+
+ok(!$dd->is_dup($eml, '01d'), '1st is_dup with oid dedupe');
+ok($dd->is_dup($different, '01d'), 'different content ignored if oid matches');
+ok($dd->is_dup($eml, '01D'), 'case insensitive oid comparison :P');
+ok(!$dd->is_dup($eml, '01dbad'), 'case insensitive oid comparison :P');
+
+done_testing;
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index e4551e69..5be4e285 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -6,6 +6,7 @@ use v5.10.1;
 use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::Eml;
+require_mods(qw(DBD::SQLite));
 use_ok 'PublicInbox::LeiToMail';
 my $from = "Content-Length: 10\nSubject: x\n\nFrom hell\n";
 my $noeol = "Subject: x\n\nFrom hell";
@@ -86,6 +87,7 @@ my $orig = do {
 
 	local $lei->{opt} = { jobs => 2 };
 	$wcb = PublicInbox::LeiToMail->write_cb("mboxcl2:$fn", $lei);
+	$lei->{dedupe}->prepare_dedupe;
 	$wcb->(\($dup = $buf), 'deadbeef', [ qw(seen) ]);
 	undef $wcb;
 	open $fh, '<', $fn or BAIL_OUT $!;
@@ -110,6 +112,7 @@ for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 		local $lei->{opt} = { jobs => 2 }; # for atomic writes
 		unlink $f or BAIL_OUT "unlink $!";
 		$wcb = PublicInbox::LeiToMail->write_cb($dst, $lei);
+		$lei->{dedupe}->prepare_dedupe;
 		$wcb->(\($dup = $buf), 'deadbeef', [ qw(seen) ]);
 		undef $wcb;
 		is(xqx([@$dc_cmd, $f]), $orig, "$zsfx matches with lock");

^ permalink raw reply related	[relevance 37%]

* [PATCH 17/36] lei: rename "extinbox" => "external"
  2020-12-31 13:51 51% [PATCH 00/36] another round of lei stuff Eric Wong
  2020-12-31 13:51 37% ` [PATCH 10/36] lei: implement various deduplication strategies Eric Wong
@ 2020-12-31 13:51 44% ` Eric Wong
  2020-12-31 13:51 71% ` [PATCH 23/36] lei: add --mfolder as an --output alias Eric Wong
  2020-12-31 13:51 56% ` [PATCH 33/36] lei: avoid Spawn package when starting daemon Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-31 13:51 UTC (permalink / raw)
  To: meta

The words "extinbox" and "extindex" are too close and easy to
confuse with the other.  Rename "extinbox" to "external", since
these could be IMAP, JMAP or other non-public-inbox search APIs.

Link: https://public-inbox.org/meta/20201226112649.GB6226@dcvr/
---
 MANIFEST                                      |  2 +-
 lib/PublicInbox/LEI.pm                        | 12 +++++-----
 .../{LeiExtinbox.pm => LeiExternal.pm}        | 18 +++++++--------
 lib/PublicInbox/LeiXSearch.pm                 |  2 +-
 t/lei.t                                       | 22 +++++++++----------
 t/lei_xsearch.t                               |  2 +-
 6 files changed, 29 insertions(+), 29 deletions(-)
 rename lib/PublicInbox/{LeiExtinbox.pm => LeiExternal.pm} (75%)

diff --git a/MANIFEST b/MANIFEST
index 96ad52bf..6dc08f01 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -164,7 +164,7 @@ lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
 lib/PublicInbox/LeiDedupe.pm
-lib/PublicInbox/LeiExtinbox.pm
+lib/PublicInbox/LeiExternal.pm
 lib/PublicInbox/LeiSearch.pm
 lib/PublicInbox/LeiStore.pm
 lib/PublicInbox/LeiToMail.pm
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9aa4d95a..f960aa72 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -8,7 +8,7 @@
 package PublicInbox::LEI;
 use strict;
 use v5.10.1;
-use parent qw(PublicInbox::DS PublicInbox::LeiExtinbox);
+use parent qw(PublicInbox::DS PublicInbox::LeiExternal);
 use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
 use Errno qw(EAGAIN ECONNREFUSED ENOENT);
@@ -70,19 +70,19 @@ sub _config_path ($) {
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|o=s format|f=s dedupe|d=s thread|t augment|a
-	sort|s=s@ reverse|r offset=i remote local! extinbox!
+	sort|s=s@ reverse|r offset=i remote local! external!
 	since|after=s until|before=s), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
 	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
 	pass_through('git show') ],
 
-'add-extinbox' => [ 'URL_OR_PATHNAME',
+'add-external' => [ 'URL_OR_PATHNAME',
 	'add/set priority of a publicinbox|extindex for extra matches',
 	qw(boost=i quiet|q) ],
-'ls-extinbox' => [ '[FILTER...]', 'list publicinbox|extindex locations',
+'ls-external' => [ '[FILTER...]', 'list publicinbox|extindex locations',
 	qw(format|f=s z|0 local remote quiet|q) ],
-'forget-extinbox' => [ '{URL_OR_PATHNAME|--prune}',
+'forget-external' => [ '{URL_OR_PATHNAME|--prune}',
 	'exclude further results from a publicinbox|extindex',
 	qw(prune quiet|q) ],
 
@@ -189,7 +189,7 @@ my %OPTDESC = (
 'q	format|f=s' => [ 'OUT|maildir|mboxrd|mboxcl2|mboxcl|html|oid|json',
 		'specify output format, default depends on --output'],
 'ls-query	format|f=s' => $ls_format,
-'ls-extinbox	format|f=s' => $ls_format,
+'ls-external	format|f=s' => $ls_format,
 
 'limit|n=i@' => ['NUM', 'limit on number of matches (default: 10000)' ],
 'offset=i' => ['OFF', 'search result offset (default: 0)'],
diff --git a/lib/PublicInbox/LeiExtinbox.pm b/lib/PublicInbox/LeiExternal.pm
similarity index 75%
rename from lib/PublicInbox/LeiExtinbox.pm
rename to lib/PublicInbox/LeiExternal.pm
index c2de7735..0378551a 100644
--- a/lib/PublicInbox/LeiExtinbox.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -1,22 +1,22 @@
 # Copyright (C) 2020 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
-# *-extinbox commands of lei
-package PublicInbox::LeiExtinbox;
+# *-external commands of lei
+package PublicInbox::LeiExternal;
 use strict;
 use v5.10.1;
 use parent qw(Exporter);
-our @EXPORT = qw(lei_ls_extinbox lei_add_extinbox lei_forget_extinbox);
+our @EXPORT = qw(lei_ls_external lei_add_external lei_forget_external);
 
-sub lei_ls_extinbox {
+sub lei_ls_external {
 	my ($self, @argv) = @_;
 	my $stor = $self->_lei_store(0);
 	my $cfg = $self->_lei_cfg(0);
 	my $out = $self->{1};
 	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
 	my (%boost, @loc);
-	for my $sec (grep(/\Aextinbox\./, @{$cfg->{-section_order}})) {
-		my $loc = substr($sec, length('extinbox.'));
+	for my $sec (grep(/\Aexternal\./, @{$cfg->{-section_order}})) {
+		my $loc = substr($sec, length('external.'));
 		$boost{$loc} = $cfg->{"$sec.boost"};
 		push @loc, $loc;
 	}
@@ -28,14 +28,14 @@ sub lei_ls_extinbox {
 	}
 }
 
-sub lei_add_extinbox {
+sub lei_add_external {
 	my ($self, $url_or_dir) = @_;
 	my $cfg = $self->_lei_cfg(1);
 	if ($url_or_dir !~ m!\Ahttps?://!) {
 		$url_or_dir = File::Spec->canonpath($url_or_dir);
 	}
 	my $new_boost = $self->{opt}->{boost} // 0;
-	my $key = "extinbox.$url_or_dir.boost";
+	my $key = "external.$url_or_dir.boost";
 	my $cur_boost = $cfg->{$key};
 	return if defined($cur_boost) && $cur_boost == $new_boost; # idempotent
 	$self->lei_config($key, $new_boost);
@@ -44,7 +44,7 @@ sub lei_add_extinbox {
 	$stor->done;
 }
 
-sub lei_forget_extinbox {
+sub lei_forget_external {
 	# TODO
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1a81b14a..7d251afd 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -18,7 +18,7 @@ sub new {
 	}, $class
 }
 
-sub attach_extinbox {
+sub attach_external {
 	my ($self, $ibxish) = @_; # ibxish = ExtSearch or Inbox
 	if (!$ibxish->can('over')) {
 		push @{$self->{remotes}}, $ibxish
diff --git a/t/lei.t b/t/lei.t
index a95a0efc..764a7fe4 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -154,38 +154,38 @@ my $setup_publicinboxes = sub {
 	$seen || BAIL_OUT 'no imports';
 };
 
-my $test_extinbox = sub {
+my $test_external = sub {
 	$setup_publicinboxes->();
 	$cleanup->();
-	$lei->('ls-extinbox');
-	is($out.$err, '', 'ls-extinbox no output, yet');
+	$lei->('ls-external');
+	is($out.$err, '', 'ls-external no output, yet');
 	ok(!-e $config_file && !-e $store_dir,
-		'nothing created by ls-extinbox');
+		'nothing created by ls-external');
 
 	my $cfg = PublicInbox::Config->new;
 	$cfg->each_inbox(sub {
 		my ($ibx) = @_;
-		ok($lei->(qw(add-extinbox -q), $ibx->{inboxdir}),
-			'added extinbox');
+		ok($lei->(qw(add-external -q), $ibx->{inboxdir}),
+			'added external');
 		is($out.$err, '', 'no output');
 	});
 	ok(-s $config_file && -e $store_dir,
-		'add-extinbox created config + store');
+		'add-external created config + store');
 	my $lcfg = PublicInbox::Config->new($config_file);
 	$cfg->each_inbox(sub {
 		my ($ibx) = @_;
-		is($lcfg->{"extinbox.$ibx->{inboxdir}.boost"}, 0,
+		is($lcfg->{"external.$ibx->{inboxdir}.boost"}, 0,
 			"configured boost on $ibx->{name}");
 	});
-	$lei->('ls-extinbox');
-	like($out, qr/boost=0\n/s, 'ls-extinbox has output');
+	$lei->('ls-external');
+	like($out, qr/boost=0\n/s, 'ls-external has output');
 };
 
 my $test_lei_common = sub {
 	$test_help->();
 	$test_config->();
 	$test_init->();
-	$test_extinbox->();
+	$test_external->();
 };
 
 my $test_lei_oneshot = $ENV{TEST_LEI_ONESHOT};
diff --git a/t/lei_xsearch.t b/t/lei_xsearch.t
index c41213bd..178c3d37 100644
--- a/t/lei_xsearch.t
+++ b/t/lei_xsearch.t
@@ -49,7 +49,7 @@ $eidx->eidx_sync({fsync => 0});
 my $es = PublicInbox::ExtSearch->new("$home/eidx");
 my $lxs = PublicInbox::LeiXSearch->new;
 for my $ibxish (shuffle($es, @ibx)) {
-	$lxs->attach_extinbox($ibxish);
+	$lxs->attach_external($ibxish);
 }
 my $nr = $lxs->xdb->get_doccount;
 my $mset = $lxs->mset('d:19931002..19931003', { limit => $nr });

^ permalink raw reply related	[relevance 44%]

* [PATCH 23/36] lei: add --mfolder as an --output alias
  2020-12-31 13:51 51% [PATCH 00/36] another round of lei stuff Eric Wong
  2020-12-31 13:51 37% ` [PATCH 10/36] lei: implement various deduplication strategies Eric Wong
  2020-12-31 13:51 44% ` [PATCH 17/36] lei: rename "extinbox" => "external" Eric Wong
@ 2020-12-31 13:51 71% ` Eric Wong
  2020-12-31 13:51 56% ` [PATCH 33/36] lei: avoid Spawn package when starting daemon Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-31 13:51 UTC (permalink / raw)
  To: meta

This will be helpful for mairix users.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f960aa72..bb77198e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -69,7 +69,7 @@ sub _config_path ($) {
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
-	save-as=s output|o=s format|f=s dedupe|d=s thread|t augment|a
+	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s@ reverse|r offset=i remote local! external!
 	since|after=s until|before=s), opt_dash('limit|n=i', '[0-9]+') ],
 

^ permalink raw reply related	[relevance 71%]

* [PATCH 33/36] lei: avoid Spawn package when starting daemon
  2020-12-31 13:51 51% [PATCH 00/36] another round of lei stuff Eric Wong
                   ` (2 preceding siblings ...)
  2020-12-31 13:51 71% ` [PATCH 23/36] lei: add --mfolder as an --output alias Eric Wong
@ 2020-12-31 13:51 56% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2020-12-31 13:51 UTC (permalink / raw)
  To: meta

Spawn was designed to speed up process spawning inside
long-lived daemons with largish memory usage.  It does not help
for short-lived scripts which only exist to start and connect to
a daemon.

This change actually speeds up initial lei startup from
~190ms to ~140ms(!).  Normal usage once the daemon is running
is unaffected, at <20ms for help text.

While we're in the area, simplify Cwd error message generation,
too.
---
 lib/PublicInbox/LEI.pm | 10 +++++-----
 script/lei             | 17 ++++++-----------
 2 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 03302f8a..b84e24ef 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -675,7 +675,7 @@ sub lazy_start {
 	require IO::FDPass;
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
-	(-p STDOUT && -p STDERR) or die "E: stdout+stderr must be pipes\n";
+	(-p STDOUT) or die "E: stdout must be a pipe\n";
 	open(STDIN, '+<', '/dev/null') or die "redirect stdin failed: $!";
 	POSIX::setsid() > 0 or die "setsid: $!";
 	my $pid = fork // die "fork: $!";
@@ -740,10 +740,9 @@ sub lazy_start {
 		$n; # true: continue, false: stop
 	});
 
-	# STDIN was redirected to /dev/null above, closing STDOUT and
-	# STDERR will cause the calling `lei' client process to finish
-	# reading <$daemon> pipe.
-	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!";
+	# STDIN was redirected to /dev/null above, closing STDERR and
+	# STDOUT will cause the calling `lei' client process to finish
+	# reading the <$daemon> pipe.
 	openlog($path, 'pid', 'user');
 	local $SIG{__WARN__} = sub { syslog('warning', "@_") };
 	my $owner_pid = $$;
@@ -751,6 +750,7 @@ sub lazy_start {
 		syslog('crit', "$@") if $@ && $$ == $owner_pid;
 	});
 	open STDERR, '>&STDIN' or die "redirect stderr failed: $!";
+	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!";
 	# $daemon pipe to `lei' closed, main loop begins:
 	PublicInbox::DS->EventLoop;
 	@$on_destroy = (); # cancel on_destroy if we get here
diff --git a/script/lei b/script/lei
index ceaf1e00..0457adfd 100755
--- a/script/lei
+++ b/script/lei
@@ -21,18 +21,13 @@ if (my ($sock, $pwd) = eval {
 	my $addr = pack_sockaddr_un($path);
 	socket(my $sock, AF_UNIX, SOCK_STREAM, 0) or die "socket: $!";
 	unless (connect($sock, $addr)) { # start the daemon if not started
-		my $cmd = [ $^X, qw[-MPublicInbox::LEI
+		local $ENV{PERL5LIB} = join(':', @INC);
+		open(my $daemon, '-|', $^X, qw[-MPublicInbox::LEI
 			-E PublicInbox::LEI::lazy_start(@ARGV)],
-			$path, $! + 0 ];
-		my $env = { PERL5LIB => join(':', @INC) };
-		pipe(my ($daemon, $w)) or die "pipe: $!";
-		my $opt = { 1 => $w, 2 => $w };
-		require PublicInbox::Spawn;
-		my $pid = PublicInbox::Spawn::spawn($cmd, $env, $opt);
-		$opt = $w = undef;
+			$path, $! + 0) or die "popen: $!";
 		while (<$daemon>) { warn $_ } # EOF when STDERR is redirected
-		waitpid($pid, 0) or warn <<"";
-lei-daemon could not start, PID:$pid exited with \$?=$?
+		close($daemon) or warn <<"";
+lei-daemon could not start, exited with \$?=$?
 
 		# try connecting again anyways, unlink+bind may be racy
 		unless (connect($sock, $addr)) {
@@ -43,8 +38,8 @@ Falling back to (slow) one-shot mode
 		}
 	}
 	require Cwd;
-	my $cwd = Cwd::fastcwd() // die "fastcwd(PWD=".($ENV{PWD}//'').": $!";
 	my $pwd = $ENV{PWD} // '';
+	my $cwd = Cwd::fastcwd() // die "fastcwd(PWD=$pwd): $!";
 	if ($pwd ne $cwd) { # prefer ENV{PWD} if it's a symlink to real cwd
 		my @st_cwd = stat($cwd) or die "stat(cwd=$cwd): $!";
 		my @st_pwd = stat($pwd); # PWD invalid, use cwd

^ permalink raw reply related	[relevance 56%]

* [PATCH 2/4] t/lei: fix TEST_RUN_MODE=0, simplify oneshot fallback
  @ 2021-01-01  5:47 57% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-01  5:47 UTC (permalink / raw)
  To: meta

We need to use an absolute path after chdir in run modes
where scripts aren't loaded into in-memory subs.

The oneshot test was also failing under TEST_RUN_MODE=0 due to
no "lei-oneshot" command existing on the FS.  So we force a
socket failure by making XDG_RUNTIME_DIR too large to fit into
the 108-byte .sun_path field of "struct sockaddr_un".  This
even lets us simplify lei-oneshot significantly.
---
 t/lei-oneshot.t | 17 -----------------
 t/lei.t         | 29 +++++++++++++++++------------
 2 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/t/lei-oneshot.t b/t/lei-oneshot.t
index 2b34f982..7688da5b 100644
--- a/t/lei-oneshot.t
+++ b/t/lei-oneshot.t
@@ -4,22 +4,5 @@
 use strict;
 use v5.10.1;
 use PublicInbox::TestCommon;
-$PublicInbox::TestCommon::cached_scripts{'lei-oneshot'} //= do {
-	eval <<'EOF';
-package LeiOneshot;
-use strict;
-use subs qw(exit);
-*exit = \&PublicInbox::TestCommon::run_script_exit;
-sub main {
-# the below "line" directive is a magic comment, see perlsyn(1) manpage
-# line 1 "lei-oneshot"
-	require PublicInbox::LEI;
-	PublicInbox::LEI::oneshot(__PACKAGE__);
-	0;
-}
-1;
-EOF
-	LeiOneshot->can('main');
-};
 local $ENV{TEST_LEI_ONESHOT} = '1';
 require './t/lei.t';
diff --git a/t/lei.t b/t/lei.t
index 690878ce..41638950 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -9,7 +9,6 @@ use PublicInbox::Config;
 use File::Path qw(rmtree);
 require_git 2.6;
 require_mods(qw(json DBD::SQLite Search::Xapian));
-my $LEI = 'lei';
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 my $lei = sub {
 	my ($cmd, $env, $xopt) = @_;
@@ -18,13 +17,12 @@ my $lei = sub {
 		($env, $xopt) = grep { (!defined) || ref } @_;
 		$cmd = [ grep { defined && !ref } @_ ];
 	}
-	run_script([$LEI, @$cmd], $env, $xopt // $opt);
+	run_script(['lei', @$cmd], $env, $xopt // $opt);
 };
 
 my ($home, $for_destroy) = tmpdir();
 delete local $ENV{XDG_DATA_HOME};
 delete local $ENV{XDG_CONFIG_HOME};
-local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 local $ENV{HOME} = $home;
 local $ENV{FOO} = 'BAR';
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
@@ -188,10 +186,16 @@ my $test_lei_common = sub {
 	$test_external->();
 };
 
-my $test_lei_oneshot = $ENV{TEST_LEI_ONESHOT};
-SKIP: {
-	last SKIP if $test_lei_oneshot;
+if ($ENV{TEST_LEI_ONESHOT}) {
+	require_ok 'PublicInbox::LEI';
+	# force sun_path[108] overflow, "IO::FDPass" avoids warning
+	local $ENV{XDG_RUNTIME_DIR} = "$home/IO::FDPass".('.sun_path' x 108);
+	$test_lei_common->();
+}
+
+SKIP: { # real socket
 	require_mods(qw(IO::FDPass Cwd), 46);
+	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
 
 	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
@@ -275,11 +279,17 @@ SKIP: {
 	if ('oneshot on cwd gone') {
 		my $cwd = Cwd::fastcwd() or BAIL_OUT "fastcwd: $!";
 		my $d = "$home/to-be-removed";
+		my $lei_path = 'lei';
+		# we chdir, so we need an abs_path fur run_script
+		if (($ENV{TEST_RUN_MODE}//2) != 2) {
+			$lei_path = PublicInbox::TestCommon::key2script('lei');
+			$lei_path = Cwd::abs_path($lei_path);
+		}
 		mkdir $d or BAIL_OUT "mkdir($d) $!";
 		chdir $d or BAIL_OUT "chdir($d) $!";
 		if (rmdir($d)) {
 			$out = $err = '';
-			ok(run_script([qw(lei help)], undef, $opt),
+			ok(run_script([$lei_path, 'help'], undef, $opt),
 				'cwd fail, one-shot fallback works');
 		} else {
 			$err = "rmdir=$!";
@@ -296,11 +306,6 @@ SKIP: {
 	}
 	ok(!kill(0, $new_pid), 'daemon exits after unlink');
 	# success over socket, can't test without
-	$test_lei_common = undef;
 };
 
-require_ok 'PublicInbox::LEI';
-$LEI = 'lei-oneshot' if $test_lei_oneshot;
-$test_lei_common->() if $test_lei_common;
-
 done_testing;

^ permalink raw reply related	[relevance 57%]

* [PATCH 0/3] lei-related test fixes
@ 2021-01-03  9:48 70% Eric Wong
  2021-01-03  9:48 60% ` [PATCH 1/3] t/lei: use $lei->() callback wrapper Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-03  9:48 UTC (permalink / raw)
  To: meta

Still chasing down a weird problem which causes t/lei.t and
t/lei-oneshot.t to fail on FreeBSD 11.4 with IO::FDPass under
high load.   No syscall errors are reported, but it's like the
FDs aren't passed at all...  Maybe it's fixed in 12.x

1/3 is to cut down on noise

2/3 is a no-brainer :x

3/3 was for me to play around with, but also avoids malloc and
    a potential leak in IO::FDPass (upstream's been notified).
    However, I'm considering just making our C code pass all
    3 FDs with one syscall since it's possible.

In any case, the C parts of PublicInbox::Spawn should probably
renamed PublicInbox::C...

Eric Wong (3):
  t/lei: use $lei->() callback wrapper
  testcommon: prepare_redirects: fix error message
  spawn: support send_fd+recv_fd w/o IO::FDPass

 lib/PublicInbox/LEI.pm        |  6 ++-
 lib/PublicInbox/Spawn.pm      | 78 ++++++++++++++++++++++++++++++--
 lib/PublicInbox/TestCommon.pm |  4 +-
 script/lei                    |  7 ++-
 t/lei.t                       | 84 ++++++++++++++++-------------------
 t/spawn.t                     | 18 ++++++++
 6 files changed, 141 insertions(+), 56 deletions(-)

^ permalink raw reply	[relevance 70%]

* [PATCH 1/3] t/lei: use $lei->() callback wrapper
  2021-01-03  9:48 70% [PATCH 0/3] lei-related test fixes Eric Wong
@ 2021-01-03  9:48 60% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-03  9:48 UTC (permalink / raw)
  To: meta

This shortens the test and should make it easier to debug and
add new tests.
---
 t/lei.t | 78 ++++++++++++++++++++++++---------------------------------
 1 file changed, 33 insertions(+), 45 deletions(-)

diff --git a/t/lei.t b/t/lei.t
index 6f6a5888..541d83ce 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -35,37 +35,35 @@ my $config_file = "$home/.config/lei/config";
 my $store_dir = "$home/.local/share/lei";
 
 my $test_help = sub {
-	ok(!$lei->([], undef, $opt), 'no args fails');
+	ok(!$lei->(), 'no args fails');
 	is($? >> 8, 1, '$? is 1');
 	is($out, '', 'nothing in stdout');
 	like($err, qr/^usage:/sm, 'usage in stderr');
 
 	for my $arg (['-h'], ['--help'], ['help'], [qw(daemon-pid --help)]) {
-		$out = $err = '';
-		ok($lei->($arg, undef, $opt), "lei @$arg");
+		ok($lei->($arg), "lei @$arg");
 		like($out, qr/^usage:/sm, "usage in stdout (@$arg)");
 		is($err, '', "nothing in stderr (@$arg)");
 	}
 
 	for my $arg ([''], ['--halp'], ['halp'], [qw(daemon-pid --halp)]) {
-		$out = $err = '';
-		ok(!$lei->($arg, undef, $opt), "lei @$arg");
+		ok(!$lei->($arg), "lei @$arg");
 		is($? >> 8, 1, '$? set correctly');
 		isnt($err, '', 'something in stderr');
 		is($out, '', 'nothing in stdout');
 	}
-	ok($lei->(qw(init -h), undef, $opt), 'init -h');
+	ok($lei->(qw(init -h)), 'init -h');
 	like($out, qr! \Q$home\E/\.local/share/lei/store\b!,
 		'actual path shown in init -h');
-	ok($lei->(qw(init -h), { XDG_DATA_HOME => '/XDH' }, $opt),
+	ok($lei->(qw(init -h), { XDG_DATA_HOME => '/XDH' }),
 		'init with XDG_DATA_HOME');
 	like($out, qr! /XDH/lei/store\b!, 'XDG_DATA_HOME in init -h');
 	is($err, '', 'no errors from init -h');
 
-	ok($lei->(qw(config -h), undef, $opt), 'config-h');
+	ok($lei->(qw(config -h)), 'config-h');
 	like($out, qr! \Q$home\E/\.config/lei/config\b!,
 		'actual path shown in config -h');
-	ok($lei->(qw(config -h), { XDG_CONFIG_HOME => '/XDC' }, $opt),
+	ok($lei->(qw(config -h), { XDG_CONFIG_HOME => '/XDC' }),
 		'config with XDG_CONFIG_HOME');
 	like($out, qr! /XDC/lei/config\b!, 'XDG_CONFIG_HOME in config -h');
 	is($err, '', 'no errors from config -h');
@@ -75,31 +73,28 @@ my $ok_err_info = sub {
 	my ($msg) = @_;
 	is(grep(!/^I:/, split(/^/, $err)), 0, $msg) or
 		diag "$msg: err=$err";
-	$err = '';
 };
 
 my $test_init = sub {
 	$cleanup->();
-	ok($lei->(['init'], undef, $opt), 'init w/o args');
+	ok($lei->('init'), 'init w/o args');
 	$ok_err_info->('after init w/o args');
-	ok($lei->(['init'], undef, $opt), 'idempotent init w/o args');
+	ok($lei->('init'), 'idempotent init w/o args');
 	$ok_err_info->('after idempotent init w/o args');
 
-	ok(!$lei->(['init', "$home/x"], undef, $opt),
-		'init conflict');
+	ok(!$lei->('init', "$home/x"), 'init conflict');
 	is(grep(/^E:/, split(/^/, $err)), 1, 'got error on conflict');
 	ok(!-e "$home/x", 'nothing created on conflict');
 	$cleanup->();
 
-	ok($lei->(['init', "$home/x"], undef, $opt), 'init conflict resolved');
+	ok($lei->('init', "$home/x"), 'init conflict resolved');
 	$ok_err_info->('init w/ arg');
-	ok($lei->(['init', "$home/x"], undef, $opt), 'init idempotent w/ path');
+	ok($lei->('init', "$home/x"), 'init idempotent w/ path');
 	$ok_err_info->('init idempotent w/ arg');
 	ok(-d "$home/x", 'created dir');
 	$cleanup->("$home/x");
 
-	ok(!$lei->(['init', "$home/x", "$home/2" ], undef, $opt),
-		'too many args fails');
+	ok(!$lei->('init', "$home/x", "$home/2"), 'too many args fails');
 	like($err, qr/too many/, 'noted excessive');
 	ok(!-e "$home/x", 'x not created on excessive');
 	for my $d (@$home_trash) {
@@ -111,12 +106,12 @@ my $test_init = sub {
 
 my $test_config = sub {
 	$cleanup->();
-	ok($lei->([qw(config a.b c)], undef, $opt), 'config set var');
+	ok($lei->(qw(config a.b c)), 'config set var');
 	is($out.$err, '', 'no output on var set');
-	ok($lei->([qw(config -l)], undef, $opt), 'config -l');
+	ok($lei->(qw(config -l)), 'config -l');
 	is($err, '', 'no errors on listing');
 	is($out, "a.b=c\n", 'got expected output');
-	ok(!$lei->([qw(config -f), "$home/.config/f", qw(x.y z)], undef, $opt),
+	ok(!$lei->(qw(config -f), "$home/.config/f", qw(x.y z)),
 			'config set var with -f fails');
 	like($err, qr/not supported/, 'not supported noted');
 	ok(!-f "$home/config/f", 'no file created');
@@ -201,7 +196,7 @@ SKIP: { # real socket
 	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
 
-	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	ok($lei->('daemon-pid'), 'daemon-pid');
 	is($err, '', 'no error from daemon-pid');
 	like($out, qr/\A[0-9]+\n\z/s, 'pid returned') or BAIL_OUT;
 	chomp(my $pid = $out);
@@ -210,42 +205,39 @@ SKIP: { # real socket
 
 	$test_lei_common->();
 
-	$out = '';
-	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	ok($lei->('daemon-pid'), 'daemon-pid');
 	chomp(my $pid_again = $out);
 	is($pid, $pid_again, 'daemon-pid idempotent');
 
-	$out = '';
-	ok(run_script([qw(lei daemon-env -0)], undef, $opt), 'show env');
+	ok($lei->(qw(daemon-env -0)), 'show env');
 	is($err, '', 'no errors in env dump');
 	my @env = split(/\0/, $out);
 	is(scalar grep(/\AHOME=\Q$home\E\z/, @env), 1, 'env has HOME');
 	is(scalar grep(/\AFOO=BAR\z/, @env), 1, 'env has FOO=BAR');
 	is(scalar grep(/\AXDG_RUNTIME_DIR=/, @env), 1, 'has XDG_RUNTIME_DIR');
 
-	$out = '';
-	ok(run_script([qw(lei daemon-env -u FOO)], undef, $opt), 'unset');
+	ok($lei->(qw(daemon-env -u FOO)), 'unset');
 	is($out.$err, '', 'no output for unset');
-	ok(run_script([qw(lei daemon-env -0)], undef, $opt), 'show again');
+	ok($lei->(qw(daemon-env -0)), 'show again');
 	is($err, '', 'no errors in env dump');
 	@env = split(/\0/, $out);
 	is(scalar grep(/\AFOO=BAR\z/, @env), 0, 'env unset FOO');
 
-	$out = '';
-	ok(run_script([qw(lei daemon-env -u FOO -u HOME -u XDG_RUNTIME_DIR)],
-			undef, $opt), 'unset multiple');
+	ok($lei->(qw(daemon-env -u FOO -u HOME -u XDG_RUNTIME_DIR)),
+			'unset multiple');
 	is($out.$err, '', 'no errors output for unset');
-	ok(run_script([qw(lei daemon-env -0)], undef, $opt), 'show again');
+
+	ok($lei->(qw(daemon-env -0)), 'show again');
 	is($err, '', 'no errors in env dump');
 	@env = split(/\0/, $out);
 	is(scalar grep(/\A(?:HOME|XDG_RUNTIME_DIR)=\z/, @env), 0, 'env unset@');
-	$out = '';
-	ok(run_script([qw(lei daemon-env -)], undef, $opt), 'clear env');
+
+	ok($lei->(qw(daemon-env -)), 'clear env');
 	is($out.$err, '', 'no output');
-	ok(run_script([qw(lei daemon-env)], undef, $opt), 'env is empty');
+	ok($lei->(qw(daemon-env)), 'env is empty');
 	is($out, '', 'env cleared');
 
-	ok(run_script([qw(lei daemon-kill)], undef, $opt), 'daemon-kill');
+	ok($lei->(qw(daemon-kill)), 'daemon-kill');
 	is($out, '', 'no output from daemon-kill');
 	is($err, '', 'no error from daemon-kill');
 	for (0..100) {
@@ -255,26 +247,22 @@ SKIP: { # real socket
 	ok(!-S $sock, 'sock gone');
 	ok(!kill(0, $pid), 'pid gone after stop');
 
-	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	ok($lei->(qw(daemon-pid)), 'daemon-pid');
 	chomp(my $new_pid = $out);
 	ok(kill(0, $new_pid), 'new pid is running');
 	ok(-S $sock, 'sock exists again');
 
-	$out = $err = '';
 	for my $sig (qw(-0 -CHLD)) {
-		ok(run_script([qw(lei daemon-kill), $sig ], undef, $opt),
-					"handles $sig");
+		ok($lei->('daemon-kill', $sig), "handles $sig");
 	}
 	is($out.$err, '', 'no output on innocuous signals');
-	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid');
+	ok($lei->('daemon-pid'), 'daemon-pid');
 	chomp $out;
 	is($out, $new_pid, 'PID unchanged after -0/-CHLD');
 
 	if ('socket inaccessible') {
 		chmod 0000, $sock or BAIL_OUT "chmod 0000: $!";
-		$out = $err = '';
-		ok(run_script([qw(lei help)], undef, $opt),
-			'connect fail, one-shot fallback works');
+		ok($lei->('help'), 'connect fail, one-shot fallback works');
 		like($err, qr/\bconnect\(/, 'connect error noted');
 		like($out, qr/^usage: /, 'help output works');
 		chmod 0700, $sock or BAIL_OUT "chmod 0700: $!";

^ permalink raw reply related	[relevance 60%]

* [PATCH 2/2] lei: fix output race in client/daemon mode
  @ 2021-01-03 11:24 70% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-03 11:24 UTC (permalink / raw)
  To: meta

The daemon needs to flush stdout before disconnecting or killing
clients, otherwise they may reread empty data on redirected
outputs.  We also don't want to unbuffer stdout too early in
case we have lots of small chunks of data to output.

The received ($self->{2}) will always have autoflush, matching normal
STDERR behavior.
---
 lib/PublicInbox/LEI.pm | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 3ad5e01a..6f21da35 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -236,6 +236,7 @@ my %CONFIG_KEYS = (
 
 sub x_it ($$) { # pronounced "exit"
 	my ($self, $code) = @_;
+	$self->{1}->autoflush(1); # make sure client sees stdout before exit
 	if (my $sig = ($code & 127)) {
 		kill($sig, $self->{pid} // $$);
 	} else {
@@ -635,6 +636,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 		say $sock "timed out waiting to recv FDs";
 		return;
 	}
+	$self->{2}->autoflush(1); # keep stdout buffered until x_it|DESTROY
 	# $ARGV_STR = join("]\0[", @ARGV);
 	# $ENV_STR = join('', map { "$_=$ENV{$_}\0" } keys %ENV);
 	# $line = "$$\0\0>$ARGV_STR\0\0>$ENV_STR\0\0";
@@ -773,4 +775,8 @@ sub oneshot {
 	}, __PACKAGE__), @ARGV);
 }
 
+# ensures stdout hits the FS before sock disconnects so a client
+# can immediately reread it
+sub DESTROY { $_[0]->{1}->autoflush(1) }
+
 1;

^ permalink raw reply related	[relevance 70%]

* [PATCH] lei: prefer IO::FDPass over our Inline::C recv_3fds
@ 2021-01-03 20:58 52% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-03 20:58 UTC (permalink / raw)
  To: meta

While our recv_3fds() implementation is more efficient
syscall-wise, loading Inline takes nearly 50ms on my machine
even after Inline::C memoizes the build.  The current ~20ms in
the fast path is barely acceptable to me, and 50ms would be
unusable.

Eventually, script/lei may invoke tcc(1) or cc(1) directly in
the fast path, but it needs @INC for the slow path, at least.

We'll encode the number of FDs into the socket name allow
parallel installations, for now.
---
 lib/PublicInbox/LEI.pm   | 12 +++++++++---
 lib/PublicInbox/Spawn.pm | 11 -----------
 script/lei               | 18 ++++++++++++------
 t/lei.t                  | 10 ++++++----
 4 files changed, 27 insertions(+), 24 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 6f21da35..f41f63ed 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -660,7 +660,7 @@ sub noop {}
 
 # lei(1) calls this when it can't connect
 sub lazy_start {
-	my ($path, $errno) = @_;
+	my ($path, $errno, $nfd) = @_;
 	if ($errno == ECONNREFUSED) {
 		unlink($path) or die "unlink($path): $!";
 	} elsif ($errno != ENOENT) {
@@ -675,8 +675,14 @@ sub lazy_start {
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
 	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
 	my $oldset = PublicInbox::Sigfd::block_signals();
-	$recv_3fds = PublicInbox::Spawn->can('recv_3fds') or die
-		"Inline::C not installed/configured or IO::FDPass missing\n";
+	if ($nfd == 1) {
+		require IO::FDPass;
+		$recv_3fds = sub { map { IO::FDPass::recv($_[0]) } (0..2) };
+	} elsif ($nfd == 3) {
+		$recv_3fds = PublicInbox::Spawn->can('recv_3fds');
+	}
+	$recv_3fds or die
+		"IO::FDPass missing or Inline::C not installed/configured\n";
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
 	(-p STDOUT) or die "E: stdout must be a pipe\n";
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index 61e95433..cd94ba96 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -315,17 +315,6 @@ unless ($set_nodatacow) {
 	*nodatacow_fd = \&PublicInbox::NDC_PP::nodatacow_fd;
 	*nodatacow_dir = \&PublicInbox::NDC_PP::nodatacow_dir;
 }
-unless (__PACKAGE__->can('recv_3fds')) {
-	eval { # try the XS IO::FDPass package
-		require IO::FDPass;
-		no warnings 'once';
-		*recv_3fds = sub { map { IO::FDPass::recv($_[0]) } (0..2) };
-		*send_3fds = sub ($$$$) {
-			my $sockfd = shift;
-			IO::FDPass::send($sockfd, shift) for (0..2);
-		};
-	};
-}
 
 undef $set_nodatacow;
 undef $vfork_spawn;
diff --git a/script/lei b/script/lei
index 029881f8..2ea98da4 100755
--- a/script/lei
+++ b/script/lei
@@ -4,11 +4,17 @@
 use strict;
 use v5.10.1;
 use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
-my $send_3fds;
+my ($send_3fds, $nfd);
 if (my ($sock, $pwd) = eval {
-	require PublicInbox::Spawn;
-	$send_3fds = PublicInbox::Spawn->can('send_3fds') or die
-		"Inline::C not installed/configured or IO::FDPass missing\n";
+	$send_3fds = eval {
+		require IO::FDPass;
+		$nfd = 1; # 1 FD per-sendmsg
+		sub { IO::FDPass::send($_[0], $_[$_]) for (1..3) }
+	} // do {
+		require PublicInbox::Spawn; # takes ~50ms even if built *sigh*
+		$nfd = 3; # 3 FDs per-sendmsg(2)
+		PublicInbox::Spawn->can('send_3fds');
+	} // die "IO::FDPass missing or Inline::C not installed/configured\n";
 	my $path = do {
 		my $runtime_dir = ($ENV{XDG_RUNTIME_DIR} // '') . '/lei';
 		if ($runtime_dir eq '/lei') {
@@ -19,7 +25,7 @@ if (my ($sock, $pwd) = eval {
 			require File::Path;
 			File::Path::mkpath($runtime_dir, 0, 0700);
 		}
-		"$runtime_dir/sock";
+		"$runtime_dir/$nfd.sock";
 	};
 	my $addr = pack_sockaddr_un($path);
 	socket(my $sock, AF_UNIX, SOCK_STREAM, 0) or die "socket: $!";
@@ -27,7 +33,7 @@ if (my ($sock, $pwd) = eval {
 		local $ENV{PERL5LIB} = join(':', @INC);
 		open(my $daemon, '-|', $^X, qw[-MPublicInbox::LEI
 			-E PublicInbox::LEI::lazy_start(@ARGV)],
-			$path, $! + 0) or die "popen: $!";
+			$path, $! + 0, $nfd) or die "popen: $!";
 		while (<$daemon>) { warn $_ } # EOF when STDERR is redirected
 		close($daemon) or warn <<"";
 lei-daemon could not start, exited with \$?=$?
diff --git a/t/lei.t b/t/lei.t
index 42c0eb8f..5afb8351 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -193,12 +193,14 @@ if ($ENV{TEST_LEI_ONESHOT}) {
 
 SKIP: { # real socket
 	require_mods(qw(Cwd), my $nr = 46);
-	require PublicInbox::Spawn;
-	skip "Inline::C not installed/configured or IO::FDPass missing", $nr
-		unless PublicInbox::Spawn->can('send_3fds');
+	my $nfd = eval { require IO::FDPass; 1 } // do {
+		require PublicInbox::Spawn;
+		PublicInbox::Spawn->can('send_3fds') ? 3 : undef;
+	} //
+	skip 'IO::FDPass missing or Inline::C not installed/configured', $nr;
 
 	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
-	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/sock";
+	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/$nfd.sock";
 
 	ok($lei->('daemon-pid'), 'daemon-pid');
 	is($err, '', 'no error from daemon-pid');

^ permalink raw reply related	[relevance 52%]

* [PATCH 0/2] lei: some usage bits
@ 2021-01-04  4:16 71% Eric Wong
  2021-01-04  4:16 67% ` [PATCH 1/2] lei: fix opt_dash to pass non-dash args to @argv Eric Wong
  2021-01-04  4:16 71% ` [PATCH 2/2] lei: improve idempotent "init" error message Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-01-04  4:16 UTC (permalink / raw)
  To: meta

Still trying to wrap my head around xsearch but my head hurts :<

Eric Wong (2):
  lei: fix opt_dash to pass non-dash args to @argv
  lei: improve idempotent "init" error message

 lib/PublicInbox/LEI.pm | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 1/2] lei: fix opt_dash to pass non-dash args to @argv
  2021-01-04  4:16 71% [PATCH 0/2] lei: some usage bits Eric Wong
@ 2021-01-04  4:16 67% ` Eric Wong
  2021-01-04  4:16 71% ` [PATCH 2/2] lei: improve idempotent "init" error message Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-01-04  4:16 UTC (permalink / raw)
  To: meta

The special "<>" handling in Getopt::Long actually invokes the
callback for every single command-line arg, not just those
prefixed by "-".  This will let us pass arbitrary non-dashed
words for search queries so users can type queries naturally
without quoting (unless they want phrase search).
---
 lib/PublicInbox/LEI.pm | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f41f63ed..50453dde 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -38,18 +38,27 @@ our %PATH2CFG; # persistent for socket daemon
 sub pass_through { $GLP_PASS }
 
 my $OPT;
-sub opt_dash {
+sub opt_dash ($$) {
 	my ($spec, $re_str) = @_; # 'limit|n=i', '([0-9]+)'
 	my ($key) = ($spec =~ m/\A([a-z]+)/g);
 	my $cb = sub { # Getopt::Long "<>" catch-all handler
 		my ($arg) = @_;
 		if ($arg =~ /\A-($re_str)\z/) {
 			$OPT->{$key} = $1;
+		} elsif ($arg eq '--') { # "--" arg separator, ignore first
+			push @{$OPT->{-argv}}, $arg if $OPT->{'--'}++;
+		# lone (single) dash is handled elsewhere
+		} elsif (substr($arg, 0, 1) eq '-') {
+			if ($OPT->{'--'}) {
+				push @{$OPT->{-argv}}, $arg;
+			} else {
+				die "bad argument: $arg\n";
+			}
 		} else {
-			die "bad argument for --$key: $arg\n";
+			push @{$OPT->{-argv}}, $arg;
 		}
 	};
-	($spec, '<>' => $cb, $GLP_PASS)
+	($spec, '<>' => $cb, $GLP_PASS) # for Getopt::Long
 }
 
 sub _store_path ($) {
@@ -360,6 +369,8 @@ sub optparse ($$$) {
 		return _help($self, "bad arguments or options for $cmd");
 	return _help($self) if $OPT->{help};
 
+	push @$argv, @{$OPT->{-argv}} if defined($OPT->{-argv});
+
 	# "-" aliases "stdin" or "clear"
 	$OPT->{$lone_dash} = ${$OPT->{$lone_dash}} if defined $lone_dash;
 

^ permalink raw reply related	[relevance 67%]

* [PATCH 2/2] lei: improve idempotent "init" error message
  2021-01-04  4:16 71% [PATCH 0/2] lei: some usage bits Eric Wong
  2021-01-04  4:16 67% ` [PATCH 1/2] lei: fix opt_dash to pass non-dash args to @argv Eric Wong
@ 2021-01-04  4:16 71% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-01-04  4:16 UTC (permalink / raw)
  To: meta

Showing "leistore.dir= already initialized" because $cur is
undefined isn't useful.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 50453dde..9a3b1ee3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -505,7 +505,7 @@ sub lei_init {
 	$dir //= _store_path($env);
 	$dir = File::Spec->rel2abs($dir, $env->{PWD}); # PWD is symlink-aware
 	my @cur = stat($cur) if defined($cur);
-	$cur = File::Spec->canonpath($cur) if $cur;
+	$cur = File::Spec->canonpath($cur // $dir);
 	my @dir = stat($dir);
 	my $exists = "I: leistore.dir=$cur already initialized" if @dir;
 	if (@cur) {

^ permalink raw reply related	[relevance 71%]

* [PATCH 1/4] lei: completion: fix filename completion
  2021-01-05  9:04 71% [PATCH 0/4] more lei usability stuff Eric Wong
@ 2021-01-05  9:04 71% ` Eric Wong
  2021-01-05  9:04 65% ` [PATCH 2/4] lei: automatic pager support Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-05  9:04 UTC (permalink / raw)
  To: meta

"-o default" is what we want from "complete", "-o filename" just
tells readline the result from the "_lei" function might be a
filename and quote appropriately.
---
 contrib/completion/lei-completion.bash | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/completion/lei-completion.bash b/contrib/completion/lei-completion.bash
index 5f47433b..0b82b109 100644
--- a/contrib/completion/lei-completion.bash
+++ b/contrib/completion/lei-completion.bash
@@ -8,4 +8,4 @@ _lei() {
 			-- "${COMP_WORDS[COMP_CWORD]}"))
 	return 0
 }
-complete -o filenames -o bashdefault -F _lei lei
+complete -o default -o bashdefault -F _lei lei

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/4] more lei usability stuff
@ 2021-01-05  9:04 71% Eric Wong
  2021-01-05  9:04 71% ` [PATCH 1/4] lei: completion: fix filename completion Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2021-01-05  9:04 UTC (permalink / raw)
  To: meta

Eric Wong (4):
  lei: completion: fix filename completion
  lei: automatic pager support
  lei: use client env as-is, drop daemon-env command
  address: pairs: new helper for JMAP (and maybe lei)

 contrib/completion/lei-completion.bash |  2 +-
 lib/PublicInbox/Address.pm             | 11 ++++-
 lib/PublicInbox/AddressPP.pm           | 21 ++++++++
 lib/PublicInbox/LEI.pm                 | 68 +++++++++++++++-----------
 t/address.t                            | 33 ++++++++++---
 t/lei.t                                | 30 +-----------
 6 files changed, 100 insertions(+), 65 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 2/4] lei: automatic pager support
  2021-01-05  9:04 71% [PATCH 0/4] more lei usability stuff Eric Wong
  2021-01-05  9:04 71% ` [PATCH 1/4] lei: completion: fix filename completion Eric Wong
@ 2021-01-05  9:04 65% ` Eric Wong
  2021-01-05  9:04 51% ` [PATCH 3/4] lei: use client env as-is, drop daemon-env command Eric Wong
  2021-01-05  9:04 50% ` [PATCH 4/4] address: pairs: new helper for JMAP (and maybe lei) Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-05  9:04 UTC (permalink / raw)
  To: meta

Just like git, we'll start a pager when outputting to a terminal
for user-friendliness when reading many messages.
---
 lib/PublicInbox/LEI.pm | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9a3b1ee3..6073a713 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -19,7 +19,7 @@ use PublicInbox::Config;
 use PublicInbox::Syscall qw(SFD_NONBLOCK EPOLLIN EPOLLONESHOT);
 use PublicInbox::Sigfd;
 use PublicInbox::DS qw(now dwaitpid);
-use PublicInbox::Spawn qw(spawn run_die);
+use PublicInbox::Spawn qw(spawn run_die popen_rd);
 use PublicInbox::OnDestroy;
 use Text::Wrap qw(wrap);
 use File::Path qw(mkpath);
@@ -619,6 +619,26 @@ sub lei_git { # support passing through random git commands
 	dwaitpid($pid, \&reap_exec, $self);
 }
 
+# caller needs to "-t $self->{1}" to check if tty
+sub start_pager {
+	my ($self) = @_;
+	my $env = $self->{env};
+	my $fh = popen_rd([qw(git var GIT_PAGER)], $env);
+	chomp(my $pager = <$fh> // '');
+	close($fh) or warn "`git var PAGER' error: \$?=$?";
+	return if $pager eq 'cat' || $pager eq '';
+	$env->{LESS} //= 'FRX';
+	$env->{LV} //= '-c';
+	$env->{COLUMNS} //= 80; # TODO TIOCGWINSZ
+	$env->{MORE} //= 'FRX' if $^O eq 'freebsd';
+	pipe(my ($r, $w)) or return warn "pipe: $!";
+	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
+	$self->{1} = $w;
+	$self->{2} = $w if -t $self->{2};
+	$self->{'pager.pid'} = spawn([$pager], $env, $rdr);
+	$env->{GIT_PAGER_IN_USE} = 'true'; # we may spawn git
+}
+
 sub accept_dispatch { # Listener {post_accept} callback
 	my ($sock) = @_; # ignore other
 	$sock->blocking(1);
@@ -794,6 +814,12 @@ sub oneshot {
 
 # ensures stdout hits the FS before sock disconnects so a client
 # can immediately reread it
-sub DESTROY { $_[0]->{1}->autoflush(1) }
+sub DESTROY {
+	my ($self) = @_;
+	$self->{1}->autoflush(1);
+	if (my $pid = delete $self->{'pager.pid'}) {
+		dwaitpid($pid, undef, $self->{sock});
+	}
+}
 
 1;

^ permalink raw reply related	[relevance 65%]

* [PATCH 3/4] lei: use client env as-is, drop daemon-env command
  2021-01-05  9:04 71% [PATCH 0/4] more lei usability stuff Eric Wong
  2021-01-05  9:04 71% ` [PATCH 1/4] lei: completion: fix filename completion Eric Wong
  2021-01-05  9:04 65% ` [PATCH 2/4] lei: automatic pager support Eric Wong
@ 2021-01-05  9:04 51% ` Eric Wong
  2021-01-05  9:04 50% ` [PATCH 4/4] address: pairs: new helper for JMAP (and maybe lei) Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-05  9:04 UTC (permalink / raw)
  To: meta

There may be subtle misbehaviours when mixing the existing
daemon env and the client-supplied env.  Just do the simplest
thing and use the client env as-is.

We'll also start the ->event_step callback since we'll need
to remember some things for long-lived commands.
---
 lib/PublicInbox/LEI.pm | 38 ++++++++++++--------------------------
 t/lei.t                | 30 +-----------------------------
 2 files changed, 13 insertions(+), 55 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 6073a713..9c3308ad 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -149,8 +149,6 @@ our %CMD = ( # sorted in order of importance/use:
 'daemon-kill' => [ '[-SIGNAL]', 'signal the lei-daemon',
 	opt_dash('signal|s=s', '[0-9]+|(?:[A-Z][A-Z0-9]+)') ],
 'daemon-pid' => [ '', 'show the PID of the lei-daemon' ],
-'daemon-env' => [ '[NAME=VALUE...]', 'set, unset, or show daemon environment',
-	qw(clear| unset|u=s@ z|0) ],
 'help' => [ '[SUBCOMMAND]', 'show help' ],
 
 # XXX do we need this?
@@ -230,12 +228,6 @@ my %OPTDESC = (
 # xargs, env, use "-0", git(1) uses "-z".  We support z|0 everywhere
 'z|0' => 'use NUL \\0 instead of newline (CR) to delimit lines',
 
-# note: no "--ignore-environment" / "-i" support like env(1) since that
-# is one-shot and this is for a persistent daemon:
-'clear|' => 'clear the daemon environment',
-'unset|u=s@' => ['NAME',
-	'unset matching NAME, may be specified multiple times'],
-
 'signal|s=s' => [ 'SIG', 'signal to send lei-daemon (default: TERM)' ],
 ); # %OPTDESC
 
@@ -538,24 +530,6 @@ sub lei_daemon_kill {
 	kill($sig, $$) or fail($self, "kill($sig, $$): $!");
 }
 
-sub lei_daemon_env {
-	my ($self, @argv) = @_;
-	my $opt = $self->{opt};
-	if (defined $opt->{clear}) {
-		%ENV = ();
-	} elsif (my $u = $opt->{unset}) {
-		delete @ENV{@$u};
-	}
-	if (@argv) {
-		%ENV = (%ENV, map { split(/=/, $_, 2) } @argv);
-	} elsif (!defined($opt->{clear}) && !$opt->{unset}) {
-		my $eor = $opt->{z} ? "\0" : "\n";
-		my $buf = '';
-		while (my ($k, $v) = each %ENV) { $buf .= "$k=$v$eor" }
-		out $self, $buf;
-	}
-}
-
 sub lei_help { _help($_[0]) }
 
 # Shell completion helper.  Used by lei-completion.bash and hopefully
@@ -678,6 +652,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 	};
 	my %env = map { split(/=/, $_, 2) } split(/\0/, $env);
 	if (chdir($env{PWD})) {
+		local %ENV = %env;
 		$self->{env} = \%env;
 		$self->{pid} = $client_pid;
 		eval { dispatch($self, split(/\]\0\[/, $argv)) };
@@ -687,6 +662,17 @@ sub accept_dispatch { # Listener {post_accept} callback
 	}
 }
 
+# for long-running results
+sub event_step {
+	my ($self) = @_;
+	local %ENV = %{$self->{env}};
+	eval {}; # TODO
+	if ($@) {
+		say { $self->{sock} } $@;
+		$self->close; # PublicInbox::DS::close
+	}
+}
+
 sub noop {}
 
 # lei(1) calls this when it can't connect
diff --git a/t/lei.t b/t/lei.t
index 5afb8351..6d47e307 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -192,7 +192,7 @@ if ($ENV{TEST_LEI_ONESHOT}) {
 }
 
 SKIP: { # real socket
-	require_mods(qw(Cwd), my $nr = 46);
+	require_mods(qw(Cwd), my $nr = 105);
 	my $nfd = eval { require IO::FDPass; 1 } // do {
 		require PublicInbox::Spawn;
 		PublicInbox::Spawn->can('send_3fds') ? 3 : undef;
@@ -215,34 +215,6 @@ SKIP: { # real socket
 	chomp(my $pid_again = $out);
 	is($pid, $pid_again, 'daemon-pid idempotent');
 
-	ok($lei->(qw(daemon-env -0)), 'show env');
-	is($err, '', 'no errors in env dump');
-	my @env = split(/\0/, $out);
-	is(scalar grep(/\AHOME=\Q$home\E\z/, @env), 1, 'env has HOME');
-	is(scalar grep(/\AFOO=BAR\z/, @env), 1, 'env has FOO=BAR');
-	is(scalar grep(/\AXDG_RUNTIME_DIR=/, @env), 1, 'has XDG_RUNTIME_DIR');
-
-	ok($lei->(qw(daemon-env -u FOO)), 'unset');
-	is($out.$err, '', 'no output for unset');
-	ok($lei->(qw(daemon-env -0)), 'show again');
-	is($err, '', 'no errors in env dump');
-	@env = split(/\0/, $out);
-	is(scalar grep(/\AFOO=BAR\z/, @env), 0, 'env unset FOO');
-
-	ok($lei->(qw(daemon-env -u FOO -u HOME -u XDG_RUNTIME_DIR)),
-			'unset multiple');
-	is($out.$err, '', 'no errors output for unset');
-
-	ok($lei->(qw(daemon-env -0)), 'show again');
-	is($err, '', 'no errors in env dump');
-	@env = split(/\0/, $out);
-	is(scalar grep(/\A(?:HOME|XDG_RUNTIME_DIR)=\z/, @env), 0, 'env unset@');
-
-	ok($lei->(qw(daemon-env -)), 'clear env');
-	is($out.$err, '', 'no output');
-	ok($lei->(qw(daemon-env)), 'env is empty');
-	is($out, '', 'env cleared');
-
 	ok($lei->(qw(daemon-kill)), 'daemon-kill');
 	is($out, '', 'no output from daemon-kill');
 	is($err, '', 'no error from daemon-kill');

^ permalink raw reply related	[relevance 51%]

* [PATCH 4/4] address: pairs: new helper for JMAP (and maybe lei)
  2021-01-05  9:04 71% [PATCH 0/4] more lei usability stuff Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-05  9:04 51% ` [PATCH 3/4] lei: use client env as-is, drop daemon-env command Eric Wong
@ 2021-01-05  9:04 50% ` Eric Wong
  2021-01-05  9:24 70%   ` JSON pretty-printing [was: [4/4] ... (and maybe lei)] Eric Wong
  3 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-05  9:04 UTC (permalink / raw)
  To: meta

Per JMAP RFC 8621 sec 4.1.2.3, we should be able to
denote the lack of a phrase/comment corresponding to an
email address with a JSON "null" (or Perl `undef').

  [
    { "name": "James Smythe", "email": "james@example.com" },
    { "name": null, "email": "jane@example.com" },
    { "name": "John Smith", "email": "john@example.com" }
  ]

The new "pairs" method just returns a 2 dimensional array
and the consumer will fill in the field names if necessary
(or not).

lei(1) may use the two dimensional array as-is for JSON output.
---
 lib/PublicInbox/Address.pm   | 11 ++++++++++-
 lib/PublicInbox/AddressPP.pm | 21 +++++++++++++++++++++
 t/address.t                  | 33 +++++++++++++++++++++++++++------
 3 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/Address.pm b/lib/PublicInbox/Address.pm
index f5af4c23..a090fa43 100644
--- a/lib/PublicInbox/Address.pm
+++ b/lib/PublicInbox/Address.pm
@@ -2,7 +2,9 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 package PublicInbox::Address;
 use strict;
-use warnings;
+use v5.10.1;
+use parent 'Exporter';
+our @EXPORT_OK = qw(pairs);
 
 sub xs_emails {
 	grep { defined } map { $_->address() } parse_email_addresses($_[0])
@@ -17,11 +19,18 @@ sub xs_names {
 	} parse_email_addresses($_[0]);
 }
 
+sub xs_pairs { # for JMAP, RFC 8621 section 4.1.2.3
+	[ map { # LHS (name) may be undef
+		[ $_->phrase // $_->comment, $_->address ]
+	} parse_email_addresses($_[0]) ];
+}
+
 eval {
 	require Email::Address::XS;
 	Email::Address::XS->import(qw(parse_email_addresses));
 	*emails = \&xs_emails;
 	*names = \&xs_names;
+	*pairs = \&xs_pairs;
 };
 
 if ($@) {
diff --git a/lib/PublicInbox/AddressPP.pm b/lib/PublicInbox/AddressPP.pm
index c04de74b..6a3ae4fe 100644
--- a/lib/PublicInbox/AddressPP.pm
+++ b/lib/PublicInbox/AddressPP.pm
@@ -13,6 +13,7 @@ sub emails {
 }
 
 sub names {
+	# split by address and post-address comment
 	my @p = split(/<?([^@<>]+)\@[\w\.\-]+>?\s*(\(.*?\))?(?:,\s*|\z)/,
 			$_[0]);
 	my @ret;
@@ -35,4 +36,24 @@ sub names {
 	@ret;
 }
 
+sub pairs { # for JMAP, RFC 8621 section 4.1.2.3
+	my ($s) = @_;
+	[ map {
+		my $addr = $_;
+		if ($s =~ s/\A\s*(.*?)\s*<\Q$addr\E>\s*(.*?)\s*(?:,|\z)// ||
+		    $s =~ s/\A\s*(.*?)\s*\Q$addr\E\s*(.*?)\s*(?:,|\z)//) {
+			my ($phrase, $comment) = ($1, $2);
+			$phrase =~ tr/\r\n\t / /s;
+			$phrase =~ s/\A['"\s]*//;
+			$phrase =~ s/['"\s]*\z//;
+			$phrase =~ s/\s*<*\s*\z//;
+			$phrase = undef if $phrase !~ /\S/;
+			$comment = ($comment =~ /\((.*?)\)/) ? $1 : undef;
+			[ $phrase // $comment, $addr ]
+		} else {
+			();
+		}
+	} emails($s) ];
+}
+
 1;
diff --git a/t/address.t b/t/address.t
index 0adcf46d..6aa94628 100644
--- a/t/address.t
+++ b/t/address.t
@@ -7,26 +7,40 @@ use_ok 'PublicInbox::Address';
 
 sub test_pkg {
 	my ($pkg) = @_;
-	my $emails = \&{"${pkg}::emails"};
-	my $names = \&{"${pkg}::names"};
+	my $emails = $pkg->can('emails');
+	my $names = $pkg->can('names');
+	my $pairs = $pkg->can('pairs');
 
 	is_deeply([qw(e@example.com e@example.org)],
 		[$emails->('User <e@example.com>, e@example.org')],
 		'address extraction works as expected');
 
+	is_deeply($pairs->('User <e@example.com>, e@example.org'),
+			[[qw(User e@example.com)], [undef, 'e@example.org']],
+		"pair extraction works ($pkg)");
+
 	is_deeply(['user@example.com'],
 		[$emails->('<user@example.com (Comment)>')],
 		'comment after domain accepted before >');
+	is_deeply($pairs->('<user@example.com (Comment)>'),
+		[[qw(Comment user@example.com)]], "comment as name ($pkg)");
 
-	my @names = $names->(
-		'User <e@e>, e@e, "John A. Doe" <j@d>, <x@x>, <y@x> (xyz), '.
-		'U Ser <u@x> (do not use)');
+	my $s = 'User <e@e>, e@e, "John A. Doe" <j@d>, <x@x>, <y@x> (xyz), '.
+		'U Ser <u@x> (do not use)';
+	my @names = $names->($s);
 	is_deeply(\@names, ['User', 'e', 'John A. Doe', 'x', 'xyz', 'U Ser'],
 		'name extraction works as expected');
+	is_deeply($pairs->($s), [ [ 'User', 'e@e' ], [ undef, 'e@e' ],
+			[ 'John A. Doe', 'j@d' ], [ undef, 'x@x' ],
+			[ 'xyz', 'y@x' ], [ 'U Ser', 'u@x' ] ],
+		"pairs extraction works for $pkg");
 
 	@names = $names->('"user@example.com" <user@example.com>');
 	is_deeply(['user'], \@names,
 		'address-as-name extraction works as expected');
+	is_deeply($pairs->('"user@example.com" <user@example.com>'),
+		[ [ 'user@example.com', 'user@example.com' ] ],
+		"pairs for $pkg");
 
 	{
 		my $backwards = 'u@example.com (John Q. Public)';
@@ -34,10 +48,17 @@ sub test_pkg {
 		is_deeply(\@names, ['John Q. Public'], 'backwards name OK');
 		my @emails = $emails->($backwards);
 		is_deeply(\@emails, ['u@example.com'], 'backwards emails OK');
+
+		is_deeply($pairs->($backwards),
+			[ [ 'John Q. Public', 'u@example.com' ] ],
+			"backwards pairs $pkg");
 	}
 
-	@names = $names->('"Quote Unneeded" <user@example.com>');
+	$s = '"Quote Unneeded" <user@example.com>';
+	@names = $names->($s);
 	is_deeply(['Quote Unneeded'], \@names, 'extra quotes dropped');
+	is_deeply($pairs->($s), [ [ 'Quote Unneeded', 'user@example.com' ] ],
+		"extra quotes dropped in pairs $pkg");
 
 	my @emails = $emails->('Local User <user>');
 	is_deeply([], \@emails , 'no address for local address');

^ permalink raw reply related	[relevance 50%]

* JSON pretty-printing [was: [4/4] ... (and maybe lei)]
  2021-01-05  9:04 50% ` [PATCH 4/4] address: pairs: new helper for JMAP (and maybe lei) Eric Wong
@ 2021-01-05  9:24 70%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-05  9:24 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
>   [
>     { "name": "James Smythe", "email": "james@example.com" },
>     { "name": null, "email": "jane@example.com" },
>     { "name": "John Smith", "email": "john@example.com" }
>   ]

I might make JSON the default "lei q" output to the terminal or
pager when no mbox/Maildir/IMAP destination is specified.

Unfortunately, attempting JSON pretty-printing seems to add
excessive vertical white space and I can't easily match the
formatting above.  jq(1) can't seem to do what I want, either.

With the Perl JSON modules, the array of 3 hash tables would
either be all on one-line (not human friendly), or have
excessive vertical space:

   [
      {
         "email" : "james@example.com",
         "name" : "James Smythe"
      },
      {
         "email" : "jane@example.com",
         "name" : null
      },
      {
         "email" : "john@example.com",
         "name" : "John Smith"
      }
   ]

So maybe I'll bypass some of the structural/indentation stuff
and only rely on the JSON modules to properly quote strings.

I'm already doing that when outputting search results (similar
to "git log"), just on a per-smsg level to avoid building giant
arrays with 1000K search results in memory all at once.

JSON requiring keys (not just values) to be quoted also annoys
me a bit as being extra visual noise.

^ permalink raw reply	[relevance 70%]

* [PATCH 00/22] lei query overview views
@ 2021-01-10 12:14 57% Eric Wong
  2021-01-10 12:14 29% ` [PATCH 01/22] lei query + pagination sorta working Eric Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:14 UTC (permalink / raw)
  To: meta

Usage summary:

	lei add-external /path/to/v1-or-v2-inbox
	lei add-external /path/to/another-inbox-or-ext-index
			# URLs aren't supported, yet :<

	lei q SEARCH TERMS GO HERE... # pager should open with JSON output

For faster startup time than what Inline::C can give:

	apt-get install libsocket-msghdr-perl # Socket::Msghdr

Having neither Inline::C nor Socket::Msghdr means parallel
queries won't work.

I went back-and-forth on a bunch of things but ultimately gave
up trying to support IO::FDPass since it got too fragile and
difficult to test with the work-queue distribution.

The pager runs from the client process (if using Socket::MsgHdr
or Inline::C), now.  It took at fair amount of work from my slow
brain to get pager shutdown to be instantaneous, though queries
which haven't output anything aren't easily interruptible...

The wq_* IPC stuff will be reused in the normal read-only
WWW/IMAP search at some point, too.

Eric Wong (22):
  lei query + pagination sorta working
  lei q: deduplicate smsg
  ds: block signals when reaping
  ipc: add support for asynchronous callbacks
  cmd_ipc: send FDs with buffer payload
  ipc: avoid excessive evals
  ipc: work queue support via SOCK_SEQPACKET
  ipc: eliminate ipc_worker_stop method
  ipc: wq: support dynamic worker count change
  ipc: drop -ipc_parent_pid field
  ipc: DESTROY and wq_workers methods
  lei: rename $w to $wpager for warning message
  lei: fix oneshot TTY detection by passing STD*{GLOB}
  lei: query: ensure pager exit is instantaneous
  ipc: start supporting sending/receiving more than 3 FDs
  ipc: fix IO::FDPass use with a worker limit of 1
  ipc: drop unused fields, default sighandlers for wq
  lei: get rid of client {pid} field
  lei: fork + FD cleanup
  lei: run pager in client script
  lei_xsearch: transfer 4 FDs internally, drop IO::FDPass
  lei: query: restore JSON output overview

 MANIFEST                        |   4 +
 lib/PublicInbox/CmdIPC4.pm      |  36 ++++
 lib/PublicInbox/DS.pm           |  16 +-
 lib/PublicInbox/Daemon.pm       |  10 +-
 lib/PublicInbox/ExtSearchIdx.pm |   4 +-
 lib/PublicInbox/IPC.pm          | 280 ++++++++++++++++++++++++++++----
 lib/PublicInbox/LEI.pm          | 180 +++++++++++++-------
 lib/PublicInbox/LeiDedupe.pm    |  29 +++-
 lib/PublicInbox/LeiExternal.pm  |  33 ++--
 lib/PublicInbox/LeiOverview.pm  | 188 +++++++++++++++++++++
 lib/PublicInbox/LeiQuery.pm     |  92 +++++++++++
 lib/PublicInbox/LeiStore.pm     |   2 +-
 lib/PublicInbox/LeiToMail.pm    |   2 +
 lib/PublicInbox/LeiXSearch.pm   | 118 +++++++++++++-
 lib/PublicInbox/Search.pm       |  10 +-
 lib/PublicInbox/SearchView.pm   |  10 +-
 lib/PublicInbox/Sigfd.pm        |  12 +-
 lib/PublicInbox/Spawn.pm        |  85 ++++++----
 lib/PublicInbox/Watch.pm        |   8 +-
 script/lei                      |  76 +++++----
 script/public-inbox-watch       |   4 +-
 t/cmd_ipc.t                     |  82 ++++++++++
 t/ipc.t                         | 115 ++++++++++++-
 t/lei.t                         |  31 +++-
 t/lei_dedupe.t                  |  14 ++
 t/lei_xsearch.t                 |   5 +
 t/spawn.t                       |  33 +---
 27 files changed, 1233 insertions(+), 246 deletions(-)
 create mode 100644 lib/PublicInbox/CmdIPC4.pm
 create mode 100644 lib/PublicInbox/LeiOverview.pm
 create mode 100644 lib/PublicInbox/LeiQuery.pm
 create mode 100644 t/cmd_ipc.t

^ permalink raw reply	[relevance 57%]

* [PATCH 02/22] lei q: deduplicate smsg
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
  2021-01-10 12:14 29% ` [PATCH 01/22] lei query + pagination sorta working Eric Wong
@ 2021-01-10 12:14 50% ` Eric Wong
  2021-01-10 12:15 71% ` [PATCH 12/22] lei: rename $w to $wpager for warning message Eric Wong
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:14 UTC (permalink / raw)
  To: meta

We don't want duplicate messages in results overviews, either.
---
 lib/PublicInbox/LeiDedupe.pm | 29 ++++++++++++++++++++++++++++-
 lib/PublicInbox/LeiQuery.pm  |  5 +++++
 t/lei_dedupe.t               | 14 ++++++++++++++
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
index c4e5dffb..58eee533 100644
--- a/lib/PublicInbox/LeiDedupe.pm
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -33,12 +33,24 @@ sub _regen_oid ($) {
 
 sub _oidbin ($) { defined($_[0]) ? pack('H*', $_[0]) : undef }
 
+sub smsg_hash ($) {
+	my ($smsg) = @_;
+	my $dig = Digest::SHA->new(256);
+	my $x = join("\0", @$smsg{qw(from to cc ds subject references mid)});
+	utf8::encode($x);
+	$dig->add($x);
+	$dig->digest;
+}
+
 # the paranoid option
 sub dedupe_oid () {
 	my $skv = PublicInbox::SharedKV->new;
 	($skv, sub { # may be called in a child process
 		my ($eml, $oid) = @_;
 		$skv->set_maybe(_oidbin($oid) // _regen_oid($eml), '');
+	}, sub {
+		my ($smsg) = @_;
+		$skv->set_maybe(_oidbin($smsg->{blob}), '');
 	});
 }
 
@@ -51,6 +63,12 @@ sub dedupe_mid () {
 		my $mid = $eml->header_raw('Message-ID') // _oidbin($oid) //
 			content_hash($eml);
 		$skv->set_maybe($mid, '');
+	}, sub {
+		my ($smsg) = @_;
+		my $mid = $smsg->{mid};
+		$mid = undef if $mid eq '';
+		$mid //= smsg_hash($smsg) // _oidbin($smsg->{blob});
+		$skv->set_maybe($mid, '');
 	});
 }
 
@@ -60,11 +78,15 @@ sub dedupe_content () {
 	($skv, sub { # may be called in a child process
 		my ($eml) = @_; # oid = $_[1], ignored
 		$skv->set_maybe(content_hash($eml), '');
+	}, sub {
+		my ($smsg) = @_;
+		$skv->set_maybe(smsg_hash($smsg), '');
 	});
 }
 
 # no deduplication at all
-sub dedupe_none () { (undef, sub { 1 }) }
+sub true { 1 }
+sub dedupe_none () { (undef, \&true, \&true) }
 
 sub new {
 	my ($cls, $lei, $dst) = @_;
@@ -85,6 +107,11 @@ sub is_dup {
 	!$self->[1]->($eml, $oid);
 }
 
+sub is_smsg_dup {
+	my ($self, $smsg) = @_;
+	!$self->[2]->($smsg);
+}
+
 sub prepare_dedupe {
 	my ($self) = @_;
 	my $skv = $self->[0];
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index d14da1bc..f69dccad 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -69,6 +69,8 @@ sub lei_q {
 	} @argv);
 	$opt->{limit} //= 10000;
 	my $lxs;
+	require PublicInbox::LeiDedupe;
+	my $dd = PublicInbox::LeiDedupe->new($self);
 
 	# --local is enabled by default
 	my @src = $opt->{'local'} ? ($sto->search) : ();
@@ -135,6 +137,7 @@ sub lei_q {
 		delete @$smsg{qw(tid num)}; # only makes sense if single src
 		chomp($buf = $json->encode(_smsg_unbless($smsg)));
 	};
+	$dd->prepare_dedupe;
 	for my $src (@src) {
 		my $srch = $src->search;
 		my $over = $src->over;
@@ -145,6 +148,7 @@ sub lei_q {
 		if ($smsg_for) {
 			for my $it ($mset->items) {
 				my $smsg = $smsg_for->($srch, $it) or next;
+				next if $dd->is_smsg_dup($smsg);
 				$self->out($buf .= $ORS) if defined $buf;
 				$smsg->{relevance} = get_pct($it);
 				$emit_cb->($smsg);
@@ -160,6 +164,7 @@ sub lei_q {
 			while ($over && $over->expand_thread($ctx)) {
 				for my $n (@{$ctx->{xids}}) {
 					my $t = $over->get_art($n) or next;
+					next if $dd->is_smsg_dup($t);
 					if (my $p = delete $n2p{$t->{num}}) {
 						$t->{relevance} = $p;
 					}
diff --git a/t/lei_dedupe.t b/t/lei_dedupe.t
index b5e2b8f9..6e971b9b 100644
--- a/t/lei_dedupe.t
+++ b/t/lei_dedupe.t
@@ -6,12 +6,16 @@ use v5.10.1;
 use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::Eml;
+use PublicInbox::Smsg;
 require_mods(qw(DBD::SQLite));
 use_ok 'PublicInbox::LeiDedupe';
 my $eml = eml_load('t/plack-qp.eml');
 my $mid = $eml->header_raw('Message-ID');
 my $different = eml_load('t/msg_iter-order.eml');
 $different->header_set('Message-ID', $mid);
+my $smsg = bless { ds => time }, 'PublicInbox::Smsg';
+$smsg->populate($eml);
+$smsg->{$_} //= '' for (qw(to cc references)) ;
 
 my $lei = { opt => { dedupe => 'none' } };
 my $dd = PublicInbox::LeiDedupe->new($lei);
@@ -19,6 +23,8 @@ $dd->prepare_dedupe;
 ok(!$dd->is_dup($eml), '1st is_dup w/o dedupe');
 ok(!$dd->is_dup($eml), '2nd is_dup w/o dedupe');
 ok(!$dd->is_dup($different), 'different is_dup w/o dedupe');
+ok(!$dd->is_smsg_dup($smsg), 'smsg dedupe none 1');
+ok(!$dd->is_smsg_dup($smsg), 'smsg dedupe none 2');
 
 for my $strat (undef, 'content') {
 	$lei->{opt}->{dedupe} = $strat;
@@ -28,6 +34,8 @@ for my $strat (undef, 'content') {
 	ok(!$dd->is_dup($eml), "1st is_dup with $desc dedupe");
 	ok($dd->is_dup($eml), "2nd seen with $desc dedupe");
 	ok(!$dd->is_dup($different), "different is_dup with $desc dedupe");
+	ok(!$dd->is_smsg_dup($smsg), "is_smsg_dup pass w/ $desc dedupe");
+	ok($dd->is_smsg_dup($smsg), "is_smsg_dup reject w/ $desc dedupe");
 }
 $lei->{opt}->{dedupe} = 'bogus';
 eval { PublicInbox::LeiDedupe->new($lei) };
@@ -39,6 +47,8 @@ $dd->prepare_dedupe;
 ok(!$dd->is_dup($eml), '1st is_dup with mid dedupe');
 ok($dd->is_dup($eml), '2nd seen with mid dedupe');
 ok($dd->is_dup($different), 'different seen with mid dedupe');
+ok(!$dd->is_smsg_dup($smsg), 'smsg mid dedupe pass');
+ok($dd->is_smsg_dup($smsg), 'smsg mid dedupe reject');
 
 $lei->{opt}->{dedupe} = 'oid';
 $dd = PublicInbox::LeiDedupe->new($lei);
@@ -56,4 +66,8 @@ ok($dd->is_dup($different, '01d'), 'different content ignored if oid matches');
 ok($dd->is_dup($eml, '01D'), 'case insensitive oid comparison :P');
 ok(!$dd->is_dup($eml, '01dbad'), 'case insensitive oid comparison :P');
 
+$smsg->{blob} = 'dead';
+ok(!$dd->is_smsg_dup($smsg), 'smsg dedupe pass');
+ok($dd->is_smsg_dup($smsg), 'smsg dedupe reject');
+
 done_testing;

^ permalink raw reply related	[relevance 50%]

* [PATCH 01/22] lei query + pagination sorta working
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
@ 2021-01-10 12:14 29% ` Eric Wong
  2021-01-10 12:14 50% ` [PATCH 02/22] lei q: deduplicate smsg Eric Wong
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:14 UTC (permalink / raw)
  To: meta

Parallelism and interactivity with pager + SIGPIPE needs work;
but results are shown and phrase search works without shell
users having to apply Xapian quoting rules on top of standard
shell quoting.
---
 MANIFEST                       |   1 +
 lib/PublicInbox/LEI.pm         |  12 +--
 lib/PublicInbox/LeiExternal.pm |  33 ++++---
 lib/PublicInbox/LeiQuery.pm    | 176 +++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiStore.pm    |   2 +-
 lib/PublicInbox/LeiToMail.pm   |   2 +
 lib/PublicInbox/LeiXSearch.pm  |  22 ++++-
 lib/PublicInbox/Search.pm      |  10 +-
 lib/PublicInbox/SearchView.pm  |  10 +-
 t/lei.t                        |  11 ++-
 t/lei_xsearch.t                |   5 +
 11 files changed, 250 insertions(+), 34 deletions(-)
 create mode 100644 lib/PublicInbox/LeiQuery.pm

diff --git a/MANIFEST b/MANIFEST
index 6dc08f01..609160dd 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -165,6 +165,7 @@ lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
 lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
+lib/PublicInbox/LeiQuery.pm
 lib/PublicInbox/LeiSearch.pm
 lib/PublicInbox/LeiStore.pm
 lib/PublicInbox/LeiToMail.pm
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9c3308ad..a5658e6d 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -8,7 +8,8 @@
 package PublicInbox::LEI;
 use strict;
 use v5.10.1;
-use parent qw(PublicInbox::DS PublicInbox::LeiExternal);
+use parent qw(PublicInbox::DS PublicInbox::LeiExternal
+	PublicInbox::LeiQuery);
 use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
 use Errno qw(EAGAIN ECONNREFUSED ENOENT);
@@ -80,7 +81,7 @@ sub _config_path ($) {
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
-	sort|s=s@ reverse|r offset=i remote local! external!
+	sort|s=s reverse|r offset=i remote local! external! pretty
 	since|after=s until|before=s), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
@@ -202,8 +203,9 @@ my %OPTDESC = (
 'limit|n=i@' => ['NUM', 'limit on number of matches (default: 10000)' ],
 'offset=i' => ['OFF', 'search result offset (default: 0)'],
 
-'sort|s=s@' => [ 'VAL|internaldate,date,relevance,docid',
+'sort|s=s' => [ 'VAL|received,relevance,docid',
 		"order of results `--output'-dependent"],
+'reverse|r' => [ 'reverse search results' ], # like sort(1)
 
 'boost=i' => 'increase/decrease priority of results (default: 0)',
 
@@ -469,10 +471,6 @@ sub lei_show {
 	my ($self, @argv) = @_;
 }
 
-sub lei_query {
-	my ($self, @argv) = @_;
-}
-
 sub lei_mark {
 	my ($self, @argv) = @_;
 }
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 4facd451..64faf5a0 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -8,24 +8,35 @@ use v5.10.1;
 use parent qw(Exporter);
 our @EXPORT = qw(lei_ls_external lei_add_external lei_forget_external);
 
-sub lei_ls_external {
-	my ($self, @argv) = @_;
-	my $stor = $self->_lei_store(0);
+sub _externals_each {
+	my ($self, $cb, @arg) = @_;
 	my $cfg = $self->_lei_cfg(0);
-	my $out = $self->{1};
-	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
-	my (%boost, @loc);
+	my %boost;
 	for my $sec (grep(/\Aexternal\./, @{$cfg->{-section_order}})) {
 		my $loc = substr($sec, length('external.'));
 		$boost{$loc} = $cfg->{"$sec.boost"};
-		push @loc, $loc;
 	}
-	use sort 'stable';
+	return \%boost if !wantarray && !$cb;
+
 	# highest boost first, but stable for alphabetic tie break
-	for (sort { $boost{$b} <=> $boost{$a} } sort keys %boost) {
-		# TODO: use miscidx and show docid so forget/set is easier
-		print $out $_, $OFS, 'boost=', $boost{$_}, $ORS;
+	use sort 'stable';
+	my @order = sort { $boost{$b} <=> $boost{$a} } sort keys %boost;
+	return @order if !$cb;
+	for my $loc (@order) {
+		$cb->(@arg, $loc, $boost{$loc});
 	}
+	@order; # scalar or array
+}
+
+sub lei_ls_external {
+	my ($self, @argv) = @_;
+	my $stor = $self->_lei_store(0);
+	my $out = $self->{1};
+	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
+	$self->_externals_each(sub {
+		my ($loc, $boost_val) = @_;
+		print $out $loc, $OFS, 'boost=', $boost_val, $ORS;
+	});
 }
 
 sub lei_add_external {
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
new file mode 100644
index 00000000..d14da1bc
--- /dev/null
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -0,0 +1,176 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# handles lei <q|ls-query|rm-query|mv-query> commands
+package PublicInbox::LeiQuery;
+use strict;
+use v5.10.1;
+use PublicInbox::MID qw($MID_EXTRACT);
+use POSIX qw(strftime);
+use PublicInbox::Address qw(pairs);
+use PublicInbox::Search qw(get_pct);
+
+sub _iso8601 ($) { strftime('%Y-%m-%dT%H:%M:%SZ', gmtime($_[0])) }
+
+# prepares an smsg for JSON
+sub _smsg_unbless ($) {
+	my ($smsg) = @_;
+
+	delete @$smsg{qw(lines bytes)};
+	$smsg->{rcvd} = _iso8601(delete $smsg->{ts}); # JMAP receivedAt
+	$smsg->{dt} = _iso8601(delete $smsg->{ds}); # JMAP UTCDate
+
+	if (my $r = delete $smsg->{references}) {
+		$smsg->{references} = [
+				map { "<$_>" } ($r =~ m/$MID_EXTRACT/go) ];
+	}
+	if (my $m = delete($smsg->{mid})) {
+		$smsg->{'m'} = "<$m>";
+	}
+	# XXX breaking to/cc, into structured arrays or tables which
+	# distinguish "$phrase <$address>" causes pretty printing JSON
+	# to take up too much vertical space.  I can't get either
+	# Cpanel::JSON::XS or JSON::XS or jq(1) only indent when
+	# wrapping is necessary, rather than blindly indenting and
+	# adding vertical space everywhere.
+	for my $f (qw(from to cc)) {
+		my $v = delete $smsg->{$f} or next;
+		$smsg->{substr($f, 0, 1)} = $v;
+	}
+	$smsg->{'s'} = delete $smsg->{subject};
+	# can we be bothered to parse From/To/Cc into arrays?
+	scalar { %$smsg }; # unbless
+}
+
+sub _vivify_external { # _externals_each callback
+	my ($src, $dir) = @_;
+	if (-f "$dir/ei.lock") {
+		require PublicInbox::ExtSearch;
+		push @$src, PublicInbox::ExtSearch->new($dir);
+	} elsif (-f "$dir/inbox.lock" || -d "$dir/public-inbox") { # v2, v1
+		require PublicInbox::Inbox;
+		push @$src, bless { inboxdir => $dir }, 'PublicInbox::Inbox';
+	} else {
+		warn "W: ignoring $dir, unable to determine type\n";
+	}
+}
+
+# the main "lei q SEARCH_TERMS" method
+sub lei_q {
+	my ($self, @argv) = @_;
+	my $sto = $self->_lei_store(1);
+	my $cfg = $self->_lei_cfg(1);
+	my $opt = $self->{opt};
+	my $qstr = join(' ', map {;
+		# Consider spaces in argv to be for phrase search in Xapian.
+		# In other words, the users should need only care about
+		# normal shell quotes and not have to learn Xapian quoting.
+		/\s/ ? (s/\A(\w+:)// ? qq{$1"$_"} : qq{"$_"}) : $_
+	} @argv);
+	$opt->{limit} //= 10000;
+	my $lxs;
+
+	# --local is enabled by default
+	my @src = $opt->{'local'} ? ($sto->search) : ();
+
+	# --external is enabled by default, but allow --no-external
+	if ($opt->{external} // 1) {
+		$self->_externals_each(\&_vivify_external, \@src);
+		# {tid} is not unique between indices, so we have to search
+		# each src individually
+		if (!$opt->{thread}) {
+			require PublicInbox::LeiXSearch;
+			my $lxs = PublicInbox::LeiXSearch->new;
+			# local is always first
+			$lxs->attach_external($_) for @src;
+			@src = ($lxs);
+		}
+	}
+	my $out = $self->{output} // '-';
+	$out = 'json:/dev/stdout' if $out eq '-';
+	my $isatty = -t $self->{1};
+	$self->start_pager if $isatty;
+	my $json = substr($out, 0, 5) eq 'json:' ?
+		ref(PublicInbox::Config->json)->new : undef;
+	if ($json) {
+		if ($opt->{pretty} //= $isatty) {
+			$json->pretty(1)->space_before(0);
+			$json->indent_length($opt->{indent} // 2);
+		}
+		$json->utf8; # avoid Wide character in print warnings
+		$json->ascii(1) if $opt->{ascii}; # for "\uXXXX"
+		$json->canonical;
+	}
+
+	# src: LeiXSearch || LeiSearch || Inbox
+	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
+	delete $mset_opt{limit} if $opt->{limit} < 0;
+	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
+	if (defined(my $sort = $opt->{'sort'})) {
+		if ($sort eq 'relevance') {
+			$mset_opt{relevance} = 1;
+		} elsif ($sort eq 'docid') {
+			$mset_opt{relevance} = $mset_opt{asc} ? -1 : -2;
+		} elsif ($sort =~ /\Areceived(?:-?[aA]t)?\z/) {
+			# the default
+		} else {
+			die "unrecognized --sort=$sort\n";
+		}
+	}
+	# $self->out($json->encode(\%mset_opt));
+	# descending docid order
+	$mset_opt{relevance} //= -2 if $opt->{thread};
+	# my $wcb = PublicInbox::LeiToMail->write_cb($out, $self);
+
+	# even w/o pretty, do the equivalent of a --pretty=oneline
+	# output so "lei q SEARCH_TERMS | wc -l" can be useful:
+	my $ORS = $json ? ($opt->{pretty} ? ', ' : ",\n") : "\n";
+	my $buf;
+
+	# we can generate too many records to hold in RAM, so we stream
+	# and fake a JSON array starting here:
+	$self->out('[') if $json;
+	my $emit_cb = sub {
+		my ($smsg) = @_;
+		delete @$smsg{qw(tid num)}; # only makes sense if single src
+		chomp($buf = $json->encode(_smsg_unbless($smsg)));
+	};
+	for my $src (@src) {
+		my $srch = $src->search;
+		my $over = $src->over;
+		my $smsg_for = $src->can('smsg_for'); # LeiXSearch
+		my $mo = { %mset_opt };
+		my $mset = $srch->mset($qstr, $mo);
+		my $ctx = {};
+		if ($smsg_for) {
+			for my $it ($mset->items) {
+				my $smsg = $smsg_for->($srch, $it) or next;
+				$self->out($buf .= $ORS) if defined $buf;
+				$smsg->{relevance} = get_pct($it);
+				$emit_cb->($smsg);
+			}
+		} else { # --thread
+			my $ids = $srch->mset_to_artnums($mset, $mo);
+			$ctx->{ids} = $ids;
+			my $i = 0;
+			my %n2p = map {
+				($ids->[$i++], get_pct($_));
+			} $mset->items;
+			undef $mset;
+			while ($over && $over->expand_thread($ctx)) {
+				for my $n (@{$ctx->{xids}}) {
+					my $t = $over->get_art($n) or next;
+					if (my $p = delete $n2p{$t->{num}}) {
+						$t->{relevance} = $p;
+					}
+					$self->out($buf .= $ORS);
+					$emit_cb->($t);
+				}
+				@{$ctx->{xids}} = ();
+			}
+		}
+	}
+	$self->out($buf .= "]\n"); # done
+}
+
+1;
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 7cda7e44..a7d7d953 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -23,7 +23,7 @@ sub new {
 	my (undef, $dir, $opt) = @_;
 	my $eidx = PublicInbox::ExtSearchIdx->new($dir, $opt);
 	my $self = bless { priv_eidx => $eidx }, __PACKAGE__;
-	eidx_init($self) if $opt->{creat};
+	eidx_init($self)->done if $opt->{creat};
 	$self;
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 851c015b..4c65dce2 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -354,6 +354,8 @@ sub write_cb { # returns a callback for git_to_mail
 		_mbox_write_cb($cls, $1, $dst, $lei);
 	} elsif ($dst =~ s!\A[Mm]aildir:!!) { # typically capitalized
 		_maildir_write_cb($dst, $lei);
+	} else {
+		undef;
 	}
 	# TODO: Maildir, MH, IMAP, JMAP ...
 }
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 33e9c413..b670bc2f 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -20,9 +20,16 @@ sub new {
 
 sub attach_external {
 	my ($self, $ibxish) = @_; # ibxish = ExtSearch or Inbox
-	if (!$ibxish->can('over')) {
-		push @{$self->{remotes}}, $ibxish
+
+	if (!$ibxish->can('over') || !$ibxish->over) {
+		return push(@{$self->{remotes}}, $ibxish)
 	}
+	my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
+	my $srch = $ibxish->search or
+		return warn("$desc not indexed for Xapian\n");
+	my @shards = $srch->xdb_shards_flat or
+		return warn("$desc has no Xapian shardsXapian\n");
+
 	if (delete $self->{xdb}) { # XXX: do we need this?
 		# clobber existing {xdb} if amending
 		my $expect = delete $self->{nshard};
@@ -41,13 +48,18 @@ sub attach_external {
 		$nr == $expect or die
 			"BUG: reloaded $nr shards, expected $expect"
 	}
-	my @shards = $ibxish->search->xdb_shards_flat;
 	push @{$self->{shards_flat}}, @shards;
 	push(@{$self->{shard2ibx}}, $ibxish) for (@shards);
 }
 
+# returns a list of local inboxes (or count in scalar context)
+sub locals {
+	my %uniq = map {; "$_" => $_ } @{$_[0]->{shard2ibx} // []};
+	values %uniq;
+}
+
 # called by PublicInbox::Search::xdb
-sub xdb_shards_flat { @{$_[0]->{shards_flat}} }
+sub xdb_shards_flat { @{$_[0]->{shards_flat} // []} }
 
 # like over->get_art
 sub smsg_for {
@@ -69,4 +81,6 @@ sub recent {
 	$self->mset($qstr //= 'bytes:1..', $opt);
 }
 
+sub over {}
+
 1;
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 0bdf6fc6..7f68ee01 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -6,7 +6,7 @@
 package PublicInbox::Search;
 use strict;
 use parent qw(Exporter);
-our @EXPORT_OK = qw(retry_reopen int_val);
+our @EXPORT_OK = qw(retry_reopen int_val get_pct);
 use List::Util qw(max);
 
 # values for searching, changing the numeric value breaks
@@ -424,4 +424,12 @@ sub int_val ($$) {
 	sortable_unserialise($val) + 0; # PV => IV conversion
 }
 
+sub get_pct ($) { # mset item
+	# Capped at "99%" since "100%" takes an extra column in the
+	# thread skeleton view.  <xapian/mset.h> says the value isn't
+	# very meaningful, anyways.
+	my $n = $_[0]->get_percent;
+	$n > 99 ? 99 : $n;
+}
+
 1;
diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm
index 6b36f795..d50d3cf6 100644
--- a/lib/PublicInbox/SearchView.pm
+++ b/lib/PublicInbox/SearchView.pm
@@ -14,7 +14,7 @@ use PublicInbox::WwwAtomStream;
 use PublicInbox::WwwStream qw(html_oneshot);
 use PublicInbox::SearchThread;
 use PublicInbox::SearchQuery;
-use PublicInbox::Search;
+use PublicInbox::Search qw(get_pct);
 my %rmap_inc;
 
 sub mbox_results {
@@ -276,14 +276,6 @@ sub sort_relevance {
 	} @{$_[0]} ]
 }
 
-sub get_pct ($) {
-	# Capped at "99%" since "100%" takes an extra column in the
-	# thread skeleton view.  <xapian/mset.h> says the value isn't
-	# very meaningful, anyways.
-	my $n = $_[0]->get_percent;
-	$n > 99 ? 99 : $n;
-}
-
 sub mset_thread {
 	my ($ctx, $mset, $q) = @_;
 	my $ibx = $ctx->{ibx};
diff --git a/t/lei.t b/t/lei.t
index 6d47e307..72c50308 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -122,7 +122,7 @@ my $setup_publicinboxes = sub {
 	return if $done eq $home;
 	use PublicInbox::InboxWritable;
 	for my $V (1, 2) {
-		run_script([qw(-init -Lmedium), "-V$V", "t$V",
+		run_script([qw(-init), "-V$V", "t$V",
 				'--newsgroup', "t.$V",
 				"$home/t$V", "http://example.com/t$V",
 				"t$V\@example.com" ]) or BAIL_OUT "init v$V";
@@ -175,6 +175,15 @@ my $test_external = sub {
 	});
 	$lei->('ls-external');
 	like($out, qr/boost=0\n/s, 'ls-external has output');
+
+	# note, on a Bourne shell users should be able to use either:
+	#	s:"use boolean prefix"
+	#	"s:use boolean prefix"
+	# or use single quotes, it should not matter.  Users only need
+	# to know shell quoting rules, not Xapian quoting rules.
+	# No double-quoting should be imposed on users on the CLI
+	$lei->('q', 's:use boolean prefix');
+	like($out, qr/search: use boolean prefix/, 'phrase search got result');
 };
 
 my $test_lei_common = sub {
diff --git a/t/lei_xsearch.t b/t/lei_xsearch.t
index 3774b4c1..8b03c1f2 100644
--- a/t/lei_xsearch.t
+++ b/t/lei_xsearch.t
@@ -70,4 +70,9 @@ my $max = max(map { $_->{docid} } @msgs);
 is($lxs->smsg_for(($mset->items)[0])->{docid}, $max,
 	'got highest docid');
 
+my @ibxish = $lxs->locals;
+is(scalar(@ibxish), scalar(@ibx) + 1, 'got locals back');
+is($lxs->search, $lxs, '->search works');
+is($lxs->over, undef, '->over fails');
+
 done_testing;

^ permalink raw reply related	[relevance 29%]

* [PATCH 12/22] lei: rename $w to $wpager for warning message
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
  2021-01-10 12:14 29% ` [PATCH 01/22] lei query + pagination sorta working Eric Wong
  2021-01-10 12:14 50% ` [PATCH 02/22] lei q: deduplicate smsg Eric Wong
@ 2021-01-10 12:15 71% ` Eric Wong
  2021-01-10 12:15 71% ` [PATCH 13/22] lei: fix oneshot TTY detection by passing STD*{GLOB} Eric Wong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:15 UTC (permalink / raw)
  To: meta

Perl keeps track of the variable name for error messages
when auto-closing an FD fails, so this will help identify
the source of a close error..
---
 lib/PublicInbox/LEI.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1f4ed0f6..24f5930b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -603,10 +603,10 @@ sub start_pager {
 	$env->{LV} //= '-c';
 	$env->{COLUMNS} //= 80; # TODO TIOCGWINSZ
 	$env->{MORE} //= 'FRX' if $^O eq 'freebsd';
-	pipe(my ($r, $w)) or return warn "pipe: $!";
+	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
-	$self->{1} = $w;
-	$self->{2} = $w if -t $self->{2};
+	$self->{1} = $wpager;
+	$self->{2} = $wpager if -t $self->{2};
 	my $pid = spawn([$pager], $env, $rdr);
 	dwaitpid($pid, undef, $self->{sock});
 	$env->{GIT_PAGER_IN_USE} = 'true'; # we may spawn git

^ permalink raw reply related	[relevance 71%]

* [PATCH 13/22] lei: fix oneshot TTY detection by passing STD*{GLOB}
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-10 12:15 71% ` [PATCH 12/22] lei: rename $w to $wpager for warning message Eric Wong
@ 2021-01-10 12:15 71% ` Eric Wong
  2021-01-10 12:15 33% ` [PATCH 14/22] lei: query: ensure pager exit is instantaneous Eric Wong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:15 UTC (permalink / raw)
  To: meta

...  instead of STD*{IO}.  I'm not sure why *STDOUT{IO} being an
IO::File object disqualifies it from the "-t" perlop check
returning true on TTY, but it does.  So use *STDOUT{GLOB} for
now.

http://nntp.perl.org/group/perl.perl5.porters/258760
Message-ID: <X/kgIqIuh4ZtUZNR@dcvr>
---
 lib/PublicInbox/LEI.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 24f5930b..17023191 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -795,9 +795,9 @@ sub oneshot {
 	local %PATH2CFG;
 	umask(077) // die("umask(077): $!");
 	dispatch((bless {
-		0 => *STDIN{IO},
-		1 => *STDOUT{IO},
-		2 => *STDERR{IO},
+		0 => *STDIN{GLOB},
+		1 => *STDOUT{GLOB},
+		2 => *STDERR{GLOB},
 		env => \%ENV
 	}, __PACKAGE__), @ARGV);
 }

^ permalink raw reply related	[relevance 71%]

* [PATCH 18/22] lei: get rid of client {pid} field
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
                   ` (4 preceding siblings ...)
  2021-01-10 12:15 33% ` [PATCH 14/22] lei: query: ensure pager exit is instantaneous Eric Wong
@ 2021-01-10 12:15 62% ` Eric Wong
  2021-01-10 12:15 41% ` [PATCH 19/22] lei: fork + FD cleanup Eric Wong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:15 UTC (permalink / raw)
  To: meta

Using kill(2) is too dangerous since extremely long
queries may mean the original PID of the aborted lei(1)
client process to be recycled by a new process.  It would
be bad if the lei_xsearch worker process issued a kill
on the wrong process.

So just rely on sending the exit message via socket.
---
 lib/PublicInbox/LEI.pm      | 18 +++++++-----------
 lib/PublicInbox/LeiQuery.pm |  2 +-
 script/lei                  |  2 +-
 3 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f8b8cd4a..0cbf342c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -240,15 +240,12 @@ my %CONFIG_KEYS = (
 sub x_it ($$) { # pronounced "exit"
 	my ($self, $code) = @_;
 	$self->{1}->autoflush(1); # make sure client sees stdout before exit
-	if (my $sig = ($code & 127)) {
-		kill($sig, $self->{pid} // $$);
-	} else {
-		$code >>= 8;
-		if (my $sock = $self->{sock}) {
-			say $sock "exit=$code";
-		} else { # for oneshot
-			$quit->($code);
-		}
+	my $sig = ($code & 127);
+	$code >>= 8 unless $sig;
+	if (my $sock = $self->{sock}) {
+		say $sock "exit=$code";
+	} else { # for oneshot
+		$quit->($code);
 	}
 }
 
@@ -675,13 +672,12 @@ sub accept_dispatch { # Listener {post_accept} callback
 		say $sock "request command truncated";
 		return;
 	}
-	my ($client_pid, $argc, @argv) = split(/\0/, $buf, -1);
+	my ($argc, @argv) = split(/\0/, $buf, -1);
 	undef $buf;
 	my %env = map { split(/=/, $_, 2) } splice(@argv, $argc);
 	if (chdir($env{PWD})) {
 		local %ENV = %env;
 		$self->{env} = \%env;
-		$self->{pid} = $client_pid + 0;
 		eval { dispatch($self, @argv) };
 		say $sock $@ if $@;
 	} else {
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 040c284d..d5376be5 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -76,7 +76,7 @@ sub lei_q {
 	}
 	my $j = $opt->{jobs} // scalar(@srcs) > 4 ? 4 : scalar(@srcs);
 	$j = 1 if !$opt->{thread};
-	if ($self->{pid}) {
+	if ($self->{sock}) {
 		$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset)
 			// $self->wq_workers($j);
 	}
diff --git a/script/lei b/script/lei
index 5e30f4d7..bea06b2c 100755
--- a/script/lei
+++ b/script/lei
@@ -62,7 +62,7 @@ Falling back to (slow) one-shot mode
 	1;
 }) { # (Socket::MsgHdr|IO::FDPass|Inline::C), $sock, $pwd are all available:
 	local $ENV{PWD} = $pwd;
-	my $buf = join("\0", $$, scalar(@ARGV), @ARGV);
+	my $buf = join("\0", scalar(@ARGV), @ARGV);
 	while (my ($k, $v) = each %ENV) { $buf .= "\0$k=$v" }
 	$buf .= "\0\0";
 	select $sock;

^ permalink raw reply related	[relevance 62%]

* [PATCH 19/22] lei: fork + FD cleanup
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
                   ` (5 preceding siblings ...)
  2021-01-10 12:15 62% ` [PATCH 18/22] lei: get rid of client {pid} field Eric Wong
@ 2021-01-10 12:15 41% ` Eric Wong
  2021-01-10 12:15 52% ` [PATCH 20/22] lei: run pager in client script Eric Wong
  2021-01-10 12:15 33% ` [PATCH 22/22] lei: query: restore JSON output overview Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:15 UTC (permalink / raw)
  To: meta

Do a better job of closing FDs that we don't want shared with
the work queue workers.  We'll also fix naming and use
"atfork_prepare" instead of "atfork_parent" to match
pthread_atfork(3) naming.
---
 lib/PublicInbox/IPC.pm        | 57 +++++++++++++++++++++++------------
 lib/PublicInbox/LEI.pm        | 18 +++++++++--
 lib/PublicInbox/LeiQuery.pm   |  7 +++--
 lib/PublicInbox/LeiXSearch.pm | 11 +++++--
 4 files changed, 68 insertions(+), 25 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 4db4b8ea..88f81e47 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -126,7 +126,7 @@ sub ipc_worker_spawn {
 	pipe(my ($r_res, $w_res)) or die "pipe: $!";
 	my $sigset = $oldset // PublicInbox::DS::block_signals();
 	my $parent = $$;
-	$self->ipc_atfork_parent;
+	$self->ipc_atfork_prepare;
 	defined(my $pid = fork) or die "fork: $!";
 	if ($pid == 0) {
 		eval { PublicInbox::DS->Reset };
@@ -155,8 +155,14 @@ sub ipc_worker_reap { # dwaitpid callback
 }
 
 # for base class, override in sub classes
-sub ipc_atfork_parent {}
-sub ipc_atfork_child {}
+sub ipc_atfork_prepare {}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	my $io = delete($self->{-ipc_atfork_child_close}) or return;
+	close($_) for @$io;
+	undef;
+}
 
 # idempotent, can be called regardless of whether worker is active or not
 sub ipc_worker_stop {
@@ -251,14 +257,21 @@ sub ipc_sibling_atfork_child {
 	$pid == $$ and die "BUG: $$ ipc_atfork_child called on itself";
 }
 
+sub _close_recvd ($) {
+	my ($self) = @_;
+	close($_) for (grep { defined } (delete @$self{0..2}));
+}
+
 sub wq_worker_loop ($) {
 	my ($self) = @_;
 	my $buf;
 	my $len = $self->{wq_req_len} // (4096 * 33);
-	my ($rec, $sub, @args);
+	my ($sub, $args);
 	my $s2 = $self->{-wq_s2} // die 'BUG: no -wq_s2';
 	local $SIG{PIPE} = sub {
-		die(bless(\"$_[0]", __PACKAGE__.'::PIPE')) if $sub;
+		my $cur_sub = $sub;
+		_close_recvd($self);
+		die(bless(\$cur_sub, __PACKAGE__.'::PIPE')) if $cur_sub;
 	};
 	my $rcv = $self->{-wq_recv_cmd} // $recv_cmd;
 	while (1) {
@@ -267,22 +280,25 @@ sub wq_worker_loop ($) {
 		my @m = @{$self->{wq_open_modes} // [qw( +<&= >&= >&= )]};
 		for my $fd (@fds) {
 			my $mode = shift(@m);
-			if (open(my $fh, $mode, $fd)) {
-				$self->{$i++} = $fh;
-				$fh->autoflush(1);
+			if (open(my $cmdfh, $mode, $fd)) {
+				$self->{$i++} = $cmdfh;
+				$cmdfh->autoflush(1);
 			} else {
 				die "$$ open($mode$fd) (FD:$i): $!";
 			}
 		}
-		# Sereal dies, Storable returns undef
-		$rec = thaw($buf) //
+		# Sereal dies on truncated data, Storable returns undef
+		$args = thaw($buf) //
 			die "thaw error on buffer of size:".length($buf);
-		($sub, @args) = @$rec;
-		eval { $self->$sub(@args) };
+		eval {
+			$sub = shift @$args;
+			eval { $self->$sub(@$args) };
+			undef $sub; # quiet SIG{PIPE} handler
+			die $@ if $@;
+		};
 		warn "$$ wq_worker: $@" if $@ && ref $@ ne __PACKAGE__.'::PIPE';
-		undef $sub; # quiet SIG{PIPE} handler
 		# need to close explicitly to avoid warnings after SIGPIPE
-		close($_) for (delete(@$self{0..2}));
+		_close_recvd($self);
 	}
 }
 
@@ -306,14 +322,17 @@ sub _wq_worker_start ($$) {
 		eval { PublicInbox::DS->Reset };
 		close(delete $self->{-wq_s1});
 		delete $self->{qw(-wq_workers -wq_ppid)};
-		$SIG{$_} = 'IGNORE' for (qw(TTOU TTIN));
-		$SIG{$_} = 'DEFAULT' for (qw(TERM QUIT INT));
+		$SIG{$_} = 'IGNORE' for (qw(PIPE TTOU TTIN));
+		$SIG{$_} = 'DEFAULT' for (qw(TERM QUIT INT CHLD));
 		local $0 = $self->{-wq_ident};
 		PublicInbox::DS::sig_setmask($oldset);
+		# ensure we properly exit even if warn() dies:
+		my $end = PublicInbox::OnDestroy->new($$, sub { exit(!!$@) });
 		my $on_destroy = $self->ipc_atfork_child;
 		eval { wq_worker_loop($self) };
 		warn "worker $self->{-wq_ident} PID:$$ died: $@" if $@;
-		exit($@ ? 1 : 0);
+		undef $on_destroy;
+		undef $end; # trigger exit
 	} else {
 		$self->{-wq_workers}->{$pid} = \undef;
 	}
@@ -326,7 +345,7 @@ sub wq_workers_start {
 	return if $self->{-wq_s1}; # idempotent
 	my ($s1, $s2);
 	socketpair($s1, $s2, AF_UNIX, $SEQPACKET, 0) or die "socketpair: $!";
-	$self->ipc_atfork_parent;
+	$self->ipc_atfork_prepare;
 	$nr_workers //= 4;
 	$nr_workers = $WQ_MAX_WORKERS if $nr_workers > $WQ_MAX_WORKERS;
 	my $sigset = $oldset // PublicInbox::DS::block_signals();
@@ -343,7 +362,7 @@ sub wq_worker_incr { # SIGTTIN handler
 	my ($self, $oldset) = @_;
 	$self->{-wq_s2} or return;
 	return if wq_workers($self) >= $WQ_MAX_WORKERS;
-	$self->ipc_atfork_parent;
+	$self->ipc_atfork_prepare;
 	my $sigset = $oldset // PublicInbox::DS::block_signals();
 	_wq_worker_start($self, $sigset);
 	PublicInbox::DS::sig_setmask($sigset) unless $oldset;
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 0cbf342c..1ef0cbec 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -33,6 +33,7 @@ my $GLP_PASS = Getopt::Long::Parser->new;
 $GLP_PASS->configure(qw(gnu_getopt no_ignore_case auto_abbrev pass_through));
 
 our %PATH2CFG; # persistent for socket daemon
+our @TO_CLOSE_ATFORK_CHILD;
 
 # TBD: this is a documentation mechanism to show a subcommand
 # (may) pass options through to another command:
@@ -266,12 +267,20 @@ sub fail ($$;$) {
 	undef;
 }
 
+sub atfork_prepare_wq {
+	my ($self, $wq) = @_;
+	push @{$wq->{-ipc_atfork_child_close}}, @TO_CLOSE_ATFORK_CHILD,
+				grep { defined } @$self{qw(0 1 2 sock)}
+}
+
 # usage: local %SIG = (%SIG, $lei->atfork_child_wq($wq));
 sub atfork_child_wq {
 	my ($self, $wq) = @_;
 	$self->{sock} //= $wq->{0};
 	$self->{$_} //= $wq->{$_} for (0..2);
 	my $oldpipe = $SIG{PIPE};
+	%PATH2CFG = ();
+	@TO_CLOSE_ATFORK_CHILD = ();
 	(
 		__WARN__ => sub { err($self, @_) },
 		PIPE => sub {
@@ -281,11 +290,14 @@ sub atfork_child_wq {
 	);
 }
 
-# usage: ($lei, @io) = $lei->atfork_prepare_wq($wq);
-sub atfork_prepare_wq {
+# usage: ($lei, @io) = $lei->atfork_parent_wq($wq);
+sub atfork_parent_wq {
 	my ($self, $wq) = @_;
 	if ($wq->wq_workers) {
+		my $env = delete $self->{env}; # env is inherited at fork
 		my $ret = bless { %$self }, ref($self);
+		$self->{env} = $env;
+		delete @$ret{qw(-lei_store cfg)};
 		my $in = delete $ret->{0};
 		($ret, delete($ret->{sock}) // $in, delete @$ret{1, 2});
 	} else {
@@ -738,6 +750,7 @@ sub lazy_start {
 	return if $pid;
 	$0 = "lei-daemon $path";
 	local %PATH2CFG;
+	local @TO_CLOSE_ATFORK_CHILD = ($l, $eof_r, $eof_w);
 	$_->blocking(0) for ($l, $eof_r, $eof_w);
 	$l = PublicInbox::Listener->new($l, \&accept_dispatch, $l);
 	my $exit_code;
@@ -764,6 +777,7 @@ sub lazy_start {
 	local %SIG = (%SIG, %$sig) if !$sigfd;
 	local $SIG{PIPE} = 'IGNORE';
 	if ($sigfd) { # TODO: use inotify/kqueue to detect unlinked sockets
+		push @TO_CLOSE_ATFORK_CHILD, $sigfd->{sock};
 		PublicInbox::DS->SetLoopTimeout(5000);
 	} else {
 		# wake up every second to accept signals if we don't
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index d5376be5..9a383cef 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -66,7 +66,7 @@ sub lei_q {
 
 	# --local is enabled by default
 	# src: LeiXSearch || LeiSearch || Inbox
-	my @srcs = $opt->{'local'} ? ($sto->search) : ();
+	my @srcs;
 	require PublicInbox::LeiXSearch;
 	my $lxs = PublicInbox::LeiXSearch->new;
 
@@ -74,12 +74,15 @@ sub lei_q {
 	if ($opt->{external} // 1) {
 		$self->_externals_each(\&_vivify_external, \@srcs);
 	}
-	my $j = $opt->{jobs} // scalar(@srcs) > 4 ? 4 : scalar(@srcs);
+	my $j = $opt->{jobs} // scalar(@srcs) > 3 ? 3 : scalar(@srcs);
 	$j = 1 if !$opt->{thread};
+	$j++ if $opt->{'local'}; # for sto->search below
 	if ($self->{sock}) {
+		$self->atfork_prepare_wq($lxs);
 		$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset)
 			// $self->wq_workers($j);
 	}
+	unshift(@srcs, $sto->search) if $opt->{'local'};
 	my $out = $opt->{output} // '-';
 	$out = 'json:/dev/stdout' if $out eq '-';
 	my $isatty = -t $self->{1};
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index c0df21a8..b4172734 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -9,6 +9,7 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::LeiSearch PublicInbox::IPC);
 use PublicInbox::Search qw(get_pct);
+use Sys::Syslog qw(syslog);
 
 sub new {
 	my ($class) = @_;
@@ -92,13 +93,13 @@ sub _mset_more ($$) {
 
 sub query_thread_mset { # for --thread
 	my ($self, $lei, $ibxish) = @_;
+	local %SIG = (%SIG, $lei->atfork_child_wq($self));
 	my ($srch, $over) = ($ibxish->search, $ibxish->over);
 	unless ($srch && $over) {
 		my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
 		warn "$desc not indexed by Xapian\n";
 		return;
 	}
-	local %SIG = (%SIG, $lei->atfork_child_wq($self));
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	do {
@@ -145,7 +146,7 @@ sub query_mset { # non-parallel for non-"--thread" users
 
 sub do_query {
 	my ($self, $lei_orig, $srcs) = @_;
-	my ($lei, @io) = $lei_orig->atfork_prepare_wq($self);
+	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
 	$io[1]->autoflush(1);
 	$io[2]->autoflush(1);
 	if ($lei->{opt}->{thread}) {
@@ -161,4 +162,10 @@ sub do_query {
 	}
 }
 
+sub ipc_atfork_child {
+	my ($self) = @_;
+	$SIG{__WARN__} = sub { syslog('warning', "@_") };
+	$self->SUPER::ipc_atfork_child; # PublicInbox::IPC
+}
+
 1;

^ permalink raw reply related	[relevance 41%]

* [PATCH 14/22] lei: query: ensure pager exit is instantaneous
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
                   ` (3 preceding siblings ...)
  2021-01-10 12:15 71% ` [PATCH 13/22] lei: fix oneshot TTY detection by passing STD*{GLOB} Eric Wong
@ 2021-01-10 12:15 33% ` Eric Wong
  2021-01-10 12:15 62% ` [PATCH 18/22] lei: get rid of client {pid} field Eric Wong
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:15 UTC (permalink / raw)
  To: meta

Improve interactivity and user experience by allowing the user
to return to the terminal immediately when the pager is exited
(e.g. hitting the `q' key in less(1)).

This is a massive change which restructures query handling to
allow parallel search when --thread expansion is in use and
offloading to a separate worker when --thread is not in use.

The Xapian query offload changes allow us to reenter the event
loop right away once the search(es) are shipped off to the work
queue workers.

This means the main lei-daemon process can forget the lei(1)
client socket immediately once it's handed off to worker
processes.

We now unblock SIGPIPE in query workers and send an exit(141)
response to the lei(1) client socket to denote SIGPIPE.

This also allows parallelization for users using "lei q" from
multiple terminals.

JSON output is currently broken and will need to be restructured
for more flexibility and fork-safety.
---
 lib/PublicInbox/IPC.pm        |  14 +++--
 lib/PublicInbox/LEI.pm        |  34 +++++++++++-
 lib/PublicInbox/LeiQuery.pm   | 102 +++++++++-------------------------
 lib/PublicInbox/LeiXSearch.pm |  80 +++++++++++++++++++++++++-
 4 files changed, 147 insertions(+), 83 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 8a3120c9..be5b2f45 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -234,6 +234,9 @@ sub wq_worker_loop ($) {
 	my $len = $self->{wq_req_len} // (4096 * 33);
 	my ($rec, $sub, @args);
 	my $s2 = $self->{-wq_s2} // die 'BUG: no -wq_s2';
+	local $SIG{PIPE} = sub {
+		die(bless(\"$_[0]", __PACKAGE__.'::PIPE')) if $sub;
+	};
 	until ($self->{-wq_quit}) {
 		my (@fds) = $recv_cmd->($s2, $buf, $len) or return; # EOF
 		my $i = 0;
@@ -242,6 +245,7 @@ sub wq_worker_loop ($) {
 			my $mode = shift(@m);
 			if (open(my $fh, $mode, $fd)) {
 				$self->{$i++} = $fh;
+				$fh->autoflush(1);
 			} else {
 				die "$$ open($mode$fd) (FD:$i): $!";
 			}
@@ -251,8 +255,10 @@ sub wq_worker_loop ($) {
 			die "thaw error on buffer of size:".length($buf);
 		($sub, @args) = @$rec;
 		eval { $self->$sub(@args) };
-		warn "$$ wq_worker: $@" if $@;
-		delete @$self{0, 1, 2};
+		warn "$$ wq_worker: $@" if $@ && ref $@ ne __PACKAGE__.'::PIPE';
+		undef $sub; # quiet SIG{PIPE} handler
+		# need to close explicitly to avoid warnings after SIGPIPE
+		close($_) for (delete(@$self{0..2}));
 	}
 }
 
@@ -284,8 +290,8 @@ sub _wq_worker_start ($$) {
 		PublicInbox::DS::sig_setmask($oldset);
 		my $on_destroy = $self->ipc_atfork_child;
 		eval { wq_worker_loop($self) };
-		die "worker $self->{-wq_ident} PID:$$ died: $@\n" if $@;
-		exit;
+		warn "worker $self->{-wq_ident} PID:$$ died: $@" if $@;
+		exit($@ ? 1 : 0);
 	} else {
 		$self->{-wq_workers}->{$pid} = \undef;
 	}
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 17023191..f8b8cd4a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -269,6 +269,33 @@ sub fail ($$;$) {
 	undef;
 }
 
+# usage: local %SIG = (%SIG, $lei->atfork_child_wq($wq));
+sub atfork_child_wq {
+	my ($self, $wq) = @_;
+	$self->{sock} //= $wq->{0};
+	$self->{$_} //= $wq->{$_} for (0..2);
+	my $oldpipe = $SIG{PIPE};
+	(
+		__WARN__ => sub { err($self, @_) },
+		PIPE => sub {
+			$self->x_it(141);
+			$oldpipe->() if ref($oldpipe) eq 'CODE';
+		}
+	);
+}
+
+# usage: ($lei, @io) = $lei->atfork_prepare_wq($wq);
+sub atfork_prepare_wq {
+	my ($self, $wq) = @_;
+	if ($wq->wq_workers) {
+		my $ret = bless { %$self }, ref($self);
+		my $in = delete $ret->{0};
+		($ret, delete($ret->{sock}) // $in, delete @$ret{1, 2});
+	} else {
+		($self, ($self->{sock} // $self->{0}), @$self{1, 2});
+	}
+}
+
 sub _help ($;$) {
 	my ($self, $errmsg) = @_;
 	my $cmd = $self->{cmd} // 'COMMAND';
@@ -608,8 +635,8 @@ sub start_pager {
 	$self->{1} = $wpager;
 	$self->{2} = $wpager if -t $self->{2};
 	my $pid = spawn([$pager], $env, $rdr);
-	dwaitpid($pid, undef, $self->{sock});
 	$env->{GIT_PAGER_IN_USE} = 'true'; # we may spawn git
+	[ $pid, @$rdr{1, 2} ];
 }
 
 sub accept_dispatch { # Listener {post_accept} callback
@@ -675,6 +702,8 @@ sub event_step {
 
 sub noop {}
 
+our $oldset; sub oldset { $oldset }
+
 # lei(1) calls this when it can't connect
 sub lazy_start {
 	my ($path, $errno, $nfd) = @_;
@@ -691,7 +720,7 @@ sub lazy_start {
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
 	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
-	my $oldset = PublicInbox::DS::block_signals();
+	local $oldset = PublicInbox::DS::block_signals();
 	if ($nfd == 1) {
 		require PublicInbox::CmdIPC1;
 		$recv_cmd = PublicInbox::CmdIPC1->can('recv_cmd1');
@@ -737,6 +766,7 @@ sub lazy_start {
 	};
 	my $sigfd = PublicInbox::Sigfd->new($sig, SFD_NONBLOCK);
 	local %SIG = (%SIG, %$sig) if !$sigfd;
+	local $SIG{PIPE} = 'IGNORE';
 	if ($sigfd) { # TODO: use inotify/kqueue to detect unlinked sockets
 		PublicInbox::DS->SetLoopTimeout(5000);
 	} else {
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index f69dccad..040c284d 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -8,7 +8,7 @@ use v5.10.1;
 use PublicInbox::MID qw($MID_EXTRACT);
 use POSIX qw(strftime);
 use PublicInbox::Address qw(pairs);
-use PublicInbox::Search qw(get_pct);
+use PublicInbox::DS qw(dwaitpid);
 
 sub _iso8601 ($) { strftime('%Y-%m-%dT%H:%M:%SZ', gmtime($_[0])) }
 
@@ -61,37 +61,30 @@ sub lei_q {
 	my $sto = $self->_lei_store(1);
 	my $cfg = $self->_lei_cfg(1);
 	my $opt = $self->{opt};
-	my $qstr = join(' ', map {;
-		# Consider spaces in argv to be for phrase search in Xapian.
-		# In other words, the users should need only care about
-		# normal shell quotes and not have to learn Xapian quoting.
-		/\s/ ? (s/\A(\w+:)// ? qq{$1"$_"} : qq{"$_"}) : $_
-	} @argv);
-	$opt->{limit} //= 10000;
-	my $lxs;
 	require PublicInbox::LeiDedupe;
 	my $dd = PublicInbox::LeiDedupe->new($self);
 
 	# --local is enabled by default
-	my @src = $opt->{'local'} ? ($sto->search) : ();
+	# src: LeiXSearch || LeiSearch || Inbox
+	my @srcs = $opt->{'local'} ? ($sto->search) : ();
+	require PublicInbox::LeiXSearch;
+	my $lxs = PublicInbox::LeiXSearch->new;
 
 	# --external is enabled by default, but allow --no-external
 	if ($opt->{external} // 1) {
-		$self->_externals_each(\&_vivify_external, \@src);
-		# {tid} is not unique between indices, so we have to search
-		# each src individually
-		if (!$opt->{thread}) {
-			require PublicInbox::LeiXSearch;
-			my $lxs = PublicInbox::LeiXSearch->new;
-			# local is always first
-			$lxs->attach_external($_) for @src;
-			@src = ($lxs);
-		}
+		$self->_externals_each(\&_vivify_external, \@srcs);
 	}
-	my $out = $self->{output} // '-';
+	my $j = $opt->{jobs} // scalar(@srcs) > 4 ? 4 : scalar(@srcs);
+	$j = 1 if !$opt->{thread};
+	if ($self->{pid}) {
+		$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset)
+			// $self->wq_workers($j);
+	}
+	my $out = $opt->{output} // '-';
 	$out = 'json:/dev/stdout' if $out eq '-';
 	my $isatty = -t $self->{1};
-	$self->start_pager if $isatty;
+	# no forking workers after this
+	my $pid_old12 = $self->start_pager if $isatty;
 	my $json = substr($out, 0, 5) eq 'json:' ?
 		ref(PublicInbox::Config->json)->new : undef;
 	if ($json) {
@@ -104,10 +97,14 @@ sub lei_q {
 		$json->canonical;
 	}
 
-	# src: LeiXSearch || LeiSearch || Inbox
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
-	delete $mset_opt{limit} if $opt->{limit} < 0;
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
+	$mset_opt{qstr} = join(' ', map {;
+		# Consider spaces in argv to be for phrase search in Xapian.
+		# In other words, the users should need only care about
+		# normal shell quotes and not have to learn Xapian quoting.
+		/\s/ ? (s/\A(\w+:)// ? qq{$1"$_"} : qq{"$_"}) : $_
+	} @argv);
 	if (defined(my $sort = $opt->{'sort'})) {
 		if ($sort eq 'relevance') {
 			$mset_opt{relevance} = 1;
@@ -123,59 +120,12 @@ sub lei_q {
 	# descending docid order
 	$mset_opt{relevance} //= -2 if $opt->{thread};
 	# my $wcb = PublicInbox::LeiToMail->write_cb($out, $self);
-
-	# even w/o pretty, do the equivalent of a --pretty=oneline
-	# output so "lei q SEARCH_TERMS | wc -l" can be useful:
-	my $ORS = $json ? ($opt->{pretty} ? ', ' : ",\n") : "\n";
-	my $buf;
-
-	# we can generate too many records to hold in RAM, so we stream
-	# and fake a JSON array starting here:
-	$self->out('[') if $json;
-	my $emit_cb = sub {
-		my ($smsg) = @_;
-		delete @$smsg{qw(tid num)}; # only makes sense if single src
-		chomp($buf = $json->encode(_smsg_unbless($smsg)));
-	};
-	$dd->prepare_dedupe;
-	for my $src (@src) {
-		my $srch = $src->search;
-		my $over = $src->over;
-		my $smsg_for = $src->can('smsg_for'); # LeiXSearch
-		my $mo = { %mset_opt };
-		my $mset = $srch->mset($qstr, $mo);
-		my $ctx = {};
-		if ($smsg_for) {
-			for my $it ($mset->items) {
-				my $smsg = $smsg_for->($srch, $it) or next;
-				next if $dd->is_smsg_dup($smsg);
-				$self->out($buf .= $ORS) if defined $buf;
-				$smsg->{relevance} = get_pct($it);
-				$emit_cb->($smsg);
-			}
-		} else { # --thread
-			my $ids = $srch->mset_to_artnums($mset, $mo);
-			$ctx->{ids} = $ids;
-			my $i = 0;
-			my %n2p = map {
-				($ids->[$i++], get_pct($_));
-			} $mset->items;
-			undef $mset;
-			while ($over && $over->expand_thread($ctx)) {
-				for my $n (@{$ctx->{xids}}) {
-					my $t = $over->get_art($n) or next;
-					next if $dd->is_smsg_dup($t);
-					if (my $p = delete $n2p{$t->{num}}) {
-						$t->{relevance} = $p;
-					}
-					$self->out($buf .= $ORS);
-					$emit_cb->($t);
-				}
-				@{$ctx->{xids}} = ();
-			}
-		}
+	$self->{mset_opt} = \%mset_opt;
+	$lxs->do_query($self, \@srcs);
+	if ($pid_old12) {
+		$self->{$_} = $pid_old12->[$_] for (1, 2);
+		dwaitpid($pid_old12->[0], undef, $self->{sock});
 	}
-	$self->out($buf .= "]\n"); # done
 }
 
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index b670bc2f..a3010efe 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -7,7 +7,8 @@
 package PublicInbox::LeiXSearch;
 use strict;
 use v5.10.1;
-use parent qw(PublicInbox::LeiSearch);
+use parent qw(PublicInbox::LeiSearch PublicInbox::IPC);
+use PublicInbox::Search qw(get_pct);
 
 sub new {
 	my ($class) = @_;
@@ -83,4 +84,81 @@ sub recent {
 
 sub over {}
 
+sub _mset_more ($$) {
+	my ($mset, $mo) = @_;
+	my $size = $mset->size;
+	$size && (($mo->{offset} += $size) < ($mo->{limit} // 10000));
+}
+
+sub query_thread_mset { # for --thread
+	my ($self, $lei, $ibxish) = @_;
+	my ($srch, $over) = ($ibxish->search, $ibxish->over);
+	unless ($srch && $over) {
+		my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
+		warn "$desc not indexed by Xapian\n";
+		return;
+	}
+	local %SIG = (%SIG, $lei->atfork_child_wq($self));
+	my $mo = { %{$lei->{mset_opt}} };
+	my $mset;
+	do {
+		$mset = $srch->mset($mo->{qstr}, $mo);
+		my $ids = $srch->mset_to_artnums($mset, $mo);
+		my $ctx = { ids => $ids };
+		my $i = 0;
+		my %n2p = map { ($ids->[$i++], get_pct($_)) } $mset->items;
+		while ($over->expand_thread($ctx)) {
+			for my $n (@{$ctx->{xids}}) {
+				my $smsg = $over->get_art($n) or next;
+				# next if $dd->is_smsg_dup($smsg); TODO
+				if (my $p = delete $n2p{$smsg->{num}}) {
+					$smsg->{relevance} = $p;
+				}
+				print { $self->{1} } Dumper($smsg);
+				# $self->out($buf .= $ORS);
+				# $emit_cb->($smsg);
+			}
+			@{$ctx->{xids}} = ();
+		}
+	} while (_mset_more($mset, $mo));
+}
+
+sub query_mset { # non-parallel for non-"--thread" users
+	my ($self, $lei, $srcs) = @_;
+	my $mo = { %{$lei->{mset_opt}} };
+	my $mset;
+	local %SIG = (%SIG, $lei->atfork_child_wq($self));
+	$self->attach_external($_) for @$srcs;
+	do {
+		$mset = $self->mset($mo->{qstr}, $mo);
+		for my $it ($mset->items) {
+			my $smsg = smsg_for($self, $it) or next;
+			# next if $dd->is_smsg_dup($smsg);
+			$smsg->{relevance} = get_pct($it);
+			use Data::Dumper;
+			print { $self->{1} } Dumper($smsg);
+			# $self->out($buf .= $ORS) if defined $buf;
+			#$emit_cb->($smsg);
+		}
+	} while (_mset_more($mset, $mo));
+}
+
+sub do_query {
+	my ($self, $lei_orig, $srcs) = @_;
+	my ($lei, @io) = $lei_orig->atfork_prepare_wq($self);
+	$io[1]->autoflush(1);
+	$io[2]->autoflush(1);
+	if ($lei->{opt}->{thread}) {
+		for my $ibxish (@$srcs) {
+			$self->wq_do('query_thread_mset', @io, $lei, $ibxish);
+		}
+	} else {
+		$self->wq_do('query_mset', @io, $lei, $srcs);
+	}
+	# TODO
+	for my $rmt (@{$self->{remotes} // []}) {
+		$self->wq_do('query_thread_mbox', @io, $lei, $rmt);
+	}
+}
+
 1;

^ permalink raw reply related	[relevance 33%]

* [PATCH 20/22] lei: run pager in client script
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
                   ` (6 preceding siblings ...)
  2021-01-10 12:15 41% ` [PATCH 19/22] lei: fork + FD cleanup Eric Wong
@ 2021-01-10 12:15 52% ` Eric Wong
  2021-01-10 12:15 33% ` [PATCH 22/22] lei: query: restore JSON output overview Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:15 UTC (permalink / raw)
  To: meta

While most single keystrokes work fine when the pager is
launched from the background daemon, Ctrl-C and WINCH can cause
strangeness when connected to the wrong terminal.
---
 lib/PublicInbox/LEI.pm      | 26 +++++++++++++++++++-------
 lib/PublicInbox/LeiQuery.pm |  5 +++--
 script/lei                  | 28 +++++++++++++++++++++++++---
 3 files changed, 47 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1ef0cbec..d19fb311 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -26,7 +26,7 @@ use Text::Wrap qw(wrap);
 use File::Path qw(mkpath);
 use File::Spec;
 our $quit = \&CORE::exit;
-my $recv_cmd;
+my ($recv_cmd, $send_cmd);
 my $GLP = Getopt::Long::Parser->new;
 $GLP->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
 my $GLP_PASS = Getopt::Long::Parser->new;
@@ -244,7 +244,8 @@ sub x_it ($$) { # pronounced "exit"
 	my $sig = ($code & 127);
 	$code >>= 8 unless $sig;
 	if (my $sock = $self->{sock}) {
-		say $sock "exit=$code";
+		my $fds = [ map { fileno($_) } @$self{0..2} ];
+		$send_cmd->($sock, $fds, "exit=$code\n", 0);
 	} else { # for oneshot
 		$quit->($code);
 	}
@@ -635,15 +636,23 @@ sub start_pager {
 	chomp(my $pager = <$fh> // '');
 	close($fh) or warn "`git var PAGER' error: \$?=$?";
 	return if $pager eq 'cat' || $pager eq '';
-	$env->{LESS} //= 'FRX';
-	$env->{LV} //= '-c';
-	$env->{COLUMNS} //= 80; # TODO TIOCGWINSZ
-	$env->{MORE} //= 'FRX' if $^O eq 'freebsd';
+	# TODO TIOCGWINSZ
+	my %new_env = (LESS => 'FRX', LV => '-c', COLUMNS => 80);
+	$new_env{MORE} = 'FRX' if $^O eq 'freebsd';
 	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
+	my $pid;
+	if (my $sock = $self->{sock}) { # lei(1) process runs it
+		delete @new_env{keys %$env}; # only set iff unset
+		my $buf = "exec 1\0".$pager;
+		while (my ($k, $v) = each %new_env) { $buf .= "\0$k=$v" };
+		my $fds = [ map { fileno($_) } @$rdr{0..2} ];
+		$send_cmd->($sock, $fds, $buf .= "\n", 0);
+	} else {
+		$pid = spawn([$pager], $env, $rdr);
+	}
 	$self->{1} = $wpager;
 	$self->{2} = $wpager if -t $self->{2};
-	my $pid = spawn([$pager], $env, $rdr);
 	$env->{GIT_PAGER_IN_USE} = 'true'; # we may spawn git
 	[ $pid, @$rdr{1, 2} ];
 }
@@ -731,10 +740,13 @@ sub lazy_start {
 	local $oldset = PublicInbox::DS::block_signals();
 	if ($nfd == 1) {
 		require PublicInbox::CmdIPC1;
+		$send_cmd = PublicInbox::CmdIPC1->can('send_cmd1');
 		$recv_cmd = PublicInbox::CmdIPC1->can('recv_cmd1');
 	} elsif ($nfd == 4) {
+		$send_cmd = PublicInbox::Spawn->can('send_cmd4');
 		$recv_cmd = PublicInbox::Spawn->can('recv_cmd4') // do {
 			require PublicInbox::CmdIPC4;
+			$send_cmd = PublicInbox::CmdIPC4->can('send_cmd4');
 			PublicInbox::CmdIPC4->can('recv_cmd4');
 		};
 	}
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 9a383cef..6e778785 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -125,9 +125,10 @@ sub lei_q {
 	# my $wcb = PublicInbox::LeiToMail->write_cb($out, $self);
 	$self->{mset_opt} = \%mset_opt;
 	$lxs->do_query($self, \@srcs);
-	if ($pid_old12) {
+	if ($pid_old12) { # [ pid, stdout, stderr ]
+		my $pid = $pid_old12->[0];
 		$self->{$_} = $pid_old12->[$_] for (1, 2);
-		dwaitpid($pid_old12->[0], undef, $self->{sock});
+		dwaitpid($pid, undef, $self->{sock}) if $pid;
 	}
 }
 
diff --git a/script/lei b/script/lei
index bea06b2c..aac8fa94 100755
--- a/script/lei
+++ b/script/lei
@@ -6,16 +6,33 @@ use v5.10.1;
 use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
 use PublicInbox::CmdIPC4;
 my $narg = 4;
+my $recv_cmd = PublicInbox::CmdIPC4->can('recv_cmd4');
 my $send_cmd = PublicInbox::CmdIPC4->can('send_cmd4') // do {
 	require PublicInbox::CmdIPC1; # 2nd choice
 	$narg = 1;
+	$recv_cmd = PublicInbox::CmdIPC1->can('recv_cmd1');
 	PublicInbox::CmdIPC1->can('send_cmd1');
 } // do {
 	require PublicInbox::Spawn; # takes ~50ms even if built *sigh*
 	$narg = 4;
+	$recv_cmd = PublicInbox::Spawn->can('recv_cmd4');
 	PublicInbox::Spawn->can('send_cmd4');
 };
 
+sub exec_cmd {
+	my ($fds, $argc, @argv) = @_;
+	my %env = map { split(/=/, $_, 2) } splice(@argv, $argc);
+	my @m = (*STDIN{IO}, '<&=',  *STDOUT{IO}, '>&=',
+		*STDERR{IO}, '>&=');
+	for my $fd (@$fds) {
+		my ($old_io, $mode) = splice(@m, 0, 2);
+		open($old_io, $mode, $fd) or die "open $mode$fd: $!";
+	}
+	%ENV = (%ENV, %env);
+	exec(@argv);
+	die "exec: @argv: $!";
+}
+
 my ($sock, $pwd);
 if ($send_cmd && eval {
 	my $path = do {
@@ -68,9 +85,14 @@ Falling back to (slow) one-shot mode
 	select $sock;
 	$| = 1; # unbuffer selected $sock
 	$send_cmd->($sock, [ 0, 1, 2 ], $buf, 0);
-	while ($buf = <$sock>) {
-		$buf =~ /\Aexit=([0-9]+)\n\z/ and exit($1 + 0);
-		die $buf;
+	while (my (@fds) = $recv_cmd->($sock, $buf, 4096 * 33)) {
+		if ($buf =~ /\Aexit=([0-9]+)\n\z/) {
+			exit($1);
+		} elsif ($buf =~ /\Aexec (.+)\n\z/) {
+			exec_cmd(\@fds, split(/\0/, $1));
+		} else {
+			die $buf;
+		}
 	}
 } else { # for systems lacking Socket::MsgHdr, IO::FDPass or Inline::C
 	warn $@ if $@;

^ permalink raw reply related	[relevance 52%]

* [PATCH 22/22] lei: query: restore JSON output overview
  2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
                   ` (7 preceding siblings ...)
  2021-01-10 12:15 52% ` [PATCH 20/22] lei: run pager in client script Eric Wong
@ 2021-01-10 12:15 33% ` Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-10 12:15 UTC (permalink / raw)
  To: meta

This internal API is better suited for fork-friendliness (but
locking + dedupe still needs to be re-added).

Normal "json" is the default, though stream-friendly "concatjson"
and "jsonl" (AKA "ndjson" AKA "ldjson") all seem working
(though tests aren't working, yet).

For normal "json", the biggest downside is the necessity of a
trailing "null" element at the end of the array because of
parallel processes, since (AFAIK) regular JSON doesn't allow
trailing commas, unlike JavaScript.
---
 MANIFEST                       |   1 +
 lib/PublicInbox/LeiOverview.pm | 188 +++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiQuery.pm    |  66 +++---------
 lib/PublicInbox/LeiXSearch.pm  |  25 +++--
 4 files changed, 217 insertions(+), 63 deletions(-)
 create mode 100644 lib/PublicInbox/LeiOverview.pm

diff --git a/MANIFEST b/MANIFEST
index caddd8df..810aec42 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -166,6 +166,7 @@ lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
 lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
+lib/PublicInbox/LeiOverview.pm
 lib/PublicInbox/LeiQuery.pm
 lib/PublicInbox/LeiSearch.pm
 lib/PublicInbox/LeiStore.pm
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
new file mode 100644
index 00000000..8a1f4f82
--- /dev/null
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -0,0 +1,188 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# per-mitem/smsg iterators for search results
+# "ovv" => "Overview viewer"
+package PublicInbox::LeiOverview;
+use strict;
+use v5.10.1;
+use POSIX qw(strftime);
+use File::Spec;
+use PublicInbox::MID qw($MID_EXTRACT);
+use PublicInbox::Address qw(pairs);
+use PublicInbox::Config;
+use PublicInbox::Search qw(get_pct);
+
+# cf. https://en.wikipedia.org/wiki/JSON_streaming
+my $JSONL = 'ldjson|ndjson|jsonl'; # 3 names for the same thing
+
+sub _iso8601 ($) { strftime('%Y-%m-%dT%H:%M:%SZ', gmtime($_[0])) }
+
+sub new {
+	my ($class, $lei) = @_;
+	my $opt = $lei->{opt};
+	my $out = $opt->{output} // '-';
+	$out = '/dev/stdout' if $out eq '-';
+
+	my $fmt = $opt->{'format'};
+	$fmt = lc($fmt) if defined $fmt;
+	if ($out =~ s/\A([a-z]+)://is) { # e.g. Maildir:/home/user/Mail/
+		my $ofmt = lc $1;
+		$fmt //= $ofmt;
+		return $lei->fail(<<"") if $fmt ne $ofmt;
+--format=$fmt and --output=$ofmt conflict
+
+	}
+	$fmt //= 'json' if $out eq '/dev/stdout';
+	$fmt //= 'maildir'; # TODO
+
+	if (index($out, '://') < 0) { # not a URL, so assume path
+		 $out = File::Spec->canonpath($out);
+	} # else URL
+
+	my $self = bless { fmt => $fmt, out => $out }, $class;
+	my $json;
+	if ($fmt =~ /\A($JSONL|(?:concat)?json)\z/) {
+		$json = $self->{json} = ref(PublicInbox::Config->json);
+	}
+	my ($isatty, $seekable);
+	if ($out eq '/dev/stdout') {
+		$isatty = -t $lei->{1};
+		$lei->start_pager if $isatty;
+		$opt->{pretty} //= $isatty;
+	} elsif ($json) {
+		return $lei->fail('JSON formats only output to stdout');
+	}
+	$self;
+}
+
+# called once by parent
+sub ovv_begin {
+	my ($self, $lei) = @_;
+	if ($self->{fmt} eq 'json') {
+		print { $lei->{1} } '[';
+	} # TODO HTML/Atom/...
+}
+
+# called once by parent (via PublicInbox::EOFpipe)
+sub ovv_end {
+	my ($self, $lei) = @_;
+	if ($self->{fmt} eq 'json') {
+		# JSON doesn't allow trailing commas, and preventing
+		# trailing commas is a PITA when parallelizing outputs
+		print { $lei->{1} } "null]\n";
+	} elsif ($self->{fmt} eq 'concatjson') {
+		print { $lei->{1} } "\n";
+	}
+}
+
+sub ovv_atfork_child {
+	my ($self) = @_;
+	# reopen dedupe here
+}
+
+# prepares an smsg for JSON
+sub _unbless_smsg {
+	my ($smsg, $mitem) = @_;
+
+	delete @$smsg{qw(lines bytes num tid)};
+	$smsg->{rcvd} = _iso8601(delete $smsg->{ts}); # JMAP receivedAt
+	$smsg->{dt} = _iso8601(delete $smsg->{ds}); # JMAP UTCDate
+	$smsg->{relevance} = get_pct($mitem) if $mitem;
+
+	if (my $r = delete $smsg->{references}) {
+		$smsg->{references} = [
+				map { "<$_>" } ($r =~ m/$MID_EXTRACT/go) ];
+	}
+	if (my $m = delete($smsg->{mid})) {
+		$smsg->{'m'} = "<$m>";
+	}
+	for my $f (qw(from to cc)) {
+		my $v = delete $smsg->{$f} or next;
+		$smsg->{substr($f, 0, 1)} = pairs($v);
+	}
+	$smsg->{'s'} = delete $smsg->{subject};
+	# can we be bothered to parse From/To/Cc into arrays?
+	scalar { %$smsg }; # unbless
+}
+
+sub ovv_atexit_child {
+	my ($self, $lei) = @_;
+	my $bref = delete $lei->{ovv_buf} or return;
+	print { $lei->{1} } $$bref;
+}
+
+# JSON module ->pretty output wastes too much vertical white space,
+# this (IMHO) provides better use of screen real-estate while not
+# being excessively compact:
+sub _json_pretty {
+	my ($json, $k, $v) = @_;
+	if (ref $v eq 'ARRAY') {
+		if (@$v) {
+			my $sep = ",\n" . (' ' x (length($k) + 7));
+			if (ref($v->[0])) { # f/t/c
+				$v = '[' . join($sep, map {
+					my $pair = $json->encode($_);
+					$pair =~ s/(null|"),"/$1, "/g;
+					$pair;
+				} @$v) . ']';
+			} else { # references
+				$v = '[' . join($sep, map {
+					substr($json->encode([$_]), 1, -1);
+				} @$v) . ']';
+			}
+		} else {
+			$v = '[]';
+		}
+	}
+	qq{  "$k": }.$v;
+}
+
+sub ovv_each_smsg_cb {
+	my ($self, $lei) = @_;
+	$lei->{ovv_buf} = \(my $buf = '');
+	my $json = $self->{json}->new;
+	if ($json) {
+		$json->utf8->canonical;
+		$json->ascii(1) if $lei->{opt}->{ascii};
+	}
+	if ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
+		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
+		sub { # DIY prettiness :P
+			my ($smsg, $mitem) = @_;
+			$smsg = _unbless_smsg($smsg, $mitem);
+			$buf .= "{\n";
+			$buf .= join(",\n", map {
+				my $v = $smsg->{$_};
+				if (ref($v)) {
+					_json_pretty($json, $_, $v);
+				} else {
+					$v = $json->encode([$v]);
+					qq{  "$_": }.substr($v, 1, -1);
+				}
+			} sort keys %$smsg);
+			$buf .= $EOR;
+			if (length($buf) > 65536) {
+				print { $lei->{1} } $buf;
+				$buf = '';
+			}
+		}
+	} elsif ($json) {
+		my $ORS = $self->{fmt} eq 'json' ? ",\n" : "\n"; # JSONL
+		sub {
+			my ($smsg, $mitem) = @_;
+			delete @$smsg{qw(tid num)};
+			$buf .= $json->encode(_unbless_smsg(@_)) . $ORS;
+			if (length($buf) > 65536) {
+				print { $lei->{1} } $buf;
+				$buf = '';
+			}
+		}
+	} elsif ($self->{fmt} eq 'oid') {
+		sub {
+			my ($smsg, $mitem) = @_;
+		}
+	} # else { ...
+}
+
+1;
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 2f4b99e5..7ca01454 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -5,43 +5,8 @@
 package PublicInbox::LeiQuery;
 use strict;
 use v5.10.1;
-use PublicInbox::MID qw($MID_EXTRACT);
-use POSIX qw(strftime);
-use PublicInbox::Address qw(pairs);
 use PublicInbox::DS qw(dwaitpid);
 
-sub _iso8601 ($) { strftime('%Y-%m-%dT%H:%M:%SZ', gmtime($_[0])) }
-
-# prepares an smsg for JSON
-sub _smsg_unbless ($) {
-	my ($smsg) = @_;
-
-	delete @$smsg{qw(lines bytes)};
-	$smsg->{rcvd} = _iso8601(delete $smsg->{ts}); # JMAP receivedAt
-	$smsg->{dt} = _iso8601(delete $smsg->{ds}); # JMAP UTCDate
-
-	if (my $r = delete $smsg->{references}) {
-		$smsg->{references} = [
-				map { "<$_>" } ($r =~ m/$MID_EXTRACT/go) ];
-	}
-	if (my $m = delete($smsg->{mid})) {
-		$smsg->{'m'} = "<$m>";
-	}
-	# XXX breaking to/cc, into structured arrays or tables which
-	# distinguish "$phrase <$address>" causes pretty printing JSON
-	# to take up too much vertical space.  I can't get either
-	# Cpanel::JSON::XS or JSON::XS or jq(1) only indent when
-	# wrapping is necessary, rather than blindly indenting and
-	# adding vertical space everywhere.
-	for my $f (qw(from to cc)) {
-		my $v = delete $smsg->{$f} or next;
-		$smsg->{substr($f, 0, 1)} = $v;
-	}
-	$smsg->{'s'} = delete $smsg->{subject};
-	# can we be bothered to parse From/To/Cc into arrays?
-	scalar { %$smsg }; # unbless
-}
-
 sub _vivify_external { # _externals_each callback
 	my ($src, $dir) = @_;
 	if (-f "$dir/ei.lock") {
@@ -68,6 +33,7 @@ sub lei_q {
 	# src: LeiXSearch || LeiSearch || Inbox
 	my @srcs;
 	require PublicInbox::LeiXSearch;
+	require PublicInbox::LeiOverview;
 	my $lxs = PublicInbox::LeiXSearch->new;
 
 	# --external is enabled by default, but allow --no-external
@@ -83,23 +49,9 @@ sub lei_q {
 			// $lxs->wq_workers($j);
 	}
 	unshift(@srcs, $sto->search) if $opt->{'local'};
-	my $out = $opt->{output} // '-';
-	$out = 'json:/dev/stdout' if $out eq '-';
-	my $isatty = -t $self->{1};
 	# no forking workers after this
-	$self->start_pager if $isatty;
-	my $json = substr($out, 0, 5) eq 'json:' ?
-		ref(PublicInbox::Config->json)->new : undef;
-	if ($json) {
-		if ($opt->{pretty} //= $isatty) {
-			$json->pretty(1)->space_before(0);
-			$json->indent_length($opt->{indent} // 2);
-		}
-		$json->utf8; # avoid Wide character in print warnings
-		$json->ascii(1) if $opt->{ascii}; # for "\uXXXX"
-		$json->canonical;
-	}
-
+	require PublicInbox::LeiOverview;
+	$self->{ovv} = PublicInbox::LeiOverview->new($self);
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
 	$mset_opt{qstr} = join(' ', map {;
@@ -124,7 +76,17 @@ sub lei_q {
 	$mset_opt{relevance} //= -2 if $opt->{thread};
 	# my $wcb = PublicInbox::LeiToMail->write_cb($out, $self);
 	$self->{mset_opt} = \%mset_opt;
-	$lxs->do_query($self, \@srcs);
+	$self->{ovv}->ovv_begin($self);
+	pipe(my ($eof_wait, $qry_done)) or die "pipe $!";
+	require PublicInbox::EOFpipe;
+	my $eof = PublicInbox::EOFpipe->new($eof_wait, \&query_done, $self);
+	$lxs->do_query($self, $qry_done, \@srcs);
+	$eof->event_step unless $self->{sock};
+}
+
+sub query_done { # PublicInbox::EOFpipe callback
+	my ($self) = @_;
+	$self->{ovv}->ovv_end($self);
 }
 
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 94f7c2bc..c030b2b2 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -8,7 +8,6 @@ package PublicInbox::LeiXSearch;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::LeiSearch PublicInbox::IPC);
-use PublicInbox::Search qw(get_pct);
 use Sys::Syslog qw(syslog);
 
 sub new {
@@ -102,26 +101,26 @@ sub query_thread_mset { # for --thread
 	}
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
+	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
 		my $ids = $srch->mset_to_artnums($mset, $mo);
 		my $ctx = { ids => $ids };
 		my $i = 0;
-		my %n2p = map { ($ids->[$i++], get_pct($_)) } $mset->items;
+		my %n2item = map { ($ids->[$i++], $_) } $mset->items;
 		while ($over->expand_thread($ctx)) {
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
 				# next if $dd->is_smsg_dup($smsg); TODO
-				if (my $p = delete $n2p{$smsg->{num}}) {
-					$smsg->{relevance} = $p;
-				}
-				print { $self->{1} } Dumper($smsg);
+				my $mitem = delete $n2item{$smsg->{num}};
+				$each_smsg->($smsg, $mitem);
 				# $self->out($buf .= $ORS);
 				# $emit_cb->($smsg);
 			}
 			@{$ctx->{xids}} = ();
 		}
 	} while (_mset_more($mset, $mo));
+	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
 sub query_mset { # non-parallel for non-"--thread" users
@@ -130,23 +129,24 @@ sub query_mset { # non-parallel for non-"--thread" users
 	my $mset;
 	local %SIG = (%SIG, $lei->atfork_child_wq($self));
 	$self->attach_external($_) for @$srcs;
+	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	do {
 		$mset = $self->mset($mo->{qstr}, $mo);
 		for my $it ($mset->items) {
 			my $smsg = smsg_for($self, $it) or next;
 			# next if $dd->is_smsg_dup($smsg);
-			$smsg->{relevance} = get_pct($it);
-			use Data::Dumper;
-			print { $self->{1} } Dumper($smsg);
+			$each_smsg->($smsg, $it);
 			# $self->out($buf .= $ORS) if defined $buf;
 			#$emit_cb->($smsg);
 		}
 	} while (_mset_more($mset, $mo));
+	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
 sub do_query {
-	my ($self, $lei_orig, $srcs) = @_;
+	my ($self, $lei_orig, $qry_done, $srcs) = @_;
 	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
+	$io[0] = $qry_done; # don't need stdin
 	$io[1]->autoflush(1);
 	$io[2]->autoflush(1);
 	if ($lei->{opt}->{thread}) {
@@ -160,6 +160,9 @@ sub do_query {
 	for my $rmt (@{$self->{remotes} // []}) {
 		$self->wq_do('query_thread_mbox', \@io, $lei, $rmt);
 	}
+
+	# sent off to children, they will drop remaining references to it
+	close $qry_done;
 }
 
 sub ipc_atfork_child {
@@ -170,7 +173,7 @@ sub ipc_atfork_child {
 
 sub ipc_atfork_prepare {
 	my ($self) = @_;
-	$self->wq_set_recv_modes(qw[<&= >&= >&= +<&=]);
+	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&=]);
 	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
 }
 

^ permalink raw reply related	[relevance 33%]

* [PATCH 00/14] lei: another pile of changes
@ 2021-01-14  7:06 64% Eric Wong
  2021-01-14  7:06 23% ` [PATCH 02/14] lei: test SIGPIPE, stop xsearch workers on client abort Eric Wong
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

PATCH 2/14 took forever to figure out; turns out I was hunting
an old bug in Perl :x (and led to PATCH 3/14, too)

We could probably go farther on 5/14 and eliminate the
need for @TO_CLOSE_ATFORK_CHILD completely, but my brain
was fried from 2/14 :x.

The "ts:" => "rt:" change is technically user-visible,
but "ts:" was never publicly documented so I doubt it
affects anybody.  "rt:" (received time) may be documented
in the future.

Eric Wong (14):
  cmd_ipc: support + test EINTR + EAGAIN, no FDs
  lei: test SIGPIPE, stop xsearch workers on client abort
  daemon+watch: fix localization of %SIG for non-signalfd users
  lei: do not unlink socket path at exit
  lei: reduce live FD references in wq child
  lei: rely on localized $current_lei for warnings
  lei_dedupe+shared_kv: ensure round-tripping serialization
  lei q: reinstate smsg dedupe
  search: rename "ts:" prefix to "rt:"
  lei_overview: rename "references" to "refs"
  lei: q: lock stdout on overview output
  leixsearch: remove some commented out code
  lei: remove temporary var on open
  lei: pass FD to CWD via cmsg, use fchdir on server

 MANIFEST                        |   2 +
 lib/PublicInbox/CmdIPC4.pm      |   6 +-
 lib/PublicInbox/Daemon.pm       |   4 +-
 lib/PublicInbox/IMAPsearchqp.pm |   6 +-
 lib/PublicInbox/IPC.pm          |  45 +++-----
 lib/PublicInbox/LEI.pm          | 182 +++++++++++++++++---------------
 lib/PublicInbox/LeiDedupe.pm    |  29 ++---
 lib/PublicInbox/LeiOverview.pm  |  43 +++++++-
 lib/PublicInbox/LeiQuery.pm     |  27 ++---
 lib/PublicInbox/LeiXSearch.pm   |  60 +++++++----
 lib/PublicInbox/Lock.pm         |   2 +-
 lib/PublicInbox/Search.pm       |   2 +-
 lib/PublicInbox/SharedKV.pm     |  12 ++-
 lib/PublicInbox/Spawn.pm        |  13 ++-
 script/lei                      |  88 +++++++++------
 script/public-inbox-watch       |   2 +-
 t/cmd_ipc.t                     |  32 ++++++
 t/imap_searchqp.t               |   6 +-
 t/lei.t                         |  33 +-----
 t/lei_dedupe.t                  |  13 +++
 t/lei_overview.t                |  33 ++++++
 xt/lei-sigpipe.t                |  32 ++++++
 22 files changed, 417 insertions(+), 255 deletions(-)
 create mode 100644 t/lei_overview.t
 create mode 100644 xt/lei-sigpipe.t

^ permalink raw reply	[relevance 64%]

* [PATCH 04/14] lei: do not unlink socket path at exit
  2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
  2021-01-14  7:06 23% ` [PATCH 02/14] lei: test SIGPIPE, stop xsearch workers on client abort Eric Wong
@ 2021-01-14  7:06 69% ` Eric Wong
  2021-01-14  7:06 68% ` [PATCH 05/14] lei: reduce live FD references in wq child Eric Wong
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

This matches existing -httpd/-nntpd/-imapd daemon behavior.
From what I can recall, it is less racy for the process doing
bind(2) to unlink it if stale.
---
 lib/PublicInbox/LEI.pm | 1 -
 t/lei.t                | 4 ++--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 2889fa76..7a1df0bb 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -789,7 +789,6 @@ sub lazy_start {
 	local $quit = sub {
 		$exit_code //= shift;
 		my $listener = $l or exit($exit_code);
-		unlink($path) if defined($path);
 		# closing eof_w triggers \&noop wakeup
 		$eof_w = $l = $path = undef;
 		$listener->close; # DS::close
diff --git a/t/lei.t b/t/lei.t
index 3ebaade6..240735bf 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -237,13 +237,13 @@ SKIP: { # real socket
 		kill(0, $pid) or last;
 		tick();
 	}
-	ok(!-S $sock, 'sock gone');
+	ok(-S $sock, 'sock still exists');
 	ok(!kill(0, $pid), 'pid gone after stop');
 
 	ok($lei->(qw(daemon-pid)), 'daemon-pid');
 	chomp(my $new_pid = $out);
 	ok(kill(0, $new_pid), 'new pid is running');
-	ok(-S $sock, 'sock exists again');
+	ok(-S $sock, 'sock still exists');
 
 	for my $sig (qw(-0 -CHLD)) {
 		ok($lei->('daemon-kill', $sig), "handles $sig");

^ permalink raw reply related	[relevance 69%]

* [PATCH 05/14] lei: reduce live FD references in wq child
  2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
  2021-01-14  7:06 23% ` [PATCH 02/14] lei: test SIGPIPE, stop xsearch workers on client abort Eric Wong
  2021-01-14  7:06 69% ` [PATCH 04/14] lei: do not unlink socket path at exit Eric Wong
@ 2021-01-14  7:06 68% ` Eric Wong
  2021-01-14  7:06 66% ` [PATCH 06/14] lei: rely on localized $current_lei for warnings Eric Wong
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

We can shrink the @TO_CLOSE_ATFORK_CHILD array by two
elements, at least.  I may be possible to eliminate this
array entirely but clobbering $quit doesn't seem to
remove references to $eof_w or the $listener socket.
---
 lib/PublicInbox/LEI.pm | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 7a1df0bb..fd2b722c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -279,6 +279,7 @@ sub atfork_child_wq {
 	my ($self, $wq) = @_;
 	@$self{qw(0 1 2 sock)} = delete(@$wq{0..3});
 	%PATH2CFG = ();
+	$quit = \&CORE::exit;
 	@TO_CLOSE_ATFORK_CHILD = ();
 	(__WARN__ => sub { err($self, @_) },
 	PIPE => sub {
@@ -782,8 +783,8 @@ sub lazy_start {
 	return if $pid;
 	$0 = "lei-daemon $path";
 	local %PATH2CFG;
-	local @TO_CLOSE_ATFORK_CHILD = ($l, $eof_r, $eof_w);
-	$_->blocking(0) for ($l, $eof_r, $eof_w);
+	local @TO_CLOSE_ATFORK_CHILD = ($l, $eof_w);
+	$l->blocking(0);
 	$l = PublicInbox::Listener->new($l, \&accept_dispatch, $l);
 	my $exit_code;
 	local $quit = sub {
@@ -795,6 +796,7 @@ sub lazy_start {
 		PublicInbox::DS->SetLoopTimeout(1000);
 	};
 	PublicInbox::EOFpipe->new($eof_r, \&noop, undef);
+	undef $eof_r;
 	my $sig = {
 		CHLD => \&PublicInbox::DS::enqueue_reap,
 		QUIT => $quit,
@@ -806,9 +808,10 @@ sub lazy_start {
 	};
 	my $sigfd = PublicInbox::Sigfd->new($sig, SFD_NONBLOCK);
 	local @SIG{keys %$sig} = values(%$sig) unless $sigfd;
+	undef $sig;
 	local $SIG{PIPE} = 'IGNORE';
 	if ($sigfd) { # TODO: use inotify/kqueue to detect unlinked sockets
-		push @TO_CLOSE_ATFORK_CHILD, $sigfd->{sock};
+		undef $sigfd;
 		PublicInbox::DS->SetLoopTimeout(5000);
 	} else {
 		# wake up every second to accept signals if we don't

^ permalink raw reply related	[relevance 68%]

* [PATCH 06/14] lei: rely on localized $current_lei for warnings
  2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-14  7:06 68% ` [PATCH 05/14] lei: reduce live FD references in wq child Eric Wong
@ 2021-01-14  7:06 66% ` Eric Wong
  2021-01-14  7:06 62% ` [PATCH 08/14] lei q: reinstate smsg dedupe Eric Wong
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

This lets us get rid of the Sys::Syslog import and __WARN__
override in LeiXSearch, though we still need it with
->atfork_child_wq.
---
 lib/PublicInbox/LEI.pm        | 7 +++++--
 lib/PublicInbox/LeiXSearch.pm | 7 -------
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index fd2b722c..a8fea16d 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -26,6 +26,7 @@ use Text::Wrap qw(wrap);
 use File::Path qw(mkpath);
 use File::Spec;
 our $quit = \&CORE::exit;
+our $current_lei;
 my ($recv_cmd, $send_cmd);
 my $GLP = Getopt::Long::Parser->new;
 $GLP->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
@@ -447,7 +448,7 @@ sub optparse ($$$) {
 
 sub dispatch {
 	my ($self, $cmd, @argv) = @_;
-	local $SIG{__WARN__} = sub { err($self, @_) };
+	local $current_lei = $self; # for __WARN__
 	return _help($self, 'no command given') unless defined($cmd);
 	my $func = "lei_$cmd";
 	$func =~ tr/-/_/;
@@ -849,7 +850,9 @@ sub lazy_start {
 	# STDOUT will cause the calling `lei' client process to finish
 	# reading the <$daemon> pipe.
 	openlog($path, 'pid', 'user');
-	local $SIG{__WARN__} = sub { syslog('warning', "@_") };
+	local $SIG{__WARN__} = sub {
+		$current_lei ? err($current_lei, @_) : syslog('warning', "@_");
+	};
 	my $on_destroy = PublicInbox::OnDestroy->new($$, sub {
 		syslog('crit', "$@") if $@;
 	});
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index d06b6f1d..68889e81 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -8,7 +8,6 @@ package PublicInbox::LeiXSearch;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::LeiSearch PublicInbox::IPC);
-use Sys::Syslog qw(syslog);
 
 sub new {
 	my ($class) = @_;
@@ -187,12 +186,6 @@ sub do_query {
 	}
 }
 
-sub ipc_atfork_child {
-	my ($self) = @_;
-	$SIG{__WARN__} = sub { syslog('warning', "@_") };
-	$self->SUPER::ipc_atfork_child; # PublicInbox::IPC
-}
-
 sub ipc_atfork_prepare {
 	my ($self) = @_;
 	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&=]);

^ permalink raw reply related	[relevance 66%]

* [PATCH 08/14] lei q: reinstate smsg dedupe
  2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
                   ` (3 preceding siblings ...)
  2021-01-14  7:06 66% ` [PATCH 06/14] lei: rely on localized $current_lei for warnings Eric Wong
@ 2021-01-14  7:06 62% ` Eric Wong
  2021-01-14  7:06 49% ` [PATCH 11/14] lei: q: lock stdout on overview output Eric Wong
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

Now that dedupe is serialization and fork-safe, we can
wire it back up in our query results paths.
---
 lib/PublicInbox/LeiQuery.pm   | 5 ++---
 lib/PublicInbox/LeiXSearch.pm | 8 ++++++--
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 1a3e1193..69d2f9a6 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -26,14 +26,13 @@ sub lei_q {
 	my $sto = $self->_lei_store(1);
 	my $cfg = $self->_lei_cfg(1);
 	my $opt = $self->{opt};
-	require PublicInbox::LeiDedupe;
-	my $dd = PublicInbox::LeiDedupe->new($self);
 
 	# --local is enabled by default
 	# src: LeiXSearch || LeiSearch || Inbox
 	my @srcs;
 	require PublicInbox::LeiXSearch;
 	require PublicInbox::LeiOverview;
+	require PublicInbox::LeiDedupe;
 	my $lxs = PublicInbox::LeiXSearch->new;
 
 	# --external is enabled by default, but allow --no-external
@@ -49,8 +48,8 @@ sub lei_q {
 
 	unshift(@srcs, $sto->search) if $opt->{'local'};
 	# no forking workers after this
-	require PublicInbox::LeiOverview;
 	$self->{ovv} = PublicInbox::LeiOverview->new($self);
+	$self->{dd} = PublicInbox::LeiDedupe->new($self);
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
 	$mset_opt{qstr} = join(' ', map {;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 68889e81..80e7a7f7 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -103,6 +103,8 @@ sub query_thread_mset { # for --thread
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
+	my $dd = $lei->{dd};
+	$dd->prepare_dedupe;
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
 		my $ids = $srch->mset_to_artnums($mset, $mo);
@@ -112,7 +114,7 @@ sub query_thread_mset { # for --thread
 		while ($over->expand_thread($ctx)) {
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
-				# next if $dd->is_smsg_dup($smsg); TODO
+				next if $dd->is_smsg_dup($smsg);
 				my $mitem = delete $n2item{$smsg->{num}};
 				$each_smsg->($smsg, $mitem);
 				# $self->out($buf .= $ORS);
@@ -132,11 +134,13 @@ sub query_mset { # non-parallel for non-"--thread" users
 	my $mset;
 	$self->attach_external($_) for @$srcs;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
+	my $dd = $lei->{dd};
+	$dd->prepare_dedupe;
 	do {
 		$mset = $self->mset($mo->{qstr}, $mo);
 		for my $it ($mset->items) {
 			my $smsg = smsg_for($self, $it) or next;
-			# next if $dd->is_smsg_dup($smsg);
+			next if $dd->is_smsg_dup($smsg);
 			$each_smsg->($smsg, $it);
 			# $self->out($buf .= $ORS) if defined $buf;
 			#$emit_cb->($smsg);

^ permalink raw reply related	[relevance 62%]

* [PATCH 11/14] lei: q: lock stdout on overview output
  2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
                   ` (4 preceding siblings ...)
  2021-01-14  7:06 62% ` [PATCH 08/14] lei q: reinstate smsg dedupe Eric Wong
@ 2021-01-14  7:06 49% ` Eric Wong
  2021-01-15  0:18 71%   ` Eric Wong
  2021-01-14  7:06 71% ` [PATCH 13/14] lei: remove temporary var on open Eric Wong
  2021-01-14  7:06 48% ` [PATCH 14/14] lei: pass FD to CWD via cmsg, use fchdir on server Eric Wong
  7 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

Most writes to stdout aren't atomic and we need locking to
prevent workers from interleaving and corrupting JSON output.
The one case stdout won't require locking is if it's pointed
to a regular file with O_APPEND; as POSIX O_APPEND semantics
guarantees atomicity.
---
 MANIFEST                       |  1 +
 lib/PublicInbox/LeiOverview.pm | 34 ++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiXSearch.pm  |  9 +++++----
 lib/PublicInbox/Lock.pm        |  2 +-
 t/lei_overview.t               | 33 +++++++++++++++++++++++++++++++++
 5 files changed, 74 insertions(+), 5 deletions(-)
 create mode 100644 t/lei_overview.t

diff --git a/MANIFEST b/MANIFEST
index 2ca240fc..0ebdaccc 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -338,6 +338,7 @@ t/kqnotify.t
 t/lei-oneshot.t
 t/lei.t
 t/lei_dedupe.t
+t/lei_overview.t
 t/lei_store.t
 t/lei_to_mail.t
 t/lei_xsearch.t
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index ec0921ba..44c21837 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -6,8 +6,11 @@
 package PublicInbox::LeiOverview;
 use strict;
 use v5.10.1;
+use parent qw(PublicInbox::Lock);
 use POSIX qw(strftime);
+use Fcntl qw(F_GETFL O_APPEND);
 use File::Spec;
+use File::Temp ();
 use PublicInbox::MID qw($MID_EXTRACT);
 use PublicInbox::Address qw(pairs);
 use PublicInbox::Config;
@@ -18,6 +21,23 @@ my $JSONL = 'ldjson|ndjson|jsonl'; # 3 names for the same thing
 
 sub _iso8601 ($) { strftime('%Y-%m-%dT%H:%M:%SZ', gmtime($_[0])) }
 
+# we open this in the parent process before ->wq_do handoff
+sub ovv_out_lk_init ($) {
+	my ($self) = @_;
+	$self->{tmp_lk_id} = "$self.$$";
+	my $tmp = File::Temp->new("lei-ovv.out.$$.lock-XXXXXX",
+					TMPDIR => 1, UNLINK => 0);
+	$self->{lock_path} = $tmp->filename;
+}
+
+sub ovv_out_lk_cancel ($) {
+	my ($self) = @_;
+	($self->{tmp_lk_id}//'') eq "$self.$$" and
+		unlink(delete($self->{lock_path}));
+}
+
+*DESTROY = \&ovv_out_lk_cancel;
+
 sub new {
 	my ($class, $lei) = @_;
 	my $opt = $lei->{opt};
@@ -50,8 +70,17 @@ sub new {
 		$isatty = -t $lei->{1};
 		$lei->start_pager if $isatty;
 		$opt->{pretty} //= $isatty;
+		if (!$isatty && -f _) {
+			my $fl = fcntl($lei->{1}, F_GETFL, 0) //
+				return $lei->fail("fcntl(stdout): $!");
+			ovv_out_lk_init($self) unless ($fl & O_APPEND);
+		} else {
+			ovv_out_lk_init($self);
+		}
 	} elsif ($json) {
 		return $lei->fail('JSON formats only output to stdout');
+	} else {
+		return $lei->fail("TODO: $out -f $fmt");
 	}
 	$self;
 }
@@ -109,6 +138,7 @@ sub _unbless_smsg {
 sub ovv_atexit_child {
 	my ($self, $lei) = @_;
 	if (my $bref = delete $lei->{ovv_buf}) {
+		my $lk = $self->lock_for_scope;
 		print { $lei->{1} } $$bref;
 	}
 }
@@ -142,7 +172,9 @@ sub _json_pretty {
 sub ovv_each_smsg_cb {
 	my ($self, $lei) = @_;
 	$lei->{ovv_buf} = \(my $buf = '');
+	delete(@$self{qw(lock_path tmp_lk_id)}) unless $lei->{-parallel};
 	my $json = $self->{json}->new;
+	$lei->{1}->autoflush(1);
 	if ($json) {
 		$json->utf8->canonical;
 		$json->ascii(1) if $lei->{opt}->{ascii};
@@ -164,6 +196,7 @@ sub ovv_each_smsg_cb {
 			} sort keys %$smsg);
 			$buf .= $EOR;
 			if (length($buf) > 65536) {
+				my $lk = $self->lock_for_scope;
 				print { $lei->{1} } $buf;
 				$buf = '';
 			}
@@ -175,6 +208,7 @@ sub ovv_each_smsg_cb {
 			delete @$smsg{qw(tid num)};
 			$buf .= $json->encode(_unbless_smsg(@_)) . $ORS;
 			if (length($buf) > 65536) {
+				my $lk = $self->lock_for_scope;
 				print { $lei->{1} } $buf;
 				$buf = '';
 			}
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 80e7a7f7..ee93e074 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -158,20 +158,21 @@ sub query_done { # PublicInbox::EOFpipe callback
 sub do_query {
 	my ($self, $lei_orig, $srcs) = @_;
 	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
-
+	my $remotes = $self->{remotes} // [];
 	pipe(my ($eof_wait, $qry_done)) or die "pipe $!";
 	$io[0] = $qry_done; # don't need stdin
-	$io[1]->autoflush(1);
-	$io[2]->autoflush(1);
+
 	if ($lei->{opt}->{thread}) {
+		$lei->{-parallel} = scalar(@$remotes) + scalar(@$srcs) - 1;
 		for my $ibxish (@$srcs) {
 			$self->wq_do('query_thread_mset', \@io, $lei, $ibxish);
 		}
 	} else {
+		$lei->{-parallel} = scalar(@$remotes);
 		$self->wq_do('query_mset', \@io, $lei, $srcs);
 	}
 	# TODO
-	for my $rmt (@{$self->{remotes} // []}) {
+	for my $rmt (@$remotes) {
 		$self->wq_do('query_thread_mbox', \@io, $lei, $rmt);
 	}
 	@io = ();
diff --git a/lib/PublicInbox/Lock.pm b/lib/PublicInbox/Lock.pm
index 2c5ebf27..bb213de4 100644
--- a/lib/PublicInbox/Lock.pm
+++ b/lib/PublicInbox/Lock.pm
@@ -37,7 +37,7 @@ sub lock_release {
 # caller must use return value
 sub lock_for_scope {
 	my ($self, @single_pid) = @_;
-	$self->lock_acquire;
+	lock_acquire($self) or return; # lock_path not set
 	PublicInbox::OnDestroy->new(@single_pid, \&lock_release, $self);
 }
 
diff --git a/t/lei_overview.t b/t/lei_overview.t
new file mode 100644
index 00000000..896cc01a
--- /dev/null
+++ b/t/lei_overview.t
@@ -0,0 +1,33 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Test::More;
+use PublicInbox::TestCommon;
+use POSIX qw(_exit);
+require_ok 'PublicInbox::LeiOverview';
+
+my $ovv = bless {}, 'PublicInbox::LeiOverview';
+$ovv->ovv_out_lk_init;
+my $lock_path = $ovv->{lock_path};
+ok(-f $lock_path, 'lock init');
+undef $ovv;
+ok(!-f $lock_path, 'lock DESTROY');
+
+$ovv = bless {}, 'PublicInbox::LeiOverview';
+$ovv->ovv_out_lk_init;
+$lock_path = $ovv->{lock_path};
+ok(-f $lock_path, 'lock init #2');
+my $pid = fork // BAIL_OUT "fork $!";
+if ($pid == 0) {
+	undef $ovv;
+	_exit(0);
+}
+is(waitpid($pid, 0), $pid, 'child exited');
+is($?, 0, 'no error in child process');
+ok(-f $lock_path, 'lock was not destroyed by child');
+undef $ovv;
+ok(!-f $lock_path, 'lock DESTROY #2');
+
+done_testing;

^ permalink raw reply related	[relevance 49%]

* [PATCH 02/14] lei: test SIGPIPE, stop xsearch workers on client abort
  2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
@ 2021-01-14  7:06 23% ` Eric Wong
  2021-01-14  7:06 69% ` [PATCH 04/14] lei: do not unlink socket path at exit Eric Wong
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

The new test ensures consistency between oneshot and
client/daemon users.  Cancelling an in-progress result now also
stops xsearch workers to avoid wasted CPU and I/O.

Note the lei->atfork_child_wq usage changes, it is to workaround
a bug in Perl 5: http://nntp.perl.org/group/perl.perl5.porters/258784
<CAHhgV8hPbcmkzWizp6Vijw921M5BOXixj4+zTh3nRS9vRBYk8w@mail.gmail.com>

This switches the internal protocol to use SOCK_SEQPACKET
AF_UNIX sockets to prevent merging messages from the daemon to
client to run pager and kill/exit the client script.
---
 MANIFEST                       |   1 +
 lib/PublicInbox/IPC.pm         |  45 ++++------
 lib/PublicInbox/LEI.pm         | 158 +++++++++++++++++----------------
 lib/PublicInbox/LeiOverview.pm |   5 +-
 lib/PublicInbox/LeiQuery.pm    |  22 ++---
 lib/PublicInbox/LeiXSearch.pm  |  34 +++++--
 script/lei                     |  74 ++++++++++-----
 t/lei.t                        |   2 +-
 xt/lei-sigpipe.t               |  32 +++++++
 9 files changed, 225 insertions(+), 148 deletions(-)
 create mode 100644 xt/lei-sigpipe.t

diff --git a/MANIFEST b/MANIFEST
index 810aec42..2ca240fc 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -429,6 +429,7 @@ xt/git_async_cmp.t
 xt/httpd-async-stream.t
 xt/imapd-mbsync-oimap.t
 xt/imapd-validate.t
+xt/lei-sigpipe.t
 xt/mem-imapd-tls.t
 xt/mem-msgview.t
 xt/msgtime_cmp.t
diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index c54fcc64..fbc91f6f 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -130,7 +130,8 @@ sub ipc_worker_spawn {
 
 sub ipc_worker_reap { # dwaitpid callback
 	my ($self, $pid) = @_;
-	warn "PID:$pid died with \$?=$?\n" if $?;
+	# SIGTERM (15) is our default exit signal
+	warn "PID:$pid died with \$?=$?\n" if $? && ($? & 127) != 15;
 }
 
 # for base class, override in sub classes
@@ -236,50 +237,31 @@ sub ipc_sibling_atfork_child {
 	$pid == $$ and die "BUG: $$ ipc_atfork_child called on itself";
 }
 
-sub _close_recvd ($) {
-	my ($self) = @_;
-	my $x = $self->{-wq_recv_modes};
-	my $end = $x ? $#$x : 2;
-	close($_) for (grep { defined } (delete @$self{0..$end}));
-}
-
 sub wq_worker_loop ($) {
 	my ($self) = @_;
-	my $buf;
 	my $len = $self->{wq_req_len} // (4096 * 33);
-	my ($sub, $args);
 	my $s2 = $self->{-wq_s2} // die 'BUG: no -wq_s2';
-	local $SIG{PIPE} = sub {
-		my $cur_sub = $sub;
-		_close_recvd($self);
-		die(bless(\$cur_sub, 'PublicInbox::SIGPIPE')) if $cur_sub;
-	};
 	while (1) {
-		my (@fds) = $recv_cmd->($s2, $buf, $len) or return; # EOF
-		my $i = 0;
+		my @fds = $recv_cmd->($s2, my $buf, $len) or return; # EOF
 		my @m = @{$self->{-wq_recv_modes} // [qw( +<&= >&= >&= )]};
+		my $nfd = 0;
 		for my $fd (@fds) {
 			my $mode = shift(@m);
 			if (open(my $cmdfh, $mode, $fd)) {
-				$self->{$i++} = $cmdfh;
+				$self->{$nfd++} = $cmdfh;
 				$cmdfh->autoflush(1);
 			} else {
-				die "$$ open($mode$fd) (FD:$i): $!";
+				die "$$ open($mode$fd) (FD:$nfd): $!";
 			}
 		}
 		# Sereal dies on truncated data, Storable returns undef
-		$args = thaw($buf) //
+		my $args = thaw($buf) //
 			die "thaw error on buffer of size:".length($buf);
-		eval {
-			$sub = shift @$args;
-			eval { $self->$sub(@$args) };
-			undef $sub; # quiet SIG{PIPE} handler
-			die $@ if $@;
-		};
+		my $sub = shift @$args;
+		eval { $self->$sub(@$args) };
 		warn "$$ wq_worker: $@" if $@ &&
 					ref($@) ne 'PublicInbox::SIGPIPE';
-		# need to close explicitly to avoid warnings after SIGPIPE
-		_close_recvd($self);
+		delete @$self{0..($nfd-1)};
 	}
 }
 
@@ -400,9 +382,16 @@ sub wq_close {
 	}
 }
 
+sub wq_kill {
+	my ($self, $sig) = @_;
+	my $workers = $self->{-wq_workers} or return;
+	kill($sig // 'TERM', keys %$workers);
+}
+
 sub WQ_MAX_WORKERS { $WQ_MAX_WORKERS }
 
 sub DESTROY {
+	wq_kill($_[0]);
 	wq_close($_[0]);
 	ipc_worker_stop($_[0]);
 }
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 7313738e..2889fa76 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -11,13 +11,13 @@ use v5.10.1;
 use parent qw(PublicInbox::DS PublicInbox::LeiExternal
 	PublicInbox::LeiQuery);
 use Getopt::Long ();
-use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
-use Errno qw(EAGAIN ECONNREFUSED ENOENT);
+use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
+use Errno qw(EAGAIN EINTR ECONNREFUSED ENOENT ECONNRESET);
 use POSIX ();
 use IO::Handle ();
 use Sys::Syslog qw(syslog openlog);
 use PublicInbox::Config;
-use PublicInbox::Syscall qw(SFD_NONBLOCK EPOLLIN EPOLLONESHOT);
+use PublicInbox::Syscall qw(SFD_NONBLOCK EPOLLIN EPOLLET);
 use PublicInbox::Sigfd;
 use PublicInbox::DS qw(now dwaitpid);
 use PublicInbox::Spawn qw(spawn run_die popen_rd);
@@ -238,16 +238,15 @@ my %CONFIG_KEYS = (
 	'leistore.dir' => 'top-level storage location',
 );
 
-sub x_it ($$) { # pronounced "exit"
+# pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
+sub x_it ($$) {
 	my ($self, $code) = @_;
-	$self->{1}->autoflush(1); # make sure client sees stdout before exit
-	my $sig = ($code & 127);
-	$code >>= 8 unless $sig;
+	# make sure client sees stdout before exit
+	$self->{1}->autoflush(1) if $self->{1};
 	if (my $sock = $self->{sock}) {
-		my $fds = [ map { fileno($_) } @$self{0..2} ];
-		$send_cmd->($sock, $fds, "exit=$code\n", 0);
-	} else { # for oneshot
-		$quit->($code);
+		send($sock, "x_it $code", MSG_EOR);
+	} elsif (!($code & 127)) { # oneshot, ignore signals
+		$quit->($code >> 8);
 	}
 }
 
@@ -274,22 +273,20 @@ sub atfork_prepare_wq {
 				grep { defined } @$self{qw(0 1 2 sock)}
 }
 
-# usage: local %SIG = (%SIG, $lei->atfork_child_wq($wq));
+# usage: my %sig = $lei->atfork_child_wq($wq);
+#	 local @SIG{keys %sig} = values %sig;
 sub atfork_child_wq {
 	my ($self, $wq) = @_;
-	return () if $self->{0}; # did not fork
-	$self->{$_} = $wq->{$_} for (0..2);
-	$self->{sock} = $wq->{3} // die 'BUG: no {sock}'; # may be undef
-	my $oldpipe = $SIG{PIPE};
+	@$self{qw(0 1 2 sock)} = delete(@$wq{0..3});
 	%PATH2CFG = ();
 	@TO_CLOSE_ATFORK_CHILD = ();
-	(
-		__WARN__ => sub { err($self, @_) },
-		PIPE => sub {
-			$self->x_it(141);
-			$oldpipe->() if ref($oldpipe) eq 'CODE';
-		}
-	);
+	(__WARN__ => sub { err($self, @_) },
+	PIPE => sub {
+		$self->x_it(13); # SIGPIPE = 13
+		# we need to close explicitly to avoid Perl warning on SIGPIPE
+		close($_) for (delete @$self{1..2});
+		die bless(\"$_[0]", 'PublicInbox::SIGPIPE'),
+	});
 }
 
 # usage: ($lei, @io) = $lei->atfork_parent_wq($wq);
@@ -300,9 +297,9 @@ sub atfork_parent_wq {
 		my $ret = bless { %$self }, ref($self);
 		$self->{env} = $env;
 		delete @$ret{qw(-lei_store cfg pgr)};
-		($ret, delete @$ret{qw(0 1 2 sock)});
+		($ret, delete @$ret{0..2}, delete($ret->{sock}) // ());
 	} else {
-		($self, @$self{qw(0 1 2 sock)});
+		($self, @$self{0..2}, $self->{sock} // ());
 	}
 }
 
@@ -647,7 +644,7 @@ sub start_pager {
 		my $buf = "exec 1\0".$pager;
 		while (my ($k, $v) = each %new_env) { $buf .= "\0$k=$v" };
 		my $fds = [ map { fileno($_) } @$rdr{0..2} ];
-		$send_cmd->($sock, $fds, $buf .= "\n", 0);
+		$send_cmd->($sock, $fds, $buf, MSG_EOR);
 	} else {
 		$pgr->[0] = spawn([$pager], $env, $rdr);
 	}
@@ -660,50 +657,39 @@ sub start_pager {
 sub stop_pager {
 	my ($self) = @_;
 	my $pgr = delete($self->{pgr}) or return;
-	my $pid = $pgr->[0];
-	close $self->{1};
-	# {2} may not be redirected
-	$self->{1} = $pgr->[1];
 	$self->{2} = $pgr->[2];
+	# do not restore original stdout, just close it so we error out
+	close(delete($self->{1})) if $self->{1};
+	my $pid = $pgr->[0];
 	dwaitpid($pid, undef, $self->{sock}) if $pid;
 }
 
 sub accept_dispatch { # Listener {post_accept} callback
 	my ($sock) = @_; # ignore other
-	$sock->blocking(1);
 	$sock->autoflush(1);
 	my $self = bless { sock => $sock }, __PACKAGE__;
-	vec(my $rin = '', fileno($sock), 1) = 1;
-	# `say $sock' triggers "die" in lei(1)
-	my $buf;
-	if (select(my $rout = $rin, undef, undef, 1)) {
-		my @fds = $recv_cmd->($sock, $buf, 4096 * 33); # >MAX_ARG_STRLEN
-		if (scalar(@fds) == 3) {
-			my $i = 0;
-			for my $rdr (qw(<&= >&= >&=)) {
-				my $fd = shift(@fds);
-				if (open(my $fh, $rdr, $fd)) {
-					$self->{$i++} = $fh;
-				}  else {
-					say $sock "open($rdr$fd) (FD=$i): $!";
-					return;
-				}
+	vec(my $rvec, fileno($sock), 1) = 1;
+	select($rvec, undef, undef, 1) or
+		return send($sock, 'timed out waiting to recv FDs', MSG_EOR);
+	my @fds = $recv_cmd->($sock, my $buf, 4096 * 33); # >MAX_ARG_STRLEN
+	if (scalar(@fds) == 3) {
+		my $i = 0;
+		for my $rdr (qw(<&= >&= >&=)) {
+			my $fd = shift(@fds);
+			if (open(my $fh, $rdr, $fd)) {
+				$self->{$i++} = $fh;
+				next;
 			}
-		} else {
-			say $sock "recv_cmd failed: $!";
-			return;
+			return send($sock, "open($rdr$fd) (FD=$i): $!", MSG_EOR);
 		}
 	} else {
-		say $sock "timed out waiting to recv FDs";
-		return;
+		return send($sock, "recv_cmd failed: $!", MSG_EOR);
 	}
 	$self->{2}->autoflush(1); # keep stdout buffered until x_it|DESTROY
 	# $ENV_STR = join('', map { "\0$_=$ENV{$_}" } keys %ENV);
 	# $buf = "$$\0$argc\0".join("\0", @ARGV).$ENV_STR."\0\0";
-	if (substr($buf, -2, 2, '') ne "\0\0") { # s/\0\0\z//
-		say $sock "request command truncated";
-		return;
-	}
+	substr($buf, -2, 2, '') eq "\0\0" or  # s/\0\0\z//
+		return send($sock, 'request command truncated', MSG_EOR);
 	my ($argc, @argv) = split(/\0/, $buf, -1);
 	undef $buf;
 	my %env = map { split(/=/, $_, 2) } splice(@argv, $argc);
@@ -711,23 +697,50 @@ sub accept_dispatch { # Listener {post_accept} callback
 		local %ENV = %env;
 		$self->{env} = \%env;
 		eval { dispatch($self, @argv) };
-		say $sock $@ if $@;
+		send($sock, $@, MSG_EOR) if $@;
 	} else {
-		say $sock "chdir($env{PWD}): $!"; # implicit close
+		send($sock, "chdir($env{PWD}): $!", MSG_EOR); # implicit close
 	}
 }
 
+sub dclose {
+	my ($self) = @_;
+	delete $self->{lxs}; # stops LeiXSearch queries
+	$self->close; # PublicInbox::DS::close
+}
+
 # for long-running results
 sub event_step {
 	my ($self) = @_;
 	local %ENV = %{$self->{env}};
-	eval {}; # TODO
-	if ($@) {
-		say { $self->{sock} } $@;
-		$self->close; # PublicInbox::DS::close
+	my $sock = $self->{sock};
+	eval {
+		while (my @fds = $recv_cmd->($sock, my $buf, 4096)) {
+			if (scalar(@fds) == 1 && !defined($fds[0])) {
+				return if $! == EAGAIN;
+				next if $! == EINTR;
+				last if $! == ECONNRESET;
+				die "recvmsg: $!";
+			}
+			for my $fd (@fds) {
+				open my $rfh, '+<&=', $fd;
+			}
+			die "unrecognized client signal: $buf";
+		}
+		dclose($self);
+	};
+	if (my $err = $@) {
+		eval { $self->fail($err) };
+		dclose($self);
 	}
 }
 
+sub event_step_init {
+	my ($self) = @_;
+	$self->{sock}->blocking(0);
+	$self->SUPER::new($self->{sock}, EPOLLIN|EPOLLET);
+}
+
 sub noop {}
 
 our $oldset; sub oldset { $oldset }
@@ -742,7 +755,7 @@ sub lazy_start {
 		die "connect($path): $!";
 	}
 	umask(077) // die("umask(077): $!");
-	socket(my $l, AF_UNIX, SOCK_STREAM, 0) or die "socket: $!";
+	socket(my $l, AF_UNIX, SOCK_SEQPACKET, 0) or die "socket: $!";
 	bind($l, pack_sockaddr_un($path)) or die "bind($path): $!";
 	listen($l, 1024) or die "listen: $!";
 	my @st = stat($path) or die "stat($path): $!";
@@ -793,7 +806,7 @@ sub lazy_start {
 		USR2 => \&noop,
 	};
 	my $sigfd = PublicInbox::Sigfd->new($sig, SFD_NONBLOCK);
-	local %SIG = (%SIG, %$sig) if !$sigfd;
+	local @SIG{keys %$sig} = values(%$sig) unless $sigfd;
 	local $SIG{PIPE} = 'IGNORE';
 	if ($sigfd) { # TODO: use inotify/kqueue to detect unlinked sockets
 		push @TO_CLOSE_ATFORK_CHILD, $sigfd->{sock};
@@ -853,24 +866,19 @@ sub oneshot {
 	local $quit = $exit if $exit;
 	local %PATH2CFG;
 	umask(077) // die("umask(077): $!");
-	local $SIG{PIPE} = sub { die(bless(\"$_[0]", 'PublicInbox::SIGPIPE')) };
-	eval {
-		my $self = bless {
-			0 => *STDIN{GLOB},
-			1 => *STDOUT{GLOB},
-			2 => *STDERR{GLOB},
-			env => \%ENV
-		}, __PACKAGE__;
-		dispatch($self, @ARGV);
-	};
-	die $@ if $@ && ref($@) ne 'PublicInbox::SIGPIPE';
+	dispatch((bless {
+		0 => *STDIN{GLOB},
+		1 => *STDOUT{GLOB},
+		2 => *STDERR{GLOB},
+		env => \%ENV
+	}, __PACKAGE__), @ARGV);
 }
 
 # ensures stdout hits the FS before sock disconnects so a client
 # can immediately reread it
 sub DESTROY {
 	my ($self) = @_;
-	$self->{1}->autoflush(1);
+	$self->{1}->autoflush(1) if $self->{1};
 	stop_pager($self);
 }
 
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 8a1f4f82..194c5e28 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -108,8 +108,9 @@ sub _unbless_smsg {
 
 sub ovv_atexit_child {
 	my ($self, $lei) = @_;
-	my $bref = delete $lei->{ovv_buf} or return;
-	print { $lei->{1} } $$bref;
+	if (my $bref = delete $lei->{ovv_buf}) {
+		print { $lei->{1} } $$bref;
+	}
 }
 
 # JSON module ->pretty output wastes too much vertical white space,
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 7ca01454..1a3e1193 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -40,14 +40,13 @@ sub lei_q {
 	if ($opt->{external} // 1) {
 		$self->_externals_each(\&_vivify_external, \@srcs);
 	}
-	my $j = $opt->{jobs} // scalar(@srcs) > 3 ? 3 : scalar(@srcs);
+	my $j = $opt->{jobs} // (scalar(@srcs) > 3 ? 3 : scalar(@srcs));
 	$j = 1 if !$opt->{thread};
 	$j++ if $opt->{'local'}; # for sto->search below
-	if ($self->{sock}) {
-		$self->atfork_prepare_wq($lxs);
-		$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset)
-			// $lxs->wq_workers($j);
-	}
+	$self->atfork_prepare_wq($lxs);
+	$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset)
+		// $lxs->wq_workers($j);
+
 	unshift(@srcs, $sto->search) if $opt->{'local'};
 	# no forking workers after this
 	require PublicInbox::LeiOverview;
@@ -77,16 +76,7 @@ sub lei_q {
 	# my $wcb = PublicInbox::LeiToMail->write_cb($out, $self);
 	$self->{mset_opt} = \%mset_opt;
 	$self->{ovv}->ovv_begin($self);
-	pipe(my ($eof_wait, $qry_done)) or die "pipe $!";
-	require PublicInbox::EOFpipe;
-	my $eof = PublicInbox::EOFpipe->new($eof_wait, \&query_done, $self);
-	$lxs->do_query($self, $qry_done, \@srcs);
-	$eof->event_step unless $self->{sock};
-}
-
-sub query_done { # PublicInbox::EOFpipe callback
-	my ($self) = @_;
-	$self->{ovv}->ovv_end($self);
+	$lxs->do_query($self, \@srcs);
 }
 
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index c030b2b2..d06b6f1d 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -92,7 +92,9 @@ sub _mset_more ($$) {
 
 sub query_thread_mset { # for --thread
 	my ($self, $lei, $ibxish) = @_;
-	local %SIG = (%SIG, $lei->atfork_child_wq($self));
+	my %sig = $lei->atfork_child_wq($self);
+	local @SIG{keys %sig} = values %sig;
+
 	my ($srch, $over) = ($ibxish->search, $ibxish->over);
 	unless ($srch && $over) {
 		my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
@@ -125,9 +127,10 @@ sub query_thread_mset { # for --thread
 
 sub query_mset { # non-parallel for non-"--thread" users
 	my ($self, $lei, $srcs) = @_;
+	my %sig = $lei->atfork_child_wq($self);
+	local @SIG{keys %sig} = values %sig;
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
-	local %SIG = (%SIG, $lei->atfork_child_wq($self));
 	$self->attach_external($_) for @$srcs;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	do {
@@ -143,9 +146,17 @@ sub query_mset { # non-parallel for non-"--thread" users
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
+sub query_done { # PublicInbox::EOFpipe callback
+	my ($lei) = @_;
+	$lei->{ovv}->ovv_end($lei);
+	$lei->dclose;
+}
+
 sub do_query {
-	my ($self, $lei_orig, $qry_done, $srcs) = @_;
+	my ($self, $lei_orig, $srcs) = @_;
 	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
+
+	pipe(my ($eof_wait, $qry_done)) or die "pipe $!";
 	$io[0] = $qry_done; # don't need stdin
 	$io[1]->autoflush(1);
 	$io[2]->autoflush(1);
@@ -160,9 +171,20 @@ sub do_query {
 	for my $rmt (@{$self->{remotes} // []}) {
 		$self->wq_do('query_thread_mbox', \@io, $lei, $rmt);
 	}
-
-	# sent off to children, they will drop remaining references to it
-	close $qry_done;
+	@io = ();
+	close $qry_done; # fully closed when children are done
+
+	# query_done will run when query_*mset close $qry_done
+	if ($lei_orig->{sock}) { # watch for client premature exit
+		require PublicInbox::EOFpipe;
+		PublicInbox::EOFpipe->new($eof_wait, \&query_done, $lei_orig);
+		$lei_orig->{lxs} = $self;
+		$lei_orig->event_step_init;
+	} else {
+		$self->wq_close;
+		read($eof_wait, my $buf, 1); # wait for close($lei->{0})
+		query_done($lei_orig); # may SIGPIPE
+	}
 }
 
 sub ipc_atfork_child {
diff --git a/script/lei b/script/lei
index 5c32ab88..9610a876 100755
--- a/script/lei
+++ b/script/lei
@@ -3,32 +3,47 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict;
 use v5.10.1;
-use Socket qw(AF_UNIX SOCK_STREAM pack_sockaddr_un);
+use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
+use Errno qw(EINTR ECONNRESET);
 use PublicInbox::CmdIPC4;
 my $narg = 4;
+my ($sock, $pwd);
 my $recv_cmd = PublicInbox::CmdIPC4->can('recv_cmd4');
 my $send_cmd = PublicInbox::CmdIPC4->can('send_cmd4') // do {
 	require PublicInbox::Spawn; # takes ~50ms even if built *sigh*
-	$narg = 4;
 	$recv_cmd = PublicInbox::Spawn->can('recv_cmd4');
 	PublicInbox::Spawn->can('send_cmd4');
 };
 
+sub sigchld {
+	my ($sig) = @_;
+	my $flags = $sig ? POSIX::WNOHANG() : 0;
+	while (waitpid(-1, $flags) > 0) {}
+}
+
 sub exec_cmd {
 	my ($fds, $argc, @argv) = @_;
-	my %env = map { split(/=/, $_, 2) } splice(@argv, $argc);
-	my @m = (*STDIN{IO}, '<&=',  *STDOUT{IO}, '>&=',
-		*STDERR{IO}, '>&=');
+	my @m = (*STDIN{IO}, '<&=',  *STDOUT{IO}, '>&=', *STDERR{IO}, '>&=');
+	my @rdr;
 	for my $fd (@$fds) {
 		my ($old_io, $mode) = splice(@m, 0, 2);
-		open($old_io, $mode, $fd) or die "open $mode$fd: $!";
+		open(my $tmpfh, $mode, $fd) or die "open $mode$fd: $!";
+		push @rdr, $old_io, $mode, $tmpfh;
+	}
+	require POSIX; # WNOHANG
+	$SIG{CHLD} = \&sigchld;
+	my $pid = fork // die "fork: $!";
+	if ($pid == 0) {
+		my %env = map { split(/=/, $_, 2) } splice(@argv, $argc);
+		while (my ($old_io, $mode, $tmpfh) = splice(@rdr, 0, 3)) {
+			open $old_io, $mode, $tmpfh or die "open $mode: $!";
+		}
+		%ENV = (%ENV, %env);
+		exec(@argv);
+		die "exec: @argv: $!";
 	}
-	%ENV = (%ENV, %env);
-	exec(@argv);
-	die "exec: @argv: $!";
 }
 
-my ($sock, $pwd);
 if ($send_cmd && eval {
 	my $path = do {
 		my $runtime_dir = ($ENV{XDG_RUNTIME_DIR} // '') . '/lei';
@@ -40,10 +55,10 @@ if ($send_cmd && eval {
 			require File::Path;
 			File::Path::mkpath($runtime_dir, 0, 0700);
 		}
-		"$runtime_dir/$narg.sock";
+		"$runtime_dir/$narg.seq.sock";
 	};
 	my $addr = pack_sockaddr_un($path);
-	socket($sock, AF_UNIX, SOCK_STREAM, 0) or die "socket: $!";
+	socket($sock, AF_UNIX, SOCK_SEQPACKET, 0) or die "socket: $!";
 	unless (connect($sock, $addr)) { # start the daemon if not started
 		local $ENV{PERL5LIB} = join(':', @INC);
 		open(my $daemon, '-|', $^X, qw[-MPublicInbox::LEI
@@ -73,22 +88,41 @@ Falling back to (slow) one-shot mode
 	}
 	1;
 }) { # (Socket::MsgHdr|Inline::C), $sock, $pwd are all available:
-	local $ENV{PWD} = $pwd;
+	$ENV{PWD} = $pwd;
 	my $buf = join("\0", scalar(@ARGV), @ARGV);
 	while (my ($k, $v) = each %ENV) { $buf .= "\0$k=$v" }
 	$buf .= "\0\0";
-	select $sock;
-	$| = 1; # unbuffer selected $sock
-	$send_cmd->($sock, [ 0, 1, 2 ], $buf, 0);
-	while (my (@fds) = $recv_cmd->($sock, $buf, 4096 * 33)) {
-		if ($buf =~ /\Aexit=([0-9]+)\n\z/) {
-			exit($1);
-		} elsif ($buf =~ /\Aexec (.+)\n\z/) {
+	$send_cmd->($sock, [ 0, 1, 2 ], $buf, MSG_EOR);
+	$SIG{TERM} = $SIG{INT} = $SIG{QUIT} = sub {
+		my ($sig) = @_; # 'TERM', not an integer :<
+		$SIG{$sig} = 'DEFAULT';
+		kill($sig, $$); # exit($signo + 128)
+	};
+	my $x_it_code = 0;
+	while (1) {
+		my (@fds) = $recv_cmd->($sock, $buf, 4096 * 33);
+		if (scalar(@fds) == 1 && !defined($fds[0])) {
+			last if $! == ECONNRESET;
+			next if $! == EINTR;
+			die "recvmsg: $!";
+		}
+		last if $buf eq '';
+		if ($buf =~ /\Ax_it ([0-9]+)\z/) {
+			$x_it_code = $1 + 0;
+			last;
+		} elsif ($buf =~ /\Aexec (.+)\z/) {
 			exec_cmd(\@fds, split(/\0/, $1));
 		} else {
+			sigchld();
 			die $buf;
 		}
 	}
+	sigchld();
+	if (my $sig = ($x_it_code & 127)) {
+		kill $sig, $$;
+		sleep;
+	}
+	exit($x_it_code >> 8);
 } else { # for systems lacking Socket::MsgHdr or Inline::C
 	warn $@ if $@;
 	require PublicInbox::LEI;
diff --git a/t/lei.t b/t/lei.t
index 6819f182..3ebaade6 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -215,7 +215,7 @@ SKIP: { # real socket
 	skip 'Socket::MsgHdr or Inline::C missing or unconfigured', $nr;
 
 	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
-	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/$nfd.sock";
+	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/$nfd.seq.sock";
 
 	ok($lei->('daemon-pid'), 'daemon-pid');
 	is($err, '', 'no error from daemon-pid');
diff --git a/xt/lei-sigpipe.t b/xt/lei-sigpipe.t
new file mode 100644
index 00000000..4d35bbb3
--- /dev/null
+++ b/xt/lei-sigpipe.t
@@ -0,0 +1,32 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use v5.10.1;
+use Test::More;
+use PublicInbox::TestCommon;
+use POSIX qw(WTERMSIG WIFSIGNALED SIGPIPE);
+require_mods(qw(json DBD::SQLite Search::Xapian));
+# XXX this needs an already configured lei instance with many messages
+
+my $do_test = sub {
+	my $env = shift // {};
+	pipe(my ($r, $w)) or BAIL_OUT $!;
+	open my $err, '+>', undef or BAIL_OUT $!;
+	my $opt = { run_mode => 0, 1 => $w, 2 => $err };
+	my $tp = start_script([qw(lei q -t), 'bytes:1..'], $env, $opt);
+	close $w;
+	sysread($r, my $buf, 1);
+	close $r; # trigger SIGPIPE
+	$tp->join;
+	ok(WIFSIGNALED($?), 'signaled');
+	is(WTERMSIG($?), SIGPIPE, 'got SIGPIPE');
+	seek($err, 0, 0);
+	my @err = grep(!m{mkdir /dev/null\b}, <$err>);
+	is_deeply(\@err, [], 'no errors');
+};
+
+$do_test->();
+$do_test->({XDG_RUNTIME_DIR => '/dev/null'});
+
+done_testing;

^ permalink raw reply related	[relevance 23%]

* [PATCH 13/14] lei: remove temporary var on open
  2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
                   ` (5 preceding siblings ...)
  2021-01-14  7:06 49% ` [PATCH 11/14] lei: q: lock stdout on overview output Eric Wong
@ 2021-01-14  7:06 71% ` Eric Wong
  2021-01-14  7:06 48% ` [PATCH 14/14] lei: pass FD to CWD via cmsg, use fchdir on server Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

We can place the IO/GLOB ref directly into $self, here.
---
 lib/PublicInbox/LEI.pm | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index a8fea16d..9786e7ac 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -678,11 +678,8 @@ sub accept_dispatch { # Listener {post_accept} callback
 		my $i = 0;
 		for my $rdr (qw(<&= >&= >&=)) {
 			my $fd = shift(@fds);
-			if (open(my $fh, $rdr, $fd)) {
-				$self->{$i++} = $fh;
-				next;
-			}
-			return send($sock, "open($rdr$fd) (FD=$i): $!", MSG_EOR);
+			open($self->{$i++}, $rdr, $fd) and next;
+			send($sock, "open($rdr$fd) (FD=$i): $!", MSG_EOR);
 		}
 	} else {
 		return send($sock, "recv_cmd failed: $!", MSG_EOR);

^ permalink raw reply related	[relevance 71%]

* [PATCH 14/14] lei: pass FD to CWD via cmsg, use fchdir on server
  2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
                   ` (6 preceding siblings ...)
  2021-01-14  7:06 71% ` [PATCH 13/14] lei: remove temporary var on open Eric Wong
@ 2021-01-14  7:06 48% ` Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-14  7:06 UTC (permalink / raw)
  To: meta

Perl chdir() automatically does fchdir(2) if given a file
or directory handle since 5.8.8/5.10.0, so we can safely
rely on it given our 5.10.1+ requirement.

This means we no longer have to waste several milliseconds
loading the Cwd.so and making stat() calls to ensure
ENV{PWD} is correct and usable in the server.  It also lets
us work in directories that are no longer accessible via
pathname.
---
 lib/PublicInbox/LEI.pm | 14 +++++++-------
 script/lei             | 18 +++---------------
 t/lei.t                | 27 ++-------------------------
 3 files changed, 12 insertions(+), 47 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9786e7ac..1f4a3082 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -674,9 +674,9 @@ sub accept_dispatch { # Listener {post_accept} callback
 	select($rvec, undef, undef, 1) or
 		return send($sock, 'timed out waiting to recv FDs', MSG_EOR);
 	my @fds = $recv_cmd->($sock, my $buf, 4096 * 33); # >MAX_ARG_STRLEN
-	if (scalar(@fds) == 3) {
+	if (scalar(@fds) == 4) {
 		my $i = 0;
-		for my $rdr (qw(<&= >&= >&=)) {
+		for my $rdr (qw(<&= >&= >&= <&=)) {
 			my $fd = shift(@fds);
 			open($self->{$i++}, $rdr, $fd) and next;
 			send($sock, "open($rdr$fd) (FD=$i): $!", MSG_EOR);
@@ -692,13 +692,13 @@ sub accept_dispatch { # Listener {post_accept} callback
 	my ($argc, @argv) = split(/\0/, $buf, -1);
 	undef $buf;
 	my %env = map { split(/=/, $_, 2) } splice(@argv, $argc);
-	if (chdir($env{PWD})) {
+	if (chdir(delete($self->{3}))) {
 		local %ENV = %env;
 		$self->{env} = \%env;
 		eval { dispatch($self, @argv) };
 		send($sock, $@, MSG_EOR) if $@;
 	} else {
-		send($sock, "chdir($env{PWD}): $!", MSG_EOR); # implicit close
+		send($sock, "fchdir: $!", MSG_EOR); # implicit close
 	}
 }
 
@@ -746,7 +746,7 @@ our $oldset; sub oldset { $oldset }
 
 # lei(1) calls this when it can't connect
 sub lazy_start {
-	my ($path, $errno, $nfd) = @_;
+	my ($path, $errno, $narg) = @_;
 	if ($errno == ECONNREFUSED) {
 		unlink($path) or die "unlink($path): $!";
 	} elsif ($errno != ENOENT) {
@@ -761,7 +761,7 @@ sub lazy_start {
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
 	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
 	local $oldset = PublicInbox::DS::block_signals();
-	if ($nfd == 4) {
+	if ($narg == 5) {
 		$send_cmd = PublicInbox::Spawn->can('send_cmd4');
 		$recv_cmd = PublicInbox::Spawn->can('recv_cmd4') // do {
 			require PublicInbox::CmdIPC4;
@@ -770,7 +770,7 @@ sub lazy_start {
 		};
 	}
 	$recv_cmd or die <<"";
-(Socket::MsgHdr || Inline::C) missing/unconfigured (nfd=$nfd);
+(Socket::MsgHdr || Inline::C) missing/unconfigured (narg=$narg);
 
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
diff --git a/script/lei b/script/lei
index 9610a876..a4a0217b 100755
--- a/script/lei
+++ b/script/lei
@@ -6,7 +6,7 @@ use v5.10.1;
 use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
 use Errno qw(EINTR ECONNRESET);
 use PublicInbox::CmdIPC4;
-my $narg = 4;
+my $narg = 5;
 my ($sock, $pwd);
 my $recv_cmd = PublicInbox::CmdIPC4->can('recv_cmd4');
 my $send_cmd = PublicInbox::CmdIPC4->can('send_cmd4') // do {
@@ -74,25 +74,13 @@ connect($path): $! (after attempted daemon start)
 Falling back to (slow) one-shot mode
 
 	}
-	require Cwd;
-	$pwd = $ENV{PWD} // '';
-	my $cwd = Cwd::fastcwd() // die "fastcwd(PWD=$pwd): $!";
-	if ($pwd ne $cwd) { # prefer ENV{PWD} if it's a symlink to real cwd
-		my @st_cwd = stat($cwd) or die "stat(cwd=$cwd): $!";
-		my @st_pwd = stat($pwd); # PWD invalid, use cwd
-		# make sure st_dev/st_ino match for {PWD} to be valid
-		$pwd = $cwd if (!@st_pwd || $st_pwd[1] != $st_cwd[1] ||
-					$st_pwd[0] != $st_cwd[0]);
-	} else {
-		$pwd = $cwd;
-	}
 	1;
 }) { # (Socket::MsgHdr|Inline::C), $sock, $pwd are all available:
-	$ENV{PWD} = $pwd;
+	open my $dh, '<', '.' or die "open(.) $!";
 	my $buf = join("\0", scalar(@ARGV), @ARGV);
 	while (my ($k, $v) = each %ENV) { $buf .= "\0$k=$v" }
 	$buf .= "\0\0";
-	$send_cmd->($sock, [ 0, 1, 2 ], $buf, MSG_EOR);
+	$send_cmd->($sock, [ 0, 1, 2, fileno($dh) ], $buf, MSG_EOR);
 	$SIG{TERM} = $SIG{INT} = $SIG{QUIT} = sub {
 		my ($sig) = @_; # 'TERM', not an integer :<
 		$SIG{$sig} = 'DEFAULT';
diff --git a/t/lei.t b/t/lei.t
index 240735bf..2349dca4 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -208,9 +208,9 @@ if ($ENV{TEST_LEI_ONESHOT}) {
 
 SKIP: { # real socket
 	require_mods(qw(Cwd), my $nr = 105);
-	my $nfd = eval { require Socket::MsgHdr; 4 } // do {
+	my $nfd = eval { require Socket::MsgHdr; 5 } // do {
 		require PublicInbox::Spawn;
-		PublicInbox::Spawn->can('send_cmd4') ? 4 : undef;
+		PublicInbox::Spawn->can('send_cmd4') ? 5 : undef;
 	} //
 	skip 'Socket::MsgHdr or Inline::C missing or unconfigured', $nr;
 
@@ -260,29 +260,6 @@ SKIP: { # real socket
 		like($out, qr/^usage: /, 'help output works');
 		chmod 0700, $sock or BAIL_OUT "chmod 0700: $!";
 	}
-	if ('oneshot on cwd gone') {
-		my $cwd = Cwd::fastcwd() or BAIL_OUT "fastcwd: $!";
-		my $d = "$home/to-be-removed";
-		my $lei_path = 'lei';
-		# we chdir, so we need an abs_path fur run_script
-		if (($ENV{TEST_RUN_MODE}//2) != 2) {
-			$lei_path = PublicInbox::TestCommon::key2script('lei');
-			$lei_path = Cwd::abs_path($lei_path);
-		}
-		mkdir $d or BAIL_OUT "mkdir($d) $!";
-		chdir $d or BAIL_OUT "chdir($d) $!";
-		if (rmdir($d)) {
-			$out = $err = '';
-			ok(run_script([$lei_path, 'help'], undef, $opt),
-				'cwd fail, one-shot fallback works');
-		} else {
-			$err = "rmdir=$!";
-		}
-		chdir $cwd or BAIL_OUT "chdir($cwd) $!";
-		like($err, qr/cwd\(/, 'cwd error noted');
-		like($out, qr/^usage: /, 'help output still works');
-	}
-
 	unlink $sock or BAIL_OUT "unlink($sock) $!";
 	for (0..100) {
 		kill('CHLD', $new_pid) or last;

^ permalink raw reply related	[relevance 48%]

* Re: [PATCH 11/14] lei: q: lock stdout on overview output
  2021-01-14  7:06 49% ` [PATCH 11/14] lei: q: lock stdout on overview output Eric Wong
@ 2021-01-15  0:18 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-15  0:18 UTC (permalink / raw)
  To: meta

Will squash this to fix a warning:

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 44c21837..ef5f27c1 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -36,8 +36,6 @@ sub ovv_out_lk_cancel ($) {
 		unlink(delete($self->{lock_path}));
 }
 
-*DESTROY = \&ovv_out_lk_cancel;
-
 sub new {
 	my ($class, $lei) = @_;
 	my $opt = $lei->{opt};
@@ -220,4 +218,7 @@ sub ovv_each_smsg_cb {
 	} # else { ...
 }
 
+no warnings 'once';
+*DESTROY = \&ovv_out_lk_cancel;
+
 1;

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/4] lei q: outputs to Maildir and mbox* working
@ 2021-01-16 11:36 71% Eric Wong
  2021-01-16 11:36 22% ` [PATCH 3/4] lei: q: results output " Eric Wong
  2021-01-16 11:36 71% ` [PATCH 4/4] lei: pager: pass correct env in oneshot mode Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-01-16 11:36 UTC (permalink / raw)
  To: meta

Only lightly-tested but this is the key "inspired by mairix"
part.  It's slow compared to mairix due to git storage and not
being able to use hardlinks, but git blob extraction will be
parallelizable.

Eric Wong (4):
  lei_to_mail: prepare for worker offload
  ipc: children don't kill on DESTROY, reduce FD sharing
  lei: q: results output to Maildir and mbox* working
  lei: pager: pass correct env in oneshot mode

 MANIFEST                       |   1 +
 lib/PublicInbox/IPC.pm         |  21 ++--
 lib/PublicInbox/LEI.pm         |  30 +++--
 lib/PublicInbox/LeiDedupe.pm   |   3 +-
 lib/PublicInbox/LeiOverview.pm |  60 ++++++----
 lib/PublicInbox/LeiQuery.pm    |  14 +--
 lib/PublicInbox/LeiToMail.pm   | 206 +++++++++++++++++++++------------
 lib/PublicInbox/LeiXSearch.pm  | 119 ++++++++++++++-----
 lib/PublicInbox/OpPipe.pm      |  41 +++++++
 t/lei.t                        |  20 ++++
 t/lei_to_mail.t                |  64 +++++-----
 11 files changed, 398 insertions(+), 181 deletions(-)
 create mode 100644 lib/PublicInbox/OpPipe.pm

^ permalink raw reply	[relevance 71%]

* [PATCH 4/4] lei: pager: pass correct env in oneshot mode
  2021-01-16 11:36 71% [PATCH 0/4] lei q: outputs to Maildir and mbox* working Eric Wong
  2021-01-16 11:36 22% ` [PATCH 3/4] lei: q: results output " Eric Wong
@ 2021-01-16 11:36 71% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-01-16 11:36 UTC (permalink / raw)
  To: meta

We want new environment variables when spawning the
pager from oneshot (non-daemon) mode.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f849c9df..56254c45 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -656,7 +656,7 @@ sub start_pager {
 		my $fds = [ map { fileno($_) } @$rdr{0..2} ];
 		$send_cmd->($sock, $fds, $buf, MSG_EOR);
 	} else {
-		$pgr->[0] = spawn([$pager], $env, $rdr);
+		$pgr->[0] = spawn([$pager], \%new_env, $rdr);
 	}
 	$self->{1} = $wpager;
 	$self->{2} = $wpager if -t $self->{2};

^ permalink raw reply related	[relevance 71%]

* [PATCH 3/4] lei: q: results output to Maildir and mbox* working
  2021-01-16 11:36 71% [PATCH 0/4] lei q: outputs to Maildir and mbox* working Eric Wong
@ 2021-01-16 11:36 22% ` Eric Wong
  2021-01-16 11:36 71% ` [PATCH 4/4] lei: pager: pass correct env in oneshot mode Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-01-16 11:36 UTC (permalink / raw)
  To: meta

All the augment and deduplication stuff seems to be working
based on unit tests.  OpPipe is a nice general addition that
will probably make future state machines easier.
---
 MANIFEST                       |   1 +
 lib/PublicInbox/LEI.pm         |  27 +++++---
 lib/PublicInbox/LeiDedupe.pm   |   3 +-
 lib/PublicInbox/LeiOverview.pm |  44 ++++++++----
 lib/PublicInbox/LeiQuery.pm    |  14 ++--
 lib/PublicInbox/LeiToMail.pm   |  89 ++++++++++++++++---------
 lib/PublicInbox/LeiXSearch.pm  | 118 ++++++++++++++++++++++++---------
 lib/PublicInbox/OpPipe.pm      |  41 ++++++++++++
 t/lei.t                        |  20 ++++++
 t/lei_to_mail.t                |   4 +-
 10 files changed, 266 insertions(+), 95 deletions(-)
 create mode 100644 lib/PublicInbox/OpPipe.pm

diff --git a/MANIFEST b/MANIFEST
index 0ebdaccc..0de1de4a 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -193,6 +193,7 @@ lib/PublicInbox/NNTPD.pm
 lib/PublicInbox/NNTPdeflate.pm
 lib/PublicInbox/NewsWWW.pm
 lib/PublicInbox/OnDestroy.pm
+lib/PublicInbox/OpPipe.pm
 lib/PublicInbox/Over.pm
 lib/PublicInbox/OverIdx.pm
 lib/PublicInbox/ProcessPipe.pm
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 5568904d..f849c9df 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -256,7 +256,9 @@ sub puts ($;@) { print { shift->{1} } map { "$_\n" } @_ }
 sub out ($;@) { print { shift->{1} } @_ }
 
 sub err ($;@) {
-	print { shift->{2} } @_, (substr($_[-1], -1, 1) eq "\n" ? () : "\n");
+	my $self = shift;
+	my $err = $self->{2} // *STDERR{IO};
+	print $err @_, (substr($_[-1], -1, 1) eq "\n" ? () : "\n");
 }
 
 sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }
@@ -270,8 +272,11 @@ sub fail ($$;$) {
 
 sub atfork_prepare_wq {
 	my ($self, $wq) = @_;
-	push @{$wq->{-ipc_atfork_child_close}}, @TO_CLOSE_ATFORK_CHILD,
-				grep { defined } @$self{qw(0 1 2 sock)}
+	my $tcafc = $wq->{-ipc_atfork_child_close};
+	push @$tcafc, @TO_CLOSE_ATFORK_CHILD;
+	if (my $sock = $self->{sock}) {
+		push @$tcafc, @$self{qw(0 1 2)}, $sock;
+	}
 }
 
 # usage: my %sig = $lei->atfork_child_wq($wq);
@@ -286,7 +291,9 @@ sub atfork_child_wq {
 	PIPE => sub {
 		$self->x_it(13); # SIGPIPE = 13
 		# we need to close explicitly to avoid Perl warning on SIGPIPE
-		close($_) for (delete @$self{1..2});
+		close(delete $self->{1});
+		# regular files and /dev/null (-c) won't trigger SIGPIPE
+		close(delete $self->{2}) unless (-f $self->{2} || -c _);
 		syswrite($self->{0}, '!') unless $self->{sock}; # for eof_wait
 		die bless(\"$_[0]", 'PublicInbox::SIGPIPE'),
 	});
@@ -641,7 +648,7 @@ sub start_pager {
 	$new_env{MORE} = 'FRX' if $^O eq 'freebsd';
 	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
-	my $pgr = [ undef, @$rdr{1, 2} ];
+	my $pgr = [ undef, @$rdr{1, 2}, $$ ];
 	if (my $sock = $self->{sock}) { # lei(1) process runs it
 		delete @new_env{keys %$env}; # only set iff unset
 		my $buf = "exec 1\0".$pager;
@@ -664,7 +671,7 @@ sub stop_pager {
 	# do not restore original stdout, just close it so we error out
 	close(delete($self->{1})) if $self->{1};
 	my $pid = $pgr->[0];
-	dwaitpid($pid, undef, $self->{sock}) if $pid;
+	dwaitpid($pid, undef, $self->{sock}) if $pid && $pgr->[3] == $$;
 }
 
 sub accept_dispatch { # Listener {post_accept} callback
@@ -706,7 +713,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 sub dclose {
 	my ($self) = @_;
 	delete $self->{lxs}; # stops LeiXSearch queries
-	$self->close; # PublicInbox::DS::close
+	$self->close if $self->{sock}; # PublicInbox::DS::close
 }
 
 # for long-running results
@@ -737,8 +744,10 @@ sub event_step {
 
 sub event_step_init {
 	my ($self) = @_;
-	$self->{sock}->blocking(0);
-	$self->SUPER::new($self->{sock}, EPOLLIN|EPOLLET);
+	if (my $sock = $self->{sock}) { # using DS->EventLoop
+		$sock->blocking(0);
+		$self->SUPER::new($sock, EPOLLIN|EPOLLET);
+	}
 }
 
 sub noop {}
diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
index 81754361..3f478aa4 100644
--- a/lib/PublicInbox/LeiDedupe.pm
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -89,8 +89,9 @@ sub true { 1 }
 sub dedupe_none ($) { (\&true, \&true) }
 
 sub new {
-	my ($cls, $lei, $dst) = @_;
+	my ($cls, $lei) = @_;
 	my $dd = $lei->{opt}->{dedupe} // 'content';
+	my $dst = $lei->{ovv}->{dst};
 
 	# allow "none" to bypass Eml->new if writing to directory:
 	return if ($dd eq 'none' && substr($dst // '', -1) eq '/');
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 9846bc8a..c0b423f6 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -15,6 +15,8 @@ use PublicInbox::MID qw($MID_EXTRACT);
 use PublicInbox::Address qw(pairs);
 use PublicInbox::Config;
 use PublicInbox::Search qw(get_pct);
+use PublicInbox::LeiDedupe;
+use PublicInbox::LeiToMail;
 
 # cf. https://en.wikipedia.org/wiki/JSON_streaming
 my $JSONL = 'ldjson|ndjson|jsonl'; # 3 names for the same thing
@@ -44,7 +46,7 @@ sub new {
 
 	my $fmt = $opt->{'format'};
 	$fmt = lc($fmt) if defined $fmt;
-	if ($dst =~ s/\A([a-z]+)://is) { # e.g. Maildir:/home/user/Mail/
+	if ($dst =~ s/\A([a-z0-9]+)://is) { # e.g. Maildir:/home/user/Mail/
 		my $ofmt = lc $1;
 		$fmt //= $ofmt;
 		return $lei->fail(<<"") if $fmt ne $ofmt;
@@ -52,13 +54,14 @@ sub new {
 
 	}
 	$fmt //= 'json' if $dst eq '/dev/stdout';
-	$fmt //= 'maildir'; # TODO
+	$fmt //= 'maildir';
 
 	if (index($dst, '://') < 0) { # not a URL, so assume path
 		 $dst = File::Spec->canonpath($dst);
 	} # else URL
 
 	my $self = bless { fmt => $fmt, dst => $dst }, $class;
+	$lei->{ovv} = $self;
 	my $json;
 	if ($fmt =~ /\A($JSONL|(?:concat)?json)\z/) {
 		$json = $self->{json} = ref(PublicInbox::Config->json);
@@ -75,11 +78,13 @@ sub new {
 		} else {
 			ovv_out_lk_init($self);
 		}
-	} elsif ($json) {
-		return $lei->fail('JSON formats only output to stdout');
-	} else {
-		return $lei->fail("TODO: $dst -f $fmt");
 	}
+	if (!$json) {
+		# default to the cheapest sort since MUA usually resorts
+		$lei->{opt}->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
+		$lei->{l2m} = PublicInbox::LeiToMail->new($lei);
+	}
+	$lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei);
 	$self;
 }
 
@@ -135,9 +140,13 @@ sub _unbless_smsg {
 
 sub ovv_atexit_child {
 	my ($self, $lei) = @_;
+	if (my $git = delete $self->{git}) {
+		$git->async_wait_all;
+	}
 	if (my $bref = delete $lei->{ovv_buf}) {
+		my $out = $lei->{1} or return;
 		my $lk = $self->lock_for_scope;
-		print { $lei->{1} } $$bref;
+		print $out $$bref;
 	}
 }
 
@@ -167,17 +176,28 @@ sub _json_pretty {
 	qq{  "$k": }.$v;
 }
 
-sub ovv_each_smsg_cb {
-	my ($self, $lei) = @_;
+sub ovv_each_smsg_cb { # runs in wq worker usually
+	my ($self, $lei, $ibxish) = @_;
 	$lei->{ovv_buf} = \(my $buf = '');
 	delete(@$self{qw(lock_path tmp_lk_id)}) unless $lei->{-parallel};
-	my $json = $self->{json}->new;
+	my $json;
 	$lei->{1}->autoflush(1);
-	if ($json) {
+	if (my $pkg = $self->{json}) {
+		$json = $pkg->new;
 		$json->utf8->canonical;
 		$json->ascii(1) if $lei->{opt}->{ascii};
 	}
-	if ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
+	if (my $l2m = $lei->{l2m}) {
+		my $wcb = $l2m->write_cb($lei);
+		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
+		$self->{git} = $git; # for ovv_atexit_child
+		my $g2m = $l2m->can('git_to_mail');
+		sub {
+			my ($smsg, $mitem) = @_;
+			my $kw = []; # TODO get from mitem
+			$git->cat_async($smsg->{blob}, $g2m, [ $wcb, $kw ]);
+		};
+	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
 		sub { # DIY prettiness :P
 			my ($smsg, $mitem) = @_;
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 69d2f9a6..a80d5887 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -23,8 +23,6 @@ sub _vivify_external { # _externals_each callback
 # the main "lei q SEARCH_TERMS" method
 sub lei_q {
 	my ($self, @argv) = @_;
-	my $sto = $self->_lei_store(1);
-	my $cfg = $self->_lei_cfg(1);
 	my $opt = $self->{opt};
 
 	# --local is enabled by default
@@ -32,7 +30,7 @@ sub lei_q {
 	my @srcs;
 	require PublicInbox::LeiXSearch;
 	require PublicInbox::LeiOverview;
-	require PublicInbox::LeiDedupe;
+	PublicInbox::Config->json;
 	my $lxs = PublicInbox::LeiXSearch->new;
 
 	# --external is enabled by default, but allow --no-external
@@ -46,10 +44,10 @@ sub lei_q {
 	$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset)
 		// $lxs->wq_workers($j);
 
-	unshift(@srcs, $sto->search) if $opt->{'local'};
 	# no forking workers after this
-	$self->{ovv} = PublicInbox::LeiOverview->new($self);
-	$self->{dd} = PublicInbox::LeiDedupe->new($self);
+	my $ovv = PublicInbox::LeiOverview->new($self) or return;
+	my $sto = $self->_lei_store(1);
+	unshift(@srcs, $sto->search) if $opt->{'local'};
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
 	$mset_opt{qstr} = join(' ', map {;
@@ -69,12 +67,10 @@ sub lei_q {
 			die "unrecognized --sort=$sort\n";
 		}
 	}
-	# $self->out($json->encode(\%mset_opt));
 	# descending docid order
 	$mset_opt{relevance} //= -2 if $opt->{thread};
-	# my $wcb = PublicInbox::LeiToMail->write_cb($out, $self);
 	$self->{mset_opt} = \%mset_opt;
-	$self->{ovv}->ovv_begin($self);
+	$ovv->ovv_begin($self);
 	$lxs->do_query($self, \@srcs);
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 5d4b7978..744f331d 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -187,8 +187,9 @@ sub zsfx2cmd ($$$) {
 	\@cmd;
 }
 
-sub compress_dst {
-	my ($self, $zsfx, $lei) = @_;
+sub _post_augment_mbox { # open a compressor process
+	my ($self, $lei) = @_;
+	my $zsfx = $self->{zsfx} or return;
 	my $cmd = zsfx2cmd($zsfx, undef, $lei);
 	pipe(my ($r, $w)) or die "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $lei->{1}, 2 => $lei->{2} };
@@ -209,7 +210,9 @@ sub decompress_src ($$$) {
 
 sub dup_src ($) {
 	my ($in) = @_;
-	open my $dup, '+>>&', $in or die "dup: $!";
+	# fileno needed because wq_set_recv_modes only used ">&=" for {1}
+	# and Perl blindly trusts that to reject the '+' (readability flag)
+	open my $dup, '+>>&=', fileno($in) or die "dup: $!";
 	$dup;
 }
 
@@ -321,11 +324,13 @@ sub new {
 	} else {
 		die "bad mail --format=$fmt\n";
 	}
-	my $dedupe = $lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei, $dst);
+	$lei->{dedupe} = PublicInbox::LeiDedupe->new($lei);
 	$self;
 }
 
-sub _prepare_maildir {
+sub _pre_augment_maildir {} # noop
+
+sub _do_augment_maildir {
 	my ($self, $lei) = @_;
 	my $dst = $lei->{ovv}->{dst};
 	if ($lei->{opt}->{augment}) {
@@ -338,6 +343,11 @@ sub _prepare_maildir {
 	} else { # clobber existing Maildir
 		_maildir_each_file($dst, \&_unlink);
 	}
+}
+
+sub _post_augment_maildir {
+	my ($self, $lei) = @_;
+	my $dst = $lei->{ovv}->{dst};
 	for my $x (qw(tmp new cur)) {
 		my $d = $dst.$x;
 		next if -d $d;
@@ -347,45 +357,64 @@ sub _prepare_maildir {
 	}
 }
 
-sub _prepare_mbox {
+sub _pre_augment_mbox {
 	my ($self, $lei) = @_;
 	my $dst = $lei->{ovv}->{dst};
-	my ($out, $seekable);
-	if ($dst eq '/dev/stdout') {
-		$out = $lei->{1};
-	} else {
+	if ($dst ne '/dev/stdout') {
 		my $mode = -p $dst ? '>' : '+>>';
 		if (-f _ && !$lei->{opt}->{augment} and !unlink($dst)) {
 			$! == ENOENT or die "unlink($dst): $!";
 		}
-		open $out, $mode, $dst or die "open($dst): $!";
-		# Perl does SEEK_END even with O_APPEND :<
-		$seekable = seek($out, 0, SEEK_SET);
-		die "seek($dst): $!\n" if !$seekable && $! != ESPIPE;
+		open my $out, $mode, $dst or die "open($dst): $!";
 		$lei->{1} = $out;
 	}
+	# Perl does SEEK_END even with O_APPEND :<
+	$self->{seekable} = seek($lei->{1}, 0, SEEK_SET);
+	if (!$self->{seekable} && $! != ESPIPE && $dst ne '/dev/stdout') {
+		die "seek($dst): $!\n";
+	}
 	state $zsfx_allow = join('|', keys %zsfx2cmd);
-	my ($zsfx) = ($dst =~ /\.($zsfx_allow)\z/);
+	($self->{zsfx}) = ($dst =~ /\.($zsfx_allow)\z/);
+}
+
+sub _do_augment_mbox {
+	my ($self, $lei) = @_;
+	return if !$lei->{opt}->{augment};
 	my $dedupe = $lei->{dedupe};
-	if ($lei->{opt}->{augment}) {
-		die "cannot augment $dst, not seekable\n" if !$seekable;
-		if (-s $out && $dedupe && $dedupe->prepare_dedupe) {
-			my $rd = $zsfx ? decompress_src($out, $zsfx, $lei) :
-					dup_src($out);
-			my $fmt = $lei->{ovv}->{fmt};
-			require PublicInbox::MboxReader;
-			PublicInbox::MboxReader->$fmt($rd, \&_augment, $lei);
-		}
-		# maybe some systems don't honor O_APPEND, Perl does this:
-		seek($out, 0, SEEK_END) or die "seek $dst: $!";
-		$dedupe->pause_dedupe if $dedupe;
+	my $dst = $lei->{ovv}->{dst};
+	die "cannot augment $dst, not seekable\n" if !$self->{seekable};
+	my $out = $lei->{1};
+	if (-s $out && $dedupe && $dedupe->prepare_dedupe) {
+		my $zsfx = $self->{zsfx};
+		my $rd = $zsfx ? decompress_src($out, $zsfx, $lei) :
+				dup_src($out);
+		my $fmt = $lei->{ovv}->{fmt};
+		require PublicInbox::MboxReader;
+		PublicInbox::MboxReader->$fmt($rd, \&_augment, $lei);
 	}
-	compress_dst($self, $zsfx, $lei) if $zsfx;
+	# maybe some systems don't honor O_APPEND, Perl does this:
+	seek($out, 0, SEEK_END) or die "seek $dst: $!";
+	$dedupe->pause_dedupe if $dedupe;
+}
+
+sub pre_augment { # fast (1 disk seek), runs in main daemon
+	my ($self, $lei) = @_;
+	# _pre_augment_maildir, _pre_augment_mbox
+	my $m = "_pre_augment_$self->{base_type}";
+	$self->$m($lei);
+}
+
+sub do_augment { # slow, runs in wq worker
+	my ($self, $lei) = @_;
+	# _do_augment_maildir, _do_augment_mbox
+	my $m = "_do_augment_$self->{base_type}";
+	$self->$m($lei);
 }
 
-sub do_prepare {
+sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
 	my ($self, $lei) = @_;
-	my $m = "_prepare_$self->{base_type}";
+	# _post_augment_maildir, _post_augment_mbox
+	my $m = "_post_augment_$self->{base_type}";
 	$self->$m($lei);
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 8b70167c..9563ad63 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -9,6 +9,10 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::LeiSearch PublicInbox::IPC);
 use PublicInbox::DS qw(dwaitpid);
+use PublicInbox::OpPipe;
+use PublicInbox::Import;
+use File::Temp 0.19 (); # 0.19 for ->newdir
+use File::Spec ();
 
 sub new {
 	my ($class) = @_;
@@ -103,9 +107,9 @@ sub query_thread_mset { # for --thread
 	}
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
-	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
-	my $dd = $lei->{dd};
-	$dd->prepare_dedupe;
+	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $ibxish);
+	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
+	$dedupe->prepare_dedupe;
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
 		my $ids = $srch->mset_to_artnums($mset, $mo);
@@ -115,7 +119,7 @@ sub query_thread_mset { # for --thread
 		while ($over->expand_thread($ctx)) {
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
-				next if $dd->is_smsg_dup($smsg);
+				next if $dedupe->is_smsg_dup($smsg);
 				my $mitem = delete $n2item{$smsg->{num}};
 				$each_smsg->($smsg, $mitem);
 			}
@@ -132,65 +136,113 @@ sub query_mset { # non-parallel for non-"--thread" users
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	$self->attach_external($_) for @$srcs;
-	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
-	my $dd = $lei->{dd};
-	$dd->prepare_dedupe;
+	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $self);
+	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
+	$dedupe->prepare_dedupe;
 	do {
 		$mset = $self->mset($mo->{qstr}, $mo);
 		for my $it ($mset->items) {
 			my $smsg = smsg_for($self, $it) or next;
-			next if $dd->is_smsg_dup($smsg);
+			next if $dedupe->is_smsg_dup($smsg);
 			$each_smsg->($smsg, $it);
 		}
 	} while (_mset_more($mset, $mo));
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
-sub query_done { # PublicInbox::EOFpipe callback
+sub git {
+	my ($self) = @_;
+	my (%seen, @dirs);
+	my $tmp = File::Temp->newdir('lei_xsrch_git-XXXXXXXX', TMPDIR => 1);
+	for my $ibx (@{$self->{shard2ibx} // []}) {
+		my $d = File::Spec->canonpath($ibx->git->{git_dir});
+		$seen{$d} //= push @dirs, "$d/objects\n"
+	}
+	my $git_dir = $tmp->dirname;
+	PublicInbox::Import::init_bare($git_dir);
+	my $f = "$git_dir/objects/info/alternates";
+	open my $alt, '>', $f or die "open($f): $!";
+	print $alt @dirs or die "print $f: $!";
+	close $alt or die "close $f: $!";
+	my $git = PublicInbox::Git->new($git_dir);
+	$git->{-tmp} = $tmp;
+	$git;
+}
+
+sub query_done { # EOF callback
 	my ($lei) = @_;
 	$lei->{ovv}->ovv_end($lei);
 	$lei->dclose;
 }
 
-sub do_query {
-	my ($self, $lei_orig, $srcs) = @_;
-	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
+sub start_query { # always runs in main (lei-daemon) process
+	my ($self, $io, $lei, $srcs) = @_;
+	if (my $l2m = $lei->{l2m}) {
+		$lei->{1} = $io->[1];
+		$l2m->post_augment($lei);
+		$io->[1] = delete $lei->{1};
+	}
 	my $remotes = $self->{remotes} // [];
-	pipe(my ($eof_wait, $qry_done)) or die "pipe $!";
-	$io[0] = $qry_done; # don't need stdin
-
 	if ($lei->{opt}->{thread}) {
 		$lei->{-parallel} = scalar(@$remotes) + scalar(@$srcs) - 1;
 		for my $ibxish (@$srcs) {
-			$self->wq_do('query_thread_mset', \@io, $lei, $ibxish);
+			$self->wq_do('query_thread_mset', $io, $lei, $ibxish);
 		}
 	} else {
 		$lei->{-parallel} = scalar(@$remotes);
-		$self->wq_do('query_mset', \@io, $lei, $srcs);
+		$self->wq_do('query_mset', $io, $lei, $srcs);
 	}
 	# TODO
 	for my $rmt (@$remotes) {
-		$self->wq_do('query_thread_mbox', \@io, $lei, $rmt);
+		$self->wq_do('query_thread_mbox', $io, $lei, $rmt);
 	}
-	@io = ();
-	close $qry_done; # fully closed when children are done
-
-	# query_done will run when query_*mset close $qry_done
-	if ($lei_orig->{sock}) { # watch for client premature exit
-		require PublicInbox::EOFpipe;
-		PublicInbox::EOFpipe->new($eof_wait, \&query_done, $lei_orig);
-		$lei_orig->{lxs} = $self;
-		$lei_orig->event_step_init;
+	close $io->[0]; # qry_status_wr
+	@$io = ();
+}
+
+sub query_prepare { # wq_do
+	my ($self, $lei) = @_;
+	my %sig = $lei->atfork_child_wq($self);
+	local @SIG{keys %sig} = values %sig;
+	if (my $l2m = $lei->{l2m}) {
+		eval { $l2m->do_augment($lei) };
+		return $lei->fail($@) if $@;
+	}
+	# trigger PublicInbox::OpPipe->event_step
+	my $qry_status_wr = $lei->{0} or
+		return $lei->fail('BUG: qry_status_wr missing');
+	$qry_status_wr->autoflush(1);
+	print $qry_status_wr '.' or # this should never fail...
+		return $lei->fail("BUG? print qry_status_wr: $!");
+}
+
+sub do_query {
+	my ($self, $lei_orig, $srcs) = @_;
+	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
+	$io[0] = undef;
+	pipe(my $qry_status_rd, $io[0]) or die "pipe $!";
+
+	$lei_orig->{lxs} = $self;
+	$lei_orig->event_step_init; # wait for shutdowns
+	my $op_map = { '' => [ \&query_done, $lei_orig ] };
+	my $in_loop = exists $lei_orig->{sock};
+	my $opp = PublicInbox::OpPipe->new($qry_status_rd, $op_map, $in_loop);
+	if (my $l2m = $lei->{l2m}) {
+		$l2m->pre_augment($lei_orig); # may redirect $lei->{1} for mbox
+		$io[1] = $lei_orig->{1};
+		$op_map->{'.'} = [ \&start_query, $self, \@io, $lei, $srcs ];
+		$self->wq_do('query_prepare', \@io, $lei);
+		$opp->event_step if !$in_loop;
 	} else {
+		start_query($self, \@io, $lei, $srcs);
+	}
+	unless ($in_loop) {
 		my @pids = $self->wq_close;
-		# wait for close($lei->{0})
-		if (read($eof_wait, my $buf, 1)) {
-			# if we get a SIGPIPE from one, kill the rest
-			kill('TERM', @pids) if $buf eq '!';
-		}
+		# for the $lei->atfork_child_wq PIPE handler:
+		$op_map->{'!'} = [ \&CORE::kill, 'TERM', @pids ];
+		$opp->event_step;
 		my $ipc_worker_reap = $self->can('ipc_worker_reap');
 		dwaitpid($_, $ipc_worker_reap, $self) for @pids;
-		query_done($lei_orig); # may SIGPIPE
 	}
 }
 
diff --git a/lib/PublicInbox/OpPipe.pm b/lib/PublicInbox/OpPipe.pm
new file mode 100644
index 00000000..295a8aa5
--- /dev/null
+++ b/lib/PublicInbox/OpPipe.pm
@@ -0,0 +1,41 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# bytecode dispatch pipe, reads a byte, runs a sub
+# byte => [ sub, @operands ]
+package PublicInbox::OpPipe;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::DS);
+use PublicInbox::Syscall qw(EPOLLIN);
+
+sub new {
+	my ($cls, $rd, $op_map, $in_loop) = @_;
+	my $self = bless { sock => $rd, op_map => $op_map }, $cls;
+	# 1031: F_SETPIPE_SZ, 4096: page size
+	fcntl($rd, 1031, 4096) if $^O eq 'linux';
+	if ($in_loop) { # iff using DS->EventLoop
+		$rd->blocking(0);
+		$self->SUPER::new($rd, EPOLLIN);
+	}
+	$self;
+}
+
+sub event_step {
+	my ($self) = @_;
+	my $rd = $self->{sock};
+	my $byte;
+	until (defined(sysread($rd, $byte, 1))) {
+		return if $!{EAGAIN};
+		next if $!{EINTR};
+		die "read \$rd: $!";
+	}
+	my $op = $self->{op_map}->{$byte} or die "BUG: unknown byte `$byte'";
+	if ($byte eq '') { # close on EOF
+		$rd->blocking ? delete($self->{sock}) : $self->close;
+	}
+	my ($sub, @args) = @$op;
+	$sub->(@args);
+}
+
+1;
diff --git a/t/lei.t b/t/lei.t
index 2349dca4..c4692217 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -7,6 +7,7 @@ use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::Config;
 use File::Path qw(rmtree);
+use Fcntl qw(SEEK_SET);
 require_git 2.6;
 require_mods(qw(json DBD::SQLite Search::Xapian));
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
@@ -188,6 +189,25 @@ my $test_external = sub {
 	# No double-quoting should be imposed on users on the CLI
 	$lei->('q', 's:use boolean prefix');
 	like($out, qr/search: use boolean prefix/, 'phrase search got result');
+
+	$lei->('q', '-o', "mboxcl2:$home/mbox", 's:use boolean prefix');
+	open my $mb, '<', "$home/mbox" or fail "no mbox: $!";
+	my @s = grep(/^Subject:/, <$mb>);
+	is(scalar(@s), 1, '1 result in mbox');
+	$lei->('q', '-a', '-o', "mboxcl2:$home/mbox", 's:see attachment');
+	is($err, '', 'no errors from augment');
+	seek($mb, 0, SEEK_SET) or BAIL_OUT "seek: $!";
+	@s = grep(/^Subject:/, <$mb>);
+	is(scalar(@s), 2, '2 results in mbox');
+
+	$lei->('q', '-a', '-o', "mboxcl2:$home/mbox", 's:nonexistent');
+	is($err, '', 'no errors on no results');
+	seek($mb, 0, SEEK_SET) or BAIL_OUT "seek: $!";
+	my @s2 = grep(/^Subject:/, <$mb>);
+	is_deeply(\@s2, \@s, 'same 2 old results w/ --augment and bad search');
+
+	$lei->('q', '-o', "mboxcl2:$home/mbox", 's:nonexistent');
+	is(-s "$home/mbox", 0, 'clobber w/o --augment');
 };
 
 my $test_lei_common = sub {
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index d5beb3d2..083e0df4 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -94,7 +94,9 @@ my $wcb_get = sub {
 		my $dup = Storable::thaw(Storable::freeze($l2m));
 		is_deeply($dup, $l2m, "$fmt round-trips through storable");
 	}
-	$l2m->do_prepare($lei);
+	$l2m->pre_augment($lei);
+	$l2m->do_augment($lei);
+	$l2m->post_augment($lei);
 	my $cb = $l2m->write_cb($lei);
 	delete $lei->{1};
 	$cb;

^ permalink raw reply related	[relevance 22%]

* [PATCH] lei q: add --mua-cmd switch
@ 2021-01-17  8:52 50% Eric Wong
  2021-01-17 10:19 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-17  8:52 UTC (permalink / raw)
  To: meta

It can be convenient to invoke an MUA as search results
are being written to it, as an eager person may want to
start seeing results ASAP.  This lets Maildir users
see results in the MUA as we are writing them.  Users
of IMAP will eventually be able to take advantage of
them, too.

Since we don't support mbox locking (yet?), we'll only invoke
the MUA after results are done for mbox formats.
---
 lib/PublicInbox/LEI.pm        | 45 ++++++++++++++++++++++++++++-------
 lib/PublicInbox/LeiToMail.pm  |  4 ++++
 lib/PublicInbox/LeiXSearch.pm |  4 ++++
 3 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 56254c45..ee5c26a8 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -83,7 +83,7 @@ sub _config_path ($) {
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
-	sort|s=s reverse|r offset=i remote local! external! pretty
+	sort|s=s reverse|r offset=i remote local! external! pretty mua-cmd=s
 	since|after=s until|before=s), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
@@ -192,6 +192,8 @@ my %OPTDESC = (
 
 'output|o=s' => [ 'DEST',
 	"destination (e.g. `/path/to/Maildir', or `-' for stdout)" ],
+'mua-cmd|mua=s' => [ 'COMMAND',
+	"MUA to run on --output Maildir or mbox (e.g. `mutt -f %f'" ],
 
 'show	format|f=s' => [ 'OUT|plain|raw|html|mboxrd|mboxcl2|mboxcl',
 			'message/object output format' ],
@@ -635,6 +637,32 @@ sub lei_git { # support passing through random git commands
 	dwaitpid($pid, \&reap_exec, $self);
 }
 
+sub exec_buf ($$) {
+	my ($argv, $env) = @_;
+	my $argc = scalar @$argv;
+	my $buf = 'exec '.join("\0", scalar(@$argv), @$argv);
+	while (my ($k, $v) = each %$env) { $buf .= "\0$k=$v" };
+	$buf;
+}
+
+sub start_mua {
+	my ($self, $sock) = @_;
+	my $mua = $self->{opt}->{'mua-cmd'} // return;
+	my $mfolder = $self->{ovv}->{dst};
+	require Text::ParseWords;
+	my $replaced;
+	my @cmd = Text::ParseWords::shellwords($mua);
+	# mutt uses '%f' for open-hook with compressed folders, so we use %f
+	@cmd = map { $_ eq '%f' ? ($replaced = $mfolder) : $_ } @cmd;
+	push @cmd, $mfolder unless defined($replaced);
+	$sock //= $self->{sock};
+	if ($sock) { # lei(1) client process runs it
+		send($sock, exec_buf(\@cmd, {}), MSG_EOR);
+	} else { # oneshot
+		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);
+	}
+}
+
 # caller needs to "-t $self->{1}" to check if tty
 sub start_pager {
 	my ($self) = @_;
@@ -644,19 +672,17 @@ sub start_pager {
 	close($fh) or warn "`git var PAGER' error: \$?=$?";
 	return if $pager eq 'cat' || $pager eq '';
 	# TODO TIOCGWINSZ
-	my %new_env = (LESS => 'FRX', LV => '-c', COLUMNS => 80);
-	$new_env{MORE} = 'FRX' if $^O eq 'freebsd';
+	my $new_env = { LESS => 'FRX', LV => '-c', COLUMNS => 80 };
+	$new_env->{MORE} = 'FRX' if $^O eq 'freebsd';
 	pipe(my ($r, $wpager)) or return warn "pipe: $!";
 	my $rdr = { 0 => $r, 1 => $self->{1}, 2 => $self->{2} };
 	my $pgr = [ undef, @$rdr{1, 2}, $$ ];
 	if (my $sock = $self->{sock}) { # lei(1) process runs it
-		delete @new_env{keys %$env}; # only set iff unset
-		my $buf = "exec 1\0".$pager;
-		while (my ($k, $v) = each %new_env) { $buf .= "\0$k=$v" };
+		delete @$new_env{keys %$env}; # only set iff unset
 		my $fds = [ map { fileno($_) } @$rdr{0..2} ];
-		$send_cmd->($sock, $fds, $buf, MSG_EOR);
+		$send_cmd->($sock, $fds, exec_buf([$pager], $new_env), MSG_EOR);
 	} else {
-		$pgr->[0] = spawn([$pager], \%new_env, $rdr);
+		$pgr->[0] = spawn([$pager], $new_env, $rdr);
 	}
 	$self->{1} = $wpager;
 	$self->{2} = $wpager if -t $self->{2};
@@ -892,6 +918,9 @@ sub DESTROY {
 	my ($self) = @_;
 	$self->{1}->autoflush(1) if $self->{1};
 	stop_pager($self);
+	if (my $mua_pid = delete $self->{"mua.pid.$self.$$"}) {
+		waitpid($mua_pid, 0);
+	}
 }
 
 1;
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 744f331d..0e23b8da 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -418,4 +418,8 @@ sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
 	$self->$m($lei);
 }
 
+sub lock_free {
+	$_[0]->{base_type} =~ /\A(?:maildir|mh|imap|jmap)\z/ ? 1 : 0;
+}
+
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 9563ad63..91864cd0 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -172,6 +172,9 @@ sub git {
 sub query_done { # EOF callback
 	my ($lei) = @_;
 	$lei->{ovv}->ovv_end($lei);
+	if (my $l2m = $lei->{l2m}) {
+		$lei->start_mua unless $l2m->lock_free;
+	}
 	$lei->dclose;
 }
 
@@ -181,6 +184,7 @@ sub start_query { # always runs in main (lei-daemon) process
 		$lei->{1} = $io->[1];
 		$l2m->post_augment($lei);
 		$io->[1] = delete $lei->{1};
+		$lei->start_mua($io->[3]) if $l2m->lock_free;
 	}
 	my $remotes = $self->{remotes} // [];
 	if ($lei->{opt}->{thread}) {

^ permalink raw reply related	[relevance 50%]

* Re: [PATCH] lei q: add --mua-cmd switch
  2021-01-17  8:52 50% [PATCH] lei q: add --mua-cmd switch Eric Wong
@ 2021-01-17 10:19 71% ` Eric Wong
  2021-01-17 10:28 71%   ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-17 10:19 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> @@ -192,6 +192,8 @@ my %OPTDESC = (
>  
>  'output|o=s' => [ 'DEST',
>  	"destination (e.g. `/path/to/Maildir', or `-' for stdout)" ],
> +'mua-cmd|mua=s' => [ 'COMMAND',
> +	"MUA to run on --output Maildir or mbox (e.g. `mutt -f %f'" ],

I'm wondering if displaying "mutt" in a one-line help message is
showing unnecessary favoritism for a non-standard MUA.

Fwiw, "mailx -f" is POSIX, but only for mbox.  While both
Heirloom mailx and GNU mailutils mailx support Maildir, but
bsd-mailx (the default on Debian, AFAIK) does not...

FreeBSD mailx does not do Maildir, either (tested 12.1 and 11.4).

^ permalink raw reply	[relevance 71%]

* Re: [PATCH] lei q: add --mua-cmd switch
  2021-01-17 10:19 71% ` Eric Wong
@ 2021-01-17 10:28 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-17 10:28 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> Eric Wong <e@80x24.org> wrote:
> > @@ -192,6 +192,8 @@ my %OPTDESC = (
> >  
> >  'output|o=s' => [ 'DEST',
> >  	"destination (e.g. `/path/to/Maildir', or `-' for stdout)" ],
> > +'mua-cmd|mua=s' => [ 'COMMAND',
> > +	"MUA to run on --output Maildir or mbox (e.g. `mutt -f %f'" ],
> 
> I'm wondering if displaying "mutt" in a one-line help message is
> showing unnecessary favoritism for a non-standard MUA.
> 
> Fwiw, "mailx -f" is POSIX, but only for mbox.

Though git also favors less(1) (and lei follows suit),
despite more(1) being the POSIX pager.

^ permalink raw reply	[relevance 71%]

* [PATCH 0/2] lei q: write faster, mutt does less work
@ 2021-01-18 10:30 71% Eric Wong
  2021-01-18 10:30 35% ` [PATCH 1/2] lei q: parallelize Maildir and mbox writing Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-18 10:30 UTC (permalink / raw)
  To: meta

1/2 was tricky and still ugly, but the speedup is great (>100%)
and opens up the door for even more speedups.

2/2 ought to help with other MUAs, but I've only tested with
mutt.  AFAIK every MUA clears the \Recent flag unless it opens a
mail folder read-only, so this saves a bunch of renames by the
MUA with Maildirs.

Eric Wong (2):
  lei q: parallelize Maildir and mbox writing
  lei_to_mail: optimize for MUAs

 lib/PublicInbox/IPC.pm         |  3 ++
 lib/PublicInbox/LEI.pm         | 36 +++++++++++++++------
 lib/PublicInbox/LeiOverview.pm | 36 +++++++++++++++++++--
 lib/PublicInbox/LeiQuery.pm    | 12 +++++--
 lib/PublicInbox/LeiToMail.pm   | 59 +++++++++++++++++++++++++++++-----
 lib/PublicInbox/LeiXSearch.pm  | 27 ++++++++++------
 lib/PublicInbox/Spawn.pm       |  2 +-
 t/lei_to_mail.t                | 16 ++++++---
 t/mbox_reader.t                |  2 ++
 9 files changed, 153 insertions(+), 40 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 1/2] lei q: parallelize Maildir and mbox writing
  2021-01-18 10:30 71% [PATCH 0/2] lei q: write faster, mutt does less work Eric Wong
@ 2021-01-18 10:30 35% ` Eric Wong
  2021-01-18 21:19 71%   ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-18 10:30 UTC (permalink / raw)
  To: meta

With 4 dedicated workers, this seems to provide a 100-120%
speedup on a 4 core machine when writing thousands of search
results to a Maildir or mbox.  This also sets us up for
high-latency IMAP destinations in the future.

This opens the door to more speedup opportunities such
as optimizing dedupe locking and other ways to reduce
contention.

This change is fairly complex and convoluted, unfortunately.
Further work may allow us to simplify it and even improve
performance.
---
 lib/PublicInbox/IPC.pm         |  3 +++
 lib/PublicInbox/LEI.pm         | 36 ++++++++++++++++++++++++----------
 lib/PublicInbox/LeiOverview.pm | 36 +++++++++++++++++++++++++++++++---
 lib/PublicInbox/LeiQuery.pm    | 12 +++++++++---
 lib/PublicInbox/LeiToMail.pm   | 29 +++++++++++++++++++++++++++
 lib/PublicInbox/LeiXSearch.pm  | 27 +++++++++++++++----------
 lib/PublicInbox/Spawn.pm       |  2 +-
 7 files changed, 118 insertions(+), 27 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 78cb8400..8fec2e62 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -397,4 +397,7 @@ sub DESTROY {
 	ipc_worker_stop($self);
 }
 
+# Sereal doesn't have dclone
+sub deep_clone { thaw(freeze($_[-1])) }
+
 1;
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 61f2a65b..6b6ee0f5 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -279,13 +279,21 @@ sub atfork_prepare_wq {
 	if (my $sock = $self->{sock}) {
 		push @$tcafc, @$self{qw(0 1 2)}, $sock;
 	}
+	for my $f (qw(lxs l2m)) {
+		my $ipc = $self->{$f} or next;
+		push @$tcafc, grep { defined }
+				@$ipc{qw(-wq_s1 -wq_s2 -ipc_req -ipc_res)};
+	}
 }
 
 # usage: my %sig = $lei->atfork_child_wq($wq);
 #	 local @SIG{keys %sig} = values %sig;
 sub atfork_child_wq {
 	my ($self, $wq) = @_;
-	@$self{qw(0 1 2 sock)} = delete(@$wq{0..3});
+	my ($sock, $l2m_wq_s1);
+	(@$self{qw(0 1 2)}, $sock, $l2m_wq_s1) = delete(@$wq{0..4});
+	$self->{sock} = $sock if -S $sock;
+	$self->{l2m}->{-wq_s1} = $l2m_wq_s1 if $l2m_wq_s1;
 	%PATH2CFG = ();
 	$quit = \&CORE::exit;
 	@TO_CLOSE_ATFORK_CHILD = ();
@@ -304,15 +312,23 @@ sub atfork_child_wq {
 # usage: ($lei, @io) = $lei->atfork_parent_wq($wq);
 sub atfork_parent_wq {
 	my ($self, $wq) = @_;
-	if ($wq->wq_workers) {
-		my $env = delete $self->{env}; # env is inherited at fork
-		my $ret = bless { %$self }, ref($self);
-		$self->{env} = $env;
-		delete @$ret{qw(-lei_store cfg pgr)};
-		($ret, delete @$ret{0..2}, delete($ret->{sock}) // ());
-	} else {
-		($self, @$self{0..2}, $self->{sock} // ());
+	my $env = delete $self->{env}; # env is inherited at fork
+	my $ret = bless { %$self }, ref($self);
+	if (my $dedupe = delete $ret->{dedupe}) {
+		$ret->{dedupe} = $wq->deep_clone($dedupe);
+	}
+	$self->{env} = $env;
+	delete @$ret{qw(-lei_store cfg pgr lxs)}; # keep l2m
+	my @io = delete @$ret{0..2};
+	$io[3] = delete($ret->{sock}) // *STDERR{GLOB};
+	my $l2m = $ret->{l2m};
+	if ($l2m && $l2m != $wq) {
+		$io[4] = $l2m->{-wq_s1} if $l2m->{-wq_s1};
+		if (my @pids = $l2m->wq_close) {
+			$wq->{l2m_pids} = \@pids;
+		}
 	}
+	($ret, @io);
 }
 
 sub _help ($;$) {
@@ -656,7 +672,7 @@ sub start_mua {
 	@cmd = map { $_ eq '%f' ? ($replaced = $mfolder) : $_ } @cmd;
 	push @cmd, $mfolder unless defined($replaced);
 	$sock //= $self->{sock};
-	if ($sock) { # lei(1) client process runs it
+	if ($PublicInbox::DS::in_loop) { # lei(1) client process runs it
 		send($sock, exec_buf(\@cmd, {}), MSG_EOR);
 	} else { # oneshot
 		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index c0b423f6..538d6bd5 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -140,6 +140,16 @@ sub _unbless_smsg {
 
 sub ovv_atexit_child {
 	my ($self, $lei) = @_;
+	if (my $l2m = delete $lei->{l2m}) {
+		# gracefully stop lei2mail processes after all
+		# ->write_mail work is complete
+		delete $l2m->{-wq_s1};
+		if (my $rd = delete $l2m->{each_smsg_done}) {
+			read($rd, my $buf, 1); # wait for EOF
+		}
+	}
+	# order matters, git->{-tmp}->DESTROY must not fire until
+	# {each_smsg_done} hits EOF above
 	if (my $git = delete $self->{git}) {
 		$git->async_wait_all;
 	}
@@ -178,8 +188,6 @@ sub _json_pretty {
 
 sub ovv_each_smsg_cb { # runs in wq worker usually
 	my ($self, $lei, $ibxish) = @_;
-	$lei->{ovv_buf} = \(my $buf = '');
-	delete(@$self{qw(lock_path tmp_lk_id)}) unless $lei->{-parallel};
 	my $json;
 	$lei->{1}->autoflush(1);
 	if (my $pkg = $self->{json}) {
@@ -187,7 +195,27 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		$json->utf8->canonical;
 		$json->ascii(1) if $lei->{opt}->{ascii};
 	}
-	if (my $l2m = $lei->{l2m}) {
+	my $l2m = $lei->{l2m};
+	if ($l2m && $l2m->{-wq_s1}) {
+		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
+		# n.b. $io[0] = qry_status_wr, $io[1] = mbox|stdout,
+		# $io[4] becomes a notification pipe that triggers EOF
+		# in this wq worker when all outstanding ->write_mail
+		# calls are complete
+		die "BUG: \$io[4] $io[4] unexpected" if $io[4];
+		pipe($l2m->{each_smsg_done}, $io[4]) or die "pipe: $!";
+		fcntl($io[4], 1031, 4096) if $^O eq 'linux';
+		delete @$lei_ipc{qw(l2m opt mset_opt cmd)};
+		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
+		$self->{git} = $git;
+		my $git_dir = $git->{git_dir};
+		sub {
+			my ($smsg, $mitem) = @_;
+			my $kw = []; # TODO get from mitem
+			$l2m->wq_do('write_mail', \@io, $git_dir,
+					$smsg->{blob}, $lei_ipc, $kw)
+		}
+	} elsif ($l2m) {
 		my $wcb = $l2m->write_cb($lei);
 		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
 		$self->{git} = $git; # for ovv_atexit_child
@@ -199,6 +227,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		};
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
+		$lei->{ovv_buf} = \(my $buf = '');
 		sub { # DIY prettiness :P
 			my ($smsg, $mitem) = @_;
 			$smsg = _unbless_smsg($smsg, $mitem);
@@ -221,6 +250,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		}
 	} elsif ($json) {
 		my $ORS = $self->{fmt} eq 'json' ? ",\n" : "\n"; # JSONL
+		$lei->{ovv_buf} = \(my $buf = '');
 		sub {
 			my ($smsg, $mitem) = @_;
 			delete @$smsg{qw(tid num)};
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index a80d5887..d6e801e3 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -41,11 +41,17 @@ sub lei_q {
 	$j = 1 if !$opt->{thread};
 	$j++ if $opt->{'local'}; # for sto->search below
 	$self->atfork_prepare_wq($lxs);
-	$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset)
-		// $lxs->wq_workers($j);
+	$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset);
+	$self->{lxs} = $lxs;
 
-	# no forking workers after this
 	my $ovv = PublicInbox::LeiOverview->new($self) or return;
+	if (my $l2m = $self->{l2m}) {
+		$j = 4 if $j <= 4; # TODO configurable
+		$self->atfork_prepare_wq($l2m);
+		$l2m->wq_workers_start('lei2mail', $j, $self->oldset);
+	}
+
+	# no forking workers after this
 	my $sto = $self->_lei_store(1);
 	unshift(@srcs, $sto->search) if $opt->{'local'};
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 0e23b8da..17d48a90 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -5,6 +5,7 @@
 package PublicInbox::LeiToMail;
 use strict;
 use v5.10.1;
+use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::Lock;
 use PublicInbox::ProcessPipe;
@@ -14,6 +15,8 @@ use Symbol qw(gensym);
 use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
 use Errno qw(EEXIST ESPIPE ENOENT);
+use File::Temp 0.19 (); # 0.19 for ->newdir
+use PublicInbox::Git;
 
 my %kw2char = ( # Maildir characters
 	draft => 'D',
@@ -422,4 +425,30 @@ sub lock_free {
 	$_[0]->{base_type} =~ /\A(?:maildir|mh|imap|jmap)\z/ ? 1 : 0;
 }
 
+sub write_mail { # via ->wq_do
+	my ($self, $git_dir, $oid, $lei, $kw) = @_;
+	my $wcb = $self->{wcb} //= do { # first message
+		my %sig = $lei->atfork_child_wq($self);
+		@SIG{keys %sig} = values %sig; # not local
+		$lei->{dedupe}->prepare_dedupe;
+		$self->write_cb($lei);
+	};
+	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);
+	$git->cat_async($oid, \&git_to_mail, [ $wcb, $kw ]);
+}
+
+sub ipc_atfork_prepare {
+	my ($self) = @_;
+	# (qry_status_wr, stdout|mbox, stderr, 3: sock, 4: each_smsg_done_wr)
+	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&= >&=]);
+	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
+}
+
+sub DESTROY {
+	my ($self) = @_;
+	for my $pid_git (grep(/\A$$\0/, keys %$self)) {
+		$self->{$pid_git}->async_wait_all;
+	}
+}
+
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 91864cd0..dc5cf3b6 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -126,6 +126,7 @@ sub query_thread_mset { # for --thread
 			@{$ctx->{xids}} = ();
 		}
 	} while (_mset_more($mset, $mo));
+	undef $each_smsg; # drops @io for l2m->{each_smsg_done}
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
@@ -147,6 +148,7 @@ sub query_mset { # non-parallel for non-"--thread" users
 			$each_smsg->($smsg, $it);
 		}
 	} while (_mset_more($mset, $mo));
+	undef $each_smsg; # drops @io for l2m->{each_smsg_done}
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
@@ -170,11 +172,14 @@ sub git {
 }
 
 sub query_done { # EOF callback
-	my ($lei) = @_;
-	$lei->{ovv}->ovv_end($lei);
-	if (my $l2m = $lei->{l2m}) {
-		$lei->start_mua unless $l2m->lock_free;
+	my ($self, $lei) = @_;
+	my $l2m = delete $lei->{l2m};
+	if (my $pids = delete $self->{l2m_pids}) {
+		my $ipc_worker_reap = $self->can('ipc_worker_reap');
+		dwaitpid($_, $ipc_worker_reap, $l2m) for @$pids;
 	}
+	$lei->{ovv}->ovv_end($lei);
+	$lei->start_mua if $l2m && !$l2m->lock_free;
 	$lei->dclose;
 }
 
@@ -188,12 +193,10 @@ sub start_query { # always runs in main (lei-daemon) process
 	}
 	my $remotes = $self->{remotes} // [];
 	if ($lei->{opt}->{thread}) {
-		$lei->{-parallel} = scalar(@$remotes) + scalar(@$srcs) - 1;
 		for my $ibxish (@$srcs) {
 			$self->wq_do('query_thread_mset', $io, $lei, $ibxish);
 		}
 	} else {
-		$lei->{-parallel} = scalar(@$remotes);
 		$self->wq_do('query_mset', $io, $lei, $srcs);
 	}
 	# TODO
@@ -226,12 +229,12 @@ sub do_query {
 	$io[0] = undef;
 	pipe(my $qry_status_rd, $io[0]) or die "pipe $!";
 
-	$lei_orig->{lxs} = $self;
 	$lei_orig->event_step_init; # wait for shutdowns
-	my $op_map = { '' => [ \&query_done, $lei_orig ] };
+	my $op_map = { '' => [ \&query_done, $self, $lei_orig ] };
 	my $in_loop = exists $lei_orig->{sock};
 	my $opp = PublicInbox::OpPipe->new($qry_status_rd, $op_map, $in_loop);
-	if (my $l2m = $lei->{l2m}) {
+	my $l2m = $lei->{l2m};
+	if ($l2m) {
 		$l2m->pre_augment($lei_orig); # may redirect $lei->{1} for mbox
 		$io[1] = $lei_orig->{1};
 		$op_map->{'.'} = [ \&start_query, $self, \@io, $lei, $srcs ];
@@ -246,13 +249,17 @@ sub do_query {
 		$op_map->{'!'} = [ \&CORE::kill, 'TERM', @pids ];
 		$opp->event_step;
 		my $ipc_worker_reap = $self->can('ipc_worker_reap');
+		if (my $l2m_pids = delete $self->{l2m_pids}) {
+			dwaitpid($_, $ipc_worker_reap, $l2m) for @$l2m_pids;
+		}
 		dwaitpid($_, $ipc_worker_reap, $self) for @pids;
 	}
 }
 
 sub ipc_atfork_prepare {
 	my ($self) = @_;
-	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&=]);
+	# (qry_status_wr, stdout|mbox, stderr, 3: sock, 4: $l2m->{-wq_s1})
+	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&= +<&=]);
 	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
 }
 
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index e5c0b1e9..b03f2d59 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -209,7 +209,7 @@ my $fdpass = <<'FDPASS';
 #include <sys/socket.h>
 
 #if defined(CMSG_SPACE) && defined(CMSG_LEN)
-#define SEND_FD_CAPA 4
+#define SEND_FD_CAPA 5
 #define SEND_FD_SPACE (SEND_FD_CAPA * sizeof(int))
 union my_cmsg {
 	struct cmsghdr hdr;

^ permalink raw reply related	[relevance 35%]

* Re: [PATCH 1/2] lei q: parallelize Maildir and mbox writing
  2021-01-18 10:30 35% ` [PATCH 1/2] lei q: parallelize Maildir and mbox writing Eric Wong
@ 2021-01-18 21:19 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-18 21:19 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
<snip>
> @@ -14,6 +15,8 @@ use Symbol qw(gensym);
>  use IO::Handle; # ->autoflush
>  use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
>  use Errno qw(EEXIST ESPIPE ENOENT);
> +use File::Temp 0.19 (); # 0.19 for ->newdir

File::Temp is unnecessary, will squash this in:

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 17d48a90..8d030227 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -15,7 +15,6 @@ use Symbol qw(gensym);
 use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
 use Errno qw(EEXIST ESPIPE ENOENT);
-use File::Temp 0.19 (); # 0.19 for ->newdir
 use PublicInbox::Git;
 
 my %kw2char = ( # Maildir characters

My initial attempt at making this change didn't have the
{each_smsg_done} pipe performing keepalive.  Instead it created
another tmpdir copy of the tmpdir created by LeiXSearch->git.

That was ugly and didn't work, since it was possible for for the
original LeiXSearch->git tmpdir to go out-of-scope before the
lei2mail worker even got a chance to copy the bare directory
and its alternates file.

The {each_smsg_done} pipe is much nicer since the kernel can
keep track of in-flight pipes and doesn't inflict extra FS
activity.

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/9] lei bugfixes and error handling
@ 2021-01-19  9:34 71% Eric Wong
  2021-01-19  9:34 47% ` [PATCH 1/9] lei q: start ->mset while query_prepare runs Eric Wong
                   ` (12 more replies)
  0 siblings, 13 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

1/9 could have some potential when we start handling remotes,
3/9 seems necessary, unfortunately :<
4/9 helped find some bugs,
9/9 is incomplete, but that describes everything :<

Eric Wong (9):
  lei q: start ->mset while query_prepare runs
  lei q: fix SIGPIPE handling from lei2mail workers
  lei q: do not spawn MUA early
  lei: write daemon errors to the sock directory
  lei q: fix augment of compressed mailboxes
  lei_overview: do not write if $lei->{1} is gone
  t/lei: fix double-running of socket test with oneshot
  lei: test some likely errors due to misuse
  lei_overview: start implementing format detection

 lib/PublicInbox/LEI.pm         |  22 ++++---
 lib/PublicInbox/LeiOverview.pm |  25 ++++++--
 lib/PublicInbox/LeiToMail.pm   |  39 +++++++-----
 lib/PublicInbox/LeiXSearch.pm  | 105 +++++++++++++++++++++++----------
 lib/PublicInbox/Spawn.pm       |   2 +-
 t/lei.t                        |  75 +++++++++++++++--------
 t/lei_to_mail.t                |   4 +-
 xt/lei-sigpipe.t               |  29 +++++----
 8 files changed, 200 insertions(+), 101 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 4/9] lei: write daemon errors to the sock directory
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-19  9:34 69% ` [PATCH 3/9] lei q: do not spawn MUA early Eric Wong
@ 2021-01-19  9:34 71% ` Eric Wong
  2021-01-19  9:34 41% ` [PATCH 5/9] lei q: fix augment of compressed mailboxes Eric Wong
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

Most everything should be captured by the __WARN__ handlers and
routed to syslog, but it appears Perl may write to stderr in
some emergency cases, as can libc or other libraries.  Just
point it to a small file that's cleared on reboot.
---
 lib/PublicInbox/LEI.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 802d2cd9..e4f8bedb 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -830,7 +830,9 @@ sub lazy_start {
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
 	(-p STDOUT) or die "E: stdout must be a pipe\n";
-	open(STDIN, '+<', '/dev/null') or die "redirect stdin failed: $!";
+	my ($err) = ($path =~ m!\A(.+?/)[^/]+\z!);
+	$err .= 'errors.log';
+	open(STDIN, '+>>', $err) or die "open($err): $!";
 	POSIX::setsid() > 0 or die "setsid: $!";
 	my $pid = fork // die "fork: $!";
 	return if $pid;

^ permalink raw reply related	[relevance 71%]

* [PATCH 6/9] lei_overview: do not write if $lei->{1} is gone
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (4 preceding siblings ...)
  2021-01-19  9:34 41% ` [PATCH 5/9] lei q: fix augment of compressed mailboxes Eric Wong
@ 2021-01-19  9:34 71% ` Eric Wong
  2021-01-19  9:34 69% ` [PATCH 7/9] t/lei: fix double-running of socket test with oneshot Eric Wong
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

We'll invalidate the {1} (stdout) field on SIGPIPE,
so don't trigger a Perl warning by writing to it.
---
 lib/PublicInbox/LeiOverview.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 538d6bd5..8781259a 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -99,12 +99,13 @@ sub ovv_begin {
 # called once by parent (via PublicInbox::EOFpipe)
 sub ovv_end {
 	my ($self, $lei) = @_;
+	my $out = $lei->{1} or return;
 	if ($self->{fmt} eq 'json') {
 		# JSON doesn't allow trailing commas, and preventing
 		# trailing commas is a PITA when parallelizing outputs
-		print { $lei->{1} } "null]\n";
+		print $out "null]\n";
 	} elsif ($self->{fmt} eq 'concatjson') {
-		print { $lei->{1} } "\n";
+		print $out "\n";
 	}
 }
 

^ permalink raw reply related	[relevance 71%]

* [PATCH 3/9] lei q: do not spawn MUA early
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
  2021-01-19  9:34 47% ` [PATCH 1/9] lei q: start ->mset while query_prepare runs Eric Wong
  2021-01-19  9:34 48% ` [PATCH 2/9] lei q: fix SIGPIPE handling from lei2mail workers Eric Wong
@ 2021-01-19  9:34 69% ` Eric Wong
  2021-01-19  9:34 71% ` [PATCH 4/9] lei: write daemon errors to the sock directory Eric Wong
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

I'm not sure why, but mutt sometimes won't detect small
quickly.  We'll display a progress bar meter when writing
results, instead.
---
 lib/PublicInbox/LeiToMail.pm  | 4 ----
 lib/PublicInbox/LeiXSearch.pm | 3 +--
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 8e58ad11..99388b5b 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -439,10 +439,6 @@ sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
 	$self->$m($lei);
 }
 
-sub lock_free {
-	$_[0]->{base_type} =~ /\A(?:maildir|mh|imap|jmap)\z/ ? 1 : 0;
-}
-
 sub write_mail { # via ->wq_do
 	my ($self, $git_dir, $oid, $lei, $kw) = @_;
 	my $not_done = delete $self->{4}; # write end of {each_smsg_done}
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 45a073a0..120857b8 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -191,7 +191,7 @@ sub query_done { # EOF callback
 		dwaitpid($_, $ipc_worker_reap, $l2m) for @$pids;
 	}
 	$lei->{ovv}->ovv_end($lei);
-	$lei->start_mua if $l2m && !$l2m->lock_free;
+	$lei->start_mua if $l2m;
 	$lei->dclose;
 }
 
@@ -201,7 +201,6 @@ sub start_query { # always runs in main (lei-daemon) process
 		$lei->{1} = $io->[1];
 		$l2m->post_augment($lei);
 		$io->[1] = delete $lei->{1};
-		$lei->start_mua($io->[3]) if $l2m->lock_free;
 	}
 	my $remotes = $self->{remotes} // [];
 	if ($lei->{opt}->{thread}) {

^ permalink raw reply related	[relevance 69%]

* [PATCH 7/9] t/lei: fix double-running of socket test with oneshot
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (5 preceding siblings ...)
  2021-01-19  9:34 71% ` [PATCH 6/9] lei_overview: do not write if $lei->{1} is gone Eric Wong
@ 2021-01-19  9:34 69% ` Eric Wong
  2021-01-19  9:34 58% ` [PATCH 8/9] lei: test some likely errors due to misuse Eric Wong
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

We split out t/lei-oneshot.t and t/lei.t so it's easier
to isolate run-mode specific bugs and behavior and there's
no reason to rerun the socket daemon tests.
---
 t/lei.t | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/t/lei.t b/t/lei.t
index 8eede13e..c804ff59 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -234,18 +234,14 @@ if ($ENV{TEST_LEI_ONESHOT}) {
 	local $ENV{XDG_RUNTIME_DIR} = $xrd;
 	$err_filter = qr!\Q$xrd!;
 	$test_lei_common->();
-}
-
+} else {
 SKIP: { # real socket
-	require_mods(qw(Cwd), my $nr = 105);
-	my $nfd = eval { require Socket::MsgHdr; 5 } // do {
+	eval { require Socket::MsgHdr; 1 } // do {
 		require PublicInbox::Spawn;
-		PublicInbox::Spawn->can('send_cmd4') ? 5 : undef;
-	} //
-	skip 'Socket::MsgHdr or Inline::C missing or unconfigured', $nr;
-
+		PublicInbox::Spawn->can('send_cmd4');
+	} // skip 'Socket::MsgHdr or Inline::C missing or unconfigured', 115;
 	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
-	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/$nfd.seq.sock";
+	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/5.seq.sock";
 
 	ok($lei->('daemon-pid'), 'daemon-pid');
 	is($err, '', 'no error from daemon-pid');
@@ -297,6 +293,7 @@ SKIP: { # real socket
 	}
 	ok(!kill(0, $new_pid), 'daemon exits after unlink');
 	# success over socket, can't test without
-};
+}; # SKIP
+} # else
 
 done_testing;

^ permalink raw reply related	[relevance 69%]

* [PATCH 8/9] lei: test some likely errors due to misuse
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (6 preceding siblings ...)
  2021-01-19  9:34 69% ` [PATCH 7/9] t/lei: fix double-running of socket test with oneshot Eric Wong
@ 2021-01-19  9:34 58% ` Eric Wong
  2021-01-20  5:04 71% ` [PATCH 0/7] lei: fixes piled higher and deeper Eric Wong
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

Because user errors happen...
---
 lib/PublicInbox/LeiOverview.pm |  3 ++-
 lib/PublicInbox/LeiToMail.pm   |  6 +++++-
 lib/PublicInbox/LeiXSearch.pm  |  9 ++++++++-
 t/lei.t                        | 14 ++++++++++++++
 4 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 8781259a..a7021b03 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -82,7 +82,8 @@ sub new {
 	if (!$json) {
 		# default to the cheapest sort since MUA usually resorts
 		$lei->{opt}->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
-		$lei->{l2m} = PublicInbox::LeiToMail->new($lei);
+		$lei->{l2m} = eval { PublicInbox::LeiToMail->new($lei) };
+		return $lei->fail($@) if $@;
 	}
 	$lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei);
 	$self;
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index a6e517ea..49b5c8ab 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -339,8 +339,12 @@ sub new {
 	my $self = bless {}, $cls;
 	if ($fmt eq 'maildir') {
 		$self->{base_type} = 'maildir';
+		-e $dst && !-d _ and die
+				"$dst exists and is not a directory\n";
 		$lei->{ovv}->{dst} = $dst .= '/' if substr($dst, -1) ne '/';
 	} elsif (substr($fmt, 0, 4) eq 'mbox') {
+		-e $dst && !-f _ && !-p _ and die
+				"$dst exists and is not a regular file\n";
 		$self->can("eml2$fmt") or die "bad mbox --format=$fmt\n";
 		$self->{base_type} = 'mbox';
 	} else {
@@ -374,7 +378,7 @@ sub _post_augment_maildir {
 		my $d = $dst.$x;
 		next if -d $d;
 		require File::Path;
-		File::Path::mkpath($d) or die "mkpath($d): $!";
+		File::Path::mkpath($d);
 		-d $d or die "$d is not a directory";
 	}
 }
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 002791c2..fa37543f 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -201,7 +201,14 @@ sub query_done { # EOF callback
 sub do_post_augment {
 	my ($lei, $zpipe, $au_done) = @_;
 	my $l2m = $lei->{l2m} or die 'BUG: no {l2m}';
-	$l2m->post_augment($lei, $zpipe);
+	eval { $l2m->post_augment($lei, $zpipe) };
+	if (my $err = $@) {
+		if (my $lxs = delete $lei->{lxs}) {
+			$lxs->wq_kill;
+			$lxs->wq_close;
+		}
+		$lei->fail("$err");
+	}
 	close $au_done; # triggers wait_startq
 }
 
diff --git a/t/lei.t b/t/lei.t
index c804ff59..8bb4e439 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -181,6 +181,20 @@ my $test_external = sub {
 	$lei->('ls-external');
 	like($out, qr/boost=0\n/s, 'ls-external has output');
 
+	ok(!$lei->(qw(q s:prefix -o /dev/null -f maildir)), 'bad maildir');
+	like($err, qr!/dev/null exists and is not a directory!,
+		'error shown');
+	is($? >> 8, 1, 'errored out with exit 1');
+
+	ok(!$lei->(qw(q s:prefix -f mboxcl2 -o), $home), 'bad mbox');
+	like($err, qr!\Q$home\E exists and is not a regular file!,
+		'error shown');
+	is($? >> 8, 1, 'errored out with exit 1');
+
+	ok(!$lei->(qw(q s:prefix -o /dev/stdout -f Mbox2)), 'bad format');
+	like($err, qr/bad mbox --format=mbox2/, 'error shown');
+	is($? >> 8, 1, 'errored out with exit 1');
+
 	# note, on a Bourne shell users should be able to use either:
 	#	s:"use boolean prefix"
 	#	"s:use boolean prefix"

^ permalink raw reply related	[relevance 58%]

* [PATCH 2/9] lei q: fix SIGPIPE handling from lei2mail workers
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
  2021-01-19  9:34 47% ` [PATCH 1/9] lei q: start ->mset while query_prepare runs Eric Wong
@ 2021-01-19  9:34 48% ` Eric Wong
  2021-01-19  9:34 69% ` [PATCH 3/9] lei q: do not spawn MUA early Eric Wong
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

We need to properly propagate SIGPIPE to the top-level
lei-daemon process and avoid relying on auto-close,
since auto-close triggers Perl warnings when explicit
close() does not.
---
 lib/PublicInbox/LEI.pm        | 15 +++++++++------
 lib/PublicInbox/LeiToMail.pm  |  7 ++++++-
 lib/PublicInbox/LeiXSearch.pm | 23 +++++++++++++++++++----
 xt/lei-sigpipe.t              | 29 ++++++++++++++++-------------
 4 files changed, 50 insertions(+), 24 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 4b1dc673..802d2cd9 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -301,10 +301,13 @@ sub atfork_child_wq {
 	PIPE => sub {
 		$self->x_it(13); # SIGPIPE = 13
 		# we need to close explicitly to avoid Perl warning on SIGPIPE
-		close(delete $self->{1});
-		# regular files and /dev/null (-c) won't trigger SIGPIPE
-		close(delete $self->{2}) unless (-f $self->{2} || -c _);
-		syswrite($self->{0}, '!') unless $self->{sock}; # for eof_wait
+		for my $i (1, 2) {
+			next unless $self->{$i} && (-p $self->{$i} || -S _);
+			close(delete $self->{$i});
+		}
+		# trigger the LeiXSearch $done OpPipe:
+		syswrite($self->{0}, '!') if $self->{0} && -p $self->{0};
+		$SIG{PIPE} = 'DEFAULT';
 		die bless(\"$_[0]", 'PublicInbox::SIGPIPE'),
 	});
 }
@@ -322,7 +325,7 @@ sub atfork_parent_wq {
 	my @io = delete @$ret{0..2};
 	$io[3] = delete($ret->{sock}) // *STDERR{GLOB};
 	my $l2m = $ret->{l2m};
-	if ($l2m && $l2m != $wq) {
+	if ($l2m && $l2m != $wq) { # $wq == lxs
 		$io[4] = $l2m->{-wq_s1} if $l2m->{-wq_s1};
 		if (my @pids = $l2m->wq_close) {
 			$wq->{l2m_pids} = \@pids;
@@ -672,7 +675,7 @@ sub start_mua {
 	@cmd = map { $_ eq '%f' ? ($replaced = $mfolder) : $_ } @cmd;
 	push @cmd, $mfolder unless defined($replaced);
 	$sock //= $self->{sock};
-	if ($PublicInbox::DS::in_loop) { # lei(1) client process runs it
+	if ($sock) { # lei(1) client process runs it
 		send($sock, exec_buf(\@cmd, {}), MSG_EOR);
 	} else { # oneshot
 		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index a1dce550..8e58ad11 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -247,11 +247,16 @@ sub _mbox_write_cb ($$) {
 	$dedupe->prepare_dedupe;
 	sub { # for git_to_mail
 		my ($buf, $oid, $kw) = @_;
+		return unless $out;
 		my $eml = PublicInbox::Eml->new($buf);
 		if (!$dedupe->is_dup($eml, $oid)) {
 			$buf = $eml2mbox->($eml, $kw);
 			my $lk = $ovv->lock_for_scope;
-			$write->($out, $buf);
+			eval { $write->($out, $buf) };
+			if ($@) {
+				die $@ if ref($@) ne 'PublicInbox::SIGPIPE';
+				undef $out
+			}
 		}
 	}
 }
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 73fd17f4..45a073a0 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -219,7 +219,7 @@ sub start_query { # always runs in main (lei-daemon) process
 	@$io = ();
 }
 
-sub query_prepare { # for wq_do,
+sub query_prepare { # called by wq_do
 	my ($self, $lei) = @_;
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
@@ -227,6 +227,18 @@ sub query_prepare { # for wq_do,
 	$lei->fail($@) if $@;
 }
 
+sub sigpipe_handler {
+	my ($self, $lei_orig, $pids) = @_;
+	if ($pids) { # one-shot (no event loop)
+		kill 'TERM', @$pids;
+		kill 'PIPE', $$;
+	} else {
+		$self->wq_kill;
+		$self->wq_close;
+	}
+	close(delete $lei_orig->{1}) if $lei_orig->{1};
+}
+
 sub do_query {
 	my ($self, $lei_orig, $srcs) = @_;
 	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
@@ -234,7 +246,10 @@ sub do_query {
 	pipe(my $done, $io[0]) or die "pipe $!";
 
 	$lei_orig->event_step_init; # wait for shutdowns
-	my $done_op = { '' => [ \&query_done, $self, $lei_orig ] };
+	my $done_op = {
+		'' => [ \&query_done, $self, $lei_orig ],
+		'!' => [ \&sigpipe_handler, $self, $lei_orig ]
+	};
 	my $in_loop = exists $lei_orig->{sock};
 	$done = PublicInbox::OpPipe->new($done, $done_op, $in_loop);
 	my $l2m = $lei->{l2m};
@@ -244,7 +259,7 @@ sub do_query {
 		my @l2m_io = (undef, @io[1..$#io]);
 		pipe(my $startq, $l2m_io[0]) or die "pipe: $!";
 		$self->wq_do('query_prepare', \@l2m_io, $lei);
-		$io[4] //= *STDERR{GLOB};
+		$io[4] = *STDERR{GLOB}; # don't send l2m->{-wq_s1}
 		die "BUG: unexpected \$io[5]: $io[5]" if $io[5];
 		fcntl($startq, 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
 		$io[5] = $startq;
@@ -253,7 +268,7 @@ sub do_query {
 	unless ($in_loop) {
 		my @pids = $self->wq_close;
 		# for the $lei->atfork_child_wq PIPE handler:
-		$done_op->{'!'} = [ \&CORE::kill, 'TERM', @pids ];
+		$done_op->{'!'}->[3] = \@pids;
 		$done->event_step;
 		my $ipc_worker_reap = $self->can('ipc_worker_reap');
 		if (my $l2m_pids = delete $self->{l2m_pids}) {
diff --git a/xt/lei-sigpipe.t b/xt/lei-sigpipe.t
index 4d35bbb3..448bd7db 100644
--- a/xt/lei-sigpipe.t
+++ b/xt/lei-sigpipe.t
@@ -11,19 +11,22 @@ require_mods(qw(json DBD::SQLite Search::Xapian));
 
 my $do_test = sub {
 	my $env = shift // {};
-	pipe(my ($r, $w)) or BAIL_OUT $!;
-	open my $err, '+>', undef or BAIL_OUT $!;
-	my $opt = { run_mode => 0, 1 => $w, 2 => $err };
-	my $tp = start_script([qw(lei q -t), 'bytes:1..'], $env, $opt);
-	close $w;
-	sysread($r, my $buf, 1);
-	close $r; # trigger SIGPIPE
-	$tp->join;
-	ok(WIFSIGNALED($?), 'signaled');
-	is(WTERMSIG($?), SIGPIPE, 'got SIGPIPE');
-	seek($err, 0, 0);
-	my @err = grep(!m{mkdir /dev/null\b}, <$err>);
-	is_deeply(\@err, [], 'no errors');
+	for my $out ([], [qw(-f mboxcl2)]) {
+		pipe(my ($r, $w)) or BAIL_OUT $!;
+		open my $err, '+>', undef or BAIL_OUT $!;
+		my $opt = { run_mode => 0, 1 => $w, 2 => $err };
+		my $cmd = [qw(lei q -t), @$out, 'bytes:1..'];
+		my $tp = start_script($cmd, $env, $opt);
+		close $w;
+		sysread($r, my $buf, 1);
+		close $r; # trigger SIGPIPE
+		$tp->join;
+		ok(WIFSIGNALED($?), "signaled @$out");
+		is(WTERMSIG($?), SIGPIPE, "got SIGPIPE @$out");
+		seek($err, 0, 0);
+		my @err = grep(!m{mkdir /dev/null\b}, <$err>);
+		is_deeply(\@err, [], "no errors @$out");
+	}
 };
 
 $do_test->();

^ permalink raw reply related	[relevance 48%]

* [PATCH 1/9] lei q: start ->mset while query_prepare runs
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
@ 2021-01-19  9:34 47% ` Eric Wong
  2021-01-19  9:34 48% ` [PATCH 2/9] lei q: fix SIGPIPE handling from lei2mail workers Eric Wong
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

We don't need the result of query_prepare (for augmenting or
mass unlinking) until we're ready to deduplicate and write
results to the filesystem.  This ought to let us hide some of
the cost of Xapian searches on multi-device/core systems for
extremely expensive searches.
---
 lib/PublicInbox/LEI.pm        |  2 +-
 lib/PublicInbox/LeiToMail.pm  |  3 +-
 lib/PublicInbox/LeiXSearch.pm | 54 ++++++++++++++++++++---------------
 lib/PublicInbox/Spawn.pm      |  2 +-
 4 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 6b6ee0f5..4b1dc673 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -293,7 +293,7 @@ sub atfork_child_wq {
 	my ($sock, $l2m_wq_s1);
 	(@$self{qw(0 1 2)}, $sock, $l2m_wq_s1) = delete(@$wq{0..4});
 	$self->{sock} = $sock if -S $sock;
-	$self->{l2m}->{-wq_s1} = $l2m_wq_s1 if $l2m_wq_s1;
+	$self->{l2m}->{-wq_s1} = $l2m_wq_s1 if $l2m_wq_s1 && -S $l2m_wq_s1;
 	%PATH2CFG = ();
 	$quit = \&CORE::exit;
 	@TO_CLOSE_ATFORK_CHILD = ();
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index dcf6d8a3..a1dce550 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -440,6 +440,7 @@ sub lock_free {
 
 sub write_mail { # via ->wq_do
 	my ($self, $git_dir, $oid, $lei, $kw) = @_;
+	my $not_done = delete $self->{4}; # write end of {each_smsg_done}
 	my $wcb = $self->{wcb} //= do { # first message
 		my %sig = $lei->atfork_child_wq($self);
 		@SIG{keys %sig} = values %sig; # not local
@@ -447,7 +448,7 @@ sub write_mail { # via ->wq_do
 		$self->write_cb($lei);
 	};
 	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);
-	$git->cat_async($oid, \&git_to_mail, [ $wcb, $kw ]);
+	$git->cat_async($oid, \&git_to_mail, [ $wcb, $kw, $not_done ]);
 }
 
 sub ipc_atfork_prepare {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index dc5cf3b6..73fd17f4 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -94,8 +94,17 @@ sub _mset_more ($$) {
 	$size && (($mo->{offset} += $size) < ($mo->{limit} // 10000));
 }
 
+# $startq will EOF when query_prepare is done augmenting and allow
+# query_mset and query_thread_mset to proceed.
+sub wait_startq ($) {
+	my ($startq) = @_;
+	$_[0] = undef;
+	read($startq, my $query_prepare_done, 1);
+}
+
 sub query_thread_mset { # for --thread
 	my ($self, $lei, $ibxish) = @_;
+	my $startq = delete $self->{5};
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
 
@@ -119,6 +128,7 @@ sub query_thread_mset { # for --thread
 		while ($over->expand_thread($ctx)) {
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
+				wait_startq($startq) if $startq;
 				next if $dedupe->is_smsg_dup($smsg);
 				my $mitem = delete $n2item{$smsg->{num}};
 				$each_smsg->($smsg, $mitem);
@@ -132,6 +142,7 @@ sub query_thread_mset { # for --thread
 
 sub query_mset { # non-parallel for non-"--thread" users
 	my ($self, $lei, $srcs) = @_;
+	my $startq = delete $self->{5};
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
 	my $mo = { %{$lei->{mset_opt}} };
@@ -144,6 +155,7 @@ sub query_mset { # non-parallel for non-"--thread" users
 		$mset = $self->mset($mo->{qstr}, $mo);
 		for my $it ($mset->items) {
 			my $smsg = smsg_for($self, $it) or next;
+			wait_startq($startq) if $startq;
 			next if $dedupe->is_smsg_dup($smsg);
 			$each_smsg->($smsg, $it);
 		}
@@ -207,47 +219,42 @@ sub start_query { # always runs in main (lei-daemon) process
 	@$io = ();
 }
 
-sub query_prepare { # wq_do
+sub query_prepare { # for wq_do,
 	my ($self, $lei) = @_;
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
-	if (my $l2m = $lei->{l2m}) {
-		eval { $l2m->do_augment($lei) };
-		return $lei->fail($@) if $@;
-	}
-	# trigger PublicInbox::OpPipe->event_step
-	my $qry_status_wr = $lei->{0} or
-		return $lei->fail('BUG: qry_status_wr missing');
-	$qry_status_wr->autoflush(1);
-	print $qry_status_wr '.' or # this should never fail...
-		return $lei->fail("BUG? print qry_status_wr: $!");
+	eval { $lei->{l2m}->do_augment($lei) };
+	$lei->fail($@) if $@;
 }
 
 sub do_query {
 	my ($self, $lei_orig, $srcs) = @_;
 	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
 	$io[0] = undef;
-	pipe(my $qry_status_rd, $io[0]) or die "pipe $!";
+	pipe(my $done, $io[0]) or die "pipe $!";
 
 	$lei_orig->event_step_init; # wait for shutdowns
-	my $op_map = { '' => [ \&query_done, $self, $lei_orig ] };
+	my $done_op = { '' => [ \&query_done, $self, $lei_orig ] };
 	my $in_loop = exists $lei_orig->{sock};
-	my $opp = PublicInbox::OpPipe->new($qry_status_rd, $op_map, $in_loop);
+	$done = PublicInbox::OpPipe->new($done, $done_op, $in_loop);
 	my $l2m = $lei->{l2m};
 	if ($l2m) {
 		$l2m->pre_augment($lei_orig); # may redirect $lei->{1} for mbox
 		$io[1] = $lei_orig->{1};
-		$op_map->{'.'} = [ \&start_query, $self, \@io, $lei, $srcs ];
-		$self->wq_do('query_prepare', \@io, $lei);
-		$opp->event_step if !$in_loop;
-	} else {
-		start_query($self, \@io, $lei, $srcs);
+		my @l2m_io = (undef, @io[1..$#io]);
+		pipe(my $startq, $l2m_io[0]) or die "pipe: $!";
+		$self->wq_do('query_prepare', \@l2m_io, $lei);
+		$io[4] //= *STDERR{GLOB};
+		die "BUG: unexpected \$io[5]: $io[5]" if $io[5];
+		fcntl($startq, 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
+		$io[5] = $startq;
 	}
+	start_query($self, \@io, $lei, $srcs);
 	unless ($in_loop) {
 		my @pids = $self->wq_close;
 		# for the $lei->atfork_child_wq PIPE handler:
-		$op_map->{'!'} = [ \&CORE::kill, 'TERM', @pids ];
-		$opp->event_step;
+		$done_op->{'!'} = [ \&CORE::kill, 'TERM', @pids ];
+		$done->event_step;
 		my $ipc_worker_reap = $self->can('ipc_worker_reap');
 		if (my $l2m_pids = delete $self->{l2m_pids}) {
 			dwaitpid($_, $ipc_worker_reap, $l2m) for @$l2m_pids;
@@ -258,8 +265,9 @@ sub do_query {
 
 sub ipc_atfork_prepare {
 	my ($self) = @_;
-	# (qry_status_wr, stdout|mbox, stderr, 3: sock, 4: $l2m->{-wq_s1})
-	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&= +<&=]);
+	# (0: qry_status_wr, 1: stdout|mbox, 2: stderr,
+	#  3: sock, 4: $l2m->{-wq_s1}, 5: $startq)
+	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&= +<&= <&=]);
 	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
 }
 
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index b03f2d59..376d2190 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -209,7 +209,7 @@ my $fdpass = <<'FDPASS';
 #include <sys/socket.h>
 
 #if defined(CMSG_SPACE) && defined(CMSG_LEN)
-#define SEND_FD_CAPA 5
+#define SEND_FD_CAPA 6
 #define SEND_FD_SPACE (SEND_FD_CAPA * sizeof(int))
 union my_cmsg {
 	struct cmsghdr hdr;

^ permalink raw reply related	[relevance 47%]

* [PATCH 5/9] lei q: fix augment of compressed mailboxes
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (3 preceding siblings ...)
  2021-01-19  9:34 71% ` [PATCH 4/9] lei: write daemon errors to the sock directory Eric Wong
@ 2021-01-19  9:34 41% ` Eric Wong
  2021-01-19  9:34 71% ` [PATCH 6/9] lei_overview: do not write if $lei->{1} is gone Eric Wong
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-19  9:34 UTC (permalink / raw)
  To: meta

We need to delay writing out the mailbox until the compressor
process is up and running, so have startq wait a bit.  This
means we must create the pipe early and hand it off to the
workers before augmenting, despite spawning the
gzip/pigz/xz/bzip2 process after augment is complete.
---
 lib/PublicInbox/LEI.pm        |  1 +
 lib/PublicInbox/LeiToMail.pm  | 19 +++++++++-------
 lib/PublicInbox/LeiXSearch.pm | 40 +++++++++++++++++++++------------
 t/lei.t                       | 42 ++++++++++++++++++++++-------------
 t/lei_to_mail.t               |  4 ++--
 5 files changed, 66 insertions(+), 40 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e4f8bedb..f3edfe82 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -758,6 +758,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 sub dclose {
 	my ($self) = @_;
 	delete $self->{lxs}; # stops LeiXSearch queries
+	close(delete $self->{1}) if $self->{1}; # may reap_compress
 	$self->close if $self->{sock}; # PublicInbox::DS::close
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 99388b5b..a6e517ea 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -200,18 +200,19 @@ sub zsfx2cmd ($$$) {
 }
 
 sub _post_augment_mbox { # open a compressor process
-	my ($self, $lei) = @_;
+	my ($self, $lei, $zpipe) = @_;
 	my $zsfx = $self->{zsfx} or return;
 	my $cmd = zsfx2cmd($zsfx, undef, $lei);
-	pipe(my ($r, $w)) or die "pipe: $!";
+	my ($r, $w) = splice(@$zpipe, 0, 2);
 	my $rdr = { 0 => $r, 1 => $lei->{1}, 2 => $lei->{2} };
 	my $pid = spawn($cmd, $lei->{env}, $rdr);
-	$lei->{"pid.$pid"} = $cmd;
 	my $pp = gensym;
-	tie *$pp, 'PublicInbox::ProcessPipe', $pid, $w, \&reap_compress, $lei;
+	my $dup = bless { "pid.$pid" => $cmd }, ref($lei);
+	$dup->{$_} = $lei->{$_} for qw(2 sock);
+	tie *$pp, 'PublicInbox::ProcessPipe', $pid, $w, \&reap_compress, $dup;
 	$lei->{1} = $pp;
 	die 'BUG: unexpected {ovv}->{lock_path}' if $lei->{ovv}->{lock_path};
-	$lei->{ovv}->ovv_out_lk_init if ($lei->{opt}->{jobs} // 2) > 1;
+	$lei->{ovv}->ovv_out_lk_init;
 }
 
 sub decompress_src ($$$) {
@@ -395,7 +396,9 @@ sub _pre_augment_mbox {
 		die "seek($dst): $!\n";
 	}
 	state $zsfx_allow = join('|', keys %zsfx2cmd);
-	($self->{zsfx}) = ($dst =~ /\.($zsfx_allow)\z/);
+	($self->{zsfx}) = ($dst =~ /\.($zsfx_allow)\z/) or return;
+	pipe(my ($r, $w)) or die "pipe: $!";
+	[ $r, $w ];
 }
 
 sub _do_augment_mbox {
@@ -433,10 +436,10 @@ sub do_augment { # slow, runs in wq worker
 }
 
 sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
-	my ($self, $lei) = @_;
+	my ($self, $lei, @args) = @_;
 	# _post_augment_maildir, _post_augment_mbox
 	my $m = "_post_augment_$self->{base_type}";
-	$self->$m($lei);
+	$self->$m($lei, @args);
 }
 
 sub write_mail { # via ->wq_do
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 120857b8..002791c2 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -191,17 +191,22 @@ sub query_done { # EOF callback
 		dwaitpid($_, $ipc_worker_reap, $l2m) for @$pids;
 	}
 	$lei->{ovv}->ovv_end($lei);
-	$lei->start_mua if $l2m;
+	if ($l2m) { # calls LeiToMail reap_compress
+		close(delete($lei->{1})) if $lei->{1};
+		$lei->start_mua;
+	}
 	$lei->dclose;
 }
 
+sub do_post_augment {
+	my ($lei, $zpipe, $au_done) = @_;
+	my $l2m = $lei->{l2m} or die 'BUG: no {l2m}';
+	$l2m->post_augment($lei, $zpipe);
+	close $au_done; # triggers wait_startq
+}
+
 sub start_query { # always runs in main (lei-daemon) process
 	my ($self, $io, $lei, $srcs) = @_;
-	if (my $l2m = $lei->{l2m}) {
-		$lei->{1} = $io->[1];
-		$l2m->post_augment($lei);
-		$io->[1] = delete $lei->{1};
-	}
 	my $remotes = $self->{remotes} // [];
 	if ($lei->{opt}->{thread}) {
 		for my $ibxish (@$srcs) {
@@ -221,9 +226,11 @@ sub start_query { # always runs in main (lei-daemon) process
 sub query_prepare { # called by wq_do
 	my ($self, $lei) = @_;
 	my %sig = $lei->atfork_child_wq($self);
+	-p $lei->{0} or die "BUG: \$done pipe expected";
 	local @SIG{keys %sig} = values %sig;
 	eval { $lei->{l2m}->do_augment($lei) };
 	$lei->fail($@) if $@;
+	syswrite($lei->{0}, '.') == 1 or die "do_post_augment trigger: $!";
 }
 
 sub sigpipe_handler {
@@ -253,26 +260,31 @@ sub do_query {
 	$done = PublicInbox::OpPipe->new($done, $done_op, $in_loop);
 	my $l2m = $lei->{l2m};
 	if ($l2m) {
-		$l2m->pre_augment($lei_orig); # may redirect $lei->{1} for mbox
+		# may redirect $lei->{1} for mbox
+		my $zpipe = $l2m->pre_augment($lei_orig);
 		$io[1] = $lei_orig->{1};
-		my @l2m_io = (undef, @io[1..$#io]);
-		pipe(my $startq, $l2m_io[0]) or die "pipe: $!";
-		$self->wq_do('query_prepare', \@l2m_io, $lei);
+		pipe(my ($startq, $au_done)) or die "pipe: $!";
+		$done_op->{'.'} = [ \&do_post_augment, $lei_orig,
+					$zpipe, $au_done ];
 		$io[4] = *STDERR{GLOB}; # don't send l2m->{-wq_s1}
+		$self->wq_do('query_prepare', \@io, $lei);
 		die "BUG: unexpected \$io[5]: $io[5]" if $io[5];
 		fcntl($startq, 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
 		$io[5] = $startq;
+		$io[1] = $zpipe->[1] if $zpipe;
 	}
 	start_query($self, \@io, $lei, $srcs);
 	unless ($in_loop) {
 		my @pids = $self->wq_close;
 		# for the $lei->atfork_child_wq PIPE handler:
 		$done_op->{'!'}->[3] = \@pids;
-		$done->event_step;
+		# $done->event_step;
+		# my $ipc_worker_reap = $self->can('ipc_worker_reap');
+		# if (my $l2m_pids = delete $self->{l2m_pids}) {
+			# dwaitpid($_, $ipc_worker_reap, $l2m) for @$l2m_pids;
+		# }
+		while ($done->{sock}) { $done->event_step }
 		my $ipc_worker_reap = $self->can('ipc_worker_reap');
-		if (my $l2m_pids = delete $self->{l2m_pids}) {
-			dwaitpid($_, $ipc_worker_reap, $l2m) for @$l2m_pids;
-		}
 		dwaitpid($_, $ipc_worker_reap, $self) for @pids;
 	}
 }
diff --git a/t/lei.t b/t/lei.t
index c4692217..8eede13e 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -189,25 +189,35 @@ my $test_external = sub {
 	# No double-quoting should be imposed on users on the CLI
 	$lei->('q', 's:use boolean prefix');
 	like($out, qr/search: use boolean prefix/, 'phrase search got result');
+	require IO::Uncompress::Gunzip;
+	for my $sfx ('', '.gz') {
+		my $f = "$home/mbox$sfx";
+		$lei->('q', '-o', "mboxcl2:$f", 's:use boolean prefix');
+		my $cat = $sfx eq '' ? sub {
+			open my $mb, '<', $f or fail "no mbox: $!";
+			<$mb>
+		} : sub {
+			my $z = IO::Uncompress::Gunzip->new($f, MultiStream=>1);
+			<$z>;
+		};
+		my @s = grep(/^Subject:/, $cat->());
+		is(scalar(@s), 1, "1 result in mbox$sfx");
+		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:see attachment');
+		is($err, '', 'no errors from augment');
+		@s = grep(/^Subject:/, my @wtf = $cat->());
+		is(scalar(@s), 2, "2 results in mbox$sfx");
 
-	$lei->('q', '-o', "mboxcl2:$home/mbox", 's:use boolean prefix');
-	open my $mb, '<', "$home/mbox" or fail "no mbox: $!";
-	my @s = grep(/^Subject:/, <$mb>);
-	is(scalar(@s), 1, '1 result in mbox');
-	$lei->('q', '-a', '-o', "mboxcl2:$home/mbox", 's:see attachment');
-	is($err, '', 'no errors from augment');
-	seek($mb, 0, SEEK_SET) or BAIL_OUT "seek: $!";
-	@s = grep(/^Subject:/, <$mb>);
-	is(scalar(@s), 2, '2 results in mbox');
+		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:nonexistent');
+		is($err, '', "no errors on no results ($sfx)");
 
-	$lei->('q', '-a', '-o', "mboxcl2:$home/mbox", 's:nonexistent');
-	is($err, '', 'no errors on no results');
-	seek($mb, 0, SEEK_SET) or BAIL_OUT "seek: $!";
-	my @s2 = grep(/^Subject:/, <$mb>);
-	is_deeply(\@s2, \@s, 'same 2 old results w/ --augment and bad search');
+		my @s2 = grep(/^Subject:/, $cat->());
+		is_deeply(\@s2, \@s,
+			"same 2 old results w/ --augment and bad search $sfx");
 
-	$lei->('q', '-o', "mboxcl2:$home/mbox", 's:nonexistent');
-	is(-s "$home/mbox", 0, 'clobber w/o --augment');
+		$lei->('q', '-o', "mboxcl2:$f", 's:nonexistent');
+		my @res = $cat->();
+		is_deeply(\@res, [], "clobber w/o --augment $sfx");
+	}
 };
 
 my $test_lei_common = sub {
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index e5ac8eac..6673d9a6 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -94,9 +94,9 @@ my $wcb_get = sub {
 		my $dup = Storable::thaw(Storable::freeze($l2m));
 		is_deeply($dup, $l2m, "$fmt round-trips through storable");
 	}
-	$l2m->pre_augment($lei);
+	my $zpipe = $l2m->pre_augment($lei);
 	$l2m->do_augment($lei);
-	$l2m->post_augment($lei);
+	$l2m->post_augment($lei, $zpipe);
 	my $cb = $l2m->write_cb($lei);
 	delete $lei->{1};
 	$cb;

^ permalink raw reply related	[relevance 41%]

* [PATCH 0/7] lei: fixes piled higher and deeper
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (7 preceding siblings ...)
  2021-01-19  9:34 58% ` [PATCH 8/9] lei: test some likely errors due to misuse Eric Wong
@ 2021-01-20  5:04 71% ` Eric Wong
  2021-01-20  5:04 63% ` [PATCH 1/7] lei: allow more mbox inode types Eric Wong
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-20  5:04 UTC (permalink / raw)
  To: meta

1/7 was necessary on my FreeBSD 11.x VM
2/7 fixes TEST_RUN_MODE=0
3/7 fixes a long-standing (well, several weeks) annoyance
4/7 depended on 3/7, sorta
5/7 should've been done ages ago
6/7 oops :x
7/7 belts and suspenders

Eric Wong (7):
  lei: allow more mbox inode types
  lei: exit code in oneshot mode
  overidx: eidx_prep: fix leftover dbh reference
  lei q: cleanup store initialization
  lei: dump and clear errors.log in daemon mode
  lei_xsearch: keep l2m->{-wq_s1} while preparing query
  lei_to_mail: call PublicInbox::IPC::DESTROY

 lib/PublicInbox/LEI.pm         | 32 +++++++++++++++++++++++++++-----
 lib/PublicInbox/LeiOverview.pm |  6 ++----
 lib/PublicInbox/LeiQuery.pm    | 18 ++++++++----------
 lib/PublicInbox/LeiToMail.pm   |  5 +++--
 lib/PublicInbox/LeiXSearch.pm  |  4 ++--
 lib/PublicInbox/OverIdx.pm     |  8 +++-----
 t/lei.t                        |  8 +++++++-
 7 files changed, 52 insertions(+), 29 deletions(-)


^ permalink raw reply	[relevance 71%]

* [PATCH 2/7] lei: exit code in oneshot mode
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (9 preceding siblings ...)
  2021-01-20  5:04 63% ` [PATCH 1/7] lei: allow more mbox inode types Eric Wong
@ 2021-01-20  5:04 71% ` Eric Wong
  2021-01-20  5:04 67% ` [PATCH 4/7] lei q: cleanup store initialization Eric Wong
  2021-01-20  5:04 53% ` [PATCH 5/7] lei: dump and clear errors.log in daemon mode Eric Wong
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-20  5:04 UTC (permalink / raw)
  To: meta

waitpid() in DESTROY ends up setting $? for the exit status,
thus we must reap IPC children before calling CORE::exit.

This fixes t/lei-oneshot.t with TEST_RUN_MODE=0
---
 lib/PublicInbox/LEI.pm | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f3edfe82..97ae2c41 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -249,6 +249,11 @@ sub x_it ($$) {
 	if (my $sock = $self->{sock}) {
 		send($sock, "x_it $code", MSG_EOR);
 	} elsif (!($code & 127)) { # oneshot, ignore signals
+		# don't want to end up using $? from child processes
+		for my $f (qw(lxs l2m)) {
+			my $wq = delete $self->{$f} or next;
+			$wq->DESTROY;
+		}
 		$quit->($code >> 8);
 	}
 }

^ permalink raw reply related	[relevance 71%]

* [PATCH 4/7] lei q: cleanup store initialization
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (10 preceding siblings ...)
  2021-01-20  5:04 71% ` [PATCH 2/7] lei: exit code in oneshot mode Eric Wong
@ 2021-01-20  5:04 67% ` Eric Wong
  2021-01-20  5:04 53% ` [PATCH 5/7] lei: dump and clear errors.log in daemon mode Eric Wong
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-20  5:04 UTC (permalink / raw)
  To: meta

Since we no longer leak an FD for over.sqlite3, we can
initialize and actually enable it by default as originally
intended.
---
 lib/PublicInbox/LeiQuery.pm | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index d6e801e3..941bc299 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -23,14 +23,15 @@ sub _vivify_external { # _externals_each callback
 # the main "lei q SEARCH_TERMS" method
 sub lei_q {
 	my ($self, @argv) = @_;
-	my $opt = $self->{opt};
-
-	# --local is enabled by default
-	# src: LeiXSearch || LeiSearch || Inbox
-	my @srcs;
 	require PublicInbox::LeiXSearch;
 	require PublicInbox::LeiOverview;
-	PublicInbox::Config->json;
+	PublicInbox::Config->json; # preload before forking
+	my $opt = $self->{opt};
+	my @srcs; # any number of LeiXSearch || LeiSearch || Inbox
+	if ($opt->{'local'} //= 1) { # --local is enabled by default
+		my $sto = $self->_lei_store(1);
+		push @srcs, $sto->search;
+	}
 	my $lxs = PublicInbox::LeiXSearch->new;
 
 	# --external is enabled by default, but allow --no-external
@@ -39,7 +40,6 @@ sub lei_q {
 	}
 	my $j = $opt->{jobs} // (scalar(@srcs) > 3 ? 3 : scalar(@srcs));
 	$j = 1 if !$opt->{thread};
-	$j++ if $opt->{'local'}; # for sto->search below
 	$self->atfork_prepare_wq($lxs);
 	$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset);
 	$self->{lxs} = $lxs;
@@ -50,10 +50,8 @@ sub lei_q {
 		$self->atfork_prepare_wq($l2m);
 		$l2m->wq_workers_start('lei2mail', $j, $self->oldset);
 	}
-
 	# no forking workers after this
-	my $sto = $self->_lei_store(1);
-	unshift(@srcs, $sto->search) if $opt->{'local'};
+
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
 	$mset_opt{qstr} = join(' ', map {;

^ permalink raw reply related	[relevance 67%]

* [PATCH 1/7] lei: allow more mbox inode types
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (8 preceding siblings ...)
  2021-01-20  5:04 71% ` [PATCH 0/7] lei: fixes piled higher and deeper Eric Wong
@ 2021-01-20  5:04 63% ` Eric Wong
  2021-01-20  5:04 71% ` [PATCH 2/7] lei: exit code in oneshot mode Eric Wong
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-20  5:04 UTC (permalink / raw)
  To: meta

We may attempt to write an mbox to any terminal, block, or
character device, not just regular files and FIFOs/pipes.
The only thing that is known to not work is a directory.

Sockets may be possible with some OSes (e.g. Plan 9) or
filesystems.  This fixes t/lei.t on FreeBSD 11.x
---
 lib/PublicInbox/LeiOverview.pm | 6 ++----
 lib/PublicInbox/LeiToMail.pm   | 4 ++--
 t/lei.t                        | 2 +-
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index dcc3088b..cab2b055 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -42,12 +42,10 @@ sub detect_fmt ($$) {
 	my ($lei, $dst) = @_;
 	if ($dst =~ m!\A([:/]+://)!) {
 		$lei->fail("$1 support not implemented, yet\n");
-	} elsif (!-e $dst) {
-		'maildir'; # the default
+	} elsif (!-e $dst || -d _) {
+		'maildir'; # the default TODO: MH?
 	} elsif (-f _ || -p _) {
 		$lei->fail("unable to determine mbox family of $dst\n");
-	} elsif (-d _) { # TODO: MH?
-		'maildir';
 	} else {
 		$lei->fail("unable to determine format of $dst\n");
 	}
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 49b5c8ab..9d9b5748 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -343,8 +343,8 @@ sub new {
 				"$dst exists and is not a directory\n";
 		$lei->{ovv}->{dst} = $dst .= '/' if substr($dst, -1) ne '/';
 	} elsif (substr($fmt, 0, 4) eq 'mbox') {
-		-e $dst && !-f _ && !-p _ and die
-				"$dst exists and is not a regular file\n";
+		(-d $dst || (-e _ && !-w _)) and die
+			"$dst exists and is not a writable file\n";
 		$self->can("eml2$fmt") or die "bad mbox --format=$fmt\n";
 		$self->{base_type} = 'mbox';
 	} else {
diff --git a/t/lei.t b/t/lei.t
index 64cb5f0e..d49dc01a 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -187,7 +187,7 @@ my $test_external = sub {
 	is($? >> 8, 1, 'errored out with exit 1');
 
 	ok(!$lei->(qw(q s:prefix -f mboxcl2 -o), $home), 'bad mbox');
-	like($err, qr!\Q$home\E exists and is not a regular file!,
+	like($err, qr!\Q$home\E exists and is not a writable file!,
 		'error shown');
 	is($? >> 8, 1, 'errored out with exit 1');
 

^ permalink raw reply related	[relevance 63%]

* [PATCH 5/7] lei: dump and clear errors.log in daemon mode
  2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
                   ` (11 preceding siblings ...)
  2021-01-20  5:04 67% ` [PATCH 4/7] lei q: cleanup store initialization Eric Wong
@ 2021-01-20  5:04 53% ` Eric Wong
  12 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-20  5:04 UTC (permalink / raw)
  To: meta

Inspired by "dmesg -c", this should help users report bugs
and avoids eating up $XDG_RUNTIME_DIR.

Once lei is ready for release, hopefully the need for this
should be few an far between, but shit happens.
---
 lib/PublicInbox/LEI.pm | 27 ++++++++++++++++++++++-----
 t/lei.t                |  6 ++++++
 2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 97ae2c41..6be6d10b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -15,6 +15,7 @@ use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
 use Errno qw(EAGAIN EINTR ECONNREFUSED ENOENT ECONNRESET);
 use POSIX ();
 use IO::Handle ();
+use Fcntl qw(SEEK_SET);
 use Sys::Syslog qw(syslog openlog);
 use PublicInbox::Config;
 use PublicInbox::Syscall qw(SFD_NONBLOCK EPOLLIN EPOLLET);
@@ -26,7 +27,7 @@ use Text::Wrap qw(wrap);
 use File::Path qw(mkpath);
 use File::Spec;
 our $quit = \&CORE::exit;
-our $current_lei;
+our ($current_lei, $errors_log);
 my ($recv_cmd, $send_cmd);
 my $GLP = Getopt::Long::Parser->new;
 $GLP->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
@@ -246,6 +247,7 @@ sub x_it ($$) {
 	my ($self, $code) = @_;
 	# make sure client sees stdout before exit
 	$self->{1}->autoflush(1) if $self->{1};
+	dump_and_clear_log();
 	if (my $sock = $self->{sock}) {
 		send($sock, "x_it $code", MSG_EOR);
 	} elsif (!($code & 127)) { # oneshot, ignore signals
@@ -264,7 +266,7 @@ sub out ($;@) { print { shift->{1} } @_ }
 
 sub err ($;@) {
 	my $self = shift;
-	my $err = $self->{2} // *STDERR{IO};
+	my $err = $self->{2} // ($self->{pgr} // [])->[2] // *STDERR{IO};
 	print $err @_, (substr($_[-1], -1, 1) eq "\n" ? () : "\n");
 }
 
@@ -300,6 +302,7 @@ sub atfork_child_wq {
 	$self->{sock} = $sock if -S $sock;
 	$self->{l2m}->{-wq_s1} = $l2m_wq_s1 if $l2m_wq_s1 && -S $l2m_wq_s1;
 	%PATH2CFG = ();
+	undef $errors_log;
 	$quit = \&CORE::exit;
 	@TO_CLOSE_ATFORK_CHILD = ();
 	(__WARN__ => sub { err($self, @_) },
@@ -483,6 +486,7 @@ sub optparse ($$$) {
 sub dispatch {
 	my ($self, $cmd, @argv) = @_;
 	local $current_lei = $self; # for __WARN__
+	dump_and_clear_log("from previous run\n");
 	return _help($self, 'no command given') unless defined($cmd);
 	my $func = "lei_$cmd";
 	$func =~ tr/-/_/;
@@ -772,6 +776,7 @@ sub event_step {
 	my ($self) = @_;
 	local %ENV = %{$self->{env}};
 	my $sock = $self->{sock};
+	local $current_lei = $self;
 	eval {
 		while (my @fds = $recv_cmd->($sock, my $buf, 4096)) {
 			if (scalar(@fds) == 1 && !defined($fds[0])) {
@@ -805,6 +810,15 @@ sub noop {}
 
 our $oldset; sub oldset { $oldset }
 
+sub dump_and_clear_log {
+	if (defined($errors_log) && -s STDIN && seek(STDIN, 0, SEEK_SET)) {
+		my @pfx = @_;
+		unshift(@pfx, "$errors_log ") if @pfx;
+		warn @pfx, do { local $/; <STDIN> };
+		truncate(STDIN, 0) or warn "ftruncate ($errors_log): $!";
+	}
+}
+
 # lei(1) calls this when it can't connect
 sub lazy_start {
 	my ($path, $errno, $narg) = @_;
@@ -836,9 +850,12 @@ sub lazy_start {
 	require PublicInbox::Listener;
 	require PublicInbox::EOFpipe;
 	(-p STDOUT) or die "E: stdout must be a pipe\n";
-	my ($err) = ($path =~ m!\A(.+?/)[^/]+\z!);
-	$err .= 'errors.log';
-	open(STDIN, '+>>', $err) or die "open($err): $!";
+	local $errors_log;
+	($errors_log) = ($path =~ m!\A(.+?/)[^/]+\z!);
+	$errors_log .= 'errors.log';
+	open(STDIN, '+>>', $errors_log) or die "open($errors_log): $!";
+	STDIN->autoflush(1);
+	dump_and_clear_log("from previous daemon process:\n");
 	POSIX::setsid() > 0 or die "setsid: $!";
 	my $pid = fork // die "fork: $!";
 	return if $pid;
diff --git a/t/lei.t b/t/lei.t
index d49dc01a..ef820fe3 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -258,6 +258,7 @@ SKIP: { # real socket
 	} // skip 'Socket::MsgHdr or Inline::C missing or unconfigured', 115;
 	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/5.seq.sock";
+	my $err_log = "$ENV{XDG_RUNTIME_DIR}/lei/errors.log";
 
 	ok($lei->('daemon-pid'), 'daemon-pid');
 	is($err, '', 'no error from daemon-pid');
@@ -267,10 +268,15 @@ SKIP: { # real socket
 	ok(-S $sock, 'sock created');
 
 	$test_lei_common->();
+	is(-s $err_log, 0, 'nothing in errors.log');
+	open my $efh, '>>', $err_log or BAIL_OUT $!;
+	print $efh "phail\n" or BAIL_OUT $!;
+	close $efh or BAIL_OUT $!;
 
 	ok($lei->('daemon-pid'), 'daemon-pid');
 	chomp(my $pid_again = $out);
 	is($pid, $pid_again, 'daemon-pid idempotent');
+	like($err, qr/phail/, 'got mock "phail" error previous run');
 
 	ok($lei->(qw(daemon-kill)), 'daemon-kill');
 	is($out, '', 'no output from daemon-kill');

^ permalink raw reply related	[relevance 53%]

* [PATCH 00/12] lei: another dump
@ 2021-01-21 19:46 70% Eric Wong
  2021-01-21 19:46 52% ` [PATCH 02/12] lei q: retrieve keywords for local, non-external messages Eric Wong
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

1/12 is a user-visible change, but there's no users, yet :P
more externals work coming...

12/12 may be too specific to bash, help from non-bash users
appreciated

Eric Wong (12):
  lei_overview: rename {relevance} => {pct}
  lei q: retrieve keywords for local, non-external messages
  lei_xsearch: eliminate some unused, commented-out code
  lei: show {pct} and {oid} in From_ lines and filenames
  lei: fix inadvertant FD sharing
  lei_to_mail: avoid segfault on exit
  lei: oneshot: use client $io[2] for placeholder
  lei: remove INT/QUIT/TERM handlers, fix daemon EOF
  lei_xsearch: reduce reference paths to lxs
  lei: remove @TO_CLOSE_ATFORK_CHILD
  lei: forget-external support with canonicalization
  lei forget-external: bash completion support

 MANIFEST                       |  1 +
 lib/PublicInbox/IPC.pm         | 23 +++++++--
 lib/PublicInbox/LEI.pm         | 86 ++++++++++++++++++++--------------
 lib/PublicInbox/LeiExternal.pm | 71 ++++++++++++++++++++++++----
 lib/PublicInbox/LeiOverview.pm | 17 +++----
 lib/PublicInbox/LeiQuery.pm    | 21 +++++----
 lib/PublicInbox/LeiSearch.pm   | 16 ++-----
 lib/PublicInbox/LeiToMail.pm   | 80 ++++++++++++++++++-------------
 lib/PublicInbox/LeiXSearch.pm  | 60 ++++++++++++------------
 lib/PublicInbox/Search.pm      | 20 +++++++-
 script/lei                     |  5 --
 t/lei.t                        |  9 ++++
 t/lei_external.t               | 18 +++++++
 t/lei_to_mail.t                | 41 +++++++++-------
 14 files changed, 300 insertions(+), 168 deletions(-)
 create mode 100644 t/lei_external.t

^ permalink raw reply	[relevance 70%]

* [PATCH 02/12] lei q: retrieve keywords for local, non-external messages
  2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
@ 2021-01-21 19:46 52% ` Eric Wong
  2021-01-21 19:46 31% ` [PATCH 04/12] lei: show {pct} and {oid} in From_ lines and filenames Eric Wong
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

This isn't tested for now, so maybe it works.
---
 lib/PublicInbox/LeiOverview.pm |  8 +++-----
 lib/PublicInbox/LeiSearch.pm   | 16 +++-------------
 lib/PublicInbox/LeiXSearch.pm  | 14 ++++++++++----
 lib/PublicInbox/Search.pm      | 20 +++++++++++++++++++-
 4 files changed, 35 insertions(+), 23 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 8799f1cc..47d9eb31 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -224,9 +224,8 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		my $git_dir = $git->{git_dir};
 		sub {
 			my ($smsg, $mitem) = @_;
-			my $kw = []; # TODO get from mitem
 			$l2m->wq_do('write_mail', \@io, $git_dir,
-					$smsg->{blob}, $lei_ipc, $kw)
+					$smsg->{blob}, $lei_ipc, $smsg->{kw});
 		}
 	} elsif ($l2m) {
 		my $wcb = $l2m->write_cb($lei);
@@ -235,8 +234,8 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		my $g2m = $l2m->can('git_to_mail');
 		sub {
 			my ($smsg, $mitem) = @_;
-			my $kw = []; # TODO get from mitem
-			$git->cat_async($smsg->{blob}, $g2m, [ $wcb, $kw ]);
+			$git->cat_async($smsg->{blob}, $g2m,
+					[ $wcb, $smsg->{kw} ]);
 		};
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
@@ -266,7 +265,6 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		$lei->{ovv_buf} = \(my $buf = '');
 		sub {
 			my ($smsg, $mitem) = @_;
-			delete @$smsg{qw(tid num)};
 			$buf .= $json->encode(_unbless_smsg(@_)) . $ORS;
 			if (length($buf) > 65536) {
 				my $lk = $self->lock_for_scope;
diff --git a/lib/PublicInbox/LeiSearch.pm b/lib/PublicInbox/LeiSearch.pm
index b7e337de..440bacf5 100644
--- a/lib/PublicInbox/LeiSearch.pm
+++ b/lib/PublicInbox/LeiSearch.pm
@@ -5,7 +5,7 @@ package PublicInbox::LeiSearch;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::ExtSearch);
-use PublicInbox::Search;
+use PublicInbox::Search qw(xap_terms);
 
 # get combined docid from over.num:
 # (not generic Xapian, only works with our sharding scheme)
@@ -19,19 +19,9 @@ sub msg_keywords {
 	my ($self, $num) = @_; # num_or_mitem
 	my $xdb = $self->xdb; # set {nshard};
 	my $docid = ref($num) ? $num->get_docid : num2docid($self, $num);
-	my %kw;
-	eval {
-		my $end = $xdb->termlist_end($docid);
-		my $cur = $xdb->termlist_begin($docid);
-		for (; $cur != $end; $cur++) {
-			$cur->skip_to('K');
-			last if $cur == $end;
-			my $kw = $cur->get_termname;
-			$kw =~ s/\AK//s and $kw{$kw} = undef;
-		}
-	};
+	my $kw = xap_terms('K', $xdb, $docid);
 	warn "E: #$docid ($num): $@\n" if $@;
-	wantarray ? sort(keys(%kw)) : \%kw;
+	wantarray ? sort(keys(%$kw)) : $kw;
 }
 
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index a6d827de..d7688ede 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -13,6 +13,7 @@ use PublicInbox::OpPipe;
 use PublicInbox::Import;
 use File::Temp 0.19 (); # 0.19 for ->newdir
 use File::Spec ();
+use PublicInbox::Search qw(xap_terms);
 
 sub new {
 	my ($class) = @_;
@@ -74,7 +75,12 @@ sub smsg_for {
 	my $docid = $mitem->get_docid;
 	my $shard = ($docid - 1) % $nshard;
 	my $num = int(($docid - 1) / $nshard) + 1;
-	my $smsg = $self->{shard2ibx}->[$shard]->over->get_art($num);
+	my $ibx = $self->{shard2ibx}->[$shard];
+	my $smsg = $ibx->over->get_art($num);
+	if (ref($ibx->can('msg_keywords'))) {
+		my $kw = xap_terms('K', $mitem->get_document);
+		$smsg->{kw} = [ sort keys %$kw ];
+	}
 	$smsg->{docid} = $docid;
 	$smsg;
 }
@@ -153,11 +159,11 @@ sub query_mset { # non-parallel for non-"--thread" users
 	$dedupe->prepare_dedupe;
 	do {
 		$mset = $self->mset($mo->{qstr}, $mo);
-		for my $it ($mset->items) {
-			my $smsg = smsg_for($self, $it) or next;
+		for my $mitem ($mset->items) {
+			my $smsg = smsg_for($self, $mitem) or next;
 			wait_startq($startq) if $startq;
 			next if $dedupe->is_smsg_dup($smsg);
-			$each_smsg->($smsg, $it);
+			$each_smsg->($smsg, $mitem);
 		}
 	} while (_mset_more($mset, $mo));
 	undef $each_smsg; # drops @io for l2m->{each_smsg_done}
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index a4b40f94..7c6a16be 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -6,7 +6,7 @@
 package PublicInbox::Search;
 use strict;
 use parent qw(Exporter);
-our @EXPORT_OK = qw(retry_reopen int_val get_pct);
+our @EXPORT_OK = qw(retry_reopen int_val get_pct xap_terms);
 use List::Util qw(max);
 
 # values for searching, changing the numeric value breaks
@@ -432,4 +432,22 @@ sub get_pct ($) { # mset item
 	$n > 99 ? 99 : $n;
 }
 
+sub xap_terms ($$;@) {
+	my ($pfx, $xdb_or_doc, @docid) = @_; # @docid may be empty ()
+	my %ret;
+	eval {
+		my $end = $xdb_or_doc->termlist_end(@docid);
+		my $cur = $xdb_or_doc->termlist_begin(@docid);
+		for (; $cur != $end; $cur++) {
+			$cur->skip_to($pfx);
+			last if $cur == $end;
+			my $tn = $cur->get_termname;
+			if (index($tn, $pfx) == 0) {
+				$ret{substr($tn, length($pfx))} = undef;
+			}
+		}
+	};
+	\%ret;
+}
+
 1;

^ permalink raw reply related	[relevance 52%]

* [PATCH 05/12] lei: fix inadvertant FD sharing
  2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
  2021-01-21 19:46 52% ` [PATCH 02/12] lei q: retrieve keywords for local, non-external messages Eric Wong
  2021-01-21 19:46 31% ` [PATCH 04/12] lei: show {pct} and {oid} in From_ lines and filenames Eric Wong
@ 2021-01-21 19:46 45% ` Eric Wong
  2021-01-21 19:46 71% ` [PATCH 07/12] lei: oneshot: use client $io[2] for placeholder Eric Wong
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

$wq->{-ipc_atfork_child_close} neededed to be initialized properly.
And start setting $0 in workers to improve visibility.
---
 lib/PublicInbox/IPC.pm        | 22 ++++++++++++++++++----
 lib/PublicInbox/LEI.pm        |  9 +++++----
 lib/PublicInbox/LeiQuery.pm   | 21 +++++++++++----------
 lib/PublicInbox/LeiToMail.pm  |  2 +-
 lib/PublicInbox/LeiXSearch.pm | 27 ++++++++++++---------------
 5 files changed, 47 insertions(+), 34 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 8fec2e62..24f45e03 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -134,6 +134,12 @@ sub ipc_worker_reap { # dwaitpid callback
 	warn "PID:$pid died with \$?=$?\n" if $? && ($? & 127) != 15;
 }
 
+sub wq_wait_old {
+	my ($self) = @_;
+	my $pids = delete $self->{"-wq_old_pids.$$"} or return;
+	dwaitpid($_, \&ipc_worker_reap, $self) for @$pids;
+}
+
 # for base class, override in sub classes
 sub ipc_atfork_prepare {}
 
@@ -370,17 +376,25 @@ sub wq_workers {
 }
 
 sub wq_close {
-	my ($self) = @_;
+	my ($self, $nohang) = @_;
 	delete @$self{qw(-wq_s1 -wq_s2)} or return;
 	my $ppid = delete $self->{-wq_ppid} or return;
 	my $workers = delete $self->{-wq_workers} // die 'BUG: no wq_workers';
 	return if $ppid != $$; # can't reap siblings or parents
-	return (keys %$workers) if wantarray; # caller will reap
-	for my $pid (keys %$workers) {
-		dwaitpid($pid, \&ipc_worker_reap, $self);
+	my @pids = map { $_ + 0 } keys %$workers;
+	if ($nohang) {
+		push @{$self->{"-wq_old_pids.$$"}}, @pids;
+	} else {
+		dwaitpid($_, \&ipc_worker_reap, $self) for @pids;
 	}
 }
 
+sub wq_kill_old {
+	my ($self) = @_;
+	my $pids = $self->{"-wq_old_pids.$$"} or return;
+	kill 'TERM', @$pids;
+}
+
 sub wq_kill {
 	my ($self, $sig) = @_;
 	my $workers = $self->{-wq_workers} or return;
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 6be6d10b..2cb2bf40 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -281,11 +281,14 @@ sub fail ($$;$) {
 
 sub atfork_prepare_wq {
 	my ($self, $wq) = @_;
-	my $tcafc = $wq->{-ipc_atfork_child_close};
+	my $tcafc = $wq->{-ipc_atfork_child_close} //= [];
 	push @$tcafc, @TO_CLOSE_ATFORK_CHILD;
 	if (my $sock = $self->{sock}) {
 		push @$tcafc, @$self{qw(0 1 2)}, $sock;
 	}
+	if (my $pgr = $self->{pgr}) {
+		push @$tcafc, @$pgr[1,2];
+	}
 	for my $f (qw(lxs l2m)) {
 		my $ipc = $self->{$f} or next;
 		push @$tcafc, grep { defined }
@@ -335,9 +338,7 @@ sub atfork_parent_wq {
 	my $l2m = $ret->{l2m};
 	if ($l2m && $l2m != $wq) { # $wq == lxs
 		$io[4] = $l2m->{-wq_s1} if $l2m->{-wq_s1};
-		if (my @pids = $l2m->wq_close) {
-			$wq->{l2m_pids} = \@pids;
-		}
+		$l2m->wq_close(1);
 	}
 	($ret, @io);
 }
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 941bc299..7d634b5e 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -32,24 +32,25 @@ sub lei_q {
 		my $sto = $self->_lei_store(1);
 		push @srcs, $sto->search;
 	}
-	my $lxs = PublicInbox::LeiXSearch->new;
 
+	my $lxs = $self->{lxs} = PublicInbox::LeiXSearch->new;
 	# --external is enabled by default, but allow --no-external
-	if ($opt->{external} // 1) {
+	if ($opt->{external} //= 1) {
 		$self->_externals_each(\&_vivify_external, \@srcs);
 	}
-	my $j = $opt->{jobs} // (scalar(@srcs) > 3 ? 3 : scalar(@srcs));
-	$j = 1 if !$opt->{thread};
-	$self->atfork_prepare_wq($lxs);
-	$lxs->wq_workers_start('lei_xsearch', $j, $self->oldset);
-	$self->{lxs} = $lxs;
-
+	my $xj = $opt->{jobs} // (scalar(@srcs) > 3 ? 3 : scalar(@srcs));
+	$xj = 1 if !$opt->{thread};
 	my $ovv = PublicInbox::LeiOverview->new($self) or return;
+	$self->atfork_prepare_wq($lxs);
+	$lxs->wq_workers_start('lei_xsearch', $xj, $self->oldset);
+	delete $lxs->{-ipc_atfork_child_close};
 	if (my $l2m = $self->{l2m}) {
-		$j = 4 if $j <= 4; # TODO configurable
+		my $mj = 4; # TODO: configurable
 		$self->atfork_prepare_wq($l2m);
-		$l2m->wq_workers_start('lei2mail', $j, $self->oldset);
+		$l2m->wq_workers_start('lei2mail', $mj, $self->oldset);
+		delete $l2m->{-ipc_atfork_child_close};
 	}
+
 	# no forking workers after this
 
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 3dcce9e7..87cc9c47 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -467,7 +467,7 @@ sub write_mail { # via ->wq_do
 
 sub ipc_atfork_prepare {
 	my ($self) = @_;
-	# (qry_status_wr, stdout|mbox, stderr, 3: sock, 4: each_smsg_done_wr)
+	# (done_wr, stdout|mbox, stderr, 3: sock, 4: each_smsg_done_wr)
 	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&= >&=]);
 	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
 }
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 13611882..7b33677e 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -110,6 +110,7 @@ sub wait_startq ($) {
 
 sub query_thread_mset { # for --thread
 	my ($self, $lei, $ibxish) = @_;
+	local $0 = "$0 query_thread_mset";
 	my $startq = delete $self->{5};
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
@@ -148,6 +149,7 @@ sub query_thread_mset { # for --thread
 
 sub query_mset { # non-parallel for non-"--thread" users
 	my ($self, $lei, $srcs) = @_;
+	local $0 = "$0 query_mset";
 	my $startq = delete $self->{5};
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
@@ -192,12 +194,10 @@ sub git {
 sub query_done { # EOF callback
 	my ($self, $lei) = @_;
 	my $l2m = delete $lei->{l2m};
-	if (my $pids = delete $self->{l2m_pids}) {
-		my $ipc_worker_reap = $self->can('ipc_worker_reap');
-		dwaitpid($_, $ipc_worker_reap, $l2m) for @$pids;
-	}
+	$l2m->wq_wait_old if $l2m;
+	$self->wq_wait_old;
 	$lei->{ovv}->ovv_end($lei);
-	if ($l2m) { # calls LeiToMail reap_compress
+	if ($l2m) { # close() calls LeiToMail reap_compress
 		close(delete($lei->{1})) if $lei->{1};
 		$lei->start_mua;
 	}
@@ -232,12 +232,12 @@ sub start_query { # always runs in main (lei-daemon) process
 	for my $rmt (@$remotes) {
 		$self->wq_do('query_thread_mbox', $io, $lei, $rmt);
 	}
-	close $io->[0]; # qry_status_wr
 	@$io = ();
 }
 
 sub query_prepare { # called by wq_do
 	my ($self, $lei) = @_;
+	local $0 = "$0 query_prepare";
 	my %sig = $lei->atfork_child_wq($self);
 	-p $lei->{0} or die "BUG: \$done pipe expected";
 	local @SIG{keys %sig} = values %sig;
@@ -246,11 +246,11 @@ sub query_prepare { # called by wq_do
 	syswrite($lei->{0}, '.') == 1 or die "do_post_augment trigger: $!";
 }
 
-sub sigpipe_handler {
-	my ($self, $lei_orig, $pids) = @_;
-	if ($pids) { # one-shot (no event loop)
-		kill 'TERM', @$pids;
+sub sigpipe_handler { # handles SIGPIPE from wq workers
+	my ($self, $lei_orig) = @_;
+	if ($self->wq_kill_old) {
 		kill 'PIPE', $$;
+		$self->wq_wait_old;
 	} else {
 		$self->wq_kill;
 		$self->wq_close;
@@ -287,19 +287,16 @@ sub do_query {
 		$io[1] = $zpipe->[1] if $zpipe;
 	}
 	start_query($self, \@io, $lei, $srcs);
+	$self->wq_close(1);
 	unless ($in_loop) {
-		my @pids = $self->wq_close;
 		# for the $lei->atfork_child_wq PIPE handler:
-		$done_op->{'!'}->[3] = \@pids;
 		while ($done->{sock}) { $done->event_step }
-		my $ipc_worker_reap = $self->can('ipc_worker_reap');
-		dwaitpid($_, $ipc_worker_reap, $self) for @pids;
 	}
 }
 
 sub ipc_atfork_prepare {
 	my ($self) = @_;
-	# (0: qry_status_wr, 1: stdout|mbox, 2: stderr,
+	# (0: done_wr, 1: stdout|mbox, 2: stderr,
 	#  3: sock, 4: $l2m->{-wq_s1}, 5: $startq)
 	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&= +<&= <&=]);
 	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC

^ permalink raw reply related	[relevance 45%]

* [PATCH 04/12] lei: show {pct} and {oid} in From_ lines and filenames
  2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
  2021-01-21 19:46 52% ` [PATCH 02/12] lei q: retrieve keywords for local, non-external messages Eric Wong
@ 2021-01-21 19:46 31% ` Eric Wong
  2021-01-21 19:46 45% ` [PATCH 05/12] lei: fix inadvertant FD sharing Eric Wong
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

From_ lines are shown when mbox* variants are output to stdout,
making {oid} and {pct} information visible without risking being
propagated to other importer processes if they were in
lei-specific X-* headers.

Maildirs already had OIDs in the filename, now they gain Xapian
{pct} in case anybody cares.
---
 lib/PublicInbox/LeiOverview.pm |  9 ++---
 lib/PublicInbox/LeiToMail.pm   | 60 +++++++++++++++++++---------------
 t/lei_to_mail.t                | 41 +++++++++++++----------
 3 files changed, 61 insertions(+), 49 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 47d9eb31..7a4fa857 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -224,8 +224,9 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		my $git_dir = $git->{git_dir};
 		sub {
 			my ($smsg, $mitem) = @_;
-			$l2m->wq_do('write_mail', \@io, $git_dir,
-					$smsg->{blob}, $lei_ipc, $smsg->{kw});
+			$smsg->{pct} = get_pct($mitem) if $mitem;
+			$l2m->wq_do('write_mail', \@io, $git_dir, $smsg,
+					$lei_ipc);
 		}
 	} elsif ($l2m) {
 		my $wcb = $l2m->write_cb($lei);
@@ -234,8 +235,8 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		my $g2m = $l2m->can('git_to_mail');
 		sub {
 			my ($smsg, $mitem) = @_;
-			$git->cat_async($smsg->{blob}, $g2m,
-					[ $wcb, $smsg->{kw} ]);
+			$smsg->{pct} = get_pct($mitem) if $mitem;
+			$git->cat_async($smsg->{blob}, $g2m, [ $wcb, $smsg ]);
 		};
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 1be0b09c..3dcce9e7 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -32,14 +32,14 @@ my %kw2status = (
 );
 
 sub _mbox_hdr_buf ($$$) {
-	my ($eml, $type, $kw) = @_;
+	my ($eml, $type, $smsg) = @_;
 	$eml->header_set($_) for (qw(Lines Bytes Content-Length));
 
 	# Messages are always 'O' (non-\Recent in IMAP), it saves
 	# MUAs the trouble of rewriting the mbox if no other
 	# changes are made
 	my %hdr = (Status => [ 'O' ]); # set Status, X-Status
-	for my $k (@$kw) {
+	for my $k (@{$smsg->{kw} // []}) {
 		if (my $ent = $kw2status{$k}) {
 			push @{$hdr{$ent->[0]}}, $ent->[1];
 		} else { # X-Label?
@@ -53,9 +53,11 @@ sub _mbox_hdr_buf ($$$) {
 
 	# fixup old bug from import (pre-a0c07cba0e5d8b6a)
 	$$buf =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s;
+	my $ident = $smsg->{blob} // 'lei';
+	if (defined(my $pct = $smsg->{pct})) { $ident .= "=$pct" }
 
 	substr($$buf, 0, 0, # prepend From line
-		"From lei\@$type Thu Jan  1 00:00:00 1970$eml->{crlf}");
+		"From $ident\@$type Thu Jan  1 00:00:00 1970$eml->{crlf}");
 	$buf;
 }
 
@@ -71,8 +73,8 @@ sub _print_full {
 }
 
 sub eml2mboxrd ($;$) {
-	my ($eml, $kw) = @_;
-	my $buf = _mbox_hdr_buf($eml, 'mboxrd', $kw);
+	my ($eml, $smsg) = @_;
+	my $buf = _mbox_hdr_buf($eml, 'mboxrd', $smsg);
 	if (my $bdy = delete $eml->{bdy}) {
 		$$bdy =~ s/^(>*From )/>$1/gm;
 		$$buf .= $eml->{crlf};
@@ -84,8 +86,8 @@ sub eml2mboxrd ($;$) {
 }
 
 sub eml2mboxo {
-	my ($eml, $kw) = @_;
-	my $buf = _mbox_hdr_buf($eml, 'mboxo', $kw);
+	my ($eml, $smsg) = @_;
+	my $buf = _mbox_hdr_buf($eml, 'mboxo', $smsg);
 	if (my $bdy = delete $eml->{bdy}) {
 		$$bdy =~ s/^From />From /gm;
 		$$buf .= $eml->{crlf};
@@ -108,8 +110,8 @@ sub _mboxcl_common ($$$) {
 
 # mboxcl still escapes "From " lines
 sub eml2mboxcl {
-	my ($eml, $kw) = @_;
-	my $buf = _mbox_hdr_buf($eml, 'mboxcl', $kw);
+	my ($eml, $smsg) = @_;
+	my $buf = _mbox_hdr_buf($eml, 'mboxcl', $smsg);
 	my $crlf = $eml->{crlf};
 	if (my $bdy = delete $eml->{bdy}) {
 		$$bdy =~ s/^From />From /gm;
@@ -121,8 +123,8 @@ sub eml2mboxcl {
 
 # mboxcl2 has no "From " escaping
 sub eml2mboxcl2 {
-	my ($eml, $kw) = @_;
-	my $buf = _mbox_hdr_buf($eml, 'mboxcl2', $kw);
+	my ($eml, $smsg) = @_;
+	my $buf = _mbox_hdr_buf($eml, 'mboxcl2', $smsg);
 	my $crlf = $eml->{crlf};
 	if (my $bdy = delete $eml->{bdy}) {
 		_mboxcl_common($buf, $bdy, $crlf);
@@ -140,10 +142,11 @@ sub git_to_mail { # git->cat_async callback
 			warn "unexpected type=$type for $oid\n";
 		}
 	}
-	if ($size > 0) {
-		my ($write_cb, $kw) = @$arg;
-		$write_cb->($bref, $oid, $kw);
+	my ($write_cb, $smsg) = @$arg;
+	if ($smsg->{blob} ne $oid) {
+		die "BUG: expected=$smsg->{blob} got=$oid";
 	}
+	$write_cb->($bref, $smsg) if $size > 0;
 }
 
 sub reap_compress { # dwaitpid callback
@@ -247,11 +250,11 @@ sub _mbox_write_cb ($$) {
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe;
 	sub { # for git_to_mail
-		my ($buf, $oid, $kw) = @_;
+		my ($buf, $smsg) = @_;
 		return unless $out;
 		my $eml = PublicInbox::Eml->new($buf);
-		if (!$dedupe->is_dup($eml, $oid)) {
-			$buf = $eml2mbox->($eml, $kw);
+		if (!$dedupe->is_dup($eml, $smsg->{blob})) {
+			$buf = $eml2mbox->($eml, $smsg);
 			my $lk = $ovv->lock_for_scope;
 			eval { $write->($out, $buf) };
 			if ($@) {
@@ -283,12 +286,15 @@ sub _augment_file { # _maildir_each_file cb
 sub _unlink { unlink($_[0]) }
 
 sub _buf2maildir {
-	my ($dst, $buf, $oid, $kw) = @_;
+	my ($dst, $buf, $smsg) = @_;
+	my $kw = $smsg->{kw} // [];
 	my $sfx = join('', sort(map { $kw2char{$_} // () } @$kw));
 	my $rand = ''; # chosen by die roll :P
 	my ($tmp, $fh, $final);
+	my $common = $smsg->{blob};
+	if (defined(my $pct = $smsg->{pct})) { $common .= "=$pct" }
 	do {
-		$tmp = $dst.'tmp/'.$rand."oid=$oid";
+		$tmp = $dst.'tmp/'.$rand.$common;
 	} while (!sysopen($fh, $tmp, O_CREAT|O_EXCL|O_WRONLY) &&
 		$! == EEXIST && ($rand = int(rand 0x7fffffff).','));
 	if (print $fh $$buf and close($fh)) {
@@ -299,14 +305,14 @@ sub _buf2maildir {
 		$dst .= 'cur/';
 		$rand = '';
 		do {
-			$final = $dst.$rand."oid=$oid:2,$sfx";
+			$final = $dst.$rand.$common.':2,'.$sfx;
 		} while (!link($tmp, $final) && $! == EEXIST &&
 			($rand = int(rand 0x7fffffff).','));
 		unlink($tmp) or warn "W: failed to unlink $tmp: $!\n";
 	} else {
 		my $err = $!;
 		unlink($tmp);
-		die "Error writing $oid to $dst: $err";
+		die "Error writing $smsg->{blob} to $dst: $err";
 	}
 }
 
@@ -316,12 +322,12 @@ sub _maildir_write_cb ($$) {
 	$dedupe->prepare_dedupe;
 	my $dst = $lei->{ovv}->{dst};
 	sub { # for git_to_mail
-		my ($buf, $oid, $kw) = @_;
-		return _buf2maildir($dst, $buf, $oid, $kw) if !$dedupe;
+		my ($buf, $smsg) = @_;
+		return _buf2maildir($dst, $buf, $smsg) if !$dedupe;
 		my $eml = PublicInbox::Eml->new($$buf); # copy buf
-		return if $dedupe->is_dup($eml, $oid);
+		return if $dedupe->is_dup($eml, $smsg->{blob});
 		undef $eml;
-		_buf2maildir($dst, $buf, $oid, $kw);
+		_buf2maildir($dst, $buf, $smsg);
 	}
 }
 
@@ -447,7 +453,7 @@ sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
 }
 
 sub write_mail { # via ->wq_do
-	my ($self, $git_dir, $oid, $lei, $kw) = @_;
+	my ($self, $git_dir, $smsg, $lei) = @_;
 	my $not_done = delete $self->{4}; # write end of {each_smsg_done}
 	my $wcb = $self->{wcb} //= do { # first message
 		my %sig = $lei->atfork_child_wq($self);
@@ -456,7 +462,7 @@ sub write_mail { # via ->wq_do
 		$self->write_cb($lei);
 	};
 	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);
-	$git->cat_async($oid, \&git_to_mail, [ $wcb, $kw, $not_done ]);
+	$git->cat_async($smsg->{blob}, \&git_to_mail, [$wcb, $smsg, $not_done]);
 }
 
 sub ipc_atfork_prepare {
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index 6673d9a6..47c0e3d4 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -18,11 +18,12 @@ my $noeol = "Subject: x\n\nFrom hell";
 my $crlf = $noeol;
 $crlf =~ s/\n/\r\n/g;
 my $kw = [qw(seen answered flagged)];
+my $smsg = { kw => $kw, blob => '0'x40 };
 my @MBOX = qw(mboxcl2 mboxrd mboxcl mboxo);
 for my $mbox (@MBOX) {
 	my $m = "eml2$mbox";
 	my $cb = PublicInbox::LeiToMail->can($m);
-	my $s = $cb->(PublicInbox::Eml->new($from), $kw);
+	my $s = $cb->(PublicInbox::Eml->new($from), $smsg);
 	is(substr($$s, -1, 1), "\n", "trailing LF in normal $mbox");
 	my $eml = PublicInbox::Eml->new($s);
 	is($eml->header('Status'), 'OR', "Status: set by $m");
@@ -40,7 +41,7 @@ for my $mbox (@MBOX) {
 	} else {
 		is(scalar(@cl), 0, "$m clobbered Content-Length");
 	}
-	$s = $cb->(PublicInbox::Eml->new($noeol), $kw);
+	$s = $cb->(PublicInbox::Eml->new($noeol), $smsg);
 	is(substr($$s, -1, 1), "\n",
 		"trailing LF added by $m when original lacks EOL");
 	$eml = PublicInbox::Eml->new($s);
@@ -49,7 +50,7 @@ for my $mbox (@MBOX) {
 	} else {
 		is($eml->body_raw, ">From hell\n", "From escaped once by $m");
 	}
-	$s = $cb->(PublicInbox::Eml->new($crlf), $kw);
+	$s = $cb->(PublicInbox::Eml->new($crlf), $smsg);
 	is(substr($$s, -2, 2), "\r\n",
 		"trailing CRLF added $m by original lacks EOL");
 	$eml = PublicInbox::Eml->new($s);
@@ -62,7 +63,7 @@ for my $mbox (@MBOX) {
 		is($eml->header('Content-Length') + length("\r\n"),
 			length($eml->body_raw), "$m Content-Length matches");
 	} elsif ($mbox eq 'mboxrd') {
-		$s = $cb->($eml, $kw);
+		$s = $cb->($eml, $smsg);
 		$eml = PublicInbox::Eml->new($s);
 		is($eml->body_raw,
 			">>From hell\r\n\r\n", "From escaped again by $m");
@@ -102,11 +103,12 @@ my $wcb_get = sub {
 	$cb;
 };
 
+my $deadbeef = { blob => 'deadbeef', kw => [ qw(seen) ] };
 my $orig = do {
 	my $wcb = $wcb_get->($mbox, $fn);
 	is(ref $wcb, 'CODE', 'write_cb returned callback');
 	ok(-f $fn && !-s _, 'empty file created');
-	$wcb->(\(my $dup = $buf), 'deadbeef', [ qw(seen) ]);
+	$wcb->(\(my $dup = $buf), $deadbeef);
 	undef $wcb;
 	open my $fh, '<', $fn or BAIL_OUT $!;
 	my $raw = do { local $/; <$fh> };
@@ -116,7 +118,7 @@ my $orig = do {
 	local $lei->{opt} = { jobs => 2 };
 	$wcb = $wcb_get->($mbox, $fn);
 	ok(-f $fn && !-s _, 'truncated mbox destination');
-	$wcb->(\($dup = $buf), 'deadbeef', [ qw(seen) ]);
+	$wcb->(\($dup = $buf), $deadbeef);
 	undef $wcb;
 	open $fh, '<', $fn or BAIL_OUT $!;
 	is(do { local $/; <$fh> }, $raw, 'jobs > 1');
@@ -131,7 +133,7 @@ for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 		ok($dc_cmd, "decompressor for .$zsfx");
 		my $f = "$fn.$zsfx";
 		my $wcb = $wcb_get->($mbox, $f);
-		$wcb->(\(my $dup = $buf), 'deadbeef', [ qw(seen) ]);
+		$wcb->(\(my $dup = $buf), $deadbeef);
 		undef $wcb;
 		my $uncompressed = xqx([@$dc_cmd, $f]);
 		is($uncompressed, $orig, "$zsfx works unlocked");
@@ -139,13 +141,13 @@ for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 		local $lei->{opt} = { jobs => 2 }; # for atomic writes
 		unlink $f or BAIL_OUT "unlink $!";
 		$wcb = $wcb_get->($mbox, $f);
-		$wcb->(\($dup = $buf), 'deadbeef', [ qw(seen) ]);
+		$wcb->(\($dup = $buf), $deadbeef);
 		undef $wcb;
 		is(xqx([@$dc_cmd, $f]), $orig, "$zsfx matches with lock");
 
 		local $lei->{opt} = { augment => 1 };
 		$wcb = $wcb_get->($mbox, $f);
-		$wcb->(\($dup = $buf . "\nx\n"), 'deadbeef', [ qw(seen) ]);
+		$wcb->(\($dup = $buf . "\nx\n"), $deadbeef);
 		undef $wcb; # commit
 
 		my $cat = popen_rd([@$dc_cmd, $f]);
@@ -157,7 +159,7 @@ for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 
 		local $lei->{opt} = { augment => 1, jobs => 2 };
 		$wcb = $wcb_get->($mbox, $f);
-		$wcb->(\($dup = $buf . "\ny\n"), 'deadbeef', [ qw(seen) ]);
+		$wcb->(\($dup = $buf . "\ny\n"), $deadbeef);
 		undef $wcb; # commit
 
 		my @raw3;
@@ -179,7 +181,8 @@ my $as_orig = sub {
 unlink $fn or BAIL_OUT $!;
 if ('default deduplication uses content_hash') {
 	my $wcb = $wcb_get->('mboxo', $fn);
-	$wcb->(\(my $x = $buf), 'deadbeef', []) for (1..2);
+	$deadbeef->{kw} = [];
+	$wcb->(\(my $x = $buf), $deadbeef) for (1..2);
 	undef $wcb; # undef to commit changes
 	my $cmp = '';
 	open my $fh, '<', $fn or BAIL_OUT $!;
@@ -188,7 +191,7 @@ if ('default deduplication uses content_hash') {
 
 	local $lei->{opt} = { augment => 1 };
 	$wcb = $wcb_get->('mboxo', $fn);
-	$wcb->(\($x = $buf . "\nx\n"), 'deadbeef', []) for (1..2);
+	$wcb->(\($x = $buf . "\nx\n"), $deadbeef) for (1..2);
 	undef $wcb; # undef to commit changes
 	open $fh, '<', $fn or BAIL_OUT $!;
 	my @x;
@@ -202,7 +205,7 @@ if ('default deduplication uses content_hash') {
 	open my $tmp, '+>', undef or BAIL_OUT $!;
 	local $lei->{1} = $tmp;
 	my $wcb = $wcb_get->('mboxrd', '/dev/stdout');
-	$wcb->(\(my $x = $buf), 'deadbeef', []);
+	$wcb->(\(my $x = $buf), $deadbeef);
 	undef $wcb; # commit
 	seek($tmp, 0, SEEK_SET) or BAIL_OUT $!;
 	my $cmp = '';
@@ -216,7 +219,7 @@ SKIP: { # FIFO support
 	mkfifo($fn, 0600) or skip("mkfifo not supported: $!", 1);
 	my $cat = popen_rd([which('cat'), $fn]);
 	my $wcb = $wcb_get->('mboxo', $fn);
-	$wcb->(\(my $x = $buf), 'deadbeef', []);
+	$wcb->(\(my $x = $buf), $deadbeef);
 	undef $wcb; # commit
 	my $cmp = '';
 	PublicInbox::MboxReader->mboxo($cat, sub { $cmp .= $as_orig->(@_) });
@@ -227,7 +230,8 @@ SKIP: { # FIFO support
 	my $md = "$tmpdir/maildir/";
 	my $wcb = $wcb_get->('maildir', $md);
 	is(ref($wcb), 'CODE', 'got Maildir callback');
-	$wcb->(\(my $x = $buf), 'badc0ffee', []);
+	my $b4dc0ffee = { blob => 'badc0ffee', kw => [] };
+	$wcb->(\(my $x = $buf), $b4dc0ffee);
 
 	my @f;
 	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @f, shift });
@@ -235,7 +239,8 @@ SKIP: { # FIFO support
 	is(do { local $/; <$fh> }, $buf, 'wrote to Maildir');
 
 	$wcb = $wcb_get->('maildir', $md);
-	$wcb->(\($x = $buf."\nx\n"), 'deadcafe', []);
+	my $deadcafe = { blob => 'deadcafe', kw => [] };
+	$wcb->(\($x = $buf."\nx\n"), $deadcafe);
 
 	my @x = ();
 	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @x, shift });
@@ -246,8 +251,8 @@ SKIP: { # FIFO support
 
 	local $lei->{opt}->{augment} = 1;
 	$wcb = $wcb_get->('maildir', $md);
-	$wcb->(\($x = $buf."\ny\n"), 'deadcafe', []);
-	$wcb->(\($x = $buf."\ny\n"), 'b4dc0ffee', []); # skipped by dedupe
+	$wcb->(\($x = $buf."\ny\n"), $deadcafe);
+	$wcb->(\($x = $buf."\ny\n"), $b4dc0ffee); # skipped by dedupe
 	@f = ();
 	PublicInbox::LeiToMail::_maildir_each_file($md, sub { push @f, shift });
 	is(scalar grep(/\A\Q$x[0]\E\z/, @f), 1, 'old file still there');

^ permalink raw reply related	[relevance 31%]

* [PATCH 07/12] lei: oneshot: use client $io[2] for placeholder
  2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-21 19:46 45% ` [PATCH 05/12] lei: fix inadvertant FD sharing Eric Wong
@ 2021-01-21 19:46 71% ` Eric Wong
  2021-01-21 19:46 67% ` [PATCH 08/12] lei: remove INT/QUIT/TERM handlers, fix daemon EOF Eric Wong
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

STDERR may actually get closed in ->ipc_atfork_child in
oneshot mode, so ensure we pass in a valid file handle
to avoid warnings ->wq_do.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 2cb2bf40..11ea385f 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -334,7 +334,7 @@ sub atfork_parent_wq {
 	$self->{env} = $env;
 	delete @$ret{qw(-lei_store cfg pgr lxs)}; # keep l2m
 	my @io = delete @$ret{0..2};
-	$io[3] = delete($ret->{sock}) // *STDERR{GLOB};
+	$io[3] = delete($ret->{sock}) // $io[2];
 	my $l2m = $ret->{l2m};
 	if ($l2m && $l2m != $wq) { # $wq == lxs
 		$io[4] = $l2m->{-wq_s1} if $l2m->{-wq_s1};

^ permalink raw reply related	[relevance 71%]

* [PATCH 12/12] lei forget-external: bash completion support
  2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
                   ` (6 preceding siblings ...)
  2021-01-21 19:46 43% ` [PATCH 11/12] lei: forget-external support with canonicalization Eric Wong
@ 2021-01-21 19:46 69% ` Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

The tricky bit was getting around word splitting bash
does on URLs.  This may work with other shells, too.
---
 lib/PublicInbox/LEI.pm         |  4 ++++
 lib/PublicInbox/LeiExternal.pm | 17 +++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9c3d7279..ef3f90fc 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -655,6 +655,10 @@ sub lei__complete {
 	} elsif ($cmd eq 'config' && !@argv && !$CONFIG_KEYS{$cur}) {
 		puts $self, grep(/$re/, keys %CONFIG_KEYS);
 	}
+	$cmd =~ tr/-/_/;
+	if (my $sub = $self->can("_complete_$cmd")) {
+		puts $self, $sub->($self, @argv, $cur);
+	}
 	# TODO: URLs, pathnames, OIDs, MIDs, etc...  See optparse() for
 	# proto parsing.
 }
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 21071058..59c3c367 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -93,4 +93,21 @@ sub lei_forget_external {
 	}
 }
 
+# shell completion helper called by lei__complete
+sub _complete_forget_external {
+	my ($self, @argv) = @_;
+	my $cfg = $self->_lei_cfg(0);
+	my $cur = pop @argv;
+	# Workaround bash word-splitting URLs to ['https', ':', '//' ...]
+	# Maybe there's a better way to go about this in
+	# contrib/completion/lei-completion.bash
+	my $colon = ($argv[-1] // '') eq ':';
+	my $re = $cur =~ /\A[\w-]/ ? '' : '.*';
+	map {
+		my $x = substr($_, length('external.'));
+		# only return the part specified on the CLI
+		$colon && $x =~ /(\Q$cur\E.*)/ ? $1 : $x;
+	} grep(/\Aexternal\.$re\Q$cur/, @{$cfg->{-section_order}});
+}
+
 1;

^ permalink raw reply related	[relevance 69%]

* [PATCH 08/12] lei: remove INT/QUIT/TERM handlers, fix daemon EOF
  2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
                   ` (3 preceding siblings ...)
  2021-01-21 19:46 71% ` [PATCH 07/12] lei: oneshot: use client $io[2] for placeholder Eric Wong
@ 2021-01-21 19:46 67% ` Eric Wong
  2021-01-21 19:46 58% ` [PATCH 10/12] lei: remove @TO_CLOSE_ATFORK_CHILD Eric Wong
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

The signal handlers on the client side were unnecessary,
all we need is to handle socket EOF properly in the daemon
by killing xsearch and l2m workers.
---
 lib/PublicInbox/IPC.pm | 1 +
 lib/PublicInbox/LEI.pm | 9 ++++++++-
 script/lei             | 5 -----
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 24f45e03..dbb87e4e 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -408,6 +408,7 @@ sub DESTROY {
 	my $ppid = $self->{-wq_ppid};
 	wq_kill($self) if $ppid && $ppid == $$;
 	wq_close($self);
+	wq_wait_old($self);
 	ipc_worker_stop($self);
 }
 
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 11ea385f..ccfc1649 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -767,7 +767,14 @@ sub accept_dispatch { # Listener {post_accept} callback
 
 sub dclose {
 	my ($self) = @_;
-	delete $self->{lxs}; # stops LeiXSearch queries
+	for my $f (qw(lxs l2m)) {
+		my $wq = delete $self->{$f} or next;
+		if ($wq->wq_kill) {
+			$self->wq_close
+		} elsif ($wq->wq_kill_old) {
+			$wq->wq_wait_old;
+		}
+	}
 	close(delete $self->{1}) if $self->{1}; # may reap_compress
 	$self->close if $self->{sock}; # PublicInbox::DS::close
 }
diff --git a/script/lei b/script/lei
index a4a0217b..8dcea562 100755
--- a/script/lei
+++ b/script/lei
@@ -81,11 +81,6 @@ Falling back to (slow) one-shot mode
 	while (my ($k, $v) = each %ENV) { $buf .= "\0$k=$v" }
 	$buf .= "\0\0";
 	$send_cmd->($sock, [ 0, 1, 2, fileno($dh) ], $buf, MSG_EOR);
-	$SIG{TERM} = $SIG{INT} = $SIG{QUIT} = sub {
-		my ($sig) = @_; # 'TERM', not an integer :<
-		$SIG{$sig} = 'DEFAULT';
-		kill($sig, $$); # exit($signo + 128)
-	};
 	my $x_it_code = 0;
 	while (1) {
 		my (@fds) = $recv_cmd->($sock, $buf, 4096 * 33);

^ permalink raw reply related	[relevance 67%]

* [PATCH 10/12] lei: remove @TO_CLOSE_ATFORK_CHILD
  2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
                   ` (4 preceding siblings ...)
  2021-01-21 19:46 67% ` [PATCH 08/12] lei: remove INT/QUIT/TERM handlers, fix daemon EOF Eric Wong
@ 2021-01-21 19:46 58% ` Eric Wong
  2021-01-21 19:46 43% ` [PATCH 11/12] lei: forget-external support with canonicalization Eric Wong
  2021-01-21 19:46 69% ` [PATCH 12/12] lei forget-external: bash completion support Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

..At least limit it to a single file handle.  The write end
EOFpipe can be limited in scope and auto-closed when $quit is
clobbered, leaving only the listener.  The listener is the only
handle that needs to be closed explicitly due to it being on the
stack in the Listener->event_step => accept_dispatch => lei_$FOO
code path.

Everything else gets clobbered by DS->Reset in children after
forking.
---
 lib/PublicInbox/LEI.pm | 40 +++++++++++++++++++---------------------
 1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ccfc1649..37b45a00 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -27,7 +27,7 @@ use Text::Wrap qw(wrap);
 use File::Path qw(mkpath);
 use File::Spec;
 our $quit = \&CORE::exit;
-our ($current_lei, $errors_log);
+our ($current_lei, $errors_log, $listener);
 my ($recv_cmd, $send_cmd);
 my $GLP = Getopt::Long::Parser->new;
 $GLP->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
@@ -35,7 +35,6 @@ my $GLP_PASS = Getopt::Long::Parser->new;
 $GLP_PASS->configure(qw(gnu_getopt no_ignore_case auto_abbrev pass_through));
 
 our %PATH2CFG; # persistent for socket daemon
-our @TO_CLOSE_ATFORK_CHILD;
 
 # TBD: this is a documentation mechanism to show a subcommand
 # (may) pass options through to another command:
@@ -281,8 +280,7 @@ sub fail ($$;$) {
 
 sub atfork_prepare_wq {
 	my ($self, $wq) = @_;
-	my $tcafc = $wq->{-ipc_atfork_child_close} //= [];
-	push @$tcafc, @TO_CLOSE_ATFORK_CHILD;
+	my $tcafc = $wq->{-ipc_atfork_child_close} //= [ $listener // () ];
 	if (my $sock = $self->{sock}) {
 		push @$tcafc, @$self{qw(0 1 2)}, $sock;
 	}
@@ -307,7 +305,6 @@ sub atfork_child_wq {
 	%PATH2CFG = ();
 	undef $errors_log;
 	$quit = \&CORE::exit;
-	@TO_CLOSE_ATFORK_CHILD = ();
 	(__WARN__ => sub { err($self, @_) },
 	PIPE => sub {
 		$self->x_it(13); # SIGPIPE = 13
@@ -837,12 +834,12 @@ sub lazy_start {
 		die "connect($path): $!";
 	}
 	umask(077) // die("umask(077): $!");
-	socket(my $l, AF_UNIX, SOCK_SEQPACKET, 0) or die "socket: $!";
-	bind($l, pack_sockaddr_un($path)) or die "bind($path): $!";
-	listen($l, 1024) or die "listen: $!";
+	local $listener;
+	socket($listener, AF_UNIX, SOCK_SEQPACKET, 0) or die "socket: $!";
+	bind($listener, pack_sockaddr_un($path)) or die "bind($path): $!";
+	listen($listener, 1024) or die "listen: $!";
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
-	pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
 	local $oldset = PublicInbox::DS::block_signals();
 	if ($narg == 5) {
 		$send_cmd = PublicInbox::Spawn->can('send_cmd4');
@@ -869,20 +866,21 @@ sub lazy_start {
 	return if $pid;
 	$0 = "lei-daemon $path";
 	local %PATH2CFG;
-	local @TO_CLOSE_ATFORK_CHILD = ($l, $eof_w);
-	$l->blocking(0);
-	$l = PublicInbox::Listener->new($l, \&accept_dispatch, $l);
+	$listener->blocking(0);
 	my $exit_code;
-	local $quit = sub {
-		$exit_code //= shift;
-		my $listener = $l or exit($exit_code);
-		# closing eof_w triggers \&noop wakeup
-		$eof_w = $l = $path = undef;
-		$listener->close; # DS::close
-		PublicInbox::DS->SetLoopTimeout(1000);
+	my $pil = PublicInbox::Listener->new($listener, \&accept_dispatch);
+	local $quit = do {
+		pipe(my ($eof_r, $eof_w)) or die "pipe: $!";
+		PublicInbox::EOFpipe->new($eof_r, \&noop, undef);
+		sub {
+			$exit_code //= shift;
+			my $lis = $pil or exit($exit_code);
+			# closing eof_w triggers \&noop wakeup
+			$listener = $eof_w = $pil = $path = undef;
+			$lis->close; # DS::close
+			PublicInbox::DS->SetLoopTimeout(1000);
+		};
 	};
-	PublicInbox::EOFpipe->new($eof_r, \&noop, undef);
-	undef $eof_r;
 	my $sig = {
 		CHLD => \&PublicInbox::DS::enqueue_reap,
 		QUIT => $quit,

^ permalink raw reply related	[relevance 58%]

* [PATCH 11/12] lei: forget-external support with canonicalization
  2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
                   ` (5 preceding siblings ...)
  2021-01-21 19:46 58% ` [PATCH 10/12] lei: remove @TO_CLOSE_ATFORK_CHILD Eric Wong
@ 2021-01-21 19:46 43% ` Eric Wong
  2021-01-21 19:46 69% ` [PATCH 12/12] lei forget-external: bash completion support Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-21 19:46 UTC (permalink / raw)
  To: meta

For proper matching, we'll do a better job canonicalizing
URLs and path names for matching.  Of course, users may edit
the file outside of lei, so ensure we try both the canonicalized
and as-is form provided by the user.

I also don't think we'll need to store externals info in
MiscIdx; just the config file is fine.
---
 MANIFEST                       |  1 +
 lib/PublicInbox/LEI.pm         | 24 ++++++++++-----
 lib/PublicInbox/LeiExternal.pm | 54 +++++++++++++++++++++++++++-------
 t/lei.t                        |  9 ++++++
 t/lei_external.t               | 18 ++++++++++++
 5 files changed, 88 insertions(+), 18 deletions(-)
 create mode 100644 t/lei_external.t

diff --git a/MANIFEST b/MANIFEST
index 0de1de4a..ddee1539 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -339,6 +339,7 @@ t/kqnotify.t
 t/lei-oneshot.t
 t/lei.t
 t/lei_dedupe.t
+t/lei_external.t
 t/lei_overview.t
 t/lei_store.t
 t/lei_to_mail.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 37b45a00..9c3d7279 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -21,7 +21,7 @@ use PublicInbox::Config;
 use PublicInbox::Syscall qw(SFD_NONBLOCK EPOLLIN EPOLLET);
 use PublicInbox::Sigfd;
 use PublicInbox::DS qw(now dwaitpid);
-use PublicInbox::Spawn qw(spawn run_die popen_rd);
+use PublicInbox::Spawn qw(spawn popen_rd);
 use PublicInbox::OnDestroy;
 use Text::Wrap qw(wrap);
 use File::Path qw(mkpath);
@@ -95,7 +95,7 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(boost=i quiet|q) ],
 'ls-external' => [ '[FILTER...]', 'list publicinbox|extindex locations',
 	qw(format|f=s z|0 local remote quiet|q) ],
-'forget-external' => [ '{URL_OR_PATHNAME|--prune}',
+'forget-external' => [ 'URL_OR_PATHNAME...|--prune',
 	'exclude further results from a publicinbox|extindex',
 	qw(prune quiet|q) ],
 
@@ -114,7 +114,7 @@ our %CMD = ( # sorted in order of importance/use:
 	"exclude message(s) on stdin from `q' search results",
 	qw(stdin| oid=s exact by-mid|mid:s quiet|q) ],
 
-'purge-mailsource' => [ '{URL_OR_PATHNAME|--all}',
+'purge-mailsource' => [ 'URL_OR_PATHNAME|--all',
 	'remove imported messages from IMAP, Maildirs, and MH',
 	qw(exact! all jobs:i indexed) ],
 
@@ -137,7 +137,7 @@ our %CMD = ( # sorted in order of importance/use:
 'forget-watch' => [ '{WATCH_NUMBER|--prune}', 'stop and forget a watch',
 	qw(prune) ],
 
-'import' => [ '{URL_OR_PATHNAME|--stdin}',
+'import' => [ 'URL_OR_PATHNAME|--stdin',
 	'one-shot import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include=s !flags),
 	],
@@ -468,6 +468,7 @@ sub optparse ($$$) {
 					last;
 				} # else continue looping
 			}
+			last if $ok;
 			my $last = pop @or;
 			$err = join(', ', @or) . " or $last must be set";
 		} else {
@@ -547,16 +548,23 @@ sub lei_mark {
 	my ($self, @argv) = @_;
 }
 
-sub lei_config {
+sub _config {
 	my ($self, @argv) = @_;
-	$self->{opt}->{'config-file'} and return fail $self,
-		"config file switches not supported by `lei config'";
 	my $env = $self->{env};
 	delete local $env->{GIT_CONFIG};
+	delete local $ENV{GIT_CONFIG};
 	my $cfg = _lei_cfg($self, 1);
 	my $cmd = [ qw(git config -f), $cfg->{'-f'}, @argv ];
 	my %rdr = map { $_ => $self->{$_} } (0..2);
-	run_die($cmd, $env, \%rdr);
+	waitpid(spawn($cmd, $env, \%rdr), 0);
+}
+
+sub lei_config {
+	my ($self, @argv) = @_;
+	$self->{opt}->{'config-file'} and return fail $self,
+		"config file switches not supported by `lei config'";
+	_config(@_);
+	x_it($self, $?) if $?;
 }
 
 sub lei_init {
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 64faf5a0..21071058 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -7,6 +7,7 @@ use strict;
 use v5.10.1;
 use parent qw(Exporter);
 our @EXPORT = qw(lei_ls_external lei_add_external lei_forget_external);
+use PublicInbox::Config;
 
 sub _externals_each {
 	my ($self, $cb, @arg) = @_;
@@ -30,7 +31,6 @@ sub _externals_each {
 
 sub lei_ls_external {
 	my ($self, @argv) = @_;
-	my $stor = $self->_lei_store(0);
 	my $out = $self->{1};
 	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
 	$self->_externals_each(sub {
@@ -39,24 +39,58 @@ sub lei_ls_external {
 	});
 }
 
+sub _canonicalize {
+	my ($location) = @_;
+	if ($location !~ m!\Ahttps?://!) {
+		PublicInbox::Config::rel2abs_collapsed($location);
+	} else {
+		require URI;
+		my $uri = URI->new($location)->canonical;
+		my $path = $uri->path . '/';
+		$path =~ tr!/!/!s; # squeeze redundant '/'
+		$uri->path($path);
+		$uri->as_string;
+	}
+}
+
 sub lei_add_external {
-	my ($self, $url_or_dir) = @_;
+	my ($self, $location) = @_;
 	my $cfg = $self->_lei_cfg(1);
-	if ($url_or_dir !~ m!\Ahttps?://!) {
-		$url_or_dir = File::Spec->canonpath($url_or_dir);
-	}
 	my $new_boost = $self->{opt}->{boost} // 0;
-	my $key = "external.$url_or_dir.boost";
+	$location = _canonicalize($location);
+	my $key = "external.$location.boost";
 	my $cur_boost = $cfg->{$key};
 	return if defined($cur_boost) && $cur_boost == $new_boost; # idempotent
 	$self->lei_config($key, $new_boost);
-	my $stor = $self->_lei_store(1);
-	# TODO: add to MiscIdx
-	$stor->done;
+	$self->_lei_store(1)->done; # just create the store
 }
 
 sub lei_forget_external {
-	# TODO
+	my ($self, @locations) = @_;
+	my $cfg = $self->_lei_cfg(1);
+	my $quiet = $self->{opt}->{quiet};
+	for my $loc (@locations) {
+		my (@unset, @not_found);
+		for my $l ($loc, _canonicalize($loc)) {
+			my $key = "external.$l.boost";
+			delete($cfg->{$key});
+			$self->_config('--unset', $key);
+			if ($? == 0) {
+				push @unset, $key;
+			} elsif (($? >> 8) == 5) {
+				push @not_found, $key;
+			} else {
+				$self->err("# --unset $key error");
+				return $self->x_it($?);
+			}
+		}
+		if (@unset) {
+			next if $quiet;
+			$self->err("# $_ unset") for @unset;
+		} elsif (@not_found) {
+			$self->err("# $_ not found") for @not_found;
+		} # else { already exited
+	}
 }
 
 1;
diff --git a/t/lei.t b/t/lei.t
index ef820fe3..50ad2bb1 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -180,6 +180,15 @@ my $test_external = sub {
 	});
 	$lei->('ls-external');
 	like($out, qr/boost=0\n/s, 'ls-external has output');
+	ok($lei->(qw(add-external -q https://EXAMPLE.com/ibx)), 'add remote');
+	is($err, '', 'no warnings after add-external');
+	$lei->('ls-external');
+	like($out, qr!https://example\.com/ibx/!s, 'added canonical URL');
+	is($err, '', 'no warnings on ls-external');
+	ok($lei->(qw(forget-external -q https://EXAMPLE.com/ibx)),
+		'forget');
+	$lei->('ls-external');
+	unlike($out, qr!https://example\.com/ibx/!s, 'removed canonical URL');
 
 	ok(!$lei->(qw(q s:prefix -o /dev/null -f maildir)), 'bad maildir');
 	like($err, qr!/dev/null exists and is not a directory!,
diff --git a/t/lei_external.t b/t/lei_external.t
new file mode 100644
index 00000000..1f0048a1
--- /dev/null
+++ b/t/lei_external.t
@@ -0,0 +1,18 @@
+#!perl -w
+use strict;
+use v5.10.1;
+use Test::More;
+my $cls = 'PublicInbox::LeiExternal';
+require_ok $cls;
+my $canon = $cls->can('_canonicalize');
+my $exp = 'https://example.com/my-inbox/';
+is($canon->('https://example.com/my-inbox'), $exp, 'trailing slash added');
+is($canon->('https://example.com/my-inbox//'), $exp, 'trailing slash removed');
+is($canon->('https://example.com//my-inbox/'), $exp, 'leading slash removed');
+is($canon->('https://EXAMPLE.com/my-inbox/'), $exp, 'lowercased');
+is($canon->('/this/path/is/nonexistent/'), '/this/path/is/nonexistent',
+	'non-existent pathname canonicalized');
+is($canon->('/this//path/'), '/this/path', 'extra slashes gone');
+is($canon->('/ALL/CAPS'), '/ALL/CAPS', 'caps preserved');
+
+done_testing;

^ permalink raw reply related	[relevance 43%]

* [PATCH 00/10] lei: externals more stuff
@ 2021-01-23 10:27 71% Eric Wong
  2021-01-23 10:27 47% ` [PATCH 01/10] lei: move external vivification to xsearch Eric Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

I don't know what I'm doing anymore, and maybe I never did.

Eric Wong (10):
  lei: move external vivification to xsearch
  lei: support remote externals
  lei_to_mail: drop cyclic reference if not using IPC
  lei: oneshot: preserve stdout if writing mbox
  lei: default "-f $mfolder" args for common MUAs
  lei completion: handle URLs with port numbers
  lei forget-external: just show the location
  lei q: support a bunch of curl(1) options
  lei forget-external: do not show redundant "not found" lines
  lei add-external: don't allow non-existent directories

 lib/PublicInbox/LEI.pm         |  46 +++++++----
 lib/PublicInbox/LeiExternal.pm |  41 ++++++++--
 lib/PublicInbox/LeiOverview.pm |  10 ++-
 lib/PublicInbox/LeiQuery.pm    |  68 ++++++++++++-----
 lib/PublicInbox/LeiToMail.pm   |  24 ++++--
 lib/PublicInbox/LeiXSearch.pm  | 136 ++++++++++++++++++++++++++++-----
 lib/PublicInbox/ProcessPipe.pm |   2 +
 script/lei                     |   2 +
 t/lei.t                        |  43 +++++++++++
 t/lei_xsearch.t                |   5 +-
 10 files changed, 309 insertions(+), 68 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 01/10] lei: move external vivification to xsearch
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
@ 2021-01-23 10:27 47% ` Eric Wong
  2021-01-23 10:27 35% ` [PATCH 02/10] lei: support remote externals Eric Wong
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

This seems like a better place to put it given upcoming
URI support, which starts in this commit.
---
 lib/PublicInbox/LeiQuery.pm   | 27 +++++------------
 lib/PublicInbox/LeiXSearch.pm | 57 ++++++++++++++++++++++++-----------
 t/lei_xsearch.t               |  5 ++-
 3 files changed, 50 insertions(+), 39 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 7d634b5e..eebf217b 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -7,19 +7,6 @@ use strict;
 use v5.10.1;
 use PublicInbox::DS qw(dwaitpid);
 
-sub _vivify_external { # _externals_each callback
-	my ($src, $dir) = @_;
-	if (-f "$dir/ei.lock") {
-		require PublicInbox::ExtSearch;
-		push @$src, PublicInbox::ExtSearch->new($dir);
-	} elsif (-f "$dir/inbox.lock" || -d "$dir/public-inbox") { # v2, v1
-		require PublicInbox::Inbox;
-		push @$src, bless { inboxdir => $dir }, 'PublicInbox::Inbox';
-	} else {
-		warn "W: ignoring $dir, unable to determine type\n";
-	}
-}
-
 # the main "lei q SEARCH_TERMS" method
 sub lei_q {
 	my ($self, @argv) = @_;
@@ -27,19 +14,19 @@ sub lei_q {
 	require PublicInbox::LeiOverview;
 	PublicInbox::Config->json; # preload before forking
 	my $opt = $self->{opt};
-	my @srcs; # any number of LeiXSearch || LeiSearch || Inbox
+	my $lxs = $self->{lxs} = PublicInbox::LeiXSearch->new;
+	# any number of LeiXSearch || LeiSearch || Inbox
 	if ($opt->{'local'} //= 1) { # --local is enabled by default
 		my $sto = $self->_lei_store(1);
-		push @srcs, $sto->search;
+		$lxs->prepare_external($sto->search);
 	}
 
-	my $lxs = $self->{lxs} = PublicInbox::LeiXSearch->new;
 	# --external is enabled by default, but allow --no-external
 	if ($opt->{external} //= 1) {
-		$self->_externals_each(\&_vivify_external, \@srcs);
+		my $cb = $lxs->can('prepare_external');
+		$self->_externals_each($cb, $lxs);
 	}
-	my $xj = $opt->{jobs} // (scalar(@srcs) > 3 ? 3 : scalar(@srcs));
-	$xj = 1 if !$opt->{thread};
+	my $xj = $opt->{thread} ? $lxs->locals : ($lxs->remotes + 1);
 	my $ovv = PublicInbox::LeiOverview->new($self) or return;
 	$self->atfork_prepare_wq($lxs);
 	$lxs->wq_workers_start('lei_xsearch', $xj, $self->oldset);
@@ -76,7 +63,7 @@ sub lei_q {
 	$mset_opt{relevance} //= -2 if $opt->{thread};
 	$self->{mset_opt} = \%mset_opt;
 	$ovv->ovv_begin($self);
-	$lxs->do_query($self, \@srcs);
+	$lxs->do_query($self);
 }
 
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 987a9896..10c25246 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -26,10 +26,6 @@ sub new {
 
 sub attach_external {
 	my ($self, $ibxish) = @_; # ibxish = ExtSearch or Inbox
-
-	if (!$ibxish->can('over') || !$ibxish->over) {
-		return push(@{$self->{remotes}}, $ibxish)
-	}
 	my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
 	my $srch = $ibxish->search or
 		return warn("$desc not indexed for Xapian\n");
@@ -59,10 +55,9 @@ sub attach_external {
 }
 
 # returns a list of local inboxes (or count in scalar context)
-sub locals {
-	my %uniq = map {; "$_" => $_ } @{$_[0]->{shard2ibx} // []};
-	values %uniq;
-}
+sub locals { @{$_[0]->{locals} // []} }
+
+sub remotes { @{$_[0]->{remotes} // []} }
 
 # called by PublicInbox::Search::xdb
 sub xdb_shards_flat { @{$_[0]->{shards_flat} // []} }
@@ -148,14 +143,16 @@ sub query_thread_mset { # for --thread
 }
 
 sub query_mset { # non-parallel for non-"--thread" users
-	my ($self, $lei, $srcs) = @_;
+	my ($self, $lei) = @_;
 	local $0 = "$0 query_mset";
 	my $startq = delete $self->{5};
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
-	$self->attach_external($_) for @$srcs;
+	for my $loc (locals($self)) {
+		attach_external($self, $loc);
+	}
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $self);
 	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
 	$dedupe->prepare_dedupe;
@@ -172,6 +169,10 @@ sub query_mset { # non-parallel for non-"--thread" users
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
+sub query_remote_mboxrd {
+	my ($self, $lei, $uri) = @_;
+}
+
 sub git {
 	my ($self) = @_;
 	my (%seen, @dirs);
@@ -221,18 +222,17 @@ sub do_post_augment {
 }
 
 sub start_query { # always runs in main (lei-daemon) process
-	my ($self, $io, $lei, $srcs) = @_;
-	my $remotes = $self->{remotes} // [];
+	my ($self, $io, $lei) = @_;
 	if ($lei->{opt}->{thread}) {
-		for my $ibxish (@$srcs) {
+		for my $ibxish (locals($self)) {
 			$self->wq_do('query_thread_mset', $io, $lei, $ibxish);
 		}
 	} else {
-		$self->wq_do('query_mset', $io, $lei, $srcs);
+		$self->wq_do('query_mset', $io, $lei);
 	}
 	# TODO
-	for my $rmt (@$remotes) {
-		$self->wq_do('query_thread_mbox', $io, $lei, $rmt);
+	for my $uri (remotes($self)) {
+		$self->wq_do('query_remote_mboxrd', $io, $lei, $uri);
 	}
 	@$io = ();
 }
@@ -259,7 +259,7 @@ sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
 }
 
 sub do_query {
-	my ($self, $lei_orig, $srcs) = @_;
+	my ($self, $lei_orig) = @_;
 	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
 	$io[0] = undef;
 	pipe(my $done, $io[0]) or die "pipe $!";
@@ -286,7 +286,7 @@ sub do_query {
 		$io[5] = $startq;
 		$io[1] = $zpipe->[1] if $zpipe;
 	}
-	start_query($self, \@io, $lei, $srcs);
+	start_query($self, \@io, $lei);
 	$self->wq_close(1);
 	unless ($in_loop) {
 		# for the $lei->atfork_child_wq PIPE handler:
@@ -302,4 +302,25 @@ sub ipc_atfork_prepare {
 	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
 }
 
+sub prepare_external {
+	my ($self, $loc, $boost) = @_; # n.b. already ordered by boost
+	if (ref $loc) { # already a URI, or PublicInbox::Inbox-like object
+		return push(@{$self->{remotes}}, $loc) if $loc->can('scheme');
+	} elsif ($loc =~ m!\Ahttps?://!) {
+		require URI;
+		return push(@{$self->{remotes}}, URI->new($loc));
+	} elsif (-f "$loc/ei.lock") {
+		require PublicInbox::ExtSearch;
+		$loc = PublicInbox::ExtSearch->new($loc);
+	} elsif (-f "$loc/inbox.lock" || -d "$loc/public-inbox") {
+		require PublicInbox::Inbox; # v2, v1
+		$loc = bless { inboxdir => $loc }, 'PublicInbox::Inbox';
+	} else {
+		warn "W: ignoring $loc, unable to determine type\n";
+		return;
+	}
+	push @{$self->{locals}}, $loc;
+}
+
+
 1;
diff --git a/t/lei_xsearch.t b/t/lei_xsearch.t
index 8b03c1f2..f745ea3e 100644
--- a/t/lei_xsearch.t
+++ b/t/lei_xsearch.t
@@ -49,7 +49,10 @@ $eidx->eidx_sync({fsync => 0});
 my $es = PublicInbox::ExtSearch->new("$home/eidx");
 my $lxs = PublicInbox::LeiXSearch->new;
 for my $ibxish (shuffle($es, @ibx)) {
-	$lxs->attach_external($ibxish);
+	$lxs->prepare_external($ibxish);
+}
+for my $loc ($lxs->locals) {
+	$lxs->attach_external($loc);
 }
 my $nr = $lxs->xdb->get_doccount;
 my $mset = $lxs->mset('d:19931002..19931003', { limit => $nr });

^ permalink raw reply related	[relevance 47%]

* [PATCH 07/10] lei forget-external: just show the location
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
                   ` (4 preceding siblings ...)
  2021-01-23 10:27 67% ` [PATCH 06/10] lei completion: handle URLs with port numbers Eric Wong
@ 2021-01-23 10:27 71% ` Eric Wong
  2021-01-23 10:27 47% ` [PATCH 08/10] lei q: support a bunch of curl(1) options Eric Wong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

No need to show the full key name since the user mainly
uses the location.
---
 lib/PublicInbox/LeiExternal.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index a4e644ee..5b5f08d1 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -76,9 +76,9 @@ sub lei_forget_external {
 			delete($cfg->{$key});
 			$self->_config('--unset', $key);
 			if ($? == 0) {
-				push @unset, $key;
+				push @unset, $l;
 			} elsif (($? >> 8) == 5) {
-				push @not_found, $key;
+				push @not_found, $l;
 			} else {
 				$self->err("# --unset $key error");
 				return $self->x_it($?);
@@ -86,7 +86,7 @@ sub lei_forget_external {
 		}
 		if (@unset) {
 			next if $quiet;
-			$self->err("# $_ unset") for @unset;
+			$self->err("# $_ gone") for @unset;
 		} elsif (@not_found) {
 			$self->err("# $_ not found") for @not_found;
 		} # else { already exited

^ permalink raw reply related	[relevance 71%]

* [PATCH 09/10] lei forget-external: don't show redundant "not found"
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
                   ` (6 preceding siblings ...)
  2021-01-23 10:27 47% ` [PATCH 08/10] lei q: support a bunch of curl(1) options Eric Wong
@ 2021-01-23 10:27 71% ` Eric Wong
  2021-01-23 10:27 71% ` [PATCH 10/10] lei add-external: don't allow non-existent directories Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

Pathname/URL canonicalization may not change the result at
all, so there's no point in trying (and failing) the same
form twice if pre and post-canonicalization are identical.
---
 lib/PublicInbox/LeiExternal.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 5b5f08d1..e7693e09 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -69,9 +69,11 @@ sub lei_forget_external {
 	my ($self, @locations) = @_;
 	my $cfg = $self->_lei_cfg(1);
 	my $quiet = $self->{opt}->{quiet};
+	my %seen;
 	for my $loc (@locations) {
 		my (@unset, @not_found);
 		for my $l ($loc, _canonicalize($loc)) {
+			next if $seen{$l}++;
 			my $key = "external.$l.boost";
 			delete($cfg->{$key});
 			$self->_config('--unset', $key);

^ permalink raw reply related	[relevance 71%]

* [PATCH 10/10] lei add-external: don't allow non-existent directories
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
                   ` (7 preceding siblings ...)
  2021-01-23 10:27 71% ` [PATCH 09/10] lei forget-external: don't show redundant "not found" Eric Wong
@ 2021-01-23 10:27 71% ` Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

At least not yet, though we may support mirroring via git.
---
 lib/PublicInbox/LeiExternal.pm | 3 +++
 t/lei.t                        | 4 ++++
 2 files changed, 7 insertions(+)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index e7693e09..bf07c41c 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -58,6 +58,9 @@ sub lei_add_external {
 	my $cfg = $self->_lei_cfg(1);
 	my $new_boost = $self->{opt}->{boost} // 0;
 	$location = _canonicalize($location);
+	if ($location !~ m!\Ahttps?://! && !-d $location) {
+		return $self->fail("$location not a directory");
+	}
 	my $key = "external.$location.boost";
 	my $cur_boost = $cfg->{$key};
 	return if defined($cur_boost) && $cur_boost == $new_boost; # idempotent
diff --git a/t/lei.t b/t/lei.t
index 6b45f5b7..60ca75c5 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -193,6 +193,10 @@ my $test_external = sub {
 	ok(!-e $config_file && !-e $store_dir,
 		'nothing created by ls-external');
 
+	ok(!$lei->('add-external', "$home/nonexistent"),
+		"fails on non-existent dir");
+	$lei->('ls-external');
+	is($out.$err, '', 'ls-external still has no output');
 	my $cfg = PublicInbox::Config->new;
 	$cfg->each_inbox(sub {
 		my ($ibx) = @_;

^ permalink raw reply related	[relevance 71%]

* [PATCH 05/10] lei: default "-f $mfolder" args for common MUAs
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-23 10:27 66% ` [PATCH 04/10] lei: oneshot: preserve stdout if writing mbox Eric Wong
@ 2021-01-23 10:27 68% ` Eric Wong
  2021-01-23 10:27 67% ` [PATCH 06/10] lei completion: handle URLs with port numbers Eric Wong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

At least mail, mailx, mutt, and neomutt follow this convention.
Heirloom mailx doesn't support Maildir (our default), but GNU
mailutils mail/mailx does.
---
 lib/PublicInbox/LEI.pm | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ba744ef3..890be575 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -698,17 +698,21 @@ sub exec_buf ($$) {
 }
 
 sub start_mua {
-	my ($self, $sock) = @_;
+	my ($self) = @_;
 	my $mua = $self->{opt}->{'mua-cmd'} // return;
 	my $mfolder = $self->{ovv}->{dst};
-	require Text::ParseWords;
-	my $replaced;
-	my @cmd = Text::ParseWords::shellwords($mua);
-	# mutt uses '%f' for open-hook with compressed folders, so we use %f
-	@cmd = map { $_ eq '%f' ? ($replaced = $mfolder) : $_ } @cmd;
+	my (@cmd, $replaced);
+	if ($mua =~ /\A(?:mutt|mailx|mail|neomutt)\z/) {
+		@cmd = ($mua, '-f');
+	# TODO: help wanted: other common FOSS MUAs
+	} else {
+		require Text::ParseWords;
+		my @cmd = Text::ParseWords::shellwords($mua);
+		# mutt uses '%f' for open-hook with compressed mbox, we follow
+		@cmd = map { $_ eq '%f' ? ($replaced = $mfolder) : $_ } @cmd;
+	}
 	push @cmd, $mfolder unless defined($replaced);
-	$sock //= $self->{sock};
-	if ($sock) { # lei(1) client process runs it
+	if (my $sock = $self->{sock}) { # lei(1) client process runs it
 		send($sock, exec_buf(\@cmd, {}), MSG_EOR);
 	} else { # oneshot
 		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);

^ permalink raw reply related	[relevance 68%]

* [PATCH 06/10] lei completion: handle URLs with port numbers
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
                   ` (3 preceding siblings ...)
  2021-01-23 10:27 68% ` [PATCH 05/10] lei: default "-f $mfolder" args for common MUAs Eric Wong
@ 2021-01-23 10:27 67% ` Eric Wong
  2021-01-23 10:27 71% ` [PATCH 07/10] lei forget-external: just show the location Eric Wong
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

This improves the experience for developers running local
instances of PublicInbox::WWW without permissions to bind
port 80 or 443.
---
 lib/PublicInbox/LeiExternal.pm | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 59c3c367..a4e644ee 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -101,12 +101,36 @@ sub _complete_forget_external {
 	# Workaround bash word-splitting URLs to ['https', ':', '//' ...]
 	# Maybe there's a better way to go about this in
 	# contrib/completion/lei-completion.bash
-	my $colon = ($argv[-1] // '') eq ':';
-	my $re = $cur =~ /\A[\w-]/ ? '' : '.*';
+	my $re = '';
+	if (@argv) {
+		my @x = @argv;
+		if ($cur eq ':' && @x) {
+			push @x, $cur;
+			$cur = '';
+		}
+		while (@x > 2 && $x[0] !~ /\Ahttps?\z/ && $x[1] ne ':') {
+			shift @x;
+		}
+		if (@x >= 2) { # qw(https : hostname : 443) or qw(http :)
+			$re = join('', @x);
+		} else { # just filter out the flags and hope for the best
+			$re = join('', grep(!/^-/, @argv));
+		}
+		$re = quotemeta($re);
+	}
+	# FIXME: bash completion off "http:" or "https:" when the last
+	# character is a colon doesn't work properly even if we're
+	# returning "//$HTTP_HOST/$PATH_INFO/", not sure why, could
+	# be a bash issue.
 	map {
 		my $x = substr($_, length('external.'));
 		# only return the part specified on the CLI
-		$colon && $x =~ /(\Q$cur\E.*)/ ? $1 : $x;
+		if ($x =~ /\A$re(\Q$cur\E.*)/) {
+			# don't duplicate if already 100% completed
+			$cur eq $1 ? () : $1;
+		} else {
+			();
+		}
 	} grep(/\Aexternal\.$re\Q$cur/, @{$cfg->{-section_order}});
 }
 

^ permalink raw reply related	[relevance 67%]

* [PATCH 04/10] lei: oneshot: preserve stdout if writing mbox
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
  2021-01-23 10:27 47% ` [PATCH 01/10] lei: move external vivification to xsearch Eric Wong
  2021-01-23 10:27 35% ` [PATCH 02/10] lei: support remote externals Eric Wong
@ 2021-01-23 10:27 66% ` Eric Wong
  2021-01-23 10:27 68% ` [PATCH 05/10] lei: default "-f $mfolder" args for common MUAs Eric Wong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

We still need stdout if launching an MUA.
---
 lib/PublicInbox/LEI.pm        | 5 ++++-
 lib/PublicInbox/LeiToMail.pm  | 1 +
 lib/PublicInbox/LeiXSearch.pm | 9 ++++++++-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f6bc920d..ba744ef3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -298,6 +298,9 @@ sub atfork_prepare_wq {
 	if (my $pgr = $self->{pgr}) {
 		push @$tcafc, @$pgr[1,2];
 	}
+	if (my $old_1 = $self->{old_1}) {
+		push @$tcafc, $old_1;
+	}
 	for my $f (qw(lxs l2m)) {
 		my $ipc = $self->{$f} or next;
 		push @$tcafc, grep { defined }
@@ -340,7 +343,7 @@ sub atfork_parent_wq {
 		$ret->{dedupe} = $wq->deep_clone($dedupe);
 	}
 	$self->{env} = $env;
-	delete @$ret{qw(-lei_store cfg pgr lxs)}; # keep l2m
+	delete @$ret{qw(-lei_store cfg old_1 pgr lxs)}; # keep l2m
 	my @io = delete @$ret{0..2};
 	$io[3] = delete($ret->{sock}) // $io[2];
 	my $l2m = $ret->{l2m};
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 438fb175..5f38add1 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -405,6 +405,7 @@ sub _pre_augment_mbox {
 			$! == ENOENT or die "unlink($dst): $!";
 		}
 		open my $out, $mode, $dst or die "open($dst): $!";
+		$lei->{old_1} = $lei->{1};
 		$lei->{1} = $out;
 	}
 	# Perl does SEEK_END even with O_APPEND :<
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index d32fe09a..8d36bca9 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -252,7 +252,14 @@ sub query_done { # EOF callback
 	}
 	$lei->{ovv}->ovv_end($lei);
 	if ($has_l2m) { # close() calls LeiToMail reap_compress
-		close(delete($lei->{1})) if $lei->{1};
+		if (my $out = delete $lei->{old_1}) {
+			if (my $mbout = $lei->{1}) {
+				close($mbout) or return $lei->fail(<<"");
+Error closing $lei->{ovv}->{dst}: $!
+
+			}
+			$lei->{1} = $out;
+		}
 		$lei->start_mua;
 	}
 	$lei->dclose;

^ permalink raw reply related	[relevance 66%]

* [PATCH 08/10] lei q: support a bunch of curl(1) options
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
                   ` (5 preceding siblings ...)
  2021-01-23 10:27 71% ` [PATCH 07/10] lei forget-external: just show the location Eric Wong
@ 2021-01-23 10:27 47% ` Eric Wong
  2021-01-23 10:27 71% ` [PATCH 09/10] lei forget-external: don't show redundant "not found" Eric Wong
  2021-01-23 10:27 71% ` [PATCH 10/10] lei add-external: don't allow non-existent directories Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

Some of these options will make sense when on weird networks
(behind firewalls, etc.)  Some of these options may not make
sense at all.

This allows users who prefer to use the SOCKS5 proxy support in
curl rather than torsocks(1), but we'll still support torsocks
by default since some Tor instances aren't on the default
127.0.0.1:9050.
---
 lib/PublicInbox/LEI.pm        |  4 ++--
 lib/PublicInbox/LeiQuery.pm   | 41 +++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiXSearch.pm | 13 +++++++++++
 3 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 890be575..a9123c6e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -84,8 +84,8 @@ our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote local! external! pretty mua-cmd=s
-	verbose|v
-	since|after=s until|before=s), opt_dash('limit|n=i', '[0-9]+') ],
+	torsocks=s no-torsocks verbose|v since|after=s until|before=s),
+	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
 	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index eebf217b..acab3c2c 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -66,4 +66,45 @@ sub lei_q {
 	$lxs->do_query($self);
 }
 
+# Stuff we may pass through to curl (as of 7.64.0), see curl manpage for
+# details, so most options which make sense for HTTP/HTTPS (including proxy
+# support for Tor and other methods of getting past weird networks).
+# Most of these are untested by us, some may not make sense for our use case
+# and typos below are likely.
+# n.b. some short options (-$NUMBER) are not supported since they conflict
+# with other "lei q" switches.
+# FIXME: Getopt::Long doesn't easily let us support support options with
+# '.' in them (e.g. --http1.1)
+sub curl_opt { qw(
+	abstract-unix-socket=s anyauth basic cacert=s capath=s
+	cert-status cert-type cert|E=s ciphers=s config|K=s@
+	connect-timeout=s connect-to=s cookie-jar|c=s cookie|b=s crlfile=s
+	digest disable dns-interface=s dns-ipv4-addr=s dns-ipv6-addr=s
+	dns-servers=s doh-url=s egd-file=s engine=s false-start
+	happy-eyeballs-timeout-ms=s haproxy-protocol header|H=s@
+	http2-prior-knowledge http2 insecure|k
+	interface=s ipv4 ipv6 junk-session-cookies
+	key-type=s key=s limit-rate=s local-port=s location-trusted location|L
+	max-redirs=i max-time=s negotiate netrc-file=s netrc-optional netrc
+	no-alpn no-buffer|N no-npn no-sessionid noproxy=s ntlm-wb ntlm
+	pass=s pinnedpubkey=s post301 post302 post303 preproxy=s
+	proxy-anyauth proxy-basic proxy-cacert=s proxy-capath=s
+	proxy-cert-type=s proxy-cert=s proxy-ciphers=s proxy-crlfile=s
+	proxy-digest proxy-header=s@ proxy-insecure
+	proxy-key-type=s proxy-key proxy-negotiate proxy-ntlm proxy-pass=s
+	proxy-pinnedpubkey=s proxy-service-name=s proxy-ssl-allow-beast
+	proxy-tls13-ciphers=s proxy-tlsauthtype=s proxy-tlspassword=s
+	proxy-tlsuser=s proxy-tlsv1 proxy-user|U=s proxy=s
+	proxytunnel=s pubkey=s random-file=s referer=s resolve=s
+	retry-connrefused retry-delay=s retry-max-time=s retry=i
+	sasl-ir service-name=s socks4=s socks4a=s socks5-basic
+	socks5-gssapi-service-name=s socks5-gssapi socks5-hostname=s socks5=s
+	speed-limit|Y speed-type|y ssl-allow-beast sslv2 sslv3
+	suppress-connect-headers tcp-fastopen tls-max=s
+	tls13-ciphers=s tlsauthtype=s tlspassword=s tlsuser=s
+	tlsv1 trace-ascii=s trace-time trace=s
+	unix-socket=s user-agent|A=s user|u=s
+)
+}
+
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 8d36bca9..defe5e67 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -193,6 +193,7 @@ sub query_remote_mboxrd {
 	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
 	$dedupe->prepare_dedupe;
 	my @cmd = qw(curl -XPOST -sSf);
+	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
 	my $tor = $opt->{torsocks} //= 'auto';
 	if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&
 			(($lei->{env}->{LD_PRELOAD}//'') !~ /torsocks/)) {
@@ -202,6 +203,18 @@ sub query_remote_mboxrd {
 	}
 	my $verbose = $opt->{verbose};
 	push @cmd, '-v' if $verbose;
+	for my $o ($lei->curl_opt) {
+		$o =~ s/\|[a-z0-9]\b//i; # remove single char short option
+		if ($o =~ s/=[is]@\z//) {
+			my $ary = $opt->{$o} or next;
+			push @cmd, map { ("--$o", $_) } @$ary;
+		} elsif ($o =~ s/=[is]\z//) {
+			my $val = $opt->{$o} // next;
+			push @cmd, "--$o", $val;
+		} elsif ($opt->{$o}) {
+			push @cmd, "--$o";
+		}
+	}
 	push @cmd, $uri->as_string;
 	$lei->err("# @cmd") if $verbose;
 	$? = 0;

^ permalink raw reply related	[relevance 47%]

* [PATCH 02/10] lei: support remote externals
  2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
  2021-01-23 10:27 47% ` [PATCH 01/10] lei: move external vivification to xsearch Eric Wong
@ 2021-01-23 10:27 35% ` Eric Wong
  2021-01-24  6:01 62%   ` Kyle Meyer
  2021-01-23 10:27 66% ` [PATCH 04/10] lei: oneshot: preserve stdout if writing mbox Eric Wong
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-23 10:27 UTC (permalink / raw)
  To: meta

Via curl(1), since that lets us easily use tor on a
per-connection basis via LD_PRELOAD (torsocks) or proxy.
We'll eventually support more curl options which can allow
users to get past firewalls and deal with other odd network
configurations.
---
 lib/PublicInbox/LEI.pm         | 19 ++++++++++--
 lib/PublicInbox/LeiOverview.pm | 10 +++++-
 lib/PublicInbox/LeiToMail.pm   | 20 +++++++-----
 lib/PublicInbox/LeiXSearch.pm  | 57 +++++++++++++++++++++++++++++++++-
 lib/PublicInbox/ProcessPipe.pm |  2 ++
 script/lei                     |  2 ++
 t/lei.t                        | 39 +++++++++++++++++++++++
 7 files changed, 137 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ef3f90fc..f6bc920d 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -84,6 +84,7 @@ our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote local! external! pretty mua-cmd=s
+	verbose|v
 	since|after=s until|before=s), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
@@ -278,6 +279,16 @@ sub fail ($$;$) {
 	undef;
 }
 
+sub child_error { # passes non-fatal curl exit codes to user
+	my ($self, $child_error) = @_; # child_error is $?
+	if (my $sock = $self->{sock}) { # send to lei(1) client
+		send($sock, "child_error $child_error", MSG_EOR);
+	} else { # oneshot
+		$self->{child_error} = $child_error;
+	}
+	undef;
+}
+
 sub atfork_prepare_wq {
 	my ($self, $wq) = @_;
 	my $tcafc = $wq->{-ipc_atfork_child_close} //= [ $listener // () ];
@@ -959,19 +970,21 @@ sub lazy_start {
 	exit($exit_code // 0);
 }
 
-# for users w/o Socket::Msghdr
+# for users w/o Socket::Msghdr installed or Inline::C enabled
 sub oneshot {
 	my ($main_pkg) = @_;
 	my $exit = $main_pkg->can('exit'); # caller may override exit()
 	local $quit = $exit if $exit;
 	local %PATH2CFG;
 	umask(077) // die("umask(077): $!");
-	dispatch((bless {
+	my $self = bless {
 		0 => *STDIN{GLOB},
 		1 => *STDOUT{GLOB},
 		2 => *STDERR{GLOB},
 		env => \%ENV
-	}, __PACKAGE__), @ARGV);
+	}, __PACKAGE__;
+	dispatch($self, @ARGV);
+	x_it($self, $self->{child_error}) if $self->{child_error};
 }
 
 # ensures stdout hits the FS before sock disconnects so a client
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 7a4fa857..49538a60 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -209,7 +209,15 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		$json->ascii(1) if $lei->{opt}->{ascii};
 	}
 	my $l2m = $lei->{l2m};
-	if ($l2m && $l2m->{-wq_s1}) {
+	if ($l2m && $ibxish->can('scheme')) { # remote https?:// mboxrd
+		delete $l2m->{-wq_s1};
+		my $g2m = $l2m->can('git_to_mail');
+		my $wcb = $l2m->write_cb($lei);
+		sub {
+			my ($smsg, undef, $eml) = @_; # no mitem in $_[1]
+			$wcb->(undef, $smsg, $eml);
+		};
+	} elsif ($l2m && $l2m->{-wq_s1}) {
 		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
 		# n.b. $io[0] = qry_status_wr, $io[1] = mbox|stdout,
 		# $io[4] becomes a notification pipe that triggers EOF
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index cea68319..43c59da0 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -251,9 +251,9 @@ sub _mbox_write_cb ($$) {
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe;
 	sub { # for git_to_mail
-		my ($buf, $smsg) = @_;
+		my ($buf, $smsg, $eml) = @_;
 		return unless $out;
-		my $eml = PublicInbox::Eml->new($buf);
+		$eml //= PublicInbox::Eml->new($buf);
 		if (!$dedupe->is_dup($eml, $smsg->{blob})) {
 			$buf = $eml2mbox->($eml, $smsg);
 			my $lk = $ovv->lock_for_scope;
@@ -286,18 +286,23 @@ sub _augment_file { # _maildir_each_file cb
 # _maildir_each_file callback, \&CORE::unlink doesn't work with it
 sub _unlink { unlink($_[0]) }
 
+sub _rand () {
+	state $seq = 0;
+	sprintf('%x,%x,%x,%x', rand(0xffffffff), time, $$, ++$seq);
+}
+
 sub _buf2maildir {
 	my ($dst, $buf, $smsg) = @_;
 	my $kw = $smsg->{kw} // [];
 	my $sfx = join('', sort(map { $kw2char{$_} // () } @$kw));
 	my $rand = ''; # chosen by die roll :P
 	my ($tmp, $fh, $final);
-	my $common = $smsg->{blob};
+	my $common = $smsg->{blob} // _rand;
 	if (defined(my $pct = $smsg->{pct})) { $common .= "=$pct" }
 	do {
 		$tmp = $dst.'tmp/'.$rand.$common;
 	} while (!sysopen($fh, $tmp, O_CREAT|O_EXCL|O_WRONLY) &&
-		$! == EEXIST && ($rand = int(rand 0x7fffffff).','));
+		$! == EEXIST && ($rand = _rand.','));
 	if (print $fh $$buf and close($fh)) {
 		# ignore new/ and write only to cur/, otherwise MUAs
 		# with R/W access to the Maildir will end up doing
@@ -308,7 +313,7 @@ sub _buf2maildir {
 		do {
 			$final = $dst.$rand.$common.':2,'.$sfx;
 		} while (!link($tmp, $final) && $! == EEXIST &&
-			($rand = int(rand 0x7fffffff).','));
+			($rand = _rand.','));
 		unlink($tmp) or warn "W: failed to unlink $tmp: $!\n";
 	} else {
 		my $err = $!;
@@ -323,9 +328,10 @@ sub _maildir_write_cb ($$) {
 	$dedupe->prepare_dedupe;
 	my $dst = $lei->{ovv}->{dst};
 	sub { # for git_to_mail
-		my ($buf, $smsg) = @_;
+		my ($buf, $smsg, $eml) = @_;
+		$buf //= \($eml->as_string);
 		return _buf2maildir($dst, $buf, $smsg) if !$dedupe;
-		my $eml = PublicInbox::Eml->new($$buf); # copy buf
+		$eml //= PublicInbox::Eml->new($$buf); # copy buf
 		return if $dedupe->is_dup($eml, $smsg->{blob});
 		undef $eml;
 		_buf2maildir($dst, $buf, $smsg);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 10c25246..d32fe09a 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -14,6 +14,7 @@ use PublicInbox::Import;
 use File::Temp 0.19 (); # 0.19 for ->newdir
 use File::Spec ();
 use PublicInbox::Search qw(xap_terms);
+use PublicInbox::Spawn qw(popen_rd);
 
 sub new {
 	my ($class) = @_;
@@ -169,8 +170,58 @@ sub query_mset { # non-parallel for non-"--thread" users
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
+sub each_eml { # callback for MboxReader->mboxrd
+	my ($eml, $self, $lei, $each_smsg) = @_;
+	my $smsg = bless {}, 'PublicInbox::Smsg';
+	$smsg->populate($eml);
+	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
+	delete @$smsg{qw(From Subject -ds -ts)};
+	if (my $startq = delete($self->{5})) { wait_startq($startq) }
+	return if !$lei->{l2m} && $lei->{dedupe}->is_smsg_dup($smsg);
+	$each_smsg->($smsg, undef, $eml);
+}
+
 sub query_remote_mboxrd {
 	my ($self, $lei, $uri) = @_;
+	local $0 = "$0 query_remote_mboxrd";
+	my %sig = $lei->atfork_child_wq($self); # keep $self->{5} startq
+	local @SIG{keys %sig} = values %sig;
+	my $opt = $lei->{opt};
+	$uri->query_form(q => $lei->{mset_opt}->{qstr}, x => 'm',
+			$opt->{thread} ? (t => 1) : ());
+	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $uri);
+	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
+	$dedupe->prepare_dedupe;
+	my @cmd = qw(curl -XPOST -sSf);
+	my $tor = $opt->{torsocks} //= 'auto';
+	if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&
+			(($lei->{env}->{LD_PRELOAD}//'') !~ /torsocks/)) {
+		unshift @cmd, 'torsocks';
+	} elsif (PublicInbox::Config::git_bool($tor)) {
+		unshift @cmd, 'torsocks';
+	}
+	my $verbose = $opt->{verbose};
+	push @cmd, '-v' if $verbose;
+	push @cmd, $uri->as_string;
+	$lei->err("# @cmd") if $verbose;
+	$? = 0;
+	my $fh = popen_rd(\@cmd, $lei->{env}, { 2 => $lei->{2} });
+	$fh = IO::Uncompress::Gunzip->new($fh);
+	eval {
+		PublicInbox::MboxReader->mboxrd($fh, \&each_eml,
+						$self, $lei, $each_smsg);
+	};
+	return $lei->fail("E: @cmd: $@") if $@;
+	if (($? >> 8) == 22) { # HTTP 404 from curl(1)
+		$uri->query_form(q => $lei->{mset_opt}->{qstr});
+		$lei->err('# no results from '.$uri->as_string);
+	} elsif ($?) {
+		$uri->query_form(q => $lei->{mset_opt}->{qstr});
+		$lei->err('E: '.$uri->as_string);
+		$lei->child_error($?);
+	}
+	undef $each_smsg;
+	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
 sub git {
@@ -230,7 +281,6 @@ sub start_query { # always runs in main (lei-daemon) process
 	} else {
 		$self->wq_do('query_mset', $io, $lei);
 	}
-	# TODO
 	for my $uri (remotes($self)) {
 		$self->wq_do('query_remote_mboxrd', $io, $lei, $uri);
 	}
@@ -263,6 +313,7 @@ sub do_query {
 	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
 	$io[0] = undef;
 	pipe(my $done, $io[0]) or die "pipe $!";
+	$lei_orig->{1}->autoflush(1);
 
 	$lei_orig->event_step_init; # wait for shutdowns
 	my $done_op = {
@@ -296,6 +347,10 @@ sub do_query {
 
 sub ipc_atfork_prepare {
 	my ($self) = @_;
+	if (exists $self->{remotes}) {
+		require PublicInbox::MboxReader;
+		require IO::Uncompress::Gunzip;
+	}
 	# (0: done_wr, 1: stdout|mbox, 2: stderr,
 	#  3: sock, 4: $l2m->{-wq_s1}, 5: $startq)
 	$self->wq_set_recv_modes(qw[+<&= >&= >&= +<&= +<&= <&=]);
diff --git a/lib/PublicInbox/ProcessPipe.pm b/lib/PublicInbox/ProcessPipe.pm
index e540dc22..97e9c268 100644
--- a/lib/PublicInbox/ProcessPipe.pm
+++ b/lib/PublicInbox/ProcessPipe.pm
@@ -13,6 +13,8 @@ sub TIEHANDLE {
 		$class;
 }
 
+sub BINMODE { binmode(shift->{fh}) } # for IO::Uncompress::Gunzip
+
 sub READ { read($_[0]->{fh}, $_[1], $_[2], $_[3] || 0) }
 
 sub READLINE { readline($_[0]->{fh}) }
diff --git a/script/lei b/script/lei
index 8dcea562..8c40bf12 100755
--- a/script/lei
+++ b/script/lei
@@ -93,6 +93,8 @@ Falling back to (slow) one-shot mode
 		if ($buf =~ /\Ax_it ([0-9]+)\z/) {
 			$x_it_code = $1 + 0;
 			last;
+		} elsif ($buf =~ /\Achild_error ([0-9]+)\z/) {
+			$x_it_code = $1 + 0;
 		} elsif ($buf =~ /\Aexec (.+)\z/) {
 			exec_cmd(\@fds, split(/\0/, $1));
 		} else {
diff --git a/t/lei.t b/t/lei.t
index 50ad2bb1..6b45f5b7 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -8,11 +8,15 @@ use PublicInbox::TestCommon;
 use PublicInbox::Config;
 use File::Path qw(rmtree);
 use Fcntl qw(SEEK_SET);
+use PublicInbox::Spawn qw(which);
 require_git 2.6;
 require_mods(qw(json DBD::SQLite Search::Xapian));
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 my ($home, $for_destroy) = tmpdir();
 my $err_filter;
+my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
+	http://czquwvybam4bgbro.onion/meta/
+	http://ou63pmih66umazou.onion/meta/);
 my $lei = sub {
 	my ($cmd, $env, $xopt) = @_;
 	$out = $err = '';
@@ -155,6 +159,32 @@ my $setup_publicinboxes = sub {
 	$seen || BAIL_OUT 'no imports';
 };
 
+my $test_external_remote = sub {
+	my ($url, $k) = @_;
+SKIP: {
+	my $nr = 4;
+	skip "$k unset", $nr if !$url;
+	which('curl') or skip 'no curl', $nr;
+	which('torsocks') or skip 'no torsocks', $nr if $url =~ m!\.onion/!;
+	$lei->('ls-external');
+	for my $e (split(/^/ms, $out)) {
+		$e =~ s/\s+boost.*//s;
+		$lei->('forget-external', '-q', $e) or
+			fail "error forgetting $e: $err"
+	}
+	$lei->('add-external', $url);
+	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
+	ok($lei->('q', "m:$mid"), "query $url");
+	is($err, '', "no errors on $url");
+	my $res = PublicInbox::Config->json->decode($out);
+	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
+	ok($lei->('q', "m:$mid", 'd:..20101002'), 'no results, no error');
+	like($err, qr/404/, 'noted 404');
+	is($out, "[null]\n", 'got null results');
+	$lei->('forget-external', $url);
+} # /SKIP
+}; # /sub
+
 my $test_external = sub {
 	$setup_publicinboxes->();
 	$cleanup->();
@@ -243,6 +273,15 @@ my $test_external = sub {
 	}
 	ok(!$lei->('q', '-o', "$home/mbox", 's:nope'),
 			'fails if mbox format unspecified');
+	my %e = (
+		TEST_LEI_EXTERNAL_HTTPS => 'https://public-inbox.org/meta/',
+		TEST_LEI_EXTERNAL_ONION => $onions[int(rand(scalar(@onions)))],
+	);
+	for my $k (keys %e) {
+		my $url = $ENV{$k} // '';
+		$url = $e{$k} if $url eq '1';
+		$test_external_remote->($url, $k);
+	}
 };
 
 my $test_lei_common = sub {

^ permalink raw reply related	[relevance 35%]

* Re: [PATCH 02/10] lei: support remote externals
  2021-01-23 10:27 35% ` [PATCH 02/10] lei: support remote externals Eric Wong
@ 2021-01-24  6:01 62%   ` Kyle Meyer
  2021-01-24 12:02 70%     ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-01-24  6:01 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

>  sub query_remote_mboxrd {
[...]
> +	my @cmd = qw(curl -XPOST -sSf);

I've been playing around with lei locally (wow :>).

The one snag I've hit is hooking up the http archives that I host
(<https://yhetil.org>).  It seems to boil down to the `curl -XPOST'
command failing.  For example, this works fine with public-inbox.org:

  $ curl -sSf -XPOST 'https://public-inbox.org/meta/?q=s:lei&x=m' | zless

But it fails with the mirror of meta at yhetil.org:

  $ curl -sSf -XPOST 'https://yhetil.org/meta/?q=s:lei&x=m' | zless
  curl: (22) The requested URL returned error: 400 Bad Request

If I add -d'' to the call, it works and produces the same output as the
above call against public-inbox.org/meta.  And if I add this option to
query_remote_mboxrd (i.e. applying the change at the end), `lei q' works
for me as expected.

yhetil.org uses nginx and varnish, and I'm _very_ far from being an
expert in either of those, so I have no doubt that the above error could
be the result of me configuring something incorrectly.  However, despite
a fair amount of time and effort, I couldn't figure out how to tweak
things to make the above command work without --data.

I quickly checked <https://lore.kernel.org>, and it seems like it would
give a similar response to -XPOST without data:

  $ curl -fSs -XPOST 'https://lore.kernel.org/git/?q=get-urlmatch&x=m'
  curl: (22) The requested URL returned error: 400
  $  curl -d'' -fSs -XPOST 'https://lore.kernel.org/git/?q=get-urlmatch&x=m' | zless
  # works


diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index defe5e67..766e9f5f 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -192,7 +192,7 @@ sub query_remote_mboxrd {
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $uri);
 	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
 	$dedupe->prepare_dedupe;
-	my @cmd = qw(curl -XPOST -sSf);
+	my @cmd = qw(curl -XPOST -d'' -sSf);
 	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
 	my $tor = $opt->{torsocks} //= 'auto';
 	if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&


^ permalink raw reply related	[relevance 62%]

* [PATCH 0/9] lei remotes fixes and updates
@ 2021-01-24 11:46 71% Eric Wong
  2021-01-24 11:46 53% ` [PATCH 1/9] lei q: limit concurrency to 4 remote connections Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2021-01-24 11:46 UTC (permalink / raw)
  To: meta

Eric Wong (9):
  lei q: limit concurrency to 4 remote connections
  ipc: wq supports arbitrarily large payloads
  ipc: get rid of wq_set_recv_modes
  lei q: disable remote externals if locals exist
  lei q: honor --no-local to force remote searches
  lei_xsearch: use curl -d '' for nginx compatibility
  lei q: fix JSON overview with remote externals
  smsg: make parse_references an object method
  smsg: parse_references: micro-optimization to avoid ++

 lib/PublicInbox/IPC.pm         |  85 +++++++++++++++++----------
 lib/PublicInbox/LEI.pm         |   9 ++-
 lib/PublicInbox/LeiOverview.pm |   2 +-
 lib/PublicInbox/LeiQuery.pm    |  13 ++++-
 lib/PublicInbox/LeiToMail.pm   |   7 +--
 lib/PublicInbox/LeiXSearch.pm  | 101 ++++++++++++++++++---------------
 lib/PublicInbox/OverIdx.pm     |  22 +------
 lib/PublicInbox/SearchIdx.pm   |   2 +-
 lib/PublicInbox/Smsg.pm        |  22 ++++++-
 script/lei                     |  11 ++--
 t/cmd_ipc.t                    |  16 ++++++
 t/ipc.t                        |  21 ++++++-
 t/lei.t                        |   3 +
 13 files changed, 196 insertions(+), 118 deletions(-)


^ permalink raw reply	[relevance 71%]

* [PATCH 4/9] lei q: disable remote externals if locals exist
  2021-01-24 11:46 71% [PATCH 0/9] lei remotes fixes and updates Eric Wong
  2021-01-24 11:46 53% ` [PATCH 1/9] lei q: limit concurrency to 4 remote connections Eric Wong
@ 2021-01-24 11:46 67% ` Eric Wong
  2021-01-24 11:46 65% ` [PATCH 5/9] lei q: honor --no-local to force remote searches Eric Wong
  2021-01-24 11:46 54% ` [PATCH 7/9] lei q: fix JSON overview with remote externals Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-24 11:46 UTC (permalink / raw)
  To: meta

--remote should be explicitly enabled if local externals are
present, since users may be offline or on expensive + metered
Internet while traveling.

In the future, --remote will probably default to
caching/memoizing all messages it fetches to increase the
usefulness of --local.
---
 lib/PublicInbox/LEI.pm      | 2 +-
 lib/PublicInbox/LeiQuery.pm | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 473a28a9..378113e8 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -83,7 +83,7 @@ sub _config_path ($) {
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
-	sort|s=s reverse|r offset=i remote local! external! pretty mua-cmd=s
+	sort|s=s reverse|r offset=i remote! local! external! pretty mua-cmd=s
 	torsocks=s no-torsocks verbose|v since|after=s until|before=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index a7938e8b..7713902b 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -24,7 +24,9 @@ sub lei_q {
 	# --external is enabled by default, but allow --no-external
 	if ($opt->{external} //= 1) {
 		my $cb = $lxs->can('prepare_external');
-		$self->_externals_each($cb, $lxs);
+		my $ne = $self->_externals_each($cb, $lxs);
+		$opt->{remote} //= $ne == $lxs->remotes;
+		delete($lxs->{remotes}) if !$opt->{remote};
 	}
 	my $xj = $lxs->concurrency($opt);
 	my $ovv = PublicInbox::LeiOverview->new($self) or return;

^ permalink raw reply related	[relevance 67%]

* [PATCH 1/9] lei q: limit concurrency to 4 remote connections
  2021-01-24 11:46 71% [PATCH 0/9] lei remotes fixes and updates Eric Wong
@ 2021-01-24 11:46 53% ` Eric Wong
  2021-01-24 11:46 67% ` [PATCH 4/9] lei q: disable remote externals if locals exist Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-24 11:46 UTC (permalink / raw)
  To: meta

Unfortunately, this isn't a per-host limit, yet; but
nevertheless reduces load on existing PublicInbox::WWW
instances, since requesting a mboxrd is one of the more
expensive operations.
---
 lib/PublicInbox/LeiQuery.pm   |  2 +-
 lib/PublicInbox/LeiXSearch.pm | 82 +++++++++++++++++++++--------------
 2 files changed, 51 insertions(+), 33 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index acab3c2c..a7938e8b 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -26,7 +26,7 @@ sub lei_q {
 		my $cb = $lxs->can('prepare_external');
 		$self->_externals_each($cb, $lxs);
 	}
-	my $xj = $opt->{thread} ? $lxs->locals : ($lxs->remotes + 1);
+	my $xj = $lxs->concurrency($opt);
 	my $ovv = PublicInbox::LeiOverview->new($self) or return;
 	$self->atfork_prepare_wq($lxs);
 	$lxs->wq_workers_start('lei_xsearch', $xj, $self->oldset);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index defe5e67..1c093a94 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -182,25 +182,16 @@ sub each_eml { # callback for MboxReader->mboxrd
 }
 
 sub query_remote_mboxrd {
-	my ($self, $lei, $uri) = @_;
+	my ($self, $lei, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
 	my %sig = $lei->atfork_child_wq($self); # keep $self->{5} startq
 	local @SIG{keys %sig} = values %sig;
-	my $opt = $lei->{opt};
-	$uri->query_form(q => $lei->{mset_opt}->{qstr}, x => 'm',
-			$opt->{thread} ? (t => 1) : ());
-	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $uri);
+	my ($opt, $env) = @$lei{qw(opt env)};
+	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
+	push(@qform, t => 1) if $opt->{thread};
 	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
 	$dedupe->prepare_dedupe;
 	my @cmd = qw(curl -XPOST -sSf);
-	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
-	my $tor = $opt->{torsocks} //= 'auto';
-	if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&
-			(($lei->{env}->{LD_PRELOAD}//'') !~ /torsocks/)) {
-		unshift @cmd, 'torsocks';
-	} elsif (PublicInbox::Config::git_bool($tor)) {
-		unshift @cmd, 'torsocks';
-	}
 	my $verbose = $opt->{verbose};
 	push @cmd, '-v' if $verbose;
 	for my $o ($lei->curl_opt) {
@@ -215,25 +206,36 @@ sub query_remote_mboxrd {
 			push @cmd, "--$o";
 		}
 	}
-	push @cmd, $uri->as_string;
-	$lei->err("# @cmd") if $verbose;
-	$? = 0;
-	my $fh = popen_rd(\@cmd, $lei->{env}, { 2 => $lei->{2} });
-	$fh = IO::Uncompress::Gunzip->new($fh);
-	eval {
-		PublicInbox::MboxReader->mboxrd($fh, \&each_eml,
-						$self, $lei, $each_smsg);
-	};
-	return $lei->fail("E: @cmd: $@") if $@;
-	if (($? >> 8) == 22) { # HTTP 404 from curl(1)
-		$uri->query_form(q => $lei->{mset_opt}->{qstr});
-		$lei->err('# no results from '.$uri->as_string);
-	} elsif ($?) {
-		$uri->query_form(q => $lei->{mset_opt}->{qstr});
-		$lei->err('E: '.$uri->as_string);
-		$lei->child_error($?);
+	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
+	my $tor = $opt->{torsocks} //= 'auto';
+	for my $uri (@$uris) {
+		$uri->query_form(@qform);
+		my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $uri);
+		my $cmd = [ @cmd, $uri->as_string ];
+		if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&
+				(($env->{LD_PRELOAD}//'') !~ /torsocks/)) {
+			unshift @$cmd, 'torsocks';
+		} elsif (PublicInbox::Config::git_bool($tor)) {
+			unshift @$cmd, 'torsocks';
+		}
+		$lei->err("# @$cmd") if $verbose;
+		$? = 0;
+		my $fh = popen_rd($cmd, $env, { 2 => $lei->{2} });
+		$fh = IO::Uncompress::Gunzip->new($fh);
+		eval {
+			PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
+							$lei, $each_smsg);
+		};
+		return $lei->fail("E: @$cmd: $@") if $@;
+		if (($? >> 8) == 22) { # HTTP 404 from curl(1)
+			$uri->query_form(q => $lei->{mset_opt}->{qstr});
+			$lei->err('# no results from '.$uri->as_string);
+		} elsif ($?) {
+			$uri->query_form(q => $lei->{mset_opt}->{qstr});
+			$lei->err('E: '.$uri->as_string);
+			$lei->child_error($?);
+		}
 	}
-	undef $each_smsg;
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
@@ -292,6 +294,17 @@ sub do_post_augment {
 	close $au_done; # triggers wait_startq
 }
 
+my $MAX_PER_HOST = 4;
+sub MAX_PER_HOST { $MAX_PER_HOST }
+
+sub concurrency {
+	my ($self, $opt) = @_;
+	my $nl = $opt->{thread} ? locals($self) : 1;
+	my $nr = remotes($self);
+	$nr = $MAX_PER_HOST if $nr > $MAX_PER_HOST;
+	$nl + $nr;
+}
+
 sub start_query { # always runs in main (lei-daemon) process
 	my ($self, $io, $lei) = @_;
 	if ($lei->{opt}->{thread}) {
@@ -301,8 +314,13 @@ sub start_query { # always runs in main (lei-daemon) process
 	} else {
 		$self->wq_do('query_mset', $io, $lei);
 	}
+	my $i = 0;
+	my $q = [];
 	for my $uri (remotes($self)) {
-		$self->wq_do('query_remote_mboxrd', $io, $lei, $uri);
+		push @{$q->[$i++ % $MAX_PER_HOST]}, $uri;
+	}
+	for my $uris (@$q) {
+		$self->wq_do('query_remote_mboxrd', $io, $lei, $uris);
 	}
 	@$io = ();
 }

^ permalink raw reply related	[relevance 53%]

* [PATCH 5/9] lei q: honor --no-local to force remote searches
  2021-01-24 11:46 71% [PATCH 0/9] lei remotes fixes and updates Eric Wong
  2021-01-24 11:46 53% ` [PATCH 1/9] lei q: limit concurrency to 4 remote connections Eric Wong
  2021-01-24 11:46 67% ` [PATCH 4/9] lei q: disable remote externals if locals exist Eric Wong
@ 2021-01-24 11:46 65% ` Eric Wong
  2021-01-24 12:31 71%   ` exit codes [was: [PATCH 5/9] lei q: honor --no-local to force remote searches] Eric Wong
  2021-01-24 11:46 54% ` [PATCH 7/9] lei q: fix JSON overview with remote externals Eric Wong
  3 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-24 11:46 UTC (permalink / raw)
  To: meta

This can be useful for testing remote behavior, or for
augmenting local results.  It'll also be possible to explicitly
include/exclude externals via CLI switches (once names are
decided).
---
 lib/PublicInbox/LeiQuery.pm   | 9 ++++++++-
 lib/PublicInbox/LeiXSearch.pm | 2 +-
 t/lei.t                       | 3 +++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 7713902b..953d1fc2 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -26,7 +26,14 @@ sub lei_q {
 		my $cb = $lxs->can('prepare_external');
 		my $ne = $self->_externals_each($cb, $lxs);
 		$opt->{remote} //= $ne == $lxs->remotes;
-		delete($lxs->{remotes}) if !$opt->{remote};
+		if ($opt->{'local'}) {
+			delete($lxs->{remotes}) if !$opt->{remote};
+		} else {
+			delete($lxs->{locals});
+		}
+	}
+	unless ($lxs->locals || $lxs->remotes) {
+		return $self->fail('no local or remote inboxes to search');
 	}
 	my $xj = $lxs->concurrency($opt);
 	my $ovv = PublicInbox::LeiOverview->new($self) or return;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index c396c597..0417db24 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -311,7 +311,7 @@ sub start_query { # always runs in main (lei-daemon) process
 		for my $ibxish (locals($self)) {
 			$self->wq_do('query_thread_mset', $io, $lei, $ibxish);
 		}
-	} else {
+	} elsif (locals($self)) {
 		$self->wq_do('query_mset', $io, $lei);
 	}
 	my $i = 0;
diff --git a/t/lei.t b/t/lei.t
index 60ca75c5..3fd1d1fe 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -277,6 +277,9 @@ my $test_external = sub {
 	}
 	ok(!$lei->('q', '-o', "$home/mbox", 's:nope'),
 			'fails if mbox format unspecified');
+	ok(!$lei->(qw(q --no-local s:see)), '--no-local');
+	is($? >> 8, 1, 'proper exit code');
+	like($err, qr/no local or remote.+? to search/, 'no inbox');
 	my %e = (
 		TEST_LEI_EXTERNAL_HTTPS => 'https://public-inbox.org/meta/',
 		TEST_LEI_EXTERNAL_ONION => $onions[int(rand(scalar(@onions)))],

^ permalink raw reply related	[relevance 65%]

* [PATCH 7/9] lei q: fix JSON overview with remote externals
  2021-01-24 11:46 71% [PATCH 0/9] lei remotes fixes and updates Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-24 11:46 65% ` [PATCH 5/9] lei q: honor --no-local to force remote searches Eric Wong
@ 2021-01-24 11:46 54% ` Eric Wong
  2021-01-24 12:37 71%   ` Eric Wong
  3 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-24 11:46 UTC (permalink / raw)
  To: meta

We can't (and don't need to) repeatedly get the $each_smsg
callback for each URI since that clobbers {ovv_buf} before
it can be output.

I initially thought this was a dedupe-related bug and
moved the dedupe code into the $each_smsg callback to
minimize differences.  Nevertheless it's a nice code
reduction.

I also thought it was related to incomplete smsg info,
so {references} is now filled in correctly for dedupe.
---
 lib/PublicInbox/LeiOverview.pm |  2 +-
 lib/PublicInbox/LeiXSearch.pm  | 15 +++++----------
 2 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 49538a60..928d66cb 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -209,7 +209,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		$json->ascii(1) if $lei->{opt}->{ascii};
 	}
 	my $l2m = $lei->{l2m};
-	if ($l2m && $ibxish->can('scheme')) { # remote https?:// mboxrd
+	if ($l2m && !$ibxish) { # remote https?:// mboxrd
 		delete $l2m->{-wq_s1};
 		my $g2m = $l2m->can('git_to_mail');
 		my $wcb = $l2m->write_cb($lei);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index c6ff5679..2bedf000 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -15,6 +15,7 @@ use File::Temp 0.19 (); # 0.19 for ->newdir
 use File::Spec ();
 use PublicInbox::Search qw(xap_terms);
 use PublicInbox::Spawn qw(popen_rd);
+use PublicInbox::MID qw(mids);
 
 sub new {
 	my ($class) = @_;
@@ -120,8 +121,6 @@ sub query_thread_mset { # for --thread
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $ibxish);
-	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
-	$dedupe->prepare_dedupe;
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
 		my $ids = $srch->mset_to_artnums($mset, $mo);
@@ -132,7 +131,6 @@ sub query_thread_mset { # for --thread
 			for my $n (@{$ctx->{xids}}) {
 				my $smsg = $over->get_art($n) or next;
 				wait_startq($startq) if $startq;
-				next if $dedupe->is_smsg_dup($smsg);
 				my $mitem = delete $n2item{$smsg->{num}};
 				$each_smsg->($smsg, $mitem);
 			}
@@ -155,14 +153,11 @@ sub query_mset { # non-parallel for non-"--thread" users
 		attach_external($self, $loc);
 	}
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $self);
-	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
-	$dedupe->prepare_dedupe;
 	do {
 		$mset = $self->mset($mo->{qstr}, $mo);
 		for my $mitem ($mset->items) {
 			my $smsg = smsg_for($self, $mitem) or next;
 			wait_startq($startq) if $startq;
-			next if $dedupe->is_smsg_dup($smsg);
 			$each_smsg->($smsg, $mitem);
 		}
 	} while (_mset_more($mset, $mo));
@@ -174,10 +169,10 @@ sub each_eml { # callback for MboxReader->mboxrd
 	my ($eml, $self, $lei, $each_smsg) = @_;
 	my $smsg = bless {}, 'PublicInbox::Smsg';
 	$smsg->populate($eml);
+	PublicInbox::OverIdx::parse_references($smsg, $eml, mids($eml));
 	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
 	delete @$smsg{qw(From Subject -ds -ts)};
 	if (my $startq = delete($self->{5})) { wait_startq($startq) }
-	return if !$lei->{l2m} && $lei->{dedupe}->is_smsg_dup($smsg);
 	$each_smsg->($smsg, undef, $eml);
 }
 
@@ -189,8 +184,6 @@ sub query_remote_mboxrd {
 	my ($opt, $env) = @$lei{qw(opt env)};
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
 	push(@qform, t => 1) if $opt->{thread};
-	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
-	$dedupe->prepare_dedupe;
 	my @cmd = (qw(curl -sSf -d), '');
 	my $verbose = $opt->{verbose};
 	push @cmd, '-v' if $verbose;
@@ -208,9 +201,9 @@ sub query_remote_mboxrd {
 	}
 	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
 	my $tor = $opt->{torsocks} //= 'auto';
+	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $uris->[0]);
 	for my $uri (@$uris) {
 		$uri->query_form(@qform);
-		my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $uri);
 		my $cmd = [ @cmd, $uri->as_string ];
 		if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&
 				(($env->{LD_PRELOAD}//'') !~ /torsocks/)) {
@@ -236,6 +229,7 @@ sub query_remote_mboxrd {
 			$lei->child_error($?);
 		}
 	}
+	undef $each_smsg;
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
@@ -387,6 +381,7 @@ sub ipc_atfork_prepare {
 	my ($self) = @_;
 	if (exists $self->{remotes}) {
 		require PublicInbox::MboxReader;
+		require PublicInbox::OverIdx; # parse_references
 		require IO::Uncompress::Gunzip;
 	}
 	# FDS: (0: done_wr, 1: stdout|mbox, 2: stderr,

^ permalink raw reply related	[relevance 54%]

* Re: [PATCH 02/10] lei: support remote externals
  2021-01-24  6:01 62%   ` Kyle Meyer
@ 2021-01-24 12:02 70%     ` Eric Wong
  2021-01-24 12:12 71%       ` Eric Wong
  2021-01-24 22:11 71%       ` Kyle Meyer
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2021-01-24 12:02 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> I've been playing around with lei locally (wow :>).

Glad you're enjoying it.  I'm still not completely happy with
some of the internals (the IPC stuff is a bit adventurous and
perhaps overkill), but functionality's getting slowly fleshed
out.

Btw, since you seem to be figuring things out without existing
docs, could I convince you to start manpages for lei?

Don't feel obligated, but it might be better for everybody since
my brain tends to skip over stuff that's only obvious because I
designed it :x (and my brain feels "off" :<)

I'm envisioning git-style manpages, with subcommands each having
their own manpage and an lei-overview(7) with common examples
as quick-start for beginners.

<snip>

> yhetil.org uses nginx and varnish, and I'm _very_ far from being an
> expert in either of those, so I have no doubt that the above error could
> be the result of me configuring something incorrectly.  However, despite
> a fair amount of time and effort, I couldn't figure out how to tweak
> things to make the above command work without --data.

Thanks, it may be nginx-specific behavior, but
https://public-inbox.org/meta/20210124114655.12815-7-e@80x24.org/
should do the trick.

> -	my @cmd = qw(curl -XPOST -sSf);
> +	my @cmd = qw(curl -XPOST -d'' -sSf);

That '' is a syntax error for me, and curl nags on -XPOST
with -d, so I've omitted -XPOST from my patch.  Thanks again for
the report.

^ permalink raw reply	[relevance 70%]

* Re: [PATCH 02/10] lei: support remote externals
  2021-01-24 12:02 70%     ` Eric Wong
@ 2021-01-24 12:12 71%       ` Eric Wong
  2021-01-24 22:11 71%       ` Kyle Meyer
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2021-01-24 12:12 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Kyle Meyer <kyle@kyleam.com> wrote:
> > -	my @cmd = qw(curl -XPOST -sSf);
> > +	my @cmd = qw(curl -XPOST -d'' -sSf);
> 
> That '' is a syntax error for me, and curl nags on -XPOST
> with -d, so I've omitted -XPOST from my patch.  Thanks again for
> the report.

Nevermind :x  Way too sleepy :<

^ permalink raw reply	[relevance 71%]

* exit codes [was: [PATCH 5/9] lei q: honor --no-local to force remote searches]
  2021-01-24 11:46 65% ` [PATCH 5/9] lei q: honor --no-local to force remote searches Eric Wong
@ 2021-01-24 12:31 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-24 12:31 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> --- a/t/lei.t
> +++ b/t/lei.t
> @@ -277,6 +277,9 @@ my $test_external = sub {
>  	}
>  	ok(!$lei->('q', '-o', "$home/mbox", 's:nope'),
>  			'fails if mbox format unspecified');
> +	ok(!$lei->(qw(q --no-local s:see)), '--no-local');
> +	is($? >> 8, 1, 'proper exit code');

I'm wondering if BSD EX_* constants from sysexits.h makes sense
for lei and if users will care, but keep in mind git doesn't use
them.  But they may conflict with curl(1) exit codes which we
propagate back to the user.

public-inbox-mda uses those codes because MTAs understand them;
but I'm not sure if lei will be invoked by MTAs via procmail
and what not...

sysexits.ph is distributed with Debian and FreeBSD Perl,
but not CentOS7.  However, the codes are stable across
architectures and OSes (AFAIK), unlike syscall numbers.

^ permalink raw reply	[relevance 71%]

* Re: [PATCH 7/9] lei q: fix JSON overview with remote externals
  2021-01-24 11:46 54% ` [PATCH 7/9] lei q: fix JSON overview with remote externals Eric Wong
@ 2021-01-24 12:37 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-24 12:37 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> --- a/lib/PublicInbox/LeiOverview.pm
> +++ b/lib/PublicInbox/LeiOverview.pm
> @@ -209,7 +209,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
>  		$json->ascii(1) if $lei->{opt}->{ascii};
>  	}
>  	my $l2m = $lei->{l2m};
> -	if ($l2m && $ibxish->can('scheme')) { # remote https?:// mboxrd
> +	if ($l2m && !$ibxish) { # remote https?:// mboxrd

I made this change last, thus necessitating changes to callers:

> --- a/lib/PublicInbox/LeiXSearch.pm
> +++ b/lib/PublicInbox/LeiXSearch.pm

> @@ -208,9 +201,9 @@ sub query_remote_mboxrd {
>  	}
>  	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
>  	my $tor = $opt->{torsocks} //= 'auto';
> +	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $uris->[0]);

Will squash this in, otherwise --remote --no-local won't work:

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 841257c1..fb608d00 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -201,7 +201,7 @@ sub query_remote_mboxrd {
 	}
 	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
 	my $tor = $opt->{torsocks} //= 'auto';
-	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $uris->[0]);
+	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	for my $uri (@$uris) {
 		$uri->query_form(@qform);
 		my $cmd = [ @cmd, $uri->as_string ];

^ permalink raw reply related	[relevance 71%]

* Re: [PATCH 02/10] lei: support remote externals
  2021-01-24 12:02 70%     ` Eric Wong
  2021-01-24 12:12 71%       ` Eric Wong
@ 2021-01-24 22:11 71%       ` Kyle Meyer
  2021-01-25 18:37 71%         ` Eric Wong
  1 sibling, 1 reply; 200+ results
From: Kyle Meyer @ 2021-01-24 22:11 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Btw, since you seem to be figuring things out without existing
> docs, could I convince you to start manpages for lei?

Sure, happy for a way to contribute.  I'm currently a bit behind with
some other volunteer work, but I should be able to carve out time for
this next weekend.

^ permalink raw reply	[relevance 71%]

* [PATCH 0/5] lei: more fixes and usability enhancement
@ 2021-01-25  1:18 71% Eric Wong
  2021-01-25  1:18 55% ` [PATCH 1/5] lei: reinstate JSON smsg output deduplication Eric Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 200+ results
From: Eric Wong @ 2021-01-25  1:18 UTC (permalink / raw)
  To: meta

cccuuurrrlll wwwiiilll nnnooo lllooonnngggeeerrr
ooouuutttpppuuutt llliiikkkeee ttthhhiiisss

Eric Wong (5):
  lei: reinstate JSON smsg output deduplication
  lei q: drop "oid" output format
  lei q: demangle and quiet curl output
  lei q: reject remotes early if curl(1) is missing
  lei q: continue remote search if torsocks(1) is missing

 lib/PublicInbox/LEI.pm         |  3 +-
 lib/PublicInbox/LeiOverview.pm | 11 +++---
 lib/PublicInbox/LeiXSearch.pm  | 70 ++++++++++++++++++++++++++--------
 t/lei.t                        | 21 ++++++++--
 4 files changed, 79 insertions(+), 26 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 2/5] lei q: drop "oid" output format
  2021-01-25  1:18 71% [PATCH 0/5] lei: more fixes and usability enhancement Eric Wong
  2021-01-25  1:18 55% ` [PATCH 1/5] lei: reinstate JSON smsg output deduplication Eric Wong
@ 2021-01-25  1:18 69% ` Eric Wong
  2021-01-25  1:18 55% ` [PATCH 3/5] lei q: demangle and quiet curl output Eric Wong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-25  1:18 UTC (permalink / raw)
  To: meta

The default deduplication command-line arguments would be
non-sensical for such an option and probably confusing.  It
doesn't seem worth the code to support OID-only output when it's
easy enough to use one of the JSON formats to extract the same
info.

We also don't have OIDs if using remotes, and the
to-be-implemented memoization will be optional.
---
 lib/PublicInbox/LEI.pm         | 3 ++-
 lib/PublicInbox/LeiOverview.pm | 4 ----
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 378113e8..09eac58c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -200,7 +200,8 @@ my %OPTDESC = (
 			'message/object output format' ],
 'mark	format|f=s' => $stdin_formats,
 'forget	format|f=s' => $stdin_formats,
-'q	format|f=s' => [ 'OUT|maildir|mboxrd|mboxcl2|mboxcl|html|oid|json',
+'q	format|f=s' => [
+	'OUT|maildir|mboxrd|mboxcl2|mboxcl|html|json|jsonl|concatjson',
 		'specify output format, default depends on --output'],
 'ls-query	format|f=s' => $ls_format,
 'ls-external	format|f=s' => $ls_format,
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 880c7acc..ea35871c 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -286,10 +286,6 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 				$buf = '';
 			}
 		}
-	} elsif ($self->{fmt} eq 'oid') {
-		sub {
-			my ($smsg, $mitem) = @_;
-		}
 	} # else { ...
 }
 

^ permalink raw reply related	[relevance 69%]

* [PATCH 4/5] lei q: reject remotes early if curl(1) is missing
  2021-01-25  1:18 71% [PATCH 0/5] lei: more fixes and usability enhancement Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-25  1:18 55% ` [PATCH 3/5] lei q: demangle and quiet curl output Eric Wong
@ 2021-01-25  1:18 65% ` Eric Wong
  2021-01-25  1:18 71% ` [PATCH 5/5] lei q: continue remote search if torsocks(1) " Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-25  1:18 UTC (permalink / raw)
  To: meta

This ought to provide a better user experience for
users if they attempt to use remote externals but
don't have curl installed.

We can avoid repeating PATH search in every worker here, too.
---
 lib/PublicInbox/LeiXSearch.pm | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 68be8ada..369f6f89 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -14,7 +14,7 @@ use PublicInbox::Import;
 use File::Temp 0.19 (); # 0.19 for ->newdir
 use File::Spec ();
 use PublicInbox::Search qw(xap_terms);
-use PublicInbox::Spawn qw(popen_rd spawn);
+use PublicInbox::Spawn qw(popen_rd spawn which);
 use PublicInbox::MID qw(mids);
 use Fcntl qw(SEEK_SET F_SETFL O_APPEND O_RDWR);
 
@@ -192,7 +192,7 @@ sub query_remote_mboxrd {
 	my ($opt, $env) = @$lei{qw(opt env)};
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
 	push(@qform, t => 1) if $opt->{thread};
-	my @cmd = (qw(curl -sSf -d), '');
+	my @cmd = ($self->{curl}, qw(-sSf -d), '');
 	my $verbose = $opt->{verbose};
 	my $reap;
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
@@ -411,13 +411,22 @@ sub ipc_atfork_prepare {
 	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
 }
 
+sub add_uri {
+	my ($self, $uri) = @_;
+	if (my $curl = $self->{curl} //= which('curl') // 0) {
+		push @{$self->{remotes}}, $uri;
+	} else {
+		warn "curl missing, ignoring $uri\n";
+	}
+}
+
 sub prepare_external {
 	my ($self, $loc, $boost) = @_; # n.b. already ordered by boost
 	if (ref $loc) { # already a URI, or PublicInbox::Inbox-like object
-		return push(@{$self->{remotes}}, $loc) if $loc->can('scheme');
+		return add_uri($self, $loc) if $loc->can('scheme');
 	} elsif ($loc =~ m!\Ahttps?://!) {
 		require URI;
-		return push(@{$self->{remotes}}, URI->new($loc));
+		return add_uri($self, URI->new($loc));
 	} elsif (-f "$loc/ei.lock") {
 		require PublicInbox::ExtSearch;
 		$loc = PublicInbox::ExtSearch->new($loc);

^ permalink raw reply related	[relevance 65%]

* [PATCH 3/5] lei q: demangle and quiet curl output
  2021-01-25  1:18 71% [PATCH 0/5] lei: more fixes and usability enhancement Eric Wong
  2021-01-25  1:18 55% ` [PATCH 1/5] lei: reinstate JSON smsg output deduplication Eric Wong
  2021-01-25  1:18 69% ` [PATCH 2/5] lei q: drop "oid" output format Eric Wong
@ 2021-01-25  1:18 55% ` Eric Wong
  2021-01-25  1:18 65% ` [PATCH 4/5] lei q: reject remotes early if curl(1) is missing Eric Wong
  2021-01-25  1:18 71% ` [PATCH 5/5] lei q: continue remote search if torsocks(1) " Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-25  1:18 UTC (permalink / raw)
  To: meta

curl(1) writes to stderr one byte-at-a-time (presumably for the
progress bar).  This ends up being unreadable on my terminal
when parallel processes are trying to write error messages.

So instead, we'll capture the output to a file and run
'tail -f' on it if --verbose is enabled.

Since HTTP 404s from non-existent results are a common response,
we'll ignore them and stay silent, matching behavior of local
searches.
---
 lib/PublicInbox/LeiXSearch.pm | 45 ++++++++++++++++++++++++++---------
 t/lei.t                       |  2 +-
 2 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index fb608d00..68be8ada 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -14,8 +14,9 @@ use PublicInbox::Import;
 use File::Temp 0.19 (); # 0.19 for ->newdir
 use File::Spec ();
 use PublicInbox::Search qw(xap_terms);
-use PublicInbox::Spawn qw(popen_rd);
+use PublicInbox::Spawn qw(popen_rd spawn);
 use PublicInbox::MID qw(mids);
+use Fcntl qw(SEEK_SET F_SETFL O_APPEND O_RDWR);
 
 sub new {
 	my ($class) = @_;
@@ -176,6 +177,13 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$each_smsg->($smsg, undef, $eml);
 }
 
+# PublicInbox::OnDestroy callback
+sub kill_reap {
+	my ($pid) = @_;
+	kill('KILL', $pid); # spawn() blocks other signals
+	waitpid($pid, 0);
+}
+
 sub query_remote_mboxrd {
 	my ($self, $lei, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
@@ -186,7 +194,20 @@ sub query_remote_mboxrd {
 	push(@qform, t => 1) if $opt->{thread};
 	my @cmd = (qw(curl -sSf -d), '');
 	my $verbose = $opt->{verbose};
-	push @cmd, '-v' if $verbose;
+	my $reap;
+	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
+	fcntl($cerr, F_SETFL, O_APPEND|O_RDWR) or warn "set O_APPEND: $!";
+	my $rdr = { 2 => $cerr };
+	my $coff = 0;
+	if ($verbose) {
+		# spawn a process to force line-buffering, otherwise curl
+		# will write 1 character at-a-time and parallel outputs
+		# mmmaaayyy llloookkk llliiikkkeee ttthhhiiisss
+		push @cmd, '-v';
+		my $o = { 1 => $lei->{2}, 2 => $lei->{2} };
+		my $pid = spawn(['tail', '-f', $cerr->filename], undef, $o);
+		$reap = PublicInbox::OnDestroy->new(\&kill_reap, $pid);
+	}
 	for my $o ($lei->curl_opt) {
 		$o =~ s/\|[a-z0-9]\b//i; # remove single char short option
 		if ($o =~ s/=[is]@\z//) {
@@ -213,21 +234,23 @@ sub query_remote_mboxrd {
 		}
 		$lei->err("# @$cmd") if $verbose;
 		$? = 0;
-		my $fh = popen_rd($cmd, $env, { 2 => $lei->{2} });
+		my $fh = popen_rd($cmd, $env, $rdr);
 		$fh = IO::Uncompress::Gunzip->new($fh);
 		eval {
 			PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
 							$lei, $each_smsg);
 		};
 		return $lei->fail("E: @$cmd: $@") if $@;
-		if (($? >> 8) == 22) { # HTTP 404 from curl(1)
-			$uri->query_form(q => $lei->{mset_opt}->{qstr});
-			$lei->err('# no results from '.$uri->as_string);
-		} elsif ($?) {
-			$uri->query_form(q => $lei->{mset_opt}->{qstr});
-			$lei->err('E: '.$uri->as_string);
-			$lei->child_error($?);
-		}
+		next unless $?;
+		seek($cerr, $coff, SEEK_SET) or warn "seek(curl stderr): $!\n";
+		my $e = do { local $/; <$cerr> } //
+				die "read(curl stderr): $!\n";
+		$coff += length($e);
+		next if (($? >> 8) == 22 && $e =~ /\b404\b/);
+		$lei->child_error($?);
+		$uri->query_form(q => $lei->{mset_opt}->{qstr});
+		# --verbose already showed the error via tail(1)
+		$lei->err("E: $uri \$?=$?\n", $verbose ? () : $e);
 	}
 	undef $each_smsg;
 	$lei->{ovv}->ovv_atexit_child($lei);
diff --git a/t/lei.t b/t/lei.t
index f826a966..69338257 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -179,7 +179,7 @@ SKIP: {
 	my $res = $json->decode($out);
 	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
 	ok($lei->('q', "m:$mid", 'd:..20101002'), 'no results, no error');
-	like($err, qr/404/, 'noted 404');
+	is($err, '', 'no output on 404, matching local FS behavior');
 	is($out, "[null]\n", 'got null results');
 	$lei->('forget-external', $url);
 } # /SKIP

^ permalink raw reply related	[relevance 55%]

* [PATCH 1/5] lei: reinstate JSON smsg output deduplication
  2021-01-25  1:18 71% [PATCH 0/5] lei: more fixes and usability enhancement Eric Wong
@ 2021-01-25  1:18 55% ` Eric Wong
  2021-01-25  1:18 69% ` [PATCH 2/5] lei q: drop "oid" output format Eric Wong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-25  1:18 UTC (permalink / raw)
  To: meta

This was accidentally clobbered completely in
("lei q: fix JSON overview with remote externals").
There are now more tests to prevent future regressions.
---
 lib/PublicInbox/LeiOverview.pm |  7 ++++++-
 t/lei.t                        | 19 ++++++++++++++++---
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 928d66cb..880c7acc 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -203,12 +203,14 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 	my ($self, $lei, $ibxish) = @_;
 	my $json;
 	$lei->{1}->autoflush(1);
+	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
 	if (my $pkg = $self->{json}) {
 		$json = $pkg->new;
 		$json->utf8->canonical;
 		$json->ascii(1) if $lei->{opt}->{ascii};
+		$lei->{ovv_buf} = \(my $buf = '');
 	}
-	my $l2m = $lei->{l2m};
+	my $l2m = $lei->{l2m} or $dedupe->prepare_dedupe;
 	if ($l2m && !$ibxish) { # remote https?:// mboxrd
 		delete $l2m->{-wq_s1};
 		my $g2m = $l2m->can('git_to_mail');
@@ -241,6 +243,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
 		$self->{git} = $git; # for ovv_atexit_child
 		my $g2m = $l2m->can('git_to_mail');
+		$dedupe->prepare_dedupe;
 		sub {
 			my ($smsg, $mitem) = @_;
 			$smsg->{pct} = get_pct($mitem) if $mitem;
@@ -251,6 +254,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		$lei->{ovv_buf} = \(my $buf = '');
 		sub { # DIY prettiness :P
 			my ($smsg, $mitem) = @_;
+			return if $dedupe->is_smsg_dup($smsg);
 			$smsg = _unbless_smsg($smsg, $mitem);
 			$buf .= "{\n";
 			$buf .= join(",\n", map {
@@ -274,6 +278,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		$lei->{ovv_buf} = \(my $buf = '');
 		sub {
 			my ($smsg, $mitem) = @_;
+			return if $dedupe->is_smsg_dup($smsg);
 			$buf .= $json->encode(_unbless_smsg(@_)) . $ORS;
 			if (length($buf) > 65536) {
 				my $lk = $self->lock_for_scope;
diff --git a/t/lei.t b/t/lei.t
index 3fd1d1fe..f826a966 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -17,6 +17,7 @@ my $err_filter;
 my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
 	http://czquwvybam4bgbro.onion/meta/
 	http://ou63pmih66umazou.onion/meta/);
+my $json = ref(PublicInbox::Config->json)->new->utf8->canonical;
 my $lei = sub {
 	my ($cmd, $env, $xopt) = @_;
 	$out = $err = '';
@@ -142,8 +143,7 @@ my $setup_publicinboxes = sub {
 		my ($ibx) = @_;
 		my $im = PublicInbox::InboxWritable->new($ibx)->importer(0);
 		my $V = $ibx->version;
-		my @eml = glob('t/*.eml');
-		push(@eml, 't/data/0001.patch') if $V == 2;
+		my @eml = (glob('t/*.eml'), 't/data/0001.patch');
 		for (@eml) {
 			next if $_ eq 't/psgi_v2-old.eml'; # dup mid
 			$im->add(eml_load($_)) or BAIL_OUT "v$V add $_";
@@ -176,7 +176,7 @@ SKIP: {
 	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
 	ok($lei->('q', "m:$mid"), "query $url");
 	is($err, '', "no errors on $url");
-	my $res = PublicInbox::Config->json->decode($out);
+	my $res = $json->decode($out);
 	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
 	ok($lei->('q', "m:$mid", 'd:..20101002'), 'no results, no error');
 	like($err, qr/404/, 'noted 404');
@@ -246,6 +246,19 @@ my $test_external = sub {
 	# No double-quoting should be imposed on users on the CLI
 	$lei->('q', 's:use boolean prefix');
 	like($out, qr/search: use boolean prefix/, 'phrase search got result');
+	my $res = $json->decode($out);
+	is(scalar(@$res), 2, 'only 2 element array (1 result)');
+	is($res->[1], undef, 'final element is undef'); # XXX should this be?
+	is(ref($res->[0]), 'HASH', 'first element is hashref');
+	$lei->('q', '--pretty', 's:use boolean prefix');
+	my $pretty = $json->decode($out);
+	is_deeply($res, $pretty, '--pretty is identical after decode');
+
+	for my $fmt (qw(ldjson ndjson jsonl)) {
+		$lei->('q', '-f', $fmt, 's:use boolean prefix');
+		is($out, $json->encode($pretty->[0])."\n", "-f $fmt");
+	}
+
 	require IO::Uncompress::Gunzip;
 	for my $sfx ('', '.gz') {
 		my $f = "$home/mbox$sfx";

^ permalink raw reply related	[relevance 55%]

* [PATCH 5/5] lei q: continue remote search if torsocks(1) is missing
  2021-01-25  1:18 71% [PATCH 0/5] lei: more fixes and usability enhancement Eric Wong
                   ` (3 preceding siblings ...)
  2021-01-25  1:18 65% ` [PATCH 4/5] lei q: reject remotes early if curl(1) is missing Eric Wong
@ 2021-01-25  1:18 71% ` Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-25  1:18 UTC (permalink / raw)
  To: meta

torsocks is just one of many ways to get curl to use Tor,
so we'll continue if we can't find torsocks in our PATH
and assume the user has a proxy configured via curlrc,
the command-line, environment variable, or even firewall
rules.
---
 lib/PublicInbox/LeiXSearch.pm | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 369f6f89..b470c113 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -228,10 +228,16 @@ sub query_remote_mboxrd {
 		my $cmd = [ @cmd, $uri->as_string ];
 		if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&
 				(($env->{LD_PRELOAD}//'') !~ /torsocks/)) {
-			unshift @$cmd, 'torsocks';
+			unshift @$cmd, which('torsocks');
 		} elsif (PublicInbox::Config::git_bool($tor)) {
-			unshift @$cmd, 'torsocks';
+			unshift @$cmd, which('torsocks');
 		}
+
+		# continue anyways if torsocks is missing; a proxy may be
+		# specified via CLI, curlrc, environment variable, or even
+		# firewall rule
+		shift(@$cmd) if !$cmd->[0];
+
 		$lei->err("# @$cmd") if $verbose;
 		$? = 0;
 		my $fh = popen_rd($cmd, $env, $rdr);

^ permalink raw reply related	[relevance 71%]

* [PATCH 1/4] lei: use Time::HiRes stat for nanosecond resolution
  @ 2021-01-25  6:41 71% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-25  6:41 UTC (permalink / raw)
  To: meta

The default stat() lacks subsecond granularity and may
lead to config updates being ignored.
---
 lib/PublicInbox/LEI.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 09eac58c..effc6c52 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -24,6 +24,7 @@ use PublicInbox::DS qw(now dwaitpid);
 use PublicInbox::Spawn qw(spawn popen_rd);
 use PublicInbox::OnDestroy;
 use Text::Wrap qw(wrap);
+use Time::HiRes qw(stat); # ctime comparisons for config cache
 use File::Path qw(mkpath);
 use File::Spec;
 our $quit = \&CORE::exit;

^ permalink raw reply related	[relevance 71%]

* RFC: lei q --include/-I and similar switch names
@ 2021-01-25  7:33 61% Eric Wong
  2021-01-27  2:04 71% ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-25  7:33 UTC (permalink / raw)
  To: meta

"add-external" sometimes feels like an unnecessary burden
for a one-off search, so it'd be nice to be able to search
an external once, or exclude certain externals.

I'm set on supporting "-I$DIR_OR_URL" since that's common
command-line usage for gcc/clang/tcc, perl, ruby to include
extra search paths for headers/modules.

--exclude is naturally the opposite of --include, and I don't
know if --exclude needs a short name.  "-v" (like "grep -v")
isn't available, but maybe "-X" works, since it's something we
can't pass from the CLI to curl.  We're not passing "-x" to
curl, either, but we pass "--proxy" through, of course.

I don't know if "--only" is a good name and don't know of any
common tools with similar functionality (but I know very little
in general :x).  "--exclusive" would be confusing with
"--exclude" and require more typing even with tab-completion.
If we use "--only", then we can use -O for --only since we can't
forward -O to curl, either...

Anyways here's the proposed Getopt::Long spec changes to think
about:

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index effc6c52..a8f06e2c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -85,6 +85,7 @@ our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty mua-cmd=s
+	include|I=s@ only|O=s@ exclude=s@
 	torsocks=s no-torsocks verbose|v since|after=s until|before=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 

The trailing '@' means it's an array and can be specified more
than once, 's' means it's a string ('i' for integers), '=' means
it's a required arg (':' denotes optional args)

And I'll probably split date-time search switches so users can
choose either the easily-faked Date: header or difficult-to-fake
most-recent Received header.  Also, date-time searches will use
git's date parser, and not force users to use YYYYMMDD or similar.

^ permalink raw reply related	[relevance 61%]

* Re: [PATCH 02/10] lei: support remote externals
  2021-01-24 22:11 71%       ` Kyle Meyer
@ 2021-01-25 18:37 71%         ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-25 18:37 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> 
> > Btw, since you seem to be figuring things out without existing
> > docs, could I convince you to start manpages for lei?
> 
> Sure, happy for a way to contribute.  I'm currently a bit behind with
> some other volunteer work, but I should be able to carve out time for
> this next weekend.

No worries and thanks in advance!

I think most of the stuff implemented for lei is stable so far,
but more features will appear :)  And please let us know if
there's anything that's too surprising or bad.

^ permalink raw reply	[relevance 71%]

* Re: RFC: lei q --include/-I and similar switch names
  2021-01-25  7:33 61% RFC: lei q --include/-I and similar switch names Eric Wong
@ 2021-01-27  2:04 71% ` Kyle Meyer
  0 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-01-27  2:04 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> "add-external" sometimes feels like an unnecessary burden
> for a one-off search, so it'd be nice to be able to search
> an external once, or exclude certain externals.
>
> I'm set on supporting "-I$DIR_OR_URL" since that's common
> command-line usage for gcc/clang/tcc, perl, ruby to include
> extra search paths for headers/modules.

Sounds very useful.

> --exclude is naturally the opposite of --include, and I don't
> know if --exclude needs a short name.  "-v" (like "grep -v")
> isn't available, but maybe "-X" works, since it's something we
> can't pass from the CLI to curl.  We're not passing "-x" to
> curl, either, but we pass "--proxy" through, of course.

My two cents: I think it'd be okay to leave --exclude without a short
name...

> I don't know if "--only" is a good name and don't know of any
> common tools with similar functionality (but I know very little
> in general :x).  "--exclusive" would be confusing with
> "--exclude" and require more typing even with tab-completion.
> If we use "--only", then we can use -O for --only since we can't
> forward -O to curl, either...

... and --only/-O sounds good to me.

^ permalink raw reply	[relevance 71%]

* [PATCH 0/9] lei completion, some small updates
@ 2021-01-27  9:42 71% Eric Wong
  2021-01-27  9:42 71% ` [PATCH 2/9] lei: drop "git" command forwarding Eric Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 200+ results
From: Eric Wong @ 2021-01-27  9:42 UTC (permalink / raw)
  To: meta

6/9 for bash completion brings us closer to very DRY
help + completion internals.

A couple of small other things while working on
something bigger...

Eric Wong (9):
  eml: favor index() over regexp match
  lei: drop "git" command forwarding
  lei: fix comment regarding client payload
  lei: ensure PWD is set correctly for path expansion
  gcf2: rely on Perl 5.10 to avoid needless ++
  lei: complete option switch args
  lei_overview: clear redundant ovv_buf definition
  v2writable: nproc: use sysconf() on Linux and FreeBSD
  lei: dclose: fix typo

 lib/PublicInbox/Eml.pm         |   2 +-
 lib/PublicInbox/Gcf2.pm        |   9 +--
 lib/PublicInbox/LEI.pm         | 108 ++++++++++++++++++++-------------
 lib/PublicInbox/LeiOverview.pm |   1 -
 lib/PublicInbox/V2Writable.pm  |   6 ++
 t/lei.t                        |  46 ++++++++++++++
 6 files changed, 123 insertions(+), 49 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 2/9] lei: drop "git" command forwarding
  2021-01-27  9:42 71% [PATCH 0/9] lei completion, some small updates Eric Wong
@ 2021-01-27  9:42 71% ` Eric Wong
  2021-01-27  9:42 71% ` [PATCH 3/9] lei: fix comment regarding client payload Eric Wong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-27  9:42 UTC (permalink / raw)
  To: meta

It was intended as a proof-of-concept and no longer needed.
---
 lib/PublicInbox/LEI.pm | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index effc6c52..abd7fc48 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -679,18 +679,6 @@ sub lei__complete {
 	# proto parsing.
 }
 
-sub reap_exec { # dwaitpid callback
-	my ($self, $pid) = @_;
-	x_it($self, $?);
-}
-
-sub lei_git { # support passing through random git commands
-	my ($self, @argv) = @_;
-	my %rdr = map { $_ => $self->{$_} } (0..2);
-	my $pid = spawn(['git', @argv], $self->{env}, \%rdr);
-	dwaitpid($pid, \&reap_exec, $self);
-}
-
 sub exec_buf ($$) {
 	my ($argv, $env) = @_;
 	my $argc = scalar @$argv;

^ permalink raw reply related	[relevance 71%]

* [PATCH 3/9] lei: fix comment regarding client payload
  2021-01-27  9:42 71% [PATCH 0/9] lei completion, some small updates Eric Wong
  2021-01-27  9:42 71% ` [PATCH 2/9] lei: drop "git" command forwarding Eric Wong
@ 2021-01-27  9:42 71% ` Eric Wong
  2021-01-27  9:42 52% ` [PATCH 4/9] lei: set PWD correctly for path expansion Eric Wong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-27  9:42 UTC (permalink / raw)
  To: meta

The client PID is no longer sent to the daemon.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index abd7fc48..c017fd4e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -765,7 +765,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 	}
 	$self->{2}->autoflush(1); # keep stdout buffered until x_it|DESTROY
 	# $ENV_STR = join('', map { "\0$_=$ENV{$_}" } keys %ENV);
-	# $buf = "$$\0$argc\0".join("\0", @ARGV).$ENV_STR."\0\0";
+	# $buf = "$argc\0".join("\0", @ARGV).$ENV_STR."\0\0";
 	substr($buf, -2, 2, '') eq "\0\0" or  # s/\0\0\z//
 		return send($sock, 'request command truncated', MSG_EOR);
 	my ($argc, @argv) = split(/\0/, $buf, -1);

^ permalink raw reply related	[relevance 71%]

* [PATCH 4/9] lei: set PWD correctly for path expansion
  2021-01-27  9:42 71% [PATCH 0/9] lei completion, some small updates Eric Wong
  2021-01-27  9:42 71% ` [PATCH 2/9] lei: drop "git" command forwarding Eric Wong
  2021-01-27  9:42 71% ` [PATCH 3/9] lei: fix comment regarding client payload Eric Wong
@ 2021-01-27  9:42 52% ` Eric Wong
  2021-01-27  9:42 45% ` [PATCH 6/9] lei: complete option switch args Eric Wong
  2021-01-27  9:42 71% ` [PATCH 9/9] lei: dclose: fix typo Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-27  9:42 UTC (permalink / raw)
  To: meta

While commit d1b9582872d1824f166a038dcf32b6ae8c6dc735
("lei: pass FD to CWD via cmsg, use fchdir on server")
ensured things work properly to get the daemon in the
right directory, it forgot to deal with places where
we expand relative paths based on the current working
directory.
---
 lib/PublicInbox/LEI.pm | 56 ++++++++++++++++++++++++++++--------------
 1 file changed, 37 insertions(+), 19 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index c017fd4e..0ce6a00b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -13,6 +13,7 @@ use parent qw(PublicInbox::DS PublicInbox::LeiExternal
 use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
 use Errno qw(EAGAIN EINTR ECONNREFUSED ENOENT ECONNRESET);
+use Cwd qw(getcwd);
 use POSIX ();
 use IO::Handle ();
 use Fcntl qw(SEEK_SET);
@@ -65,18 +66,37 @@ sub opt_dash ($$) {
 	($spec, '<>' => $cb, $GLP_PASS) # for Getopt::Long
 }
 
+sub rel2abs ($$) {
+	my ($self, $p) = @_;
+	return $p if index($p, '/') == 0; # already absolute
+	my $pwd = $self->{env}->{PWD};
+	if (defined $pwd) {
+		my $cwd = $self->{3} // getcwd() // die "getcwd(PWD=$pwd): $!";
+		if (my @st_pwd = stat($pwd)) {
+			my @st_cwd = stat($cwd) or die "stat($cwd): $!";
+			"@st_pwd[1,0]" eq "@st_cwd[1,0]" or
+				$self->{env}->{PWD} = $pwd = $cwd;
+		} else { # PWD was invalid
+			delete $self->{env}->{PWD};
+			undef $pwd;
+		}
+	}
+	$pwd //= $self->{env}->{PWD} = getcwd() // die "getcwd(PWD=$pwd): $!";
+	File::Spec->rel2abs($p, $pwd);
+}
+
 sub _store_path ($) {
-	my ($env) = @_;
-	File::Spec->rel2abs(($env->{XDG_DATA_HOME} //
-		($env->{HOME} // '/nonexistent').'/.local/share')
-		.'/lei/store', $env->{PWD});
+	my ($self) = @_;
+	rel2abs($self, ($self->{env}->{XDG_DATA_HOME} //
+		($self->{env}->{HOME} // '/nonexistent').'/.local/share')
+		.'/lei/store');
 }
 
 sub _config_path ($) {
-	my ($env) = @_;
-	File::Spec->rel2abs(($env->{XDG_CONFIG_HOME} //
-		($env->{HOME} // '/nonexistent').'/.config')
-		.'/lei/config', $env->{PWD});
+	my ($self) = @_;
+	rel2abs($self, ($self->{env}->{XDG_CONFIG_HOME} //
+		($self->{env}->{HOME} // '/nonexistent').'/.config')
+		.'/lei/config');
 }
 
 # TODO: generate shell completion + help using %CMD and %OPTDESC
@@ -295,7 +315,7 @@ sub atfork_prepare_wq {
 	my ($self, $wq) = @_;
 	my $tcafc = $wq->{-ipc_atfork_child_close} //= [ $listener // () ];
 	if (my $sock = $self->{sock}) {
-		push @$tcafc, @$self{qw(0 1 2)}, $sock;
+		push @$tcafc, @$self{qw(0 1 2 3)}, $sock;
 	}
 	if (my $pgr = $self->{pgr}) {
 		push @$tcafc, @$pgr[1,2];
@@ -345,7 +365,7 @@ sub atfork_parent_wq {
 		$ret->{dedupe} = $wq->deep_clone($dedupe);
 	}
 	$self->{env} = $env;
-	delete @$ret{qw(-lei_store cfg old_1 pgr lxs)}; # keep l2m
+	delete @$ret{qw(3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
 	my @io = delete @$ret{0..2};
 	$io[3] = delete($ret->{sock}) // $io[2];
 	my $l2m = $ret->{l2m};
@@ -362,7 +382,7 @@ sub _help ($;$) {
 	my @info = @{$CMD{$cmd} // [ '...', '...' ]};
 	my @top = ($cmd, shift(@info) // ());
 	my $cmd_desc = shift(@info);
-	$cmd_desc = $cmd_desc->($self->{env}) if ref($cmd_desc) eq 'CODE';
+	$cmd_desc = $cmd_desc->($self) if ref($cmd_desc) eq 'CODE';
 	my @opt_desc;
 	my $lpad = 2;
 	for my $sw (grep { !ref } @info) { # ("prio=s", "z", $GLP_PASS)
@@ -520,7 +540,7 @@ sub dispatch {
 
 sub _lei_cfg ($;$) {
 	my ($self, $creat) = @_;
-	my $f = _config_path($self->{env});
+	my $f = _config_path($self);
 	my @st = stat($f);
 	my $cur_st = @st ? pack('dd', $st[10], $st[7]) : ''; # 10:ctime, 7:size
 	if (my $cfg = $PATH2CFG{$f}) { # reuse existing object in common case
@@ -550,8 +570,7 @@ sub _lei_store ($;$) {
 	$cfg->{-lei_store} //= do {
 		require PublicInbox::LeiStore;
 		my $dir = $cfg->{'leistore.dir'};
-		$dir //= _store_path($self->{env}) if $creat;
-		return unless $dir;
+		$dir //= $creat ? _store_path($self) : return;
 		PublicInbox::LeiStore->new($dir, { creat => $creat });
 	};
 }
@@ -587,9 +606,8 @@ sub lei_init {
 	my ($self, $dir) = @_;
 	my $cfg = _lei_cfg($self, 1);
 	my $cur = $cfg->{'leistore.dir'};
-	my $env = $self->{env};
-	$dir //= _store_path($env);
-	$dir = File::Spec->rel2abs($dir, $env->{PWD}); # PWD is symlink-aware
+	$dir //= _store_path($self);
+	$dir = rel2abs($self, $dir);
 	my @cur = stat($cur) if defined($cur);
 	$cur = File::Spec->canonpath($cur // $dir);
 	my @dir = stat($dir);
@@ -601,7 +619,7 @@ sub lei_init {
 		}
 
 		# some folks like symlinks and bind mounts :P
-		if (@dir && "$cur[0] $cur[1]" eq "$dir[0] $dir[1]") {
+		if (@dir && "@cur[1,0]" eq "@dir[1,0]") {
 			lei_config($self, 'leistore.dir', $dir);
 			_lei_store($self, 1)->done;
 			return qerr($self, "$exists (as $cur)");
@@ -771,7 +789,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 	my ($argc, @argv) = split(/\0/, $buf, -1);
 	undef $buf;
 	my %env = map { split(/=/, $_, 2) } splice(@argv, $argc);
-	if (chdir(delete($self->{3}))) {
+	if (chdir($self->{3})) {
 		local %ENV = %env;
 		$self->{env} = \%env;
 		eval { dispatch($self, @argv) };

^ permalink raw reply related	[relevance 52%]

* [PATCH 6/9] lei: complete option switch args
  2021-01-27  9:42 71% [PATCH 0/9] lei completion, some small updates Eric Wong
                   ` (2 preceding siblings ...)
  2021-01-27  9:42 52% ` [PATCH 4/9] lei: set PWD correctly for path expansion Eric Wong
@ 2021-01-27  9:42 45% ` Eric Wong
  2021-01-27  9:42 71% ` [PATCH 9/9] lei: dclose: fix typo Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-27  9:42 UTC (permalink / raw)
  To: meta

And add tests for existing completion cases
---
 lib/PublicInbox/LEI.pm | 36 ++++++++++++++++++++++++---------
 t/lei.t                | 46 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 0ce6a00b..d5d9cf1f 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -104,8 +104,9 @@ sub _config_path ($) {
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
-	sort|s=s reverse|r offset=i remote! local! external! pretty mua-cmd=s
-	torsocks=s no-torsocks verbose|v since|after=s until|before=s),
+	sort|s=s reverse|r offset=i remote! local! external! pretty
+	mua-cmd|mua=s no-torsocks torsocks=s verbose|v
+	received-after=s received-before=s sent-after=s sent-since=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
@@ -200,7 +201,11 @@ my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
 my %OPTDESC = (
 'help|h' => 'show this built-in help',
 'quiet|q' => 'be quiet',
+'verbose|v' => 'be more verbose',
 'solve!' => 'do not attempt to reconstruct blobs from emails',
+'torsocks=s' => ['auto|no|yes',
+		'whether or not to wrap git and curl commands with torsocks'],
+'no-torsocks' => 'alias for --torsocks=no',
 'save-as=s' => ['NAME', 'save a search terms by given name'],
 
 'type=s' => [ 'any|mid|git', 'disambiguate type' ],
@@ -212,7 +217,7 @@ my %OPTDESC = (
 	'return all messages in the same thread as the actual match(es)',
 'augment|a' => 'augment --output destination instead of clobbering',
 
-'output|o=s' => [ 'DEST',
+'output|mfolder|o=s' => [ 'DEST',
 	"destination (e.g. `/path/to/Maildir', or `-' for stdout)" ],
 'mua-cmd|mua=s' => [ 'COMMAND',
 	"MUA to run on --output Maildir or mbox (e.g. `mutt -f %f'" ],
@@ -222,7 +227,7 @@ my %OPTDESC = (
 'mark	format|f=s' => $stdin_formats,
 'forget	format|f=s' => $stdin_formats,
 'q	format|f=s' => [
-	'OUT|maildir|mboxrd|mboxcl2|mboxcl|html|json|jsonl|concatjson',
+	'OUT|maildir|mboxrd|mboxcl2|mboxcl|mboxo|html|json|jsonl|concatjson',
 		'specify output format, default depends on --output'],
 'ls-query	format|f=s' => $ls_format,
 'ls-external	format|f=s' => $ls_format,
@@ -673,22 +678,33 @@ sub lei__complete {
 				get-color-name get-colorbool);
 			# fall-through
 		}
-		# TODO: arg support
 		puts $self, grep(/$re/, map { # generate short/long names
-			my $eq = '';
-			if (s/=.+\z//) { # required arg, e.g. output|o=i
-				$eq = '=';
-			} elsif (s/:.+\z//) { # optional arg, e.g. mid:s
+			if (s/[:=].+\z//) { # req/optional args, e.g output|o=i
 			} else { # negation: solve! => no-solve|solve
 				s/\A(.+)!\z/no-$1|$1/;
 			}
 			map {
-				length > 1 ? "--$_$eq" : "-$_"
+				my $x = length > 1 ? "--$_" : "-$_";
+				$x eq $cur ? () : $x;
 			} split(/\|/, $_, -1) # help|h
 		} grep { $OPTDESC{"$cmd\t$_"} || $OPTDESC{$_} } @spec);
 	} elsif ($cmd eq 'config' && !@argv && !$CONFIG_KEYS{$cur}) {
 		puts $self, grep(/$re/, keys %CONFIG_KEYS);
 	}
+
+	# switch args (e.g. lei q -f mbox<TAB>)
+	if (($argv[-1] // $cur // '') =~ /\A--?([\w\-]+)\z/) {
+		my $opt = quotemeta $1;
+		puts $self, map {
+			my $v = $OPTDESC{$_};
+			$v = $v->[0] if ref($v);
+			my @v = split(/\|/, $v);
+			# get rid of ALL CAPS placeholder (e.g "OUT")
+			# (TODO: completion for external paths)
+			shift(@v) if uc($v[0]) eq $v[0];
+			@v;
+		} grep(/\A(?:$cmd\t|)(?:[\w-]+\|)*$opt\b/, keys %OPTDESC);
+	}
 	$cmd =~ tr/-/_/;
 	if (my $sub = $self->can("_complete_$cmd")) {
 		puts $self, $sub->($self, @argv, $cur);
diff --git a/t/lei.t b/t/lei.t
index 69338257..3f6702e6 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -216,6 +216,24 @@ my $test_external = sub {
 	like($out, qr/boost=0\n/s, 'ls-external has output');
 	ok($lei->(qw(add-external -q https://EXAMPLE.com/ibx)), 'add remote');
 	is($err, '', 'no warnings after add-external');
+
+	ok($lei->(qw(_complete lei forget-external)), 'complete for externals');
+	my %comp = map { $_ => 1 } split(/\s+/, $out);
+	ok($comp{'https://example.com/ibx/'}, 'forget external completion');
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		ok($comp{$ibx->{inboxdir}}, "local $ibx->{name} completion");
+	});
+	for my $u (qw(h http https https: https:/ https:// https://e
+			https://example https://example. https://example.co
+			https://example.com https://example.com/
+			https://example.com/i https://example.com/ibx)) {
+		ok($lei->(qw(_complete lei forget-external), $u),
+			"partial completion for URL $u");
+		is($out, "https://example.com/ibx/\n",
+			"completed partial URL $u");
+	}
+
 	$lei->('ls-external');
 	like($out, qr!https://example\.com/ibx/!s, 'added canonical URL');
 	is($err, '', 'no warnings on ls-external');
@@ -304,11 +322,39 @@ my $test_external = sub {
 	}
 };
 
+my $test_completion = sub {
+	ok($lei->(qw(_complete lei)), 'no errors on complete');
+	my %out = map { $_ => 1 } split(/\s+/s, $out);
+	ok($out{'q'}, "`lei q' offered as completion");
+	ok($out{'add-external'}, "`lei add-external' offered as completion");
+
+	ok($lei->(qw(_complete lei q)), 'complete q (no args)');
+	%out = map { $_ => 1 } split(/\s+/s, $out);
+	for my $sw (qw(-f --format -o --output --mfolder --augment -a
+			--mua --mua-cmd --no-local --local --verbose -v
+			--save-as --no-remote --remote --torsocks
+			--reverse -r )) {
+		ok($out{$sw}, "$sw offered as completion");
+	}
+
+	ok($lei->(qw(_complete lei q --form)), 'complete q --format');
+	is($out, "--format\n", 'complete lei q --format');
+	for my $sw (qw(-f --format)) {
+		ok($lei->(qw(_complete lei q), $sw), "complete q $sw ARG");
+		%out = map { $_ => 1 } split(/\s+/s, $out);
+		for my $f (qw(mboxrd mboxcl2 mboxcl mboxo json jsonl
+				concatjson maildir)) {
+			ok($out{$f}, "got $sw $f as output format");
+		}
+	}
+};
+
 my $test_lei_common = sub {
 	$test_help->();
 	$test_config->();
 	$test_init->();
 	$test_external->();
+	$test_completion->();
 };
 
 if ($ENV{TEST_LEI_ONESHOT}) {

^ permalink raw reply related	[relevance 45%]

* [PATCH 9/9] lei: dclose: fix typo
  2021-01-27  9:42 71% [PATCH 0/9] lei completion, some small updates Eric Wong
                   ` (3 preceding siblings ...)
  2021-01-27  9:42 45% ` [PATCH 6/9] lei: complete option switch args Eric Wong
@ 2021-01-27  9:42 71% ` Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-27  9:42 UTC (permalink / raw)
  To: meta

Oops :x
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index d5d9cf1f..f5413aab 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -820,7 +820,7 @@ sub dclose {
 	for my $f (qw(lxs l2m)) {
 		my $wq = delete $self->{$f} or next;
 		if ($wq->wq_kill) {
-			$self->wq_close
+			$wq->wq_close
 		} elsif ($wq->wq_kill_old) {
 			$wq->wq_wait_old;
 		}

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/7] lei: more half-baked updates
@ 2021-01-29  7:42 71% Eric Wong
  2021-01-29  7:42 37% ` [PATCH 4/7] lei: less error-prone FD mapping Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-01-29  7:42 UTC (permalink / raw)
  To: meta

I'm not sure if I want to keep 1/7.

4/7 is LONG overdue

Still chasing down difficult-to-reproduce lei2mail workers
segfaults which seem related to LeiDedupe + SharedKV and weird
object lifetimes; which is preventing me from doing anything
else.  Worst case is we disable worker processes, but the
performance hit sucks.

Eric Wong (7):
  ipc: wq: support passing fields to workers
  lei_xsearch: drop repeated "Xapian" in error message
  ipc: more consistent behavior between worker types
  lei: less error-prone FD mapping
  git: synchronous cat_file may return type and OID
  ipc: move on_destroy scope to inside the eval
  shared_kv: simplify PID+object guard for cleanup

 lib/PublicInbox/Git.pm         |  9 ++---
 lib/PublicInbox/IPC.pm         | 46 +++++++++++++---------
 lib/PublicInbox/LEI.pm         | 56 ++++++++++++++++++++-------
 lib/PublicInbox/LeiOverview.pm |  9 ++---
 lib/PublicInbox/LeiToMail.pm   |  8 +---
 lib/PublicInbox/LeiXSearch.pm  | 70 +++++++++++++++-------------------
 lib/PublicInbox/SharedKV.pm    |  8 ++--
 lib/PublicInbox/Spawn.pm       |  2 +-
 t/git.t                        |  8 ++--
 t/shared_kv.t                  |  2 +-
 10 files changed, 119 insertions(+), 99 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 4/7] lei: less error-prone FD mapping
  2021-01-29  7:42 71% [PATCH 0/7] lei: more half-baked updates Eric Wong
@ 2021-01-29  7:42 37% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-01-29  7:42 UTC (permalink / raw)
  To: meta

Keeping track of non-standard FDs gets tricky, so make it easier
by relying on st_dev/st_ino mapping in the transmitted objects.

We'll keep using numbers for the standard FDs since we need to
be able to easily redirect them in the producer (main daemon)
process for (gzip|bzip2|xz) if writing to a compressed mbox.
---
 lib/PublicInbox/LEI.pm         | 56 +++++++++++++++++++++-------
 lib/PublicInbox/LeiOverview.pm |  9 ++---
 lib/PublicInbox/LeiToMail.pm   |  8 +---
 lib/PublicInbox/LeiXSearch.pm  | 68 +++++++++++++++-------------------
 lib/PublicInbox/Spawn.pm       |  2 +-
 5 files changed, 77 insertions(+), 66 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f5413aab..3ed330f9 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -335,14 +335,27 @@ sub atfork_prepare_wq {
 	}
 }
 
+sub io_restore ($$) {
+	my ($dst, $src) = @_;
+	for my $i (0..2) { # standard FDs
+		my $io = delete $src->{$i} or next;
+		$dst->{$i} = $io;
+	}
+	for my $i (3..9) { # named (non-standard) FDs
+		my $io = $src->{$i} or next;
+		my @st = stat($io) or die "stat $src.$i ($io): $!";
+		my $f = delete $dst->{"dev=$st[0],ino=$st[1]"} // next;
+		$dst->{$f} = $io;
+		delete $src->{$i};
+	}
+}
+
 # usage: my %sig = $lei->atfork_child_wq($wq);
 #	 local @SIG{keys %sig} = values %sig;
 sub atfork_child_wq {
 	my ($self, $wq) = @_;
-	my ($sock, $l2m_wq_s1);
-	(@$self{qw(0 1 2)}, $sock, $l2m_wq_s1) = delete(@$wq{0..4});
-	$self->{sock} = $sock if -S $sock;
-	$self->{l2m}->{-wq_s1} = $l2m_wq_s1 if $l2m_wq_s1 && -S $l2m_wq_s1;
+	io_restore($self, $wq);
+	io_restore($self->{l2m}, $wq);
 	%PATH2CFG = ();
 	undef $errors_log;
 	$quit = \&CORE::exit;
@@ -355,30 +368,45 @@ sub atfork_child_wq {
 			close(delete $self->{$i});
 		}
 		# trigger the LeiXSearch $done OpPipe:
-		syswrite($self->{0}, '!') if $self->{0} && -p $self->{0};
+		syswrite($self->{op_pipe}, '!') if $self->{op_pipe};
 		$SIG{PIPE} = 'DEFAULT';
 		die bless(\"$_[0]", 'PublicInbox::SIGPIPE'),
 	});
 }
 
+sub io_extract ($;@) {
+	my ($obj, @fields) = @_;
+	my @io;
+	for my $f (@fields) {
+		my $io = delete $obj->{$f} or next;
+		my @st = stat($io) or die "W: stat $obj.$f ($io): $!";
+		$obj->{"dev=$st[0],ino=$st[1]"} = $f;
+		push @io, $io;
+	}
+	@io
+}
+
 # usage: ($lei, @io) = $lei->atfork_parent_wq($wq);
 sub atfork_parent_wq {
 	my ($self, $wq) = @_;
 	my $env = delete $self->{env}; # env is inherited at fork
-	my $ret = bless { %$self }, ref($self);
-	if (my $dedupe = delete $ret->{dedupe}) {
-		$ret->{dedupe} = $wq->deep_clone($dedupe);
+	my $lei = bless { %$self }, ref($self);
+	if (my $dedupe = delete $lei->{dedupe}) {
+		$lei->{dedupe} = $wq->deep_clone($dedupe);
 	}
 	$self->{env} = $env;
-	delete @$ret{qw(3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
-	my @io = delete @$ret{0..2};
-	$io[3] = delete($ret->{sock}) // $io[2];
-	my $l2m = $ret->{l2m};
+	delete @$lei{qw(3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
+	my @io = (delete(@$lei{qw(0 1 2)}),
+			io_extract($lei, qw(sock op_pipe startq)));
+	my $l2m = $lei->{l2m};
 	if ($l2m && $l2m != $wq) { # $wq == lxs
-		$io[4] = $l2m->{-wq_s1} if $l2m->{-wq_s1};
+		if (my $wq_s1 = $l2m->{-wq_s1}) {
+			push @io, io_extract($l2m, '-wq_s1');
+			$l2m->{-wq_s1} = $wq_s1;
+		}
 		$l2m->wq_close(1);
 	}
-	($ret, @io);
+	($lei, @io);
 }
 
 sub _help ($;$) {
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index f9a28138..c67e2747 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -220,14 +220,13 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
 		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
-		# n.b. $io[0] = qry_status_wr, $io[1] = mbox|stdout,
-		# $io[4] becomes a notification pipe that triggers EOF
+		# $io[-1] becomes a notification pipe that triggers EOF
 		# in this wq worker when all outstanding ->write_mail
 		# calls are complete
-		die "BUG: \$io[4] $io[4] unexpected" if $io[4];
-		pipe($l2m->{each_smsg_done}, $io[4]) or die "pipe: $!";
-		fcntl($io[4], 1031, 4096) if $^O eq 'linux';
+		pipe($l2m->{each_smsg_done}, $io[$#io + 1]) or die "pipe: $!";
+		fcntl($io[-1], 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
 		delete @$lei_ipc{qw(l2m opt mset_opt cmd)};
+		$lei_ipc->{each_smsg_not_done} = $#io;
 		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
 		$self->{git} = $git;
 		my $git_dir = $git->{git_dir};
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 08a1570d..61b546b5 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -460,7 +460,7 @@ sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
 
 sub write_mail { # via ->wq_do
 	my ($self, $git_dir, $smsg, $lei) = @_;
-	my $not_done = delete $self->{4}; # write end of {each_smsg_done}
+	my $not_done = delete $self->{$lei->{each_smsg_not_done}};
 	my $wcb = $self->{wcb} //= do { # first message
 		my %sig = $lei->atfork_child_wq($self);
 		@SIG{keys %sig} = values %sig; # not local
@@ -471,12 +471,6 @@ sub write_mail { # via ->wq_do
 	$git->cat_async($smsg->{blob}, \&git_to_mail, [$wcb, $smsg, $not_done]);
 }
 
-sub ipc_atfork_prepare {
-	my ($self) = @_;
-	# FDs: (done_wr, stdout|mbox, stderr, 3: sock, 4: each_smsg_done_wr)
-	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
-}
-
 # We rely on OnDestroy to run this before ->DESTROY, since ->DESTROY
 # ordering is unstable at worker exit and may cause segfaults
 sub reap_gits {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 9ea2b5f3..e69b637c 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -109,9 +109,9 @@ sub wait_startq ($) {
 sub query_thread_mset { # for --thread
 	my ($self, $lei, $ibxish) = @_;
 	local $0 = "$0 query_thread_mset";
-	my $startq = delete $self->{5};
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
+	my $startq = delete $lei->{startq};
 
 	my ($srch, $over) = ($ibxish->search, $ibxish->over);
 	unless ($srch && $over) {
@@ -145,9 +145,9 @@ sub query_thread_mset { # for --thread
 sub query_mset { # non-parallel for non-"--thread" users
 	my ($self, $lei) = @_;
 	local $0 = "$0 query_mset";
-	my $startq = delete $self->{5};
 	my %sig = $lei->atfork_child_wq($self);
 	local @SIG{keys %sig} = values %sig;
+	my $startq = delete $lei->{startq};
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	for my $loc (locals($self)) {
@@ -173,7 +173,7 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$smsg->parse_references($eml, mids($eml));
 	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
 	delete @$smsg{qw(From Subject -ds -ts)};
-	if (my $startq = delete($self->{5})) { wait_startq($startq) }
+	if (my $startq = delete($lei->{startq})) { wait_startq($startq) }
 	$each_smsg->($smsg, undef, $eml);
 }
 
@@ -352,11 +352,12 @@ sub query_prepare { # called by wq_do
 	my ($self, $lei) = @_;
 	local $0 = "$0 query_prepare";
 	my %sig = $lei->atfork_child_wq($self);
-	-p $lei->{0} or die "BUG: \$done pipe expected";
+	-p $lei->{op_pipe} or die "BUG: \$done pipe expected";
 	local @SIG{keys %sig} = values %sig;
+	delete $lei->{l2m}->{-wq_s1};
 	eval { $lei->{l2m}->do_augment($lei) };
 	$lei->fail($@) if $@;
-	syswrite($lei->{0}, '.') == 1 or die "do_post_augment trigger: $!";
+	syswrite($lei->{op_pipe}, '.') == 1 or die "do_post_augment trigger: $!"
 }
 
 sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
@@ -370,56 +371,45 @@ sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
 }
 
 sub do_query {
-	my ($self, $lei_orig) = @_;
-	my ($lei, @io) = $lei_orig->atfork_parent_wq($self);
-	$io[0] = undef;
-	pipe(my $done, $io[0]) or die "pipe $!";
-	$lei_orig->{1}->autoflush(1);
+	my ($self, $lei) = @_;
+	$lei->{1}->autoflush(1);
+	my ($au_done, $zpipe);
+	my $l2m = $lei->{l2m};
+	if ($l2m) {
+		pipe($lei->{startq}, $au_done) or die "pipe: $!";
+		# 1031: F_SETPIPE_SZ
+		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';
+		$zpipe = $l2m->pre_augment($lei);
+	}
+	pipe(my $done, $lei->{op_pipe}) or die "pipe $!";
+	my ($lei_ipc, @io) = $lei->atfork_parent_wq($self);
+	delete($lei->{op_pipe});
 
-	$lei_orig->event_step_init; # wait for shutdowns
+	$lei->event_step_init; # wait for shutdowns
 	my $done_op = {
-		'' => [ \&query_done, $lei_orig ],
-		'!' => [ \&sigpipe_handler, $lei_orig ]
+		'' => [ \&query_done, $lei ],
+		'!' => [ \&sigpipe_handler, $lei ]
 	};
-	my $in_loop = exists $lei_orig->{sock};
+	my $in_loop = exists $lei->{sock};
 	$done = PublicInbox::OpPipe->new($done, $done_op, $in_loop);
-	my $l2m = $lei->{l2m};
 	if ($l2m) {
-		# may redirect $lei->{1} for mbox
-		my $zpipe = $l2m->pre_augment($lei_orig);
-		$io[1] = $lei_orig->{1};
-		pipe(my ($startq, $au_done)) or die "pipe: $!";
-		$done_op->{'.'} = [ \&do_post_augment, $lei_orig,
-					$zpipe, $au_done ];
-		local $io[4] = *STDERR{GLOB}; # don't send l2m->{-wq_s1}
-		die "BUG: unexpected \$io[5]: $io[5]" if $io[5];
-		$self->wq_do('query_prepare', \@io, $lei);
-		fcntl($startq, 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
-		$io[5] = $startq;
+		$done_op->{'.'} = [ \&do_post_augment, $lei, $zpipe, $au_done ];
+		$self->wq_do('query_prepare', \@io, $lei_ipc);
 		$io[1] = $zpipe->[1] if $zpipe;
 	}
-	start_query($self, \@io, $lei);
+	start_query($self, \@io, $lei_ipc);
 	$self->wq_close(1);
 	unless ($in_loop) {
-		# for the $lei->atfork_child_wq PIPE handler:
+		# for the $lei_ipc->atfork_child_wq PIPE handler:
 		while ($done->{sock}) { $done->event_step }
 	}
 }
 
-sub ipc_atfork_prepare {
-	my ($self) = @_;
-	if (exists $self->{remotes}) {
-		require PublicInbox::MboxReader;
-		require IO::Uncompress::Gunzip;
-	}
-	# FDS: (0: done_wr, 1: stdout|mbox, 2: stderr,
-	#       3: sock, 4: $l2m->{-wq_s1}, 5: $startq)
-	$self->SUPER::ipc_atfork_prepare; # PublicInbox::IPC
-}
-
 sub add_uri {
 	my ($self, $uri) = @_;
 	if (my $curl = $self->{curl} //= which('curl') // 0) {
+		require PublicInbox::MboxReader;
+		require IO::Uncompress::Gunzip;
 		push @{$self->{remotes}}, $uri;
 	} else {
 		warn "curl missing, ignoring $uri\n";
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index ef4885c1..1842899c 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -209,7 +209,7 @@ my $fdpass = <<'FDPASS';
 #include <sys/socket.h>
 
 #if defined(CMSG_SPACE) && defined(CMSG_LEN)
-#define SEND_FD_CAPA 6
+#define SEND_FD_CAPA 10
 #define SEND_FD_SPACE (SEND_FD_CAPA * sizeof(int))
 union my_cmsg {
 	struct cmsghdr hdr;

^ permalink raw reply related	[relevance 37%]

* [PATCH 0/2] doc: initial lei manpages
@ 2021-02-01  5:57 65% Kyle Meyer
  2021-02-01  5:57 27% ` [PATCH 1/2] doc: start manpages for lei commands Kyle Meyer
  2021-02-01  5:57 55% ` [PATCH 2/2] doc: add lei-overview(7) Kyle Meyer
  0 siblings, 2 replies; 200+ results
From: Kyle Meyer @ 2021-02-01  5:57 UTC (permalink / raw)
  To: meta

Prompted by <20210124120217.GA12880@dcvr>, here's my attempt to start
lei's manpages.  The first patch adds a manpage for lei and each of
its currently implemented subcommands.  The second patch adds an
overview/quickstart.

I'm not really sure this is in a good state.  I ran out of time to
give it a complete read-through, I feel like it may need a bit more
flesh just to be a good _start_, and I probably injected a good amount
of my own confusion into them.  Anyway, it still may be useful to get
feedback on, especially because I probably won't be able to work on it
in the next couple of days.

  [1/2] doc: start manpages for lei commands
  [2/2] doc: add lei-overview(7)

 Documentation/.gitignore              |   1 +
 Documentation/lei-add-external.pod    |  49 ++++++++++
 Documentation/lei-config.pod          |  26 ++++++
 Documentation/lei-daemon-kill.pod     |  28 ++++++
 Documentation/lei-daemon-pid.pod      |  28 ++++++
 Documentation/lei-forget-external.pod |  40 ++++++++
 Documentation/lei-init.pod            |  42 +++++++++
 Documentation/lei-ls-external.pod     |  38 ++++++++
 Documentation/lei-overview.pod        |  72 +++++++++++++++
 Documentation/lei-q.pod               | 127 ++++++++++++++++++++++++++
 Documentation/lei.pod                 |  90 ++++++++++++++++++
 Documentation/txt2pre                 |  12 ++-
 MANIFEST                              |  10 ++
 Makefile.PL                           |   7 +-
 14 files changed, 567 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/lei-add-external.pod
 create mode 100644 Documentation/lei-config.pod
 create mode 100644 Documentation/lei-daemon-kill.pod
 create mode 100644 Documentation/lei-daemon-pid.pod
 create mode 100644 Documentation/lei-forget-external.pod
 create mode 100644 Documentation/lei-init.pod
 create mode 100644 Documentation/lei-ls-external.pod
 create mode 100644 Documentation/lei-overview.pod
 create mode 100644 Documentation/lei-q.pod
 create mode 100644 Documentation/lei.pod


base-commit: dd1a1bceb56692722b1fb4a27391c80307403d86
-- 
2.30.0


^ permalink raw reply	[relevance 65%]

* [PATCH 1/2] doc: start manpages for lei commands
  2021-02-01  5:57 65% [PATCH 0/2] doc: initial lei manpages Kyle Meyer
@ 2021-02-01  5:57 27% ` Kyle Meyer
  2021-02-06  9:01 90%   ` lei-q doc thoughts... [was: doc: start manpages for lei commands] Eric Wong
  2021-02-01  5:57 55% ` [PATCH 2/2] doc: add lei-overview(7) Kyle Meyer
  1 sibling, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-01  5:57 UTC (permalink / raw)
  To: meta

Add manpages for lei and the currently implemented subcommands.  The
included options and their descriptions follow to a large degree the
--help output, dropping some options that are not currently wired up.
---
 Documentation/.gitignore              |   1 +
 Documentation/lei-add-external.pod    |  49 ++++++++++
 Documentation/lei-config.pod          |  26 ++++++
 Documentation/lei-daemon-kill.pod     |  28 ++++++
 Documentation/lei-daemon-pid.pod      |  28 ++++++
 Documentation/lei-forget-external.pod |  40 ++++++++
 Documentation/lei-init.pod            |  42 +++++++++
 Documentation/lei-ls-external.pod     |  38 ++++++++
 Documentation/lei-q.pod               | 127 ++++++++++++++++++++++++++
 Documentation/lei.pod                 |  90 ++++++++++++++++++
 Documentation/txt2pre                 |  11 ++-
 MANIFEST                              |   9 ++
 Makefile.PL                           |   5 +-
 13 files changed, 492 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/lei-add-external.pod
 create mode 100644 Documentation/lei-config.pod
 create mode 100644 Documentation/lei-daemon-kill.pod
 create mode 100644 Documentation/lei-daemon-pid.pod
 create mode 100644 Documentation/lei-forget-external.pod
 create mode 100644 Documentation/lei-init.pod
 create mode 100644 Documentation/lei-ls-external.pod
 create mode 100644 Documentation/lei-q.pod
 create mode 100644 Documentation/lei.pod

diff --git a/Documentation/.gitignore b/Documentation/.gitignore
index 92510039..142bce32 100644
--- a/Documentation/.gitignore
+++ b/Documentation/.gitignore
@@ -1,3 +1,4 @@
+/lei*.txt
 /public-inbox-*.txt
 /public-inbox.cgi.txt
 /standards.txt
diff --git a/Documentation/lei-add-external.pod b/Documentation/lei-add-external.pod
new file mode 100644
index 00000000..dd87be62
--- /dev/null
+++ b/Documentation/lei-add-external.pod
@@ -0,0 +1,49 @@
+=head1 NAME
+
+lei-add-external - add inbox or external index
+
+=head1 SYNOPSIS
+
+lei add-external [OPTIONS] URL_OR_PATHNAME
+
+=head1 DESCRIPTION
+
+Configure lei to search against an external (an inbox or external
+index).  When C<URL_OR_PATHNAME> is a local path, it should point to a
+directory that is a C<public.<name>.inboxdir> or
+C<extindex.<name>.topdir> value in ~/.public-inbox/config.
+
+=head1 OPTIONS
+
+=over
+
+=item --boost=NUMBER
+
+Set priority of a new or existing location.
+
+Default: 0
+
+=back
+
+=head1 FILES
+
+The configuration for lei resides at C<$XDG_CONFIG_HOME/lei/config>.
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<lei-forget-external(1)>, L<lei-ls-external(1)>,
+L<public-inbox-index(1)>, L<public-inbox-extindex(1)>,
+L<public-inbox-extindex-format(5)>
diff --git a/Documentation/lei-config.pod b/Documentation/lei-config.pod
new file mode 100644
index 00000000..b6d8bfde
--- /dev/null
+++ b/Documentation/lei-config.pod
@@ -0,0 +1,26 @@
+=head1 NAME
+
+lei-config - git-config wrapper for lei configuration file
+
+=head1 SYNOPSIS
+
+lei config [OPTIONS]
+
+=head1 DESCRIPTION
+
+Call git-config(1) with C<$XDG_CONFIG_HOME/lei/config> as the
+configuration file.  All C<OPTIONS> are passed through, but those that
+override the configuration file are not permitted.
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
diff --git a/Documentation/lei-daemon-kill.pod b/Documentation/lei-daemon-kill.pod
new file mode 100644
index 00000000..b369d3b3
--- /dev/null
+++ b/Documentation/lei-daemon-kill.pod
@@ -0,0 +1,28 @@
+=head1 NAME
+
+lei-daemon-kill - signal the lei-daemon
+
+=head1 SYNOPSIS
+
+lei daemon-kill [-SIGNAL | -s SIGNAL | --signal SIGNAL]
+
+=head1 DESCRIPTION
+
+Send a signal to the lei-daemon.  C<SIGNAL> defaults to C<TERM>.
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<lei-daemon-pid(1)>
diff --git a/Documentation/lei-daemon-pid.pod b/Documentation/lei-daemon-pid.pod
new file mode 100644
index 00000000..09de8b42
--- /dev/null
+++ b/Documentation/lei-daemon-pid.pod
@@ -0,0 +1,28 @@
+=head1 NAME
+
+lei-daemon-pid - show the PID of the lei-daemon
+
+=head1 SYNOPSIS
+
+lei daemon-pid
+
+=head1 DESCRIPTION
+
+Show the PID of the lei-daemon.
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<lei-daemon-kill(1)>
diff --git a/Documentation/lei-forget-external.pod b/Documentation/lei-forget-external.pod
new file mode 100644
index 00000000..40287bd3
--- /dev/null
+++ b/Documentation/lei-forget-external.pod
@@ -0,0 +1,40 @@
+=head1 NAME
+
+lei-forget-external - forget external locations
+
+=head1 SYNOPSIS
+
+lei forget-external [OPTIONS] URL_OR_PATHNAME [URL_OR_PATHNAME...]
+
+=head1 DESCRIPTION
+
+Forget the specified externals by removing their entries from
+C<$XDG_CONFIG_HOME/lei/config>.  This excludes the locations from
+future search results.
+
+=head1 OPTIONS
+
+=over
+
+=item -q, --quiet
+
+Suppress feedback messages.
+
+=back
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<lei-add-external(1)>, L<lei-ls-external(1)>
diff --git a/Documentation/lei-init.pod b/Documentation/lei-init.pod
new file mode 100644
index 00000000..8a8022fb
--- /dev/null
+++ b/Documentation/lei-init.pod
@@ -0,0 +1,42 @@
+=head1 NAME
+
+lei-init - initialize storage
+
+=head1 SYNOPSIS
+
+lei init [OPTIONS] [PATHNAME]
+
+=head1 DESCRIPTION
+
+Initialize local writable storage for L<lei(1)>.  If C<PATHNAME> is
+unspecified, the storage is created at C<$XDG_DATA_HOME/lei/store>.
+C<leistore.dir> in C<$XDG_CONFIG_HOME/lei/config> records this
+location.
+
+=head1 OPTIONS
+
+=over
+
+=item -q, --quiet
+
+Suppress feedback messages.
+
+=back
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+
+=head1 SEE ALSO
+
+L<lei-add-external(1)>
diff --git a/Documentation/lei-ls-external.pod b/Documentation/lei-ls-external.pod
new file mode 100644
index 00000000..1735faa9
--- /dev/null
+++ b/Documentation/lei-ls-external.pod
@@ -0,0 +1,38 @@
+=head1 NAME
+
+lei-ls-external - list inbox and external index locations
+
+=head1 SYNOPSIS
+
+lei ls-external [OPTIONS]
+
+=head1 DESCRIPTION
+
+List configured externals.
+
+=head1 OPTIONS
+
+=over
+
+=item -z, -0
+
+Use C<\0> (NUL) instead of newline (CR) to delimit lines.
+
+=back
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<lei-add-external(1)>, L<lei-forget-external(1)>
diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
new file mode 100644
index 00000000..e307e020
--- /dev/null
+++ b/Documentation/lei-q.pod
@@ -0,0 +1,127 @@
+=head1 NAME
+
+lei-q - search for messages matching terms
+
+=head1 SYNOPSIS
+
+lei q [OPTIONS] TERM [TERM...]
+
+=head1 DESCRIPTION
+
+Search for messages across the lei store and externals.
+
+TODO: Give common prefixes, or at least a description/reference.
+
+=head1 OPTIONS
+
+=over
+
+=item -o PATH, --output=PATH, --mfolder=PATH
+
+Destination for results (e.g., C<path/to/Maildir> or - for stdout).
+
+Default: -
+
+=item -f FORMAT, --format=FORMAT
+
+Format of results: C<maildir>, C<mboxrd>, C<mboxcl2>, C<mboxcl>,
+C<mboxo>, C<json>, C<jsonl>, or C<concatjson>.  The default format
+used depends on C<--output>.
+
+TODO: Provide description of formats?
+
+=item --pretty
+
+Pretty print C<json> or C<concatjson> output.  If stdout is opened to
+a tty and used as the C<--output> destination, C<--pretty> is enabled
+by default.
+
+=item --mua-cmd=COMMAND, --mua=COMMAND
+
+A command to run on C<--output> Maildir or mbox (e.g., C<mutt -f %f>).
+For a subset of MUAs known to accept a mailbox via C<-f>, COMMAND can
+be abbreviated to the name of the program: C<mutt>, C<mailx>, C<mail>,
+or C<neomutt>.
+
+=item --augment
+
+Augment output destination instead of clobbering it.
+
+=item -t, --thread
+
+Return all messages in the same thread as the actual match(es).
+
+=item -d STRATEGY, --dedupe=STRATEGY
+
+Strategy for deduplicating messages: C<content>, C<oid>, C<mid>, or
+C<none>.
+
+Default: C<content>
+
+TODO: Provide description of strategies?
+
+=item --[no-]remote
+
+Whether to include results requiring network access.  When local
+externals are configured, C<--remote> must be explicitly passed to
+enable reporting of results from remote externals.
+
+=item --no-local
+
+Limit operations to those requiring network access.
+
+=item --no-external
+
+Don't include results from externals.
+
+=item -NUMBER, -n NUMBER, --limit=NUMBER
+
+Limit the number of matches.
+
+Default: 10000
+
+=item --offset=NUMBER
+
+Shift start of search results.
+
+Default: 0
+
+=item -r, --reverse
+
+Reverse the results.  Note that this applies before C<--limit>.
+
+=item -s KEY, --sort=KEY
+
+Order the results by KEY.  Valid keys are C<received>, C<relevance>,
+and C<docid>.
+
+Default: C<received>
+
+=item -v, --verbose
+
+Provide more feedback on stderr.
+
+=item --torsocks=auto|no|yes, --no-torsocks
+
+Whether to wrap L<git(1)> and L<curl(1)> commands with torsocks.
+
+Default: C<auto>
+
+=back
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<lei-add-external(1)>
diff --git a/Documentation/lei.pod b/Documentation/lei.pod
new file mode 100644
index 00000000..e12a157d
--- /dev/null
+++ b/Documentation/lei.pod
@@ -0,0 +1,90 @@
+=head1 NAME
+
+lei - local email interface for public-inbox
+
+=head1 SYNOPSIS
+
+lei COMMAND
+
+=head1 DESCRIPTION
+
+Unlike the C10K-oriented L<public-inbox-daemon(8)>, lei is designed
+exclusively to handle trusted local clients with read/write access to
+the file system, using as many system resources as the local user has
+access to.  lei supports a local, writable store built on top of
+L<public-inbox-v2-format(5)> and L<public-inbox-extindex(1)>.
+L<lei-q(1)> provides an interface for querying messages across the lei
+store and read-only local and remote "externals" (inboxes and external
+indices).
+
+Available in public-inbox 1.7.0+.
+
+=head1 COMMANDS
+
+Subcommands for initializing and managing local, writable storage:
+
+=over
+
+=item * L<lei-init(1)>
+
+=back
+
+TODO: Add commands like lei-import once they're implemented.
+
+The following subcommands can be used to manage and inspect external
+locations:
+
+=over
+
+=item * L<lei-add-external(1)>
+
+=item * L<lei-forget-external(1)>
+
+=item * L<lei-ls-external(1)>
+
+=back
+
+Subcommands related to searching and inspecting messages from the lei
+store and configured externals are
+
+=over
+
+=item * L<lei-q(1)>
+
+=back
+
+TODO: Add lei-show (and perhaps others) once implemented.
+
+Other subcommands include
+
+=over
+
+=item * L<lei-config(1)>
+
+=item * L<lei-daemon-kill(1)>
+
+=item * L<lei-daemon-pid(1)>
+
+=back
+
+=head1 FILES
+
+By default storage is located at C<$XDG_DATA_HOME/lei/store>.  The
+configuration for lei resides at C<$XDG_CONFIG_HOME/lei/config>.
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<lei-overview(7)>
diff --git a/Documentation/txt2pre b/Documentation/txt2pre
index 75e4725c..f69323e1 100755
--- a/Documentation/txt2pre
+++ b/Documentation/txt2pre
@@ -10,7 +10,16 @@ use warnings;
 use PublicInbox::Linkify;
 use PublicInbox::Hval qw(ascii_html);
 my %xurls;
-for (qw[public-inbox.cgi(1)
+for (qw[lei(1)
+	lei-add-external(1)
+	lei-config(1)
+	lei-daemon-kill(1)
+	lei-daemon-pid(1)
+	lei-forget-external(1)
+	lei-init(1)
+	lei-ls-external(1)
+	lei-q(1)
+	public-inbox.cgi(1)
 	public-inbox-compact(1)
 	public-inbox-config(5)
 	public-inbox-convert(1)
diff --git a/MANIFEST b/MANIFEST
index 2077ab12..4d6f8b43 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -21,6 +21,15 @@ Documentation/flow.ge
 Documentation/flow.txt
 Documentation/hosted.txt
 Documentation/include.mk
+Documentation/lei-add-external.pod
+Documentation/lei-config.pod
+Documentation/lei-daemon-kill.pod
+Documentation/lei-daemon-pid.pod
+Documentation/lei-forget-external.pod
+Documentation/lei-init.pod
+Documentation/lei-ls-external.pod
+Documentation/lei-q.pod
+Documentation/lei.pod
 Documentation/marketing.txt
 Documentation/mknews.perl
 Documentation/public-inbox-compact.pod
diff --git a/Makefile.PL b/Makefile.PL
index b2f3393d..7bfe1e4e 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -42,7 +42,10 @@ $v->{-m1} = [ map {
 			push @no_pod, $x;
 			();
 		}
-	} @EXE_FILES ];
+	} @EXE_FILES,
+	qw(
+	lei-add-external lei-config lei-daemon-kill lei-daemon-pid
+	lei-forget-external lei-init lei-ls-external lei-q)];
 $v->{-m5} = [ qw(public-inbox-config public-inbox-v1-format
 		public-inbox-v2-format public-inbox-extindex-format) ];
 $v->{-m7} = [ qw(public-inbox-overview public-inbox-tuning) ];
-- 
2.30.0


^ permalink raw reply related	[relevance 27%]

* [PATCH 2/2] doc: add lei-overview(7)
  2021-02-01  5:57 65% [PATCH 0/2] doc: initial lei manpages Kyle Meyer
  2021-02-01  5:57 27% ` [PATCH 1/2] doc: start manpages for lei commands Kyle Meyer
@ 2021-02-01  5:57 55% ` Kyle Meyer
  2021-02-01  6:40 71%   ` Eric Wong
  1 sibling, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-01  5:57 UTC (permalink / raw)
  To: meta

---
 Documentation/lei-overview.pod | 72 ++++++++++++++++++++++++++++++++++
 Documentation/txt2pre          |  1 +
 MANIFEST                       |  1 +
 Makefile.PL                    |  2 +-
 4 files changed, 75 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/lei-overview.pod

diff --git a/Documentation/lei-overview.pod b/Documentation/lei-overview.pod
new file mode 100644
index 00000000..988896ce
--- /dev/null
+++ b/Documentation/lei-overview.pod
@@ -0,0 +1,72 @@
+=head1 NAME
+
+lei - an overview of lei
+
+=head1 DESCRIPTION
+
+L<lei(1)> is a local email interface for public-inbox.  This document
+provides some basic examples.
+
+=head1 LEI STORE
+
+L<lei-init(1)> initializes writable local storage based on
+L<public-inbox-v2-format(5)>.
+
+TODO: Extend when lei-import and friends are added.
+
+=head1 EXTERNALS
+
+In addition to the above store, lei can make read-only queries to
+"externals": inboxes and external indices.  An external can be
+registered by passing a URL or local path to L<lei-add-external(1)>.
+For local paths, the external needs to be indexed with
+L<public-inbox-index(1)> (in the case of a regular inbox) or
+L<public-inbox-extindex(1)> (in the case of an external index).
+
+=head2 EXAMPLES
+
+=over
+
+=item $ lei add-external https://public-inbox.org/meta/
+
+Add a remote external for public-inbox's inbox.
+
+=back
+
+=head1 SEARCHING
+
+The L<lei-q(1)> command searches the local store and externals.  The
+search prefixes match those available via L<public-inbox-httpd(1)>.
+
+=head2 EXAMPLES
+
+=over
+
+=item $ lei q s:lei s:skeleton
+
+Search for messages whose subject includes "lei" and "skeleton".
+
+=item $ lei q -t s:lei s:skeleton
+
+Do the same, but also report unmatched messages that are in the same
+thread as a matched message.
+
+=item $ lei q -t -o t.mbox --format mboxrd --mua=mutt s:lei s:skeleton
+
+Write mboxrd-formatted results to t.mbox and enter mutt to view the
+file by invoking C<mutt -f %f>.
+
+=back
+
+=head1 CONTACT
+
+Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
+
+The mail archives are hosted at L<https://public-inbox.org/meta/>
+and L<http://hjrcffqmbrq6wope.onion/meta/>
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
diff --git a/Documentation/txt2pre b/Documentation/txt2pre
index f69323e1..604490ef 100755
--- a/Documentation/txt2pre
+++ b/Documentation/txt2pre
@@ -18,6 +18,7 @@ for (qw[lei(1)
 	lei-forget-external(1)
 	lei-init(1)
 	lei-ls-external(1)
+	lei-overview(7)
 	lei-q(1)
 	public-inbox.cgi(1)
 	public-inbox-compact(1)
diff --git a/MANIFEST b/MANIFEST
index 4d6f8b43..56fde540 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -28,6 +28,7 @@ Documentation/lei-daemon-pid.pod
 Documentation/lei-forget-external.pod
 Documentation/lei-init.pod
 Documentation/lei-ls-external.pod
+Documentation/lei-overview.pod
 Documentation/lei-q.pod
 Documentation/lei.pod
 Documentation/marketing.txt
diff --git a/Makefile.PL b/Makefile.PL
index 7bfe1e4e..f1910c47 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -48,7 +48,7 @@ $v->{-m1} = [ map {
 	lei-forget-external lei-init lei-ls-external lei-q)];
 $v->{-m5} = [ qw(public-inbox-config public-inbox-v1-format
 		public-inbox-v2-format public-inbox-extindex-format) ];
-$v->{-m7} = [ qw(public-inbox-overview public-inbox-tuning) ];
+$v->{-m7} = [ qw(lei-overview public-inbox-overview public-inbox-tuning) ];
 $v->{-m8} = [ qw(public-inbox-daemon) ];
 my @sections = (1, 5, 7, 8);
 $v->{check_80} = [];
-- 
2.30.0


^ permalink raw reply related	[relevance 55%]

* Re: [PATCH 2/2] doc: add lei-overview(7)
  2021-02-01  5:57 55% ` [PATCH 2/2] doc: add lei-overview(7) Kyle Meyer
@ 2021-02-01  6:40 71%   ` Eric Wong
  2021-02-01 11:37 71%     ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-01  6:40 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> +=item $ lei q -t -o t.mbox --format mboxrd --mua=mutt s:lei s:skeleton
> 
> +Write mboxrd-formatted results to t.mbox and enter mutt to view the
> +file by invoking C<mutt -f %f>.

Thanks for this series.  I'll take a closer look later (or
tomorrow)

mutt actually uses mboxcl2, so it's probably better to use
mboxcl2 in examples involving mutt.  I would also prefer "-f" in
examples if the rest of the args are using short switches.

No need to resend just for that, I can fix up locally before
pushing.

^ permalink raw reply	[relevance 71%]

* [PATCH 08/21] lei: keep $lei around until workers are reaped
                     ` (3 preceding siblings ...)
  2021-02-01  8:28 66% ` [PATCH 06/21] lei: remove syslog dependency Eric Wong
@ 2021-02-01  8:28 82% ` Eric Wong
  2021-02-01  8:28 71% ` [PATCH 11/21] lei: deep clone {ovv} for l2m workers Eric Wong
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

This prevents SharedKV->DESTROY in lei-daemon from triggering
before DB handles are closed in lei2mail processes.  The
{each_smsg_not_done} pipe was not sufficient in this case:
that gets closed at the end of the last git_to_mail callback
invocation.
---
 lib/PublicInbox/IPC.pm        | 10 +++++-----
 lib/PublicInbox/LEI.pm        |  2 +-
 lib/PublicInbox/LeiXSearch.pm |  4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 37f02944..689f32d0 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -137,7 +137,7 @@ sub ipc_worker_spawn {
 }
 
 sub ipc_worker_reap { # dwaitpid callback
-	my ($self, $pid) = @_;
+	my ($args, $pid) = @_;
 	return if !$?;
 	# TERM(15) is our default exit signal, PIPE(13) is likely w/ pager
 	my $s = $? & 127;
@@ -145,9 +145,9 @@ sub ipc_worker_reap { # dwaitpid callback
 }
 
 sub wq_wait_old {
-	my ($self) = @_;
+	my ($self, $args) = @_;
 	my $pids = delete $self->{"-wq_old_pids.$$"} or return;
-	dwaitpid($_, \&ipc_worker_reap, $self) for @$pids;
+	dwaitpid($_, \&ipc_worker_reap, [$self, $args]) for @$pids;
 }
 
 # for base class, override in sub classes
@@ -164,7 +164,7 @@ sub ipc_atfork_child {
 
 # idempotent, can be called regardless of whether worker is active or not
 sub ipc_worker_stop {
-	my ($self) = @_;
+	my ($self, $args) = @_;
 	my ($pid, $ppid) = delete(@$self{qw(-ipc_pid -ipc_ppid)});
 	my ($w_req, $r_res) = delete(@$self{qw(-ipc_req -ipc_res)});
 	if (!$w_req && !$r_res) {
@@ -175,7 +175,7 @@ sub ipc_worker_stop {
 	$w_req = $r_res = undef;
 
 	return if $$ != $ppid;
-	dwaitpid($pid, \&ipc_worker_reap, $self);
+	dwaitpid($pid, \&ipc_worker_reap, [$self, $args]);
 }
 
 # use this if we have multiple readers reading curl or "pigz -dc"
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index c0b90451..4f7ed171 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -860,7 +860,7 @@ sub dclose {
 		if ($wq->wq_kill) {
 			$wq->wq_close
 		} elsif ($wq->wq_kill_old) {
-			$wq->wq_wait_old;
+			$wq->wq_wait_old($self);
 		}
 	}
 	close(delete $self->{1}) if $self->{1}; # may reap_compress
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index de82a7da..b4a9b89d 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -283,7 +283,7 @@ sub query_done { # EOF callback
 	my $has_l2m = exists $lei->{l2m};
 	for my $f (qw(lxs l2m)) {
 		my $wq = delete $lei->{$f} or next;
-		$wq->wq_wait_old;
+		$wq->wq_wait_old($lei);
 	}
 	$lei->{ovv}->ovv_end($lei);
 	if ($has_l2m) { # close() calls LeiToMail reap_compress
@@ -359,7 +359,7 @@ sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
 	my ($lei) = @_;
 	my $lxs = delete $lei->{lxs};
 	if ($lxs && $lxs->wq_kill_old) { # is this the daemon?
-		$lxs->wq_wait_old;
+		$lxs->wq_wait_old($lei);
 	}
 	close(delete $lei->{1}) if $lei->{1};
 	$lei->x_it(13);

^ permalink raw reply related	[relevance 82%]

* [PATCH 03/21] lei: remove per-child SIG{__WARN__}
    2021-02-01  8:28 57% ` [PATCH 01/21] lei: more consistent dedupe and ovv_buf init Eric Wong
@ 2021-02-01  8:28 71% ` Eric Wong
  2021-02-01  8:28 28% ` [PATCH 04/21] lei: remove SIGPIPE handler Eric Wong
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

The top-level $SIG{__WARN__} using $current_lei does the job,
already.
---
 lib/PublicInbox/LEI.pm | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 3ed330f9..ceba16e4 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -359,8 +359,7 @@ sub atfork_child_wq {
 	%PATH2CFG = ();
 	undef $errors_log;
 	$quit = \&CORE::exit;
-	(__WARN__ => sub { err($self, @_) },
-	PIPE => sub {
+	(PIPE => sub {
 		$self->x_it(13); # SIGPIPE = 13
 		# we need to close explicitly to avoid Perl warning on SIGPIPE
 		for my $i (1, 2) {

^ permalink raw reply related	[relevance 71%]

* [PATCH 11/21] lei: deep clone {ovv} for l2m workers
                     ` (4 preceding siblings ...)
  2021-02-01  8:28 82% ` [PATCH 08/21] lei: keep $lei around until workers are reaped Eric Wong
@ 2021-02-01  8:28 71% ` Eric Wong
  2021-02-01  8:28 65% ` [PATCH 13/21] lei: increase initial timeout Eric Wong
  2021-02-01  8:28 64% ` [PATCH 20/21] lei: avoid ETOOMANYREFS, cleanup imports Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

We don't need to send the temporary xsearch {git} object over to
workers, just the directory name.
---
 lib/PublicInbox/LEI.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 4f7ed171..08554932 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -401,8 +401,9 @@ sub atfork_parent_wq {
 	my ($self, $wq) = @_;
 	my $env = delete $self->{env}; # env is inherited at fork
 	my $lei = bless { %$self }, ref($self);
-	if (my $dedupe = delete $lei->{dedupe}) {
-		$lei->{dedupe} = $wq->deep_clone($dedupe);
+	for my $f (qw(dedupe ovv)) {
+		my $tmp = delete($lei->{$f}) or next;
+		$lei->{$f} = $wq->deep_clone($tmp);
 	}
 	$self->{env} = $env;
 	delete @$lei{qw(3 -lei_store cfg old_1 pgr lxs)}; # keep l2m

^ permalink raw reply related	[relevance 71%]

* [PATCH 06/21] lei: remove syslog dependency
                     ` (2 preceding siblings ...)
  2021-02-01  8:28 28% ` [PATCH 04/21] lei: remove SIGPIPE handler Eric Wong
@ 2021-02-01  8:28 66% ` Eric Wong
  2021-02-01  8:28 82% ` [PATCH 08/21] lei: keep $lei around until workers are reaped Eric Wong
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

It doesn't seem necessary now that we redirect and write
stuff to errors.log, which gets checked every run.
---
 lib/PublicInbox/LEI.pm | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 22cd20f6..c0b90451 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -14,10 +14,9 @@ use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
 use Errno qw(EPIPE EAGAIN EINTR ECONNREFUSED ENOENT ECONNRESET);
 use Cwd qw(getcwd);
-use POSIX ();
+use POSIX qw(strftime);
 use IO::Handle ();
 use Fcntl qw(SEEK_SET);
-use Sys::Syslog qw(syslog openlog);
 use PublicInbox::Config;
 use PublicInbox::Syscall qw(SFD_NONBLOCK EPOLLIN EPOLLET);
 use PublicInbox::Sigfd;
@@ -1007,9 +1006,9 @@ sub lazy_start {
 				warn "$path dev/ino changed, quitting\n";
 				$path = undef;
 			}
-		} elsif (defined($path)) {
-			warn "stat($path): $!, quitting ...\n";
-			undef $path; # don't unlink
+		} elsif (defined($path)) { # ENOENT is common
+			warn "stat($path): $!, quitting ...\n" if $! != ENOENT;
+			undef $path;
 			$quit->();
 		}
 		return 1 if defined($path);
@@ -1029,18 +1028,14 @@ sub lazy_start {
 	# STDIN was redirected to /dev/null above, closing STDERR and
 	# STDOUT will cause the calling `lei' client process to finish
 	# reading the <$daemon> pipe.
-	openlog($path, 'pid', 'user');
 	local $SIG{__WARN__} = sub {
-		$current_lei ? err($current_lei, @_) : syslog('warning', "@_");
+		$current_lei ? err($current_lei, @_) : warn(
+		  strftime('%Y-%m-%dT%H:%M:%SZ', gmtime(time))," $$ ", @_);
 	};
-	my $on_destroy = PublicInbox::OnDestroy->new($$, sub {
-		syslog('crit', "$@") if $@;
-	});
 	open STDERR, '>&STDIN' or die "redirect stderr failed: $!";
 	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!";
 	# $daemon pipe to `lei' closed, main loop begins:
 	PublicInbox::DS->EventLoop;
-	@$on_destroy = (); # cancel on_destroy if we get here
 	exit($exit_code // 0);
 }
 

^ permalink raw reply related	[relevance 66%]

* [PATCH 01/21] lei: more consistent dedupe and ovv_buf init
  @ 2021-02-01  8:28 57% ` Eric Wong
  2021-02-01  8:28 71% ` [PATCH 03/21] lei: remove per-child SIG{__WARN__} Eric Wong
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

This fixes "--dedupe none" with Maildir where we don't
create the object at all.
---
 lib/PublicInbox/LeiDedupe.pm   |  4 ++--
 lib/PublicInbox/LeiOverview.pm | 18 ++++++++++--------
 lib/PublicInbox/LeiToMail.pm   |  3 +--
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
index 3f478aa4..e3ae8e33 100644
--- a/lib/PublicInbox/LeiDedupe.pm
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -103,8 +103,8 @@ sub new {
 	bless [ $skv, undef, undef, $m ], $cls;
 }
 
-# returns true on unseen messages according to the deduplication strategy,
-# returns false if seen
+# returns true on seen messages according to the deduplication strategy,
+# returns false if unseen
 sub is_dup {
 	my ($self, $eml, $oid) = @_;
 	!$self->[1]->($eml, $oid);
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index c67e2747..fa041457 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -92,13 +92,14 @@ sub new {
 			ovv_out_lk_init($self);
 		}
 	}
-	if (!$json) {
+	if ($json) {
+		$lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei);
+	} else {
 		# default to the cheapest sort since MUA usually resorts
 		$lei->{opt}->{'sort'} //= 'docid' if $dst ne '/dev/stdout';
 		$lei->{l2m} = eval { PublicInbox::LeiToMail->new($lei) };
 		return $lei->fail($@) if $@;
 	}
-	$lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei);
 	$self;
 }
 
@@ -201,15 +202,19 @@ sub _json_pretty {
 
 sub ovv_each_smsg_cb { # runs in wq worker usually
 	my ($self, $lei, $ibxish) = @_;
-	my $json;
+	my ($json, $dedupe);
 	$lei->{1}->autoflush(1);
-	my $dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
 	if (my $pkg = $self->{json}) {
 		$json = $pkg->new;
 		$json->utf8->canonical;
 		$json->ascii(1) if $lei->{opt}->{ascii};
 	}
-	my $l2m = $lei->{l2m} or $dedupe->prepare_dedupe;
+	my $l2m = $lei->{l2m};
+	if (!$l2m) {
+		$dedupe = $lei->{dedupe} // die 'BUG: {dedupe} missing';
+		$dedupe->prepare_dedupe;
+	}
+	$lei->{ovv_buf} = \(my $buf = '') if !$l2m;
 	if ($l2m && !$ibxish) { # remote https?:// mboxrd
 		delete $l2m->{-wq_s1};
 		my $g2m = $l2m->can('git_to_mail');
@@ -241,7 +246,6 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
 		$self->{git} = $git; # for ovv_atexit_child
 		my $g2m = $l2m->can('git_to_mail');
-		$dedupe->prepare_dedupe;
 		sub {
 			my ($smsg, $mitem) = @_;
 			$smsg->{pct} = get_pct($mitem) if $mitem;
@@ -249,7 +253,6 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		};
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
-		$lei->{ovv_buf} = \(my $buf = '');
 		sub { # DIY prettiness :P
 			my ($smsg, $mitem) = @_;
 			return if $dedupe->is_smsg_dup($smsg);
@@ -273,7 +276,6 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		}
 	} elsif ($json) {
 		my $ORS = $self->{fmt} eq 'json' ? ",\n" : "\n"; # JSONL
-		$lei->{ovv_buf} = \(my $buf = '');
 		sub {
 			my ($smsg, $mitem) = @_;
 			return if $dedupe->is_smsg_dup($smsg);
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 61b546b5..244bfb67 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -323,7 +323,7 @@ sub _buf2maildir {
 sub _maildir_write_cb ($$) {
 	my ($self, $lei) = @_;
 	my $dedupe = $lei->{dedupe};
-	$dedupe->prepare_dedupe;
+	$dedupe->prepare_dedupe if $dedupe;
 	my $dst = $lei->{ovv}->{dst};
 	sub { # for git_to_mail
 		my ($buf, $smsg, $eml) = @_;
@@ -464,7 +464,6 @@ sub write_mail { # via ->wq_do
 	my $wcb = $self->{wcb} //= do { # first message
 		my %sig = $lei->atfork_child_wq($self);
 		@SIG{keys %sig} = values %sig; # not local
-		$lei->{dedupe}->prepare_dedupe;
 		$self->write_cb($lei);
 	};
 	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);

^ permalink raw reply related	[relevance 57%]

* [PATCH 04/21] lei: remove SIGPIPE handler
    2021-02-01  8:28 57% ` [PATCH 01/21] lei: more consistent dedupe and ovv_buf init Eric Wong
  2021-02-01  8:28 71% ` [PATCH 03/21] lei: remove per-child SIG{__WARN__} Eric Wong
@ 2021-02-01  8:28 28% ` Eric Wong
  2021-02-01  8:28 66% ` [PATCH 06/21] lei: remove syslog dependency Eric Wong
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

It doesn't save us any code, and the action-at-a-distance
element was making it confusing to track down actual problems.
Another potential problem was keeping references alive too long.

So do like we would a C100K server and check every write
while still ensuring lei(1) exit with a proper SIGPIPE
iff needed.
---
 lib/PublicInbox/IPC.pm         | 10 +++---
 lib/PublicInbox/LEI.pm         | 56 +++++++++++++++++++++-------------
 lib/PublicInbox/LeiExternal.pm |  3 +-
 lib/PublicInbox/LeiOverview.pm | 33 ++++++++------------
 lib/PublicInbox/LeiToMail.pm   | 45 ++++++++++++---------------
 lib/PublicInbox/LeiXSearch.pm  | 17 ++++-------
 t/lei_to_mail.t                | 31 ++++++++++---------
 7 files changed, 96 insertions(+), 99 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 479c4377..172552b9 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -139,8 +139,10 @@ sub ipc_worker_spawn {
 
 sub ipc_worker_reap { # dwaitpid callback
 	my ($self, $pid) = @_;
-	# SIGTERM (15) is our default exit signal
-	warn "PID:$pid died with \$?=$?\n" if $? && ($? & 127) != 15;
+	return if !$?;
+	# TERM(15) is our default exit signal, PIPE(13) is likely w/ pager
+	my $s = $? & 127;
+	warn "PID:$pid died with \$?=$?\n" if $s != 15 && $s != 13;
 }
 
 sub wq_wait_old {
@@ -278,7 +280,7 @@ sub recv_and_run {
 	undef $buf;
 	my $sub = shift @$args;
 	eval { $self->$sub(@$args) };
-	warn "$$ wq_worker: $@" if $@ && ref($@) ne 'PublicInbox::SIGPIPE';
+	warn "$$ wq_worker: $@" if $@;
 	delete @$self{0..($nfd-1)};
 	$n;
 }
@@ -320,7 +322,7 @@ sub wq_do { # always async
 	} else {
 		@$self{0..$#$ios} = @$ios;
 		eval { $self->$sub(@args) };
-		warn "wq_do: $@" if $@ && ref($@) ne 'PublicInbox::SIGPIPE';
+		warn "wq_do: $@" if $@;
 		delete @$self{0..$#$ios}; # don't close
 	}
 }
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ceba16e4..b915bb0c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -12,7 +12,7 @@ use parent qw(PublicInbox::DS PublicInbox::LeiExternal
 	PublicInbox::LeiQuery);
 use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
-use Errno qw(EAGAIN EINTR ECONNREFUSED ENOENT ECONNRESET);
+use Errno qw(EPIPE EAGAIN EINTR ECONNREFUSED ENOENT ECONNRESET);
 use Cwd qw(getcwd);
 use POSIX ();
 use IO::Handle ();
@@ -277,7 +277,11 @@ sub x_it ($$) {
 	dump_and_clear_log();
 	if (my $sock = $self->{sock}) {
 		send($sock, "x_it $code", MSG_EOR);
-	} elsif (!($code & 127)) { # oneshot, ignore signals
+	} elsif (my $signum = ($code & 127)) { # oneshot, usually SIGPIPE (13)
+		$SIG{PIPE} = 'DEFAULT'; # $SIG{$signum} doesn't work
+		kill $signum, $$;
+		sleep; # wait for signal
+	} else { # oneshot
 		# don't want to end up using $? from child processes
 		for my $f (qw(lxs l2m)) {
 			my $wq = delete $self->{$f} or next;
@@ -287,14 +291,15 @@ sub x_it ($$) {
 	}
 }
 
-sub puts ($;@) { print { shift->{1} } map { "$_\n" } @_ }
-
-sub out ($;@) { print { shift->{1} } @_ }
-
 sub err ($;@) {
 	my $self = shift;
-	my $err = $self->{2} // ($self->{pgr} // [])->[2] // *STDERR{IO};
-	print $err @_, (substr($_[-1], -1, 1) eq "\n" ? () : "\n");
+	my $err = $self->{2} // ($self->{pgr} // [])->[2] // *STDERR{GLOB};
+	my $eor = (substr($_[-1], -1, 1) eq "\n" ? () : "\n");
+	print $err @_, $eor and return;
+	my $old_err = delete $self->{2};
+	close($old_err) if $! == EPIPE && $old_err;;
+	$err = $self->{2} = ($self->{pgr} // [])->[2] // *STDERR{GLOB};
+	print $err @_, $eor or print STDERR @_, $eor;
 }
 
 sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }
@@ -306,6 +311,17 @@ sub fail ($$;$) {
 	undef;
 }
 
+sub out ($;@) {
+	my $self = shift;
+	return if print { $self->{1} // return } @_; # likely
+	return note_sigpipe($self, 1) if $! == EPIPE;
+	my $err = "error writing to stdout: $!";
+	delete $self->{1};
+	fail($self, $err);
+}
+
+sub puts ($;@) { out(shift, map { "$_\n" } @_) }
+
 sub child_error { # passes non-fatal curl exit codes to user
 	my ($self, $child_error) = @_; # child_error is $?
 	if (my $sock = $self->{sock}) { # send to lei(1) client
@@ -350,27 +366,23 @@ sub io_restore ($$) {
 	}
 }
 
-# usage: my %sig = $lei->atfork_child_wq($wq);
-#	 local @SIG{keys %sig} = values %sig;
+# triggers sigpipe_handler
+sub note_sigpipe {
+	my ($self, $fd) = @_;
+	close(delete($self->{$fd})); # explicit close silences Perl warning
+	syswrite($self->{op_pipe}, '!') if $self->{op_pipe};
+	x_it($self, 13);
+}
+
 sub atfork_child_wq {
 	my ($self, $wq) = @_;
 	io_restore($self, $wq);
+	-p $self->{op_pipe} or die 'BUG: {op_pipe} expected';
 	io_restore($self->{l2m}, $wq);
 	%PATH2CFG = ();
 	undef $errors_log;
 	$quit = \&CORE::exit;
-	(PIPE => sub {
-		$self->x_it(13); # SIGPIPE = 13
-		# we need to close explicitly to avoid Perl warning on SIGPIPE
-		for my $i (1, 2) {
-			next unless $self->{$i} && (-p $self->{$i} || -S _);
-			close(delete $self->{$i});
-		}
-		# trigger the LeiXSearch $done OpPipe:
-		syswrite($self->{op_pipe}, '!') if $self->{op_pipe};
-		$SIG{PIPE} = 'DEFAULT';
-		die bless(\"$_[0]", 'PublicInbox::SIGPIPE'),
-	});
+	$current_lei = $self; # for SIG{__WARN__}
 }
 
 sub io_extract ($;@) {
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index bf07c41c..b1176824 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -31,11 +31,10 @@ sub _externals_each {
 
 sub lei_ls_external {
 	my ($self, @argv) = @_;
-	my $out = $self->{1};
 	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
 	$self->_externals_each(sub {
 		my ($loc, $boost_val) = @_;
-		print $out $loc, $OFS, 'boost=', $boost_val, $ORS;
+		$self->out($loc, $OFS, 'boost=', $boost_val, $ORS);
 	});
 }
 
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index fa041457..1d62ffe2 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -107,28 +107,22 @@ sub new {
 sub ovv_begin {
 	my ($self, $lei) = @_;
 	if ($self->{fmt} eq 'json') {
-		print { $lei->{1} } '[';
+		$lei->out('[');
 	} # TODO HTML/Atom/...
 }
 
 # called once by parent (via PublicInbox::EOFpipe)
 sub ovv_end {
 	my ($self, $lei) = @_;
-	my $out = $lei->{1} or return;
 	if ($self->{fmt} eq 'json') {
 		# JSON doesn't allow trailing commas, and preventing
 		# trailing commas is a PITA when parallelizing outputs
-		print $out "null]\n";
+		$lei->out("null]\n");
 	} elsif ($self->{fmt} eq 'concatjson') {
-		print $out "\n";
+		$lei->out("\n");
 	}
 }
 
-sub ovv_atfork_child {
-	my ($self) = @_;
-	# reopen dedupe here
-}
-
 # prepares an smsg for JSON
 sub _unbless_smsg {
 	my ($smsg, $mitem) = @_;
@@ -168,9 +162,8 @@ sub ovv_atexit_child {
 		$git->async_wait_all;
 	}
 	if (my $bref = delete $lei->{ovv_buf}) {
-		my $out = $lei->{1} or return;
 		my $lk = $self->lock_for_scope;
-		print $out $$bref;
+		$lei->out($$bref);
 	}
 }
 
@@ -268,11 +261,10 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 				}
 			} sort keys %$smsg);
 			$buf .= $EOR;
-			if (length($buf) > 65536) {
-				my $lk = $self->lock_for_scope;
-				print { $lei->{1} } $buf;
-				$buf = '';
-			}
+			return if length($buf) < 65536;
+			my $lk = $self->lock_for_scope;
+			$lei->out($buf);
+			$buf = '';
 		}
 	} elsif ($json) {
 		my $ORS = $self->{fmt} eq 'json' ? ",\n" : "\n"; # JSONL
@@ -280,11 +272,10 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			my ($smsg, $mitem) = @_;
 			return if $dedupe->is_smsg_dup($smsg);
 			$buf .= $json->encode(_unbless_smsg(@_)) . $ORS;
-			if (length($buf) > 65536) {
-				my $lk = $self->lock_for_scope;
-				print { $lei->{1} } $buf;
-				$buf = '';
-			}
+			return if length($buf) < 65536;
+			my $lk = $self->lock_for_scope;
+			$lei->out($buf);
+			$buf = '';
 		}
 	} # else { ...
 }
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 1f6c2a3b..01e7cec5 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -17,7 +17,7 @@ use PublicInbox::GitAsyncCat;
 use Symbol qw(gensym);
 use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
-use Errno qw(EEXIST ESPIPE ENOENT);
+use Errno qw(EEXIST ESPIPE ENOENT EPIPE);
 
 # struggles with short-lived repos, Gcf2Client makes little sense with lei;
 # but we may use in-process libgit2 in the future.
@@ -68,14 +68,16 @@ sub _mbox_hdr_buf ($$$) {
 }
 
 sub atomic_append { # for on-disk destinations (O_APPEND, or O_EXCL)
-	my ($fh, $buf) = @_;
-	defined(my $w = syswrite($fh, $$buf)) or die "write: $!";
-	$w == length($$buf) or die "short write: $w != ".length($$buf);
-}
-
-sub _print_full {
-	my ($fh, $buf) = @_;
-	print $fh $$buf or die "print: $!";
+	my ($lei, $buf) = @_;
+	if (defined(my $w = syswrite($lei->{1} // return, $$buf))) {
+		return if $w == length($$buf);
+		$buf = "short atomic write: $w != ".length($$buf);
+	} elsif ($! == EPIPE) {
+		return $lei->note_sigpipe(1);
+	} else {
+		$buf = "atomic write: $!";
+	}
+	$lei->fail($buf);
 }
 
 sub eml2mboxrd ($;$) {
@@ -248,24 +250,19 @@ sub _mbox_write_cb ($$) {
 	my $ovv = $lei->{ovv};
 	my $m = 'eml2'.$ovv->{fmt};
 	my $eml2mbox = $self->can($m) or die "$self->$m missing";
-	my $out = $lei->{1} // die "no stdout ($m, $ovv->{dst})"; # redirected earlier
-	$out->autoflush(1);
-	my $write = $ovv->{lock_path} ? \&_print_full : \&atomic_append;
+	$lei->{1} // die "no stdout ($m, $ovv->{dst})"; # redirected earlier
+	$lei->{1}->autoflush(1);
+	my $atomic_append = !defined($ovv->{lock_path});
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe;
 	sub { # for git_to_mail
 		my ($buf, $smsg, $eml) = @_;
-		return unless $out;
 		$eml //= PublicInbox::Eml->new($buf);
-		if (!$dedupe->is_dup($eml, $smsg->{blob})) {
-			$buf = $eml2mbox->($eml, $smsg);
-			my $lk = $ovv->lock_for_scope;
-			eval { $write->($out, $buf) };
-			if ($@) {
-				die $@ if ref($@) ne 'PublicInbox::SIGPIPE';
-				undef $out
-			}
-		}
+		return if $dedupe->is_dup($eml, $smsg->{blob});
+		$buf = $eml2mbox->($eml, $smsg);
+		return atomic_append($lei, $buf) if $atomic_append;
+		my $lk = $ovv->lock_for_scope;
+		$lei->out($$buf);
 	}
 }
 
@@ -467,8 +464,7 @@ sub write_mail { # via ->wq_do
 	my ($self, $git_dir, $smsg, $lei) = @_;
 	my $not_done = delete $self->{$lei->{each_smsg_not_done}};
 	my $wcb = $self->{wcb} //= do { # first message
-		my %sig = $lei->atfork_child_wq($self);
-		@SIG{keys %sig} = values %sig; # not local
+		$lei->atfork_child_wq($self);
 		$self->write_cb($lei);
 	};
 	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);
@@ -483,7 +479,6 @@ sub wq_atexit_child {
 		$git->async_wait_all;
 	}
 	$SIG{__WARN__} = 'DEFAULT';
-	$SIG{PIPE} = 'DEFAULT';
 }
 
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index e69b637c..de82a7da 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -109,8 +109,7 @@ sub wait_startq ($) {
 sub query_thread_mset { # for --thread
 	my ($self, $lei, $ibxish) = @_;
 	local $0 = "$0 query_thread_mset";
-	my %sig = $lei->atfork_child_wq($self);
-	local @SIG{keys %sig} = values %sig;
+	$lei->atfork_child_wq($self);
 	my $startq = delete $lei->{startq};
 
 	my ($srch, $over) = ($ibxish->search, $ibxish->over);
@@ -145,8 +144,7 @@ sub query_thread_mset { # for --thread
 sub query_mset { # non-parallel for non-"--thread" users
 	my ($self, $lei) = @_;
 	local $0 = "$0 query_mset";
-	my %sig = $lei->atfork_child_wq($self);
-	local @SIG{keys %sig} = values %sig;
+	$lei->atfork_child_wq($self);
 	my $startq = delete $lei->{startq};
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
@@ -187,8 +185,7 @@ sub kill_reap {
 sub query_remote_mboxrd {
 	my ($self, $lei, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
-	my %sig = $lei->atfork_child_wq($self); # keep $self->{5} startq
-	local @SIG{keys %sig} = values %sig;
+	$lei->atfork_child_wq($self);
 	my ($opt, $env) = @$lei{qw(opt env)};
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
 	push(@qform, t => 1) if $opt->{thread};
@@ -351,9 +348,7 @@ sub start_query { # always runs in main (lei-daemon) process
 sub query_prepare { # called by wq_do
 	my ($self, $lei) = @_;
 	local $0 = "$0 query_prepare";
-	my %sig = $lei->atfork_child_wq($self);
-	-p $lei->{op_pipe} or die "BUG: \$done pipe expected";
-	local @SIG{keys %sig} = values %sig;
+	$lei->atfork_child_wq($self);
 	delete $lei->{l2m}->{-wq_s1};
 	eval { $lei->{l2m}->do_augment($lei) };
 	$lei->fail($@) if $@;
@@ -363,11 +358,11 @@ sub query_prepare { # called by wq_do
 sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
 	my ($lei) = @_;
 	my $lxs = delete $lei->{lxs};
-	if ($lxs && $lxs->wq_kill_old) {
-		kill 'PIPE', $$;
+	if ($lxs && $lxs->wq_kill_old) { # is this the daemon?
 		$lxs->wq_wait_old;
 	}
 	close(delete $lei->{1}) if $lei->{1};
+	$lei->x_it(13);
 }
 
 sub do_query {
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index 47c0e3d4..f7535687 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -12,6 +12,7 @@ use List::Util qw(shuffle);
 require_mods(qw(DBD::SQLite));
 require PublicInbox::MboxReader;
 require PublicInbox::LeiOverview;
+require PublicInbox::LEI;
 use_ok 'PublicInbox::LeiToMail';
 my $from = "Content-Length: 10\nSubject: x\n\nFrom hell\n";
 my $noeol = "Subject: x\n\nFrom hell";
@@ -73,7 +74,11 @@ for my $mbox (@MBOX) {
 my ($tmpdir, $for_destroy) = tmpdir();
 local $ENV{TMPDIR} = $tmpdir;
 open my $err, '>>', "$tmpdir/lei.err" or BAIL_OUT $!;
-my $lei = { 2 => $err };
+my $lei = bless { 2 => $err }, 'PublicInbox::LEI';
+my $commit = sub {
+	$_[0] = undef; # wcb
+	delete $lei->{1};
+};
 my $buf = <<'EOM';
 From: x@example.com
 Subject: x
@@ -98,9 +103,7 @@ my $wcb_get = sub {
 	my $zpipe = $l2m->pre_augment($lei);
 	$l2m->do_augment($lei);
 	$l2m->post_augment($lei, $zpipe);
-	my $cb = $l2m->write_cb($lei);
-	delete $lei->{1};
-	$cb;
+	$l2m->write_cb($lei);
 };
 
 my $deadbeef = { blob => 'deadbeef', kw => [ qw(seen) ] };
@@ -109,7 +112,7 @@ my $orig = do {
 	is(ref $wcb, 'CODE', 'write_cb returned callback');
 	ok(-f $fn && !-s _, 'empty file created');
 	$wcb->(\(my $dup = $buf), $deadbeef);
-	undef $wcb;
+	$commit->($wcb);
 	open my $fh, '<', $fn or BAIL_OUT $!;
 	my $raw = do { local $/; <$fh> };
 	like($raw, qr/^blah\n/sm, 'wrote content');
@@ -119,7 +122,7 @@ my $orig = do {
 	$wcb = $wcb_get->($mbox, $fn);
 	ok(-f $fn && !-s _, 'truncated mbox destination');
 	$wcb->(\($dup = $buf), $deadbeef);
-	undef $wcb;
+	$commit->($wcb);
 	open $fh, '<', $fn or BAIL_OUT $!;
 	is(do { local $/; <$fh> }, $raw, 'jobs > 1');
 	$raw;
@@ -134,7 +137,7 @@ for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 		my $f = "$fn.$zsfx";
 		my $wcb = $wcb_get->($mbox, $f);
 		$wcb->(\(my $dup = $buf), $deadbeef);
-		undef $wcb;
+		$commit->($wcb);
 		my $uncompressed = xqx([@$dc_cmd, $f]);
 		is($uncompressed, $orig, "$zsfx works unlocked");
 
@@ -142,13 +145,13 @@ for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 		unlink $f or BAIL_OUT "unlink $!";
 		$wcb = $wcb_get->($mbox, $f);
 		$wcb->(\($dup = $buf), $deadbeef);
-		undef $wcb;
+		$commit->($wcb);
 		is(xqx([@$dc_cmd, $f]), $orig, "$zsfx matches with lock");
 
 		local $lei->{opt} = { augment => 1 };
 		$wcb = $wcb_get->($mbox, $f);
 		$wcb->(\($dup = $buf . "\nx\n"), $deadbeef);
-		undef $wcb; # commit
+		$commit->($wcb);
 
 		my $cat = popen_rd([@$dc_cmd, $f]);
 		my @raw;
@@ -160,7 +163,7 @@ for my $zsfx (qw(gz bz2 xz)) { # XXX should we support zst, zz, lzo, lzma?
 		local $lei->{opt} = { augment => 1, jobs => 2 };
 		$wcb = $wcb_get->($mbox, $f);
 		$wcb->(\($dup = $buf . "\ny\n"), $deadbeef);
-		undef $wcb; # commit
+		$commit->($wcb);
 
 		my @raw3;
 		$cat = popen_rd([@$dc_cmd, $f]);
@@ -183,7 +186,7 @@ if ('default deduplication uses content_hash') {
 	my $wcb = $wcb_get->('mboxo', $fn);
 	$deadbeef->{kw} = [];
 	$wcb->(\(my $x = $buf), $deadbeef) for (1..2);
-	undef $wcb; # undef to commit changes
+	$commit->($wcb);
 	my $cmp = '';
 	open my $fh, '<', $fn or BAIL_OUT $!;
 	PublicInbox::MboxReader->mboxo($fh, sub { $cmp .= $as_orig->(@_) });
@@ -192,7 +195,7 @@ if ('default deduplication uses content_hash') {
 	local $lei->{opt} = { augment => 1 };
 	$wcb = $wcb_get->('mboxo', $fn);
 	$wcb->(\($x = $buf . "\nx\n"), $deadbeef) for (1..2);
-	undef $wcb; # undef to commit changes
+	$commit->($wcb);
 	open $fh, '<', $fn or BAIL_OUT $!;
 	my @x;
 	PublicInbox::MboxReader->mboxo($fh, sub { push @x, $as_orig->(@_) });
@@ -206,7 +209,7 @@ if ('default deduplication uses content_hash') {
 	local $lei->{1} = $tmp;
 	my $wcb = $wcb_get->('mboxrd', '/dev/stdout');
 	$wcb->(\(my $x = $buf), $deadbeef);
-	undef $wcb; # commit
+	$commit->($wcb);
 	seek($tmp, 0, SEEK_SET) or BAIL_OUT $!;
 	my $cmp = '';
 	PublicInbox::MboxReader->mboxrd($tmp, sub { $cmp .= $as_orig->(@_) });
@@ -220,7 +223,7 @@ SKIP: { # FIFO support
 	my $cat = popen_rd([which('cat'), $fn]);
 	my $wcb = $wcb_get->('mboxo', $fn);
 	$wcb->(\(my $x = $buf), $deadbeef);
-	undef $wcb; # commit
+	$commit->($wcb);
 	my $cmp = '';
 	PublicInbox::MboxReader->mboxo($cat, sub { $cmp .= $as_orig->(@_) });
 	is($cmp, $buf, 'message written to FIFO');

^ permalink raw reply related	[relevance 28%]

* [PATCH 13/21] lei: increase initial timeout
                     ` (5 preceding siblings ...)
  2021-02-01  8:28 71% ` [PATCH 11/21] lei: deep clone {ovv} for l2m workers Eric Wong
@ 2021-02-01  8:28 65% ` Eric Wong
  2021-02-01  8:28 64% ` [PATCH 20/21] lei: avoid ETOOMANYREFS, cleanup imports Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

PublicInbox::Listener unconditionally sets O_NONBLOCK upon
accept(), so we need a larger timeout under heavy load since
there's no "dataready" accept filter on the listener.

With O_NONBLOCK already set, we don't have to set it at
->event_step_init
---
 lib/PublicInbox/LEI.pm | 7 ++++---
 script/lei             | 3 ++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 08554932..e2f22a75 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -824,7 +824,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 	$sock->autoflush(1);
 	my $self = bless { sock => $sock }, __PACKAGE__;
 	vec(my $rvec = '', fileno($sock), 1) = 1;
-	select($rvec, undef, undef, 1) or
+	select($rvec, undef, undef, 60) or
 		return send($sock, 'timed out waiting to recv FDs', MSG_EOR);
 	my @fds = $recv_cmd->($sock, my $buf, 4096 * 33); # >MAX_ARG_STRLEN
 	if (scalar(@fds) == 4) {
@@ -834,7 +834,9 @@ sub accept_dispatch { # Listener {post_accept} callback
 			send($sock, "open(+<&=$fd) (FD=$i): $!", MSG_EOR);
 		}
 	} else {
-		return send($sock, "recv_cmd failed: $!", MSG_EOR);
+		my $msg = "recv_cmd failed: $!";
+		warn $msg;
+		return send($sock, $msg, MSG_EOR);
 	}
 	$self->{2}->autoflush(1); # keep stdout buffered until x_it|DESTROY
 	# $ENV_STR = join('', map { "\0$_=$ENV{$_}" } keys %ENV);
@@ -898,7 +900,6 @@ sub event_step {
 sub event_step_init {
 	my ($self) = @_;
 	if (my $sock = $self->{sock}) { # using DS->EventLoop
-		$sock->blocking(0);
 		$self->SUPER::new($sock, EPOLLIN|EPOLLET);
 	}
 }
diff --git a/script/lei b/script/lei
index 006c1180..f92dd302 100755
--- a/script/lei
+++ b/script/lei
@@ -79,7 +79,8 @@ Falling back to (slow) one-shot mode
 	my $buf = join("\0", scalar(@ARGV), @ARGV);
 	while (my ($k, $v) = each %ENV) { $buf .= "\0$k=$v" }
 	$buf .= "\0\0";
-	$send_cmd->($sock, [ 0, 1, 2, fileno($dh) ], $buf, MSG_EOR);
+	$send_cmd->($sock, [ 0, 1, 2, fileno($dh) ], $buf, MSG_EOR) or
+		die "sendmsg: $!";
 	my $x_it_code = 0;
 	while (1) {
 		my (@fds) = $recv_cmd->($sock, $buf, 4096 * 33);

^ permalink raw reply related	[relevance 65%]

* [PATCH 20/21] lei: avoid ETOOMANYREFS, cleanup imports
                     ` (6 preceding siblings ...)
  2021-02-01  8:28 65% ` [PATCH 13/21] lei: increase initial timeout Eric Wong
@ 2021-02-01  8:28 64% ` Eric Wong
  7 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01  8:28 UTC (permalink / raw)
  To: meta

As with PublicInbox::IPC, we'll attempt to bump RLIMIT_NOFILE
and transparently workaround ETOOMANYREFS.  If that fails,
we'll give the user a hint to bump RLIMIT_NOFILE since
ETOOMANYREFS is an uncommon error which users may be unfamiliar
with.

Found while stress testing for segfaults.
---
 script/lei | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/script/lei b/script/lei
index f92dd302..58f0dbe9 100755
--- a/script/lei
+++ b/script/lei
@@ -4,10 +4,9 @@
 use strict;
 use v5.10.1;
 use Socket qw(AF_UNIX SOCK_SEQPACKET MSG_EOR pack_sockaddr_un);
-use Errno qw(EINTR ECONNRESET);
 use PublicInbox::CmdIPC4;
 my $narg = 5;
-my ($sock, $pwd);
+my $sock;
 my $recv_cmd = PublicInbox::CmdIPC4->can('recv_cmd4');
 my $send_cmd = PublicInbox::CmdIPC4->can('send_cmd4') // do {
 	require PublicInbox::Spawn; # takes ~50ms even if built *sigh*
@@ -73,20 +72,32 @@ connect($path): $! (after attempted daemon start)
 Falling back to (slow) one-shot mode
 
 	}
-	1;
-}) { # (Socket::MsgHdr|Inline::C), $sock, $pwd are all available:
+	# (Socket::MsgHdr|Inline::C), $sock are all available:
 	open my $dh, '<', '.' or die "open(.) $!";
 	my $buf = join("\0", scalar(@ARGV), @ARGV);
 	while (my ($k, $v) = each %ENV) { $buf .= "\0$k=$v" }
 	$buf .= "\0\0";
-	$send_cmd->($sock, [ 0, 1, 2, fileno($dh) ], $buf, MSG_EOR) or
-		die "sendmsg: $!";
+	my $n = $send_cmd->($sock, [0, 1, 2, fileno($dh)], $buf, MSG_EOR);
+	if (!$n && $!{ETOOMANYREFS} && eval { require BSD::Resource }) {
+		my $NOFILE = BSD::Resource::RLIMIT_NOFILE();
+		my ($s, $h) = BSD::Resource::getrlimit($NOFILE);
+		if ($s < $h && BSD::Resource::setrlimit($NOFILE, $h, $h)) {
+			$n = $send_cmd->($sock, [0, 1, 2, fileno($dh)],
+					$buf, MSG_EOR);
+		}
+	}
+	if (!$n) {
+		die "sendmsg: $! (check RLIMIT_NOFILE)\n" if $!{ETOOMANYREFS};
+		die "sendmsg: $!\n";
+	}
+	1;
+}) { # connected and request sent to lei-daemon, wait for responses or EOF
 	my $x_it_code = 0;
 	while (1) {
-		my (@fds) = $recv_cmd->($sock, $buf, 4096 * 33);
+		my (@fds) = $recv_cmd->($sock, my $buf, 4096 * 33);
 		if (scalar(@fds) == 1 && !defined($fds[0])) {
-			last if $! == ECONNRESET;
-			next if $! == EINTR;
+			next if $!{EINTR};
+			last if $!{ECONNRESET};
 			die "recvmsg: $!";
 		}
 		last if $buf eq '';

^ permalink raw reply related	[relevance 64%]

* Re: [PATCH 2/2] doc: add lei-overview(7)
  2021-02-01  6:40 71%   ` Eric Wong
@ 2021-02-01 11:37 71%     ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-01 11:37 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Kyle Meyer <kyle@kyleam.com> wrote:
> > +=item $ lei q -t -o t.mbox --format mboxrd --mua=mutt s:lei s:skeleton
> > 
> > +Write mboxrd-formatted results to t.mbox and enter mutt to view the
> > +file by invoking C<mutt -f %f>.
> 
> Thanks for this series.  I'll take a closer look later (or
> tomorrow)

It seems fine, pushed as commit e49cf9c629c0fd3024bdb63b5c5e84b590814c4e
Thanks again

> mutt actually uses mboxcl2, so it's probably better to use
> mboxcl2 in examples involving mutt.  I would also prefer "-f" in
> examples if the rest of the args are using short switches.
> 
> No need to resend just for that, I can fix up locally before
> pushing.

diff --git a/Documentation/lei-overview.pod b/Documentation/lei-overview.pod
index 988896ce..d1903045 100644
--- a/Documentation/lei-overview.pod
+++ b/Documentation/lei-overview.pod
@@ -51,9 +51,9 @@ Search for messages whose subject includes "lei" and "skeleton".
 Do the same, but also report unmatched messages that are in the same
 thread as a matched message.
 
-=item $ lei q -t -o t.mbox --format mboxrd --mua=mutt s:lei s:skeleton
+=item $ lei q -t -o t.mbox -f mboxcl2 --mua=mutt s:lei s:skeleton
 
-Write mboxrd-formatted results to t.mbox and enter mutt to view the
+Write mboxcl2-formatted results to t.mbox and enter mutt to view the
 file by invoking C<mutt -f %f>.
 
 =back

^ permalink raw reply related	[relevance 71%]

* can lei require Inline::C?
@ 2021-02-02 10:09 71% Eric Wong
  2021-02-03  0:02 71% ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-02 10:09 UTC (permalink / raw)
  To: meta

Performance and interactivity suck without being able to use FD
passing (especially tab completions).  And having to maintain
and separate code paths is a huge time sink...

Inline::C is packaged by every relevant distro (unlike
Socket::Msghdr), and I figure anybody who uses lei at this stage
will have a C compiler...

It would let us use io-uring and maybe some other things
more easily, too (and I miss hacking in C).

^ permalink raw reply	[relevance 71%]

* [PATCH 00/16] lei: -I/--include and more
@ 2021-02-02 11:46 66% Eric Wong
  2021-02-02 11:46 42% ` [PATCH 01/16] lei: switch to use SEQPACKET socketpair instead of pipe Eric Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:46 UTC (permalink / raw)
  To: meta

We're further embracing SOCK_SEQPACKET for progress reporting.
There's numerous cleanups for the oneshot case, but that's still
using worker processes.  Worker-less oneshot seems pretty-broken
atm, but 16/16 will let us work on it more easily.

Eric Wong (16):
  lei: switch to use SEQPACKET socketpair instead of pipe
  lei_query: default to 10000 messages as documented
  lei q: emit progress and counting via PktOp
  lei q: support --only, --include and --exclude
  lei: complete: do not complete non-arg options w/ help text
  lei: q: shell completion for --(include|exclude|only)
  lei_xsearch: truncate curl stderr after reading it
  lib: explicitly distinguish oneshot use
  lei q: do not leave temporary files after oneshot exit
  cmd_ipc4: fix comments and formatting
  pktop: fix potential undefined var
  lei_xsearch: ensure curl.err and tail(1) cleanup happens
  doc: lei-q: note "-a" and link to Xapian QueryParser
  lei_overview: avoid unnecessary {l2m} delete
  lei q: tidy up progress reporting
  lei q: support --jobs [SEARCHERS],[WRITERS]

 Documentation/lei-q.pod        |  5 +-
 MANIFEST                       |  2 +-
 lib/PublicInbox/CmdIPC4.pm     |  7 ++-
 lib/PublicInbox/IPC.pm         | 42 +++++++++++++----
 lib/PublicInbox/LEI.pm         | 60 +++++++++++++++---------
 lib/PublicInbox/LeiExternal.pm | 12 ++---
 lib/PublicInbox/LeiOverview.pm | 15 +++---
 lib/PublicInbox/LeiQuery.pm    | 77 ++++++++++++++++++++++++-------
 lib/PublicInbox/LeiXSearch.pm  | 83 ++++++++++++++++++++++++----------
 lib/PublicInbox/OpPipe.pm      | 41 -----------------
 lib/PublicInbox/PktOp.pm       | 69 ++++++++++++++++++++++++++++
 lib/PublicInbox/V2Writable.pm  | 22 +--------
 t/lei.t                        | 14 ++++--
 t/lei_external.t               |  2 +-
 xt/lei-sigpipe.t               | 29 ++++++++++--
 15 files changed, 318 insertions(+), 162 deletions(-)
 delete mode 100644 lib/PublicInbox/OpPipe.pm
 create mode 100644 lib/PublicInbox/PktOp.pm


^ permalink raw reply	[relevance 66%]

* [PATCH 04/16] lei q: support --only, --include and --exclude
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
  2021-02-02 11:46 42% ` [PATCH 01/16] lei: switch to use SEQPACKET socketpair instead of pipe Eric Wong
  2021-02-02 11:46 32% ` [PATCH 03/16] lei q: emit progress and counting via PktOp Eric Wong
@ 2021-02-02 11:46 51% ` Eric Wong
  2021-02-02 11:46 71% ` [PATCH 05/16] lei: complete: do not complete non-arg options w/ help text Eric Wong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:46 UTC (permalink / raw)
  To: meta

-I is short for --include since it's standard for C compilers
(along with Perl and Ruby).  There are no single-character
shortcuts for --exclude or --only, since I don't expect
--exclude to be used very often and --only is already short (and
will support shell completion).
---
 lib/PublicInbox/LEI.pm         |  1 +
 lib/PublicInbox/LeiExternal.pm | 12 +++++-----
 lib/PublicInbox/LeiQuery.pm    | 42 ++++++++++++++++++++++++----------
 t/lei_external.t               |  2 +-
 4 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 6c2515dc..ffbc2503 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -104,6 +104,7 @@ our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
+	include|I=s@ exclude=s@ only=s@
 	mua-cmd|mua=s no-torsocks torsocks=s verbose|v quiet|q
 	received-after=s received-before=s sent-after=s sent-since=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index b1176824..3853cfc1 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -9,7 +9,7 @@ use parent qw(Exporter);
 our @EXPORT = qw(lei_ls_external lei_add_external lei_forget_external);
 use PublicInbox::Config;
 
-sub _externals_each {
+sub externals_each {
 	my ($self, $cb, @arg) = @_;
 	my $cfg = $self->_lei_cfg(0);
 	my %boost;
@@ -32,14 +32,14 @@ sub _externals_each {
 sub lei_ls_external {
 	my ($self, @argv) = @_;
 	my ($OFS, $ORS) = $self->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
-	$self->_externals_each(sub {
+	externals_each($self, sub {
 		my ($loc, $boost_val) = @_;
 		$self->out($loc, $OFS, 'boost=', $boost_val, $ORS);
 	});
 }
 
-sub _canonicalize {
-	my ($location) = @_;
+sub ext_canonicalize {
+	my ($location) = $_[-1];
 	if ($location !~ m!\Ahttps?://!) {
 		PublicInbox::Config::rel2abs_collapsed($location);
 	} else {
@@ -56,7 +56,7 @@ sub lei_add_external {
 	my ($self, $location) = @_;
 	my $cfg = $self->_lei_cfg(1);
 	my $new_boost = $self->{opt}->{boost} // 0;
-	$location = _canonicalize($location);
+	$location = ext_canonicalize($location);
 	if ($location !~ m!\Ahttps?://! && !-d $location) {
 		return $self->fail("$location not a directory");
 	}
@@ -74,7 +74,7 @@ sub lei_forget_external {
 	my %seen;
 	for my $loc (@locations) {
 		my (@unset, @not_found);
-		for my $l ($loc, _canonicalize($loc)) {
+		for my $l ($loc, ext_canonicalize($loc)) {
 			next if $seen{$l}++;
 			my $key = "external.$l.boost";
 			delete($cfg->{$key});
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index dea04c13..fd8a3bca 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -7,6 +7,11 @@ use strict;
 use v5.10.1;
 use PublicInbox::DS qw(dwaitpid);
 
+sub prep_ext { # externals_each callback
+	my ($lxs, $exclude, $loc) = @_;
+	$lxs->prepare_external($loc) unless $exclude->{$loc};
+}
+
 # the main "lei q SEARCH_TERMS" method
 sub lei_q {
 	my ($self, @argv) = @_;
@@ -14,22 +19,35 @@ sub lei_q {
 	require PublicInbox::LeiOverview;
 	PublicInbox::Config->json; # preload before forking
 	my $opt = $self->{opt};
+	# prepare any number of LeiXSearch || LeiSearch || Inbox || URL
 	my $lxs = $self->{lxs} = PublicInbox::LeiXSearch->new;
-	# any number of LeiXSearch || LeiSearch || Inbox
-	if ($opt->{'local'} //= 1) { # --local is enabled by default
+	my @only = @{$opt->{only} // []};
+	# --local is enabled by default unless --only is used
+	# we'll allow "--only $LOCATION --local"
+	if ($opt->{'local'} //= scalar(@only) ? 0 : 1) {
 		my $sto = $self->_lei_store(1);
 		$lxs->prepare_external($sto->search);
 	}
-
-	# --external is enabled by default, but allow --no-external
-	if ($opt->{external} //= 1) {
-		my $cb = $lxs->can('prepare_external');
-		my $ne = $self->_externals_each($cb, $lxs);
-		$opt->{remote} //= $ne == $lxs->remotes;
-		if ($opt->{'local'}) {
-			delete($lxs->{remotes}) if !$opt->{remote};
-		} else {
-			delete($lxs->{locals});
+	if (@only) {
+		for my $loc (@only) {
+			$lxs->prepare_external($self->ext_canonicalize($loc));
+		}
+	} else {
+		for my $loc (@{$opt->{include} // []}) {
+			$lxs->prepare_external($self->ext_canonicalize($loc));
+		}
+		# --external is enabled by default, but allow --no-external
+		if ($opt->{external} //= 1) {
+			my %x = map {;
+				($self->ext_canonicalize($_), 1)
+			} @{$self->{exclude} // []};
+			my $ne = $self->externals_each(\&prep_ext, $lxs, \%x);
+			$opt->{remote} //= !($lxs->locals - $opt->{'local'});
+			if ($opt->{'local'}) {
+				delete($lxs->{remotes}) if !$opt->{remote};
+			} else {
+				delete($lxs->{locals});
+			}
 		}
 	}
 	unless ($lxs->locals || $lxs->remotes) {
diff --git a/t/lei_external.t b/t/lei_external.t
index 1f0048a1..587990db 100644
--- a/t/lei_external.t
+++ b/t/lei_external.t
@@ -4,7 +4,7 @@ use v5.10.1;
 use Test::More;
 my $cls = 'PublicInbox::LeiExternal';
 require_ok $cls;
-my $canon = $cls->can('_canonicalize');
+my $canon = $cls->can('ext_canonicalize');
 my $exp = 'https://example.com/my-inbox/';
 is($canon->('https://example.com/my-inbox'), $exp, 'trailing slash added');
 is($canon->('https://example.com/my-inbox//'), $exp, 'trailing slash removed');

^ permalink raw reply related	[relevance 51%]

* [PATCH 01/16] lei: switch to use SEQPACKET socketpair instead of pipe
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
@ 2021-02-02 11:46 42% ` Eric Wong
  2021-02-02 11:46 32% ` [PATCH 03/16] lei q: emit progress and counting via PktOp Eric Wong
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:46 UTC (permalink / raw)
  To: meta

This will allow us to use larger messages and do progress
reporting to accumulate in the main daemon.
---
 MANIFEST                      |  2 +-
 lib/PublicInbox/LEI.pm        |  8 ++--
 lib/PublicInbox/LeiXSearch.pm | 27 ++++++------
 lib/PublicInbox/OpPipe.pm     | 41 ------------------
 lib/PublicInbox/PktOp.pm      | 79 +++++++++++++++++++++++++++++++++++
 5 files changed, 98 insertions(+), 59 deletions(-)
 delete mode 100644 lib/PublicInbox/OpPipe.pm
 create mode 100644 lib/PublicInbox/PktOp.pm

diff --git a/MANIFEST b/MANIFEST
index 017dc7f2..bcb9d08e 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -205,9 +205,9 @@ lib/PublicInbox/NNTPD.pm
 lib/PublicInbox/NNTPdeflate.pm
 lib/PublicInbox/NewsWWW.pm
 lib/PublicInbox/OnDestroy.pm
-lib/PublicInbox/OpPipe.pm
 lib/PublicInbox/Over.pm
 lib/PublicInbox/OverIdx.pm
+lib/PublicInbox/PktOp.pm
 lib/PublicInbox/ProcessPipe.pm
 lib/PublicInbox/Qspawn.pm
 lib/PublicInbox/Reply.pm
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 17ad18b9..737db1e1 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -306,7 +306,7 @@ sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }
 sub fail ($$;$) {
 	my ($self, $buf, $exit_code) = @_;
 	err($self, $buf) if defined $buf;
-	syswrite($self->{op_pipe}, '!') if $self->{op_pipe}; # fail_handler
+	send($self->{pkt_op}, '!', MSG_EOR) if $self->{pkt_op}; # fail_handler
 	x_it($self, ($exit_code // 1) << 8);
 	undef;
 }
@@ -369,14 +369,14 @@ sub io_restore ($$) {
 sub note_sigpipe { # triggers sigpipe_handler
 	my ($self, $fd) = @_;
 	close(delete($self->{$fd})); # explicit close silences Perl warning
-	syswrite($self->{op_pipe}, '|') if $self->{op_pipe};
+	send($self->{pkt_op}, '|', MSG_EOR) if $self->{pkt_op};
 	x_it($self, 13);
 }
 
 sub atfork_child_wq {
 	my ($self, $wq) = @_;
 	io_restore($self, $wq);
-	-p $self->{op_pipe} or die 'BUG: {op_pipe} expected';
+	-S $self->{pkt_op} or die 'BUG: {pkt_op} expected';
 	io_restore($self->{l2m}, $wq);
 	%PATH2CFG = ();
 	undef $errors_log;
@@ -408,7 +408,7 @@ sub atfork_parent_wq {
 	$self->{env} = $env;
 	delete @$lei{qw(3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
 	my @io = (delete(@$lei{qw(0 1 2)}),
-			io_extract($lei, qw(sock op_pipe startq)));
+			io_extract($lei, qw(sock pkt_op startq)));
 	my $l2m = $lei->{l2m};
 	if ($l2m && $l2m != $wq) { # $wq == lxs
 		if (my $wq_s1 = $l2m->{-wq_s1}) {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index f630e79a..e577ab09 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -9,10 +9,11 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::LeiSearch PublicInbox::IPC);
 use PublicInbox::DS qw(dwaitpid);
-use PublicInbox::OpPipe;
+use PublicInbox::PktOp;
 use PublicInbox::Import;
 use File::Temp 0.19 (); # 0.19 for ->newdir
 use File::Spec ();
+use Socket qw(MSG_EOR);
 use PublicInbox::Search qw(xap_terms);
 use PublicInbox::Spawn qw(popen_rd spawn which);
 use PublicInbox::MID qw(mids);
@@ -353,7 +354,8 @@ sub query_prepare { # called by wq_do
 	delete $lei->{l2m}->{-wq_s1};
 	eval { $lei->{l2m}->do_augment($lei) };
 	$lei->fail($@) if $@;
-	syswrite($lei->{op_pipe}, '.') == 1 or die "do_post_augment trigger: $!"
+	send($lei->{pkt_op}, '.', MSG_EOR) == 1 or
+		die "do_post_augment trigger: $!"
 }
 
 sub fail_handler ($;$$) {
@@ -380,20 +382,19 @@ sub do_query {
 		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';
 		$zpipe = $l2m->pre_augment($lei);
 	}
-	pipe(my $done, $lei->{op_pipe}) or die "pipe $!";
+	my $in_loop = exists $lei->{sock};
+	my $ops = {
+		'|' => [ \&sigpipe_handler, $lei ],
+		'!' => [ \&fail_handler, $lei ],
+		'.' => [ \&do_post_augment, $lei, $zpipe, $au_done ],
+		'' => [ \&query_done, $lei ],
+	};
+	(my $op, $lei->{pkt_op}) = PublicInbox::PktOp->pair($ops, $in_loop);
 	my ($lei_ipc, @io) = $lei->atfork_parent_wq($self);
-	delete($lei->{op_pipe});
+	delete($lei->{pkt_op});
 
 	$lei->event_step_init; # wait for shutdowns
-	my $done_op = {
-		'' => [ \&query_done, $lei ],
-		'|' => [ \&sigpipe_handler, $lei ],
-		'!' => [ \&fail_handler, $lei ]
-	};
-	my $in_loop = exists $lei->{sock};
-	$done = PublicInbox::OpPipe->new($done, $done_op, $in_loop);
 	if ($l2m) {
-		$done_op->{'.'} = [ \&do_post_augment, $lei, $zpipe, $au_done ];
 		$self->wq_do('query_prepare', \@io, $lei_ipc);
 		$io[1] = $zpipe->[1] if $zpipe;
 	}
@@ -401,7 +402,7 @@ sub do_query {
 	$self->wq_close(1);
 	unless ($in_loop) {
 		# for the $lei_ipc->atfork_child_wq PIPE handler:
-		while ($done->{sock}) { $done->event_step }
+		while ($op->{sock}) { $op->event_step }
 	}
 }
 
diff --git a/lib/PublicInbox/OpPipe.pm b/lib/PublicInbox/OpPipe.pm
deleted file mode 100644
index 295a8aa5..00000000
--- a/lib/PublicInbox/OpPipe.pm
+++ /dev/null
@@ -1,41 +0,0 @@
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
-# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-
-# bytecode dispatch pipe, reads a byte, runs a sub
-# byte => [ sub, @operands ]
-package PublicInbox::OpPipe;
-use strict;
-use v5.10.1;
-use parent qw(PublicInbox::DS);
-use PublicInbox::Syscall qw(EPOLLIN);
-
-sub new {
-	my ($cls, $rd, $op_map, $in_loop) = @_;
-	my $self = bless { sock => $rd, op_map => $op_map }, $cls;
-	# 1031: F_SETPIPE_SZ, 4096: page size
-	fcntl($rd, 1031, 4096) if $^O eq 'linux';
-	if ($in_loop) { # iff using DS->EventLoop
-		$rd->blocking(0);
-		$self->SUPER::new($rd, EPOLLIN);
-	}
-	$self;
-}
-
-sub event_step {
-	my ($self) = @_;
-	my $rd = $self->{sock};
-	my $byte;
-	until (defined(sysread($rd, $byte, 1))) {
-		return if $!{EAGAIN};
-		next if $!{EINTR};
-		die "read \$rd: $!";
-	}
-	my $op = $self->{op_map}->{$byte} or die "BUG: unknown byte `$byte'";
-	if ($byte eq '') { # close on EOF
-		$rd->blocking ? delete($self->{sock}) : $self->close;
-	}
-	my ($sub, @args) = @$op;
-	$sub->(@args);
-}
-
-1;
diff --git a/lib/PublicInbox/PktOp.pm b/lib/PublicInbox/PktOp.pm
new file mode 100644
index 00000000..d5b95a73
--- /dev/null
+++ b/lib/PublicInbox/PktOp.pm
@@ -0,0 +1,79 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# op dispatch socket, reads a message, runs a sub
+# There may be multiple producers, but (for now) only one consumer
+# Used for lei_xsearch and maybe other things
+# "literal" => [ sub, @operands ]
+# /regexp/ => [ sub, @operands ]
+package PublicInbox::PktOp;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::DS);
+use Errno qw(EAGAIN EINTR);
+use PublicInbox::Syscall qw(EPOLLIN EPOLLET);
+use Socket qw(AF_UNIX MSG_EOR SOCK_SEQPACKET);
+
+sub new {
+	my ($cls, $r, $ops, $in_loop) = @_;
+	my $self = bless { sock => $r, ops => $ops, re => [] }, $cls;
+	if (ref($ops) eq 'ARRAY') {
+		my %ops;
+		for my $op (@$ops) {
+			if (ref($op->[0])) {
+				push @{$self->{re}}, $op;
+			} else {
+				$ops{$op->[0]} = $op->[1];
+			}
+		}
+		$self->{ops} = \%ops;
+	}
+	if ($in_loop) { # iff using DS->EventLoop
+		$r->blocking(0);
+		$self->SUPER::new($r, EPOLLIN|EPOLLET);
+	}
+	$self;
+}
+
+# returns a blessed object as the consumer, and a GLOB/IO for the producer
+sub pair {
+	my ($cls, $ops, $in_loop) = @_;
+	my ($c, $p);
+	socketpair($c, $p, AF_UNIX, SOCK_SEQPACKET, 0) or die "socketpair: $!";
+	(new($cls, $c, $ops, $in_loop), $p);
+}
+
+sub close {
+	my ($self) = @_;
+	my $c = $self->{sock} or return;
+	$c->blocking ? delete($self->{sock}) : $self->SUPER::close;
+}
+
+sub event_step {
+	my ($self) = @_;
+	my $c = $self->{sock};
+	my $msg;
+	do {
+		my $n = recv($c, $msg, 128, 0);
+		unless (defined $n) {
+			return if $! == EAGAIN;
+			next if $! == EINTR;
+			$self->close;
+			die "recv: $!";
+		}
+		my $op = $self->{ops}->{$msg};
+		unless ($op) {
+			for my $re_op (@{$self->{re}}) {
+				$msg =~ $re_op->[0] or next;
+				$op = $re_op->[1];
+				last;
+			}
+		}
+		die "BUG: unknown message: `$msg'" unless $op;
+		my ($sub, @args) = @$op;
+		$sub->(@args);
+		return $self->close if $msg eq ''; # close on EOF
+	} while (1);
+}
+
+1;

^ permalink raw reply related	[relevance 42%]

* [PATCH 03/16] lei q: emit progress and counting via PktOp
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
  2021-02-02 11:46 42% ` [PATCH 01/16] lei: switch to use SEQPACKET socketpair instead of pipe Eric Wong
@ 2021-02-02 11:46 32% ` Eric Wong
  2021-02-02 11:46 51% ` [PATCH 04/16] lei q: support --only, --include and --exclude Eric Wong
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:46 UTC (permalink / raw)
  To: meta

Sometimes it can be confusing for "lei q" to finish writing to a
Maildir|mbox and not know if it did anything.  So show some
per-external progress and stats.

These can be disabled via the new --quiet/-q switch.

We differ slightly from mairix(1) here, as we use stderr
instead of stdout for reporting totals (and we support
parallel queries from various sources).
---
 lib/PublicInbox/IPC.pm        | 23 +++++++++-------
 lib/PublicInbox/LEI.pm        |  2 +-
 lib/PublicInbox/LeiXSearch.pm | 51 ++++++++++++++++++++++++++---------
 lib/PublicInbox/PktOp.pm      | 36 +++++++++----------------
 t/lei.t                       |  8 +++---
 xt/lei-sigpipe.t              |  2 +-
 6 files changed, 71 insertions(+), 51 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 689f32d0..50de1bed 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -10,6 +10,7 @@
 package PublicInbox::IPC;
 use strict;
 use v5.10.1;
+use parent qw(Exporter);
 use Carp qw(confess croak);
 use PublicInbox::DS qw(dwaitpid);
 use PublicInbox::Spawn;
@@ -18,6 +19,7 @@ use PublicInbox::WQWorker;
 use Socket qw(AF_UNIX MSG_EOR SOCK_STREAM);
 my $SEQPACKET = eval { Socket::SOCK_SEQPACKET() }; # portable enough?
 use constant PIPE_BUF => $^O eq 'linux' ? 4096 : POSIX::_POSIX_PIPE_BUF();
+our @EXPORT_OK = qw(ipc_freeze ipc_thaw);
 my $WQ_MAX_WORKERS = 4096;
 my ($enc, $dec);
 # ->imports at BEGIN turns sereal_*_with_object into custom ops on 5.14+
@@ -33,12 +35,13 @@ BEGIN {
 };
 
 if ($enc && $dec) { # should be custom ops
-	*freeze = sub ($) { sereal_encode_with_object $enc, $_[0] };
-	*thaw = sub ($) { sereal_decode_with_object $dec, $_[0], my $ret };
+	*ipc_freeze = sub ($) { sereal_encode_with_object $enc, $_[0] };
+	*ipc_thaw = sub ($) { sereal_decode_with_object $dec, $_[0], my $ret };
 } else {
 	eval { # some distros have Storable as a separate package from Perl
 		require Storable;
-		Storable->import(qw(freeze thaw));
+		*ipc_freeze = \&Storable::freeze;
+		*ipc_thaw = \&Storable::thaw;
 		$enc = 1;
 	} // warn("Storable (part of Perl) missing: $@\n");
 }
@@ -56,12 +59,12 @@ sub _get_rec ($) {
 	chop($len) eq "\n" or croak "no LF byte in $len";
 	defined(my $n = read($r, my $buf, $len)) or croak "read error: $!";
 	$n == $len or croak "short read: $n != $len";
-	thaw($buf);
+	ipc_thaw($buf);
 }
 
 sub _pack_rec ($) {
 	my ($ref) = @_;
-	my $buf = freeze($ref);
+	my $buf = ipc_freeze($ref);
 	length($buf) . "\n" . $buf;
 }
 
@@ -275,7 +278,7 @@ sub recv_and_run {
 		$n = length($buf);
 	}
 	# Sereal dies on truncated data, Storable returns undef
-	my $args = thaw($buf) // die "thaw error on buffer of size: $n";
+	my $args = ipc_thaw($buf) // die "thaw error on buffer of size: $n";
 	undef $buf;
 	my $sub = shift @$args;
 	eval { $self->$sub(@$args) };
@@ -301,15 +304,15 @@ sub wq_do { # always async
 	my ($self, $sub, $ios, @args) = @_;
 	if (my $s1 = $self->{-wq_s1}) { # run in worker
 		my $fds = [ map { fileno($_) } @$ios ];
-		my $n = $send_cmd->($s1, $fds, freeze([$sub, @args]), MSG_EOR);
+		my $buf = ipc_freeze([$sub, @args]);
+		my $n = $send_cmd->($s1, $fds, $buf, MSG_EOR);
 		return if defined($n); # likely
 		croak "sendmsg: $! (check RLIMIT_NOFILE)" if $!{ETOOMANYREFS};
 		croak "sendmsg: $!" if !$!{EMSGSIZE};
 		socketpair(my $r, my $w, AF_UNIX, SOCK_STREAM, 0) or
 			croak "socketpair: $!";
-		my $buf = freeze([$sub, @args]);
 		$n = $send_cmd->($s1, [ fileno($r) ],
-				freeze(['do_sock_stream', length($buf)]),
+				ipc_freeze(['do_sock_stream', length($buf)]),
 				MSG_EOR) // croak "sendmsg: $!";
 		undef $r;
 		$n = $send_cmd->($w, $fds, $buf, 0) // croak "sendmsg: $!";
@@ -461,6 +464,6 @@ sub DESTROY {
 }
 
 # Sereal doesn't have dclone
-sub deep_clone { thaw(freeze($_[-1])) }
+sub deep_clone { ipc_thaw(ipc_freeze($_[-1])) }
 
 1;
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 737db1e1..6c2515dc 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -104,7 +104,7 @@ our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
-	mua-cmd|mua=s no-torsocks torsocks=s verbose|v
+	mua-cmd|mua=s no-torsocks torsocks=s verbose|v quiet|q
 	received-after=s received-before=s sent-after=s sent-since=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index e577ab09..95862306 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -8,12 +8,11 @@ package PublicInbox::LeiXSearch;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::LeiSearch PublicInbox::IPC);
-use PublicInbox::DS qw(dwaitpid);
-use PublicInbox::PktOp;
+use PublicInbox::DS qw(dwaitpid now);
+use PublicInbox::PktOp qw(pkt_do);
 use PublicInbox::Import;
 use File::Temp 0.19 (); # 0.19 for ->newdir
 use File::Spec ();
-use Socket qw(MSG_EOR);
 use PublicInbox::Search qw(xap_terms);
 use PublicInbox::Spawn qw(popen_rd spawn which);
 use PublicInbox::MID qw(mids);
@@ -97,7 +96,7 @@ sub over {}
 sub _mset_more ($$) {
 	my ($mset, $mo) = @_;
 	my $size = $mset->size;
-	$size && (($mo->{offset} += $size) < ($mo->{limit} // 10000));
+	$size >= $mo->{limit} && (($mo->{offset} += $size) < $mo->{limit});
 }
 
 # $startq will EOF when query_prepare is done augmenting and allow
@@ -115,16 +114,15 @@ sub query_thread_mset { # for --thread
 	my $startq = delete $lei->{startq};
 
 	my ($srch, $over) = ($ibxish->search, $ibxish->over);
-	unless ($srch && $over) {
-		my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
-		warn "$desc not indexed by Xapian\n";
-		return;
-	}
+	my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
+	return warn("$desc not indexed by Xapian\n") unless ($srch && $over);
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $ibxish);
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
+		pkt_do($lei->{pkt_op}, 'mset_progress', $desc, $mset->size,
+				$mset->get_matches_estimated);
 		my $ids = $srch->mset_to_artnums($mset, $mo);
 		my $ctx = { ids => $ids };
 		my $i = 0;
@@ -156,6 +154,8 @@ sub query_mset { # non-parallel for non-"--thread" users
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $self);
 	do {
 		$mset = $self->mset($mo->{qstr}, $mo);
+		pkt_do($lei->{pkt_op}, 'mset_progress', 'xsearch',
+				$mset->size, $mset->get_matches_estimated);
 		for my $mitem ($mset->items) {
 			my $smsg = smsg_for($self, $mitem) or next;
 			wait_startq($startq) if $startq;
@@ -174,6 +174,16 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
 	delete @$smsg{qw(From Subject -ds -ts)};
 	if (my $startq = delete($lei->{startq})) { wait_startq($startq) }
+	++$lei->{-nr_remote_eml};
+	if (!$lei->{opt}->{quiet}) {
+		my $now = now();
+		my $next = $lei->{-next_progress} //= ($now + 1);
+		if ($now > $next) {
+			$lei->{-next_progress} = $now + 1;
+			my $nr = $lei->{-nr_remote_eml};
+			$lei->err("# $lei->{-current_url} $nr/?");
+		}
+	}
 	$each_smsg->($smsg, undef, $eml);
 }
 
@@ -223,6 +233,8 @@ sub query_remote_mboxrd {
 	my $tor = $opt->{torsocks} //= 'auto';
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	for my $uri (@$uris) {
+		$lei->{-current_url} = $uri->as_string;
+		$lei->{-nr_remote_eml} = 0;
 		$uri->query_form(@qform);
 		my $cmd = [ @cmd, $uri->as_string ];
 		if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&
@@ -246,7 +258,12 @@ sub query_remote_mboxrd {
 							$lei, $each_smsg);
 		};
 		return $lei->fail("E: @$cmd: $@") if $@;
-		next unless $?;
+		if ($? == 0) {
+			my $nr = $lei->{-nr_remote_eml};
+			pkt_do($lei->{pkt_op}, 'mset_progress',
+				$lei->{-current_url}, $nr, $nr);
+			next;
+		}
 		seek($cerr, $coff, SEEK_SET) or warn "seek(curl stderr): $!\n";
 		my $e = do { local $/; <$cerr> } //
 				die "read(curl stderr): $!\n";
@@ -299,9 +316,19 @@ Error closing $lei->{ovv}->{dst}: $!
 		}
 		$lei->start_mua;
 	}
+	$lei->{opt}->{quiet} or
+		$lei->err('# ', $lei->{-mset_total} // 0, " matches");
 	$lei->dclose;
 }
 
+sub mset_progress { # called via pkt_op/pkt_do from workers
+	my ($lei, $pargs) = @_;
+	my ($desc, $mset_size, $mset_total_est) = @$pargs;
+	return if $lei->{opt}->{quiet};
+	$lei->{-mset_total} += $mset_size;
+	$lei->err("# $desc $mset_size/$mset_total_est");
+}
+
 sub do_post_augment {
 	my ($lei, $zpipe, $au_done) = @_;
 	my $l2m = $lei->{l2m} or die 'BUG: no {l2m}';
@@ -354,8 +381,7 @@ sub query_prepare { # called by wq_do
 	delete $lei->{l2m}->{-wq_s1};
 	eval { $lei->{l2m}->do_augment($lei) };
 	$lei->fail($@) if $@;
-	send($lei->{pkt_op}, '.', MSG_EOR) == 1 or
-		die "do_post_augment trigger: $!"
+	pkt_do($lei->{pkt_op}, '.') == 1 or die "do_post_augment trigger: $!"
 }
 
 sub fail_handler ($;$$) {
@@ -388,6 +414,7 @@ sub do_query {
 		'!' => [ \&fail_handler, $lei ],
 		'.' => [ \&do_post_augment, $lei, $zpipe, $au_done ],
 		'' => [ \&query_done, $lei ],
+		'mset_progress' => [ \&mset_progress, $lei ],
 	};
 	(my $op, $lei->{pkt_op}) = PublicInbox::PktOp->pair($ops, $in_loop);
 	my ($lei_ipc, @io) = $lei->atfork_parent_wq($self);
diff --git a/lib/PublicInbox/PktOp.pm b/lib/PublicInbox/PktOp.pm
index d5b95a73..12839e71 100644
--- a/lib/PublicInbox/PktOp.pm
+++ b/lib/PublicInbox/PktOp.pm
@@ -9,25 +9,16 @@
 package PublicInbox::PktOp;
 use strict;
 use v5.10.1;
-use parent qw(PublicInbox::DS);
+use parent qw(PublicInbox::DS Exporter);
 use Errno qw(EAGAIN EINTR);
 use PublicInbox::Syscall qw(EPOLLIN EPOLLET);
 use Socket qw(AF_UNIX MSG_EOR SOCK_SEQPACKET);
+use PublicInbox::IPC qw(ipc_freeze ipc_thaw);
+our @EXPORT_OK = qw(pkt_do);
 
 sub new {
 	my ($cls, $r, $ops, $in_loop) = @_;
 	my $self = bless { sock => $r, ops => $ops, re => [] }, $cls;
-	if (ref($ops) eq 'ARRAY') {
-		my %ops;
-		for my $op (@$ops) {
-			if (ref($op->[0])) {
-				push @{$self->{re}}, $op;
-			} else {
-				$ops{$op->[0]} = $op->[1];
-			}
-		}
-		$self->{ops} = \%ops;
-	}
 	if ($in_loop) { # iff using DS->EventLoop
 		$r->blocking(0);
 		$self->SUPER::new($r, EPOLLIN|EPOLLET);
@@ -43,6 +34,11 @@ sub pair {
 	(new($cls, $c, $ops, $in_loop), $p);
 }
 
+sub pkt_do { # for the producer to trigger event_step in consumer
+	my ($producer, $cmd, @args) = @_;
+	send($producer, @args ? "$cmd\0".ipc_freeze(\@args) : $cmd, MSG_EOR);
+}
+
 sub close {
 	my ($self) = @_;
 	my $c = $self->{sock} or return;
@@ -54,24 +50,18 @@ sub event_step {
 	my $c = $self->{sock};
 	my $msg;
 	do {
-		my $n = recv($c, $msg, 128, 0);
+		my $n = recv($c, $msg, 4096, 0);
 		unless (defined $n) {
 			return if $! == EAGAIN;
 			next if $! == EINTR;
 			$self->close;
 			die "recv: $!";
 		}
-		my $op = $self->{ops}->{$msg};
-		unless ($op) {
-			for my $re_op (@{$self->{re}}) {
-				$msg =~ $re_op->[0] or next;
-				$op = $re_op->[1];
-				last;
-			}
-		}
-		die "BUG: unknown message: `$msg'" unless $op;
+		my ($cmd, $pargs) = split(/\0/, $msg, 2);
+		my $op = $self->{ops}->{$cmd // $msg};
+		die "BUG: unknown message: `$cmd'" unless $op;
 		my ($sub, @args) = @$op;
-		$sub->(@args);
+		$sub->(@args, $pargs ? ipc_thaw($pargs) : ());
 		return $self->close if $msg eq ''; # close on EOF
 	} while (1);
 }
diff --git a/t/lei.t b/t/lei.t
index 3f6702e6..a46e46f2 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -174,11 +174,11 @@ SKIP: {
 	}
 	$lei->('add-external', $url);
 	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
-	ok($lei->('q', "m:$mid"), "query $url");
+	ok($lei->('q', '-q', "m:$mid"), "query $url");
 	is($err, '', "no errors on $url");
 	my $res = $json->decode($out);
 	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
-	ok($lei->('q', "m:$mid", 'd:..20101002'), 'no results, no error');
+	ok($lei->('q', '-q', "m:$mid", 'd:..20101002'), 'no results, no error');
 	is($err, '', 'no output on 404, matching local FS behavior');
 	is($out, "[null]\n", 'got null results');
 	$lei->('forget-external', $url);
@@ -291,12 +291,12 @@ my $test_external = sub {
 		my @s = grep(/^Subject:/, $cat->());
 		is(scalar(@s), 1, "1 result in mbox$sfx");
 		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:see attachment');
-		is($err, '', 'no errors from augment');
+		is(grep(!/^#/, $err), 0, 'no errors from augment');
 		@s = grep(/^Subject:/, my @wtf = $cat->());
 		is(scalar(@s), 2, "2 results in mbox$sfx");
 
 		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:nonexistent');
-		is($err, '', "no errors on no results ($sfx)");
+		is(grep(!/^#/, $err), 0, "no errors on no results ($sfx)");
 
 		my @s2 = grep(/^Subject:/, $cat->());
 		is_deeply(\@s2, \@s,
diff --git a/xt/lei-sigpipe.t b/xt/lei-sigpipe.t
index 448bd7db..1aa9ed07 100644
--- a/xt/lei-sigpipe.t
+++ b/xt/lei-sigpipe.t
@@ -15,7 +15,7 @@ my $do_test = sub {
 		pipe(my ($r, $w)) or BAIL_OUT $!;
 		open my $err, '+>', undef or BAIL_OUT $!;
 		my $opt = { run_mode => 0, 1 => $w, 2 => $err };
-		my $cmd = [qw(lei q -t), @$out, 'bytes:1..'];
+		my $cmd = [qw(lei q -q -t), @$out, 'bytes:1..'];
 		my $tp = start_script($cmd, $env, $opt);
 		close $w;
 		sysread($r, my $buf, 1);

^ permalink raw reply related	[relevance 32%]

* [PATCH 05/16] lei: complete: do not complete non-arg options w/ help text
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-02 11:46 51% ` [PATCH 04/16] lei q: support --only, --include and --exclude Eric Wong
@ 2021-02-02 11:46 71% ` Eric Wong
  2021-02-02 11:46 64% ` [PATCH 06/16] lei: q: shell completion for --(include|exclude|only) Eric Wong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:46 UTC (permalink / raw)
  To: meta

Some of our command-line switches take no arguments, and need
no completion for those arguments.
---
 lib/PublicInbox/LEI.pm | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ffbc2503..b0a8358a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -737,8 +737,7 @@ sub lei__complete {
 		my $opt = quotemeta $1;
 		puts $self, map {
 			my $v = $OPTDESC{$_};
-			$v = $v->[0] if ref($v);
-			my @v = split(/\|/, $v);
+			my @v = ref($v) ? split(/\|/, $v->[0]) : ();
 			# get rid of ALL CAPS placeholder (e.g "OUT")
 			# (TODO: completion for external paths)
 			shift(@v) if uc($v[0]) eq $v[0];

^ permalink raw reply related	[relevance 71%]

* [PATCH 13/16] doc: lei-q: note "-a" and link to Xapian QueryParser
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-02 11:46 52% ` [PATCH 09/16] lei q: do not leave temporary files after oneshot exit Eric Wong
@ 2021-02-02 11:46 71% ` Eric Wong
  2021-02-02 11:47 56% ` [PATCH 15/16] lei q: tidy up progress reporting Eric Wong
  2021-02-02 11:47 48% ` [PATCH 16/16] lei q: support --jobs [SEARCHERS],[WRITERS] Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:46 UTC (permalink / raw)
  To: meta

"-a" is supported by mairix, too.  We should also note somewhere
the query parsing features supported by Xapian.
---
 Documentation/lei-q.pod | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index e307e020..5c0ca843 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -43,7 +43,7 @@ For a subset of MUAs known to accept a mailbox via C<-f>, COMMAND can
 be abbreviated to the name of the program: C<mutt>, C<mailx>, C<mail>,
 or C<neomutt>.
 
-=item --augment
+=item -a, --augment
 
 Augment output destination instead of clobbering it.
 
@@ -124,4 +124,5 @@ License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
 
 =head1 SEE ALSO
 
-L<lei-add-external(1)>
+L<lei-add-external(1)>,
+L<Xapian::QueryParser Syntax|https://xapian.org/docs/queryparser.html>

^ permalink raw reply related	[relevance 71%]

* [PATCH 06/16] lei: q: shell completion for --(include|exclude|only)
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-02 11:46 71% ` [PATCH 05/16] lei: complete: do not complete non-arg options w/ help text Eric Wong
@ 2021-02-02 11:46 64% ` Eric Wong
  2021-02-02 11:46 52% ` [PATCH 09/16] lei q: do not leave temporary files after oneshot exit Eric Wong
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:46 UTC (permalink / raw)
  To: meta

Because .onion URLs names are long!
---
 lib/PublicInbox/LEI.pm      |  7 +++++++
 lib/PublicInbox/LeiQuery.pm | 16 ++++++++++++++++
 t/lei.t                     |  6 ++++++
 3 files changed, 29 insertions(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index b0a8358a..bb7efd59 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -229,6 +229,13 @@ my %OPTDESC = (
 'q	format|f=s' => [
 	'OUT|maildir|mboxrd|mboxcl2|mboxcl|mboxo|html|json|jsonl|concatjson',
 		'specify output format, default depends on --output'],
+'q	exclude=s@' => [ 'URL_OR_PATHNAME',
+		'exclude specified external(s) from search' ],
+'q	include|I=s@' => [ 'URL_OR_PATHNAME',
+		'include specified external(s) in search' ],
+'q	only=s@' => [ 'URL_OR_PATHNAME',
+		'only use specified external(s) for search' ],
+
 'ls-query	format|f=s' => $ls_format,
 'ls-external	format|f=s' => $ls_format,
 
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index fd8a3bca..7c1e3606 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -94,6 +94,22 @@ sub lei_q {
 	$lxs->do_query($self);
 }
 
+# shell completion helper called by lei__complete
+sub _complete_q {
+	my ($self, @argv) = @_;
+	my $ext = qr/\A(?:-I|(?:--(?:include|exclude|only)))\z/;
+	# $argv[-1] =~ $ext and return $self->_complete_forget_external;
+	my @cur;
+	while (@argv) {
+		if ($argv[-1] =~ $ext) {
+			my @c = $self->_complete_forget_external(@cur);
+			return @c if @c;
+		}
+		unshift(@cur, pop @argv);
+	}
+	();
+}
+
 # Stuff we may pass through to curl (as of 7.64.0), see curl manpage for
 # details, so most options which make sense for HTTP/HTTPS (including proxy
 # support for Tor and other methods of getting past weird networks).
diff --git a/t/lei.t b/t/lei.t
index a46e46f2..33f47ae4 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -232,6 +232,12 @@ my $test_external = sub {
 			"partial completion for URL $u");
 		is($out, "https://example.com/ibx/\n",
 			"completed partial URL $u");
+		for my $qo (qw(-I --include --exclude --only)) {
+			ok($lei->(qw(_complete lei q), $qo, $u),
+				"partial completion for URL q $qo $u");
+			is($out, "https://example.com/ibx/\n",
+				"completed partial URL $u on q $qo");
+		}
 	}
 
 	$lei->('ls-external');

^ permalink raw reply related	[relevance 64%]

* [PATCH 15/16] lei q: tidy up progress reporting
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
                   ` (6 preceding siblings ...)
  2021-02-02 11:46 71% ` [PATCH 13/16] doc: lei-q: note "-a" and link to Xapian QueryParser Eric Wong
@ 2021-02-02 11:47 56% ` Eric Wong
  2021-02-02 11:47 48% ` [PATCH 16/16] lei q: support --jobs [SEARCHERS],[WRITERS] Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:47 UTC (permalink / raw)
  To: meta

We won't be reporting progress when output is going to stdout
since it can clutter up the terminal unless stderr != stdout,
which probably isn't worth checking.

We'll also use a more agnostic mset_progress which may
make it easier to support worker-less invocations.
---
 lib/PublicInbox/LEI.pm         |  1 +
 lib/PublicInbox/LeiOverview.pm |  2 ++
 lib/PublicInbox/LeiXSearch.pm  | 34 +++++++++++++++++++---------------
 3 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 44afced3..2c512c5e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -871,6 +871,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 
 sub dclose {
 	my ($self) = @_;
+	delete $self->{-progress};
 	for my $f (qw(lxs l2m)) {
 		my $wq = delete $self->{$f} or next;
 		if ($wq->wq_kill) {
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index ff15d295..52da225d 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -90,6 +90,8 @@ sub new {
 		} else {
 			ovv_out_lk_init($self);
 		}
+	} elsif (!$opt->{quiet}) {
+		$lei->{-progress} = 1;
 	}
 	if ($json) {
 		$lei->{dedupe} //= PublicInbox::LeiDedupe->new($lei);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index e207f0fc..57a18075 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -107,6 +107,19 @@ sub wait_startq ($) {
 	read($startq, my $query_prepare_done, 1);
 }
 
+sub mset_progress {
+	my $lei = shift;
+	return unless $lei->{-progress};
+	if ($lei->{pkt_op}) { # called via pkt_op/pkt_do from workers
+		pkt_do($lei->{pkt_op}, 'mset_progress', @_);
+	} else { # single lei-daemon consumer
+		my @args = ref($_[-1]) eq 'ARRAY' ? @{$_[-1]} : @_;
+		my ($desc, $mset_size, $mset_total_est) = @args;
+		$lei->{-mset_total} += $mset_size;
+		$lei->err("# $desc $mset_size/$mset_total_est");
+	}
+}
+
 sub query_thread_mset { # for --thread
 	my ($self, $lei, $ibxish) = @_;
 	local $0 = "$0 query_thread_mset";
@@ -121,7 +134,7 @@ sub query_thread_mset { # for --thread
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $ibxish);
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
-		pkt_do($lei->{pkt_op}, 'mset_progress', $desc, $mset->size,
+		mset_progress($lei, $desc, $mset->size,
 				$mset->get_matches_estimated);
 		my $ids = $srch->mset_to_artnums($mset, $mo);
 		my $ctx = { ids => $ids };
@@ -154,7 +167,7 @@ sub query_mset { # non-parallel for non-"--thread" users
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $self);
 	do {
 		$mset = $self->mset($mo->{qstr}, $mo);
-		pkt_do($lei->{pkt_op}, 'mset_progress', 'xsearch',
+		mset_progress($lei, 'xsearch', $mset->size,
 				$mset->size, $mset->get_matches_estimated);
 		for my $mitem ($mset->items) {
 			my $smsg = smsg_for($self, $mitem) or next;
@@ -174,8 +187,8 @@ sub each_eml { # callback for MboxReader->mboxrd
 	$smsg->{$_} //= '' for qw(from to cc ds subject references mid);
 	delete @$smsg{qw(From Subject -ds -ts)};
 	if (my $startq = delete($lei->{startq})) { wait_startq($startq) }
-	++$lei->{-nr_remote_eml};
-	if (!$lei->{opt}->{quiet}) {
+	if ($lei->{-progress}) {
+		++$lei->{-nr_remote_eml};
 		my $now = now();
 		my $next = $lei->{-next_progress} //= ($now + 1);
 		if ($now > $next) {
@@ -261,8 +274,7 @@ sub query_remote_mboxrd {
 		return $lei->fail("E: @$cmd: $@") if $@;
 		if ($? == 0) {
 			my $nr = $lei->{-nr_remote_eml};
-			pkt_do($lei->{pkt_op}, 'mset_progress',
-				$lei->{-current_url}, $nr, $nr);
+			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
 			next;
 		}
 		seek($cerr, $coff, SEEK_SET) or warn "seek(curl stderr): $!\n";
@@ -318,19 +330,11 @@ Error closing $lei->{ovv}->{dst}: $!
 		}
 		$lei->start_mua;
 	}
-	$lei->{opt}->{quiet} or
+	$lei->{-progress} and
 		$lei->err('# ', $lei->{-mset_total} // 0, " matches");
 	$lei->dclose;
 }
 
-sub mset_progress { # called via pkt_op/pkt_do from workers
-	my ($lei, $pargs) = @_;
-	my ($desc, $mset_size, $mset_total_est) = @$pargs;
-	return if $lei->{opt}->{quiet};
-	$lei->{-mset_total} += $mset_size;
-	$lei->err("# $desc $mset_size/$mset_total_est");
-}
-
 sub do_post_augment {
 	my ($lei, $zpipe, $au_done) = @_;
 	my $l2m = $lei->{l2m} or die 'BUG: no {l2m}';

^ permalink raw reply related	[relevance 56%]

* [PATCH 09/16] lei q: do not leave temporary files after oneshot exit
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-02 11:46 64% ` [PATCH 06/16] lei: q: shell completion for --(include|exclude|only) Eric Wong
@ 2021-02-02 11:46 52% ` Eric Wong
  2021-02-02 11:46 71% ` [PATCH 13/16] doc: lei-q: note "-a" and link to Xapian QueryParser Eric Wong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:46 UTC (permalink / raw)
  To: meta

Avoid on-stack shortcuts which may prevent destructors from
firing since we're not inside the event loop.  We'll also tidy
up the unlink mechanism in LeiOverview while we're at it.
---
 lib/PublicInbox/LEI.pm         | 20 +++++++++++---------
 lib/PublicInbox/LeiOverview.pm |  7 +++----
 lib/PublicInbox/LeiQuery.pm    |  4 ++--
 lib/PublicInbox/LeiXSearch.pm  |  5 +++--
 xt/lei-sigpipe.t               | 27 +++++++++++++++++++++++++--
 5 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index d6fa814c..44afced3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -284,20 +284,22 @@ sub x_it ($$) {
 	dump_and_clear_log();
 	if (my $sock = $self->{sock}) {
 		send($sock, "x_it $code", MSG_EOR);
-	} elsif (!$self->{oneshot}) {
-		return; # client disconnected, noop
-	} elsif (my $signum = ($code & 127)) { # usually SIGPIPE (13)
-		$SIG{PIPE} = 'DEFAULT'; # $SIG{$signum} doesn't work
-		kill $signum, $$;
-		sleep; # wait for signal
-	} else {
+	} elsif ($self->{oneshot}) {
 		# don't want to end up using $? from child processes
 		for my $f (qw(lxs l2m)) {
 			my $wq = delete $self->{$f} or next;
 			$wq->DESTROY;
 		}
-		$quit->($code >> 8);
-	}
+		# cleanup anything that has tempfiles
+		delete @$self{qw(ovv dedupe)};
+		if (my $signum = ($code & 127)) { # usually SIGPIPE (13)
+			$SIG{PIPE} = 'DEFAULT'; # $SIG{$signum} doesn't work
+			kill $signum, $$;
+			sleep; # wait for signal
+		} else {
+			$quit->($code >> 8);
+		}
+	} # else ignore if client disconnected
 }
 
 sub err ($;@) {
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 1d62ffe2..31cc67f1 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -26,16 +26,15 @@ sub _iso8601 ($) { strftime('%Y-%m-%dT%H:%M:%SZ', gmtime($_[0])) }
 # we open this in the parent process before ->wq_do handoff
 sub ovv_out_lk_init ($) {
 	my ($self) = @_;
-	$self->{tmp_lk_id} = "$self.$$";
 	my $tmp = File::Temp->new("lei-ovv.dst.$$.lock-XXXXXX",
 					TMPDIR => 1, UNLINK => 0);
-	$self->{lock_path} = $tmp->filename;
+	$self->{"lk_id.$self.$$"} = $self->{lock_path} = $tmp->filename;
 }
 
 sub ovv_out_lk_cancel ($) {
 	my ($self) = @_;
-	($self->{tmp_lk_id}//'') eq "$self.$$" and
-		unlink(delete($self->{lock_path}));
+	my $lock_path = delete $self->{"lk_id.$self.$$"} or return;
+	unlink($lock_path);
 }
 
 sub detect_fmt ($$) {
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 7c1e3606..ca214ca1 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -54,7 +54,7 @@ sub lei_q {
 		return $self->fail('no local or remote inboxes to search');
 	}
 	my $xj = $lxs->concurrency($opt);
-	my $ovv = PublicInbox::LeiOverview->new($self) or return;
+	PublicInbox::LeiOverview->new($self) or return;
 	$self->atfork_prepare_wq($lxs);
 	$lxs->wq_workers_start('lei_xsearch', $xj, $self->oldset);
 	delete $lxs->{-ipc_atfork_child_close};
@@ -90,7 +90,7 @@ sub lei_q {
 	# descending docid order
 	$mset_opt{relevance} //= -2 if $opt->{thread};
 	$self->{mset_opt} = \%mset_opt;
-	$ovv->ovv_begin($self);
+	$self->{ovv}->ovv_begin($self);
 	$lxs->do_query($self);
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index e997431f..b3cace74 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -387,8 +387,9 @@ sub query_prepare { # called by wq_do
 
 sub fail_handler ($;$$) {
 	my ($lei, $code, $io) = @_;
-	if (my $lxs = delete $lei->{lxs}) {
-		$lxs->wq_wait_old($lei) if $lxs->wq_kill_old; # lei-daemon
+	for my $f (qw(lxs l2m)) {
+		my $wq = delete $lei->{$f} or next;
+		$wq->wq_wait_old($lei) if $wq->wq_kill_old; # lei-daemon
 	}
 	close($io) if $io; # needed to avoid warnings on SIGPIPE
 	$lei->x_it($code // (1 >> 8));
diff --git a/xt/lei-sigpipe.t b/xt/lei-sigpipe.t
index 1aa9ed07..ba2d23c8 100644
--- a/xt/lei-sigpipe.t
+++ b/xt/lei-sigpipe.t
@@ -29,7 +29,30 @@ my $do_test = sub {
 	}
 };
 
-$do_test->();
-$do_test->({XDG_RUNTIME_DIR => '/dev/null'});
+my ($tmp, $for_destroy) = tmpdir();
+my $pid;
+my $opt = { run_mode => 0, 1 => \(my $out = '') };
+if (run_script([qw(lei daemon-pid)], undef, $opt)) {
+	chomp($pid = $out);
+	mkdir "$tmp/d" or BAIL_OUT $!;
+	local $ENV{TMPDIR} = "$tmp/d";
+	$do_test->();
+	$out = '';
+	ok(run_script([qw(lei daemon-pid)], undef, $opt), 'daemon-pid again');
+	chomp($out);
+	is($out, $pid, 'daemon-pid unchanged');
+	ok(kill(0, $pid), 'daemon still running');
+	$out = '';
+}
+{
+	mkdir "$tmp/1" or BAIL_OUT $!;
+	local $ENV{TMPDIR} = "$tmp/1";
+	$do_test->({XDG_RUNTIME_DIR => '/dev/null'});
+	is(unlink(glob("$tmp/1/*")), 0, 'nothing left over w/ oneshot');
+}
+
+# the one-shot test should be slow enough that the daemon has cleaned
+# up in the background:
+is_deeply([glob("$tmp/d/*")], [], 'nothing left over with daemon');
 
 done_testing;

^ permalink raw reply related	[relevance 52%]

* [PATCH 16/16] lei q: support --jobs [SEARCHERS],[WRITERS]
  2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
                   ` (7 preceding siblings ...)
  2021-02-02 11:47 56% ` [PATCH 15/16] lei q: tidy up progress reporting Eric Wong
@ 2021-02-02 11:47 48% ` Eric Wong
  8 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-02 11:47 UTC (permalink / raw)
  To: meta

This comma-delimited parameter allows controlling the number or
lei_xsearch and lei2mail worker processes.  With the change
to make IPC wq_* work use the event loop, it's now safe to
run fewer worker processes for searching with no risk of
deadlocks.

MAX_PER_HOST isn't configurable yet for remote hosts,
and maybe it shouldn't be due to potential for abuse.
---
 lib/PublicInbox/IPC.pm        | 19 +++++++++++++++++++
 lib/PublicInbox/LEI.pm        |  5 ++++-
 lib/PublicInbox/LeiQuery.pm   | 14 ++++++++++++--
 lib/PublicInbox/LeiXSearch.pm |  1 -
 lib/PublicInbox/V2Writable.pm | 22 ++--------------------
 5 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 50de1bed..3873649b 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -466,4 +466,23 @@ sub DESTROY {
 # Sereal doesn't have dclone
 sub deep_clone { ipc_thaw(ipc_freeze($_[-1])) }
 
+sub detect_nproc () {
+	# _SC_NPROCESSORS_ONLN = 84 on both Linux glibc and musl
+	return POSIX::sysconf(84) if $^O eq 'linux';
+	return POSIX::sysconf(58) if $^O eq 'freebsd';
+	# TODO: more OSes
+
+	# getconf(1) is POSIX, but *NPROCESSORS* vars are not
+	for (qw(_NPROCESSORS_ONLN NPROCESSORS_ONLN)) {
+		`getconf $_ 2>/dev/null` =~ /^(\d+)$/ and return $1;
+	}
+	for my $nproc (qw(nproc gnproc)) { # GNU coreutils nproc
+		`$nproc 2>/dev/null` =~ /^(\d+)$/ and return $1;
+	}
+
+	# should we bother with `sysctl hw.ncpu`?  Those only give
+	# us total processor count, not online processor count.
+	undef
+}
+
 1;
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 2c512c5e..9afc90cf 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -104,7 +104,7 @@ our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
-	include|I=s@ exclude=s@ only=s@
+	include|I=s@ exclude=s@ only=s@ jobs|j=s
 	mua-cmd|mua=s no-torsocks torsocks=s verbose|v quiet|q
 	received-after=s received-before=s sent-after=s sent-since=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
@@ -236,6 +236,9 @@ my %OPTDESC = (
 'q	only=s@' => [ 'URL_OR_PATHNAME',
 		'only use specified external(s) for search' ],
 
+'q	jobs=s'	=> [ '[SEARCH_JOBS][,WRITER_JOBS]',
+		'control number of search and writer jobs' ],
+
 'ls-query	format|f=s' => $ls_format,
 'ls-external	format|f=s' => $ls_format,
 
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index ca214ca1..72a67c24 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -17,6 +17,7 @@ sub lei_q {
 	my ($self, @argv) = @_;
 	require PublicInbox::LeiXSearch;
 	require PublicInbox::LeiOverview;
+	require PublicInbox::V2Writable;
 	PublicInbox::Config->json; # preload before forking
 	my $opt = $self->{opt};
 	# prepare any number of LeiXSearch || LeiSearch || Inbox || URL
@@ -53,13 +54,22 @@ sub lei_q {
 	unless ($lxs->locals || $lxs->remotes) {
 		return $self->fail('no local or remote inboxes to search');
 	}
-	my $xj = $lxs->concurrency($opt);
+	my ($xj, $mj) = split(/,/, $opt->{jobs} // '');
+	if (defined($xj) && $xj ne '' && $xj !~ /\A[1-9][0-9]*\z/) {
+		return $self->fail("`$xj' search jobs must be >= 1");
+	}
+	$xj ||= $lxs->concurrency($opt); # allow: "--jobs ,$WRITER_ONLY"
+	my $nproc = $lxs->detect_nproc; # don't memoize, schedtool(1) exists
+	$xj = $nproc if $xj > $nproc;
 	PublicInbox::LeiOverview->new($self) or return;
 	$self->atfork_prepare_wq($lxs);
 	$lxs->wq_workers_start('lei_xsearch', $xj, $self->oldset);
 	delete $lxs->{-ipc_atfork_child_close};
 	if (my $l2m = $self->{l2m}) {
-		my $mj = 4; # TODO: configurable
+		if (defined($mj) && $mj !~ /\A[1-9][0-9]*\z/) {
+			return $self->fail("`$mj' writer jobs must be >= 1");
+		}
+		$mj //= $nproc;
 		$self->atfork_prepare_wq($l2m);
 		$l2m->wq_workers_start('lei2mail', $mj, $self->oldset);
 		delete $l2m->{-ipc_atfork_child_close};
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 57a18075..37bd233e 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -350,7 +350,6 @@ sub do_post_augment {
 }
 
 my $MAX_PER_HOST = 4;
-sub MAX_PER_HOST { $MAX_PER_HOST }
 
 sub concurrency {
 	my ($self, $opt) = @_;
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 35b7fe30..cbd4f003 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -8,6 +8,7 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::Lock);
 use PublicInbox::SearchIdxShard;
+use PublicInbox::IPC;
 use PublicInbox::Eml;
 use PublicInbox::Git;
 use PublicInbox::Import;
@@ -35,32 +36,13 @@ our $PACKING_FACTOR = 0.4;
 # to increase Xapian shards
 our $NPROC_MAX_DEFAULT = 4;
 
-sub detect_nproc () {
-	# _SC_NPROCESSORS_ONLN = 84 on both Linux glibc and musl
-	return POSIX::sysconf(84) if $^O eq 'linux';
-	return POSIX::sysconf(58) if $^O eq 'freebsd';
-	# TODO: more OSes
-
-	# getconf(1) is POSIX, but *NPROCESSORS* vars are not
-	for (qw(_NPROCESSORS_ONLN NPROCESSORS_ONLN)) {
-		`getconf $_ 2>/dev/null` =~ /^(\d+)$/ and return $1;
-	}
-	for my $nproc (qw(nproc gnproc)) { # GNU coreutils nproc
-		`$nproc 2>/dev/null` =~ /^(\d+)$/ and return $1;
-	}
-
-	# should we bother with `sysctl hw.ncpu`?  Those only give
-	# us total processor count, not online processor count.
-	undef
-}
-
 sub nproc_shards ($) {
 	my ($creat_opt) = @_;
 	my $n = $creat_opt->{nproc} if ref($creat_opt) eq 'HASH';
 	$n //= $ENV{NPROC};
 	if (!$n) {
 		# assume 2 cores if not detectable or zero
-		state $NPROC_DETECTED = detect_nproc() || 2;
+		state $NPROC_DETECTED = PublicInbox::IPC::detect_nproc() || 2;
 		$n = $NPROC_DETECTED;
 		$n = $NPROC_MAX_DEFAULT if $n > $NPROC_MAX_DEFAULT;
 	}

^ permalink raw reply related	[relevance 48%]

* Re: can lei require Inline::C?
  2021-02-02 10:09 71% can lei require Inline::C? Eric Wong
@ 2021-02-03  0:02 71% ` Kyle Meyer
  0 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2021-02-03  0:02 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Performance and interactivity suck without being able to use FD
> passing (especially tab completions).  And having to maintain
> and separate code paths is a huge time sink...
>
> Inline::C is packaged by every relevant distro (unlike
> Socket::Msghdr), and I figure anybody who uses lei at this stage
> will have a C compiler...

Yeah, I'd figure the same, so my uninformed guess/opinion is that the
separate code paths aren't worth the trouble.

^ permalink raw reply	[relevance 71%]

* [PATCH 02/11] lei: further reduce lei2mail FD pressure
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
  2021-02-03  8:11 67% ` [PATCH 01/11] lei: reduce FD pressure from lei2mail worker Eric Wong
@ 2021-02-03  8:11 71% ` Eric Wong
  2021-02-03  8:11 71% ` [PATCH 04/11] lei: err: avoid uninitialized variable warnings Eric Wong
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

We don't need to be sending errors directly to the client, but
instead go through lei-daemon or the top-level one-shot process.
---
 lib/PublicInbox/LeiOverview.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 88034ada..366af8b2 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -216,7 +216,9 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			$wcb->(undef, $smsg, $eml);
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
+		my $sock = delete $lei->{sock}; # lei2mail doesn't need it
 		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
+		$lei->{sock} = $sock if $sock;
 		# $io[0] becomes a notification pipe that triggers EOF
 		# in this wq worker when all outstanding ->write_mail
 		# calls are complete

^ permalink raw reply related	[relevance 71%]

* [PATCH 04/11] lei: err: avoid uninitialized variable warnings
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
  2021-02-03  8:11 67% ` [PATCH 01/11] lei: reduce FD pressure from lei2mail worker Eric Wong
  2021-02-03  8:11 71% ` [PATCH 02/11] lei: further reduce lei2mail FD pressure Eric Wong
@ 2021-02-03  8:11 71% ` Eric Wong
  2021-02-03  8:11 41% ` [PATCH 05/11] lei: propagate curl errors, improve internal consistency Eric Wong
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

---
 lib/PublicInbox/LEI.pm | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9afc90cf..9b4d4e0b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -308,12 +308,12 @@ sub x_it ($$) {
 sub err ($;@) {
 	my $self = shift;
 	my $err = $self->{2} // ($self->{pgr} // [])->[2] // *STDERR{GLOB};
-	my $eor = (substr($_[-1], -1, 1) eq "\n" ? () : "\n");
-	print $err @_, $eor and return;
+	my @eor = (substr($_[-1]//'', -1, 1) eq "\n" ? () : ("\n"));
+	print $err @_, @eor and return;
 	my $old_err = delete $self->{2};
-	close($old_err) if $! == EPIPE && $old_err;;
+	close($old_err) if $! == EPIPE && $old_err;
 	$err = $self->{2} = ($self->{pgr} // [])->[2] // *STDERR{GLOB};
-	print $err @_, $eor or print STDERR @_, $eor;
+	print $err @_, @eor or print STDERR @_, @eor;
 }
 
 sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }

^ permalink raw reply related	[relevance 71%]

* [PATCH 07/11] lei: complete basenames for include|exclude|only
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-03  8:11 54% ` [PATCH 06/11] lei q: -I/--exclude/--only support globs and basenames Eric Wong
@ 2021-02-03  8:11 71% ` Eric Wong
  2021-02-03  8:11 71% ` [PATCH 08/11] lei: help starts pager Eric Wong
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

This will make it even easier for RSI-afflicted users to use,
since many externals may share a common prefix.
---
 lib/PublicInbox/LeiQuery.pm | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 10b8d6fa..8015ecec 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -112,11 +112,22 @@ sub lei_q {
 sub _complete_q {
 	my ($self, @argv) = @_;
 	my $ext = qr/\A(?:-I|(?:--(?:include|exclude|only)))\z/;
-	# $argv[-1] =~ $ext and return $self->_complete_forget_external;
 	my @cur;
 	while (@argv) {
 		if ($argv[-1] =~ $ext) {
 			my @c = $self->_complete_forget_external(@cur);
+			# try basename match:
+			if (scalar(@cur) == 1 && index($cur[0], '/') < 0) {
+				my $all = $self->externals_each;
+				my %bn;
+				for my $loc (keys %$all) {
+					my $bn = (split(m!/!, $loc))[-1];
+					++$bn{$bn};
+				}
+				push @c, grep {
+					$bn{$_} == 1 && /\A\Q$cur[0]/
+				} keys %bn;
+			}
 			return @c if @c;
 		}
 		unshift(@cur, pop @argv);

^ permalink raw reply related	[relevance 71%]

* [PATCH 08/11] lei: help starts pager
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-03  8:11 71% ` [PATCH 07/11] lei: complete basenames for include|exclude|only Eric Wong
@ 2021-02-03  8:11 71% ` Eric Wong
  2021-02-03  8:11 56% ` [PATCH 09/11] lei add-external: completion for existing URL basenames Eric Wong
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

Because some commands have many options which take up
multiple screens.
---
 lib/PublicInbox/LEI.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 3cb7a327..005f6f7a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -507,7 +507,9 @@ EOF
 		$msg .= $rhs;
 		$msg .= "\n";
 	}
-	print { $self->{$errmsg ? 2 : 1} } $msg;
+	my $out = $self->{$errmsg ? 2 : 1};
+	start_pager($self) if -t $out;
+	print $out $msg;
 	x_it($self, $errmsg ? 1 << 8 : 0); # stderr => failure
 	undef;
 }

^ permalink raw reply related	[relevance 71%]

* [PATCH 01/11] lei: reduce FD pressure from lei2mail worker
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
@ 2021-02-03  8:11 67% ` Eric Wong
  2021-02-03  8:11 71% ` [PATCH 02/11] lei: further reduce lei2mail FD pressure Eric Wong
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

lei2mail doesn't need stdin anymore, so we can use the [0] slot
for the $not_done keepalive purposes.
---
 lib/PublicInbox/LeiOverview.pm | 8 ++++----
 lib/PublicInbox/LeiToMail.pm   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 52da225d..88034ada 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -217,13 +217,13 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
 		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
-		# $io[-1] becomes a notification pipe that triggers EOF
+		# $io[0] becomes a notification pipe that triggers EOF
 		# in this wq worker when all outstanding ->write_mail
 		# calls are complete
-		pipe($l2m->{each_smsg_done}, $io[$#io + 1]) or die "pipe: $!";
-		fcntl($io[-1], 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
+		$io[0] = undef;
+		pipe($l2m->{each_smsg_done}, $io[0]) or die "pipe: $!";
+		fcntl($io[0], 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
 		delete @$lei_ipc{qw(l2m opt mset_opt cmd)};
-		$lei_ipc->{each_smsg_not_done} = $#io;
 		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
 		$self->{git} = $git;
 		my $git_dir = $git->{git_dir};
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index c6c5f84b..c704dc2a 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -464,7 +464,7 @@ sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
 
 sub write_mail { # via ->wq_do
 	my ($self, $git_dir, $smsg, $lei) = @_;
-	my $not_done = delete $self->{$lei->{each_smsg_not_done}};
+	my $not_done = delete $self->{0} // die 'BUG: $not_done missing';
 	my $wcb = $self->{wcb} //= do { # first message
 		$lei->atfork_child_wq($self);
 		$self->write_cb($lei);

^ permalink raw reply related	[relevance 67%]

* [PATCH 00/11] lei q --stdin, shortcut names, etc
@ 2021-02-03  8:11 66% Eric Wong
  2021-02-03  8:11 67% ` [PATCH 01/11] lei: reduce FD pressure from lei2mail worker Eric Wong
                   ` (9 more replies)
  0 siblings, 10 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

Since externals tend to have common URL or pathname prefixes,
it's now possible to use -I/--only/--exclude with just the
basename of a URL or directory if that's unambiguous.

Wildcard matches are also supported with -I/--only/--exclude.

forget-external still requires the full path, but that's
rarely-used.

add-external bash completion now supports URL hostnames
and common base names.

"lei q" also supports reading queries from stdin.
FD use is slightly reduced, but still far from ideal
(it's bad when I have to bump "ulimit -n" to reattach
 screen(1) while I'm running stress tests).

Eric Wong (11):
  lei: reduce FD pressure from lei2mail worker
  lei: further reduce lei2mail FD pressure
  pkt_op: rely on DS::in_loop global
  lei: err: avoid uninitialized variable warnings
  lei: propagate curl errors, improve internal consistency
  lei q: -I/--exclude/--only support globs and basenames
  lei: complete basenames for include|exclude|only
  lei: help starts pager
  lei add-external: completion for existing URL basenames
  lei: use sleep(1) loop for infinite sleep
  lei q: support reading queries from stdin

 MANIFEST                               |  1 +
 contrib/completion/lei-completion.bash |  6 ++
 lib/PublicInbox/InputPipe.pm           | 37 ++++++++++++
 lib/PublicInbox/LEI.pm                 | 37 +++++++-----
 lib/PublicInbox/LeiExternal.pm         | 82 +++++++++++++++++++++-----
 lib/PublicInbox/LeiOverview.pm         |  9 ++-
 lib/PublicInbox/LeiQuery.pm            | 59 ++++++++++++++----
 lib/PublicInbox/LeiToMail.pm           |  2 +-
 lib/PublicInbox/LeiXSearch.pm          | 20 +++----
 lib/PublicInbox/PktOp.pm               | 25 +++++---
 script/lei                             |  2 +-
 t/lei.t                                | 51 ++++++++++++----
 12 files changed, 248 insertions(+), 83 deletions(-)
 create mode 100644 lib/PublicInbox/InputPipe.pm


^ permalink raw reply	[relevance 66%]

* [PATCH 06/11] lei q: -I/--exclude/--only support globs and basenames
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-03  8:11 41% ` [PATCH 05/11] lei: propagate curl errors, improve internal consistency Eric Wong
@ 2021-02-03  8:11 54% ` Eric Wong
  2021-02-03  8:11 71% ` [PATCH 07/11] lei: complete basenames for include|exclude|only Eric Wong
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

We can do basename matching when it's unambiguous.  Since '*?[]'
characters are rare in URLs and pathnames, we'll do glob
matching by default to support a (curl-inspired) --globoff/-g
option to disable globbing.

And fix --exclude while we're at it
---
 lib/PublicInbox/LEI.pm         |  3 ++-
 lib/PublicInbox/LeiExternal.pm | 38 +++++++++++++++++++++++++++++++++-
 lib/PublicInbox/LeiQuery.pm    | 14 ++++++++-----
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 05a39cad..3cb7a327 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -104,7 +104,7 @@ our %CMD = ( # sorted in order of importance/use:
 'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
-	include|I=s@ exclude=s@ only=s@ jobs|j=s
+	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g
 	mua-cmd|mua=s no-torsocks torsocks=s verbose|v quiet|q
 	received-after=s received-before=s sent-after=s sent-since=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
@@ -201,6 +201,7 @@ my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
 my %OPTDESC = (
 'help|h' => 'show this built-in help',
 'quiet|q' => 'be quiet',
+'globoff|g' => "do not match locations using '*?' wildcards and '[]' ranges",
 'verbose|v' => 'be more verbose',
 'solve!' => 'do not attempt to reconstruct blobs from emails',
 'torsocks=s' => ['auto|no|yes',
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 3853cfc1..6b4c7fb0 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -39,7 +39,7 @@ sub lei_ls_external {
 }
 
 sub ext_canonicalize {
-	my ($location) = $_[-1];
+	my ($location) = @_;
 	if ($location !~ m!\Ahttps?://!) {
 		PublicInbox::Config::rel2abs_collapsed($location);
 	} else {
@@ -52,6 +52,42 @@ sub ext_canonicalize {
 	}
 }
 
+my %patmap = ('*' => '[^/]*?', '?' => '[^/]', '[' => '[', ']' => ']');
+sub glob2pat {
+	my ($glob) = @_;
+        $glob =~ s!(.)!$patmap{$1} || "\Q$1"!ge;
+        $glob;
+}
+
+sub get_externals {
+	my ($self, $loc, $exclude) = @_;
+	return (ext_canonicalize($loc)) if -e $loc;
+
+	my @m;
+	my @cur = externals_each($self);
+	my $do_glob = !$self->{opt}->{globoff}; # glob by default
+	if ($do_glob && ($loc =~ /[\*\?]/s || $loc =~ /\[.*\]/s)) {
+		my $re = glob2pat($loc);
+		@m = grep(m!$re!, @cur);
+		return @m if scalar(@m);
+	} elsif (index($loc, '/') < 0) { # exact basename match:
+		@m = grep(m!/\Q$loc\E/?\z!, @cur);
+		return @m if scalar(@m) == 1;
+	} elsif ($exclude) { # URL, maybe:
+		my $canon = ext_canonicalize($loc);
+		@m = grep(m!\A\Q$canon\E\z!, @cur);
+		return @m if scalar(@m) == 1;
+	} else { # URL:
+		return (ext_canonicalize($loc));
+	}
+	if (scalar(@m) == 0) {
+		$self->fail("`$loc' is unknown");
+	} else {
+		$self->fail("`$loc' is ambiguous:\n", map { "\t$_\n" } @m);
+	}
+	();
+}
+
 sub lei_add_external {
 	my ($self, $location) = @_;
 	my $cfg = $self->_lei_cfg(1);
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 72a67c24..10b8d6fa 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -31,17 +31,21 @@ sub lei_q {
 	}
 	if (@only) {
 		for my $loc (@only) {
-			$lxs->prepare_external($self->ext_canonicalize($loc));
+			my @loc = $self->get_externals($loc) or return;
+			$lxs->prepare_external($_) for @loc;
 		}
 	} else {
 		for my $loc (@{$opt->{include} // []}) {
-			$lxs->prepare_external($self->ext_canonicalize($loc));
+			my @loc = $self->get_externals($loc) or return;
+			$lxs->prepare_external($_) for @loc;
 		}
 		# --external is enabled by default, but allow --no-external
 		if ($opt->{external} //= 1) {
-			my %x = map {;
-				($self->ext_canonicalize($_), 1)
-			} @{$self->{exclude} // []};
+			my %x;
+			for my $loc (@{$opt->{exclude} // []}) {
+				my @l = $self->get_externals($loc, 1) or return;
+				$x{$_} = 1 for @l;
+			}
 			my $ne = $self->externals_each(\&prep_ext, $lxs, \%x);
 			$opt->{remote} //= !($lxs->locals - $opt->{'local'});
 			if ($opt->{'local'}) {

^ permalink raw reply related	[relevance 54%]

* [PATCH 05/11] lei: propagate curl errors, improve internal consistency
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-03  8:11 71% ` [PATCH 04/11] lei: err: avoid uninitialized variable warnings Eric Wong
@ 2021-02-03  8:11 41% ` Eric Wong
  2021-02-03  8:11 54% ` [PATCH 06/11] lei q: -I/--exclude/--only support globs and basenames Eric Wong
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

IO::Uncompress::Gunzip seems to be losing $? when closing
PublicInbox::ProcessPipe.  To workaround this, do a synchronous
waitpid ourselves to force proper $? reporting update tests to
use the new --only feature for testing invalid URLs.

This improves internal code consistency by having {pkt_op}
parse the same ASCII-only protocol script/lei understands.

We no longer pass {sock} to worker processes at all,
further reducing FD pressure on per-user limits.
---
 lib/PublicInbox/LEI.pm         | 15 ++++++++-------
 lib/PublicInbox/LeiOverview.pm |  2 --
 lib/PublicInbox/LeiXSearch.pm  | 16 +++++++---------
 lib/PublicInbox/PktOp.pm       | 15 +++++++++++----
 t/lei.t                        | 29 ++++++++++++++++-------------
 5 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9b4d4e0b..05a39cad 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -285,8 +285,8 @@ sub x_it ($$) {
 	# make sure client sees stdout before exit
 	$self->{1}->autoflush(1) if $self->{1};
 	dump_and_clear_log();
-	if (my $sock = $self->{sock}) {
-		send($sock, "x_it $code", MSG_EOR);
+	if (my $s = $self->{pkt_op} // $self->{sock}) {
+		send($s, "x_it $code", MSG_EOR);
 	} elsif ($self->{oneshot}) {
 		# don't want to end up using $? from child processes
 		for my $f (qw(lxs l2m)) {
@@ -339,9 +339,10 @@ sub puts ($;@) { out(shift, map { "$_\n" } @_) }
 
 sub child_error { # passes non-fatal curl exit codes to user
 	my ($self, $child_error) = @_; # child_error is $?
-	if (my $sock = $self->{sock}) { # send to lei(1) client
-		send($sock, "child_error $child_error", MSG_EOR);
-	} elsif ($self->{oneshot}) {
+	if (my $s = $self->{pkt_op} // $self->{sock}) {
+		# send to the parent lei-daemon or to lei(1) client
+		send($s, "child_error $child_error", MSG_EOR);
+	} elsif (!$PublicInbox::DS::in_loop) {
 		$self->{child_error} = $child_error;
 	} # else noop if client disconnected
 }
@@ -420,9 +421,9 @@ sub atfork_parent_wq {
 		$lei->{$f} = $wq->deep_clone($tmp);
 	}
 	$self->{env} = $env;
-	delete @$lei{qw(3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
+	delete @$lei{qw(sock 3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
 	my @io = (delete(@$lei{qw(0 1 2)}),
-			io_extract($lei, qw(sock pkt_op startq)));
+			io_extract($lei, qw(pkt_op startq)));
 	my $l2m = $lei->{l2m};
 	if ($l2m && $l2m != $wq) { # $wq == lxs
 		if (my $wq_s1 = $l2m->{-wq_s1}) {
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 366af8b2..88034ada 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -216,9 +216,7 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			$wcb->(undef, $smsg, $eml);
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
-		my $sock = delete $lei->{sock}; # lei2mail doesn't need it
 		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
-		$lei->{sock} = $sock if $sock;
 		# $io[0] becomes a notification pipe that triggers EOF
 		# in this wq worker when all outstanding ->write_mail
 		# calls are complete
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 23a9c020..d33064bb 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -113,8 +113,7 @@ sub mset_progress {
 	if ($lei->{pkt_op}) { # called via pkt_op/pkt_do from workers
 		pkt_do($lei->{pkt_op}, 'mset_progress', @_);
 	} else { # single lei-daemon consumer
-		my @args = ref($_[-1]) eq 'ARRAY' ? @{$_[-1]} : @_;
-		my ($desc, $mset_size, $mset_total_est) = @args;
+		my ($desc, $mset_size, $mset_total_est) = @_;
 		$lei->{-mset_total} += $mset_size;
 		$lei->err("# $desc $mset_size/$mset_total_est");
 	}
@@ -264,14 +263,11 @@ sub query_remote_mboxrd {
 		shift(@$cmd) if !$cmd->[0];
 
 		$lei->err("# @$cmd") if $verbose;
-		$? = 0;
-		my $fh = popen_rd($cmd, $env, $rdr);
+		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
 		$fh = IO::Uncompress::Gunzip->new($fh);
-		eval {
-			PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
-							$lei, $each_smsg);
-		};
-		return $lei->fail("E: @$cmd: $@") if $@;
+		PublicInbox::MboxReader->mboxrd($fh, \&each_eml, $self,
+						$lei, $each_smsg);
+		waitpid($pid, 0) == $pid or die "BUG: waitpid (curl): $!";
 		if ($? == 0) {
 			my $nr = $lei->{-nr_remote_eml};
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
@@ -420,6 +416,8 @@ sub do_query {
 		'.' => [ \&do_post_augment, $lei, $zpipe, $au_done ],
 		'' => [ \&query_done, $lei ],
 		'mset_progress' => [ \&mset_progress, $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
 	};
 	(my $op, $lei->{pkt_op}) = PublicInbox::PktOp->pair($ops);
 	my ($lei_ipc, @io) = $lei->atfork_parent_wq($self);
diff --git a/lib/PublicInbox/PktOp.pm b/lib/PublicInbox/PktOp.pm
index 40c7262a..10d76da0 100644
--- a/lib/PublicInbox/PktOp.pm
+++ b/lib/PublicInbox/PktOp.pm
@@ -4,8 +4,7 @@
 # op dispatch socket, reads a message, runs a sub
 # There may be multiple producers, but (for now) only one consumer
 # Used for lei_xsearch and maybe other things
-# "literal" => [ sub, @operands ]
-# /regexp/ => [ sub, @operands ]
+# "command" => [ $sub, @fixed_operands ]
 package PublicInbox::PktOp;
 use strict;
 use v5.10.1;
@@ -57,11 +56,19 @@ sub event_step {
 			$self->close;
 			die "recv: $!";
 		}
-		my ($cmd, $pargs) = split(/\0/, $msg, 2);
+		my ($cmd, @pargs);
+		if (index($msg, "\0") > 0) {
+			($cmd, my $pargs) = split(/\0/, $msg, 2);
+			@pargs = @{ipc_thaw($pargs)};
+		} else {
+			# for compatibility with the script/lei in client mode,
+			# it doesn't load Sereal||Storable for startup speed
+			($cmd, @pargs) = split(/ /, $msg);
+		}
 		my $op = $self->{ops}->{$cmd //= $msg};
 		die "BUG: unknown message: `$cmd'" unless $op;
 		my ($sub, @args) = @$op;
-		$sub->(@args, $pargs ? ipc_thaw($pargs) : ());
+		$sub->(@args, @pargs);
 		return $self->close if $msg eq ''; # close on EOF
 	} while (1);
 }
diff --git a/t/lei.t b/t/lei.t
index 33f47ae4..461669a8 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -14,6 +14,7 @@ require_mods(qw(json DBD::SQLite Search::Xapian));
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 my ($home, $for_destroy) = tmpdir();
 my $err_filter;
+my $curl = which('curl');
 my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
 	http://czquwvybam4bgbro.onion/meta/
 	http://ou63pmih66umazou.onion/meta/);
@@ -39,7 +40,7 @@ local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 local $ENV{HOME} = $home;
 local $ENV{FOO} = 'BAR';
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
-my $home_trash = [ "$home/.local", "$home/.config" ];
+my $home_trash = [ "$home/.local", "$home/.config", "$home/junk" ];
 my $cleanup = sub { rmtree([@$home_trash, @_]) };
 my $config_file = "$home/.config/lei/config";
 my $store_dir = "$home/.local/share/lei";
@@ -162,26 +163,19 @@ my $setup_publicinboxes = sub {
 my $test_external_remote = sub {
 	my ($url, $k) = @_;
 SKIP: {
-	my $nr = 4;
+	my $nr = 5;
 	skip "$k unset", $nr if !$url;
-	which('curl') or skip 'no curl', $nr;
+	$curl or skip 'no curl', $nr;
 	which('torsocks') or skip 'no torsocks', $nr if $url =~ m!\.onion/!;
-	$lei->('ls-external');
-	for my $e (split(/^/ms, $out)) {
-		$e =~ s/\s+boost.*//s;
-		$lei->('forget-external', '-q', $e) or
-			fail "error forgetting $e: $err"
-	}
-	$lei->('add-external', $url);
 	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
-	ok($lei->('q', '-q', "m:$mid"), "query $url");
+	my @cmd = ('q', '--only', $url, '-q', "m:$mid");
+	ok($lei->(@cmd), "query $url");
 	is($err, '', "no errors on $url");
 	my $res = $json->decode($out);
 	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
-	ok($lei->('q', '-q', "m:$mid", 'd:..20101002'), 'no results, no error');
+	ok($lei->(@cmd, 'd:..20101002'), 'no results, no error');
 	is($err, '', 'no output on 404, matching local FS behavior');
 	is($out, "[null]\n", 'got null results');
-	$lei->('forget-external', $url);
 } # /SKIP
 }; # /sub
 
@@ -355,12 +349,21 @@ my $test_completion = sub {
 	}
 };
 
+my $test_fail = sub {
+	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m));
+	is($? >> 8, 3, 'got curl exit for bogus URL');
+	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m -o), "$home/junk");
+	is($? >> 8, 3, 'got curl exit for bogus URL with Maildir');
+	is($out, '', 'no output');
+};
+
 my $test_lei_common = sub {
 	$test_help->();
 	$test_config->();
 	$test_init->();
 	$test_external->();
 	$test_completion->();
+	$test_fail->();
 };
 
 if ($ENV{TEST_LEI_ONESHOT}) {

^ permalink raw reply related	[relevance 41%]

* [PATCH 10/11] lei: use sleep(1) loop for infinite sleep
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
                   ` (7 preceding siblings ...)
  2021-02-03  8:11 56% ` [PATCH 09/11] lei add-external: completion for existing URL basenames Eric Wong
@ 2021-02-03  8:11 71% ` Eric Wong
  2021-02-03  8:11 43% ` [PATCH 11/11] lei q: support reading queries from stdin Eric Wong
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

Perl may internally race and miss signals due to a lack of
self-pipe / eventfd / signalfd / EVFILT_SIGNAL usage.  While our
event loop paths avoid these problems by using signalfd or
EVFILT_SIGNAL, thse sleep() calls are not within the event loop.
---
 lib/PublicInbox/LEI.pm | 2 +-
 script/lei             | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 005f6f7a..28dce0c5 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -299,7 +299,7 @@ sub x_it ($$) {
 		if (my $signum = ($code & 127)) { # usually SIGPIPE (13)
 			$SIG{PIPE} = 'DEFAULT'; # $SIG{$signum} doesn't work
 			kill $signum, $$;
-			sleep; # wait for signal
+			sleep(1) while 1; # wait for signal
 		} else {
 			$quit->($code >> 8);
 		}
diff --git a/script/lei b/script/lei
index 58f0dbe9..40c21ad8 100755
--- a/script/lei
+++ b/script/lei
@@ -116,7 +116,7 @@ Falling back to (slow) one-shot mode
 	sigchld();
 	if (my $sig = ($x_it_code & 127)) {
 		kill $sig, $$;
-		sleep;
+		sleep(1) while 1;
 	}
 	exit($x_it_code >> 8);
 } else { # for systems lacking Socket::MsgHdr or Inline::C

^ permalink raw reply related	[relevance 71%]

* [PATCH 09/11] lei add-external: completion for existing URL basenames
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
                   ` (6 preceding siblings ...)
  2021-02-03  8:11 71% ` [PATCH 08/11] lei: help starts pager Eric Wong
@ 2021-02-03  8:11 56% ` Eric Wong
  2021-02-03  8:11 71% ` [PATCH 10/11] lei: use sleep(1) loop for infinite sleep Eric Wong
  2021-02-03  8:11 43% ` [PATCH 11/11] lei q: support reading queries from stdin Eric Wong
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

Given the presence of one external on a certain host or prefix
path, it's logical other inboxes would share a common prefix.
For bash users, attempt to complete that using the "-o nospace"
option of bash
---
 contrib/completion/lei-completion.bash |  6 ++++
 lib/PublicInbox/LeiExternal.pm         | 44 ++++++++++++++++++--------
 t/lei.t                                |  3 ++
 3 files changed, 39 insertions(+), 14 deletions(-)

diff --git a/contrib/completion/lei-completion.bash b/contrib/completion/lei-completion.bash
index 0b82b109..fbda474c 100644
--- a/contrib/completion/lei-completion.bash
+++ b/contrib/completion/lei-completion.bash
@@ -4,6 +4,12 @@
 # preliminary bash completion support for lei (Local Email Interface)
 # Needs a lot of work, see `lei__complete' in lib/PublicInbox::LEI.pm
 _lei() {
+	case ${COMP_WORDS[@]} in
+	*' add-external http'*)
+		compopt -o nospace
+		;;
+	*) compopt +o nospace ;; # the default
+	esac
 	COMPREPLY=($(compgen -W "$(lei _complete ${COMP_WORDS[@]})" \
 			-- "${COMP_WORDS[COMP_CWORD]}"))
 	return 0
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 6b4c7fb0..accacf1a 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -133,17 +133,15 @@ sub lei_forget_external {
 	}
 }
 
-# shell completion helper called by lei__complete
-sub _complete_forget_external {
-	my ($self, @argv) = @_;
-	my $cfg = $self->_lei_cfg(0);
-	my $cur = pop @argv;
+sub _complete_url_common ($) {
+	my ($argv) = @_;
 	# Workaround bash word-splitting URLs to ['https', ':', '//' ...]
 	# Maybe there's a better way to go about this in
 	# contrib/completion/lei-completion.bash
 	my $re = '';
-	if (@argv) {
-		my @x = @argv;
+	my $cur = pop @$argv;
+	if (@$argv) {
+		my @x = @$argv;
 		if ($cur eq ':' && @x) {
 			push @x, $cur;
 			$cur = '';
@@ -154,10 +152,18 @@ sub _complete_forget_external {
 		if (@x >= 2) { # qw(https : hostname : 443) or qw(http :)
 			$re = join('', @x);
 		} else { # just filter out the flags and hope for the best
-			$re = join('', grep(!/^-/, @argv));
+			$re = join('', grep(!/^-/, @$argv));
 		}
 		$re = quotemeta($re);
 	}
+	($cur, $re);
+}
+
+# shell completion helper called by lei__complete
+sub _complete_forget_external {
+	my ($self, @argv) = @_;
+	my $cfg = $self->_lei_cfg(0);
+	my ($cur, $re) = _complete_url_common(\@argv);
 	# FIXME: bash completion off "http:" or "https:" when the last
 	# character is a colon doesn't work properly even if we're
 	# returning "//$HTTP_HOST/$PATH_INFO/", not sure why, could
@@ -165,13 +171,23 @@ sub _complete_forget_external {
 	map {
 		my $x = substr($_, length('external.'));
 		# only return the part specified on the CLI
-		if ($x =~ /\A$re(\Q$cur\E.*)/) {
-			# don't duplicate if already 100% completed
-			$cur eq $1 ? () : $1;
-		} else {
-			();
-		}
+		# don't duplicate if already 100% completed
+		$x =~ /\A$re(\Q$cur\E.*)/ ? ($cur eq $1 ? () : $1) : ();
 	} grep(/\Aexternal\.$re\Q$cur/, @{$cfg->{-section_order}});
 }
 
+sub _complete_add_external { # for bash, this relies on "compopt -o nospace"
+	my ($self, @argv) = @_;
+	my $cfg = $self->_lei_cfg(0);
+	my ($cur, $re) = _complete_url_common(\@argv);
+	require URI;
+	map {
+		my $u = URI->new(substr($_, length('external.')));
+		my ($base) = ($u->path =~ m!((?:/?.*)?/)[^/]+/?\z!);
+		$u->path($base);
+		$u = $u->as_string;
+		$u =~ /\A$re(\Q$cur\E.*)/ ? ($cur eq $1 ? () : $1) : ();
+	} grep(m!\Aexternal\.https?://!, @{$cfg->{-section_order}});
+}
+
 1;
diff --git a/t/lei.t b/t/lei.t
index 461669a8..03bbb078 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -233,6 +233,9 @@ my $test_external = sub {
 				"completed partial URL $u on q $qo");
 		}
 	}
+	ok($lei->(qw(_complete lei add-external), 'https://'),
+		'add-external hostname completion');
+	is($out, "https://example.com/\n", 'completed up to hostname');
 
 	$lei->('ls-external');
 	like($out, qr!https://example\.com/ibx/!s, 'added canonical URL');

^ permalink raw reply related	[relevance 56%]

* [PATCH 11/11] lei q: support reading queries from stdin
  2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
                   ` (8 preceding siblings ...)
  2021-02-03  8:11 71% ` [PATCH 10/11] lei: use sleep(1) loop for infinite sleep Eric Wong
@ 2021-02-03  8:11 43% ` Eric Wong
  9 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-03  8:11 UTC (permalink / raw)
  To: meta

This will be useful on shared machines when a user doesn't want
search queries visible to other users looking at the ps(1)
output or similar.
---
 MANIFEST                       |  1 +
 lib/PublicInbox/InputPipe.pm   | 37 ++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LEI.pm         |  7 ++++---
 lib/PublicInbox/LeiOverview.pm |  1 -
 lib/PublicInbox/LeiQuery.pm    | 32 ++++++++++++++++++++++-------
 lib/PublicInbox/LeiXSearch.pm  |  2 ++
 t/lei.t                        | 19 +++++++++++++++++
 7 files changed, 88 insertions(+), 11 deletions(-)
 create mode 100644 lib/PublicInbox/InputPipe.pm

diff --git a/MANIFEST b/MANIFEST
index bcb9d08e..6922f9b1 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -173,6 +173,7 @@ lib/PublicInbox/In2Tie.pm
 lib/PublicInbox/Inbox.pm
 lib/PublicInbox/InboxIdle.pm
 lib/PublicInbox/InboxWritable.pm
+lib/PublicInbox/InputPipe.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
diff --git a/lib/PublicInbox/InputPipe.pm b/lib/PublicInbox/InputPipe.pm
new file mode 100644
index 00000000..a8bdf031
--- /dev/null
+++ b/lib/PublicInbox/InputPipe.pm
@@ -0,0 +1,37 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# for reading pipes and sockets off the DS event loop
+package PublicInbox::InputPipe;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::DS);
+use PublicInbox::Syscall qw(EPOLLIN EPOLLET);
+
+sub consume {
+	my ($in, $cb, @args) = @_;
+	my $self = bless { cb => $cb, sock => $in, args => \@args },__PACKAGE__;
+	if ($PublicInbox::DS::in_loop) {
+		eval { $self->SUPER::new($in, EPOLLIN|EPOLLET) };
+		return $in->blocking(0) unless $@; # regular file sets $@
+	}
+	event_step($self) while $self->{sock};
+}
+
+sub event_step {
+	my ($self) = @_;
+	my ($r, $rbuf);
+	while (($r = sysread($self->{sock}, $rbuf, 65536))) {
+		$self->{cb}->(@{$self->{args} // []}, $rbuf);
+	}
+	if (defined($r)) { # EOF
+		$self->{cb}->(@{$self->{args} // []}, '');
+	} elsif ($!{EAGAIN}) {
+		return;
+	} else {
+		$self->{cb}->(@{$self->{args} // []}, undef)
+	}
+	$self->{sock}->blocking ? delete($self->{sock}) : $self->close
+}
+
+1;
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 28dce0c5..49deed13 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -101,10 +101,10 @@ sub _config_path ($) {
 # TODO: generate shell completion + help using %CMD and %OPTDESC
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
-'q' => [ 'SEARCH_TERMS...', 'search for messages matching terms', qw(
+'q' => [ '--stdin|SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
-	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g
+	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
 	mua-cmd|mua=s no-torsocks torsocks=s verbose|v quiet|q
 	received-after=s received-before=s sent-after=s sent-since=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
@@ -554,12 +554,13 @@ sub optparse ($$$) {
 		} elsif ($var =~ /\A\[-?$POS_ARG\]\z/) { # one optional arg
 			$i++;
 		} elsif ($var =~ /\A.+?\|/) { # required FOO|--stdin
+			$inf = 1 if index($var, '...') > 0;
 			my @or = split(/\|/, $var);
 			my $ok;
 			for my $o (@or) {
 				if ($o =~ /\A--([a-z0-9\-]+)/) {
 					$ok = defined($OPT->{$1});
-					last;
+					last if $ok;
 				} elsif (defined($argv->[$i])) {
 					$ok = 1;
 					$i++;
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 88034ada..e33d63a2 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -81,7 +81,6 @@ sub new {
 	my ($isatty, $seekable);
 	if ($dst eq '/dev/stdout') {
 		$isatty = -t $lei->{1};
-		$lei->start_pager if $isatty;
 		$opt->{pretty} //= $isatty;
 		if (!$isatty && -f _) {
 			my $fl = fcntl($lei->{1}, F_GETFL, 0) //
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 8015ecec..4fe40400 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -12,6 +12,16 @@ sub prep_ext { # externals_each callback
 	$lxs->prepare_external($loc) unless $exclude->{$loc};
 }
 
+sub qstr_add { # for --stdin
+	my ($self) = @_; # $_[1] = $rbuf
+	if (defined($_[1])) {
+		return eval { $self->{lxs}->do_query($self) } if $_[1] eq '';
+		$self->{mset_opt}->{qstr} .= $_[1];
+	} else {
+		$self->fail("error reading stdin: $!");
+	}
+}
+
 # the main "lei q SEARCH_TERMS" method
 sub lei_q {
 	my ($self, @argv) = @_;
@@ -84,12 +94,6 @@ sub lei_q {
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
 	$mset_opt{limit} //= 10000;
-	$mset_opt{qstr} = join(' ', map {;
-		# Consider spaces in argv to be for phrase search in Xapian.
-		# In other words, the users should need only care about
-		# normal shell quotes and not have to learn Xapian quoting.
-		/\s/ ? (s/\A(\w+:)// ? qq{$1"$_"} : qq{"$_"}) : $_
-	} @argv);
 	if (defined(my $sort = $opt->{'sort'})) {
 		if ($sort eq 'relevance') {
 			$mset_opt{relevance} = 1;
@@ -104,7 +108,21 @@ sub lei_q {
 	# descending docid order
 	$mset_opt{relevance} //= -2 if $opt->{thread};
 	$self->{mset_opt} = \%mset_opt;
-	$self->{ovv}->ovv_begin($self);
+
+	if ($opt->{stdin}) {
+		return $self->fail(<<'') if @argv;
+no query allowed on command-line with --stdin
+
+		require PublicInbox::InputPipe;
+		PublicInbox::InputPipe::consume($self->{0}, \&qstr_add, $self);
+		return;
+	}
+	# Consider spaces in argv to be for phrase search in Xapian.
+	# In other words, the users should need only care about
+	# normal shell quotes and not have to learn Xapian quoting.
+	$mset_opt{qstr} = join(' ', map {;
+		/\s/ ? (s/\A(\w+:)// ? qq{$1"$_"} : qq{"$_"}) : $_
+	} @argv);
 	$lxs->do_query($self);
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index d33064bb..965617b5 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -402,6 +402,8 @@ sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
 sub do_query {
 	my ($self, $lei) = @_;
 	$lei->{1}->autoflush(1);
+	$lei->start_pager if -t $lei->{1};
+	$lei->{ovv}->ovv_begin($lei);
 	my ($au_done, $zpipe);
 	my $l2m = $lei->{l2m};
 	if ($l2m) {
diff --git a/t/lei.t b/t/lei.t
index 03bbb078..01eed1da 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -275,6 +275,25 @@ my $test_external = sub {
 	my $pretty = $json->decode($out);
 	is_deeply($res, $pretty, '--pretty is identical after decode');
 
+	{
+		open my $fh, '+>', undef or BAIL_OUT $!;
+		$fh->autoflush(1);
+		print $fh 's:use' or BAIL_OUT $!;
+		seek($fh, 0, SEEK_SET) or BAIL_OUT $!;
+		ok($lei->([qw(q -q --stdin)], undef, { %$opt, 0 => $fh }),
+				'--stdin on regular file works');
+		like($out, qr/use boolean prefix/, '--stdin on regular file');
+	}
+	{
+		pipe(my ($r, $w)) or BAIL_OUT $!;
+		print $w 's:use' or BAIL_OUT $!;
+		close $w or BAIL_OUT $!;
+		ok($lei->([qw(q -q --stdin)], undef, { %$opt, 0 => $r }),
+				'--stdin on pipe file works');
+		like($out, qr/use boolean prefix/, '--stdin on pipe');
+	}
+	ok(!$lei->(qw(q -q --stdin s:use)), "--stdin and argv don't mix");
+
 	for my $fmt (qw(ldjson ndjson jsonl)) {
 		$lei->('q', '-f', $fmt, 's:use boolean prefix');
 		is($out, $json->encode($pretty->[0])."\n", "-f $fmt");

^ permalink raw reply related	[relevance 43%]

* [PATCH] t/lei: skip "lei q" tests on missing dependencies
@ 2021-02-04  2:10 83% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-04  2:10 UTC (permalink / raw)
  To: meta

... for now.  It's probably possible to just use send()
recv() without CMSG_* eventually.
---
 t/lei.t | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/t/lei.t b/t/lei.t
index 01eed1da..a08a6d0d 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -9,6 +9,9 @@ use PublicInbox::Config;
 use File::Path qw(rmtree);
 use Fcntl qw(SEEK_SET);
 use PublicInbox::Spawn qw(which);
+my $req_sendcmd = 'Socket::MsgHdr or Inline::C missing or unconfigured';
+undef($req_sendcmd) if PublicInbox::Spawn->can('send_cmd4');
+eval { require Socket::MsgHdr; undef $req_sendcmd };
 require_git 2.6;
 require_mods(qw(json DBD::SQLite Search::Xapian));
 my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
@@ -165,6 +168,7 @@ my $test_external_remote = sub {
 SKIP: {
 	my $nr = 5;
 	skip "$k unset", $nr if !$url;
+	skip $req_sendcmd, $nr if $req_sendcmd;
 	$curl or skip 'no curl', $nr;
 	which('torsocks') or skip 'no torsocks', $nr if $url =~ m!\.onion/!;
 	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
@@ -245,6 +249,8 @@ my $test_external = sub {
 	$lei->('ls-external');
 	unlike($out, qr!https://example\.com/ibx/!s, 'removed canonical URL');
 
+SKIP: {
+	skip $req_sendcmd, 52 if $req_sendcmd;
 	ok(!$lei->(qw(q s:prefix -o /dev/null -f maildir)), 'bad maildir');
 	like($err, qr!/dev/null exists and is not a directory!,
 		'error shown');
@@ -342,6 +348,7 @@ my $test_external = sub {
 		$url = $e{$k} if $url eq '1';
 		$test_external_remote->($url, $k);
 	}
+	}; # /SKIP
 };
 
 my $test_completion = sub {
@@ -372,11 +379,14 @@ my $test_completion = sub {
 };
 
 my $test_fail = sub {
+SKIP: {
+	skip $req_sendcmd, 3 if $req_sendcmd;
 	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m));
 	is($? >> 8, 3, 'got curl exit for bogus URL');
 	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m -o), "$home/junk");
 	is($? >> 8, 3, 'got curl exit for bogus URL with Maildir');
 	is($out, '', 'no output');
+}; # /SKIP
 };
 
 my $test_lei_common = sub {
@@ -397,10 +407,7 @@ if ($ENV{TEST_LEI_ONESHOT}) {
 	$test_lei_common->();
 } else {
 SKIP: { # real socket
-	eval { require Socket::MsgHdr; 1 } // do {
-		require PublicInbox::Spawn;
-		PublicInbox::Spawn->can('send_cmd4');
-	} // skip 'Socket::MsgHdr or Inline::C missing or unconfigured', 115;
+	skip $req_sendcmd, 115 if $req_sendcmd;
 	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/5.seq.sock";
 	my $err_log = "$ENV{XDG_RUNTIME_DIR}/lei/errors.log";

^ permalink raw reply related	[relevance 83%]

* [PATCH 00/10] lei: cleanups + initial import support
@ 2021-02-04  9:59 68% Eric Wong
  2021-02-04  9:59 65% ` [PATCH 01/10] lei q: delay worker spawn Eric Wong
                   ` (6 more replies)
  0 siblings, 7 replies; 200+ results
From: Eric Wong @ 2021-02-04  9:59 UTC (permalink / raw)
  To: meta

Still some ways to go, but changes to the "lei q" backend
should make future work far easier.  I went a bit overboard
with the FD passing in earlier iterations :x  Maybe Inline::C
won't have to be a hard requirement for lei after all...

The PktOp package is nice and works out for "lei import", too

Eric Wong (10):
  lei q: delay worker spawn
  ipc: localize fields assignment to prevent circular refs
  lei q: reorder internals to reduce FD passing
  lei q: only start pager if output is to stdout
  lei q: reinstate early MUA spawn for Maildir
  eml: handle warning ignores for lei
  lei q: eliminate $not_done temporary git dir hack
  lei_query: remove uneeded dwaitpid import
  lei_xsearch: drop unused imports
  lei import: initial implementation

 MANIFEST                         |   1 +
 lib/PublicInbox/Admin.pm         |   7 +-
 lib/PublicInbox/Eml.pm           |  19 ++++
 lib/PublicInbox/IPC.pm           |  10 +--
 lib/PublicInbox/InboxWritable.pm |  24 +-----
 lib/PublicInbox/LEI.pm           | 144 +++++++++++++------------------
 lib/PublicInbox/LeiImport.pm     | 106 +++++++++++++++++++++++
 lib/PublicInbox/LeiOverview.pm   |  44 ++--------
 lib/PublicInbox/LeiQuery.pm      |  20 ++---
 lib/PublicInbox/LeiStore.pm      |  18 ++++
 lib/PublicInbox/LeiToMail.pm     |  43 ++++++---
 lib/PublicInbox/LeiXSearch.pm    | 143 +++++++++++++++---------------
 lib/PublicInbox/Watch.pm         |  14 ++-
 t/lei.t                          |  15 ++++
 14 files changed, 345 insertions(+), 263 deletions(-)
 create mode 100644 lib/PublicInbox/LeiImport.pm

^ permalink raw reply	[relevance 68%]

* [PATCH 04/10] lei q: only start pager if output is to stdout
  2021-02-04  9:59 68% [PATCH 00/10] lei: cleanups + initial import support Eric Wong
  2021-02-04  9:59 65% ` [PATCH 01/10] lei q: delay worker spawn Eric Wong
  2021-02-04  9:59 29% ` [PATCH 03/10] lei q: reorder internals to reduce FD passing Eric Wong
@ 2021-02-04  9:59 71% ` Eric Wong
  2021-02-04  9:59 61% ` [PATCH 05/10] lei q: reinstate early MUA spawn for Maildir Eric Wong
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-04  9:59 UTC (permalink / raw)
  To: meta

No need to be starting a pager if we're writing to a regular file.
---
 lib/PublicInbox/LeiOverview.pm | 3 +--
 lib/PublicInbox/LeiXSearch.pm  | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index e6bf4f2a..3125f015 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -78,9 +78,8 @@ sub new {
 	if ($fmt =~ /\A($JSONL|(?:concat)?json)\z/) {
 		$json = $self->{json} = ref(PublicInbox::Config->json);
 	}
-	my ($isatty, $seekable);
 	if ($dst eq '/dev/stdout') {
-		$isatty = -t $lei->{1};
+		my $isatty = $lei->{need_pager} = -t $lei->{1};
 		$opt->{pretty} //= $isatty;
 		if (!$isatty && -f _) {
 			my $fl = fcntl($lei->{1}, F_GETFL, 0) //
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index e41d899e..0ca871ea 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -414,7 +414,7 @@ sub do_query {
 	};
 	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
 	$lei->{1}->autoflush(1);
-	$lei->start_pager if -t $lei->{1};
+	$lei->start_pager if delete $lei->{need_pager};
 	$lei->{ovv}->ovv_begin($lei);
 	my $l2m = $lei->{l2m};
 	if ($l2m) {

^ permalink raw reply related	[relevance 71%]

* [PATCH 01/10] lei q: delay worker spawn
  2021-02-04  9:59 68% [PATCH 00/10] lei: cleanups + initial import support Eric Wong
@ 2021-02-04  9:59 65% ` Eric Wong
  2021-02-04  9:59 29% ` [PATCH 03/10] lei q: reorder internals to reduce FD passing Eric Wong
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-04  9:59 UTC (permalink / raw)
  To: meta

Now that --stdin support is sorted, we can delay spawning
workers until we know the query is ready-to-run.
---
 lib/PublicInbox/LeiQuery.pm   | 19 +++++--------------
 lib/PublicInbox/LeiXSearch.pm |  6 ++++++
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 4fe40400..6b1aa40c 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -75,21 +75,12 @@ sub lei_q {
 	$xj ||= $lxs->concurrency($opt); # allow: "--jobs ,$WRITER_ONLY"
 	my $nproc = $lxs->detect_nproc; # don't memoize, schedtool(1) exists
 	$xj = $nproc if $xj > $nproc;
-	PublicInbox::LeiOverview->new($self) or return;
-	$self->atfork_prepare_wq($lxs);
-	$lxs->wq_workers_start('lei_xsearch', $xj, $self->oldset);
-	delete $lxs->{-ipc_atfork_child_close};
-	if (my $l2m = $self->{l2m}) {
-		if (defined($mj) && $mj !~ /\A[1-9][0-9]*\z/) {
-			return $self->fail("`$mj' writer jobs must be >= 1");
-		}
-		$mj //= $nproc;
-		$self->atfork_prepare_wq($l2m);
-		$l2m->wq_workers_start('lei2mail', $mj, $self->oldset);
-		delete $l2m->{-ipc_atfork_child_close};
+	$lxs->{jobs} = $xj;
+	if (defined($mj) && $mj !~ /\A[1-9][0-9]*\z/) {
+		return $self->fail("`$mj' writer jobs must be >= 1");
 	}
-
-	# no forking workers after this
+	$self->{l2m}->{jobs} = ($mj // $nproc) if $self->{l2m};
+	PublicInbox::LeiOverview->new($self) or return;
 
 	my %mset_opt = map { $_ => $opt->{$_} } qw(thread limit offset);
 	$mset_opt{asc} = $opt->{'reverse'} ? 1 : 0;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 965617b5..ab66717c 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -406,7 +406,13 @@ sub do_query {
 	$lei->{ovv}->ovv_begin($lei);
 	my ($au_done, $zpipe);
 	my $l2m = $lei->{l2m};
+	$lei->atfork_prepare_wq($self);
+	$self->wq_workers_start('lei_xsearch', $self->{jobs}, $lei->oldset);
+	delete $self->{-ipc_atfork_child_close};
 	if ($l2m) {
+		$lei->atfork_prepare_wq($l2m);
+		$l2m->wq_workers_start('lei2mail', $l2m->{jobs}, $lei->oldset);
+		delete $l2m->{-ipc_atfork_child_close};
 		pipe($lei->{startq}, $au_done) or die "pipe: $!";
 		# 1031: F_SETPIPE_SZ
 		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';

^ permalink raw reply related	[relevance 65%]

* [PATCH 05/10] lei q: reinstate early MUA spawn for Maildir
  2021-02-04  9:59 68% [PATCH 00/10] lei: cleanups + initial import support Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-04  9:59 71% ` [PATCH 04/10] lei q: only start pager if output is to stdout Eric Wong
@ 2021-02-04  9:59 61% ` Eric Wong
  2021-02-04  9:59 51% ` [PATCH 06/10] eml: handle warning ignores for lei Eric Wong
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-04  9:59 UTC (permalink / raw)
  To: meta

Once all files are written, we can use utime() to poke Maildirs
to wake up MUAs that fail to account for nanosecond timestamps
resolution.
---
 lib/PublicInbox/LEI.pm        |  1 +
 lib/PublicInbox/LeiToMail.pm  | 13 +++++++++++++
 lib/PublicInbox/LeiXSearch.pm | 15 +++++++++------
 3 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 0d4b1c11..24efb494 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -739,6 +739,7 @@ sub start_mua {
 	} elsif ($self->{oneshot}) {
 		$self->{"mua.pid.$self.$$"} = spawn(\@cmd);
 	}
+	delete $self->{-progress};
 }
 
 # caller needs to "-t $self->{1}" to check if tty
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index f9250860..5a6f18fb 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -365,6 +365,7 @@ sub new {
 	} else {
 		die "bad mail --format=$fmt\n";
 	}
+	$self->{dst} = $dst;
 	$lei->{dedupe} = PublicInbox::LeiDedupe->new($lei);
 	$self;
 }
@@ -474,6 +475,18 @@ sub ipc_atfork_child {
 	$self->SUPER::ipc_atfork_child;
 }
 
+sub lock_free {
+	$_[0]->{base_type} =~ /\A(?:maildir|mh|imap|jmap)\z/ ? 1 : 0;
+}
+
+sub poke_dst {
+	my ($self) = @_;
+	if ($self->{base_type} eq 'maildir') {
+		my $t = time + 1;
+		utime($t, $t, "$self->{dst}/cur");
+	}
+}
+
 sub write_mail { # via ->wq_do
 	my ($self, $git_dir, $smsg) = @_;
 	my $not_done = delete $self->{0} // die 'BUG: $not_done missing';
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 0ca871ea..e7f0ef63 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -308,13 +308,13 @@ sub git {
 
 sub query_done { # EOF callback for main daemon
 	my ($lei) = @_;
-	my $has_l2m = exists $lei->{l2m};
-	for my $f (qw(lxs l2m)) {
-		my $wq = delete $lei->{$f} or next;
-		$wq->wq_wait_old($lei);
+	my $l2m = delete $lei->{l2m};
+	$l2m->wq_wait_old($lei) if $l2m;
+	if (my $lxs = delete $lei->{lxs}) {
+		$lxs->wq_wait_old($lei);
 	}
 	$lei->{ovv}->ovv_end($lei);
-	if ($has_l2m) { # close() calls LeiToMail reap_compress
+	if ($l2m) { # close() calls LeiToMail reap_compress
 		if (my $out = delete $lei->{old_1}) {
 			if (my $mbout = $lei->{1}) {
 				close($mbout) or return $lei->fail(<<"");
@@ -323,7 +323,7 @@ Error closing $lei->{ovv}->{dst}: $!
 			}
 			$lei->{1} = $out;
 		}
-		$lei->start_mua;
+		$l2m->lock_free ? $l2m->poke_dst : $lei->start_mua;
 	}
 	$lei->{-progress} and
 		$lei->err('# ', $lei->{-mset_total} // 0, " matches");
@@ -355,6 +355,9 @@ sub concurrency {
 
 sub start_query { # always runs in main (lei-daemon) process
 	my ($self, $lei) = @_;
+	if (my $l2m = $lei->{l2m}) {
+		$lei->start_mua if $l2m->lock_free;
+	}
 	if ($lei->{opt}->{thread}) {
 		for my $ibxish (locals($self)) {
 			$self->wq_do('query_thread_mset', [], $ibxish);

^ permalink raw reply related	[relevance 61%]

* [PATCH 07/10] lei q: eliminate $not_done temporary git dir hack
  2021-02-04  9:59 68% [PATCH 00/10] lei: cleanups + initial import support Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-04  9:59 51% ` [PATCH 06/10] eml: handle warning ignores for lei Eric Wong
@ 2021-02-04  9:59 56% ` Eric Wong
  2021-02-04  9:59 36% ` [PATCH 10/10] lei import: initial implementation Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-04  9:59 UTC (permalink / raw)
  To: meta

Another step towards simplifying lei internals.

None of our current uses of ->wq_do involve FD passing, and the
plan is only rely on FD passing between lei-daemon and lei(1).
Internally, it ought to be possible for lei-daemon internal bits
to be ordered properly to not need FD passing.
---
 lib/PublicInbox/LeiOverview.pm | 23 ++---------------------
 lib/PublicInbox/LeiToMail.pm   |  3 +--
 lib/PublicInbox/LeiXSearch.pm  | 16 ++++++++++++----
 3 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 3125f015..d3df4faa 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -147,17 +147,6 @@ sub _unbless_smsg {
 
 sub ovv_atexit_child {
 	my ($self, $lei) = @_;
-	if (my $l2m = $lei->{l2m}) {
-		# wait for ->write_mail work we submitted to lei2mail
-		if (my $rd = delete $l2m->{each_smsg_done}) {
-			read($rd, my $buf, 1); # wait for EOF
-		}
-	}
-	# order matters, git->{-tmp}->DESTROY must not fire until
-	# {each_smsg_done} hits EOF above
-	if (my $git = delete $self->{git}) {
-		$git->async_wait_all;
-	}
 	if (my $bref = delete $lei->{ovv_buf}) {
 		my $lk = $self->lock_for_scope;
 		$lei->out($$bref);
@@ -213,19 +202,11 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			$wcb->(undef, $smsg, $eml);
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
-		# $io->[0] becomes a notification pipe that triggers EOF
-		# in this wq worker when all outstanding ->write_mail
-		# calls are complete
-		my $io = [];
-		pipe($l2m->{each_smsg_done}, $io->[0]) or die "pipe: $!";
-		fcntl($io->[0], 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
-		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
-		$self->{git} = $git;
-		my $git_dir = $git->{git_dir};
+		my $git_dir = $ibxish->git->{git_dir};
 		sub {
 			my ($smsg, $mitem) = @_;
 			$smsg->{pct} = get_pct($mitem) if $mitem;
-			$l2m->wq_do('write_mail', $io, $git_dir, $smsg);
+			$l2m->wq_do('write_mail', [], $git_dir, $smsg);
 		}
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 1f815e40..4f847221 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -490,10 +490,9 @@ sub poke_dst {
 
 sub write_mail { # via ->wq_do
 	my ($self, $git_dir, $smsg) = @_;
-	my $not_done = delete $self->{0} // die 'BUG: $not_done missing';
 	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);
 	git_async_cat($git, $smsg->{blob}, \&git_to_mail,
-				[$self->{wcb}, $smsg, $not_done]);
+				[$self->{wcb}, $smsg]);
 }
 
 sub wq_atexit_child {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index e7f0ef63..2dc44414 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -287,12 +287,15 @@ sub query_remote_mboxrd {
 	$lei->{ovv}->ovv_atexit_child($lei);
 }
 
-sub git {
+# called by LeiOverview::each_smsg_cb
+sub git { $_[0]->{git_tmp} // die 'BUG: caller did not set {git_tmp}' }
+
+sub git_tmp ($) {
 	my ($self) = @_;
 	my (%seen, @dirs);
-	my $tmp = File::Temp->newdir('lei_xsrch_git-XXXXXXXX', TMPDIR => 1);
-	for my $ibx (@{$self->{shard2ibx} // []}) {
-		my $d = File::Spec->canonpath($ibx->git->{git_dir});
+	my $tmp = File::Temp->newdir("lei_xsearch_git.$$-XXXX", TMPDIR => 1);
+	for my $ibxish (locals($self)) {
+		my $d = File::Spec->canonpath($ibxish->git->{git_dir});
 		$seen{$d} //= push @dirs, "$d/objects\n"
 	}
 	my $git_dir = $tmp->dirname;
@@ -428,6 +431,11 @@ sub do_query {
 		# 1031: F_SETPIPE_SZ
 		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';
 	}
+	if (!$lei->{opt}->{thread} && locals($self)) { # for query_mset
+		# lei->{git_tmp} is set for wq_wait_old so we don't
+		# delete until all lei2mail + lei_xsearch workers are reaped
+		$lei->{git_tmp} = $self->{git_tmp} = git_tmp($self);
+	}
 	$self->wq_workers_start('lei_xsearch', $self->{jobs},
 				$lei->oldset, { lei => $lei });
 	my $op = delete $lei->{pkt_op_c};

^ permalink raw reply related	[relevance 56%]

* [PATCH 06/10] eml: handle warning ignores for lei
  2021-02-04  9:59 68% [PATCH 00/10] lei: cleanups + initial import support Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-04  9:59 61% ` [PATCH 05/10] lei q: reinstate early MUA spawn for Maildir Eric Wong
@ 2021-02-04  9:59 51% ` Eric Wong
  2021-02-04  9:59 56% ` [PATCH 07/10] lei q: eliminate $not_done temporary git dir hack Eric Wong
  2021-02-04  9:59 36% ` [PATCH 10/10] lei import: initial implementation Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-04  9:59 UTC (permalink / raw)
  To: meta

There's nothing we can do about bad emails in our search
results, so quiet things down and don't fight the MUA for
the terminal.
---
 lib/PublicInbox/Admin.pm         |  7 +++----
 lib/PublicInbox/Eml.pm           | 19 +++++++++++++++++++
 lib/PublicInbox/InboxWritable.pm | 24 +-----------------------
 lib/PublicInbox/LeiToMail.pm     |  1 +
 lib/PublicInbox/Watch.pm         | 14 ++++++--------
 5 files changed, 30 insertions(+), 35 deletions(-)

diff --git a/lib/PublicInbox/Admin.pm b/lib/PublicInbox/Admin.pm
index f96397ea..3b38a5a3 100644
--- a/lib/PublicInbox/Admin.pm
+++ b/lib/PublicInbox/Admin.pm
@@ -10,6 +10,7 @@ our @EXPORT_OK = qw(setup_signals);
 use PublicInbox::Config;
 use PublicInbox::Inbox;
 use PublicInbox::Spawn qw(popen_rd);
+use PublicInbox::Eml;
 *rel2abs_collapsed = \&PublicInbox::Config::rel2abs_collapsed;
 
 sub setup_signals {
@@ -241,12 +242,10 @@ sub index_inbox {
 	}
 	local %SIG = %SIG;
 	setup_signals(\&index_terminate, $ibx);
-	my $warn_cb = $SIG{__WARN__} // \&CORE::warn;
 	my $idx = { current_info => $ibx->{inboxdir} };
-	my $warn_ignore = PublicInbox::InboxWritable->can('warn_ignore');
 	local $SIG{__WARN__} = sub {
-		return if $warn_ignore->(@_);
-		$warn_cb->($idx->{current_info}, ': ', @_);
+		return if PublicInbox::Eml::warn_ignore(@_);
+		warn($idx->{current_info}, ': ', @_);
 	};
 	if (ref($ibx) && $ibx->version == 2) {
 		eval { require PublicInbox::V2Writable };
diff --git a/lib/PublicInbox/Eml.pm b/lib/PublicInbox/Eml.pm
index bd27f19b..f7f62e7b 100644
--- a/lib/PublicInbox/Eml.pm
+++ b/lib/PublicInbox/Eml.pm
@@ -477,6 +477,25 @@ sub charset_set {
 
 sub crlf { $_[0]->{crlf} // "\n" }
 
+# warnings to ignore when handling spam mailboxes and maybe other places
+sub warn_ignore {
+	my $s = "@_";
+	# Email::Address::XS warnings
+	$s =~ /^Argument contains empty address at /
+	|| $s =~ /^Element at index [0-9]+ contains /
+	# PublicInbox::MsgTime
+	|| $s =~ /^bogus TZ offset: .+?, ignoring and assuming \+0000/
+	|| $s =~ /^bad Date: .+? in /
+	# Encode::Unicode::UTF7
+	|| $s =~ /^Bad UTF7 data escape at /
+}
+
+# this expects to be RHS in this assignment: "local $SIG{__WARN__} = ..."
+sub warn_ignore_cb {
+	my $cb = $SIG{__WARN__} // \&CORE::warn;
+	sub { $cb->(@_) unless warn_ignore(@_) }
+}
+
 sub willneed { re_memo($_) for @_ }
 
 willneed(qw(From To Cc Date Subject Content-Type In-Reply-To References
diff --git a/lib/PublicInbox/InboxWritable.pm b/lib/PublicInbox/InboxWritable.pm
index 982ad6e5..3a4012cd 100644
--- a/lib/PublicInbox/InboxWritable.pm
+++ b/lib/PublicInbox/InboxWritable.pm
@@ -9,7 +9,7 @@ use parent qw(PublicInbox::Inbox Exporter);
 use PublicInbox::Import;
 use PublicInbox::Filter::Base qw(REJECT);
 use Errno qw(ENOENT);
-our @EXPORT_OK = qw(eml_from_path warn_ignore_cb);
+our @EXPORT_OK = qw(eml_from_path);
 
 use constant {
 	PERM_UMASK => 0,
@@ -277,28 +277,6 @@ sub cleanup ($) {
 	delete @{$_[0]}{qw(over mm git search)};
 }
 
-# warnings to ignore when handling spam mailboxes and maybe other places
-sub warn_ignore {
-	my $s = "@_";
-	# Email::Address::XS warnings
-	$s =~ /^Argument contains empty address at /
-	|| $s =~ /^Element at index [0-9]+ contains /
-	# PublicInbox::MsgTime
-	|| $s =~ /^bogus TZ offset: .+?, ignoring and assuming \+0000/
-	|| $s =~ /^bad Date: .+? in /
-	# Encode::Unicode::UTF7
-	|| $s =~ /^Bad UTF7 data escape at /
-}
-
-# this expects to be RHS in this assignment: "local $SIG{__WARN__} = ..."
-sub warn_ignore_cb {
-	my $cb = $SIG{__WARN__} // \&CORE::warn;
-	sub {
-		return if warn_ignore(@_);
-		$cb->(@_);
-	}
-}
-
 # v2+ only, XXX: maybe we can just rely on ->max_git_epoch and remove
 sub git_dir_latest {
 	my ($self, $max) = @_;
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 5a6f18fb..1f815e40 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -472,6 +472,7 @@ sub ipc_atfork_child {
 		close $zpipe->[0];
 	}
 	$self->{wcb} = $self->write_cb($lei);
+	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$self->SUPER::ipc_atfork_child;
 }
 
diff --git a/lib/PublicInbox/Watch.pm b/lib/PublicInbox/Watch.pm
index 2b44ba43..185e5da8 100644
--- a/lib/PublicInbox/Watch.pm
+++ b/lib/PublicInbox/Watch.pm
@@ -7,7 +7,7 @@ package PublicInbox::Watch;
 use strict;
 use v5.10.1;
 use PublicInbox::Eml;
-use PublicInbox::InboxWritable qw(eml_from_path warn_ignore_cb);
+use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::Filter::Base qw(REJECT);
 use PublicInbox::Spamcheck;
 use PublicInbox::Sigfd;
@@ -174,7 +174,7 @@ sub _remove_spam {
 	# path must be marked as (S)een
 	$path =~ /:2,[A-R]*S[T-Za-z]*\z/ or return;
 	my $eml = eml_from_path($path) or return;
-	local $SIG{__WARN__} = warn_ignore_cb();
+	local $SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$self->{pi_cfg}->each_inbox(\&remove_eml_i, $self, $eml, $path);
 }
 
@@ -414,13 +414,11 @@ sub imap_import_msg ($$$$$) {
 			import_eml($self, $ibx, $eml);
 		}
 	} elsif ($inboxes eq 'watchspam') {
-		# we don't remove unseen messages
-		if ($flags =~ /\\Seen\b/) {
-			local $SIG{__WARN__} = warn_ignore_cb();
-			my $eml = PublicInbox::Eml->new($raw);
-			$self->{pi_cfg}->each_inbox(\&remove_eml_i,
+		return if $flags !~ /\\Seen\b/; # don't remove unseen messages
+		local $SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
+		my $eml = PublicInbox::Eml->new($raw);
+		$self->{pi_cfg}->each_inbox(\&remove_eml_i,
 						$self, $eml, "$url UID:$uid");
-		}
 	} else {
 		die "BUG: destination unknown $inboxes";
 	}

^ permalink raw reply related	[relevance 51%]

* [PATCH 03/10] lei q: reorder internals to reduce FD passing
  2021-02-04  9:59 68% [PATCH 00/10] lei: cleanups + initial import support Eric Wong
  2021-02-04  9:59 65% ` [PATCH 01/10] lei q: delay worker spawn Eric Wong
@ 2021-02-04  9:59 29% ` Eric Wong
  2021-02-04  9:59 71% ` [PATCH 04/10] lei q: only start pager if output is to stdout Eric Wong
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-04  9:59 UTC (permalink / raw)
  To: meta

While FD passing is critical for script/lei <=> lei-daemon,
lei-daemon doesn't need to use it internally if FDs are
created in the proper order before forking.
---
 lib/PublicInbox/IPC.pm         |  3 --
 lib/PublicInbox/LEI.pm         | 99 +++++++---------------------------
 lib/PublicInbox/LeiOverview.pm | 28 +++-------
 lib/PublicInbox/LeiToMail.pm   | 28 ++++++----
 lib/PublicInbox/LeiXSearch.pm  | 97 ++++++++++++++++-----------------
 5 files changed, 92 insertions(+), 163 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 078aaa2c..7f5a3f6f 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -464,9 +464,6 @@ sub DESTROY {
 	ipc_worker_stop($self);
 }
 
-# Sereal doesn't have dclone
-sub deep_clone { ipc_thaw(ipc_freeze($_[-1])) }
-
 sub detect_nproc () {
 	# _SC_NPROCESSORS_ONLN = 84 on both Linux glibc and musl
 	return POSIX::sysconf(84) if $^O eq 'linux';
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 49deed13..0d4b1c11 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -286,7 +286,7 @@ sub x_it ($$) {
 	# make sure client sees stdout before exit
 	$self->{1}->autoflush(1) if $self->{1};
 	dump_and_clear_log();
-	if (my $s = $self->{pkt_op} // $self->{sock}) {
+	if (my $s = $self->{pkt_op_p} // $self->{sock}) {
 		send($s, "x_it $code", MSG_EOR);
 	} elsif ($self->{oneshot}) {
 		# don't want to end up using $? from child processes
@@ -322,7 +322,8 @@ sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }
 sub fail ($$;$) {
 	my ($self, $buf, $exit_code) = @_;
 	err($self, $buf) if defined $buf;
-	send($self->{pkt_op}, '!', MSG_EOR) if $self->{pkt_op}; # fail_handler
+	# calls fail_handler:
+	send($self->{pkt_op_p}, '!', MSG_EOR) if $self->{pkt_op_p};
 	x_it($self, ($exit_code // 1) << 8);
 	undef;
 }
@@ -340,7 +341,7 @@ sub puts ($;@) { out(shift, map { "$_\n" } @_) }
 
 sub child_error { # passes non-fatal curl exit codes to user
 	my ($self, $child_error) = @_; # child_error is $?
-	if (my $s = $self->{pkt_op} // $self->{sock}) {
+	if (my $s = $self->{pkt_op_p} // $self->{sock}) {
 		# send to the parent lei-daemon or to lei(1) client
 		send($s, "child_error $child_error", MSG_EOR);
 	} elsif (!$PublicInbox::DS::in_loop) {
@@ -348,94 +349,34 @@ sub child_error { # passes non-fatal curl exit codes to user
 	} # else noop if client disconnected
 }
 
-sub atfork_prepare_wq {
-	my ($self, $wq) = @_;
-	my $tcafc = $wq->{-ipc_atfork_child_close} //= [ $listener // () ];
-	if (my $sock = $self->{sock}) {
-		push @$tcafc, @$self{qw(0 1 2 3)}, $sock;
-	}
-	if (my $pgr = $self->{pgr}) {
-		push @$tcafc, @$pgr[1,2];
-	}
-	if (my $old_1 = $self->{old_1}) {
-		push @$tcafc, $old_1;
-	}
-	for my $f (qw(lxs l2m)) {
-		my $ipc = $self->{$f} or next;
-		push @$tcafc, grep { defined }
-				@$ipc{qw(-wq_s1 -wq_s2 -ipc_req -ipc_res)};
-	}
-}
-
-sub io_restore ($$) {
-	my ($dst, $src) = @_;
-	for my $i (0..2) { # standard FDs
-		my $io = delete $src->{$i} or next;
-		$dst->{$i} = $io;
-	}
-	for my $i (3..9) { # named (non-standard) FDs
-		my $io = $src->{$i} or next;
-		my @st = stat($io) or die "stat $src.$i ($io): $!";
-		my $f = delete $dst->{"dev=$st[0],ino=$st[1]"} // next;
-		$dst->{$f} = $io;
-		delete $src->{$i};
-	}
-}
-
 sub note_sigpipe { # triggers sigpipe_handler
 	my ($self, $fd) = @_;
 	close(delete($self->{$fd})); # explicit close silences Perl warning
-	send($self->{pkt_op}, '|', MSG_EOR) if $self->{pkt_op};
+	send($self->{pkt_op_p}, '|', MSG_EOR) if $self->{pkt_op_p};
 	x_it($self, 13);
 }
 
-sub atfork_child_wq {
-	my ($self, $wq) = @_;
-	io_restore($self, $wq);
-	-S $self->{pkt_op} or die 'BUG: {pkt_op} expected';
-	io_restore($self->{l2m}, $wq);
+sub lei_atfork_child {
+	my ($self) = @_;
+	# we need to explicitly close things which are on stack
+	delete $self->{0};
+	for (delete @$self{qw(3 sock old_1 au_done)}) {
+		close($_) if defined($_);
+	}
+	if (my $op_c = delete $self->{pkt_op_c}) {
+		close(delete $op_c->{sock});
+	}
+	if (my $pgr = delete $self->{pgr}) {
+		close($_) for (@$pgr[1,2]);
+	}
+	close $listener if $listener;
+	undef $listener;
 	%PATH2CFG = ();
 	undef $errors_log;
 	$quit = \&CORE::exit;
 	$current_lei = $self; # for SIG{__WARN__}
 }
 
-sub io_extract ($;@) {
-	my ($obj, @fields) = @_;
-	my @io;
-	for my $f (@fields) {
-		my $io = delete $obj->{$f} or next;
-		my @st = stat($io) or die "W: stat $obj.$f ($io): $!";
-		$obj->{"dev=$st[0],ino=$st[1]"} = $f;
-		push @io, $io;
-	}
-	@io
-}
-
-# usage: ($lei, @io) = $lei->atfork_parent_wq($wq);
-sub atfork_parent_wq {
-	my ($self, $wq) = @_;
-	my $env = delete $self->{env}; # env is inherited at fork
-	my $lei = bless { %$self }, ref($self);
-	for my $f (qw(dedupe ovv)) {
-		my $tmp = delete($lei->{$f}) or next;
-		$lei->{$f} = $wq->deep_clone($tmp);
-	}
-	$self->{env} = $env;
-	delete @$lei{qw(sock 3 -lei_store cfg old_1 pgr lxs)}; # keep l2m
-	my @io = (delete(@$lei{qw(0 1 2)}),
-			io_extract($lei, qw(pkt_op startq)));
-	my $l2m = $lei->{l2m};
-	if ($l2m && $l2m != $wq) { # $wq == lxs
-		if (my $wq_s1 = $l2m->{-wq_s1}) {
-			push @io, io_extract($l2m, '-wq_s1');
-			$l2m->{-wq_s1} = $wq_s1;
-		}
-		$l2m->wq_close(1);
-	}
-	($lei, @io);
-}
-
 sub _help ($;$) {
 	my ($self, $errmsg) = @_;
 	my $cmd = $self->{cmd} // 'COMMAND';
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index e33d63a2..e6bf4f2a 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -207,7 +207,6 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 	}
 	$lei->{ovv_buf} = \(my $buf = '') if !$l2m;
 	if ($l2m && !$ibxish) { # remote https?:// mboxrd
-		delete $l2m->{-wq_s1};
 		my $g2m = $l2m->can('git_to_mail');
 		my $wcb = $l2m->write_cb($lei);
 		sub {
@@ -215,33 +214,20 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			$wcb->(undef, $smsg, $eml);
 		};
 	} elsif ($l2m && $l2m->{-wq_s1}) {
-		my ($lei_ipc, @io) = $lei->atfork_parent_wq($l2m);
-		# $io[0] becomes a notification pipe that triggers EOF
+		# $io->[0] becomes a notification pipe that triggers EOF
 		# in this wq worker when all outstanding ->write_mail
 		# calls are complete
-		$io[0] = undef;
-		pipe($l2m->{each_smsg_done}, $io[0]) or die "pipe: $!";
-		fcntl($io[0], 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
-		delete @$lei_ipc{qw(l2m opt mset_opt cmd)};
+		my $io = [];
+		pipe($l2m->{each_smsg_done}, $io->[0]) or die "pipe: $!";
+		fcntl($io->[0], 1031, 4096) if $^O eq 'linux'; # F_SETPIPE_SZ
 		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
 		$self->{git} = $git;
 		my $git_dir = $git->{git_dir};
 		sub {
 			my ($smsg, $mitem) = @_;
 			$smsg->{pct} = get_pct($mitem) if $mitem;
-			$l2m->wq_do('write_mail', \@io, $git_dir, $smsg,
-					$lei_ipc);
+			$l2m->wq_do('write_mail', $io, $git_dir, $smsg);
 		}
-	} elsif ($l2m) {
-		my $wcb = $l2m->write_cb($lei);
-		my $git = $ibxish->git; # (LeiXSearch|Inbox|ExtSearch)->git
-		$self->{git} = $git; # for ovv_atexit_child
-		my $g2m = $l2m->can('git_to_mail');
-		sub {
-			my ($smsg, $mitem) = @_;
-			$smsg->{pct} = get_pct($mitem) if $mitem;
-			$git->cat_async($smsg->{blob}, $g2m, [ $wcb, $smsg ]);
-		};
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
 		sub { # DIY prettiness :P
@@ -275,7 +261,9 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 			$lei->out($buf);
 			$buf = '';
 		}
-	} # else { ...
+	} else {
+		die "TODO: unhandled case $self->{fmt}"
+	}
 }
 
 no warnings 'once';
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index c704dc2a..f9250860 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -211,10 +211,10 @@ sub zsfx2cmd ($$$) {
 }
 
 sub _post_augment_mbox { # open a compressor process
-	my ($self, $lei, $zpipe) = @_;
+	my ($self, $lei) = @_;
 	my $zsfx = $self->{zsfx} or return;
 	my $cmd = zsfx2cmd($zsfx, undef, $lei);
-	my ($r, $w) = splice(@$zpipe, 0, 2);
+	my ($r, $w) = @{delete $lei->{zpipe}};
 	my $rdr = { 0 => $r, 1 => $lei->{1}, 2 => $lei->{2} };
 	my $pid = spawn($cmd, $lei->{env}, $rdr);
 	my $pp = gensym;
@@ -407,7 +407,7 @@ sub _pre_augment_mbox {
 			$! == ENOENT or die "unlink($dst): $!";
 		}
 		open my $out, $mode, $dst or die "open($dst): $!";
-		$lei->{old_1} = $lei->{1};
+		$lei->{old_1} = $lei->{1}; # keep for spawning MUA
 		$lei->{1} = $out;
 	}
 	# Perl does SEEK_END even with O_APPEND :<
@@ -418,7 +418,7 @@ sub _pre_augment_mbox {
 	state $zsfx_allow = join('|', keys %zsfx2cmd);
 	($self->{zsfx}) = ($dst =~ /\.($zsfx_allow)\z/) or return;
 	pipe(my ($r, $w)) or die "pipe: $!";
-	[ $r, $w ];
+	$lei->{zpipe} = [ $r, $w ];
 }
 
 sub _do_augment_mbox {
@@ -462,16 +462,24 @@ sub post_augment { # fast (spawn compressor or mkdir), runs in main daemon
 	$self->$m($lei, @args);
 }
 
+sub ipc_atfork_child {
+	my ($self) = @_;
+	my $lei = delete $self->{lei};
+	$lei->lei_atfork_child;
+	if (my $zpipe = delete $lei->{zpipe}) {
+		$lei->{1} = $zpipe->[1];
+		close $zpipe->[0];
+	}
+	$self->{wcb} = $self->write_cb($lei);
+	$self->SUPER::ipc_atfork_child;
+}
+
 sub write_mail { # via ->wq_do
-	my ($self, $git_dir, $smsg, $lei) = @_;
+	my ($self, $git_dir, $smsg) = @_;
 	my $not_done = delete $self->{0} // die 'BUG: $not_done missing';
-	my $wcb = $self->{wcb} //= do { # first message
-		$lei->atfork_child_wq($self);
-		$self->write_cb($lei);
-	};
 	my $git = $self->{"$$\0$git_dir"} //= PublicInbox::Git->new($git_dir);
 	git_async_cat($git, $smsg->{blob}, \&git_to_mail,
-				[$wcb, $smsg, $not_done]);
+				[$self->{wcb}, $smsg, $not_done]);
 }
 
 sub wq_atexit_child {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index ab66717c..e41d899e 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -110,8 +110,8 @@ sub wait_startq ($) {
 sub mset_progress {
 	my $lei = shift;
 	return unless $lei->{-progress};
-	if ($lei->{pkt_op}) { # called via pkt_op/pkt_do from workers
-		pkt_do($lei->{pkt_op}, 'mset_progress', @_);
+	if ($lei->{pkt_op_p}) {
+		pkt_do($lei->{pkt_op_p}, 'mset_progress', @_);
 	} else { # single lei-daemon consumer
 		my ($desc, $mset_size, $mset_total_est) = @_;
 		$lei->{-mset_total} += $mset_size;
@@ -120,11 +120,10 @@ sub mset_progress {
 }
 
 sub query_thread_mset { # for --thread
-	my ($self, $lei, $ibxish) = @_;
+	my ($self, $ibxish) = @_;
 	local $0 = "$0 query_thread_mset";
-	$lei->atfork_child_wq($self);
+	my $lei = $self->{lei};
 	my $startq = delete $lei->{startq};
-
 	my ($srch, $over) = ($ibxish->search, $ibxish->over);
 	my $desc = $ibxish->{inboxdir} // $ibxish->{topdir};
 	return warn("$desc not indexed by Xapian\n") unless ($srch && $over);
@@ -154,9 +153,9 @@ sub query_thread_mset { # for --thread
 }
 
 sub query_mset { # non-parallel for non-"--thread" users
-	my ($self, $lei) = @_;
+	my ($self) = @_;
 	local $0 = "$0 query_mset";
-	$lei->atfork_child_wq($self);
+	my $lei = $self->{lei};
 	my $startq = delete $lei->{startq};
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
@@ -207,10 +206,10 @@ sub kill_reap {
 }
 
 sub query_remote_mboxrd {
-	my ($self, $lei, $uris) = @_;
+	my ($self, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
-	$lei->atfork_child_wq($self);
 	local $SIG{TERM} = sub { exit(0) }; # for DESTROY (File::Temp, $reap)
+	my $lei = $self->{lei};
 	my ($opt, $env) = @$lei{qw(opt env)};
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
 	push(@qform, t => 1) if $opt->{thread};
@@ -307,7 +306,7 @@ sub git {
 	$git;
 }
 
-sub query_done { # EOF callback
+sub query_done { # EOF callback for main daemon
 	my ($lei) = @_;
 	my $has_l2m = exists $lei->{l2m};
 	for my $f (qw(lxs l2m)) {
@@ -332,9 +331,8 @@ Error closing $lei->{ovv}->{dst}: $!
 }
 
 sub do_post_augment {
-	my ($lei, $zpipe, $au_done) = @_;
-	my $l2m = $lei->{l2m} or die 'BUG: no {l2m}';
-	eval { $l2m->post_augment($lei, $zpipe) };
+	my ($lei) = @_;
+	eval { $lei->{l2m}->post_augment($lei) };
 	if (my $err = $@) {
 		if (my $lxs = delete $lei->{lxs}) {
 			$lxs->wq_kill;
@@ -342,7 +340,7 @@ sub do_post_augment {
 		}
 		$lei->fail("$err");
 	}
-	close $au_done; # triggers wait_startq
+	close(delete $lei->{au_done}); # triggers wait_startq
 }
 
 my $MAX_PER_HOST = 4;
@@ -356,13 +354,13 @@ sub concurrency {
 }
 
 sub start_query { # always runs in main (lei-daemon) process
-	my ($self, $io, $lei) = @_;
+	my ($self, $lei) = @_;
 	if ($lei->{opt}->{thread}) {
 		for my $ibxish (locals($self)) {
-			$self->wq_do('query_thread_mset', $io, $lei, $ibxish);
+			$self->wq_do('query_thread_mset', [], $ibxish);
 		}
 	} elsif (locals($self)) {
-		$self->wq_do('query_mset', $io, $lei);
+		$self->wq_do('query_mset', []);
 	}
 	my $i = 0;
 	my $q = [];
@@ -370,19 +368,23 @@ sub start_query { # always runs in main (lei-daemon) process
 		push @{$q->[$i++ % $MAX_PER_HOST]}, $uri;
 	}
 	for my $uris (@$q) {
-		$self->wq_do('query_remote_mboxrd', $io, $lei, $uris);
+		$self->wq_do('query_remote_mboxrd', [], $uris);
 	}
-	@$io = ();
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	$self->{lei}->lei_atfork_child;
+	$self->SUPER::ipc_atfork_child;
 }
 
 sub query_prepare { # called by wq_do
-	my ($self, $lei) = @_;
+	my ($self) = @_;
 	local $0 = "$0 query_prepare";
-	$lei->atfork_child_wq($self);
-	delete $lei->{l2m}->{-wq_s1};
+	my $lei = $self->{lei};
 	eval { $lei->{l2m}->do_augment($lei) };
 	$lei->fail($@) if $@;
-	pkt_do($lei->{pkt_op}, '.') == 1 or die "do_post_augment trigger: $!"
+	pkt_do($lei->{pkt_op_p}, '.') == 1 or die "do_post_augment trigger: $!"
 }
 
 sub fail_handler ($;$$) {
@@ -401,45 +403,38 @@ sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
 
 sub do_query {
 	my ($self, $lei) = @_;
-	$lei->{1}->autoflush(1);
-	$lei->start_pager if -t $lei->{1};
-	$lei->{ovv}->ovv_begin($lei);
-	my ($au_done, $zpipe);
-	my $l2m = $lei->{l2m};
-	$lei->atfork_prepare_wq($self);
-	$self->wq_workers_start('lei_xsearch', $self->{jobs}, $lei->oldset);
-	delete $self->{-ipc_atfork_child_close};
-	if ($l2m) {
-		$lei->atfork_prepare_wq($l2m);
-		$l2m->wq_workers_start('lei2mail', $l2m->{jobs}, $lei->oldset);
-		delete $l2m->{-ipc_atfork_child_close};
-		pipe($lei->{startq}, $au_done) or die "pipe: $!";
-		# 1031: F_SETPIPE_SZ
-		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';
-		$zpipe = $l2m->pre_augment($lei);
-	}
 	my $ops = {
 		'|' => [ \&sigpipe_handler, $lei ],
 		'!' => [ \&fail_handler, $lei ],
-		'.' => [ \&do_post_augment, $lei, $zpipe, $au_done ],
+		'.' => [ \&do_post_augment, $lei ],
 		'' => [ \&query_done, $lei ],
 		'mset_progress' => [ \&mset_progress, $lei ],
 		'x_it' => [ $lei->can('x_it'), $lei ],
 		'child_error' => [ $lei->can('child_error'), $lei ],
 	};
-	(my $op, $lei->{pkt_op}) = PublicInbox::PktOp->pair($ops);
-	my ($lei_ipc, @io) = $lei->atfork_parent_wq($self);
-	delete($lei->{pkt_op});
-
-	$lei->event_step_init; # wait for shutdowns
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	$lei->{1}->autoflush(1);
+	$lei->start_pager if -t $lei->{1};
+	$lei->{ovv}->ovv_begin($lei);
+	my $l2m = $lei->{l2m};
 	if ($l2m) {
-		$self->wq_do('query_prepare', \@io, $lei_ipc);
-		$io[1] = $zpipe->[1] if $zpipe;
+		$l2m->pre_augment($lei);
+		$l2m->wq_workers_start('lei2mail', $l2m->{jobs},
+					$lei->oldset, { lei => $lei });
+		pipe($lei->{startq}, $lei->{au_done}) or die "pipe: $!";
+		# 1031: F_SETPIPE_SZ
+		fcntl($lei->{startq}, 1031, 4096) if $^O eq 'linux';
 	}
-	start_query($self, \@io, $lei_ipc);
-	$self->wq_close(1);
+	$self->wq_workers_start('lei_xsearch', $self->{jobs},
+				$lei->oldset, { lei => $lei });
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$l2m->wq_close(1) if $l2m;
+	$lei->event_step_init; # wait for shutdowns
+	$self->wq_do('query_prepare', []) if $l2m;
+	start_query($self, $lei);
+	$self->wq_close(1); # lei_xsearch workers stop when done
 	if ($lei->{oneshot}) {
-		# for the $lei_ipc->atfork_child_wq PIPE handler:
 		while ($op->{sock}) { $op->event_step }
 	}
 }

^ permalink raw reply related	[relevance 29%]

* [PATCH 10/10] lei import: initial implementation
  2021-02-04  9:59 68% [PATCH 00/10] lei: cleanups + initial import support Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-04  9:59 56% ` [PATCH 07/10] lei q: eliminate $not_done temporary git dir hack Eric Wong
@ 2021-02-04  9:59 36% ` Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-04  9:59 UTC (permalink / raw)
  To: meta

Only tested with .eml files so far, but Maildir + IMAP
will be supported.
---
 MANIFEST                      |   1 +
 lib/PublicInbox/IPC.pm        |   4 +-
 lib/PublicInbox/LEI.pm        |  48 ++++++++++++---
 lib/PublicInbox/LeiImport.pm  | 106 ++++++++++++++++++++++++++++++++++
 lib/PublicInbox/LeiStore.pm   |  18 ++++++
 lib/PublicInbox/LeiXSearch.pm |  18 +-----
 t/lei.t                       |  15 +++++
 7 files changed, 184 insertions(+), 26 deletions(-)
 create mode 100644 lib/PublicInbox/LeiImport.pm

diff --git a/MANIFEST b/MANIFEST
index 6922f9b1..a11d4106 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -179,6 +179,7 @@ lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
 lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
+lib/PublicInbox/LeiImport.pm
 lib/PublicInbox/LeiOverview.pm
 lib/PublicInbox/LeiQuery.pm
 lib/PublicInbox/LeiSearch.pm
diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 7f5a3f6f..a0e6bfee 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -101,7 +101,7 @@ sub ipc_worker_loop ($$$) {
 
 # starts a worker if Sereal or Storable is installed
 sub ipc_worker_spawn {
-	my ($self, $ident, $oldset) = @_;
+	my ($self, $ident, $oldset, $fields) = @_;
 	return unless $enc; # no Sereal or Storable
 	return if ($self->{-ipc_ppid} // -1) == $$; # idempotent
 	delete(@$self{qw(-ipc_req -ipc_res -ipc_ppid -ipc_pid)});
@@ -123,6 +123,8 @@ sub ipc_worker_spawn {
 		# ensure we properly exit even if warn() dies:
 		my $end = PublicInbox::OnDestroy->new($$, sub { exit(!!$@) });
 		eval {
+			$fields //= {};
+			local @$self{keys %$fields} = values(%$fields);
 			my $on_destroy = $self->ipc_atfork_child;
 			local %SIG = %SIG;
 			ipc_worker_loop($self, $r_req, $w_res);
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 24efb494..682d1bd1 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -160,9 +160,10 @@ our %CMD = ( # sorted in order of importance/use:
 'forget-watch' => [ '{WATCH_NUMBER|--prune}', 'stop and forget a watch',
 	qw(prune) ],
 
-'import' => [ 'URL_OR_PATHNAME|--stdin',
-	'one-shot import/update from URL or filesystem',
-	qw(stdin| offset=i recursive|r exclude=s include=s !flags),
+'import' => [ 'URLS_OR_PATHNAMES...|--stdin',
+	'one-time import/update from URL or filesystem',
+	qw(stdin| offset=i recursive|r exclude=s include|I=s
+	format|f=s flags!),
 	],
 
 'config' => [ '[...]', sub {
@@ -194,8 +195,8 @@ our %CMD = ( # sorted in order of importance/use:
 # $spec => [@ALLOWED_VALUES (default is first), $description],
 # $spec => $description
 # "$SUB_COMMAND TAB $spec" => as above
-my $stdin_formats = [ 'IN|auto|raw|mboxrd|mboxcl2|mboxcl|mboxo',
-		'specify message input format' ];
+my $stdin_formats = [ 'MAIL_FORMAT|eml|mboxrd|mboxcl2|mboxcl|mboxo',
+			'specify message input format' ];
 my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
 
 my %OPTDESC = (
@@ -240,6 +241,8 @@ my %OPTDESC = (
 'q	jobs=s'	=> [ '[SEARCH_JOBS][,WRITER_JOBS]',
 		'control number of search and writer jobs' ],
 
+'import format|f=s' => $stdin_formats,
+
 'ls-query	format|f=s' => $ls_format,
 'ls-external	format|f=s' => $ls_format,
 
@@ -319,6 +322,20 @@ sub err ($;@) {
 
 sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }
 
+sub fail_handler ($;$$) {
+	my ($lei, $code, $io) = @_;
+	for my $f (qw(imp lxs l2m)) {
+		my $wq = delete $lei->{$f} or next;
+		$wq->wq_wait_old($lei) if $wq->wq_kill_old; # lei-daemon
+	}
+	close($io) if $io; # needed to avoid warnings on SIGPIPE
+	$lei->x_it($code // (1 >> 8));
+}
+
+sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
+	fail_handler($_[0], 13, delete $_[0]->{1});
+}
+
 sub fail ($$;$) {
 	my ($self, $buf, $exit_code) = @_;
 	err($self, $buf) if defined $buf;
@@ -340,7 +357,8 @@ sub out ($;@) {
 sub puts ($;@) { out(shift, map { "$_\n" } @_) }
 
 sub child_error { # passes non-fatal curl exit codes to user
-	my ($self, $child_error) = @_; # child_error is $?
+	my ($self, $child_error, $msg) = @_; # child_error is $?
+	$self->err($msg) if $msg;
 	if (my $s = $self->{pkt_op_p} // $self->{sock}) {
 		# send to the parent lei-daemon or to lei(1) client
 		send($s, "child_error $child_error", MSG_EOR);
@@ -357,9 +375,16 @@ sub note_sigpipe { # triggers sigpipe_handler
 }
 
 sub lei_atfork_child {
-	my ($self) = @_;
+	my ($self, $persist) = @_;
 	# we need to explicitly close things which are on stack
-	delete $self->{0};
+	if ($persist) {
+		my @io = delete @$self{0,1,2};
+		unless ($self->{oneshot}) {
+			close($_) for @io;
+		}
+	} else {
+		delete $self->{0};
+	}
 	for (delete @$self{qw(3 sock old_1 au_done)}) {
 		close($_) if defined($_);
 	}
@@ -374,7 +399,7 @@ sub lei_atfork_child {
 	%PATH2CFG = ();
 	undef $errors_log;
 	$quit = \&CORE::exit;
-	$current_lei = $self; # for SIG{__WARN__}
+	$current_lei = $persist ? undef : $self; # for SIG{__WARN__}
 }
 
 sub _help ($;$) {
@@ -606,6 +631,11 @@ sub lei_config {
 	x_it($self, $?) if $?;
 }
 
+sub lei_import {
+	require PublicInbox::LeiImport;
+	PublicInbox::LeiImport->call(@_);
+}
+
 sub lei_init {
 	my ($self, $dir) = @_;
 	my $cfg = _lei_cfg($self, 1);
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
new file mode 100644
index 00000000..4a9af8a7
--- /dev/null
+++ b/lib/PublicInbox/LeiImport.pm
@@ -0,0 +1,106 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# front-end for the "lei import" sub-command
+package PublicInbox::LeiImport;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use PublicInbox::MboxReader;
+use PublicInbox::Eml;
+
+sub _import_eml { # MboxReader callback
+	my ($eml, $sto, $set_kw) = @_;
+	$sto->ipc_do('set_eml', $eml, $set_kw ? $sto->mbox_keywords($eml) : ());
+}
+
+sub import_done { # EOF callback for main daemon
+	my ($lei) = @_;
+	my $imp = delete $lei->{imp};
+	$imp->wq_wait_old($lei) if $imp;
+	my $wait = $lei->{sto}->ipc_do('done');
+	$lei->dclose;
+}
+
+sub call { # the main "lei import" method
+	my ($cls, $lei, @argv) = @_;
+	my $sto = $lei->_lei_store(1);
+	$sto->write_prepare($lei);
+	$lei->{opt}->{flags} //= 1;
+	my $fmt = $lei->{opt}->{'format'};
+	my $self = $lei->{imp} = bless {}, $cls;
+	return $lei->fail('--format unspecified') if !$fmt;
+	$self->{0} = $lei->{0} if $lei->{opt}->{stdin};
+	my $ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		'' => [ \&import_done, $lei ],
+	};
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	my $j = $lei->{opt}->{jobs} // scalar(@argv) || 1;
+	my $nproc = $self->detect_nproc;
+	$j = $nproc if $j > $nproc;
+	$self->wq_workers_start('lei_import', $j, $lei->oldset, {lei => $lei});
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$self->wq_do('import_stdin', []) if $self->{0};
+	for my $x (@argv) {
+		$self->wq_do('import_path_url', [], $x);
+	}
+	$self->wq_close(1);
+	$lei->event_step_init; # wait for shutdowns
+	if ($lei->{oneshot}) {
+		while ($op->{sock}) { $op->event_step }
+	}
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	$self->{lei}->lei_atfork_child;
+	$self->SUPER::ipc_atfork_child;
+}
+
+sub _import_fh {
+	my ($lei, $fh, $x) = @_;
+	my $set_kw = $lei->{opt}->{flags};
+	my $fmt = $lei->{opt}->{'format'};
+	eval {
+		if ($fmt eq 'eml') {
+			my $buf = do { local $/; <$fh> } //
+				return $lei->child_error(1 >> 8, <<"");
+		error reading $x: $!
+
+			my $eml = PublicInbox::Eml->new(\$buf);
+			_import_eml($eml, $lei->{sto}, $set_kw);
+		} else { # some mbox
+			my $cb = PublicInbox::MboxReader->can($fmt);
+			$cb or return $lei->child_error(1 >> 8, <<"");
+	--format $fmt unsupported for $x
+
+			$cb->(undef, $fh, \&_import_eml, $lei->{sto}, $set_kw);
+		}
+	};
+	$lei->child_error(1 >> 8, "<stdin>: $@") if $@;
+}
+
+sub import_path_url {
+	my ($self, $x) = @_;
+	my $lei = $self->{lei};
+	# TODO auto-detect?
+	if (-f $x) {
+		open my $fh, '<', $x or return $lei->child_error(1 >> 8, <<"");
+unable to open $x: $!
+
+		_import_fh($lei, $fh, $x);
+	} else {
+		$lei->fail("$x unsupported (TODO)");
+	}
+}
+
+sub import_stdin {
+	my ($self) = @_;
+	_import_fh($self->{lei}, $self->{0}, '<stdin>');
+}
+
+1;
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index a7d7d953..3a215973 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -17,6 +17,7 @@ use PublicInbox::V2Writable;
 use PublicInbox::ContentHash qw(content_hash content_digest);
 use PublicInbox::MID qw(mids mids_in);
 use PublicInbox::LeiSearch;
+use PublicInbox::MDA;
 use List::Util qw(max);
 
 sub new {
@@ -237,4 +238,21 @@ sub done {
 	die $err if $err;
 }
 
+sub ipc_atfork_child {
+	my ($self) = @_;
+	my $lei = delete $self->{lei};
+	$lei->lei_atfork_child(1) if $lei;
+	$self->SUPER::ipc_atfork_child;
+}
+
+sub write_prepare {
+	my ($self, $lei) = @_;
+	$self->ipc_lock_init;
+	# Mail we import into lei are private, so headers filtered out
+	# by -mda for public mail are not appropriate
+	local @PublicInbox::MDA::BAD_HEADERS = ();
+	$self->ipc_worker_spawn('lei_store', $lei->oldset, { lei => $lei });
+	$lei->{sto} = $self;
+}
+
 1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index daf42098..f8068362 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -392,25 +392,11 @@ sub query_prepare { # called by wq_do
 	pkt_do($lei->{pkt_op_p}, '.') == 1 or die "do_post_augment trigger: $!"
 }
 
-sub fail_handler ($;$$) {
-	my ($lei, $code, $io) = @_;
-	for my $f (qw(lxs l2m)) {
-		my $wq = delete $lei->{$f} or next;
-		$wq->wq_wait_old($lei) if $wq->wq_kill_old; # lei-daemon
-	}
-	close($io) if $io; # needed to avoid warnings on SIGPIPE
-	$lei->x_it($code // (1 >> 8));
-}
-
-sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
-	fail_handler($_[0], 13, delete $_[0]->{1});
-}
-
 sub do_query {
 	my ($self, $lei) = @_;
 	my $ops = {
-		'|' => [ \&sigpipe_handler, $lei ],
-		'!' => [ \&fail_handler, $lei ],
+		'|' => [ $lei->can('sigpipe_handler'), $lei ],
+		'!' => [ $lei->can('fail_handler'), $lei ],
 		'.' => [ \&do_post_augment, $lei ],
 		'' => [ \&query_done, $lei ],
 		'mset_progress' => [ \&mset_progress, $lei ],
diff --git a/t/lei.t b/t/lei.t
index a08a6d0d..eb824a30 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -389,6 +389,20 @@ SKIP: {
 }; # /SKIP
 };
 
+my $test_import = sub {
+	$cleanup->();
+	ok($lei->(qw(q s:boolean)), 'search miss before import');
+	unlike($out, qr/boolean/i, 'no results, yet');
+	open my $fh, '<', 't/data/0001.patch' or BAIL_OUT $!;
+	ok($lei->([qw(import -f eml -)], undef, { %$opt, 0 => $fh }),
+		'import single file from stdin');
+	close $fh;
+	ok($lei->(qw(q s:boolean)), 'search hit after import');
+	ok($lei->(qw(import -f eml), 't/data/message_embed.eml'),
+		'import single file by path');
+	$cleanup->();
+};
+
 my $test_lei_common = sub {
 	$test_help->();
 	$test_config->();
@@ -396,6 +410,7 @@ my $test_lei_common = sub {
 	$test_external->();
 	$test_completion->();
 	$test_fail->();
+	$test_import->();
 };
 
 if ($ENV{TEST_LEI_ONESHOT}) {

^ permalink raw reply related	[relevance 36%]

* lei-q doc thoughts... [was: doc: start manpages for lei commands]
  2021-02-01  5:57 27% ` [PATCH 1/2] doc: start manpages for lei commands Kyle Meyer
@ 2021-02-06  9:01 90%   ` Eric Wong
  2021-02-06 19:57 90%     ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2021-02-06  9:01 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> +=item --mua-cmd=COMMAND, --mua=COMMAND

On second thought:  is the long "--mua-cmd" even worth having or
supporting given "--mua=" exists?  I will likely remove it from
the documentation and filter it out from the help text.

Technically "mua-cmd" is more descriptive since it's a command
with a %f placeholder, but I can't imagine anybody wanting to
type "--mua-cmd" over "--mua".

> +=item -t, --thread
> +
> +Return all messages in the same thread as the actual match(es).

Heh, it turns out mairix uses "--threads" (plural).  I never
knew that since I always used "-t".  Not sure if it's worth
pluralizing on our end...

^ permalink raw reply	[relevance 90%]

* [PATCH 04/17] lei: abort lei_import worker on client abort
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
  2021-02-06 12:18 55% ` [PATCH 02/17] lei: favor "keywords" over "flags", test --no-kw Eric Wong
  2021-02-06 12:18 63% ` [PATCH 03/17] lei: fix completion of --no-kw / --no-keywords Eric Wong
@ 2021-02-06 12:18 67% ` Eric Wong
  2021-02-06 12:18 41% ` [PATCH 07/17] tests: add test_lei wrapper, split out t/lei-import.t Eric Wong
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

We'll stuff all the common wq key fields into the
@WQ_KEYS array so it's easier to keep track of what
to kill or reap.
---
 lib/PublicInbox/LEI.pm | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8d5a921e..28ad88e7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -286,6 +286,8 @@ my %CONFIG_KEYS = (
 	'leistore.dir' => 'top-level storage location',
 );
 
+my @WQ_KEYS = qw(lxs l2m imp); # internal workers
+
 # pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
 sub x_it ($$) {
 	my ($self, $code) = @_;
@@ -296,7 +298,7 @@ sub x_it ($$) {
 		send($s, "x_it $code", MSG_EOR);
 	} elsif ($self->{oneshot}) {
 		# don't want to end up using $? from child processes
-		for my $f (qw(lxs l2m)) {
+		for my $f (@WQ_KEYS) {
 			my $wq = delete $self->{$f} or next;
 			$wq->DESTROY;
 		}
@@ -327,7 +329,7 @@ sub qerr ($;@) { $_[0]->{opt}->{quiet} or err(shift, @_) }
 
 sub fail_handler ($;$$) {
 	my ($lei, $code, $io) = @_;
-	for my $f (qw(imp lxs l2m)) {
+	for my $f (@WQ_KEYS) {
 		my $wq = delete $lei->{$f} or next;
 		$wq->wq_wait_old($lei) if $wq->wq_kill_old; # lei-daemon
 	}
@@ -335,7 +337,7 @@ sub fail_handler ($;$$) {
 	$lei->x_it($code // (1 >> 8));
 }
 
-sub sigpipe_handler { # handles SIGPIPE from l2m/lxs workers
+sub sigpipe_handler { # handles SIGPIPE from @WQ_KEYS workers
 	fail_handler($_[0], 13, delete $_[0]->{1});
 }
 
@@ -856,7 +858,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 sub dclose {
 	my ($self) = @_;
 	delete $self->{-progress};
-	for my $f (qw(lxs l2m)) {
+	for my $f (@WQ_KEYS) {
 		my $wq = delete $self->{$f} or next;
 		if ($wq->wq_kill) {
 			$wq->wq_close

^ permalink raw reply related	[relevance 67%]

* [PATCH 03/17] lei: fix completion of --no-kw / --no-keywords
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
  2021-02-06 12:18 55% ` [PATCH 02/17] lei: favor "keywords" over "flags", test --no-kw Eric Wong
@ 2021-02-06 12:18 63% ` Eric Wong
  2021-02-06 12:18 67% ` [PATCH 04/17] lei: abort lei_import worker on client abort Eric Wong
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

We did not complete --no-* flags properly when multiple options
are allowed.
---
 lib/PublicInbox/LEI.pm | 9 ++++++---
 t/lei.t                | 8 +++++++-
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index b058b533..8d5a921e 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -274,6 +274,8 @@ my %OPTDESC = (
 'by-mid|mid:s' => [ 'MID', 'match only by Message-ID, ignoring contents' ],
 'jobs:i' => 'set parallelism level',
 
+'kw|keywords|flags!' => 'disable/enable importing flags',
+
 # xargs, env, use "-0", git(1) uses "-z".  We support z|0 everywhere
 'z|0' => 'use NUL \\0 instead of newline (CR) to delimit lines',
 
@@ -425,7 +427,7 @@ sub _help ($;$) {
 		my (@vals, @s, @l);
 		my $x = $sw;
 		if ($x =~ s/!\z//) { # solve! => --no-solve
-			$x = "no-$x";
+			$x =~ s/(\A|\|)/$1no-/g
 		} elsif ($x =~ s/:.+//) { # optional args: $x = "mid:s"
 			@vals = (' [', undef, ']');
 		} elsif ($x =~ s/=.+//) { # required arg: $x = "type=s"
@@ -710,8 +712,9 @@ sub lei__complete {
 		}
 		puts $self, grep(/$re/, map { # generate short/long names
 			if (s/[:=].+\z//) { # req/optional args, e.g output|o=i
-			} else { # negation: solve! => no-solve|solve
-				s/\A(.+)!\z/no-$1|$1/;
+			} elsif (s/!\z//) {
+				# negation: solve! => no-solve|solve
+				s/([\w\-]+)/$1|no-$1/g
 			}
 			map {
 				my $x = length > 1 ? "--$_" : "-$_";
diff --git a/t/lei.t b/t/lei.t
index 41d854e8..df333957 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -363,7 +363,7 @@ my $test_completion = sub {
 			--mua --mua-cmd --no-local --local --verbose -v
 			--save-as --no-remote --remote --torsocks
 			--reverse -r )) {
-		ok($out{$sw}, "$sw offered as completion");
+		ok($out{$sw}, "$sw offered as `lei q' completion");
 	}
 
 	ok($lei->(qw(_complete lei q --form)), 'complete q --format');
@@ -376,6 +376,12 @@ my $test_completion = sub {
 			ok($out{$f}, "got $sw $f as output format");
 		}
 	}
+	ok($lei->(qw(_complete lei import)), 'complete import');
+	%out = map { $_ => 1 } split(/\s+/s, $out);
+	for my $sw (qw(--flags --no-flags --no-kw --kw --no-keywords
+			--keywords)) {
+		ok($out{$sw}, "$sw offered as `lei import' completion");
+	}
 };
 
 my $test_fail = sub {

^ permalink raw reply related	[relevance 63%]

* [PATCH 00/17] lei: more random updates
@ 2021-02-06 12:18 60% Eric Wong
  2021-02-06 12:18 55% ` [PATCH 02/17] lei: favor "keywords" over "flags", test --no-kw Eric Wong
                   ` (10 more replies)
  0 siblings, 11 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

"lei add-external --mirror $URL $DESTDIR" works.
Tests are more split out and hopefully easier-to-manage
going forward (they are slowing down, though, but
more use of common setup_public_inboxes() may help).

The curl(1) short options are gone to avoid conflicts.
--help looks a bit nicer, now.

Eric Wong (17):
  lei_overview: drop unnecessary autoflush call
  lei: favor "keywords" over "flags", test --no-kw
  lei: fix completion of --no-kw / --no-keywords
  lei: abort lei_import worker on client abort
  init: lowercase -j for --jobs
  lei_query: trim curl options
  tests: add test_lei wrapper, split out t/lei-import.t
  t/lei-externals: split out into separate test
  t/tests: split out setup_public_inboxes sub
  tests: split out lei-daemon.t from lei.t
  treewide: replace confess with croak
  script/lei: avoid waitpid(-1, ...) to keep tests fast
  lei: add-external --mirror support
  lei help: split out into separate file
  lei add-external: reject index and remote opts w/o mirror
  lei_curl: replace -K/--config with --curl-config
  lei: remove short switch support for curl(1) options

 MANIFEST                               |  11 +-
 Makefile.PL                            |   3 +
 contrib/completion/lei-completion.bash |   2 +-
 lib/PublicInbox/Admin.pm               |   7 +-
 lib/PublicInbox/DS.pm                  |  10 +-
 lib/PublicInbox/Eml.pm                 |   4 +-
 lib/PublicInbox/IPC.pm                 |   2 +-
 lib/PublicInbox/LEI.pm                 | 200 +++++-------
 lib/PublicInbox/LeiCurl.pm             |  72 +++++
 lib/PublicInbox/LeiExternal.pm         |  46 ++-
 lib/PublicInbox/LeiHelp.pm             | 100 ++++++
 lib/PublicInbox/LeiImport.pm           |   4 +-
 lib/PublicInbox/LeiMirror.pm           | 288 +++++++++++++++++
 lib/PublicInbox/LeiOverview.pm         |   1 -
 lib/PublicInbox/LeiQuery.pm            |  24 +-
 lib/PublicInbox/LeiXSearch.pm          |  33 +-
 lib/PublicInbox/OverIdx.pm             |   2 +-
 lib/PublicInbox/TestCommon.pm          | 142 ++++++++-
 script/lei                             |  28 +-
 script/public-inbox-init               |   2 +-
 t/home1/.gitignore                     |   5 +
 t/home1/Makefile                       |   7 +
 t/home1/README                         |   8 +
 t/lei-daemon.t                         |  63 ++++
 t/lei-externals.t                      | 200 ++++++++++++
 t/lei-import.t                         |  39 +++
 t/lei-mirror.t                         |  30 ++
 t/lei-oneshot.t                        |   8 -
 t/lei.t                                | 424 +++----------------------
 29 files changed, 1180 insertions(+), 585 deletions(-)
 create mode 100644 lib/PublicInbox/LeiCurl.pm
 create mode 100644 lib/PublicInbox/LeiHelp.pm
 create mode 100644 lib/PublicInbox/LeiMirror.pm
 create mode 100644 t/home1/.gitignore
 create mode 100644 t/home1/Makefile
 create mode 100644 t/home1/README
 create mode 100644 t/lei-daemon.t
 create mode 100644 t/lei-externals.t
 create mode 100644 t/lei-import.t
 create mode 100644 t/lei-mirror.t
 delete mode 100644 t/lei-oneshot.t

^ permalink raw reply	[relevance 60%]

* [PATCH 02/17] lei: favor "keywords" over "flags", test --no-kw
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
@ 2021-02-06 12:18 55% ` Eric Wong
  2021-02-06 12:18 63% ` [PATCH 03/17] lei: fix completion of --no-kw / --no-keywords Eric Wong
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

JMAP brain says "keywords", IMAP brain says "flags";
JMAP brain wins today.

Since "keywords" is a bit long, support "kw" as a shortcut since
there's no conflict and "kw:" will be our search prefix for
looking up messages by keyword.
---
 lib/PublicInbox/LEI.pm       |  7 ++++---
 lib/PublicInbox/LeiImport.pm |  4 ++--
 t/lei.t                      | 21 ++++++++++++++++++++-
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 682d1bd1..b058b533 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -131,7 +131,7 @@ our %CMD = ( # sorted in order of importance/use:
 	'exclude mail matching From: or thread from non-Message-ID searches',
 	qw(stdin| thread|t from|f=s mid=s oid=s) ],
 'mark' => [ 'MESSAGE_FLAGS...',
-	'set/unset flags on message(s) from stdin',
+	'set/unset keywords on message(s) from stdin',
 	qw(stdin| oid=s exact by-mid|mid:s) ],
 'forget' => [ '[--stdin|--oid=OID|--by-mid=MID]',
 	"exclude message(s) on stdin from `q' search results",
@@ -152,7 +152,8 @@ our %CMD = ( # sorted in order of importance/use:
 
 'add-watch' => [ '[URL_OR_PATHNAME]',
 		'watch for new messages and flag changes',
-	qw(import! flags! interval=s recursive|r exclude=s include=s) ],
+	qw(import! kw|keywords|flags! interval=s recursive|r
+	exclude=s include=s) ],
 'ls-watch' => [ '[FILTER...]', 'list active watches with numbers and status',
 		qw(format|f=s z) ],
 'pause-watch' => [ '[WATCH_NUMBER_OR_FILTER]', qw(all local remote) ],
@@ -163,7 +164,7 @@ our %CMD = ( # sorted in order of importance/use:
 'import' => [ 'URLS_OR_PATHNAMES...|--stdin',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
-	format|f=s flags!),
+	format|f=s kw|keywords|flags!),
 	],
 
 'config' => [ '[...]', sub {
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 4a9af8a7..2c7cbf2b 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -26,7 +26,7 @@ sub call { # the main "lei import" method
 	my ($cls, $lei, @argv) = @_;
 	my $sto = $lei->_lei_store(1);
 	$sto->write_prepare($lei);
-	$lei->{opt}->{flags} //= 1;
+	$lei->{opt}->{kw} //= 1;
 	my $fmt = $lei->{opt}->{'format'};
 	my $self = $lei->{imp} = bless {}, $cls;
 	return $lei->fail('--format unspecified') if !$fmt;
@@ -63,7 +63,7 @@ sub ipc_atfork_child {
 
 sub _import_fh {
 	my ($lei, $fh, $x) = @_;
-	my $set_kw = $lei->{opt}->{flags};
+	my $set_kw = $lei->{opt}->{kw};
 	my $fmt = $lei->{opt}->{'format'};
 	eval {
 		if ($fmt eq 'eml') {
diff --git a/t/lei.t b/t/lei.t
index eb824a30..41d854e8 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -400,7 +400,26 @@ my $test_import = sub {
 	ok($lei->(qw(q s:boolean)), 'search hit after import');
 	ok($lei->(qw(import -f eml), 't/data/message_embed.eml'),
 		'import single file by path');
-	$cleanup->();
+
+	my $str = <<'';
+From: a@b
+Message-ID: <x@y>
+Status: RO
+
+	ok($lei->([qw(import -f eml -)], undef, { %$opt, 0 => \$str }),
+		'import single file with keywords from stdin');
+	$lei->(qw(q m:x@y));
+	my $res = $json->decode($out);
+	is($res->[1], undef, 'only one result');
+	is_deeply($res->[0]->{kw}, ['seen'], "message `seen' keyword set");
+
+	$str =~ tr/x/v/; # v@y
+	ok($lei->([qw(import --no-kw -f eml -)], undef, { %$opt, 0 => \$str }),
+		'import single file with --no-kw from stdin');
+	$lei->(qw(q m:v@y));
+	$res = $json->decode($out);
+	is($res->[1], undef, 'only one result');
+	is_deeply($res->[0]->{kw}, [], 'no keywords set');
 };
 
 my $test_lei_common = sub {

^ permalink raw reply related	[relevance 55%]

* [PATCH 12/17] script/lei: avoid waitpid(-1, ...) to keep tests fast
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
                   ` (5 preceding siblings ...)
  2021-02-06 12:18 46% ` [PATCH 10/17] tests: split out lei-daemon.t from lei.t Eric Wong
@ 2021-02-06 12:18 66% ` Eric Wong
  2021-02-06 12:18 21% ` [PATCH 13/17] lei: add-external --mirror support Eric Wong
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

We only spawn one process to be reaped at the moment.  tests
will run the contents of script/* in the same process if
possible, so any test scripts which spawn -httpd or other
read-only can cause us to stall with waitpid(-1, ...)
---
 script/lei | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/script/lei b/script/lei
index 40c21ad8..b7f21f14 100755
--- a/script/lei
+++ b/script/lei
@@ -14,13 +14,15 @@ my $send_cmd = PublicInbox::CmdIPC4->can('send_cmd4') // do {
 	PublicInbox::Spawn->can('send_cmd4');
 };
 
-sub sigchld {
-	my ($sig) = @_;
-	my $flags = $sig ? POSIX::WNOHANG() : 0;
-	while (waitpid(-1, $flags) > 0) {}
-}
+my %pids;
+my $sigchld = sub {
+	my $flags = scalar(@_) ? POSIX::WNOHANG() : 0;
+	for my $pid (keys %pids) {
+		delete($pids{$pid}) if waitpid($pid, $flags) == $pid;
+	}
+};
 
-sub exec_cmd {
+my $exec_cmd = sub {
 	my ($fds, $argc, @argv) = @_;
 	my @old = (*STDIN{IO}, *STDOUT{IO}, *STDERR{IO});
 	my @rdr;
@@ -29,7 +31,7 @@ sub exec_cmd {
 		push @rdr, shift(@old), $tmpfh;
 	}
 	require POSIX; # WNOHANG
-	$SIG{CHLD} = \&sigchld;
+	$SIG{CHLD} = $sigchld;
 	my $pid = fork // die "fork: $!";
 	if ($pid == 0) {
 		my %env = map { split(/=/, $_, 2) } splice(@argv, $argc);
@@ -38,9 +40,11 @@ sub exec_cmd {
 		}
 		%ENV = (%ENV, %env);
 		exec(@argv);
-		die "exec: @argv: $!";
+		warn "exec: @argv: $!\n";
+		POSIX::_exit(1);
 	}
-}
+	$pids{$pid} = 1;
+};
 
 if ($send_cmd && eval {
 	my $path = do {
@@ -107,13 +111,13 @@ Falling back to (slow) one-shot mode
 		} elsif ($buf =~ /\Achild_error ([0-9]+)\z/) {
 			$x_it_code = $1 + 0;
 		} elsif ($buf =~ /\Aexec (.+)\z/) {
-			exec_cmd(\@fds, split(/\0/, $1));
+			$exec_cmd->(\@fds, split(/\0/, $1));
 		} else {
-			sigchld();
+			$sigchld->();
 			die $buf;
 		}
 	}
-	sigchld();
+	$sigchld->();
 	if (my $sig = ($x_it_code & 127)) {
 		kill $sig, $$;
 		sleep(1) while 1;

^ permalink raw reply related	[relevance 66%]

* [PATCH 10/17] tests: split out lei-daemon.t from lei.t
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
                   ` (4 preceding siblings ...)
  2021-02-06 12:18 24% ` [PATCH 08/17] t/lei-externals: split out into separate test Eric Wong
@ 2021-02-06 12:18 46% ` Eric Wong
  2021-02-06 12:18 66% ` [PATCH 12/17] script/lei: avoid waitpid(-1, ...) to keep tests fast Eric Wong
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

This makes it easier for hackers to find daemon-specific
tests and forces us to always test both daemon and
oneshot mode.
---
 MANIFEST                      |   2 +-
 lib/PublicInbox/TestCommon.pm |   8 +-
 t/lei-daemon.t                |  63 ++++++++++++
 t/lei-oneshot.t               |   8 --
 t/lei.t                       | 177 ++++++++--------------------------
 5 files changed, 107 insertions(+), 151 deletions(-)
 create mode 100644 t/lei-daemon.t
 delete mode 100644 t/lei-oneshot.t

diff --git a/MANIFEST b/MANIFEST
index 000834cc..52dea385 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -354,9 +354,9 @@ t/init.t
 t/ipc.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei-daemon.t
 t/lei-externals.t
 t/lei-import.t
-t/lei-oneshot.t
 t/lei.t
 t/lei_dedupe.t
 t/lei_external.t
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index bb2cd7e6..c861dc5d 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -456,13 +456,15 @@ SKIP: {
 	require PublicInbox::Spawn;
 	state $lei_daemon = PublicInbox::Spawn->can('send_cmd4') ||
 				eval { require Socket::MsgHdr; 1 };
+	# XXX fix and move this inside daemon-only before 1.7 release
+	skip <<'EOM', 1 unless $lei_daemon;
+Socket::MsgHdr missing or Inline::C is unconfigured/missing
+EOM
 	$lei_opt = { 1 => \$lei_out, 2 => \$lei_err };
 	my $daemon_pid;
 	my ($tmpdir, $for_destroy) = tmpdir();
 	SKIP: {
-		skip <<'EOM', 1 unless $lei_daemon;
-Socket::MsgHdr missing or Inline::C is unconfigured/missing
-EOM
+		skip 'TEST_LEI_ONESHOT set', 1 if $ENV{TEST_LEI_ONESHOT};
 		my $home = "$tmpdir/lei-daemon";
 		mkdir($home, 0700) or BAIL_OUT "mkdir: $!";
 		local $ENV{HOME} = $home;
diff --git a/t/lei-daemon.t b/t/lei-daemon.t
new file mode 100644
index 00000000..c55ba86c
--- /dev/null
+++ b/t/lei-daemon.t
@@ -0,0 +1,63 @@
+#!perl -w
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+
+test_lei({ daemon_only => 1 }, sub {
+	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/5.seq.sock";
+	my $err_log = "$ENV{XDG_RUNTIME_DIR}/lei/errors.log";
+	ok($lei->('daemon-pid'), 'daemon-pid');
+	is($lei_err, '', 'no error from daemon-pid');
+	like($lei_out, qr/\A[0-9]+\n\z/s, 'pid returned') or BAIL_OUT;
+	chomp(my $pid = $lei_out);
+	ok(kill(0, $pid), 'pid is valid');
+	ok(-S $sock, 'sock created');
+	is(-s $err_log, 0, 'nothing in errors.log');
+	open my $efh, '>>', $err_log or BAIL_OUT $!;
+	print $efh "phail\n" or BAIL_OUT $!;
+	close $efh or BAIL_OUT $!;
+
+	ok($lei->('daemon-pid'), 'daemon-pid');
+	chomp(my $pid_again = $lei_out);
+	is($pid, $pid_again, 'daemon-pid idempotent');
+	like($lei_err, qr/phail/, 'got mock "phail" error previous run');
+
+	ok($lei->(qw(daemon-kill)), 'daemon-kill');
+	is($lei_out, '', 'no output from daemon-kill');
+	is($lei_err, '', 'no error from daemon-kill');
+	for (0..100) {
+		kill(0, $pid) or last;
+		tick();
+	}
+	ok(-S $sock, 'sock still exists');
+	ok(!kill(0, $pid), 'pid gone after stop');
+
+	ok($lei->(qw(daemon-pid)), 'daemon-pid');
+	chomp(my $new_pid = $lei_out);
+	ok(kill(0, $new_pid), 'new pid is running');
+	ok(-S $sock, 'sock still exists');
+
+	for my $sig (qw(-0 -CHLD)) {
+		ok($lei->('daemon-kill', $sig), "handles $sig");
+	}
+	is($lei_out.$lei_err, '', 'no output on innocuous signals');
+	ok($lei->('daemon-pid'), 'daemon-pid');
+	chomp $lei_out;
+	is($lei_out, $new_pid, 'PID unchanged after -0/-CHLD');
+
+	if ('socket inaccessible') {
+		chmod 0000, $sock or BAIL_OUT "chmod 0000: $!";
+		ok($lei->('help'), 'connect fail, one-shot fallback works');
+		like($lei_err, qr/\bconnect\(/, 'connect error noted');
+		like($lei_out, qr/^usage: /, 'help output works');
+		chmod 0700, $sock or BAIL_OUT "chmod 0700: $!";
+	}
+	unlink $sock or BAIL_OUT "unlink($sock) $!";
+	for (0..100) {
+		kill('CHLD', $new_pid) or last;
+		tick();
+	}
+	ok(!kill(0, $new_pid), 'daemon exits after unlink');
+});
+
+done_testing;
diff --git a/t/lei-oneshot.t b/t/lei-oneshot.t
deleted file mode 100644
index 7688da5b..00000000
--- a/t/lei-oneshot.t
+++ /dev/null
@@ -1,8 +0,0 @@
-#!perl -w
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
-# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict;
-use v5.10.1;
-use PublicInbox::TestCommon;
-local $ENV{TEST_LEI_ONESHOT} = '1';
-require './t/lei.t';
diff --git a/t/lei.t b/t/lei.t
index cfcdafb9..f789f63a 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -1,87 +1,56 @@
 #!perl -w
 # Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict;
-use v5.10.1;
-use Test::More;
-use PublicInbox::TestCommon;
-use PublicInbox::Config;
+use strict; use v5.10.1; use PublicInbox::TestCommon;
 use File::Path qw(rmtree);
 use PublicInbox::Spawn qw(which);
-my $req_sendcmd = 'Socket::MsgHdr or Inline::C missing or unconfigured';
-undef($req_sendcmd) if PublicInbox::Spawn->can('send_cmd4');
-eval { require Socket::MsgHdr; undef $req_sendcmd };
-require_git 2.6;
-require_mods(qw(json DBD::SQLite Search::Xapian));
-my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
-my ($home, $for_destroy) = tmpdir();
-my $err_filter;
-my $curl = which('curl');
-my $json = ref(PublicInbox::Config->json)->new->utf8->canonical;
-my $lei = sub {
-	my ($cmd, $env, $xopt) = @_;
-	$out = $err = '';
-	if (!ref($cmd)) {
-		($env, $xopt) = grep { (!defined) || ref } @_;
-		$cmd = [ grep { defined && !ref } @_ ];
-	}
-	my $res = run_script(['lei', @$cmd], $env, $xopt // $opt);
-	$err_filter and
-		$err = join('', grep(!/$err_filter/, split(/^/m, $err)));
-	$res;
-};
 
-delete local $ENV{XDG_DATA_HOME};
-delete local $ENV{XDG_CONFIG_HOME};
-local $ENV{GIT_COMMITTER_EMAIL} = 'lei@example.com';
-local $ENV{GIT_COMMITTER_NAME} = 'lei user';
-local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
-local $ENV{HOME} = $home;
-mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
-my $home_trash = [ "$home/.local", "$home/.config", "$home/junk" ];
+# this only tests the basic help/config/init/completion bits of lei;
+# actual functionality is tested in other t/lei-*.t tests
+my $curl = which('curl');
+my $home;
+my $home_trash = [];
 my $cleanup = sub { rmtree([@$home_trash, @_]) };
-my $config_file = "$home/.config/lei/config";
-my $store_dir = "$home/.local/share/lei";
 
 my $test_help = sub {
 	ok(!$lei->(), 'no args fails');
 	is($? >> 8, 1, '$? is 1');
-	is($out, '', 'nothing in stdout');
-	like($err, qr/^usage:/sm, 'usage in stderr');
+	is($lei_out, '', 'nothing in stdout');
+	like($lei_err, qr/^usage:/sm, 'usage in stderr');
 
 	for my $arg (['-h'], ['--help'], ['help'], [qw(daemon-pid --help)]) {
 		ok($lei->($arg), "lei @$arg");
-		like($out, qr/^usage:/sm, "usage in stdout (@$arg)");
-		is($err, '', "nothing in stderr (@$arg)");
+		like($lei_out, qr/^usage:/sm, "usage in stdout (@$arg)");
+		is($lei_err, '', "nothing in stderr (@$arg)");
 	}
 
 	for my $arg ([''], ['--halp'], ['halp'], [qw(daemon-pid --halp)]) {
 		ok(!$lei->($arg), "lei @$arg");
 		is($? >> 8, 1, '$? set correctly');
-		isnt($err, '', 'something in stderr');
-		is($out, '', 'nothing in stdout');
+		isnt($lei_err, '', 'something in stderr');
+		is($lei_out, '', 'nothing in stdout');
 	}
 	ok($lei->(qw(init -h)), 'init -h');
-	like($out, qr! \Q$home\E/\.local/share/lei/store\b!,
+	like($lei_out, qr! \Q$home\E/\.local/share/lei/store\b!,
 		'actual path shown in init -h');
 	ok($lei->(qw(init -h), { XDG_DATA_HOME => '/XDH' }),
 		'init with XDG_DATA_HOME');
-	like($out, qr! /XDH/lei/store\b!, 'XDG_DATA_HOME in init -h');
-	is($err, '', 'no errors from init -h');
+	like($lei_out, qr! /XDH/lei/store\b!, 'XDG_DATA_HOME in init -h');
+	is($lei_err, '', 'no errors from init -h');
 
 	ok($lei->(qw(config -h)), 'config-h');
-	like($out, qr! \Q$home\E/\.config/lei/config\b!,
+	like($lei_out, qr! \Q$home\E/\.config/lei/config\b!,
 		'actual path shown in config -h');
 	ok($lei->(qw(config -h), { XDG_CONFIG_HOME => '/XDC' }),
 		'config with XDG_CONFIG_HOME');
-	like($out, qr! /XDC/lei/config\b!, 'XDG_CONFIG_HOME in config -h');
-	is($err, '', 'no errors from config -h');
+	like($lei_out, qr! /XDC/lei/config\b!, 'XDG_CONFIG_HOME in config -h');
+	is($lei_err, '', 'no errors from config -h');
 };
 
 my $ok_err_info = sub {
 	my ($msg) = @_;
-	is(grep(!/^I:/, split(/^/, $err)), 0, $msg) or
-		diag "$msg: err=$err";
+	is(grep(!/^I:/, split(/^/, $lei_err)), 0, $msg) or
+		diag "$msg: err=$lei_err";
 };
 
 my $test_init = sub {
@@ -92,7 +61,7 @@ my $test_init = sub {
 	$ok_err_info->('after idempotent init w/o args');
 
 	ok(!$lei->('init', "$home/x"), 'init conflict');
-	is(grep(/^E:/, split(/^/, $err)), 1, 'got error on conflict');
+	is(grep(/^E:/, split(/^/, $lei_err)), 1, 'got error on conflict');
 	ok(!-e "$home/x", 'nothing created on conflict');
 	$cleanup->();
 
@@ -104,36 +73,36 @@ my $test_init = sub {
 	$cleanup->("$home/x");
 
 	ok(!$lei->('init', "$home/x", "$home/2"), 'too many args fails');
-	like($err, qr/too many/, 'noted excessive');
+	like($lei_err, qr/too many/, 'noted excessive');
 	ok(!-e "$home/x", 'x not created on excessive');
 	for my $d (@$home_trash) {
 		my $base = (split(m!/!, $d))[-1];
 		ok(!-d $d, "$base not created");
 	}
-	is($out, '', 'nothing in stdout on init failure');
+	is($lei_out, '', 'nothing in stdout on init failure');
 };
 
 my $test_config = sub {
 	$cleanup->();
 	ok($lei->(qw(config a.b c)), 'config set var');
-	is($out.$err, '', 'no output on var set');
+	is($lei_out.$lei_err, '', 'no output on var set');
 	ok($lei->(qw(config -l)), 'config -l');
-	is($err, '', 'no errors on listing');
-	is($out, "a.b=c\n", 'got expected output');
+	is($lei_err, '', 'no errors on listing');
+	is($lei_out, "a.b=c\n", 'got expected output');
 	ok(!$lei->(qw(config -f), "$home/.config/f", qw(x.y z)),
 			'config set var with -f fails');
-	like($err, qr/not supported/, 'not supported noted');
+	like($lei_err, qr/not supported/, 'not supported noted');
 	ok(!-f "$home/config/f", 'no file created');
 };
 
 my $test_completion = sub {
 	ok($lei->(qw(_complete lei)), 'no errors on complete');
-	my %out = map { $_ => 1 } split(/\s+/s, $out);
+	my %out = map { $_ => 1 } split(/\s+/s, $lei_out);
 	ok($out{'q'}, "`lei q' offered as completion");
 	ok($out{'add-external'}, "`lei add-external' offered as completion");
 
 	ok($lei->(qw(_complete lei q)), 'complete q (no args)');
-	%out = map { $_ => 1 } split(/\s+/s, $out);
+	%out = map { $_ => 1 } split(/\s+/s, $lei_out);
 	for my $sw (qw(-f --format -o --output --mfolder --augment -a
 			--mua --mua-cmd --no-local --local --verbose -v
 			--save-as --no-remote --remote --torsocks
@@ -142,17 +111,17 @@ my $test_completion = sub {
 	}
 
 	ok($lei->(qw(_complete lei q --form)), 'complete q --format');
-	is($out, "--format\n", 'complete lei q --format');
+	is($lei_out, "--format\n", 'complete lei q --format');
 	for my $sw (qw(-f --format)) {
 		ok($lei->(qw(_complete lei q), $sw), "complete q $sw ARG");
-		%out = map { $_ => 1 } split(/\s+/s, $out);
+		%out = map { $_ => 1 } split(/\s+/s, $lei_out);
 		for my $f (qw(mboxrd mboxcl2 mboxcl mboxo json jsonl
 				concatjson maildir)) {
 			ok($out{$f}, "got $sw $f as output format");
 		}
 	}
 	ok($lei->(qw(_complete lei import)), 'complete import');
-	%out = map { $_ => 1 } split(/\s+/s, $out);
+	%out = map { $_ => 1 } split(/\s+/s, $lei_out);
 	for my $sw (qw(--flags --no-flags --no-kw --kw --no-keywords
 			--keywords)) {
 		ok($out{$sw}, "$sw offered as `lei import' completion");
@@ -161,93 +130,23 @@ my $test_completion = sub {
 
 my $test_fail = sub {
 SKIP: {
-	skip $req_sendcmd, 3 if $req_sendcmd;
+	skip 'no curl', 3 unless which('curl');
 	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m));
 	is($? >> 8, 3, 'got curl exit for bogus URL');
 	$lei->(qw(q --only http://127.0.0.1:99999/bogus/ t:m -o), "$home/junk");
 	is($? >> 8, 3, 'got curl exit for bogus URL with Maildir');
-	is($out, '', 'no output');
+	is($lei_out, '', 'no output');
 }; # /SKIP
 };
 
-my $test_lei_common = sub {
+test_lei(sub {
+	$home = $ENV{HOME};
+	$home_trash = [ "$home/.local", "$home/.config", "$home/junk" ];
 	$test_help->();
 	$test_config->();
 	$test_init->();
 	$test_completion->();
 	$test_fail->();
-};
-
-if ($ENV{TEST_LEI_ONESHOT}) {
-	require_ok 'PublicInbox::LEI';
-	# force sun_path[108] overflow, ($lei->() filters out this path)
-	my $xrd = "$home/1shot-test".('.sun_path' x 108);
-	local $ENV{XDG_RUNTIME_DIR} = $xrd;
-	$err_filter = qr!\Q$xrd!;
-	$test_lei_common->();
-} else {
-SKIP: { # real socket
-	skip $req_sendcmd, 115 if $req_sendcmd;
-	local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
-	my $sock = "$ENV{XDG_RUNTIME_DIR}/lei/5.seq.sock";
-	my $err_log = "$ENV{XDG_RUNTIME_DIR}/lei/errors.log";
-
-	ok($lei->('daemon-pid'), 'daemon-pid');
-	is($err, '', 'no error from daemon-pid');
-	like($out, qr/\A[0-9]+\n\z/s, 'pid returned') or BAIL_OUT;
-	chomp(my $pid = $out);
-	ok(kill(0, $pid), 'pid is valid');
-	ok(-S $sock, 'sock created');
-
-	$test_lei_common->();
-	is(-s $err_log, 0, 'nothing in errors.log');
-	open my $efh, '>>', $err_log or BAIL_OUT $!;
-	print $efh "phail\n" or BAIL_OUT $!;
-	close $efh or BAIL_OUT $!;
-
-	ok($lei->('daemon-pid'), 'daemon-pid');
-	chomp(my $pid_again = $out);
-	is($pid, $pid_again, 'daemon-pid idempotent');
-	like($err, qr/phail/, 'got mock "phail" error previous run');
-
-	ok($lei->(qw(daemon-kill)), 'daemon-kill');
-	is($out, '', 'no output from daemon-kill');
-	is($err, '', 'no error from daemon-kill');
-	for (0..100) {
-		kill(0, $pid) or last;
-		tick();
-	}
-	ok(-S $sock, 'sock still exists');
-	ok(!kill(0, $pid), 'pid gone after stop');
-
-	ok($lei->(qw(daemon-pid)), 'daemon-pid');
-	chomp(my $new_pid = $out);
-	ok(kill(0, $new_pid), 'new pid is running');
-	ok(-S $sock, 'sock still exists');
-
-	for my $sig (qw(-0 -CHLD)) {
-		ok($lei->('daemon-kill', $sig), "handles $sig");
-	}
-	is($out.$err, '', 'no output on innocuous signals');
-	ok($lei->('daemon-pid'), 'daemon-pid');
-	chomp $out;
-	is($out, $new_pid, 'PID unchanged after -0/-CHLD');
-
-	if ('socket inaccessible') {
-		chmod 0000, $sock or BAIL_OUT "chmod 0000: $!";
-		ok($lei->('help'), 'connect fail, one-shot fallback works');
-		like($err, qr/\bconnect\(/, 'connect error noted');
-		like($out, qr/^usage: /, 'help output works');
-		chmod 0700, $sock or BAIL_OUT "chmod 0700: $!";
-	}
-	unlink $sock or BAIL_OUT "unlink($sock) $!";
-	for (0..100) {
-		kill('CHLD', $new_pid) or last;
-		tick();
-	}
-	ok(!kill(0, $new_pid), 'daemon exits after unlink');
-	# success over socket, can't test without
-}; # SKIP
-} # else
+});
 
 done_testing;

^ permalink raw reply related	[relevance 46%]

* [PATCH 07/17] tests: add test_lei wrapper, split out t/lei-import.t
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-06 12:18 67% ` [PATCH 04/17] lei: abort lei_import worker on client abort Eric Wong
@ 2021-02-06 12:18 41% ` Eric Wong
  2021-02-06 12:18 24% ` [PATCH 08/17] t/lei-externals: split out into separate test Eric Wong
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

This will make it easier to maintain and test lei going forward,
we need to be testing against existing read-only daemons.  We'll
also save ourselves some boilerplate by exporting all the
Test::More methods directly in TestCommon

We'll start using this by splitting out the latest "lei import"
tests into its own file.
---
 MANIFEST                      |  1 +
 lib/PublicInbox/TestCommon.pm | 93 ++++++++++++++++++++++++++++++++---
 t/lei-import.t                | 39 +++++++++++++++
 t/lei.t                       | 35 -------------
 4 files changed, 127 insertions(+), 41 deletions(-)
 create mode 100644 t/lei-import.t

diff --git a/MANIFEST b/MANIFEST
index a11d4106..3bece258 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -351,6 +351,7 @@ t/init.t
 t/ipc.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei-import.t
 t/lei-oneshot.t
 t/lei.t
 t/lei_dedupe.t
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 40c2dc9e..2b78731b 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -9,14 +9,17 @@ use v5.10.1;
 use Fcntl qw(FD_CLOEXEC F_SETFD F_GETFD :seek);
 use POSIX qw(dup2);
 use IO::Socket::INET;
-our @EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods
-	run_script start_script key2sub xsys xsys_e xqx eml_load tick
-	have_xapian_compact);
+our @EXPORT;
 BEGIN {
+	@EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods
+		run_script start_script key2sub xsys xsys_e xqx eml_load tick
+		have_xapian_compact json_utf8
+		test_lei $lei $lei_out $lei_err $lei_opt);
 	require Test::More;
-	*BAIL_OUT = \&Test::More::BAIL_OUT;
-	*plan = \&Test::More::plan;
-	*skip = \&Test::More::skip;
+	my @methods = grep(!/\W/, @Test::More::EXPORT);
+	eval(join('', map { "*$_=\\&Test::More::$_;" } @methods));
+	die $@ if $@;
+	push @EXPORT, @methods;
 }
 
 sub eml_load ($) {
@@ -419,6 +422,84 @@ sub have_xapian_compact () {
 	PublicInbox::Spawn::which($ENV{XAPIAN_COMPACT} || 'xapian-compact');
 }
 
+our ($err_skip, $lei_opt, $lei_out, $lei_err);
+our $lei = sub {
+	my ($cmd, $env, $xopt) = @_;
+	$lei_out = $lei_err = '';
+	if (!ref($cmd)) {
+		($env, $xopt) = grep { (!defined) || ref } @_;
+		$cmd = [ grep { defined && !ref } @_ ];
+	}
+	my $res = run_script(['lei', @$cmd], $env, $xopt // $lei_opt);
+	$err_skip and
+		$lei_err = join('', grep(!/$err_skip/, split(/^/m, $lei_err)));
+	$res;
+};
+
+sub json_utf8 () {
+	state $x = ref(PublicInbox::Config->json)->new->utf8->canonical;
+}
+
+sub test_lei {
+SKIP: {
+	my ($cb) = pop @_;
+	my $test_opt = shift // {};
+	require_git(2.6) or skip('git 2.6+ required for lei test', 2);
+	require_mods(qw(json DBD::SQLite Search::Xapian), 2);
+	require PublicInbox::Config;
+	delete local $ENV{XDG_DATA_HOME};
+	delete local $ENV{XDG_CONFIG_HOME};
+	local $ENV{GIT_COMMITTER_EMAIL} = 'lei@example.com';
+	local $ENV{GIT_COMMITTER_NAME} = 'lei user';
+	my (undef, $fn, $lineno) = caller(0);
+	my $t = "$fn:$lineno";
+	require PublicInbox::Spawn;
+	state $lei_daemon = PublicInbox::Spawn->can('send_cmd4') ||
+				eval { require Socket::MsgHdr; 1 };
+	$lei_opt = { 1 => \$lei_out, 2 => \$lei_err };
+	my $daemon_pid;
+	my ($tmpdir, $for_destroy) = tmpdir();
+	SKIP: {
+		skip <<'EOM', 1 unless $lei_daemon;
+Socket::MsgHdr missing or Inline::C is unconfigured/missing
+EOM
+		my $home = "$tmpdir/lei-daemon";
+		mkdir($home, 0700) or BAIL_OUT "mkdir: $!";
+		local $ENV{HOME} = $home;
+		my $xrd = "$home/xdg_run";
+		mkdir($xrd, 0700) or BAIL_OUT "mkdir: $!";
+		local $ENV{XDG_RUNTIME_DIR} = $xrd;
+		$cb->();
+		ok($lei->(qw(daemon-pid)), "daemon-pid after $t");
+		chomp($daemon_pid = $lei_out);
+		if ($daemon_pid) {
+			ok(kill(0, $daemon_pid), "daemon running after $t");
+			ok($lei->(qw(daemon-kill)), "daemon-kill after $t");
+		} else {
+			fail("daemon not running after $t");
+		}
+	}; # SKIP for lei_daemon
+	unless ($test_opt->{daemon_only}) {
+		require_ok 'PublicInbox::LEI';
+		my $home = "$tmpdir/lei-oneshot";
+		mkdir($home, 0700) or BAIL_OUT "mkdir: $!";
+		local $ENV{HOME} = $home;
+		# force sun_path[108] overflow:
+		my $xrd = "$home/1shot-test".('.sun_path' x 108);
+		local $err_skip = qr!\Q$xrd!; # for $lei->() filtering
+		local $ENV{XDG_RUNTIME_DIR} = $xrd;
+		$cb->();
+	}
+	if ($daemon_pid) {
+		for (0..10) {
+			kill(0, $daemon_pid) or last;
+			tick;
+		}
+		ok(!kill(0, $daemon_pid), "$t daemon stopped after oneshot");
+	}
+}; # SKIP if missing git 2.6+ || Xapian || SQLite || json
+}
+
 package PublicInboxTestProcess;
 use strict;
 
diff --git a/t/lei-import.t b/t/lei-import.t
new file mode 100644
index 00000000..709d89fa
--- /dev/null
+++ b/t/lei-import.t
@@ -0,0 +1,39 @@
+#!perl -w
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+test_lei(sub {
+
+ok($lei->(qw(q s:boolean)), 'search miss before import');
+unlike($lei_out, qr/boolean/i, 'no results, yet');
+open my $fh, '<', 't/data/0001.patch' or BAIL_OUT $!;
+ok($lei->([qw(import -f eml -)], undef, { %$lei_opt, 0 => $fh }),
+	'import single file from stdin');
+close $fh;
+ok($lei->(qw(q s:boolean)), 'search hit after import');
+ok($lei->(qw(import -f eml), 't/data/message_embed.eml'),
+	'import single file by path');
+
+my $str = <<'';
+From: a@b
+Message-ID: <x@y>
+Status: RO
+
+my $opt = { %$lei_opt, 0 => \$str };
+ok($lei->([qw(import -f eml -)], undef, $opt),
+	'import single file with keywords from stdin');
+$lei->(qw(q m:x@y));
+my $res = json_utf8->decode($lei_out);
+is($res->[1], undef, 'only one result');
+is_deeply($res->[0]->{kw}, ['seen'], "message `seen' keyword set");
+
+$str =~ tr/x/v/; # v@y
+ok($lei->([qw(import --no-kw -f eml -)], undef, $opt),
+	'import single file with --no-kw from stdin');
+$lei->(qw(q m:v@y));
+$res = json_utf8->decode($lei_out);
+is($res->[1], undef, 'only one result');
+is_deeply($res->[0]->{kw}, [], 'no keywords set');
+
+});
+done_testing;
diff --git a/t/lei.t b/t/lei.t
index df333957..9f92d895 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -41,7 +41,6 @@ local $ENV{GIT_COMMITTER_EMAIL} = 'lei@example.com';
 local $ENV{GIT_COMMITTER_NAME} = 'lei user';
 local $ENV{XDG_RUNTIME_DIR} = "$home/xdg_run";
 local $ENV{HOME} = $home;
-local $ENV{FOO} = 'BAR';
 mkdir "$home/xdg_run", 0700 or BAIL_OUT "mkdir: $!";
 my $home_trash = [ "$home/.local", "$home/.config", "$home/junk" ];
 my $cleanup = sub { rmtree([@$home_trash, @_]) };
@@ -395,39 +394,6 @@ SKIP: {
 }; # /SKIP
 };
 
-my $test_import = sub {
-	$cleanup->();
-	ok($lei->(qw(q s:boolean)), 'search miss before import');
-	unlike($out, qr/boolean/i, 'no results, yet');
-	open my $fh, '<', 't/data/0001.patch' or BAIL_OUT $!;
-	ok($lei->([qw(import -f eml -)], undef, { %$opt, 0 => $fh }),
-		'import single file from stdin');
-	close $fh;
-	ok($lei->(qw(q s:boolean)), 'search hit after import');
-	ok($lei->(qw(import -f eml), 't/data/message_embed.eml'),
-		'import single file by path');
-
-	my $str = <<'';
-From: a@b
-Message-ID: <x@y>
-Status: RO
-
-	ok($lei->([qw(import -f eml -)], undef, { %$opt, 0 => \$str }),
-		'import single file with keywords from stdin');
-	$lei->(qw(q m:x@y));
-	my $res = $json->decode($out);
-	is($res->[1], undef, 'only one result');
-	is_deeply($res->[0]->{kw}, ['seen'], "message `seen' keyword set");
-
-	$str =~ tr/x/v/; # v@y
-	ok($lei->([qw(import --no-kw -f eml -)], undef, { %$opt, 0 => \$str }),
-		'import single file with --no-kw from stdin');
-	$lei->(qw(q m:v@y));
-	$res = $json->decode($out);
-	is($res->[1], undef, 'only one result');
-	is_deeply($res->[0]->{kw}, [], 'no keywords set');
-};
-
 my $test_lei_common = sub {
 	$test_help->();
 	$test_config->();
@@ -435,7 +401,6 @@ my $test_lei_common = sub {
 	$test_external->();
 	$test_completion->();
 	$test_fail->();
-	$test_import->();
 };
 
 if ($ENV{TEST_LEI_ONESHOT}) {

^ permalink raw reply related	[relevance 41%]

* [PATCH 08/17] t/lei-externals: split out into separate test
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
                   ` (3 preceding siblings ...)
  2021-02-06 12:18 41% ` [PATCH 07/17] tests: add test_lei wrapper, split out t/lei-import.t Eric Wong
@ 2021-02-06 12:18 24% ` Eric Wong
  2021-02-06 12:18 46% ` [PATCH 10/17] tests: split out lei-daemon.t from lei.t Eric Wong
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

This is still overloaded with "lei q" stuff, but that's
somewhat inevitable.
---
 MANIFEST          |   1 +
 t/lei-externals.t | 231 ++++++++++++++++++++++++++++++++++++++++++++++
 t/lei.t           | 225 --------------------------------------------
 3 files changed, 232 insertions(+), 225 deletions(-)
 create mode 100644 t/lei-externals.t

diff --git a/MANIFEST b/MANIFEST
index 3bece258..c7fe4fb5 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -351,6 +351,7 @@ t/init.t
 t/ipc.t
 t/iso-2202-jp.eml
 t/kqnotify.t
+t/lei-externals.t
 t/lei-import.t
 t/lei-oneshot.t
 t/lei.t
diff --git a/t/lei-externals.t b/t/lei-externals.t
new file mode 100644
index 00000000..739f779d
--- /dev/null
+++ b/t/lei-externals.t
@@ -0,0 +1,231 @@
+#!perl -w
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+use Fcntl qw(SEEK_SET);
+use PublicInbox::Spawn qw(which);
+
+my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
+	http://czquwvybam4bgbro.onion/meta/
+	http://ou63pmih66umazou.onion/meta/);
+
+# TODO share this across tests, it takes ~300ms
+my $setup_publicinboxes = sub {
+	my ($home) = @_;
+	use PublicInbox::InboxWritable;
+	for my $V (1, 2) {
+		run_script([qw(-init), "-V$V", "t$V",
+				'--newsgroup', "t.$V",
+				"$home/t$V", "http://example.com/t$V",
+				"t$V\@example.com" ]) or BAIL_OUT "init v$V";
+	}
+	my $cfg = PublicInbox::Config->new;
+	my $seen = 0;
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		my $im = PublicInbox::InboxWritable->new($ibx)->importer(0);
+		my $V = $ibx->version;
+		my @eml = (glob('t/*.eml'), 't/data/0001.patch');
+		for (@eml) {
+			next if $_ eq 't/psgi_v2-old.eml'; # dup mid
+			$im->add(eml_load($_)) or BAIL_OUT "v$V add $_";
+			$seen++;
+		}
+		$im->done;
+		if ($V == 1) {
+			run_script(['-index', $ibx->{inboxdir}]) or
+				BAIL_OUT 'index v1';
+		}
+	});
+	$seen || BAIL_OUT 'no imports';
+};
+
+my $test_external_remote = sub {
+	my ($url, $k) = @_;
+SKIP: {
+	my $nr = 5;
+	skip "$k unset", $nr if !$url;
+	which('curl') or skip 'no curl', $nr;
+	which('torsocks') or skip 'no torsocks', $nr if $url =~ m!\.onion/!;
+	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
+	my @cmd = ('q', '--only', $url, '-q', "m:$mid");
+	ok($lei->(@cmd), "query $url");
+	is($lei_err, '', "no errors on $url");
+	my $res = json_utf8->decode($lei_out);
+	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
+	ok($lei->(@cmd, 'd:..20101002'), 'no results, no error');
+	is($lei_err, '', 'no output on 404, matching local FS behavior');
+	is($lei_out, "[null]\n", 'got null results');
+} # /SKIP
+}; # /sub
+
+test_lei(sub {
+	my $home = $ENV{HOME};
+	$setup_publicinboxes->($home);
+	my $config_file = "$home/.config/lei/config";
+	my $store_dir = "$home/.local/share/lei";
+	ok($lei->('ls-external'), 'ls-external works');
+	is($lei_out.$lei_err, '', 'ls-external no output, yet');
+	ok(!-e $config_file && !-e $store_dir,
+		'nothing created by ls-external');
+
+	ok(!$lei->('add-external', "$home/nonexistent"),
+		"fails on non-existent dir");
+	ok($lei->('ls-external'), 'ls-external works after add failure');
+	is($lei_out.$lei_err, '', 'ls-external still has no output');
+	my $cfg = PublicInbox::Config->new;
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		ok($lei->(qw(add-external -q), $ibx->{inboxdir}),
+			'added external');
+		is($lei_out.$lei_err, '', 'no output');
+	});
+	ok(-s $config_file && -e $store_dir,
+		'add-external created config + store');
+	my $lcfg = PublicInbox::Config->new($config_file);
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		is($lcfg->{"external.$ibx->{inboxdir}.boost"}, 0,
+			"configured boost on $ibx->{name}");
+	});
+	$lei->('ls-external');
+	like($lei_out, qr/boost=0\n/s, 'ls-external has output');
+	ok($lei->(qw(add-external -q https://EXAMPLE.com/ibx)), 'add remote');
+	is($lei_err, '', 'no warnings after add-external');
+
+	ok($lei->(qw(_complete lei forget-external)), 'complete for externals');
+	my %comp = map { $_ => 1 } split(/\s+/, $lei_out);
+	ok($comp{'https://example.com/ibx/'}, 'forget external completion');
+	$cfg->each_inbox(sub {
+		my ($ibx) = @_;
+		ok($comp{$ibx->{inboxdir}}, "local $ibx->{name} completion");
+	});
+	for my $u (qw(h http https https: https:/ https:// https://e
+			https://example https://example. https://example.co
+			https://example.com https://example.com/
+			https://example.com/i https://example.com/ibx)) {
+		ok($lei->(qw(_complete lei forget-external), $u),
+			"partial completion for URL $u");
+		is($lei_out, "https://example.com/ibx/\n",
+			"completed partial URL $u");
+		for my $qo (qw(-I --include --exclude --only)) {
+			ok($lei->(qw(_complete lei q), $qo, $u),
+				"partial completion for URL q $qo $u");
+			is($lei_out, "https://example.com/ibx/\n",
+				"completed partial URL $u on q $qo");
+		}
+	}
+	ok($lei->(qw(_complete lei add-external), 'https://'),
+		'add-external hostname completion');
+	is($lei_out, "https://example.com/\n", 'completed up to hostname');
+
+	$lei->('ls-external');
+	like($lei_out, qr!https://example\.com/ibx/!s, 'added canonical URL');
+	is($lei_err, '', 'no warnings on ls-external');
+	ok($lei->(qw(forget-external -q https://EXAMPLE.com/ibx)),
+		'forget');
+	$lei->('ls-external');
+	unlike($lei_out, qr!https://example\.com/ibx/!s,
+		'removed canonical URL');
+SKIP: {
+	ok(!$lei->(qw(q s:prefix -o /dev/null -f maildir)), 'bad maildir');
+	like($lei_err, qr!/dev/null exists and is not a directory!,
+		'error shown');
+	is($? >> 8, 1, 'errored out with exit 1');
+
+	ok(!$lei->(qw(q s:prefix -f mboxcl2 -o), $home), 'bad mbox');
+	like($lei_err, qr!\Q$home\E exists and is not a writable file!,
+		'error shown');
+	is($? >> 8, 1, 'errored out with exit 1');
+
+	ok(!$lei->(qw(q s:prefix -o /dev/stdout -f Mbox2)), 'bad format');
+	like($lei_err, qr/bad mbox --format=mbox2/, 'error shown');
+	is($? >> 8, 1, 'errored out with exit 1');
+
+	# note, on a Bourne shell users should be able to use either:
+	#	s:"use boolean prefix"
+	#	"s:use boolean prefix"
+	# or use single quotes, it should not matter.  Users only need
+	# to know shell quoting rules, not Xapian quoting rules.
+	# No double-quoting should be imposed on users on the CLI
+	$lei->('q', 's:use boolean prefix');
+	like($lei_out, qr/search: use boolean prefix/,
+		'phrase search got result');
+	my $res = json_utf8->decode($lei_out);
+	is(scalar(@$res), 2, 'only 2 element array (1 result)');
+	is($res->[1], undef, 'final element is undef'); # XXX should this be?
+	is(ref($res->[0]), 'HASH', 'first element is hashref');
+	$lei->('q', '--pretty', 's:use boolean prefix');
+	my $pretty = json_utf8->decode($lei_out);
+	is_deeply($res, $pretty, '--pretty is identical after decode');
+
+	{
+		open my $fh, '+>', undef or BAIL_OUT $!;
+		$fh->autoflush(1);
+		print $fh 's:use' or BAIL_OUT $!;
+		seek($fh, 0, SEEK_SET) or BAIL_OUT $!;
+		ok($lei->([qw(q -q --stdin)], undef, { %$lei_opt, 0 => $fh }),
+				'--stdin on regular file works');
+		like($lei_out, qr/use boolean/, '--stdin on regular file');
+	}
+	{
+		pipe(my ($r, $w)) or BAIL_OUT $!;
+		print $w 's:use' or BAIL_OUT $!;
+		close $w or BAIL_OUT $!;
+		ok($lei->([qw(q -q --stdin)], undef, { %$lei_opt, 0 => $r }),
+				'--stdin on pipe file works');
+		like($lei_out, qr/use boolean prefix/, '--stdin on pipe');
+	}
+	ok(!$lei->(qw(q -q --stdin s:use)), "--stdin and argv don't mix");
+
+	for my $fmt (qw(ldjson ndjson jsonl)) {
+		$lei->('q', '-f', $fmt, 's:use boolean prefix');
+		is($lei_out, json_utf8->encode($pretty->[0])."\n", "-f $fmt");
+	}
+
+	require IO::Uncompress::Gunzip;
+	for my $sfx ('', '.gz') {
+		my $f = "$home/mbox$sfx";
+		$lei->('q', '-o', "mboxcl2:$f", 's:use boolean prefix');
+		my $cat = $sfx eq '' ? sub {
+			open my $mb, '<', $f or fail "no mbox: $!";
+			<$mb>
+		} : sub {
+			my $z = IO::Uncompress::Gunzip->new($f, MultiStream=>1);
+			<$z>;
+		};
+		my @s = grep(/^Subject:/, $cat->());
+		is(scalar(@s), 1, "1 result in mbox$sfx");
+		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:see attachment');
+		is(grep(!/^#/, $lei_err), 0, 'no errors from augment');
+		@s = grep(/^Subject:/, my @wtf = $cat->());
+		is(scalar(@s), 2, "2 results in mbox$sfx");
+
+		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:nonexistent');
+		is(grep(!/^#/, $lei_err), 0, "no errors on no results ($sfx)");
+
+		my @s2 = grep(/^Subject:/, $cat->());
+		is_deeply(\@s2, \@s,
+			"same 2 old results w/ --augment and bad search $sfx");
+
+		$lei->('q', '-o', "mboxcl2:$f", 's:nonexistent');
+		my @res = $cat->();
+		is_deeply(\@res, [], "clobber w/o --augment $sfx");
+	}
+	ok(!$lei->('q', '-o', "$home/mbox", 's:nope'),
+			'fails if mbox format unspecified');
+	ok(!$lei->(qw(q --no-local s:see)), '--no-local');
+	is($? >> 8, 1, 'proper exit code');
+	like($lei_err, qr/no local or remote.+? to search/, 'no inbox');
+	my %e = (
+		TEST_LEI_EXTERNAL_HTTPS => 'https://public-inbox.org/meta/',
+		TEST_LEI_EXTERNAL_ONION => $onions[int(rand(scalar(@onions)))],
+	);
+	for my $k (keys %e) {
+		my $url = $ENV{$k} // '';
+		$url = $e{$k} if $url eq '1';
+		$test_external_remote->($url, $k);
+	}
+	}; # /SKIP
+}); # test_lei
+done_testing;
diff --git a/t/lei.t b/t/lei.t
index 9f92d895..cfcdafb9 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -7,7 +7,6 @@ use Test::More;
 use PublicInbox::TestCommon;
 use PublicInbox::Config;
 use File::Path qw(rmtree);
-use Fcntl qw(SEEK_SET);
 use PublicInbox::Spawn qw(which);
 my $req_sendcmd = 'Socket::MsgHdr or Inline::C missing or unconfigured';
 undef($req_sendcmd) if PublicInbox::Spawn->can('send_cmd4');
@@ -18,9 +17,6 @@ my $opt = { 1 => \(my $out = ''), 2 => \(my $err = '') };
 my ($home, $for_destroy) = tmpdir();
 my $err_filter;
 my $curl = which('curl');
-my @onions = qw(http://hjrcffqmbrq6wope.onion/meta/
-	http://czquwvybam4bgbro.onion/meta/
-	http://ou63pmih66umazou.onion/meta/);
 my $json = ref(PublicInbox::Config->json)->new->utf8->canonical;
 my $lei = sub {
 	my ($cmd, $env, $xopt) = @_;
@@ -130,226 +126,6 @@ my $test_config = sub {
 	ok(!-f "$home/config/f", 'no file created');
 };
 
-my $setup_publicinboxes = sub {
-	state $done = '';
-	return if $done eq $home;
-	use PublicInbox::InboxWritable;
-	for my $V (1, 2) {
-		run_script([qw(-init), "-V$V", "t$V",
-				'--newsgroup', "t.$V",
-				"$home/t$V", "http://example.com/t$V",
-				"t$V\@example.com" ]) or BAIL_OUT "init v$V";
-	}
-	my $cfg = PublicInbox::Config->new;
-	my $seen = 0;
-	$cfg->each_inbox(sub {
-		my ($ibx) = @_;
-		my $im = PublicInbox::InboxWritable->new($ibx)->importer(0);
-		my $V = $ibx->version;
-		my @eml = (glob('t/*.eml'), 't/data/0001.patch');
-		for (@eml) {
-			next if $_ eq 't/psgi_v2-old.eml'; # dup mid
-			$im->add(eml_load($_)) or BAIL_OUT "v$V add $_";
-			$seen++;
-		}
-		$im->done;
-		if ($V == 1) {
-			run_script(['-index', $ibx->{inboxdir}]) or
-				BAIL_OUT 'index v1';
-		}
-	});
-	$done = $home;
-	$seen || BAIL_OUT 'no imports';
-};
-
-my $test_external_remote = sub {
-	my ($url, $k) = @_;
-SKIP: {
-	my $nr = 5;
-	skip "$k unset", $nr if !$url;
-	skip $req_sendcmd, $nr if $req_sendcmd;
-	$curl or skip 'no curl', $nr;
-	which('torsocks') or skip 'no torsocks', $nr if $url =~ m!\.onion/!;
-	my $mid = '20140421094015.GA8962@dcvr.yhbt.net';
-	my @cmd = ('q', '--only', $url, '-q', "m:$mid");
-	ok($lei->(@cmd), "query $url");
-	is($err, '', "no errors on $url");
-	my $res = $json->decode($out);
-	is($res->[0]->{'m'}, "<$mid>", "got expected mid from $url");
-	ok($lei->(@cmd, 'd:..20101002'), 'no results, no error');
-	is($err, '', 'no output on 404, matching local FS behavior');
-	is($out, "[null]\n", 'got null results');
-} # /SKIP
-}; # /sub
-
-my $test_external = sub {
-	$setup_publicinboxes->();
-	$cleanup->();
-	$lei->('ls-external');
-	is($out.$err, '', 'ls-external no output, yet');
-	ok(!-e $config_file && !-e $store_dir,
-		'nothing created by ls-external');
-
-	ok(!$lei->('add-external', "$home/nonexistent"),
-		"fails on non-existent dir");
-	$lei->('ls-external');
-	is($out.$err, '', 'ls-external still has no output');
-	my $cfg = PublicInbox::Config->new;
-	$cfg->each_inbox(sub {
-		my ($ibx) = @_;
-		ok($lei->(qw(add-external -q), $ibx->{inboxdir}),
-			'added external');
-		is($out.$err, '', 'no output');
-	});
-	ok(-s $config_file && -e $store_dir,
-		'add-external created config + store');
-	my $lcfg = PublicInbox::Config->new($config_file);
-	$cfg->each_inbox(sub {
-		my ($ibx) = @_;
-		is($lcfg->{"external.$ibx->{inboxdir}.boost"}, 0,
-			"configured boost on $ibx->{name}");
-	});
-	$lei->('ls-external');
-	like($out, qr/boost=0\n/s, 'ls-external has output');
-	ok($lei->(qw(add-external -q https://EXAMPLE.com/ibx)), 'add remote');
-	is($err, '', 'no warnings after add-external');
-
-	ok($lei->(qw(_complete lei forget-external)), 'complete for externals');
-	my %comp = map { $_ => 1 } split(/\s+/, $out);
-	ok($comp{'https://example.com/ibx/'}, 'forget external completion');
-	$cfg->each_inbox(sub {
-		my ($ibx) = @_;
-		ok($comp{$ibx->{inboxdir}}, "local $ibx->{name} completion");
-	});
-	for my $u (qw(h http https https: https:/ https:// https://e
-			https://example https://example. https://example.co
-			https://example.com https://example.com/
-			https://example.com/i https://example.com/ibx)) {
-		ok($lei->(qw(_complete lei forget-external), $u),
-			"partial completion for URL $u");
-		is($out, "https://example.com/ibx/\n",
-			"completed partial URL $u");
-		for my $qo (qw(-I --include --exclude --only)) {
-			ok($lei->(qw(_complete lei q), $qo, $u),
-				"partial completion for URL q $qo $u");
-			is($out, "https://example.com/ibx/\n",
-				"completed partial URL $u on q $qo");
-		}
-	}
-	ok($lei->(qw(_complete lei add-external), 'https://'),
-		'add-external hostname completion');
-	is($out, "https://example.com/\n", 'completed up to hostname');
-
-	$lei->('ls-external');
-	like($out, qr!https://example\.com/ibx/!s, 'added canonical URL');
-	is($err, '', 'no warnings on ls-external');
-	ok($lei->(qw(forget-external -q https://EXAMPLE.com/ibx)),
-		'forget');
-	$lei->('ls-external');
-	unlike($out, qr!https://example\.com/ibx/!s, 'removed canonical URL');
-
-SKIP: {
-	skip $req_sendcmd, 52 if $req_sendcmd;
-	ok(!$lei->(qw(q s:prefix -o /dev/null -f maildir)), 'bad maildir');
-	like($err, qr!/dev/null exists and is not a directory!,
-		'error shown');
-	is($? >> 8, 1, 'errored out with exit 1');
-
-	ok(!$lei->(qw(q s:prefix -f mboxcl2 -o), $home), 'bad mbox');
-	like($err, qr!\Q$home\E exists and is not a writable file!,
-		'error shown');
-	is($? >> 8, 1, 'errored out with exit 1');
-
-	ok(!$lei->(qw(q s:prefix -o /dev/stdout -f Mbox2)), 'bad format');
-	like($err, qr/bad mbox --format=mbox2/, 'error shown');
-	is($? >> 8, 1, 'errored out with exit 1');
-
-	# note, on a Bourne shell users should be able to use either:
-	#	s:"use boolean prefix"
-	#	"s:use boolean prefix"
-	# or use single quotes, it should not matter.  Users only need
-	# to know shell quoting rules, not Xapian quoting rules.
-	# No double-quoting should be imposed on users on the CLI
-	$lei->('q', 's:use boolean prefix');
-	like($out, qr/search: use boolean prefix/, 'phrase search got result');
-	my $res = $json->decode($out);
-	is(scalar(@$res), 2, 'only 2 element array (1 result)');
-	is($res->[1], undef, 'final element is undef'); # XXX should this be?
-	is(ref($res->[0]), 'HASH', 'first element is hashref');
-	$lei->('q', '--pretty', 's:use boolean prefix');
-	my $pretty = $json->decode($out);
-	is_deeply($res, $pretty, '--pretty is identical after decode');
-
-	{
-		open my $fh, '+>', undef or BAIL_OUT $!;
-		$fh->autoflush(1);
-		print $fh 's:use' or BAIL_OUT $!;
-		seek($fh, 0, SEEK_SET) or BAIL_OUT $!;
-		ok($lei->([qw(q -q --stdin)], undef, { %$opt, 0 => $fh }),
-				'--stdin on regular file works');
-		like($out, qr/use boolean prefix/, '--stdin on regular file');
-	}
-	{
-		pipe(my ($r, $w)) or BAIL_OUT $!;
-		print $w 's:use' or BAIL_OUT $!;
-		close $w or BAIL_OUT $!;
-		ok($lei->([qw(q -q --stdin)], undef, { %$opt, 0 => $r }),
-				'--stdin on pipe file works');
-		like($out, qr/use boolean prefix/, '--stdin on pipe');
-	}
-	ok(!$lei->(qw(q -q --stdin s:use)), "--stdin and argv don't mix");
-
-	for my $fmt (qw(ldjson ndjson jsonl)) {
-		$lei->('q', '-f', $fmt, 's:use boolean prefix');
-		is($out, $json->encode($pretty->[0])."\n", "-f $fmt");
-	}
-
-	require IO::Uncompress::Gunzip;
-	for my $sfx ('', '.gz') {
-		my $f = "$home/mbox$sfx";
-		$lei->('q', '-o', "mboxcl2:$f", 's:use boolean prefix');
-		my $cat = $sfx eq '' ? sub {
-			open my $mb, '<', $f or fail "no mbox: $!";
-			<$mb>
-		} : sub {
-			my $z = IO::Uncompress::Gunzip->new($f, MultiStream=>1);
-			<$z>;
-		};
-		my @s = grep(/^Subject:/, $cat->());
-		is(scalar(@s), 1, "1 result in mbox$sfx");
-		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:see attachment');
-		is(grep(!/^#/, $err), 0, 'no errors from augment');
-		@s = grep(/^Subject:/, my @wtf = $cat->());
-		is(scalar(@s), 2, "2 results in mbox$sfx");
-
-		$lei->('q', '-a', '-o', "mboxcl2:$f", 's:nonexistent');
-		is(grep(!/^#/, $err), 0, "no errors on no results ($sfx)");
-
-		my @s2 = grep(/^Subject:/, $cat->());
-		is_deeply(\@s2, \@s,
-			"same 2 old results w/ --augment and bad search $sfx");
-
-		$lei->('q', '-o', "mboxcl2:$f", 's:nonexistent');
-		my @res = $cat->();
-		is_deeply(\@res, [], "clobber w/o --augment $sfx");
-	}
-	ok(!$lei->('q', '-o', "$home/mbox", 's:nope'),
-			'fails if mbox format unspecified');
-	ok(!$lei->(qw(q --no-local s:see)), '--no-local');
-	is($? >> 8, 1, 'proper exit code');
-	like($err, qr/no local or remote.+? to search/, 'no inbox');
-	my %e = (
-		TEST_LEI_EXTERNAL_HTTPS => 'https://public-inbox.org/meta/',
-		TEST_LEI_EXTERNAL_ONION => $onions[int(rand(scalar(@onions)))],
-	);
-	for my $k (keys %e) {
-		my $url = $ENV{$k} // '';
-		$url = $e{$k} if $url eq '1';
-		$test_external_remote->($url, $k);
-	}
-	}; # /SKIP
-};
-
 my $test_completion = sub {
 	ok($lei->(qw(_complete lei)), 'no errors on complete');
 	my %out = map { $_ => 1 } split(/\s+/s, $out);
@@ -398,7 +174,6 @@ my $test_lei_common = sub {
 	$test_help->();
 	$test_config->();
 	$test_init->();
-	$test_external->();
 	$test_completion->();
 	$test_fail->();
 };

^ permalink raw reply related	[relevance 24%]

* [PATCH 13/17] lei: add-external --mirror support
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
                   ` (6 preceding siblings ...)
  2021-02-06 12:18 66% ` [PATCH 12/17] script/lei: avoid waitpid(-1, ...) to keep tests fast Eric Wong
@ 2021-02-06 12:18 21% ` Eric Wong
  2021-02-06 12:18 26% ` [PATCH 14/17] lei help: split out into separate file Eric Wong
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

This can be useful for users who want to clone and
mirror an existing public-inbox.  This doesn't have
update support, yet, so users will need to run
"git fetch && public-inbox-index" for now.
---
 MANIFEST                               |   3 +
 contrib/completion/lei-completion.bash |   2 +-
 lib/PublicInbox/Admin.pm               |   7 +-
 lib/PublicInbox/LEI.pm                 |  17 +-
 lib/PublicInbox/LeiCurl.pm             |  65 ++++++
 lib/PublicInbox/LeiExternal.pm         |  28 ++-
 lib/PublicInbox/LeiMirror.pm           | 288 +++++++++++++++++++++++++
 lib/PublicInbox/LeiXSearch.pm          |  33 +--
 lib/PublicInbox/TestCommon.pm          |   5 +-
 t/lei-mirror.t                         |  24 +++
 10 files changed, 427 insertions(+), 45 deletions(-)
 create mode 100644 lib/PublicInbox/LeiCurl.pm
 create mode 100644 lib/PublicInbox/LeiMirror.pm
 create mode 100644 t/lei-mirror.t

diff --git a/MANIFEST b/MANIFEST
index 52dea385..4236f87c 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -177,9 +177,11 @@ lib/PublicInbox/InputPipe.pm
 lib/PublicInbox/Isearch.pm
 lib/PublicInbox/KQNotify.pm
 lib/PublicInbox/LEI.pm
+lib/PublicInbox/LeiCurl.pm
 lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
 lib/PublicInbox/LeiImport.pm
+lib/PublicInbox/LeiMirror.pm
 lib/PublicInbox/LeiOverview.pm
 lib/PublicInbox/LeiQuery.pm
 lib/PublicInbox/LeiSearch.pm
@@ -357,6 +359,7 @@ t/kqnotify.t
 t/lei-daemon.t
 t/lei-externals.t
 t/lei-import.t
+t/lei-mirror.t
 t/lei.t
 t/lei_dedupe.t
 t/lei_external.t
diff --git a/contrib/completion/lei-completion.bash b/contrib/completion/lei-completion.bash
index fbda474c..619805fb 100644
--- a/contrib/completion/lei-completion.bash
+++ b/contrib/completion/lei-completion.bash
@@ -5,7 +5,7 @@
 # Needs a lot of work, see `lei__complete' in lib/PublicInbox::LEI.pm
 _lei() {
 	case ${COMP_WORDS[@]} in
-	*' add-external http'*)
+	*' add-external h'* | *' --mirror h'*)
 		compopt -o nospace
 		;;
 	*) compopt +o nospace ;; # the default
diff --git a/lib/PublicInbox/Admin.pm b/lib/PublicInbox/Admin.pm
index 3b38a5a3..b21fb241 100644
--- a/lib/PublicInbox/Admin.pm
+++ b/lib/PublicInbox/Admin.pm
@@ -273,8 +273,8 @@ EOM
 	$idx->{nidx} // 0; # returns number processed
 }
 
-sub progress_prepare ($) {
-	my ($opt) = @_;
+sub progress_prepare ($;$) {
+	my ($opt, $dst) = @_;
 
 	# public-inbox-index defaults to quiet, -xcpdb and -compact do not
 	if (defined($opt->{quiet}) && $opt->{quiet} < 0) {
@@ -286,7 +286,8 @@ sub progress_prepare ($) {
 		$opt->{1} = $null; # suitable for spawn() redirect
 	} else {
 		$opt->{verbose} ||= 1;
-		$opt->{-progress} = sub { print STDERR @_ };
+		$dst //= *STDERR{GLOB};
+		$opt->{-progress} = sub { print $dst @_ };
 	}
 }
 
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 28ad88e7..bdeab7e3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -98,6 +98,13 @@ sub _config_path ($) {
 		.'/lei/config');
 }
 
+sub index_opt {
+	# TODO: drop underscore variants everywhere, they're undocumented
+	qw(fsync|sync! jobs|j=i indexlevel|index-level|L=s compact+
+	max_size|max-size=s sequential_shard|sequential-shard
+	batch_size|batch-size=s skip-docdata quiet|q verbose|v+)
+}
+
 # TODO: generate shell completion + help using %CMD and %OPTDESC
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
@@ -105,7 +112,7 @@ our %CMD = ( # sorted in order of importance/use:
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
-	mua-cmd|mua=s no-torsocks torsocks=s verbose|v quiet|q
+	mua-cmd|mua=s no-torsocks torsocks=s verbose|v+ quiet|q
 	received-after=s received-before=s sent-after=s sent-since=s),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
@@ -115,7 +122,8 @@ our %CMD = ( # sorted in order of importance/use:
 
 'add-external' => [ 'URL_OR_PATHNAME',
 	'add/set priority of a publicinbox|extindex for extra matches',
-	qw(boost=i quiet|q) ],
+	qw(boost=i c=s@ mirror=s no-torsocks torsocks=s inbox-version=i),
+	index_opt(), PublicInbox::LeiQuery::curl_opt() ],
 'ls-external' => [ '[FILTER...]', 'list publicinbox|extindex locations',
 	qw(format|f=s z|0 local remote quiet|q) ],
 'forget-external' => [ 'URL_OR_PATHNAME...|--prune',
@@ -204,7 +212,7 @@ my %OPTDESC = (
 'help|h' => 'show this built-in help',
 'quiet|q' => 'be quiet',
 'globoff|g' => "do not match locations using '*?' wildcards and '[]' ranges",
-'verbose|v' => 'be more verbose',
+'verbose|v+' => 'be more verbose',
 'solve!' => 'do not attempt to reconstruct blobs from emails',
 'torsocks=s' => ['auto|no|yes',
 		'whether or not to wrap git and curl commands with torsocks'],
@@ -286,7 +294,7 @@ my %CONFIG_KEYS = (
 	'leistore.dir' => 'top-level storage location',
 );
 
-my @WQ_KEYS = qw(lxs l2m imp); # internal workers
+my @WQ_KEYS = qw(lxs l2m imp mrr); # internal workers
 
 # pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
 sub x_it ($$) {
@@ -714,6 +722,7 @@ sub lei__complete {
 		}
 		puts $self, grep(/$re/, map { # generate short/long names
 			if (s/[:=].+\z//) { # req/optional args, e.g output|o=i
+			} elsif (s/\+\z//) { # verbose|v+
 			} elsif (s/!\z//) {
 				# negation: solve! => no-solve|solve
 				s/([\w\-]+)/$1|no-$1/g
diff --git a/lib/PublicInbox/LeiCurl.pm b/lib/PublicInbox/LeiCurl.pm
new file mode 100644
index 00000000..c8747d4f
--- /dev/null
+++ b/lib/PublicInbox/LeiCurl.pm
@@ -0,0 +1,65 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# common option and torsocks(1) wrapping for curl(1)
+package PublicInbox::LeiCurl;
+use strict;
+use v5.10.1;
+use PublicInbox::Spawn qw(which);
+use PublicInbox::Config;
+
+# prepares a common command for curl(1) based on $lei command
+sub new {
+	my ($cls, $lei, $curl) = @_;
+	$curl //= which('curl') // return $lei->fail('curl not found');
+	my $opt = $lei->{opt};
+	my @cmd = ($curl, qw(-Sf));
+	$cmd[-1] .= 's' if $opt->{quiet}; # already the default for "lei q"
+	$cmd[-1] .= 'v' if $opt->{verbose}; # we use ourselves, too
+	for my $o ($lei->curl_opt) {
+		$o =~ s/\|[a-z0-9]\b//i; # remove single char short option
+		if ($o =~ s/=[is]@\z//) {
+			my $ary = $opt->{$o} or next;
+			push @cmd, map { ("--$o", $_) } @$ary;
+		} elsif ($o =~ s/=[is]\z//) {
+			my $val = $opt->{$o} // next;
+			push @cmd, "--$o", $val;
+		} elsif ($opt->{$o}) {
+			push @cmd, "--$o";
+		}
+	}
+	push @cmd, '-v' if $opt->{verbose}; # lei uses this itself
+	bless \@cmd, $cls;
+}
+
+sub torsocks { # useful for "git clone" and "git fetch", too
+	my ($self, $lei, $uri)= @_;
+	my $opt = $lei->{opt};
+	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
+	my $torsocks = $opt->{torsocks} //= 'auto';
+	if ($torsocks eq 'auto' && substr($uri->host, -6) eq '.onion' &&
+			(($lei->{env}->{LD_PRELOAD}//'') !~ /torsocks/)) {
+		# "auto" continues anyways if torsocks is missing;
+		# a proxy may be specified via CLI, curlrc,
+		# environment variable, or even firewall rule
+		[ ($lei->{torsocks} //= which('torsocks')) // () ]
+	} elsif (PublicInbox::Config::git_bool($torsocks)) {
+		my $x = $lei->{torsocks} //= which('torsocks');
+		$x or return $lei->fail(<<EOM);
+--torsocks=yes specified but torsocks not found in PATH=$ENV{PATH}
+EOM
+		[ $x ];
+	} else { # the common case for current Internet :<
+		[];
+	}
+}
+
+# completes the result of cmd() for $uri
+sub for_uri {
+	my ($self, $lei, $uri) = @_;
+	my $pfx = torsocks($self, $lei, $uri) or return; # error
+	[ @$pfx, @$self, substr($uri->path, -3) eq '.gz' ? () : '--compressed',
+		$uri->as_string ]
+}
+
+1;
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index accacf1a..6a5c2517 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -88,19 +88,35 @@ sub get_externals {
 	();
 }
 
-sub lei_add_external {
+sub add_external_finish {
 	my ($self, $location) = @_;
 	my $cfg = $self->_lei_cfg(1);
 	my $new_boost = $self->{opt}->{boost} // 0;
-	$location = ext_canonicalize($location);
-	if ($location !~ m!\Ahttps?://! && !-d $location) {
-		return $self->fail("$location not a directory");
-	}
 	my $key = "external.$location.boost";
 	my $cur_boost = $cfg->{$key};
 	return if defined($cur_boost) && $cur_boost == $new_boost; # idempotent
 	$self->lei_config($key, $new_boost);
-	$self->_lei_store(1)->done; # just create the store
+}
+
+sub lei_add_external {
+	my ($self, $location) = @_;
+	$self->_lei_store(1)->write_prepare($self);
+	my $new_boost = $self->{opt}->{boost} // 0;
+	$location = ext_canonicalize($location);
+	my $mirror = $self->{opt}->{mirror};
+	if (defined($mirror) && -d $location) {
+		$self->fail(<<""); # TODO: did you mean "update-external?"
+--mirror destination `$location' already exists
+
+	}
+	if ($location !~ m!\Ahttps?://! && !-d $location) {
+		$mirror // return $self->fail("$location not a directory");
+		$mirror = ext_canonicalize($mirror);
+		require PublicInbox::LeiMirror;
+		PublicInbox::LeiMirror->start($self, $mirror => $location);
+	} else {
+		add_external_finish($self, $location);
+	}
 }
 
 sub lei_forget_external {
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
new file mode 100644
index 00000000..bb172e6a
--- /dev/null
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -0,0 +1,288 @@
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# "lei add-external --mirror" support
+package PublicInbox::LeiMirror;
+use strict;
+use v5.10.1;
+use parent qw(PublicInbox::IPC);
+use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
+use PublicInbox::Spawn qw(popen_rd spawn);
+use PublicInbox::PktOp;
+
+sub mirror_done { # EOF callback for main daemon
+	my ($lei) = @_;
+	my $mrr = delete $lei->{mrr};
+	$mrr->wq_wait_old($lei) if $mrr;
+	# FIXME: check $? before finish
+	$lei->add_external_finish($mrr->{dst});
+	$lei->dclose;
+}
+
+# for old installations without manifest.js.gz
+sub try_scrape {
+	my ($self) = @_;
+	my $uri = URI->new($self->{src});
+	my $lei = $self->{lei};
+	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
+	my $cmd = $curl->for_uri($lei, $uri);
+	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
+	my $fh = popen_rd($cmd, $lei->{env}, $opt);
+	my $html = do { local $/; <$fh> } // die "read(curl $uri): $!";
+	close($fh) or return $lei->child_error($?, "@$cmd failed");
+
+	# we grep with URL below, we don't want Subject/From headers
+	# making us clone random URLs
+	my @urls = ($html =~ m!\bgit clone --mirror ([a-z\+]+://\S+)!g);
+	my $url = $uri->as_string;
+	chop($url) eq '/' or die "BUG: $uri not canonicalized";
+
+	# since this is for old instances w/o manifest.js.gz, try v1 first
+	return clone_v1($self) if grep(m!\A\Q$url\E/*\z!, @urls);
+	if (my @v2_urls = grep(m!\A\Q$url\E/[0-9]+\z!, @urls)) {
+		my %v2_uris = map { $_ => URI->new($_) } @v2_urls; # uniq
+		return clone_v2($self, [ values %v2_uris ]);
+	}
+
+	# filter out common URLs served by WWW (e.g /$MSGID/T/)
+	if (@urls && $url =~ s!/+[^/]+\@[^/]+/.*\z!! &&
+			grep(m!\A\Q$url\E/*\z!, @urls)) {
+		die <<"";
+E: confused by scraping <$uri>, did you mean <$url>?
+
+	}
+	@urls and die <<"";
+E: confused by scraping <$uri>, got ambiguous results:
+@urls
+
+	die "E: scraping <$uri> revealed nothing\n";
+}
+
+sub clone_cmd {
+	my ($lei) = @_;
+	my @cmd = qw(git);
+	# we support "-c $key=$val" for arbitrary git config options
+	# e.g.: git -c http.proxy=socks5h://127.0.0.1:9050
+	push(@cmd, '-c', $_) for @{$lei->{opt}->{c} // []};
+	push @cmd, qw(clone --mirror);
+	push @cmd, '-q' if $lei->{opt}->{quiet};
+	push @cmd, '-v' if $lei->{opt}->{verbose};
+	# XXX any other options to support?
+	# --reference is tricky with multiple epochs...
+	@cmd;
+}
+
+# tries the relatively new /$INBOX/_/text/config/raw endpoint
+sub _try_config {
+	my ($self) = @_;
+	my $dst = $self->{dst};
+	if (!-d $dst || !mkdir($dst)) {
+		require File::Path;
+		File::Path::mkpath($dst);
+		-d $dst or die "mkpath($dst): $!\n";
+	}
+	my $uri = URI->new($self->{src});
+	my $lei = $self->{lei};
+	my $path = $uri->path;
+	chop($path) eq '/' or die "BUG: $uri not canonicalized";
+	$uri->path($path . '/_/text/config/raw');
+	my $cmd = $self->{curl}->for_uri($lei, $uri);
+	push @$cmd, '--compressed'; # curl decompresses for us
+	my $ce = "$dst/inbox.config.example";
+	my $f = "$ce-$$.tmp";
+	open(my $fh, '+>', $f) or return $lei->err("open $f: $! (non-fatal)");
+	my $opt = { 0 => $lei->{0}, 1 => $fh, 2 => $lei->{2} };
+	$lei->qerr("# @$cmd");
+	my $pid = spawn($cmd, $lei->{env}, $opt);
+	waitpid($pid, 0) == $pid or return $lei->err("waitpid @$cmd: $!");
+	if (($? >> 8) == 22) { # 404 missing
+		unlink($f) if -s $fh == 0;
+		return;
+	}
+	return $lei->err("# @$cmd failed (non-fatal)") if $?;
+	rename($f, $ce) or return $lei->err("link($f, $ce): $! (non-fatal)");
+	my $cfg = PublicInbox::Config::git_config_dump($f);
+	my $ibx = $self->{ibx} = {};
+	for my $sec (grep(/\Apublicinbox\./, @{$cfg->{-section_order}})) {
+		for (qw(address newsgroup nntpmirror)) {
+			$ibx->{$_} = $cfg->{"$sec.$_"};
+		}
+	}
+}
+
+sub index_cloned_inbox {
+	my ($self, $iv) = @_;
+	my $ibx = delete($self->{ibx}) // {
+		address => [ 'lei@example.com' ],
+		version => $iv,
+	};
+	$ibx->{inboxdir} = $self->{dst};
+	PublicInbox::Inbox->new($ibx);
+	PublicInbox::InboxWritable->new($ibx);
+	my $opt = {};
+	my $lei = $self->{lei};
+	for my $sw ($lei->index_opt) {
+		my ($k) = ($sw =~ /\A([\w-]+)/);
+		$opt->{$k} = $lei->{opt}->{$k};
+	}
+	# force synchronous dwaitpid for v2:
+	local $PublicInbox::DS::in_loop = 0;
+	my $cfg = PublicInbox::Config->new;
+	my $env = PublicInbox::Admin::index_prepare($opt, $cfg);
+	local %ENV = (%ENV, %$env) if $env;
+	PublicInbox::Admin::progress_prepare($opt, $lei->{2});
+	PublicInbox::Admin::index_inbox($ibx, undef, $opt);
+}
+
+sub clone_v1 {
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
+	my $uri = URI->new($self->{src});
+	my $pfx = $curl->torsocks($lei, $uri) or return;
+	my $cmd = [ @$pfx, clone_cmd($lei), $uri->as_string, $self->{dst} ];
+	$lei->qerr("# @$cmd");
+	my $pid = spawn($cmd, $lei->{env}, $lei);
+	waitpid($pid, 0) == $pid or die "BUG: waitpid @$cmd: $!";
+	$? == 0 or return $lei->child_error($?, "@$cmd failed");
+	_try_config($self);
+	index_cloned_inbox($self, 1);
+}
+
+sub clone_v2 {
+	my ($self, $v2_uris) = @_;
+	my $lei = $self->{lei};
+	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
+	my $pfx //= $curl->torsocks($lei, $v2_uris->[0]) or return;
+	my @epochs;
+	my $dst = $self->{dst};
+	my @src_edst;
+	for my $uri (@$v2_uris) {
+		my $src = $uri->as_string;
+		my $edst = $dst;
+		$src =~ m!/([0-9]+)(?:\.git)?\z! or die <<"";
+failed to extract epoch number from $src
+
+		my $nr = $1 + 0;
+		$edst .= "/git/$nr.git";
+		push @src_edst, [ $src, $edst ];
+	}
+	my $lk = bless { lock_path => "$dst/inbox.lock" }, 'PublicInbox::Lock';
+	_try_config($self);
+	my $on_destroy = $lk->lock_for_scope($$);
+	my @cmd = clone_cmd($lei);
+	while (my $pair = shift(@src_edst)) {
+		my $cmd = [ @$pfx, @cmd, @$pair ];
+		$lei->qerr("# @$cmd");
+		my $pid = spawn($cmd, $lei->{env}, $lei);
+		waitpid($pid, 0) == $pid or die "BUG: waitpid @$cmd: $!";
+		$? == 0 or return $lei->child_error($?, "@$cmd failed");
+	}
+	undef $on_destroy; # unlock
+	index_cloned_inbox($self, 2);
+}
+
+sub try_manifest {
+	my ($self) = @_;
+	my $uri = URI->new($self->{src});
+	my $lei = $self->{lei};
+	my $curl = $self->{curl} //= PublicInbox::LeiCurl->new($lei) or return;
+	my $path = $uri->path;
+	chop($path) eq '/' or die "BUG: $uri not canonicalized";
+	$uri->path($path . '/manifest.js.gz');
+	my $cmd = $curl->for_uri($lei, $uri);
+	$lei->qerr("# @$cmd");
+	my $opt = { 0 => $lei->{0}, 2 => $lei->{2} };
+	my $fh = popen_rd($cmd, $lei->{env}, $opt);
+	my $gz = do { local $/; <$fh> } // die "read(curl $uri): $!";
+	unless (close $fh) {
+		return try_scrape($self) if ($? >> 8) == 22; # 404 missing
+		return $lei->child_error($?, "@$cmd failed");
+	}
+	my $js;
+	gunzip(\$gz => \$js, MultiStream => 1) or
+		die "gunzip($uri): $GunzipError";
+	my $m = eval { PublicInbox::Config->json->decode($js) };
+	die "$uri: error decoding `$js': $@" if $@;
+	ref($m) eq 'HASH' or die "$uri unknown type: ".ref($m);
+
+	my $v1_bare = $m->{$path};
+	my @v2_epochs = grep(m!\A\Q$path\E/git/[0-9]+\.git\z!, keys %$m);
+	if (@v2_epochs) {
+		# It may be possible to have v1 + v2 in parallel someday:
+		$lei->err(<<EOM) if defined $v1_bare;
+# `$v1_bare' appears to be a v1 inbox while v2 epochs exist:
+# @v2_epochs
+# ignoring $v1_bare (use --inbox-version=1 to force v1 instead)
+EOM
+		@v2_epochs = map { $uri->path($_); $uri->clone } @v2_epochs;
+		clone_v2($self, \@v2_epochs);
+	} elsif ($v1_bare) {
+		clone_v1($self);
+	} elsif (my @maybe = grep(m!\Q$path\E!, keys %$m)) {
+		die "E: confused by <$uri>, possible matches:\n@maybe";
+	} else {
+		die "E: confused by <$uri>";
+	}
+}
+
+sub start_clone_url {
+	my ($self) = @_;
+	return try_manifest($self) if $self->{src} =~ m!\Ahttps?://!;
+	die "TODO: non-HTTP/HTTPS clone of $self->{src} not supported, yet";
+}
+
+sub do_mirror { # via wq_do
+	my ($self) = @_;
+	my $lei = $self->{lei};
+	eval {
+		my $iv = $lei->{opt}->{'inbox-version'};
+		if (defined $iv) {
+			return clone_v1($self) if $iv == 1;
+			return try_scrape($self) if $iv == 2;
+			die "bad --inbox-version=$iv\n";
+		}
+		return start_clone_url($self) if $self->{src} =~ m!://!;
+		die "TODO: cloning local directories not supported, yet";
+	};
+	return $lei->fail($@) if $@;
+	$lei->qerr("# mirrored $self->{src} => $self->{dst}");
+}
+
+sub start {
+	my ($cls, $lei, $src, $dst) = @_;
+	my $self = bless { lei => $lei, src => $src, dst => $dst }, $cls;
+	$lei->{mrr} = $self;
+	if ($src =~ m!https?://!) {
+		require URI;
+		require PublicInbox::LeiCurl;
+	}
+	require PublicInbox::Lock;
+	require PublicInbox::Inbox;
+	require PublicInbox::Admin;
+	require PublicInbox::InboxWritable;
+	my $ops = {
+		'!' => [ $lei->can('fail_handler'), $lei ],
+		'x_it' => [ $lei->can('x_it'), $lei ],
+		'child_error' => [ $lei->can('child_error'), $lei ],
+		'' => [ \&mirror_done, $lei ],
+	};
+	($lei->{pkt_op_c}, $lei->{pkt_op_p}) = PublicInbox::PktOp->pair($ops);
+	$self->wq_workers_start('lei_mirror', 1, $lei->oldset, {lei => $lei});
+	my $op = delete $lei->{pkt_op_c};
+	delete $lei->{pkt_op_p};
+	$self->wq_do('do_mirror', []);
+	$self->wq_close(1);
+	$lei->event_step_init; # wait for shutdowns
+	if ($lei->{oneshot}) {
+		while ($op->{sock}) { $op->event_step }
+	}
+}
+
+sub ipc_atfork_child {
+	my ($self) = @_;
+	$self->{lei}->lei_atfork_child;
+	$self->SUPER::ipc_atfork_child;
+}
+
+1;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index f8068362..1e5d7ca6 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -212,7 +212,6 @@ sub query_remote_mboxrd {
 	my ($opt, $env) = @$lei{qw(opt env)};
 	my @qform = (q => $lei->{mset_opt}->{qstr}, x => 'm');
 	push(@qform, t => 1) if $opt->{thread};
-	my @cmd = ($self->{curl}, qw(-sSf -d), '');
 	my $verbose = $opt->{verbose};
 	my $reap;
 	my $cerr = File::Temp->new(TEMPLATE => 'curl.err-XXXX', TMPDIR => 1);
@@ -223,43 +222,18 @@ sub query_remote_mboxrd {
 		# spawn a process to force line-buffering, otherwise curl
 		# will write 1 character at-a-time and parallel outputs
 		# mmmaaayyy llloookkk llliiikkkeee ttthhhiiisss
-		push @cmd, '-v';
 		my $o = { 1 => $lei->{2}, 2 => $lei->{2} };
 		my $pid = spawn(['tail', '-f', $cerr->filename], undef, $o);
 		$reap = PublicInbox::OnDestroy->new(\&kill_reap, $pid);
 	}
-	for my $o ($lei->curl_opt) {
-		$o =~ s/\|[a-z0-9]\b//i; # remove single char short option
-		if ($o =~ s/=[is]@\z//) {
-			my $ary = $opt->{$o} or next;
-			push @cmd, map { ("--$o", $_) } @$ary;
-		} elsif ($o =~ s/=[is]\z//) {
-			my $val = $opt->{$o} // next;
-			push @cmd, "--$o", $val;
-		} elsif ($opt->{$o}) {
-			push @cmd, "--$o";
-		}
-	}
-	$opt->{torsocks} = 'false' if $opt->{'no-torsocks'};
-	my $tor = $opt->{torsocks} //= 'auto';
+	my $curl = PublicInbox::LeiCurl->new($lei, $self->{curl}) or return;
+	push @$curl, '-s', '-d', '';
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	for my $uri (@$uris) {
 		$lei->{-current_url} = $uri->as_string;
 		$lei->{-nr_remote_eml} = 0;
 		$uri->query_form(@qform);
-		my $cmd = [ @cmd, $uri->as_string ];
-		if ($tor eq 'auto' && substr($uri->host, -6) eq '.onion' &&
-				(($env->{LD_PRELOAD}//'') !~ /torsocks/)) {
-			unshift @$cmd, which('torsocks');
-		} elsif (PublicInbox::Config::git_bool($tor)) {
-			unshift @$cmd, which('torsocks');
-		}
-
-		# continue anyways if torsocks is missing; a proxy may be
-		# specified via CLI, curlrc, environment variable, or even
-		# firewall rule
-		shift(@$cmd) if !$cmd->[0];
-
+		my $cmd = $curl->for_uri($lei, $uri);
 		$lei->err("# @$cmd") if $verbose;
 		my ($fh, $pid) = popen_rd($cmd, $env, $rdr);
 		$fh = IO::Uncompress::Gunzip->new($fh);
@@ -440,6 +414,7 @@ sub add_uri {
 	if (my $curl = $self->{curl} //= which('curl') // 0) {
 		require PublicInbox::MboxReader;
 		require IO::Uncompress::Gunzip;
+		require PublicInbox::LeiCurl;
 		push @{$self->{remotes}}, $uri;
 	} else {
 		warn "curl missing, ignoring $uri\n";
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index c861dc5d..5cce44e4 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -461,8 +461,9 @@ SKIP: {
 Socket::MsgHdr missing or Inline::C is unconfigured/missing
 EOM
 	$lei_opt = { 1 => \$lei_out, 2 => \$lei_err };
-	my $daemon_pid;
-	my ($tmpdir, $for_destroy) = tmpdir();
+	my ($daemon_pid, $for_destroy);
+	my $tmpdir = $test_opt->{tmpdir};
+	($tmpdir, $for_destroy) = tmpdir unless $tmpdir;
 	SKIP: {
 		skip 'TEST_LEI_ONESHOT set', 1 if $ENV{TEST_LEI_ONESHOT};
 		my $home = "$tmpdir/lei-daemon";
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
new file mode 100644
index 00000000..cf34c7ae
--- /dev/null
+++ b/t/lei-mirror.t
@@ -0,0 +1,24 @@
+#!perl -w
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+my $sock = tcp_server();
+my ($tmpdir, $for_destroy) = tmpdir();
+my $http = 'http://'.$sock->sockhost.':'.$sock->sockport.'/';
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+my $cmd = [ qw(-httpd -W0), "--stdout=$tmpdir/out", "--stderr=$tmpdir/err" ];
+my $td = start_script($cmd, { PI_CONFIG => $cfg_path }, { 3 => $sock });
+test_lei({ tmpdir => $tmpdir }, sub {
+	my $home = $ENV{HOME};
+	my $t1 = "$home/t1-mirror";
+	ok($lei->('add-external', $t1, '--mirror', "$http/t1/"), '--mirror v1');
+	ok(-f "$t1/public-inbox/msgmap.sqlite3", 't1-mirror indexed');
+	my $t2 = "$home/t2-mirror";
+	ok($lei->('add-external', $t2, '--mirror', "$http/t2/"), '--mirror v2');
+	ok(-f "$t2/msgmap.sqlite3", 't2-mirror indexed');
+});
+
+ok($td->kill, 'killed -httpd');
+$td->join;
+
+done_testing;

^ permalink raw reply related	[relevance 21%]

* [PATCH 15/17] lei add-external: reject index and remote opts w/o mirror
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
                   ` (8 preceding siblings ...)
  2021-02-06 12:18 26% ` [PATCH 14/17] lei help: split out into separate file Eric Wong
@ 2021-02-06 12:18 66% ` Eric Wong
  2021-02-06 12:18 64% ` [PATCH 17/17] lei: remove short switch support for curl(1) options Eric Wong
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

Option combinations which make no sense should fail
to prevent misunderstandings and avoid surprises.
---
 lib/PublicInbox/LeiExternal.pm | 22 ++++++++++++++++++++--
 t/lei-mirror.t                 |  6 ++++++
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 6a5c2517..b65dc87c 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -101,9 +101,27 @@ sub add_external_finish {
 sub lei_add_external {
 	my ($self, $location) = @_;
 	$self->_lei_store(1)->write_prepare($self);
-	my $new_boost = $self->{opt}->{boost} // 0;
+	my $opt = $self->{opt};
+	my $mirror = $opt->{mirror} // do {
+		my @fail;
+		for my $sw ($self->index_opt, $self->curl_opt,
+				qw(c no-torsocks torsocks inbox-version)) {
+			my ($f) = (split(/|/, $sw, 2))[0];
+			next unless defined $opt->{$f};
+			$f = length($f) == 1 ? "-$f" : "--$f";
+			push @fail, $f;
+		}
+		if (scalar(@fail) == 1) {
+			return $self->("@fail requires --mirror");
+		} elsif (@fail) {
+			my $last = pop @fail;
+			my $fail = join(', ', @fail);
+			return $self->("@fail and $last require --mirror");
+		}
+		undef;
+	};
+	my $new_boost = $opt->{boost} // 0;
 	$location = ext_canonicalize($location);
-	my $mirror = $self->{opt}->{mirror};
 	if (defined($mirror) && -d $location) {
 		$self->fail(<<""); # TODO: did you mean "update-external?"
 --mirror destination `$location' already exists
diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index cf34c7ae..6af49678 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -16,6 +16,12 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	my $t2 = "$home/t2-mirror";
 	ok($lei->('add-external', $t2, '--mirror', "$http/t2/"), '--mirror v2');
 	ok(-f "$t2/msgmap.sqlite3", 't2-mirror indexed');
+
+	ok(!$lei->('add-external', $t2, '--mirror', "$http/t2/"),
+		'--mirror fails if reused');
+
+	ok(!$lei->('add-external', "$t2-fail", '-Lmedium'), '--mirror v2');
+	ok(!-d "$t2-fail", 'destination not created on failure');
 });
 
 ok($td->kill, 'killed -httpd');

^ permalink raw reply related	[relevance 66%]

* [PATCH 17/17] lei: remove short switch support for curl(1) options
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
                   ` (9 preceding siblings ...)
  2021-02-06 12:18 66% ` [PATCH 15/17] lei add-external: reject index and remote opts w/o mirror Eric Wong
@ 2021-02-06 12:18 64% ` Eric Wong
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

In particular, -U and -u switches may conflict with diff(1)
options we may need for "lei show" which will use solver
remotely or locally.
---
 lib/PublicInbox/LeiQuery.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 63945d53..0346498f 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -164,7 +164,7 @@ sub curl_opt { qw(
 	connect-timeout=s connect-to=s cookie-jar=s cookie=s crlfile=s
 	digest disable dns-interface=s dns-ipv4-addr=s dns-ipv6-addr=s
 	dns-servers=s doh-url=s egd-file=s engine=s false-start
-	happy-eyeballs-timeout-ms=s haproxy-protocol header|H=s@
+	happy-eyeballs-timeout-ms=s haproxy-protocol header=s@
 	http2-prior-knowledge http2 insecure
 	interface=s ipv4 ipv6 junk-session-cookies
 	key-type=s key=s limit-rate=s local-port=s location-trusted location
@@ -177,7 +177,7 @@ sub curl_opt { qw(
 	proxy-key-type=s proxy-key proxy-negotiate proxy-ntlm proxy-pass=s
 	proxy-pinnedpubkey=s proxy-service-name=s proxy-ssl-allow-beast
 	proxy-tls13-ciphers=s proxy-tlsauthtype=s proxy-tlspassword=s
-	proxy-tlsuser=s proxy-tlsv1 proxy-user|U=s proxy=s
+	proxy-tlsuser=s proxy-tlsv1 proxy-user=s proxy=s
 	proxytunnel=s pubkey=s random-file=s referer=s resolve=s
 	retry-connrefused retry-delay=s retry-max-time=s retry=i
 	sasl-ir service-name=s socks4=s socks4a=s socks5-basic
@@ -186,7 +186,7 @@ sub curl_opt { qw(
 	suppress-connect-headers tcp-fastopen tls-max=s
 	tls13-ciphers=s tlsauthtype=s tlspassword=s tlsuser=s
 	tlsv1 trace-ascii=s trace-time trace=s
-	unix-socket=s user-agent|A=s user|u=s
+	unix-socket=s user-agent=s user=s
 )
 }
 

^ permalink raw reply related	[relevance 64%]

* [PATCH 14/17] lei help: split out into separate file
  2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
                   ` (7 preceding siblings ...)
  2021-02-06 12:18 21% ` [PATCH 13/17] lei: add-external --mirror support Eric Wong
@ 2021-02-06 12:18 26% ` Eric Wong
  2021-02-06 12:18 66% ` [PATCH 15/17] lei add-external: reject index and remote opts w/o mirror Eric Wong
  2021-02-06 12:18 64% ` [PATCH 17/17] lei: remove short switch support for curl(1) options Eric Wong
  10 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-06 12:18 UTC (permalink / raw)
  To: meta

We'll reword and improve formatting with non-breaking spaces
("\xa0") which is only replaced with SP after wrapping.

Some terminology is shortened (e.g. "URL_OR_PATHNAME" => "LOCATION")
to improve formatting.

This also enables completion for -h/--help and lets us
prioritize favored switch names while attempting to
satisfy users relying on muscle memory from other tools.
---
 MANIFEST                   |   1 +
 lib/PublicInbox/LEI.pm     | 167 +++++++++++++------------------------
 lib/PublicInbox/LeiHelp.pm | 100 ++++++++++++++++++++++
 3 files changed, 160 insertions(+), 108 deletions(-)
 create mode 100644 lib/PublicInbox/LeiHelp.pm

diff --git a/MANIFEST b/MANIFEST
index 4236f87c..521f1f68 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -180,6 +180,7 @@ lib/PublicInbox/LEI.pm
 lib/PublicInbox/LeiCurl.pm
 lib/PublicInbox/LeiDedupe.pm
 lib/PublicInbox/LeiExternal.pm
+lib/PublicInbox/LeiHelp.pm
 lib/PublicInbox/LeiImport.pm
 lib/PublicInbox/LeiMirror.pm
 lib/PublicInbox/LeiOverview.pm
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index bdeab7e3..3098ade7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -23,7 +23,6 @@ use PublicInbox::Sigfd;
 use PublicInbox::DS qw(now dwaitpid);
 use PublicInbox::Spawn qw(spawn popen_rd);
 use PublicInbox::OnDestroy;
-use Text::Wrap qw(wrap);
 use Time::HiRes qw(stat); # ctime comparisons for config cache
 use File::Path qw(mkpath);
 use File::Spec;
@@ -100,33 +99,34 @@ sub _config_path ($) {
 
 sub index_opt {
 	# TODO: drop underscore variants everywhere, they're undocumented
-	qw(fsync|sync! jobs|j=i indexlevel|index-level|L=s compact+
+	qw(fsync|sync! jobs|j=i indexlevel|L=s compact
 	max_size|max-size=s sequential_shard|sequential-shard
-	batch_size|batch-size=s skip-docdata quiet|q verbose|v+)
+	batch_size|batch-size=s skip-docdata)
 }
 
-# TODO: generate shell completion + help using %CMD and %OPTDESC
+# we generate shell completion + help using %CMD and %OPTDESC,
+# see lei__complete() and PublicInbox::LeiHelp
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ '--stdin|SEARCH_TERMS...', 'search for messages matching terms', qw(
 	save-as=s output|mfolder|o=s format|f=s dedupe|d=s thread|t augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
-	mua-cmd|mua=s no-torsocks torsocks=s verbose|v+ quiet|q
-	received-after=s received-before=s sent-after=s sent-since=s),
+	mua-cmd|mua=s no-torsocks torsocks=s verbose|v+ quiet|q),
 	PublicInbox::LeiQuery::curl_opt(), opt_dash('limit|n=i', '[0-9]+') ],
 
 'show' => [ 'MID|OID', 'show a given object (Message-ID or object ID)',
 	qw(type=s solve! format|f=s dedupe|d=s thread|t remote local!),
 	pass_through('git show') ],
 
-'add-external' => [ 'URL_OR_PATHNAME',
+'add-external' => [ 'LOCATION',
 	'add/set priority of a publicinbox|extindex for extra matches',
 	qw(boost=i c=s@ mirror=s no-torsocks torsocks=s inbox-version=i),
+	qw(quiet|q verbose|v+),
 	index_opt(), PublicInbox::LeiQuery::curl_opt() ],
 'ls-external' => [ '[FILTER...]', 'list publicinbox|extindex locations',
 	qw(format|f=s z|0 local remote quiet|q) ],
-'forget-external' => [ 'URL_OR_PATHNAME...|--prune',
+'forget-external' => [ 'LOCATION...|--prune',
 	'exclude further results from a publicinbox|extindex',
 	qw(prune quiet|q) ],
 
@@ -145,21 +145,20 @@ our %CMD = ( # sorted in order of importance/use:
 	"exclude message(s) on stdin from `q' search results",
 	qw(stdin| oid=s exact by-mid|mid:s quiet|q) ],
 
-'purge-mailsource' => [ 'URL_OR_PATHNAME|--all',
+'purge-mailsource' => [ 'LOCATION|--all',
 	'remove imported messages from IMAP, Maildirs, and MH',
 	qw(exact! all jobs:i indexed) ],
 
 # code repos are used for `show' to solve blobs from patch mails
-'add-coderepo' => [ 'PATHNAME', 'add or set priority of a git code repo',
+'add-coderepo' => [ 'DIRNAME', 'add or set priority of a git code repo',
 	qw(boost=i) ],
 'ls-coderepo' => [ '[FILTER_TERMS...]',
 		'list known code repos', qw(format|f=s z) ],
-'forget-coderepo' => [ 'PATHNAME',
+'forget-coderepo' => [ 'DIRNAME',
 	'stop using repo to solve blobs from patches',
 	qw(prune) ],
 
-'add-watch' => [ '[URL_OR_PATHNAME]',
-		'watch for new messages and flag changes',
+'add-watch' => [ 'LOCATION', 'watch for new messages and flag changes',
 	qw(import! kw|keywords|flags! interval=s recursive|r
 	exclude=s include=s) ],
 'ls-watch' => [ '[FILTER...]', 'list active watches with numbers and status',
@@ -169,7 +168,7 @@ our %CMD = ( # sorted in order of importance/use:
 'forget-watch' => [ '{WATCH_NUMBER|--prune}', 'stop and forget a watch',
 	qw(prune) ],
 
-'import' => [ 'URLS_OR_PATHNAMES...|--stdin',
+'import' => [ 'LOCATION...|--stdin',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
 	format|f=s kw|keywords|flags!),
@@ -179,8 +178,8 @@ our %CMD = ( # sorted in order of importance/use:
 		'git-config(1) wrapper for '._config_path($_[0]);
 	}, qw(config-file|system|global|file|f=s), # for conflict detection
 	pass_through('git config') ],
-'init' => [ '[PATHNAME]', sub {
-		'initialize storage, default: '._store_path($_[0]);
+'init' => [ '[DIRNAME]', sub {
+	"initialize storage, default: "._store_path($_[0]);
 	}, qw(quiet|q) ],
 'daemon-kill' => [ '[-SIGNAL]', 'signal the lei-daemon',
 	opt_dash('signal|s=s', '[0-9]+|(?:[A-Z][A-Z0-9]+)') ],
@@ -208,43 +207,66 @@ my $stdin_formats = [ 'MAIL_FORMAT|eml|mboxrd|mboxcl2|mboxcl|mboxo',
 			'specify message input format' ];
 my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
 
+# we use \x{a0} (non-breaking SP) to avoid wrapping in PublicInbox::LeiHelp
 my %OPTDESC = (
 'help|h' => 'show this built-in help',
 'quiet|q' => 'be quiet',
-'globoff|g' => "do not match locations using '*?' wildcards and '[]' ranges",
+'globoff|g' => "do not match locations using '*?' wildcards ".
+		"and\xa0'[]'\x{a0}ranges",
 'verbose|v+' => 'be more verbose',
 'solve!' => 'do not attempt to reconstruct blobs from emails',
-'torsocks=s' => ['auto|no|yes',
+'torsocks=s' => ['VAL|auto|no|yes',
 		'whether or not to wrap git and curl commands with torsocks'],
 'no-torsocks' => 'alias for --torsocks=no',
 'save-as=s' => ['NAME', 'save a search terms by given name'],
 
 'type=s' => [ 'any|mid|git', 'disambiguate type' ],
 
-'dedupe|d=s' => ['STRAT|content|oid|mid|none',
+'dedupe|d=s' => ['STRATEGY|content|oid|mid|none',
 		'deduplication strategy'],
 'show	thread|t' => 'display entire thread a message belongs to',
 'q	thread|t' =>
 	'return all messages in the same thread as the actual match(es)',
 'augment|a' => 'augment --output destination instead of clobbering',
 
-'output|mfolder|o=s' => [ 'DEST',
-	"destination (e.g. `/path/to/Maildir', or `-' for stdout)" ],
-'mua-cmd|mua=s' => [ 'COMMAND',
-	"MUA to run on --output Maildir or mbox (e.g. `mutt -f %f'" ],
+'output|mfolder|o=s' => [ 'MFOLDER',
+	"destination (e.g.\xa0`/path/to/Maildir', ".
+	"or\xa0`-'\x{a0}for\x{a0}stdout)" ],
+'mua-cmd|mua=s' => [ 'CMD',
+	"MUA to run on --output Maildir or mbox (e.g.\xa0`mutt\xa0-f\xa0%f')" ],
 
 'show	format|f=s' => [ 'OUT|plain|raw|html|mboxrd|mboxcl2|mboxcl',
 			'message/object output format' ],
 'mark	format|f=s' => $stdin_formats,
 'forget	format|f=s' => $stdin_formats,
+
+'add-external	inbox-version=i' => [ 'NUM|1|2',
+		'force a public-inbox version with --mirror'],
+'add-external	mirror=s' => [ 'URL', 'mirror a public-inbox'],
+
+# public-inbox-index options
+'add-external	jobs|j=i' => 'set parallelism when indexing after --mirror',
+'fsync!' => 'speed up indexing after --mirror, risk index corruption',
+'compact' => 'run compact index after mirroring',
+'indexlevel|L=s' => [ 'LEVEL|full|medium|basic',
+	"indexlevel with --mirror (default: full)" ],
+'max_size|max-size=s' => [ 'SIZE',
+	'do not index messages larger than SIZE (default: infinity)' ],
+'batch_size|batch-size=s' => [ 'SIZE',
+	'flush changes to OS after given number of bytes (default: 1m)' ],
+'sequential_shard|sequential-shard' =>
+	'index Xapian shards sequentially for slow storage',
+'skip-docdata' =>
+	'drop compatibility w/ public-inbox <1.6 to save ~1.5% space',
+
 'q	format|f=s' => [
 	'OUT|maildir|mboxrd|mboxcl2|mboxcl|mboxo|html|json|jsonl|concatjson',
 		'specify output format, default depends on --output'],
-'q	exclude=s@' => [ 'URL_OR_PATHNAME',
+'q	exclude=s@' => [ 'LOCATION',
 		'exclude specified external(s) from search' ],
-'q	include|I=s@' => [ 'URL_OR_PATHNAME',
+'q	include|I=s@' => [ 'LOCATION',
 		'include specified external(s) in search' ],
-'q	only=s@' => [ 'URL_OR_PATHNAME',
+'q	only=s@' => [ 'LOCATION',
 		'only use specified external(s) for search' ],
 
 'q	jobs=s'	=> [ '[SEARCH_JOBS][,WRITER_JOBS]',
@@ -258,9 +280,9 @@ my %OPTDESC = (
 'limit|n=i@' => ['NUM', 'limit on number of matches (default: 10000)' ],
 'offset=i' => ['OFF', 'search result offset (default: 0)'],
 
-'sort|s=s' => [ 'VAL|received,relevance,docid',
-		"order of results `--output'-dependent"],
-'reverse|r' => [ 'reverse search results' ], # like sort(1)
+'sort|s=s' => [ 'VAL|received|relevance|docid',
+		"order of results is `--output'-dependent"],
+'reverse|r' => 'reverse search results', # like sort(1)
 
 'boost=i' => 'increase/decrease priority of results (default: 0)',
 
@@ -280,7 +302,6 @@ my %OPTDESC = (
 'exact!' => 'rely on content match instead of exact header matches',
 
 'by-mid|mid:s' => [ 'MID', 'match only by Message-ID, ignoring contents' ],
-'jobs:i' => 'set parallelism level',
 
 'kw|keywords|flags!' => 'disable/enable importing flags',
 
@@ -415,86 +436,15 @@ sub lei_atfork_child {
 	$current_lei = $persist ? undef : $self; # for SIG{__WARN__}
 }
 
-sub _help ($;$) {
-	my ($self, $errmsg) = @_;
-	my $cmd = $self->{cmd} // 'COMMAND';
-	my @info = @{$CMD{$cmd} // [ '...', '...' ]};
-	my @top = ($cmd, shift(@info) // ());
-	my $cmd_desc = shift(@info);
-	$cmd_desc = $cmd_desc->($self) if ref($cmd_desc) eq 'CODE';
-	my @opt_desc;
-	my $lpad = 2;
-	for my $sw (grep { !ref } @info) { # ("prio=s", "z", $GLP_PASS)
-		my $desc = $OPTDESC{"$cmd\t$sw"} // $OPTDESC{$sw} // next;
-		my $arg_vals = '';
-		($arg_vals, $desc) = @$desc if ref($desc) eq 'ARRAY';
-
-		# lower-case is a keyword (e.g. `content', `oid'),
-		# ALL_CAPS is a string description (e.g. `PATH')
-		if ($desc !~ /default/ && $arg_vals =~ /\b([a-z]+)[,\|]/) {
-			$desc .= "\ndefault: `$1'";
-		}
-		my (@vals, @s, @l);
-		my $x = $sw;
-		if ($x =~ s/!\z//) { # solve! => --no-solve
-			$x =~ s/(\A|\|)/$1no-/g
-		} elsif ($x =~ s/:.+//) { # optional args: $x = "mid:s"
-			@vals = (' [', undef, ']');
-		} elsif ($x =~ s/=.+//) { # required arg: $x = "type=s"
-			@vals = (' ', undef);
-		} # else: no args $x = 'thread|t'
-		for (split(/\|/, $x)) { # help|h
-			length($_) > 1 ? push(@l, "--$_") : push(@s, "-$_");
-		}
-		if (!scalar(@vals)) { # no args 'thread|t'
-		} elsif ($arg_vals =~ s/\A([A-Z_]+)\b//) { # "NAME"
-			$vals[1] = $1;
-		} else {
-			$vals[1] = uc(substr($l[0], 2)); # "--type" => "TYPE"
-		}
-		if ($arg_vals =~ /([,\|])/) {
-			my $sep = $1;
-			my @allow = split(/\Q$sep\E/, $arg_vals);
-			my $must = $sep eq '|' ? 'Must' : 'Can';
-			@allow = map { "`$_'" } @allow;
-			my $last = pop @allow;
-			$desc .= "\n$must be one of: " .
-				join(', ', @allow) . " or $last";
-		}
-		my $lhs = join(', ', @s, @l) . join('', @vals);
-		if ($x =~ /\|\z/) { # "stdin|" or "clear|"
-			$lhs =~ s/\A--/- , --/;
-		} else {
-			$lhs =~ s/\A--/    --/; # pad if no short options
-		}
-		$lpad = length($lhs) if length($lhs) > $lpad;
-		push @opt_desc, $lhs, $desc;
-	}
-	my $msg = $errmsg ? "E: $errmsg\n" : '';
-	$msg .= <<EOF;
-usage: lei @top
-  $cmd_desc
-
-EOF
-	$lpad += 2;
-	local $Text::Wrap::columns = 78 - $lpad;
-	my $padding = ' ' x ($lpad + 2);
-	while (my ($lhs, $rhs) = splice(@opt_desc, 0, 2)) {
-		$msg .= '  '.pack("A$lpad", $lhs);
-		$rhs = wrap('', '', $rhs);
-		$rhs =~ s/\n/\n$padding/sg; # LHS pad continuation lines
-		$msg .= $rhs;
-		$msg .= "\n";
-	}
-	my $out = $self->{$errmsg ? 2 : 1};
-	start_pager($self) if -t $out;
-	print $out $msg;
-	x_it($self, $errmsg ? 1 << 8 : 0); # stderr => failure
-	undef;
+sub _help {
+	require PublicInbox::LeiHelp;
+	PublicInbox::LeiHelp::call($_[0], $_[1], \%CMD, \%OPTDESC);
 }
 
 sub optparse ($$$) {
 	my ($self, $cmd, $argv) = @_;
+	# allow _complete --help to complete, not show help
+	return 1 if substr($cmd, 0, 1) eq '_';
 	$self->{cmd} = $cmd;
 	$OPT = $self->{opt} = {};
 	my $info = $CMD{$cmd} // [ '[...]' ];
@@ -720,7 +670,8 @@ sub lei__complete {
 				get-color-name get-colorbool);
 			# fall-through
 		}
-		puts $self, grep(/$re/, map { # generate short/long names
+		# generate short/long names from Getopt::Long specs
+		puts $self, grep(/$re/, qw(--help -h), map {
 			if (s/[:=].+\z//) { # req/optional args, e.g output|o=i
 			} elsif (s/\+\z//) { # verbose|v+
 			} elsif (s/!\z//) {
@@ -730,7 +681,7 @@ sub lei__complete {
 			map {
 				my $x = length > 1 ? "--$_" : "-$_";
 				$x eq $cur ? () : $x;
-			} split(/\|/, $_, -1) # help|h
+			} grep(!/_/, split(/\|/, $_, -1)) # help|h
 		} grep { $OPTDESC{"$cmd\t$_"} || $OPTDESC{$_} } @spec);
 	} elsif ($cmd eq 'config' && !@argv && !$CONFIG_KEYS{$cur}) {
 		puts $self, grep(/$re/, keys %CONFIG_KEYS);
diff --git a/lib/PublicInbox/LeiHelp.pm b/lib/PublicInbox/LeiHelp.pm
new file mode 100644
index 00000000..43414ab4
--- /dev/null
+++ b/lib/PublicInbox/LeiHelp.pm
@@ -0,0 +1,100 @@
+# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# -h/--help support for lei
+package PublicInbox::LeiHelp;
+use strict;
+use v5.10.1;
+use Text::Wrap qw(wrap);
+
+my %NOHELP = map { $_ => 1 } qw(mua-cmd mfolder);
+
+sub call {
+	my ($self, $errmsg, $CMD, $OPTDESC) = @_;
+	my $cmd = $self->{cmd} // 'COMMAND';
+	my @info = @{$CMD->{$cmd} // [ '...', '...' ]};
+	my @top = ($cmd, shift(@info) // ());
+	my $cmd_desc = shift(@info);
+	$cmd_desc = $cmd_desc->($self) if ref($cmd_desc) eq 'CODE';
+	$cmd_desc =~ s/default: /default:\xa0/;
+	my @opt_desc;
+	my $lpad = 2;
+	for my $sw (grep { !ref } @info) { # ("prio=s", "z", $GLP_PASS)
+		my $desc = $OPTDESC->{"$cmd\t$sw"} // $OPTDESC->{$sw} // next;
+		my $arg_vals = '';
+		($arg_vals, $desc) = @$desc if ref($desc) eq 'ARRAY';
+
+		# lower-case is a keyword (e.g. `content', `oid'),
+		# ALL_CAPS is a string description (e.g. `PATH')
+		if ($desc !~ /default/ && $arg_vals =~ /\b([a-z]+)[,\|]/) {
+			$desc .= " (default:\xa0`$1')";
+		} else {
+			$desc =~ s/default: /default:\xa0/;
+		}
+		my (@vals, @s, @l);
+		my $x = $sw;
+		if ($x =~ s/!\z//) { # solve! => --no-solve
+			$x =~ s/(\A|\|)/$1no-/g
+		} elsif ($x =~ s/\+\z//) { # verbose|v+
+		} elsif ($x =~ s/:.+//) { # optional args: $x = "mid:s"
+			@vals = (' [', undef, ']');
+		} elsif ($x =~ s/=.+//) { # required arg: $x = "type=s"
+			@vals = (' ', undef);
+		} # else: no args $x = 'thread|t'
+
+		# we support underscore options from public-inbox-* commands;
+		# but they've never been documented and will likely go away.
+		# $x = help|h
+		for (grep { !/_/ && !$NOHELP{$_} } split(/\|/, $x)) {
+			length($_) > 1 ? push(@l, "--$_") : push(@s, "-$_");
+		}
+		if (!scalar(@vals)) { # no args 'thread|t'
+		} elsif ($arg_vals =~ s/\A([A-Z_]+)\b//) { # "NAME"
+			$vals[1] = $1;
+		} else {
+			$vals[1] = uc(substr($l[0], 2)); # "--type" => "TYPE"
+		}
+		if ($arg_vals =~ /([,\|])/) {
+			my $sep = $1;
+			my @allow = split(/\Q$sep\E/, $arg_vals);
+			my $must = $sep eq '|' ? 'Must' : 'Can';
+			@allow = map { length $_ ? "`$_'" : () } @allow;
+			my $last = pop @allow;
+			$desc .= "\n$must be one of: " .
+				join(', ', @allow) . " or $last";
+		}
+		my $lhs = join(', ', @s, @l) . join('', @vals);
+		if ($x =~ /\|\z/) { # "stdin|" or "clear|"
+			$lhs =~ s/\A--/- , --/;
+		} else {
+			$lhs =~ s/\A--/    --/; # pad if no short options
+		}
+		$lpad = length($lhs) if length($lhs) > $lpad;
+		push @opt_desc, $lhs, $desc;
+	}
+	my $msg = $errmsg ? "E: $errmsg\n" : '';
+	$msg .= <<EOF;
+usage: lei @top
+$cmd_desc
+
+EOF
+	$lpad += 2;
+	local $Text::Wrap::columns = 78 - $lpad;
+	# local $Text::Wrap::break = ; # don't break on nbsp (\xa0)
+	my $padding = ' ' x ($lpad + 2);
+	while (my ($lhs, $rhs) = splice(@opt_desc, 0, 2)) {
+		$msg .= '  '.pack("A$lpad", $lhs);
+		$rhs = wrap('', '', $rhs);
+		$rhs =~ s/\n/\n$padding/sg; # LHS pad continuation lines
+		$msg .= $rhs;
+		$msg .= "\n";
+	}
+	my $fd = $errmsg ? 2 : 1;
+	$self->start_pager if -t $self->{$fd};
+	$msg =~ s/\xa0/ /gs; # convert NBSP to SP
+	print { $self->{$fd} } $msg;
+	$self->x_it($errmsg ? (1 << 8) : 0); # stderr => failure
+	undef;
+}
+
+1;

^ permalink raw reply related	[relevance 26%]

* Re: lei-q doc thoughts... [was: doc: start manpages for lei commands]
  2021-02-06  9:01 90%   ` lei-q doc thoughts... [was: doc: start manpages for lei commands] Eric Wong
@ 2021-02-06 19:57 90%     ` Kyle Meyer
  2021-02-07  3:33 90%       ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Kyle Meyer @ 2021-02-06 19:57 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Kyle Meyer <kyle@kyleam.com> wrote:
>> +=item --mua-cmd=COMMAND, --mua=COMMAND
>
> On second thought:  is the long "--mua-cmd" even worth having or
> supporting given "--mua=" exists?  I will likely remove it from
> the documentation and filter it out from the help text.
>
> Technically "mua-cmd" is more descriptive since it's a command
> with a %f placeholder, but I can't imagine anybody wanting to
> type "--mua-cmd" over "--mua".

No, I can't either.  Dropping --mua-cmd makes sense.

>> +=item -t, --thread
>> +
>> +Return all messages in the same thread as the actual match(es).
>
> Heh, it turns out mairix uses "--threads" (plural).  I never
> knew that since I always used "-t".  Not sure if it's worth
> pluralizing on our end...

I'd vote for following mairix here.  I guess most people will use -t,
but it's just one less thing to get tripped up on.

By the way, if you'd like, I'd be happy to do a round (or more) of lei
manpage updates for new additions whenever you think things are in a
good spot for it.

^ permalink raw reply	[relevance 90%]

* Re: lei-q doc thoughts... [was: doc: start manpages for lei commands]
  2021-02-06 19:57 90%     ` Kyle Meyer
@ 2021-02-07  3:33 90%       ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2021-02-07  3:33 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> 
> No, I can't either.  Dropping --mua-cmd makes sense.

> Eric Wong writes:
> > Heh, it turns out mairix uses "--threads" (plural).  I never
> > knew that since I always used "-t".  Not sure if it's worth
> > pluralizing on our end...
> 
> I'd vote for following mairix here.  I guess most people will use -t,
> but it's just one less thing to get tripped up on.

Thanks, I'll queue up necessary patches for those.

> By the way, if you'd like, I'd be happy to do a round (or more) of lei
> manpage updates for new additions whenever you think things are in a
> good spot for it.

Yes, much appreciated, thanks in advance :>

Anything covered by tests should be considered
ready-for-documentation, I think.  That can help us flush out
any strange/unexpected behaviors before it's finalized in a
release.

Fwiw, I'm likely to apply+push any doc-only patches as quickly as
I see them.

^ permalink raw reply	[relevance 90%]

Results 1-200 of ~1311   | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-12-15 11:47 63% [PATCH/RFC 0/7] lei - Local Email Interface skeleton Eric Wong
2020-12-15 11:47 28% ` [RFC 3/7] lei: FD-passing and IPC basics Eric Wong
2020-12-15 11:47 43% ` [RFC 4/7] lei: proposed command-listing and options Eric Wong
2020-12-26 11:26 71%   ` "extinbox" term - was: [RFC 4/7] lei: proposed command-listing Eric Wong
2020-12-28 15:29 71%     ` Kyle Meyer
2020-12-28 21:55 71%       ` Eric Wong
2020-12-29  3:01 71%         ` Kyle Meyer
2020-12-15 11:47 61% ` [RFC 7/7] lei: use spawn (vfork + execve) for lazy start Eric Wong
2020-12-15 12:05     more considerations in UI/UX Eric Wong
2020-12-23  5:42     ` Kyle Meyer
2020-12-26 11:13 55%   ` [RFC] lei: rename proposed "query" command to "q", add JSON output Eric Wong
2020-12-18 12:09 55% [PATCH 00/26] lei: basic UI + IPC work Eric Wong
2020-12-18 12:09 28% ` [PATCH 01/26] lei: FD-passing and IPC basics Eric Wong
2020-12-18 12:09 45% ` [PATCH 02/26] lei: proposed command-listing and options Eric Wong
2020-12-18 12:09 61% ` [PATCH 05/26] lei: use spawn (vfork + execve) for lazy start Eric Wong
2020-12-18 12:09 20% ` [PATCH 06/26] lei: refine help/option parsing, implement "init" Eric Wong
2020-12-18 12:09 49% ` [PATCH 07/26] t/lei-oneshot: standalone oneshot (non-socket) test Eric Wong
2020-12-18 12:09 68% ` [PATCH 08/26] lei: ensure we run a restrictive umask Eric Wong
2020-12-18 12:09 44% ` [PATCH 09/26] lei: support `daemon-env' for modifying long-lived env Eric Wong
2020-12-18 12:09 60% ` [PATCH 12/26] rename LeiDaemon package to PublicInbox::LEI Eric Wong
2020-12-18 12:09 66% ` [PATCH 13/26] lei: support pass-through for `lei config' Eric Wong
2020-12-18 12:09 47% ` [PATCH 14/26] lei: help: show actual paths being operated on Eric Wong
2020-12-18 12:09 37% ` [PATCH 15/26] lei: rename $client => $self and bless Eric Wong
2020-12-18 12:09 50% ` [PATCH 16/26] lei: micro-optimize startup time Eric Wong
2020-12-18 12:09 64% ` [PATCH 20/26] lei: restore default __DIE__ handler for event loop Eric Wong
2020-12-18 12:09 38% ` [PATCH 21/26] lei: drop $SIG{__DIE__}, add oneshot fallbacks Eric Wong
2020-12-18 12:09 53% ` [PATCH 22/26] lei: start working on bash completion Eric Wong
2020-12-18 12:09 67% ` [PATCH 23/26] build: add lei.sh + "make symlink-install" target Eric Wong
2020-12-18 12:09 42% ` [PATCH 24/26] lei: support for -$DIGIT and -$SIG CLI switches Eric Wong
2020-12-18 12:09 64% ` [PATCH 25/26] lei: revise output routines Eric Wong
2020-12-18 12:09 42% ` [PATCH 26/26] lei: extinbox: start implementing in config file Eric Wong
2020-12-18 20:23 71%   ` Eric Wong
2020-12-31 13:51 51% [PATCH 00/36] another round of lei stuff Eric Wong
2020-12-31 13:51 37% ` [PATCH 10/36] lei: implement various deduplication strategies Eric Wong
2020-12-31 13:51 44% ` [PATCH 17/36] lei: rename "extinbox" => "external" Eric Wong
2020-12-31 13:51 71% ` [PATCH 23/36] lei: add --mfolder as an --output alias Eric Wong
2020-12-31 13:51 56% ` [PATCH 33/36] lei: avoid Spawn package when starting daemon Eric Wong
2021-01-01  5:47     [PATCH 0/4] TEST_RUN_MODE=0 fixes Eric Wong
2021-01-01  5:47 57% ` [PATCH 2/4] t/lei: fix TEST_RUN_MODE=0, simplify oneshot fallback Eric Wong
2021-01-03  9:48 70% [PATCH 0/3] lei-related test fixes Eric Wong
2021-01-03  9:48 60% ` [PATCH 1/3] t/lei: use $lei->() callback wrapper Eric Wong
2021-01-03 11:24     [PATCH 0/2] fix race from stdout buffering in FD pass exit Eric Wong
2021-01-03 11:24 70% ` [PATCH 2/2] lei: fix output race in client/daemon mode Eric Wong
2021-01-03 20:58 52% [PATCH] lei: prefer IO::FDPass over our Inline::C recv_3fds Eric Wong
2021-01-04  4:16 71% [PATCH 0/2] lei: some usage bits Eric Wong
2021-01-04  4:16 67% ` [PATCH 1/2] lei: fix opt_dash to pass non-dash args to @argv Eric Wong
2021-01-04  4:16 71% ` [PATCH 2/2] lei: improve idempotent "init" error message Eric Wong
2021-01-05  9:04 71% [PATCH 0/4] more lei usability stuff Eric Wong
2021-01-05  9:04 71% ` [PATCH 1/4] lei: completion: fix filename completion Eric Wong
2021-01-05  9:04 65% ` [PATCH 2/4] lei: automatic pager support Eric Wong
2021-01-05  9:04 51% ` [PATCH 3/4] lei: use client env as-is, drop daemon-env command Eric Wong
2021-01-05  9:04 50% ` [PATCH 4/4] address: pairs: new helper for JMAP (and maybe lei) Eric Wong
2021-01-05  9:24 70%   ` JSON pretty-printing [was: [4/4] ... (and maybe lei)] Eric Wong
2021-01-10 12:14 57% [PATCH 00/22] lei query overview views Eric Wong
2021-01-10 12:14 29% ` [PATCH 01/22] lei query + pagination sorta working Eric Wong
2021-01-10 12:14 50% ` [PATCH 02/22] lei q: deduplicate smsg Eric Wong
2021-01-10 12:15 71% ` [PATCH 12/22] lei: rename $w to $wpager for warning message Eric Wong
2021-01-10 12:15 71% ` [PATCH 13/22] lei: fix oneshot TTY detection by passing STD*{GLOB} Eric Wong
2021-01-10 12:15 33% ` [PATCH 14/22] lei: query: ensure pager exit is instantaneous Eric Wong
2021-01-10 12:15 62% ` [PATCH 18/22] lei: get rid of client {pid} field Eric Wong
2021-01-10 12:15 41% ` [PATCH 19/22] lei: fork + FD cleanup Eric Wong
2021-01-10 12:15 52% ` [PATCH 20/22] lei: run pager in client script Eric Wong
2021-01-10 12:15 33% ` [PATCH 22/22] lei: query: restore JSON output overview Eric Wong
2021-01-14  7:06 64% [PATCH 00/14] lei: another pile of changes Eric Wong
2021-01-14  7:06 23% ` [PATCH 02/14] lei: test SIGPIPE, stop xsearch workers on client abort Eric Wong
2021-01-14  7:06 69% ` [PATCH 04/14] lei: do not unlink socket path at exit Eric Wong
2021-01-14  7:06 68% ` [PATCH 05/14] lei: reduce live FD references in wq child Eric Wong
2021-01-14  7:06 66% ` [PATCH 06/14] lei: rely on localized $current_lei for warnings Eric Wong
2021-01-14  7:06 62% ` [PATCH 08/14] lei q: reinstate smsg dedupe Eric Wong
2021-01-14  7:06 49% ` [PATCH 11/14] lei: q: lock stdout on overview output Eric Wong
2021-01-15  0:18 71%   ` Eric Wong
2021-01-14  7:06 71% ` [PATCH 13/14] lei: remove temporary var on open Eric Wong
2021-01-14  7:06 48% ` [PATCH 14/14] lei: pass FD to CWD via cmsg, use fchdir on server Eric Wong
2021-01-16 11:36 71% [PATCH 0/4] lei q: outputs to Maildir and mbox* working Eric Wong
2021-01-16 11:36 22% ` [PATCH 3/4] lei: q: results output " Eric Wong
2021-01-16 11:36 71% ` [PATCH 4/4] lei: pager: pass correct env in oneshot mode Eric Wong
2021-01-17  8:52 50% [PATCH] lei q: add --mua-cmd switch Eric Wong
2021-01-17 10:19 71% ` Eric Wong
2021-01-17 10:28 71%   ` Eric Wong
2021-01-18 10:30 71% [PATCH 0/2] lei q: write faster, mutt does less work Eric Wong
2021-01-18 10:30 35% ` [PATCH 1/2] lei q: parallelize Maildir and mbox writing Eric Wong
2021-01-18 21:19 71%   ` Eric Wong
2021-01-19  9:34 71% [PATCH 0/9] lei bugfixes and error handling Eric Wong
2021-01-19  9:34 47% ` [PATCH 1/9] lei q: start ->mset while query_prepare runs Eric Wong
2021-01-19  9:34 48% ` [PATCH 2/9] lei q: fix SIGPIPE handling from lei2mail workers Eric Wong
2021-01-19  9:34 69% ` [PATCH 3/9] lei q: do not spawn MUA early Eric Wong
2021-01-19  9:34 71% ` [PATCH 4/9] lei: write daemon errors to the sock directory Eric Wong
2021-01-19  9:34 41% ` [PATCH 5/9] lei q: fix augment of compressed mailboxes Eric Wong
2021-01-19  9:34 71% ` [PATCH 6/9] lei_overview: do not write if $lei->{1} is gone Eric Wong
2021-01-19  9:34 69% ` [PATCH 7/9] t/lei: fix double-running of socket test with oneshot Eric Wong
2021-01-19  9:34 58% ` [PATCH 8/9] lei: test some likely errors due to misuse Eric Wong
2021-01-20  5:04 71% ` [PATCH 0/7] lei: fixes piled higher and deeper Eric Wong
2021-01-20  5:04 63% ` [PATCH 1/7] lei: allow more mbox inode types Eric Wong
2021-01-20  5:04 71% ` [PATCH 2/7] lei: exit code in oneshot mode Eric Wong
2021-01-20  5:04 67% ` [PATCH 4/7] lei q: cleanup store initialization Eric Wong
2021-01-20  5:04 53% ` [PATCH 5/7] lei: dump and clear errors.log in daemon mode Eric Wong
2021-01-21 19:46 70% [PATCH 00/12] lei: another dump Eric Wong
2021-01-21 19:46 52% ` [PATCH 02/12] lei q: retrieve keywords for local, non-external messages Eric Wong
2021-01-21 19:46 31% ` [PATCH 04/12] lei: show {pct} and {oid} in From_ lines and filenames Eric Wong
2021-01-21 19:46 45% ` [PATCH 05/12] lei: fix inadvertant FD sharing Eric Wong
2021-01-21 19:46 71% ` [PATCH 07/12] lei: oneshot: use client $io[2] for placeholder Eric Wong
2021-01-21 19:46 67% ` [PATCH 08/12] lei: remove INT/QUIT/TERM handlers, fix daemon EOF Eric Wong
2021-01-21 19:46 58% ` [PATCH 10/12] lei: remove @TO_CLOSE_ATFORK_CHILD Eric Wong
2021-01-21 19:46 43% ` [PATCH 11/12] lei: forget-external support with canonicalization Eric Wong
2021-01-21 19:46 69% ` [PATCH 12/12] lei forget-external: bash completion support Eric Wong
2021-01-23 10:27 71% [PATCH 00/10] lei: externals more stuff Eric Wong
2021-01-23 10:27 47% ` [PATCH 01/10] lei: move external vivification to xsearch Eric Wong
2021-01-23 10:27 35% ` [PATCH 02/10] lei: support remote externals Eric Wong
2021-01-24  6:01 62%   ` Kyle Meyer
2021-01-24 12:02 70%     ` Eric Wong
2021-01-24 12:12 71%       ` Eric Wong
2021-01-24 22:11 71%       ` Kyle Meyer
2021-01-25 18:37 71%         ` Eric Wong
2021-01-23 10:27 66% ` [PATCH 04/10] lei: oneshot: preserve stdout if writing mbox Eric Wong
2021-01-23 10:27 68% ` [PATCH 05/10] lei: default "-f $mfolder" args for common MUAs Eric Wong
2021-01-23 10:27 67% ` [PATCH 06/10] lei completion: handle URLs with port numbers Eric Wong
2021-01-23 10:27 71% ` [PATCH 07/10] lei forget-external: just show the location Eric Wong
2021-01-23 10:27 47% ` [PATCH 08/10] lei q: support a bunch of curl(1) options Eric Wong
2021-01-23 10:27 71% ` [PATCH 09/10] lei forget-external: don't show redundant "not found" Eric Wong
2021-01-23 10:27 71% ` [PATCH 10/10] lei add-external: don't allow non-existent directories Eric Wong
2021-01-24 11:46 71% [PATCH 0/9] lei remotes fixes and updates Eric Wong
2021-01-24 11:46 53% ` [PATCH 1/9] lei q: limit concurrency to 4 remote connections Eric Wong
2021-01-24 11:46 67% ` [PATCH 4/9] lei q: disable remote externals if locals exist Eric Wong
2021-01-24 11:46 65% ` [PATCH 5/9] lei q: honor --no-local to force remote searches Eric Wong
2021-01-24 12:31 71%   ` exit codes [was: [PATCH 5/9] lei q: honor --no-local to force remote searches] Eric Wong
2021-01-24 11:46 54% ` [PATCH 7/9] lei q: fix JSON overview with remote externals Eric Wong
2021-01-24 12:37 71%   ` Eric Wong
2021-01-25  1:18 71% [PATCH 0/5] lei: more fixes and usability enhancement Eric Wong
2021-01-25  1:18 55% ` [PATCH 1/5] lei: reinstate JSON smsg output deduplication Eric Wong
2021-01-25  1:18 69% ` [PATCH 2/5] lei q: drop "oid" output format Eric Wong
2021-01-25  1:18 55% ` [PATCH 3/5] lei q: demangle and quiet curl output Eric Wong
2021-01-25  1:18 65% ` [PATCH 4/5] lei q: reject remotes early if curl(1) is missing Eric Wong
2021-01-25  1:18 71% ` [PATCH 5/5] lei q: continue remote search if torsocks(1) " Eric Wong
2021-01-25  6:41     [PATCH 0/4] miscidx: lazy transactions to fix tests Eric Wong
2021-01-25  6:41 71% ` [PATCH 1/4] lei: use Time::HiRes stat for nanosecond resolution Eric Wong
2021-01-25  7:33 61% RFC: lei q --include/-I and similar switch names Eric Wong
2021-01-27  2:04 71% ` Kyle Meyer
2021-01-27  9:42 71% [PATCH 0/9] lei completion, some small updates Eric Wong
2021-01-27  9:42 71% ` [PATCH 2/9] lei: drop "git" command forwarding Eric Wong
2021-01-27  9:42 71% ` [PATCH 3/9] lei: fix comment regarding client payload Eric Wong
2021-01-27  9:42 52% ` [PATCH 4/9] lei: set PWD correctly for path expansion Eric Wong
2021-01-27  9:42 45% ` [PATCH 6/9] lei: complete option switch args Eric Wong
2021-01-27  9:42 71% ` [PATCH 9/9] lei: dclose: fix typo Eric Wong
2021-01-29  7:42 71% [PATCH 0/7] lei: more half-baked updates Eric Wong
2021-01-29  7:42 37% ` [PATCH 4/7] lei: less error-prone FD mapping Eric Wong
2021-02-01  5:57 65% [PATCH 0/2] doc: initial lei manpages Kyle Meyer
2021-02-01  5:57 27% ` [PATCH 1/2] doc: start manpages for lei commands Kyle Meyer
2021-02-06  9:01 90%   ` lei-q doc thoughts... [was: doc: start manpages for lei commands] Eric Wong
2021-02-06 19:57 90%     ` Kyle Meyer
2021-02-07  3:33 90%       ` Eric Wong
2021-02-01  5:57 55% ` [PATCH 2/2] doc: add lei-overview(7) Kyle Meyer
2021-02-01  6:40 71%   ` Eric Wong
2021-02-01 11:37 71%     ` Eric Wong
2021-02-01  8:28     [PATCH 00/21] lei2mail worker segfault finally fixed Eric Wong
2021-02-01  8:28 57% ` [PATCH 01/21] lei: more consistent dedupe and ovv_buf init Eric Wong
2021-02-01  8:28 71% ` [PATCH 03/21] lei: remove per-child SIG{__WARN__} Eric Wong
2021-02-01  8:28 28% ` [PATCH 04/21] lei: remove SIGPIPE handler Eric Wong
2021-02-01  8:28 66% ` [PATCH 06/21] lei: remove syslog dependency Eric Wong
2021-02-01  8:28 82% ` [PATCH 08/21] lei: keep $lei around until workers are reaped Eric Wong
2021-02-01  8:28 71% ` [PATCH 11/21] lei: deep clone {ovv} for l2m workers Eric Wong
2021-02-01  8:28 65% ` [PATCH 13/21] lei: increase initial timeout Eric Wong
2021-02-01  8:28 64% ` [PATCH 20/21] lei: avoid ETOOMANYREFS, cleanup imports Eric Wong
2021-02-02 10:09 71% can lei require Inline::C? Eric Wong
2021-02-03  0:02 71% ` Kyle Meyer
2021-02-02 11:46 66% [PATCH 00/16] lei: -I/--include and more Eric Wong
2021-02-02 11:46 42% ` [PATCH 01/16] lei: switch to use SEQPACKET socketpair instead of pipe Eric Wong
2021-02-02 11:46 32% ` [PATCH 03/16] lei q: emit progress and counting via PktOp Eric Wong
2021-02-02 11:46 51% ` [PATCH 04/16] lei q: support --only, --include and --exclude Eric Wong
2021-02-02 11:46 71% ` [PATCH 05/16] lei: complete: do not complete non-arg options w/ help text Eric Wong
2021-02-02 11:46 64% ` [PATCH 06/16] lei: q: shell completion for --(include|exclude|only) Eric Wong
2021-02-02 11:46 52% ` [PATCH 09/16] lei q: do not leave temporary files after oneshot exit Eric Wong
2021-02-02 11:46 71% ` [PATCH 13/16] doc: lei-q: note "-a" and link to Xapian QueryParser Eric Wong
2021-02-02 11:47 56% ` [PATCH 15/16] lei q: tidy up progress reporting Eric Wong
2021-02-02 11:47 48% ` [PATCH 16/16] lei q: support --jobs [SEARCHERS],[WRITERS] Eric Wong
2021-02-03  8:11 66% [PATCH 00/11] lei q --stdin, shortcut names, etc Eric Wong
2021-02-03  8:11 67% ` [PATCH 01/11] lei: reduce FD pressure from lei2mail worker Eric Wong
2021-02-03  8:11 71% ` [PATCH 02/11] lei: further reduce lei2mail FD pressure Eric Wong
2021-02-03  8:11 71% ` [PATCH 04/11] lei: err: avoid uninitialized variable warnings Eric Wong
2021-02-03  8:11 41% ` [PATCH 05/11] lei: propagate curl errors, improve internal consistency Eric Wong
2021-02-03  8:11 54% ` [PATCH 06/11] lei q: -I/--exclude/--only support globs and basenames Eric Wong
2021-02-03  8:11 71% ` [PATCH 07/11] lei: complete basenames for include|exclude|only Eric Wong
2021-02-03  8:11 71% ` [PATCH 08/11] lei: help starts pager Eric Wong
2021-02-03  8:11 56% ` [PATCH 09/11] lei add-external: completion for existing URL basenames Eric Wong
2021-02-03  8:11 71% ` [PATCH 10/11] lei: use sleep(1) loop for infinite sleep Eric Wong
2021-02-03  8:11 43% ` [PATCH 11/11] lei q: support reading queries from stdin Eric Wong
2021-02-04  2:10 83% [PATCH] t/lei: skip "lei q" tests on missing dependencies Eric Wong
2021-02-04  9:59 68% [PATCH 00/10] lei: cleanups + initial import support Eric Wong
2021-02-04  9:59 65% ` [PATCH 01/10] lei q: delay worker spawn Eric Wong
2021-02-04  9:59 29% ` [PATCH 03/10] lei q: reorder internals to reduce FD passing Eric Wong
2021-02-04  9:59 71% ` [PATCH 04/10] lei q: only start pager if output is to stdout Eric Wong
2021-02-04  9:59 61% ` [PATCH 05/10] lei q: reinstate early MUA spawn for Maildir Eric Wong
2021-02-04  9:59 51% ` [PATCH 06/10] eml: handle warning ignores for lei Eric Wong
2021-02-04  9:59 56% ` [PATCH 07/10] lei q: eliminate $not_done temporary git dir hack Eric Wong
2021-02-04  9:59 36% ` [PATCH 10/10] lei import: initial implementation Eric Wong
2021-02-06 12:18 60% [PATCH 00/17] lei: more random updates Eric Wong
2021-02-06 12:18 55% ` [PATCH 02/17] lei: favor "keywords" over "flags", test --no-kw Eric Wong
2021-02-06 12:18 63% ` [PATCH 03/17] lei: fix completion of --no-kw / --no-keywords Eric Wong
2021-02-06 12:18 67% ` [PATCH 04/17] lei: abort lei_import worker on client abort Eric Wong
2021-02-06 12:18 41% ` [PATCH 07/17] tests: add test_lei wrapper, split out t/lei-import.t Eric Wong
2021-02-06 12:18 24% ` [PATCH 08/17] t/lei-externals: split out into separate test Eric Wong
2021-02-06 12:18 46% ` [PATCH 10/17] tests: split out lei-daemon.t from lei.t Eric Wong
2021-02-06 12:18 66% ` [PATCH 12/17] script/lei: avoid waitpid(-1, ...) to keep tests fast Eric Wong
2021-02-06 12:18 21% ` [PATCH 13/17] lei: add-external --mirror support Eric Wong
2021-02-06 12:18 26% ` [PATCH 14/17] lei help: split out into separate file Eric Wong
2021-02-06 12:18 66% ` [PATCH 15/17] lei add-external: reject index and remote opts w/o mirror Eric Wong
2021-02-06 12:18 64% ` [PATCH 17/17] lei: remove short switch support for curl(1) options Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).