unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Re: Sharing lei searches
  2024-04-18 10:26 71% ` Eric Wong
@ 2024-04-18 15:17 71%   ` Gonsolo
  0 siblings, 0 replies; 200+ results
From: Gonsolo @ 2024-04-18 15:17 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Hi!

> I do the above for some simple searches.

Ok.

> > Is there a better way?

> You can also make `lei up --all' run as cronjob on a server
> and output to IMAP folders; then use IMAP normally w/o needing
> lei on the other machines.

Ok.

I thought about something like lei export-searches/import-searches but
I guess I will never suffer enough to try to implement it.

Thanks for the help!

-- 
g

^ permalink raw reply	[relevance 71%]

* Re: Sharing lei searches
  2024-04-18  6:24 71% Sharing lei searches Gonsolo
@ 2024-04-18 10:26 71% ` Eric Wong
  2024-04-18 15:17 71%   ` Gonsolo
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-04-18 10:26 UTC (permalink / raw)
  To: Gonsolo; +Cc: meta

Gonsolo <gonsolo@gmail.com> wrote:
> Hi!
> 
> Is there an easy way to share a lei configuration for a different computer?
> Right now I'm relying on the following (clumsy) workflow:
> 
> 1. lei edit-search on computer A, copy and mail to myself
> 2. Dummy lei q on computer B.
> 3. lei edit-search on computer B, use email from 1.
> 4. Do this for all searches in lei ls-search

I do the above for some simple searches.

> Is there a better way?

You can also make `lei up --all' run as cronjob on a server
and output to IMAP folders; then use IMAP normally w/o needing
lei on the other machines.

^ permalink raw reply	[relevance 71%]

* Sharing lei searches
@ 2024-04-18  6:24 71% Gonsolo
  2024-04-18 10:26 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Gonsolo @ 2024-04-18  6:24 UTC (permalink / raw)
  To: meta

Hi!

Is there an easy way to share a lei configuration for a different computer?
Right now I'm relying on the following (clumsy) workflow:

1. lei edit-search on computer A, copy and mail to myself
2. Dummy lei q on computer B.
3. lei edit-search on computer B, use email from 1.
4. Do this for all searches in lei ls-search

Is there a better way?

Thanks
-- 
g

^ permalink raw reply	[relevance 71%]

* [PATCH v2 3/4] lei/store: stop shard workers + cat-file on idle
  2024-04-16 20:56 63% ` [PATCH 3/4] lei/store: stop shard workers + cat-file on idle Eric Wong
@ 2024-04-17  9:34 60%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-04-17  9:34 UTC (permalink / raw)
  To: meta

Schedule a timer to stop shard workers and the git-cat-file
process after a `barrier' command.  This allows us to save some
memory again when the lei-daemon is idle but preserves the fork
overhead reduction when issuing many commands in parallel or in
quick succession.
---
  v2 fixes an incorrect call to add_uniq_timer.  Sometimes I wish Perl
  could have more static type||arg checking, but it's probably still
  better than other scripting languages...

Interdiff against v1:
  diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
  index a054f649..b2da2bc3 100644
  --- a/lib/PublicInbox/LeiStore.pm
  +++ b/lib/PublicInbox/LeiStore.pm
  @@ -574,7 +574,7 @@ sub set_xvmd {
   sub check_done {
   	my ($self) = @_;
   	$self->git->_active ?
  -		add_uniq_timer("$self-check_done", \&check_done, $self) :
  +		add_uniq_timer("$self-check_done", 5, \&check_done, $self) :
   		done($self);
   }
   

 lib/PublicInbox/LeiStore.pm | 46 ++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 162c915f..b2da2bc3 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -571,21 +571,11 @@ sub set_xvmd {
 	sto_export_kw($self, $smsg->{num}, $vmd);
 }
 
-sub barrier {
+sub check_done {
 	my ($self) = @_;
-	my ($errfh, $lei_sock) = @$self{0, 1}; # via sto_barrier_request
-	my @err;
-	if ($self->{im}) {
-		eval { $self->{im}->barrier };
-		push(@err, "E: import barrier: $@\n") if $@;
-	}
-	delete $self->{lms};
-	eval { $self->{priv_eidx}->barrier };
-	push(@err, "E: priv_eidx barrier: $@\n") if $@;
-	print { $errfh // \*STDERR } @err;
-	send($lei_sock, 'child_error 256', 0) if @err && $lei_sock;
-	xchg_stderr($self);
-	die @err if @err;
+	$self->git->_active ?
+		add_uniq_timer("$self-check_done", 5, \&check_done, $self) :
+		done($self);
 }
 
 sub xchg_stderr {
@@ -602,23 +592,33 @@ sub xchg_stderr {
 	undef;
 }
 
-sub done {
-	my ($self) = @_;
-	my ($errfh, $lei_sock) = @$self{0, 1};
+sub _commit ($$) {
+	my ($self, $cmd) = @_; # cmd is 'done' or 'barrier'
+	my ($errfh, $lei_sock) = @$self{0, 1}; # via sto_barrier_request
 	my @err;
-	if (my $im = delete($self->{im})) {
-		eval { $im->done };
-		push(@err, "E: import done: $@\n") if $@;
+	if ($self->{im}) {
+		eval { $self->{im}->$cmd };
+		push(@err, "E: import $cmd: $@\n") if $@;
 	}
 	delete $self->{lms};
-	eval { $self->{priv_eidx}->done }; # V2Writable::done
-	push(@err, "E: priv_eidx done: $@\n") if $@;
-	print { $errfh // *STDERR{GLOB} } @err;
+	eval { $self->{priv_eidx}->$cmd };
+	push(@err, "E: priv_eidx $cmd: $@\n") if $@;
+	print { $errfh // \*STDERR } @err;
 	send($lei_sock, 'child_error 256', 0) if @err && $lei_sock;
 	xchg_stderr($self);
 	die @err if @err;
+	# $lei_sock goes out-of-scope and script/lei can terminate
+}
+
+sub barrier {
+	my ($self) = @_;
+	_commit $self, 'barrier';
+	add_uniq_timer("$self-check_done", 5, \&check_done, $self);
+	undef;
 }
 
+sub done { _commit $_[0], 'done' }
+
 sub ipc_atfork_child {
 	my ($self) = @_;
 	my $lei = $self->{lei};

^ permalink raw reply related	[relevance 60%]

* [PATCH 1/4] v2 + lei/store: always wait for fast-import checkpoint
  2024-04-16 20:56 71% [PATCH 0/4] lei parallelism fixes Eric Wong
@ 2024-04-16 20:56 71% ` Eric Wong
  2024-04-16 20:56 65% ` [PATCH 2/4] lei: use ->barrier to commit to lei/store Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-04-16 20:56 UTC (permalink / raw)
  To: meta

Since data going to git is the most important, always ensure
data is written to git before attempting to write anything to
SQLite or Xapian.
---
 lib/PublicInbox/LeiStore.pm   | 4 +---
 lib/PublicInbox/V2Writable.pm | 8 +-------
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 2eb09eca..0df2352c 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -573,9 +573,7 @@ sub set_xvmd {
 
 sub checkpoint {
 	my ($self, $wait) = @_;
-	if (my $im = $self->{im}) {
-		$wait ? $im->barrier : $im->checkpoint;
-	}
+	$self->{im}->barrier if $self->{im};
 	delete $self->{lms};
 	$self->{priv_eidx}->checkpoint($wait);
 }
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index fb259396..43f37f60 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -507,13 +507,7 @@ sub set_last_commits ($) { # this is NOT for ExtSearchIdx
 sub checkpoint ($;$) {
 	my ($self, $wait) = @_;
 
-	if (my $im = $self->{im}) {
-		if ($wait) {
-			$im->barrier;
-		} else {
-			$im->checkpoint;
-		}
-	}
+	$self->{im}->barrier if $self->{im};
 	my $shards = $self->{idx_shards};
 	if ($shards) {
 		my $dbh = $self->{mm}->{dbh} if $self->{mm};

^ permalink raw reply related	[relevance 71%]

* [PATCH 2/4] lei: use ->barrier to commit to lei/store
  2024-04-16 20:56 71% [PATCH 0/4] lei parallelism fixes Eric Wong
  2024-04-16 20:56 71% ` [PATCH 1/4] v2 + lei/store: always wait for fast-import checkpoint Eric Wong
@ 2024-04-16 20:56 65% ` Eric Wong
  2024-04-16 20:56 63% ` [PATCH 3/4] lei/store: stop shard workers + cat-file on idle Eric Wong
  2024-04-16 20:56 53% ` [PATCH 4/4] lei: use async barrier for --import-before Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-04-16 20:56 UTC (permalink / raw)
  To: meta

barrier (synchronous checkpoint) is better than ->done with
parallel lei commands being issued (via '&' or different
terminals), since repeatedly stopping and restarting processes
doesn't play nicely with expensive tasks like `lei reindex'.

This introduces a slight regression in maintaining more
processes (and thus resource use) when lei is idle, but that'll
be fixed in the next commit.
---
 lib/PublicInbox/ExtSearchIdx.pm       |  1 +
 lib/PublicInbox/LEI.pm                |  6 +++---
 lib/PublicInbox/LeiInput.pm           |  2 +-
 lib/PublicInbox/LeiRefreshMailSync.pm |  2 +-
 lib/PublicInbox/LeiRemote.pm          |  4 ++--
 lib/PublicInbox/LeiStore.pm           | 26 ++++++++++++++++++--------
 lib/PublicInbox/LeiToMail.pm          |  3 ++-
 lib/PublicInbox/LeiXSearch.pm         |  4 ++--
 t/lei-store-fail.t                    |  2 +-
 9 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index ebbffffc..763a124c 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -1424,5 +1424,6 @@ no warnings 'once';
 *idx_shard = \&PublicInbox::V2Writable::idx_shard;
 *reindex_checkpoint = \&PublicInbox::V2Writable::reindex_checkpoint;
 *checkpoint = \&PublicInbox::V2Writable::checkpoint;
+*barrier = \&PublicInbox::V2Writable::barrier;
 
 1;
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 5b46686a..e9a0de6c 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1443,7 +1443,7 @@ sub wq_eof { # EOF callback for main daemon
 	my ($lei, $wq_fld) = @_;
 	local $current_lei = $lei;
 	my $wq = delete $lei->{$wq_fld // 'wq1'};
-	$lei->sto_done_request($wq);
+	$lei->sto_barrier_request($wq);
 	$wq // $lei->fail; # already failed
 }
 
@@ -1548,7 +1548,7 @@ sub lms {
 	(-f $f || $creat) ? PublicInbox::LeiMailSync->new($f) : undef;
 }
 
-sub sto_done_request {
+sub sto_barrier_request {
 	my ($lei, $wq) = @_;
 	return unless $lei->{sto} && $lei->{sto}->{-wq_s1};
 	local $current_lei = $lei;
@@ -1558,7 +1558,7 @@ sub sto_done_request {
 		my $s = ($wq ? $wq->{lei_sock} : undef) // $lei->{sock};
 		my $errfh = $lei->{2} // *STDERR{GLOB};
 		my @io = $s ? ($errfh, $s) : ($errfh);
-		eval { $lei->{sto}->wq_io_do('done', \@io) };
+		eval { $lei->{sto}->wq_io_do('barrier', \@io, 1) };
 	}
 	warn($@) if $@;
 }
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index d003d983..c388f7dc 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -499,7 +499,7 @@ sub process_inputs {
 	}
 	# always commit first, even on error partial work is acceptable for
 	# lei <import|tag|convert>
-	$self->{lei}->sto_done_request;
+	$self->{lei}->sto_barrier_request;
 	$self->{lei}->fail($err) if $err;
 }
 
diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index a60a9a5e..dde23274 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -60,7 +60,7 @@ sub input_path_url { # overrides PublicInbox::LeiInput::input_path_url
 			$self->folder_missing($$uri);
 		}
 	} else { die "BUG: $input not supported" }
-	$self->{lei}->sto_done_request;
+	$self->{lei}->sto_barrier_request;
 }
 
 sub lei_refresh_mail_sync {
diff --git a/lib/PublicInbox/LeiRemote.pm b/lib/PublicInbox/LeiRemote.pm
index ddcaf2c9..d6fc40a4 100644
--- a/lib/PublicInbox/LeiRemote.pm
+++ b/lib/PublicInbox/LeiRemote.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # Make remote externals HTTP(S) inboxes behave like
@@ -51,7 +51,7 @@ sub mset {
 	$fh = IO::Uncompress::Gunzip->new($fh, MultiStream=>1, AutoClose=>1);
 	eval { PublicInbox::MboxReader->mboxrd($fh, \&each_mboxrd_eml, $self) };
 	my $err = $@ ? ": $@" : '';
-	my $wait = $self->{lei}->{sto}->wq_do('done');
+	my $wait = $self->{lei}->{sto}->wq_do('barrier');
 	$lei->child_error($?, "@$cmd failed$err") if $err || $?;
 	$self; # we are the mset (and $ibx, and $self)
 }
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 0df2352c..162c915f 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -81,7 +81,7 @@ sub importer {
 		delete $self->{im};
 		$im->done;
 		undef $im;
-		$self->checkpoint;
+		$self->barrier;
 		$max = $self->{priv_eidx}->{mg}->git_epochs + 1;
 	}
 	my (undef, $tl) = eidx_init($self); # acquire lock
@@ -118,7 +118,7 @@ sub cat_blob {
 
 sub schedule_commit {
 	my ($self, $sec) = @_;
-	add_uniq_timer($self->{priv_eidx}->{topdir}, $sec, \&done, $self);
+	add_uniq_timer($self->{priv_eidx}->{topdir}, $sec, \&barrier, $self);
 }
 
 # follows the stderr file
@@ -391,7 +391,7 @@ sub reindex_done {
 	my ($self) = @_;
 	my ($eidx, $tl) = eidx_init($self);
 	$eidx->git->async_wait_all;
-	# ->done to be called via sto_done_request
+	# ->done to be called via sto_barrier_request
 }
 
 sub add_eml {
@@ -571,11 +571,21 @@ sub set_xvmd {
 	sto_export_kw($self, $smsg->{num}, $vmd);
 }
 
-sub checkpoint {
-	my ($self, $wait) = @_;
-	$self->{im}->barrier if $self->{im};
+sub barrier {
+	my ($self) = @_;
+	my ($errfh, $lei_sock) = @$self{0, 1}; # via sto_barrier_request
+	my @err;
+	if ($self->{im}) {
+		eval { $self->{im}->barrier };
+		push(@err, "E: import barrier: $@\n") if $@;
+	}
 	delete $self->{lms};
-	$self->{priv_eidx}->checkpoint($wait);
+	eval { $self->{priv_eidx}->barrier };
+	push(@err, "E: priv_eidx barrier: $@\n") if $@;
+	print { $errfh // \*STDERR } @err;
+	send($lei_sock, 'child_error 256', 0) if @err && $lei_sock;
+	xchg_stderr($self);
+	die @err if @err;
 }
 
 sub xchg_stderr {
@@ -594,7 +604,7 @@ sub xchg_stderr {
 
 sub done {
 	my ($self) = @_;
-	my ($errfh, $lei_sock) = @$self{0, 1}; # via sto_done_request
+	my ($errfh, $lei_sock) = @$self{0, 1};
 	my @err;
 	if (my $im = delete($self->{im})) {
 		eval { $im->done };
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index dfae29e9..593547f6 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -724,8 +724,9 @@ sub post_augment {
 	my ($self, $lei, @args) = @_;
 	$self->{-au_noted}++ and $lei->qerr("# writing to $self->{dst} ...");
 
+	# FIXME: this synchronous wait can be slow w/ parallel callers
 	my $wait = $lei->{opt}->{'import-before'} ?
-			$lei->{sto}->wq_do('checkpoint', 1) : 0;
+			$lei->{sto}->wq_do('barrier') : 0;
 	# _post_augment_mbox
 	my $m = $self->can("_post_augment_$self->{base_type}") or return;
 	$m->($self, $lei, @args);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index d4f34733..5a5a1adc 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -363,7 +363,7 @@ print STDERR $_;
 						$self, $lei, $each_smsg);
 		};
 		my ($exc, $code) = ($@, $?);
-		$lei->sto_done_request if delete($self->{-sto_imported});
+		$lei->sto_barrier_request if delete($self->{-sto_imported});
 		die "E: $exc" if $exc && !$code;
 		my $nr = delete $lei->{-nr_remote_eml} // 0;
 		if (!$code) { # don't update if no results, maybe MTA is down
@@ -399,7 +399,7 @@ sub query_done { # EOF callback for main daemon
 	delete $lei->{lxs};
 	($lei->{opt}->{'mail-sync'} && !$lei->{sto}) and
 		warn "BUG: {sto} missing with --mail-sync";
-	$lei->sto_done_request;
+	$lei->sto_barrier_request;
 	$lei->{ovv}->ovv_end($lei);
 	if ($l2m) { # close() calls LeiToMail reap_compress
 		$l2m->finish_output($lei);
diff --git a/t/lei-store-fail.t b/t/lei-store-fail.t
index c2f03148..1e83e383 100644
--- a/t/lei-store-fail.t
+++ b/t/lei-store-fail.t
@@ -39,7 +39,7 @@ EOM
 	lei_ok qw(q m:testmessage@example.com);
 	is($lei_out, "[null]\n", 'delayed commit is unindexed');
 
-	# make immediate ->sto_done_request fail from mboxrd import:
+	# make immediate ->sto_barrier_request fail from mboxrd import:
 	remove_tree("$ENV{HOME}/.local/share/lei/store");
 	# subsequent lei commands are undefined behavior,
 	# but we need to make sure the current lei command fails:

^ permalink raw reply related	[relevance 65%]

* [PATCH 3/4] lei/store: stop shard workers + cat-file on idle
  2024-04-16 20:56 71% [PATCH 0/4] lei parallelism fixes Eric Wong
  2024-04-16 20:56 71% ` [PATCH 1/4] v2 + lei/store: always wait for fast-import checkpoint Eric Wong
  2024-04-16 20:56 65% ` [PATCH 2/4] lei: use ->barrier to commit to lei/store Eric Wong
@ 2024-04-16 20:56 63% ` Eric Wong
  2024-04-17  9:34 60%   ` [PATCH v2 " Eric Wong
  2024-04-16 20:56 53% ` [PATCH 4/4] lei: use async barrier for --import-before Eric Wong
  3 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-04-16 20:56 UTC (permalink / raw)
  To: meta

Schedule a timer to stop shard workers and the git-cat-file
process after a `barrier' command.  This allows us to save some
memory again when the lei-daemon is idle but preserves the fork
overhead reduction when issuing many commands in parallel or in
quick succession.
---
 lib/PublicInbox/LeiStore.pm | 46 ++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 162c915f..a054f649 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -571,21 +571,11 @@ sub set_xvmd {
 	sto_export_kw($self, $smsg->{num}, $vmd);
 }
 
-sub barrier {
+sub check_done {
 	my ($self) = @_;
-	my ($errfh, $lei_sock) = @$self{0, 1}; # via sto_barrier_request
-	my @err;
-	if ($self->{im}) {
-		eval { $self->{im}->barrier };
-		push(@err, "E: import barrier: $@\n") if $@;
-	}
-	delete $self->{lms};
-	eval { $self->{priv_eidx}->barrier };
-	push(@err, "E: priv_eidx barrier: $@\n") if $@;
-	print { $errfh // \*STDERR } @err;
-	send($lei_sock, 'child_error 256', 0) if @err && $lei_sock;
-	xchg_stderr($self);
-	die @err if @err;
+	$self->git->_active ?
+		add_uniq_timer("$self-check_done", \&check_done, $self) :
+		done($self);
 }
 
 sub xchg_stderr {
@@ -602,23 +592,33 @@ sub xchg_stderr {
 	undef;
 }
 
-sub done {
-	my ($self) = @_;
-	my ($errfh, $lei_sock) = @$self{0, 1};
+sub _commit ($$) {
+	my ($self, $cmd) = @_; # cmd is 'done' or 'barrier'
+	my ($errfh, $lei_sock) = @$self{0, 1}; # via sto_barrier_request
 	my @err;
-	if (my $im = delete($self->{im})) {
-		eval { $im->done };
-		push(@err, "E: import done: $@\n") if $@;
+	if ($self->{im}) {
+		eval { $self->{im}->$cmd };
+		push(@err, "E: import $cmd: $@\n") if $@;
 	}
 	delete $self->{lms};
-	eval { $self->{priv_eidx}->done }; # V2Writable::done
-	push(@err, "E: priv_eidx done: $@\n") if $@;
-	print { $errfh // *STDERR{GLOB} } @err;
+	eval { $self->{priv_eidx}->$cmd };
+	push(@err, "E: priv_eidx $cmd: $@\n") if $@;
+	print { $errfh // \*STDERR } @err;
 	send($lei_sock, 'child_error 256', 0) if @err && $lei_sock;
 	xchg_stderr($self);
 	die @err if @err;
+	# $lei_sock goes out-of-scope and script/lei can terminate
+}
+
+sub barrier {
+	my ($self) = @_;
+	_commit $self, 'barrier';
+	add_uniq_timer("$self-check_done", 5, \&check_done, $self);
+	undef;
 }
 
+sub done { _commit $_[0], 'done' }
+
 sub ipc_atfork_child {
 	my ($self) = @_;
 	my $lei = $self->{lei};

^ permalink raw reply related	[relevance 63%]

* [PATCH 4/4] lei: use async barrier for --import-before
  2024-04-16 20:56 71% [PATCH 0/4] lei parallelism fixes Eric Wong
                   ` (2 preceding siblings ...)
  2024-04-16 20:56 63% ` [PATCH 3/4] lei/store: stop shard workers + cat-file on idle Eric Wong
@ 2024-04-16 20:56 53% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-04-16 20:56 UTC (permalink / raw)
  To: meta

Write barriers can take a long time to finish, especially when
commands are issues in parallel.  So handle it asynchronously
without blocking lei-daemon by making EOFpipe a little more
flexible by supporting arguments to the callback function.

This is another step towards improving parallel use of lei.
---
 lib/PublicInbox/EOFpipe.pm    |  7 ++++---
 lib/PublicInbox/LeiToMail.pm  | 29 ++++++++++++++++++++++-------
 lib/PublicInbox/LeiXSearch.pm | 13 +++++++++----
 3 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/lib/PublicInbox/EOFpipe.pm b/lib/PublicInbox/EOFpipe.pm
index 3474874f..77b699a2 100644
--- a/lib/PublicInbox/EOFpipe.pm
+++ b/lib/PublicInbox/EOFpipe.pm
@@ -7,8 +7,8 @@ use parent qw(PublicInbox::DS);
 use PublicInbox::Syscall qw(EPOLLIN EPOLLONESHOT $F_SETPIPE_SZ);
 
 sub new {
-	my (undef, $rd, $cb) = @_;
-	my $self = bless { cb => $cb }, __PACKAGE__;
+	my (undef, $rd, @cb_args) = @_;
+	my $self = bless { cb_args => \@cb_args }, __PACKAGE__;
 	# 4096: page size
 	fcntl($rd, $F_SETPIPE_SZ, 4096) if $F_SETPIPE_SZ;
 	$self->SUPER::new($rd, EPOLLIN|EPOLLONESHOT);
@@ -17,7 +17,8 @@ sub new {
 sub event_step {
 	my ($self) = @_;
 	if ($self->do_read(my $buf, 1) == 0) { # auto-closed
-		$self->{cb}->();
+		my ($cb, @args) = @{delete $self->{cb_args}};
+		$cb->(@args);
 	}
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 593547f6..5481b5e4 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -14,7 +14,7 @@ use PublicInbox::Import;
 use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
 use PublicInbox::Syscall qw(rename_noreplace);
-use autodie qw(open seek close);
+use autodie qw(pipe open seek close);
 use Carp qw(croak);
 
 my %kw2char = ( # Maildir characters
@@ -605,7 +605,7 @@ sub _pre_augment_mbox {
 			$lei->{dedupe} && $lei->{dedupe}->can('reset_dedupe');
 	}
 	if ($self->{zsfx} = PublicInbox::MboxReader::zsfx($dst)) {
-		pipe(my ($r, $w)) or die "pipe: $!";
+		pipe(my $r, my $w);
 		$lei->{zpipe} = [ $r, $w ];
 		$lei->{ovv}->{lock_path} and
 			die 'BUG: unexpected {ovv}->{lock_path}';
@@ -719,17 +719,32 @@ sub do_augment { # slow, runs in wq worker
 	$m->($self, $lei);
 }
 
+sub post_augment_call ($$$$) {
+	my ($self, $lei, $m, $post_augment_done) = @_;
+	eval { $m->($self, $lei) };
+	$lei->{post_augment_err} = $@ if $@; # for post_augment_done
+}
+
 # fast (spawn compressor or mkdir), runs in same process as pre_augment
 sub post_augment {
-	my ($self, $lei, @args) = @_;
+	my ($self, $lei, $post_augment_done) = @_;
 	$self->{-au_noted}++ and $lei->qerr("# writing to $self->{dst} ...");
 
-	# FIXME: this synchronous wait can be slow w/ parallel callers
-	my $wait = $lei->{opt}->{'import-before'} ?
-			$lei->{sto}->wq_do('barrier') : 0;
 	# _post_augment_mbox
 	my $m = $self->can("_post_augment_$self->{base_type}") or return;
-	$m->($self, $lei, @args);
+
+	# --import-before is only for lei-(q|lcat), not lei-convert
+	$lei->{opt}->{'import-before'} or
+		return post_augment_call $self, $lei, $m, $post_augment_done;
+
+	# we can't deal with post_augment until import-before commits:
+	require PublicInbox::EOFpipe;
+	my @io = @$lei{qw(2 sock)};
+	pipe(my $r, $io[2]);
+	PublicInbox::EOFpipe->new($r, \&post_augment_call,
+				$self, $lei, $m, $post_augment_done);
+	$lei->{sto}->wq_io_do('barrier', \@io);
+	# _post_augment_* && post_augment_done run when barrier is complete
 }
 
 # called by every single l2m worker process
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 5a5a1adc..43dedd10 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -22,6 +22,7 @@ use PublicInbox::ContentHash qw(git_sha);
 use POSIX qw(strftime);
 use autodie qw(close open read seek truncate);
 use PublicInbox::Syscall qw($F_SETPIPE_SZ);
+use PublicInbox::OnDestroy;
 
 sub new {
 	my ($class) = @_;
@@ -428,11 +429,9 @@ sub query_done { # EOF callback for main daemon
 	$lei->dclose;
 }
 
-sub do_post_augment {
+sub post_augment_done { # via on_destroy in top-level lei-daemon
 	my ($lei) = @_;
-	my $l2m = $lei->{l2m} or return; # client disconnected
-	eval { $l2m->post_augment($lei) };
-	my $err = $@;
+	my $err = delete $lei->{post_augment_err};
 	if ($err) {
 		if (my $lxs = delete $lei->{lxs}) {
 			$lxs->wq_kill(-POSIX::SIGTERM());
@@ -447,6 +446,12 @@ sub do_post_augment {
 	close(delete $lei->{au_done}); # trigger wait_startq if start_mua didn't
 }
 
+sub do_post_augment {
+	my ($lei) = @_;
+	my $l2m = $lei->{l2m} or return; # client disconnected
+	$l2m->post_augment($lei, on_destroy(\&post_augment_done, $lei));
+}
+
 sub incr_post_augment { # called whenever an l2m shard finishes augment
 	my ($lei) = @_;
 	my $l2m = $lei->{l2m} or return; # client disconnected

^ permalink raw reply related	[relevance 53%]

* [PATCH 0/4] lei parallelism fixes
@ 2024-04-16 20:56 71% Eric Wong
  2024-04-16 20:56 71% ` [PATCH 1/4] v2 + lei/store: always wait for fast-import checkpoint Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2024-04-16 20:56 UTC (permalink / raw)
  To: meta

This series allows `lei reindex' to run in parallel with other
lei commands which write to lei/store.

Eric Wong (4):
  v2 + lei/store: always wait for fast-import checkpoint
  lei: use ->barrier to commit to lei/store
  lei/store: stop shard workers + cat-file on idle
  lei: use async barrier for --import-before

 lib/PublicInbox/EOFpipe.pm            |  7 ++--
 lib/PublicInbox/ExtSearchIdx.pm       |  1 +
 lib/PublicInbox/LEI.pm                |  6 ++--
 lib/PublicInbox/LeiInput.pm           |  2 +-
 lib/PublicInbox/LeiRefreshMailSync.pm |  2 +-
 lib/PublicInbox/LeiRemote.pm          |  4 +--
 lib/PublicInbox/LeiStore.pm           | 46 ++++++++++++++++-----------
 lib/PublicInbox/LeiToMail.pm          | 28 ++++++++++++----
 lib/PublicInbox/LeiXSearch.pm         | 17 ++++++----
 lib/PublicInbox/V2Writable.pm         |  8 +----
 t/lei-store-fail.t                    |  2 +-
 11 files changed, 74 insertions(+), 49 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 3/3] lei: remove leftover debugging message
  2024-04-12 18:04 71% [PATCH 0/3] some lei fixes Eric Wong
@ 2024-04-12 18:04 71% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-04-12 18:04 UTC (permalink / raw)
  To: meta

Noticed while working on other things...

Fixes: 299aac294ec3 (lei: do label/keyword parsing in optparse, 2023-10-02)
---
 lib/PublicInbox/LEI.pm | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 06592358..5b46686a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -728,8 +728,6 @@ sub optparse ($$$) {
 		require PublicInbox::LeiInput;
 		my @err = PublicInbox::LeiInput::vmd_mod_extract($self, $argv);
 		return $self->fail(join("\n", @err)) if @err;
-	} else {
-		warn "proto $proto\n" if $cmd =~ /(add-watch|tag|index)/;
 	}
 
 	my $i = 0;

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/3] some lei fixes
@ 2024-04-12 18:04 71% Eric Wong
  2024-04-12 18:04 71% ` [PATCH 3/3] lei: remove leftover debugging message Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-04-12 18:04 UTC (permalink / raw)
  To: meta

Some trivial fixes I noticed while (still) working on getting
lei to use checkpoint to improve parallelism.

Eric Wong (3):
  lei_remote: solver supports uncommitted blobs
  io: avoid redundant waitpid in DESTROY
  lei: remove leftover debugging message

 lib/PublicInbox/IO.pm        | 10 +++++-----
 lib/PublicInbox/LEI.pm       |  2 --
 lib/PublicInbox/LeiRemote.pm | 13 ++++++++++---
 3 files changed, 15 insertions(+), 10 deletions(-)

^ permalink raw reply	[relevance 71%]

* Re: [PATCH] lei q: support --thread-id=$MSGID || -T $MSGID
  2024-04-12  8:03 71% ` Štěpán Němec
@ 2024-04-12  9:43 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-04-12  9:43 UTC (permalink / raw)
  To: Štěpán Němec; +Cc: meta

Štěpán Němec <stepnem@smrk.net> wrote:
> Eric Wong wrote:
> > +		is $lei_out, '', 'no results on unlrelated thread';
>                                                   ^
> s/unlrelated/unrelated/

Thanks, squashed:

diff --git a/t/psgi_v2.t b/t/psgi_v2.t
index 56a6ae8e..d5c328f0 100644
--- a/t/psgi_v2.t
+++ b/t/psgi_v2.t
@@ -105,7 +105,7 @@ my $test_lei_q_threadid = sub {
 	my ($u) = @_;
 	test_lei(sub {
 		lei_ok qw(q -f text --only), $u, qw(-T t@1 s:unrelated);
-		is $lei_out, '', 'no results on unlrelated thread';
+		is $lei_out, '', 'no results on unrelated thread';
 		lei_ok qw(q -f text --only), $u, qw(-T t@1 dt:19931002000300..);
 		my @m = ($lei_out =~ m!^Message-ID: <([^>]+)>\n!gms);
 		is_deeply \@m, ['t@3'], 'got expected result from -T MSGID';

And pushed as commit 873066744d1b105da4cfafb1c7312ca11b579864

^ permalink raw reply related	[relevance 71%]

* Re: [PATCH] lei q: support --thread-id=$MSGID || -T $MSGID
  2024-04-12  2:01 51% [PATCH] lei q: support --thread-id=$MSGID || -T $MSGID Eric Wong
@ 2024-04-12  8:03 71% ` Štěpán Němec
  2024-04-12  9:43 71%   ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Štěpán Němec @ 2024-04-12  8:03 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta


Typo squad alert!

On Fri, 12 Apr 2024 02:01:03 +0000
Eric Wong wrote:

> diff --git a/t/psgi_v2.t b/t/psgi_v2.t
> index 54faae9b..56a6ae8e 100644
> --- a/t/psgi_v2.t
> +++ b/t/psgi_v2.t
> @@ -101,6 +101,19 @@ EOM
>  	}
>  };
>  
> +my $test_lei_q_threadid = sub {
> +	my ($u) = @_;
> +	test_lei(sub {
> +		lei_ok qw(q -f text --only), $u, qw(-T t@1 s:unrelated);
> +		is $lei_out, '', 'no results on unlrelated thread';
                                                  ^
s/unlrelated/unrelated/

Thanks,

  Štěpán

^ permalink raw reply	[relevance 71%]

* Re: lei-up doesn't output replies to matching thread
  2024-04-11 22:46 65% lei-up doesn't output replies to matching thread Josh Steadmon
@ 2024-04-12  2:07 71% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-04-12  2:07 UTC (permalink / raw)
  To: Josh Steadmon; +Cc: meta

Josh Steadmon <steadmon@google.com> wrote:
> Hello again,
> 
> I'm having trouble where `lei up --all` is not outputting threaded
> replies, despite the fact that the saved search requests them. I noticed
> the problem on this Git thread:

I think this needs the notmuch-style subquery support I mentioned here:

https://public-inbox.org/meta/20230412201743.M20097@dcvr/T/#u

The external process part works nowadays (for new -cindex), but isn't
wired up to old lei and WWW code, yet...  And the subquery code will
probably be stealing C++ from notmuch.



Slightly related, I just posted a patch to finish off a per-thread
subscription feature but it requires recent HTTP(S) endpoints if
using HTTP(S):
https://public-inbox.org/meta/20240412020103.2665237-1-e@80x24.org/

(I've been sidetracked badly this year dealing with offline things :<)

^ permalink raw reply	[relevance 71%]

* [PATCH] lei q: support --thread-id=$MSGID || -T $MSGID
@ 2024-04-12  2:01 51% Eric Wong
  2024-04-12  8:03 71% ` Štěpán Němec
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-04-12  2:01 UTC (permalink / raw)
  To: meta

This adds support for the "POST /$INBOX/$MSGID/?x=m?q=..."
added last year to support per-thread searches
764035c83 (www: support POST /$INBOX/$MSGID/?x=m&q=, 2023-03-30)
This only supports instances of public-inbox since 764035c83,
but unfortunately there hasn't been a release since then.
---
 Documentation/lei-q.pod       |  9 +++++++++
 lib/PublicInbox/LEI.pm        |  1 +
 lib/PublicInbox/LeiXSearch.pm | 16 ++++++++++++----
 t/psgi_v2.t                   | 16 ++++++++++++++++
 4 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 4476a806..79156750 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -129,6 +129,15 @@ lei/store unless an MUA unflags it!  (Behavior undecided)
 Caveat: C<-tt> only works on locally-indexed messages at the
 moment, and not on remote (HTTP(S)) endpoints.
 
+=item --thread-id=MSGID
+
+=item -T MSGID
+
+Only search messages in the same thread as the given Message-ID.
+
+For HTTP(S) externals, this only works on instances running
+public-inbox 2.0+ (UNRELEASED).
+
 =item --jobs=QUERY_WORKERS[,WRITE_WORKERS]
 
 =item --jobs=,WRITE_WORKERS
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 7c31ab43..06592358 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -177,6 +177,7 @@ our %CMD = ( # sorted in order of importance/use:
 	'stdin|', # /|\z/ must be first for lone dash
 	@lxs_opt, @net_opt,
 	qw(save! output|mfolder|o=s format|f=s dedupe|d=s threads|t+
+	thread-id|T=s
 	sort|s=s reverse|r offset=i pretty jobs|j=s globoff|g augment|a
 	import-before! lock=s@ rsyncable alert=s@ mua=s verbose|v+
 	shared color! mail-sync!), @c_opt, opt_dash('limit|n=i', '[0-9]+') ],
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index fc95d401..d4f34733 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -13,7 +13,7 @@ use File::Temp 0.19 (); # 0.19 for ->newdir
 use File::Spec ();
 use PublicInbox::Search qw(xap_terms);
 use PublicInbox::Spawn qw(popen_rd popen_wr which);
-use PublicInbox::MID qw(mids);
+use PublicInbox::MID qw(mids mid_escape);
 use PublicInbox::Smsg;
 use PublicInbox::Eml;
 use PublicInbox::LEI;
@@ -160,6 +160,8 @@ sub query_one_mset { # for --threads and l2m w/o sort
 	my $can_kw = !!$ibxish->can('msg_keywords');
 	my $threads = $lei->{opt}->{threads} // 0;
 	my $fl = $threads > 1 ? 1 : undef;
+	my $mid = $lei->{opt}->{'thread-id'};
+	$mo->{threadid} = $over->mid2tid($mid) if defined $mid;
 	my $lss = $lei->{lss};
 	my $maxk = "external.$dir.maxuid"; # max of previous, so our min
 	my $min = $lss ? ($lss->{-cfg}->{$maxk} // 0) : 0;
@@ -339,6 +341,12 @@ print STDERR $_;
 	push @$curl, '-s', '-d', '';
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	$self->{import_sto} = $lei->{sto} if $lei->{opt}->{'import-remote'};
+	if (defined(my $mid = $opt->{'thread-id'})) {
+		$mid = mid_escape($mid);
+		for my $uri (@$uris) {
+			$uri->path($uri->path.$mid.'/');
+		}
+	}
 	for my $uri (@$uris) {
 		$lei->{-current_url} = $uri->as_string;
 		my $start = time;
@@ -459,7 +467,9 @@ sub concurrency {
 sub start_query ($$) { # always runs in main (lei-daemon) process
 	my ($self, $lei) = @_;
 	local $PublicInbox::LEI::current_lei = $lei;
-	if ($self->{opt_threads} || ($lei->{l2m} && !$self->{opt_sort})) {
+	if ($lei->{opt}->{threads} ||
+			defined($lei->{opt}->{'thread-id'}) ||
+			($lei->{l2m} && !$lei->{opt}->{'sort'})) {
 		for my $ibxish (locals($self)) {
 			$self->wq_io_do('query_one_mset', [], $ibxish);
 		}
@@ -546,8 +556,6 @@ sub do_query {
 	my $op_c = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
 	@$end = ();
-	$self->{opt_threads} = $lei->{opt}->{threads};
-	$self->{opt_sort} = $lei->{opt}->{'sort'};
 	$self->{-do_lcat} = !!(delete $lei->{lcat_todo});
 	if ($l2m) {
 		$l2m->net_merge_all_done($lei) unless $lei->{auth};
diff --git a/t/psgi_v2.t b/t/psgi_v2.t
index 54faae9b..56a6ae8e 100644
--- a/t/psgi_v2.t
+++ b/t/psgi_v2.t
@@ -101,6 +101,19 @@ EOM
 	}
 };
 
+my $test_lei_q_threadid = sub {
+	my ($u) = @_;
+	test_lei(sub {
+		lei_ok qw(q -f text --only), $u, qw(-T t@1 s:unrelated);
+		is $lei_out, '', 'no results on unlrelated thread';
+		lei_ok qw(q -f text --only), $u, qw(-T t@1 dt:19931002000300..);
+		my @m = ($lei_out =~ m!^Message-ID: <([^>]+)>\n!gms);
+		is_deeply \@m, ['t@3'], 'got expected result from -T MSGID';
+	});
+};
+
+$test_lei_q_threadid->($m2t->{inboxdir});
+
 my $cfgpath = "$ibx->{inboxdir}/pi_config";
 {
 	open my $fh, '>', $cfgpath or BAIL_OUT $!;
@@ -374,6 +387,9 @@ my $client3 = sub {
 
 	$res = $cb->(POST("/m2t/t\@1/?q=s:unrelated&x=m"));
 	is($res->code, 404, '404 on cross-thread search');
+
+	my $rmt = $ENV{PLACK_TEST_EXTERNALSERVER_URI};
+	$rmt and $test_lei_q_threadid->("$rmt/m2t/");
 };
 test_psgi(sub { $www->call(@_) }, $client3);
 test_httpd($env, $client3, 4);

^ permalink raw reply related	[relevance 51%]

* lei-up doesn't output replies to matching thread
@ 2024-04-11 22:46 65% Josh Steadmon
  2024-04-12  2:07 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Josh Steadmon @ 2024-04-11 22:46 UTC (permalink / raw)
  To: meta

Hello again,

I'm having trouble where `lei up --all` is not outputting threaded
replies, despite the fact that the saved search requests them. I noticed
the problem on this Git thread:
https://lore.kernel.org/git/520da361-1b80-4ba3-87b2-86d6fdfc18b5@web.de/

My saved search is as follows:

{
  "output": "maildir:/usr/local/google/home/steadmon/.mail/lei-git-unit-tests",
  "q": ["(dfn:t/unit-tests OR s:unit OR ((nq:bug OR nq:regression) AND nq:unit)) AND rt:2.weeks.ago.."],
  "external": "1",
  "local": "1",
  "remote": "1",
  "threads": "1"
}

Lei is aware of the replies, because I can see them with lei-q if I copy
the query from the saved search:

$ lei q -t "(dfn:t/unit-tests OR s:unit OR ((nq:bug OR nq:regression) AND nq:unit)) AND rt:2.weeks.ago.." \
  | grep "t-prio-queue: simplify using compound literals" \
  | wc -l
11


In my output maildir, I only have the initial message that started the
thread:

$ cd ~/.mail/lei-git-unit-tests
$ grep -Rl "Subject:.*t-prio-queue: simplify using compound literals" \
  | wc -l
1


And lei-up does not seem to see those matches:

$ lei up ~/.mail/lei-git-unit-tests
# /usr/local/google/home/steadmon/.local/share/lei/store 10/10
# 0 written to /usr/local/google/home/steadmon/.mail/lei-git-unit-tests/ (10 matches)


I have not noticed the problem for other threads, but I am not 100% sure
that it's limited to just this one, either. If you have any advice for
further debugging, I'd appreciate the help.

Thanks!

^ permalink raw reply	[relevance 65%]

* [PATCH] lei blob: fix attachment extraction for unimported||inflight
@ 2024-04-11 18:58 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-04-11 18:58 UTC (permalink / raw)
  To: meta

Noticed while trying to make other reliability improvements to
lei...
---
 lib/PublicInbox/LeiBlob.pm | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiBlob.pm b/lib/PublicInbox/LeiBlob.pm
index 127cc81e..00697097 100644
--- a/lib/PublicInbox/LeiBlob.pm
+++ b/lib/PublicInbox/LeiBlob.pm
@@ -119,14 +119,17 @@ sub lei_blob {
 		} else {
 			open $rdr->{2}, '>', '/dev/null' or die "open: $!";
 		}
-		my $cmd = [ 'git', '--git-dir='.$lei->ale->git->{git_dir},
-				'cat-file', 'blob', $blob ];
+		my $cmd = $lei->ale->git->cmd('cat-file', 'blob', $blob);
+		my $cerr;
 		if (defined $lei->{-attach_idx}) {
 			my $buf = run_qx($cmd, $lei->{env}, $rdr);
 			return extract_attach($lei, $blob, \$buf) unless $?;
+			$cerr = $?;
+		} else {
+			$rdr->{1} = $lei->{1}; # write directly to client
+			$cerr = run_wait($cmd, $lei->{env}, $rdr) or return;
 		}
-		$rdr->{1} = $lei->{1};
-		my $cerr = run_wait($cmd, $lei->{env}, $rdr) or return;
+		# fall back to unimported ('lei index') and inflight blobs
 		my $lms = $lei->lms;
 		my $bref = ($lms ? $lms->local_blob($blob, 1) : undef) // do {
 			my $sto = $lei->{sto} // $lei->_lei_store;

^ permalink raw reply related	[relevance 71%]

* Re: v1.9.0 : `ls-search' is not an lei command
  2024-04-04 19:55 71% ` Eric Wong
@ 2024-04-09 19:59 71%   ` Josh Steadmon
  0 siblings, 0 replies; 200+ results
From: Josh Steadmon @ 2024-04-09 19:59 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On 2024.04.04 19:55, Eric Wong wrote:
> Josh Steadmon <steadmon@google.com> wrote:
> > Hello all,
> > 
> > I recently had to reinstall my work machine, and after doing so I now
> > get an error when running `lei ls-search`:
> > 
> > `ls-search' is not an lei command
> > 
> > This happens with both version 1.9.0-1+build1 of the Debian "lei"
> > package, and with version 1.9.0 of the Nix "public-inbox" package.
> 
> Are you sure the `lei' in $PATH is pointed to the right
> installation?
> 
> `lei sucks' should give diagnostic info (or similarly broken if
> the lei-daemon is running on an uninstalled version).

Thanks for the pointer to `lei sucks'. It reported that lei was for some
reason trying to load packages from a prior attempt at installing
public-inbox from source, even after deleting the relevant directory.
I'm not sure how that happened, since I installed via apt and AFAICT
there's nothing in my environment that would tell Perl to look in that
location.

In any case, `apt-get purge lei` followed by a reinstall worked after
deleting the old source directory.

Thanks again!

^ permalink raw reply	[relevance 71%]

* Re: v1.9.0 : `ls-search' is not an lei command
  2024-04-04 19:04 71% v1.9.0 : `ls-search' is not an lei command Josh Steadmon
@ 2024-04-04 19:55 71% ` Eric Wong
  2024-04-09 19:59 71%   ` Josh Steadmon
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-04-04 19:55 UTC (permalink / raw)
  To: Josh Steadmon; +Cc: meta

Josh Steadmon <steadmon@google.com> wrote:
> Hello all,
> 
> I recently had to reinstall my work machine, and after doing so I now
> get an error when running `lei ls-search`:
> 
> `ls-search' is not an lei command
> 
> This happens with both version 1.9.0-1+build1 of the Debian "lei"
> package, and with version 1.9.0 of the Nix "public-inbox" package.

Are you sure the `lei' in $PATH is pointed to the right
installation?

`lei sucks' should give diagnostic info (or similarly broken if
the lei-daemon is running on an uninstalled version).

`lei daemon-kill' should always work and rerunning any `lei'
should restart it.

Check for a LeiLsSearch.pm on your machine and see if it lines
up with your expected installation location (and there should be
a bunch of Lei*.pm siblings in the same directory).

^ permalink raw reply	[relevance 71%]

* v1.9.0 : `ls-search' is not an lei command
@ 2024-04-04 19:04 71% Josh Steadmon
  2024-04-04 19:55 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Josh Steadmon @ 2024-04-04 19:04 UTC (permalink / raw)
  To: meta

Hello all,

I recently had to reinstall my work machine, and after doing so I now
get an error when running `lei ls-search`:

`ls-search' is not an lei command

This happens with both version 1.9.0-1+build1 of the Debian "lei"
package, and with version 1.9.0 of the Nix "public-inbox" package.

Unfortunately I'm not sure which version I had installed prior to wiping
my machine.

Any advice is appreciated, thanks!
Josh

^ permalink raw reply	[relevance 71%]

* Re: Lei exception
  2024-03-14 14:46 71%     ` Eric Wong
@ 2024-03-15 17:36 71%       ` Gonsolo
  0 siblings, 0 replies; 200+ results
From: Gonsolo @ 2024-03-15 17:36 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

> please keep meta@public-inbox.org in the Cc:

Ok.

> No, please don't ask Debian maintainers to package unreleased
> versions.  The non-lei feature: codesearch, is not ready to be
> supported in a 2.0 release.
>
> The only good news is 2.0 will probably be the last release
> involving large data model additions/changes.

Ok. Thanks. I'll just wait for 2.0.


-- 
g

^ permalink raw reply	[relevance 71%]

* Re: Lei exception
       [not found]       ` <CANL0fFSMQ1YL1a8PEpU39pYQ7d6vmmndughvJVue=SWNYNdqGQ@mail.gmail.com>
@ 2024-03-14 14:46 71%     ` Eric Wong
  2024-03-15 17:36 71%       ` Gonsolo
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-03-14 14:46 UTC (permalink / raw)
  To: Gonsolo; +Cc: meta

please keep meta@public-inbox.org in the Cc:

Gonsolo <gonsolo@gmail.com> wrote:
> > Which version of public-inbox is this?  It looks like 1.9 based
> > on the line number[1].
> 
> Package from Ubuntu/Debian.
> 
> lei 1.9.0-1 (Ubuntu)

<snip>

>   936c275178dfc6908577487ce97d3a83c58c5449 PublicInbox/LeiSearch.pm

OK.

> > Any info you can share about the queries you use?  (https, local
> > public-inbox clones).
> 
> lei edit-search:

<snip>

> [lei "q"]
>         include = https://lore.kernel.org/all/
>         external = 1
>         local = 1
>         remote = 1
>         threads = 1

OK, yeah.  I seem to recall problems with http(s) externals
being racy and made more reliable with the use of OnDestroy
callbacks in public-inbox.git

> > I've seen it a few times in the past, but IIRC couldn't reliably
> > reproduce it and haven't seen it in a while.  I'm always running
> > latest <https://80x24.org/public-inbox.git>, though, and that
> > has numerous reliability improvements over 1.9.
> 
> Ok. Maybe I'm going to ping the Debian maintainer.

No, please don't ask Debian maintainers to package unreleased
versions.  The non-lei feature: codesearch, is not ready to be
supported in a 2.0 release.

The only good news is 2.0 will probably be the last release
involving large data model additions/changes.

^ permalink raw reply	[relevance 71%]

* Re: Lei exception
  2024-03-13 15:35 71% Lei exception Gonsolo
@ 2024-03-13 19:20 71% ` Eric Wong
       [not found]       ` <CANL0fFSMQ1YL1a8PEpU39pYQ7d6vmmndughvJVue=SWNYNdqGQ@mail.gmail.com>
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-03-13 19:20 UTC (permalink / raw)
  To: Gonsolo; +Cc: meta

Gonsolo <gonsolo@gmail.com> wrote:
> Hi!
> 
> Since a few days I'm getting the following error when running "lei up --all":
> 
> 10770 lei2mail 1 (nshard=3) 8b214638a3a05e3d9f2345a632a5eed0de7f9ab6:
> Exception: Document 22720 not found at
> /usr/share/perl5/PublicInbox/LeiSearch.pm line 68.
> 
> Is there anything I can do?

Which version of public-inbox is this?  It looks like 1.9 based
on the line number[1].

Any info you can share about the queries you use?  (https, local
public-inbox clones).

I've seen it a few times in the past, but IIRC couldn't reliably
reproduce it and haven't seen it in a while.  I'm always running
latest <https://80x24.org/public-inbox.git>, though, and that
has numerous reliability improvements over 1.9.

In any case, there's no mail data loss from xsmsg_vmd, only
metadata (missed flags) and I think it could've been a data
synchronization problem that was fixed in public-inbox.git.


[1] Running "lei sucks" should show all relevant version info.  The blob OID
    for LeiSearch.pm would certainly help confirm that.


^ permalink raw reply	[relevance 71%]

* Lei exception
@ 2024-03-13 15:35 71% Gonsolo
  2024-03-13 19:20 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Gonsolo @ 2024-03-13 15:35 UTC (permalink / raw)
  To: meta

Hi!

Since a few days I'm getting the following error when running "lei up --all":

10770 lei2mail 1 (nshard=3) 8b214638a3a05e3d9f2345a632a5eed0de7f9ab6:
Exception: Document 22720 not found at
/usr/share/perl5/PublicInbox/LeiSearch.pm line 68.

Is there anything I can do?

Best regards


-- 
g

^ permalink raw reply	[relevance 71%]

* [PATCH 1/2] lei: prevent empty {bytes} field in saved search
  @ 2024-03-08 21:05 65% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-03-08 21:05 UTC (permalink / raw)
  To: meta

Noticed while tracking down fast-import crash bug report.

Link: https://public-inbox.org/meta/CAL_JsqK7P4gjLPyvzxNEcYmxT4j6Ah5f3Pz1RqDHxmysTg3aEg@mail.gmail.com/
---
 lib/PublicInbox/LeiSearch.pm | 2 ++
 lib/PublicInbox/LeiToMail.pm | 1 +
 lib/PublicInbox/OverIdx.pm   | 6 +++++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiSearch.pm b/lib/PublicInbox/LeiSearch.pm
index 29e3213f..684668c5 100644
--- a/lib/PublicInbox/LeiSearch.pm
+++ b/lib/PublicInbox/LeiSearch.pm
@@ -103,6 +103,8 @@ sub xoids_for {
 		for my $o (@overs) {
 			my ($id, $prev);
 			while (my $cur = $o->next_by_mid($mid, \$id, \$prev)) {
+				# {bytes} may be '' from old bug
+				$cur->{bytes} = 1 if $cur->{bytes} eq '';
 				next if $cur->{bytes} == 0 ||
 					$xoids->{$cur->{blob}};
 				$git->cat_async($cur->{blob}, \&_cmp_1st,
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index a816df6c..dfae29e9 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -149,6 +149,7 @@ sub git_to_mail { # git->cat_async callback
 						"W: $oid is $type (!= blob)");
 		$size or return $self->{lei}->child_error(0,"E: $oid is empty");
 		$smsg->{blob} eq $oid or die "BUG: expected=$smsg->{blob}";
+		$smsg->{bytes} ||= $size;
 		$self->{wcb}->($bref, $smsg);
 	};
 	$self->{lei}->fail("$@ (oid=$oid)") if $@;
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index c9c25828..4f8533f7 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -17,6 +17,7 @@ use PublicInbox::MID qw/id_compress mids_for_index references/;
 use PublicInbox::Smsg qw(subject_normalized);
 use Compress::Zlib qw(compress);
 use Carp qw(croak);
+use bytes (); # length
 
 sub dbh_new {
 	my ($self) = @_;
@@ -263,7 +264,10 @@ sub ddd_for ($) {
 
 sub add_overview {
 	my ($self, $eml, $smsg) = @_;
-	$smsg->{lines} = $eml->body_raw =~ tr!\n!\n!;
+	my $raw = $eml->body_raw;
+	$smsg->{lines} = $raw =~ tr!\n!\n!;
+	$smsg->{bytes} //= bytes::length $raw;
+	undef $raw;
 	my $mids = mids_for_index($eml);
 	my $refs = $smsg->parse_references($eml, $mids);
 	$mids->[0] //= do {

^ permalink raw reply related	[relevance 65%]

* Re: lei up can't fetch new thread messages when searching by mid
  2024-02-09 17:35 71% ` Eric Wong
@ 2024-02-12 11:11 71%   ` Pratyush Yadav
  0 siblings, 0 replies; 200+ results
From: Pratyush Yadav @ 2024-02-12 11:11 UTC (permalink / raw)
  To: Eric Wong; +Cc: Pratyush Yadav, meta

Hi,

On Fri, Feb 09 2024, Eric Wong wrote:

> Pratyush Yadav <me@yadavpratyush.com> wrote:
>
> <snip> yeah, known problem, whole thread from last year...
>
> 	https://public-inbox.org/meta/20230330112951.M493025@dcvr/T/
>
>> Is there a way to properly subscribe to an email thread? I suppose I can
>> do query the subject instead but that can also run into some corner
>> cases like multiple threads with the same subject, or if someone changes
>> the subject when replying to the thread.
>
> The infrastructure exists for it, but hasn't been wired into
> lei, yet:
>
> 	https://public-inbox.org/meta/20230330112951.M493025@dcvr/
>
> I also don't know if lore is running it, yet, but the
> 80x24.org/lore mirror is still alive...  And I've still yet to
> make a release.  Hopefully coming soon, can't wrap my head
> around coderepo <=> inbox mapping parts and also dealing with
> extreme anxiety around making releases :<

I see. Thanks for the pointer. I will use the subject to query for now
and wait until this comes to lei :-)

-- 
Regards,
Pratyush Yadav

^ permalink raw reply	[relevance 71%]

* Re: lei up can't fetch new thread messages when searching by mid
  2024-02-09 16:06 70% lei up can't fetch new thread messages when searching by mid Pratyush Yadav
@ 2024-02-09 17:35 71% ` Eric Wong
  2024-02-12 11:11 71%   ` Pratyush Yadav
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-02-09 17:35 UTC (permalink / raw)
  To: Pratyush Yadav; +Cc: meta

Pratyush Yadav <me@yadavpratyush.com> wrote:

<snip> yeah, known problem, whole thread from last year...

	https://public-inbox.org/meta/20230330112951.M493025@dcvr/T/

> Is there a way to properly subscribe to an email thread? I suppose I can
> do query the subject instead but that can also run into some corner
> cases like multiple threads with the same subject, or if someone changes
> the subject when replying to the thread.

The infrastructure exists for it, but hasn't been wired into
lei, yet:

	https://public-inbox.org/meta/20230330112951.M493025@dcvr/

I also don't know if lore is running it, yet, but the
80x24.org/lore mirror is still alive...  And I've still yet to
make a release.  Hopefully coming soon, can't wrap my head
around coderepo <=> inbox mapping parts and also dealing with
extreme anxiety around making releases :<

^ permalink raw reply	[relevance 71%]

* lei up can't fetch new thread messages when searching by mid
@ 2024-02-09 16:06 70% Pratyush Yadav
  2024-02-09 17:35 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Pratyush Yadav @ 2024-02-09 16:06 UTC (permalink / raw)
  To: meta

Hi,

I have set up a lei query for an email thread I am interested in. I
initially imported the messages by querying for the Message-Id via
"mid:<message-id>" and set -t to get all messages in the thread. That
works great.

Now when I run lei up, it does not fetch new messages in the thread.
From what I understand, this happens because lei adds an additional
parameter to the search that only looks for messages since the last time
lei up ran. This of course does not include the email with the
Message-Id I am interested in so its replies also do not show up in the
query results.

Is there a way to properly subscribe to an email thread? I suppose I can
do query the subject instead but that can also run into some corner
cases like multiple threads with the same subject, or if someone changes
the subject when replying to the thread.

-- 
Regards,
Pratyush Yadav

^ permalink raw reply	[relevance 70%]

* [PATCH 5/5] lei: sort MH inputs sequentially by default
    2024-01-31 10:20 65% ` [PATCH 1/5] lei convert: explicitly allow --sort for inputs Eric Wong
  2024-01-31 10:20 55% ` [PATCH 4/5] scripts/import_*: update usage to include lei tips Eric Wong
@ 2024-01-31 10:20 59% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-01-31 10:20 UTC (permalink / raw)
  To: meta

MH sequence numbers can be analogous to IMAP UIDs and NNTP
article numbers (or more like IMAP MSNs with clients which
pack).  In any case, sort then numerically by default to avoid
surprising users who treat NNTP spools and mlmmj archives as MH
folders.  This gives more coherent git history and resulting
NNTP/IMAP numbering when round-tripping MH -> v2 -> (NNTP|IMAP) -> MH
---
 lib/PublicInbox/LeiInput.pm |  2 +-
 lib/PublicInbox/MHreader.pm |  3 ++-
 t/mh_reader.t               | 14 ++++++++++----
 3 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 947a7a79..d003d983 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -242,7 +242,7 @@ sub input_path_url {
 		}
 	} elsif (-d _ && $ifmt eq 'mh') {
 		my $mhr = PublicInbox::MHreader->new($input.'/', $lei->{3});
-		$mhr->{sort} = $lei->{opt}->{sort};
+		$mhr->{sort} = $lei->{opt}->{sort} // [ 'sequence'];
 		$mhr->mh_each_eml($self->can('input_mh_cb'), $self, @args);
 	} elsif (-d _ && $ifmt =~ /\A(?:v1|v2)\z/) {
 		my $ibx = PublicInbox::Inbox->new({inboxdir => $input});
diff --git a/lib/PublicInbox/MHreader.pm b/lib/PublicInbox/MHreader.pm
index 033aa740..3e7bbd5c 100644
--- a/lib/PublicInbox/MHreader.pm
+++ b/lib/PublicInbox/MHreader.pm
@@ -54,7 +54,8 @@ sub mh_each_file {
 	opendir(my $dh, my $dir = $self->{dir});
 	my $restore = PublicInbox::OnDestroy->new($$, \&chdir, $self->{cwdfh});
 	chdir($dh);
-	if (defined(my $sort = $self->{sort})) {
+	my $sort = $self->{sort};
+	if (defined $sort && "@$sort" ne 'none') {
 		my @sort = map {
 			my @tmp = $_ eq '' ? ('sequence') : split(/[, ]/);
 			# sorting by name alphabetically makes no sense for MH:
diff --git a/t/mh_reader.t b/t/mh_reader.t
index 711fc8aa..c81df32e 100644
--- a/t/mh_reader.t
+++ b/t/mh_reader.t
@@ -7,6 +7,7 @@ use PublicInbox::IO qw(write_file);
 use PublicInbox::Lock;
 use PublicInbox::OnDestroy;
 use PublicInbox::Eml;
+use File::Path qw(remove_tree);
 use autodie;
 opendir my $cwdfh, '.';
 
@@ -103,12 +104,17 @@ test_lei(sub {
 	like $lei_out, qr/^Subject: msg 4\nStatus: RO\n\n\n/ms,
 		"message retrieved after `lei index'";
 
+	lei_ok qw(convert -s none -f text), "mh:$for_sort", \'--sort=none';
+
 	# ensure sort works for _input_ when output disallows sort
 	my $v2out = "$ENV{HOME}/v2-out";
-	lei_ok qw(convert -s sequence), "mh:$for_sort", '-o', "v2:$v2out";
-	my $git = PublicInbox::Git->new("$v2out/git/0.git");
-	chomp(my @l = $git->qx(qw(log --pretty=oneline --format=%s)));
-	is_xdeeply \@l, [1, 22, 333], 'sequence order preserved for v2';
+	for my $sort (['--sort=sequence'], []) { # sequence is the default
+		lei_ok qw(convert), @$sort, "mh:$for_sort", '-o', "v2:$v2out";
+		my $g = PublicInbox::Git->new("$v2out/git/0.git");
+		chomp(my @l = $g->qx(qw(log --pretty=oneline --format=%s)));
+		is_xdeeply \@l, [1, 22, 333], 'sequence order preserved for v2';
+		File::Path::remove_tree $v2out;
+	}
 });
 
 done_testing;

^ permalink raw reply related	[relevance 59%]

* [PATCH 1/5] lei convert: explicitly allow --sort for inputs
  @ 2024-01-31 10:20 65% ` Eric Wong
  2024-01-31 10:20 55% ` [PATCH 4/5] scripts/import_*: update usage to include lei tips Eric Wong
  2024-01-31 10:20 59% ` [PATCH 5/5] lei: sort MH inputs sequentially by default Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-01-31 10:20 UTC (permalink / raw)
  To: meta

LeiToMail can't sort v2 output, but sorting MH input (and
NNTP spool + mlmmj archives) numerically makes sense.
---
 lib/PublicInbox/LeiConvert.pm | 1 +
 lib/PublicInbox/LeiToMail.pm  | 2 ++
 t/mh_reader.t                 | 9 ++++++++-
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 17a952f2..4d4fceb2 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -52,6 +52,7 @@ sub lei_convert { # the main "lei convert" method
 	my ($lei, @inputs) = @_;
 	$lei->{opt}->{kw} //= 1;
 	$lei->{opt}->{dedupe} //= 'none';
+	$lei->{input_opt}->{sort} = 1; # for LeiToMail conflict check
 	my $self = bless {}, __PACKAGE__;
 	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
 	$lei->{l2m} or return
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 9197bb44..a816df6c 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -451,6 +451,8 @@ EOM
 		(-d $dst || (-e _ && !-w _)) and die
 			"$dst exists and is not a writable file\n";
 	}
+	$lei->{input_opt} and # lei_convert sets this
+		@conflict = grep { !$lei->{input_opt}->{$_} } @conflict;
 	my @err = map { defined($lei->{opt}->{$_}) ? "--$_" : () } @conflict;
 	die "@err incompatible with $fmt\n" if @err;
 	$self->{dst} = $dst;
diff --git a/t/mh_reader.t b/t/mh_reader.t
index e8f69fa8..711fc8aa 100644
--- a/t/mh_reader.t
+++ b/t/mh_reader.t
@@ -101,7 +101,14 @@ test_lei(sub {
 	lei_ok qw(index), 'mh:'.$stale;
 	lei qw(q -f mboxrd), 's:msg 4';
 	like $lei_out, qr/^Subject: msg 4\nStatus: RO\n\n\n/ms,
-		"message retrieved after `lei index'"
+		"message retrieved after `lei index'";
+
+	# ensure sort works for _input_ when output disallows sort
+	my $v2out = "$ENV{HOME}/v2-out";
+	lei_ok qw(convert -s sequence), "mh:$for_sort", '-o', "v2:$v2out";
+	my $git = PublicInbox::Git->new("$v2out/git/0.git");
+	chomp(my @l = $git->qx(qw(log --pretty=oneline --format=%s)));
+	is_xdeeply \@l, [1, 22, 333], 'sequence order preserved for v2';
 });
 
 done_testing;

^ permalink raw reply related	[relevance 65%]

* [PATCH 4/5] scripts/import_*: update usage to include lei tips
    2024-01-31 10:20 65% ` [PATCH 1/5] lei convert: explicitly allow --sort for inputs Eric Wong
@ 2024-01-31 10:20 55% ` Eric Wong
  2024-01-31 10:20 59% ` [PATCH 5/5] lei: sort MH inputs sequentially by default Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-01-31 10:20 UTC (permalink / raw)
  To: meta

These scripts probably don't offer anything useful now that
lei has fleshed out read-only MH support and v2 outputs.
---
 scripts/import_maildir        | 20 ++++++++++++++------
 scripts/import_slrnspool      | 26 ++++++++++++++++++--------
 scripts/import_vger_from_mbox |  6 +++---
 3 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/scripts/import_maildir b/scripts/import_maildir
index 269f2550..7228a3ad 100755
--- a/scripts/import_maildir
+++ b/scripts/import_maildir
@@ -1,21 +1,29 @@
 #!/usr/bin/perl -w
-# Copyright (C) 2014, Eric Wong <e@80x24.org> and all contributors
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-#
-# Script to import a Maildir into a public-inbox
 =begin usage
+Ancient script to import a Maildir into a v1 public-inbox
+
+	# this is only if you want a v1 inbox
 	export GIT_DIR=/path/to/your/repo.git
 	export GIT_AUTHOR_EMAIL='list@example.com'
 	export GIT_AUTHOR_NAME='list name'
 	./import_maildir /path/to/maildir/
+
+For v2 (strongly recommended), use:
+
+	lei convert /path/to/maildir -o /path/to/v2-inbox
+	# (and `lei daemon-kill' if you don't want the daemon to linger)
 =cut
-use strict;
-use warnings;
+use v5.12;
 use Date::Parse qw/str2time/;
 use PublicInbox::Eml;
 use PublicInbox::Git;
 use PublicInbox::Import;
-sub usage { "Usage:\n".join('', grep(/\t/, `head -n 24 $0`)) }
+sub usage {
+	open my $fh, '<', __FILE__;
+	("Usage:\n", grep { /^=begin usage/../^=cut/ and !/^=/m } <$fh>);
+}
 my $dir = shift @ARGV or die usage();
 my $git_dir = `git rev-parse --git-dir`;
 chomp $git_dir;
diff --git a/scripts/import_slrnspool b/scripts/import_slrnspool
index d9a35dfd..81df6c2e 100755
--- a/scripts/import_slrnspool
+++ b/scripts/import_slrnspool
@@ -1,20 +1,30 @@
 #!/usr/bin/perl -w
-# Copyright (C) 2015-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-#
-# Incremental (or one-shot) importer of a slrnpull news spool
 =begin usage
+Incremental (or one-shot) importer of a slrnpull news spool.
+
+Since the news spool can appear as an MH folder, you may also use
+lei from public-inbox 2.0+ to convert it:
+
+	lei convert mh:$SLRNPULL_ROOT/news/foo/bar -o v2:/path/to/inbox/
+	# (and `lei daemon-kill' if you don't want the daemon to linger)
+
+But if you want to use this script:
+
 	export ORIGINAL_RECIPIENT=address@example.com
-	public-inbox-init $INBOX $GIT_DIR $HTTP_URL $ORIGINAL_RECIPIENT
-	./import_slrnspool SLRNPULL_ROOT/news/foo/bar
+	public-inbox-init -V2 $INBOX $INBOX_DIR $HTTP_URL $ORIGINAL_RECIPIENT
+	./import_slrnspool $SLRNPULL_ROOT/news/foo/bar
 =cut
-use strict;
-use warnings;
+use v5.12;
 use PublicInbox::Config;
 use PublicInbox::Eml;
 use PublicInbox::Import;
 use PublicInbox::Git;
-sub usage { "Usage:\n".join('',grep(/\t/, `head -n 10 $0`)) }
+sub usage {
+	open my $fh, '<', __FILE__;
+	("Usage:\n", grep { /^=begin usage/../^=cut/ and !/^=/m } <$fh>);
+}
 my $exit = 0;
 my $sighandler = sub { $exit = 1 };
 $SIG{INT} = $sighandler;
diff --git a/scripts/import_vger_from_mbox b/scripts/import_vger_from_mbox
index c33e42e4..40ccf50b 100644
--- a/scripts/import_vger_from_mbox
+++ b/scripts/import_vger_from_mbox
@@ -1,8 +1,8 @@
 #!/usr/bin/perl -w
-# Copyright (C) 2016-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict;
-use warnings;
+# consider `lei convert' instead since it handles more formats
+use v5.12;
 use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/;
 use PublicInbox::InboxWritable;
 my $usage = "usage: $0 NAME EMAIL DIR <MBOX\n";

^ permalink raw reply related	[relevance 55%]

* [PATCH 2/2] doc/lei-mail-formats: update MH read-only status
  2024-01-30  6:31 71% [PATCH 0/2] watch: add MH support + lei doc Eric Wong
@ 2024-01-30  6:31 69% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-01-30  6:31 UTC (permalink / raw)
  To: meta

I'm not looking forward to dealing with synchronization
problems if we end up dealing with writes...
---
 Documentation/lei-mail-formats.pod | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/Documentation/lei-mail-formats.pod b/Documentation/lei-mail-formats.pod
index 930c5d76..618bada2 100644
--- a/Documentation/lei-mail-formats.pod
+++ b/Documentation/lei-mail-formats.pod
@@ -83,9 +83,19 @@ mbox.
 
 =head1 MH
 
-Not yet supported, locking semantics (or lack thereof) appear to
-make it unsuitable for parallel access.  It is widely-supported
-by a variety of MUAs and mailing list managers, however.
+Preliminary support for reads as of 2.0.0.  Locking semantics differ
+incompatibly amongst existing writers: Python and nmh appear
+compatible with each other, while mutt appears racy and unsuitable
+for parallel access due to rename(2) potentially clobbering the
+C<.mh_sequences> file.  More info about other clients is greatly
+appreciated.
+
+Sequence numbers may be packed and reused by some writers, so lei
+users may need to run L<lei-refresh-mail-sync(1)> if inotify|kevent
+missed packing while L<lei-daemon(8)> wasn't running.
+
+lei is safe for reading mlmmj archives as MH since mlmmj neither
+packs nor uses a .mh_sequences file to store state.
 
 =head1 MMDF
 

^ permalink raw reply related	[relevance 69%]

* [PATCH 0/2] watch: add MH support + lei doc
@ 2024-01-30  6:31 71% Eric Wong
  2024-01-30  6:31 69% ` [PATCH 2/2] doc/lei-mail-formats: update MH read-only status Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-01-30  6:31 UTC (permalink / raw)
  To: meta

I'm not sure if I want to deal with supporting MH writes;
but reads should be mostly safe (I hope...)

Eric Wong (2):
  watch: support incremental updates from MH
  doc/lei-mail-formats: update MH read-only status

 Documentation/lei-mail-formats.pod   |  16 +++-
 Documentation/public-inbox-watch.pod |  16 ++--
 MANIFEST                             |   1 +
 lib/PublicInbox/Watch.pm             |  92 ++++++++++++++------
 t/watch_maildir.t                    |   2 +-
 t/watch_mh.t                         | 120 +++++++++++++++++++++++++++
 6 files changed, 211 insertions(+), 36 deletions(-)
 create mode 100644 t/watch_mh.t

^ permalink raw reply	[relevance 71%]

* [PATCH 2/3] lei+net_reader: show NNTP message in more failures
  2024-01-10 11:18 71% [PATCH 0/3] lei NNTP + error handling fixes Eric Wong
@ 2024-01-10 11:18 60% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-01-10 11:18 UTC (permalink / raw)
  To: meta

Showing absolutely nothing when hitting a server requiring
authentication is a very bad user experience.  While we're
at it, use Net::Cmd->message in more places where we experience
failure, too.
---
 lib/PublicInbox/LeiLsMailSource.pm |  6 +++++-
 lib/PublicInbox/NetReader.pm       | 19 +++++++++++--------
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/LeiLsMailSource.pm b/lib/PublicInbox/LeiLsMailSource.pm
index 4b427b26..ab6c1e60 100644
--- a/lib/PublicInbox/LeiLsMailSource.pm
+++ b/lib/PublicInbox/LeiLsMailSource.pm
@@ -42,7 +42,11 @@ sub input_path_url { # overrides LeiInput version
 		my $uri = PublicInbox::URInntps->new($url);
 		my $nn = $lei->{net}->nn_get($uri) or
 			return $lei->err("E: $uri");
-		my $l = $nn->newsgroups($uri->group); # name => description
+		# $l = name => description
+		my $l = $nn->newsgroups($uri->group) // return $lei->err(<<EOM);
+E: $uri LIST NEWSGROUPS: ${\($lei->{net}->ndump($nn->message))}
+E: login may be required, try adding `-c nntp.debug' to your command
+EOM
 		my $sec = $lei->{net}->can('uri_section')->($uri);
 		if ($json) {
 			my $all = $nn->list;
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 751043e9..ec18818b 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -14,7 +14,7 @@ our @EXPORT = qw(uri_section imap_uri nntp_uri);
 
 sub ndump {
 	require Data::Dumper;
-	Data::Dumper->new(\@_)->Useqq(1)->Terse(1)->Dump;
+	Data::Dumper->new([ $_[-1] ])->Useqq(1)->Terse(1)->Dump;
 }
 
 # returns the git config section name, e.g [imap "imaps://user@example.com"]
@@ -240,19 +240,19 @@ sub nn_new ($$$$) {
 				try_starttls($nn_arg->{Host})) {
 			# soft fail by default
 			$nn->starttls or warn <<"";
-W: <$uri> STARTTLS tried and failed (not requested)
+W: <$uri> STARTTLS tried and failed (not requested): ${\(ndump($nn->message))}
 
 		} elsif ($nntp_cfg->{starttls}) {
 			# hard fail if explicitly configured
 			$nn->starttls or die <<"";
-E: <$uri> STARTTLS requested and failed
+E: <$uri> STARTTLS requested and failed: ${\(ndump($nn->message))}
 
 		}
 	} elsif ($nntp_cfg->{starttls}) {
 		$nn->can('starttls') or
 			die "E: <$uri> Net::NNTP too old for STARTTLS\n";
 		$nn->starttls or die <<"";
-E: <$uri> STARTTLS requested and failed
+E: <$uri> STARTTLS requested and failed: ${\(ndump($nn->message))}
 
 	}
 	$nn;
@@ -298,18 +298,21 @@ sub nn_for ($$$$) { # nn = Net::NNTP
 		if ($nn->authinfo($u, $p)) {
 			push @{$nntp_cfg->{-postconn}}, [ 'authinfo', $u, $p ];
 		} else {
-			warn "E: <$uri> AUTHINFO $u XXXX failed\n";
+			warn <<EOM;
+E: <$uri> AUTHINFO $u XXXX: ${\(ndump($nn->message))}
+EOM
 			$nn = undef;
 		}
 	}
-
-	if ($nntp_cfg->{compress}) {
+	if ($nn && $nntp_cfg->{compress}) {
 		# https://rt.cpan.org/Ticket/Display.html?id=129967
 		if ($nn->can('compress')) {
 			if ($nn->compress) {
 				push @{$nntp_cfg->{-postconn}}, [ 'compress' ];
 			} else {
-				warn "W: <$uri> COMPRESS failed\n";
+				warn <<EOM;
+W: <$uri> COMPRESS: ${\(ndump($nn->message))}
+EOM
 			}
 		} else {
 			delete $nntp_cfg->{compress};

^ permalink raw reply related	[relevance 60%]

* [PATCH 0/3] lei NNTP + error handling fixes
@ 2024-01-10 11:18 71% Eric Wong
  2024-01-10 11:18 60% ` [PATCH 2/3] lei+net_reader: show NNTP message in more failures Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2024-01-10 11:18 UTC (permalink / raw)
  To: meta

These apply to public-inbox-watch, too, actually.  I also
just noticed 3/3 while testing something after 2/3.

Eric Wong (3):
  net_reader: fix NNTP credential use
  lei+net_reader: show NNTP message in more failures
  lei_to_mail: show supported mbox formats on error

 lib/PublicInbox/LeiLsMailSource.pm |  6 +++++-
 lib/PublicInbox/LeiToMail.pm       |  4 +++-
 lib/PublicInbox/NetReader.pm       | 24 +++++++++++++++---------
 3 files changed, 23 insertions(+), 11 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH] lei: MH: support inotify to detect updates
@ 2024-01-03 10:23 32% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2024-01-03 10:23 UTC (permalink / raw)
  To: meta

This should help us deal with MH sequence number packing and
invalidating mail_sync.sqlite3.
---
 lib/PublicInbox/LEI.pm          | 133 +++++++++++++++++---------------
 lib/PublicInbox/LeiMailSync.pm  |  10 ++-
 lib/PublicInbox/LeiNoteEvent.pm |  22 +++++-
 lib/PublicInbox/LeiWatch.pm     |   7 +-
 lib/PublicInbox/MHreader.pm     |   2 +-
 t/lei-watch.t                   |  12 ++-
 6 files changed, 112 insertions(+), 74 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e0cfd55a..81f940fe 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -28,7 +28,7 @@ use PublicInbox::IPC;
 use Time::HiRes qw(stat); # ctime comparisons for config cache
 use File::Path ();
 use File::Spec;
-use Carp ();
+use Carp qw(carp);
 use Sys::Syslog qw(openlog syslog closelog);
 our $quit = \&CORE::exit;
 our ($current_lei, $errors_log, $listener, $oldset, $dir_idle);
@@ -38,7 +38,7 @@ my $GLP_PASS = Getopt::Long::Parser->new;
 $GLP_PASS->configure(qw(gnu_getopt no_ignore_case auto_abbrev pass_through));
 
 our (%PATH2CFG, # persistent for socket daemon
-$MDIR2CFGPATH, # /path/to/maildir => { /path/to/config => [ ino watches ] }
+$MDIR2CFGPATH, # location => { /path/to/config => [ ino watches ] }
 $OPT, # shared between optparse and opt_dash callback (for Getopt::Long)
 $daemon_pid
 );
@@ -606,7 +606,7 @@ sub _lei_atfork_child {
 	$dir_idle->force_close if $dir_idle;
 	undef $dir_idle;
 	%PATH2CFG = ();
-	$MDIR2CFGPATH = {};
+	$MDIR2CFGPATH = undef;
 	eval 'no warnings; undef $PublicInbox::LeiNoteEvent::to_flush';
 	undef $errors_log;
 	$quit = \&CORE::exit;
@@ -1252,32 +1252,43 @@ sub cfg2lei ($) {
 	$lei;
 }
 
+sub note_event ($@) { # runs lei_note_event for a given config file
+	my ($cfg_f, @args) = @_;
+	my $cfg = $PATH2CFG{$cfg_f} // return;
+	eval { cfg2lei($cfg)->dispatch('note-event', @args) };
+	carp "E: note-event $cfg_f: $@\n" if $@;
+}
+
 sub dir_idle_handler ($) { # PublicInbox::DirIdle callback
 	my ($ev) = @_; # Linux::Inotify2::Event or duck type
 	my $fn = $ev->fullname;
 	if ($fn =~ m!\A(.+)/(new|cur)/([^/]+)\z!) { # Maildir file
-		my ($mdir, $nc, $bn) = ($1, $2, $3);
-		$nc = '' if $ev->IN_DELETE || $ev->IN_MOVED_FROM;
-		for my $f (keys %{$MDIR2CFGPATH->{$mdir} // {}}) {
-			my $cfg = $PATH2CFG{$f} // next;
-			eval {
-				my $lei = cfg2lei($cfg);
-				$lei->dispatch('note-event',
-						"maildir:$mdir", $nc, $bn, $fn);
-			};
-			warn "E: note-event $f: $@\n" if $@;
+		my ($loc, $new_cur, $bn) = ("maildir:$1", $2, $3);
+		$new_cur = '' if $ev->IN_DELETE || $ev->IN_MOVED_FROM;
+		for my $cfg_f (keys %{$MDIR2CFGPATH->{$loc} // {}}) {
+			note_event($cfg_f, $loc, $new_cur, $bn, $fn);
 		}
-	}
+	} elsif ($fn =~ m!\A(.+)/([0-9]+)\z!) { # MH mail message file
+		my ($loc, $n, $new_cur) = ("mh:$1", $2, '+');
+		$new_cur = '' if $ev->IN_DELETE || $ev->IN_MOVED_FROM;
+		for my $cfg_f (keys %{$MDIR2CFGPATH->{$loc} // {}}) {
+			note_event($cfg_f, $loc, $new_cur, $n, $fn);
+		}
+	} elsif ($fn =~ m!\A(.+)/\.mh_sequences\z!) { # reread flags
+		my $loc = "mh:$1";
+		for my $cfg_f (keys %{$MDIR2CFGPATH->{$loc} // {}}) {
+			note_event($cfg_f, $loc, '.mh_sequences')
+		}
+	} # else we don't care
 	if ($ev->can('cancel') && ($ev->IN_IGNORE || $ev->IN_UNMOUNT)) {
 		$ev->cancel;
 	}
 	if ($fn =~ m!\A(.+)/(?:new|cur)\z! && !-e $fn) {
-		delete $MDIR2CFGPATH->{$1};
+		delete $MDIR2CFGPATH->{"maildir:$1"};
 	}
-	if (!-e $fn) { # config file or Maildir gone
-		for my $cfgpaths (values %$MDIR2CFGPATH) {
-			delete $cfgpaths->{$fn};
-		}
+	if (!-e $fn) { # config file, Maildir, or MH dir gone
+		delete $_->{$fn} for values %$MDIR2CFGPATH; # config file
+		delete @$MDIR2CFGPATH{"maildir:$fn", "mh:$fn"};
 		delete $PATH2CFG{$fn};
 	}
 }
@@ -1442,19 +1453,22 @@ sub watch_state_ok ($) {
 	$state =~ /\Apause|(?:import|index|tag)-(?:ro|rw)\z/;
 }
 
-sub cancel_maildir_watch ($$) {
-	my ($d, $cfg_f) = @_;
-	my $w = delete $MDIR2CFGPATH->{$d}->{$cfg_f};
-	scalar(keys %{$MDIR2CFGPATH->{$d}}) or
-		delete $MDIR2CFGPATH->{$d};
-	for my $x (@{$w // []}) { $x->cancel }
+sub cancel_dir_watch ($$$) {
+	my ($type, $d, $cfg_f) = @_;
+	my $loc = "$type:".canonpath_harder($d);
+	my $w = delete $MDIR2CFGPATH->{$loc}->{$cfg_f};
+	delete $MDIR2CFGPATH->{$loc} if !(keys %{$MDIR2CFGPATH->{$loc}});
+	$_->cancel for @$w;
 }
 
-sub add_maildir_watch ($$) {
-	my ($d, $cfg_f) = @_;
-	if (!exists($MDIR2CFGPATH->{$d}->{$cfg_f})) {
-		my @w = $dir_idle->add_watches(["$d/cur", "$d/new"], 1);
-		push @{$MDIR2CFGPATH->{$d}->{$cfg_f}}, @w if @w;
+sub add_dir_watch ($$$) {
+	my ($type, $d, $cfg_f) = @_;
+	$d = canonpath_harder($d);
+	my $loc = "$type:$d";
+	my @dirs = $type eq 'mh' ? ($d) : ("$d/cur", "$d/new");
+	if (!exists($MDIR2CFGPATH->{$loc}->{$cfg_f})) {
+		my @w = $dir_idle->add_watches(\@dirs, 1);
+		push @{$MDIR2CFGPATH->{$loc}->{$cfg_f}}, @w if @w;
 	}
 }
 
@@ -1467,24 +1481,20 @@ sub refresh_watches {
 	my %seen;
 	my $cfg_f = $cfg->{'-f'};
 	for my $w (grep(/\Awatch\..+\.state\z/, keys %$cfg)) {
-		my $url = substr($w, length('watch.'), -length('.state'));
+		my $loc = substr($w, length('watch.'), -length('.state'));
 		require PublicInbox::LeiWatch;
-		$watches->{$url} //= PublicInbox::LeiWatch->new($url);
-		$seen{$url} = undef;
-		my $state = $cfg->get_1("watch.$url.state");
+		$watches->{$loc} //= PublicInbox::LeiWatch->new($loc);
+		$seen{$loc} = undef;
+		my $state = $cfg->get_1("watch.$loc.state");
 		if (!watch_state_ok($state)) {
-			warn("watch.$url.state=$state not supported\n");
-			next;
-		}
-		if ($url =~ /\Amaildir:(.+)/i) {
-			my $d = canonpath_harder($1);
-			if ($state eq 'pause') {
-				cancel_maildir_watch($d, $cfg_f);
-			} else {
-				add_maildir_watch($d, $cfg_f);
-			}
+			warn("watch.$loc.state=$state not supported\n");
+		} elsif ($loc =~ /\A(maildir|mh):(.+)\z/i) {
+			my ($type, $d) = ($1, $2);
+			$state eq 'pause' ?
+				cancel_dir_watch($type, $d, $cfg_f) :
+				add_dir_watch($type, $d, $cfg_f);
 		} else { # TODO: imap/nntp/jmap
-			$lei->child_error(0, "E: watch $url not supported, yet")
+			$lei->child_error(0, "E: watch $loc not supported, yet")
 		}
 	}
 
@@ -1492,29 +1502,28 @@ sub refresh_watches {
 	my $lms = $lei->lms;
 	if ($lms) {
 		$lms->lms_write_prepare;
-		for my $d ($lms->folders('maildir:')) {
-			substr($d, 0, length('maildir:')) = '';
-
+		for my $loc ($lms->folders(qr/\A(?:maildir|mh):/)) {
+			my $old = $loc;
+			my ($type, $d) = split /:/, $loc, 2;
 			# fixup old bugs while we're iterating:
-			my $cd = canonpath_harder($d);
-			my $f = "maildir:$cd";
-			$lms->rename_folder("maildir:$d", $f) if $d ne $cd;
-			next if $watches->{$f}; # may be set to pause
+			$d = canonpath_harder($d);
+			$loc = "$type:$d";
+			$lms->rename_folder($old, $loc) if $old ne $loc;
+			next if $watches->{$loc}; # may be set to pause
 			require PublicInbox::LeiWatch;
-			$watches->{$f} = PublicInbox::LeiWatch->new($f);
-			$seen{$f} = undef;
-			add_maildir_watch($cd, $cfg_f);
+			$watches->{$loc} = PublicInbox::LeiWatch->new($loc);
+			$seen{$loc} = undef;
+			add_dir_watch($type, $d, $cfg_f);
 		}
 	}
 	if ($old) { # cull old non-existent entries
-		for my $url (keys %$old) {
-			next if exists $seen{$url};
-			delete $old->{$url};
-			if ($url =~ /\Amaildir:(.+)/i) {
-				my $d = canonpath_harder($1);
-				cancel_maildir_watch($d, $cfg_f);
+		for my $loc (keys %$old) {
+			next if exists $seen{$loc};
+			delete $old->{$loc};
+			if ($loc =~ /\A(maildir|mh):(.+)\z/i) {
+				cancel_dir_watch($1, $2, $cfg_f);
 			} else { # TODO: imap/nntp/jmap
-				$lei->child_error(0, "E: watch $url TODO");
+				$lei->child_error(0, "E: watch $loc TODO");
 			}
 		}
 	}
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 593715dc..c498421c 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -425,9 +425,13 @@ sub folders {
 	my $re;
 	if (defined($pfx[0])) {
 		$sql .= ' WHERE loc REGEXP ?'; # DBD::SQLite uses perlre
-		$re = !!$pfx[1] ? '.*' : '';
-		$re .= quotemeta($pfx[0]);
-		$re .= '.*';
+		if (ref($pfx[0])) { # assume qr// "Regexp"
+			$re = $pfx[0];
+		} else {
+			$re = !!$pfx[1] ? '.*' : '';
+			$re .= quotemeta($pfx[0]);
+			$re .= '.*';
+		}
 	}
 	my $sth = ($self->{dbh} //= dbh_new($self))->prepare($sql);
 	$sth->bind_param(1, $re) if defined($re);
diff --git a/lib/PublicInbox/LeiNoteEvent.pm b/lib/PublicInbox/LeiNoteEvent.pm
index 8581bd9a..8d900d0c 100644
--- a/lib/PublicInbox/LeiNoteEvent.pm
+++ b/lib/PublicInbox/LeiNoteEvent.pm
@@ -60,6 +60,18 @@ sub maildir_event { # via wq_nonblock_do
 	} # else: eml_from_path already warns
 }
 
+sub _mh_cb { # mh_read_one cb
+	my ($dir, $bn, $kw, $eml, $self, $state) = @_;
+}
+
+sub mh_event { # via wq_nonblock_do
+	my ($self, $folder, $bn, $state) = @_;
+	my $dir = substr($folder, 3);
+	require PublicInbox::MHreader; # if we forked early
+	my $mhr = PublicInbox::MHreader->new($dir, $self->{lei}->{3});
+	$mhr->mh_read_one($bn, \&_mh_cb, $self, $state);
+}
+
 sub lei_note_event {
 	my ($lei, $folder, $new_cur, $bn, $fn, @rest) = @_;
 	die "BUG: unexpected: @rest" if @rest;
@@ -72,11 +84,14 @@ sub lei_note_event {
 	$lms->arg2folder($lei, [ $folder ]);
 	my $state = $cfg->get_1("watch.$folder.state") // 'tag-rw';
 	return if $state eq 'pause';
-	return $lms->clear_src($folder, \$bn) if $new_cur eq '';
+	if ($new_cur eq '') {
+		my $id = $folder =~ /\Amaildir:/ ? \$bn : $bn + 0;
+		return $lms->clear_src($folder, $id);
+	}
 	$lms->lms_pause;
 	$lei->ale; # prepare
 	$sto->write_prepare($lei);
-	require PublicInbox::MdirReader;
+	require PublicInbox::MHreader if $folder =~ /\Amh:/; # optimistic
 	my $self = $cfg->{-lei_note_event} //= do {
 		my $wq = bless { lms => $lms }, __PACKAGE__;
 		# MUAs such as mutt can trigger massive rename() storms so
@@ -91,12 +106,15 @@ sub lei_note_event {
 		$lei->{lne} = $wq;
 	};
 	if ($folder =~ /\Amaildir:/i) {
+		require PublicInbox::MdirReader;
 		my $fl = PublicInbox::MdirReader::maildir_basename_flags($bn)
 			// return;
 		return if index($fl, 'T') >= 0;
 		my $kw = PublicInbox::MdirReader::flags2kw($fl);
 		my $vmd = { kw => $kw, sync_info => [ $folder, \$bn ] };
 		$self->wq_nonblock_do('maildir_event', $fn, $vmd, $state);
+	} elsif ($folder =~ /\Amh:/) {
+		$self->wq_nonblock_do('mh_event', $folder, $bn, $state);
 	} # else: TODO: imap
 }
 
diff --git a/lib/PublicInbox/LeiWatch.pm b/lib/PublicInbox/LeiWatch.pm
index 35267b58..b30e5152 100644
--- a/lib/PublicInbox/LeiWatch.pm
+++ b/lib/PublicInbox/LeiWatch.pm
@@ -1,13 +1,12 @@
 # Copyright all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
-# represents a Maildir or IMAP "watch" item
+# represents a Maildir, MH or IMAP "watch" item
 package PublicInbox::LeiWatch;
-use strict;
-use v5.10.1;
+use v5.12;
 use parent qw(PublicInbox::IPC);
 
-# "url" may be something like "maildir:/path/to/dir"
+# "url" may be something like "maildir:/path/to/dir" or "mh:/path/to/dir"
 sub new { bless { url => $_[1] }, $_[0] }
 
 1;
diff --git a/lib/PublicInbox/MHreader.pm b/lib/PublicInbox/MHreader.pm
index 673e3e06..033aa740 100644
--- a/lib/PublicInbox/MHreader.pm
+++ b/lib/PublicInbox/MHreader.pm
@@ -82,7 +82,7 @@ sub kw_for ($$) {
 	\@kw;
 }
 
-sub _file2eml { # mh_each_file cb
+sub _file2eml { # mh_each_file / mh_read_one cb
 	my ($dir, $n, $self, $ucb, @arg) = @_;
 	my $eml = eml_from_path($n);
 	$ucb->($dir, $n, kw_for($self, $n), $eml, @arg) if $eml;
diff --git a/t/lei-watch.t b/t/lei-watch.t
index 7b357ee0..8ad50d13 100644
--- a/t/lei-watch.t
+++ b/t/lei-watch.t
@@ -3,6 +3,7 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 use File::Path qw(make_path remove_tree);
+use PublicInbox::IO qw(write_file);
 plan skip_all => "TEST_FLAKY not enabled for $0" if !$ENV{TEST_FLAKY};
 require_mods('lei');
 my $have_fast_inotify = eval { require PublicInbox::Inotify } ||
@@ -13,7 +14,7 @@ $have_fast_inotify or
 
 my ($ro_home, $cfg_path) = setup_public_inboxes;
 test_lei(sub {
-	my $md = "$ENV{HOME}/md";
+	my ($md, $mh1, $mh2) = map { "$ENV{HOME}/$_" } qw(md mh1 mh2);
 	my $cfg_f = "$ENV{HOME}/.config/lei/config";
 	my $md2 = $md.'2';
 	lei_ok 'ls-watch';
@@ -45,13 +46,14 @@ test_lei(sub {
 	}
 
 	# first, make sure tag-ro works
-	make_path("$md/new", "$md/cur", "$md/tmp");
+	make_path("$md/new", "$md/cur", "$md/tmp", $mh1, $mh2);
 	lei_ok qw(add-watch --state=tag-ro), $md;
 	lei_ok 'ls-watch';
 	like($lei_out, qr/^\Qmaildir:$md\E$/sm, 'maildir shown');
 	lei_ok qw(q mid:testmessage@example.com -o), $md, '-I', "$ro_home/t1";
 	my @f = glob("$md/cur/*:2,");
 	is(scalar(@f), 1, 'got populated maildir with one result');
+
 	rename($f[0], "$f[0]S") or xbail "rename $!"; # set (S)een
 	tick($have_fast_inotify ? 0.2 : 2.2); # always needed for 1 CPU systems
 	lei_ok qw(note-event done); # flushes immediately (instead of 5s)
@@ -94,6 +96,12 @@ test_lei(sub {
 		my $cmp = [ <$fh> ];
 		is_xdeeply($cmp, $ino_contents, 'inotify Maildir watches gone');
 	};
+
+	write_file '>', "$mh1/.mh_sequences";
+	lei_ok qw(add-watch --state=tag-ro), $mh1, "mh:$mh2";
+	lei_ok 'ls-watch', \'refresh watches';
+	like $lei_out, qr/^\Qmh:$mh1\E$/sm, 'MH 1 shown';
+	like $lei_out, qr/^\Qmh:$mh2\E$/sm, 'MH 2 shown';
 });
 
 done_testing;

^ permalink raw reply related	[relevance 32%]

* [PATCH v2] lei: support reading MH for convert+import+index
  2023-12-16 13:09 20% [PATCH] lei: support reading MH for convert+import+index Eric Wong
  2023-12-16 16:15 71% ` Konstantin Ryabitsev
@ 2023-12-29 18:05 19% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-12-29 18:05 UTC (permalink / raw)
  To: meta

The MH format is widely-supported and used by various MUAs such
as mutt and sylpheed, and a MH-like format is used by mlmmj for
archives, as well.  Locking implementations for writes are
inconsistent, so this commit doesn't support writes, yet.

inotify|EVFILT_VNODE watches aren't supported, yet, but that'll
have to come since MH allows packing unused integers and
renaming files.
---
v2 fixes:
* uses Perl REGEXP match via DBD::SQLite for folder filtering
* unconditionally verify blob contents
* eliminate unused $tmpdir in test

diff -u b/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
--- b/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -471,19 +471,20 @@
 		}
 	}
 
+	# MH, except `uid' is not always unique (can be packed)
 	$b2n = $dbh->prepare(<<'');
 SELECT f.loc,b.uid FROM blob2num b
 LEFT JOIN folders f ON b.fid = f.fid
-WHERE b.oidbin = ? /* AND f.loc LIKE 'mh:/%' */
+WHERE b.oidbin = ? AND f.loc REGEXP '^mh:/'
 
 	$b2n->bind_param(1, $oidbin, SQL_BLOB);
 	$b2n->execute;
-	while (my ($d, $n) = $b2n->fetchrow_array) {
-		substr($d, 0, length('mh:')) = '';
-		my $f = "$d/$n";
+	while (my ($f, $n) = $b2n->fetchrow_array) {
+		$f =~ s/\Amh://s or die "BUG: not MH: $f";
+		$f .= "/$n";
 		open my $fh, '<', $f or next;
 		my $raw = read_all($fh, -s $fh // next);
-		next if $vrfy && blob_mismatch $f, $oidhex, \$raw;
+		next if blob_mismatch $f, $oidhex, \$raw;
 		return \$raw;
 	}
 	undef;
diff -u b/t/mh_reader.t b/t/mh_reader.t
--- b/t/mh_reader.t
+++ b/t/mh_reader.t
@@ -10,7 +10,6 @@
 use autodie;
 opendir my $cwdfh, '.';
 
-my $tmpdir = tmpdir;
 my $normal = create_dir 'normal', sub {
 	write_file '>', 3, "Subject: replied a\n\n";
 	write_file '>', 4, "Subject: replied b\n\n";

 MANIFEST                       |   3 +
 lib/PublicInbox/LEI.pm         |  13 ++--
 lib/PublicInbox/LeiConvert.pm  |   5 ++
 lib/PublicInbox/LeiImport.pm   |  23 +++++++
 lib/PublicInbox/LeiImportKw.pm |   2 +-
 lib/PublicInbox/LeiIndex.pm    |   2 +-
 lib/PublicInbox/LeiInput.pm    |  52 +++++++++++++---
 lib/PublicInbox/LeiMailSync.pm |  40 ++++++++----
 lib/PublicInbox/LeiToMail.pm   |   5 ++
 lib/PublicInbox/MHreader.pm    | 103 +++++++++++++++++++++++++++++++
 lib/PublicInbox/MdirReader.pm  |   2 +-
 lib/PublicInbox/MdirSort.pm    |  46 ++++++++++++++
 lib/PublicInbox/TestCommon.pm  |  22 ++++---
 t/mh_reader.t                  | 107 +++++++++++++++++++++++++++++++++
 14 files changed, 392 insertions(+), 33 deletions(-)
 create mode 100644 lib/PublicInbox/MHreader.pm
 create mode 100644 lib/PublicInbox/MdirSort.pm
 create mode 100644 t/mh_reader.t

diff --git a/MANIFEST b/MANIFEST
index 109ce88a..051cd6f9 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -296,6 +296,7 @@ lib/PublicInbox/Linkify.pm
 lib/PublicInbox/Listener.pm
 lib/PublicInbox/Lock.pm
 lib/PublicInbox/MDA.pm
+lib/PublicInbox/MHreader.pm
 lib/PublicInbox/MID.pm
 lib/PublicInbox/MIME.pm
 lib/PublicInbox/MailDiff.pm
@@ -305,6 +306,7 @@ lib/PublicInbox/MboxGz.pm
 lib/PublicInbox/MboxLock.pm
 lib/PublicInbox/MboxReader.pm
 lib/PublicInbox/MdirReader.pm
+lib/PublicInbox/MdirSort.pm
 lib/PublicInbox/MiscIdx.pm
 lib/PublicInbox/MiscSearch.pm
 lib/PublicInbox/MsgIter.pm
@@ -547,6 +549,7 @@ t/mda-mime.eml
 t/mda.t
 t/mda_filter_rubylang.t
 t/mdir_reader.t
+t/mh_reader.t
 t/mid.t
 t/mime.t
 t/miscsearch.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 17431518..e0cfd55a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -267,7 +267,7 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s new-only
 	lock=s@ in-format|F=s kw! verbose|v+ incremental! mail-sync!
-	commit-delay=i),
+	commit-delay=i sort|s:s@),
 	@net_opt, @c_opt ],
 'forget-mail-sync' => [ 'LOCATION...',
 	'forget sync information for a mail folder', @c_opt ],
@@ -280,7 +280,7 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
 'convert' => [ 'LOCATION...|--stdin',
 	'one-time conversion from URL or filesystem to another format',
 	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s lock=s@ kw!
-		rsyncable),
+		rsyncable sort|s:s@),
 	@net_opt, @c_opt ],
 'p2q' => [ 'LOCATION_OR_COMMIT...|--stdin',
 	"use a patch to generate a query for `lei q --stdin'",
@@ -321,6 +321,9 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
 my $stdin_formats = [ 'MAIL_FORMAT|eml|mboxrd|mboxcl2|mboxcl|mboxo',
 			'specify message input format' ];
 my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
+my $sort_out = [ 'VAL|received|relevance|docid',
+		"order of results is `--output'-dependent"];
+my $sort_in = [ 'sequence|mtime|size', 'sort input (format-dependent)' ];
 
 # we use \x{a0} (non-breaking SP) to avoid wrapping in PublicInbox::LeiHelp
 my %OPTDESC = (
@@ -428,8 +431,10 @@ my %OPTDESC = (
 'limit|n=i@' => ['NUM', 'limit on number of matches (default: 10000)' ],
 'offset=i' => ['OFF', 'search result offset (default: 0)'],
 
-'sort|s=s' => [ 'VAL|received|relevance|docid',
-		"order of results is `--output'-dependent"],
+'sort|s=s	q' => $sort_out,
+'sort|s=s	lcat' => $sort_out,
+'sort|s:s@	convert' => $sort_in,
+'sort|s:s@	import' => $sort_in,
 'reverse|r' => 'reverse search results', # like sort(1)
 
 'boost=i' => 'increase/decrease priority of results (default: 0)',
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 8f628562..17a952f2 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -28,6 +28,11 @@ sub input_maildir_cb {
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
+sub input_mh_cb {
+	my ($dn, $bn, $kw, $eml, $self) = @_;
+	$self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
 sub process_inputs { # via wq_do
 	my ($self) = @_;
 	local $PublicInbox::DS::in_loop = 0; # force synchronous awaitpid
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index c2552bf0..5521188c 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -53,6 +53,29 @@ sub pmdir_cb { # called via wq_io_do from LeiPmdir->each_mdir_fn
 	}
 }
 
+sub input_mh_cb {
+	my ($mhdir, $n, $kw, $eml, $self) = @_;
+	substr($mhdir, 0, 0) = 'mh:'; # add prefix
+	my $lse = $self->{lse} //= $self->{lei}->{sto}->search;
+	my $lms = $self->{-lms_rw} //= $self->{lei}->lms; # may be 0 or undef
+	my @oidbin = $lms ? $lms->num_oidbin($mhdir, $n) : ();
+	@oidbin > 1 and warn("W: $mhdir/$n not unique:\n",
+				map { "\t".unpack('H*', $_)."\n" } @oidbin);
+	my @docids = sort { $a <=> $b } uniqstr
+			map { $lse->over->oidbin_exists($_) } @oidbin;
+	if (scalar @docids) {
+		$lse->kw_changed(undef, $kw, \@docids) or return;
+	}
+	if (defined $eml) {
+		my $vmd = $self->{-import_kw} ? { kw => $kw } : undef;
+		$vmd->{sync_info} = [ $mhdir, $n + 0 ] if $self->{-mail_sync};
+		$self->input_eml_cb($eml, $vmd);
+	}
+	# TODO:
+	# elsif (my $ikw = $self->{lei}->{ikw}) { # old message, kw only
+	#	$ikw->wq_io_do('ck_update_kw', [], "mh:$dir", $uid, $kw);
+}
+
 sub input_net_cb { # imap_each / nntp_each
 	my ($uri, $uid, $kw, $eml, $self) = @_;
 	if (defined $eml) {
diff --git a/lib/PublicInbox/LeiImportKw.pm b/lib/PublicInbox/LeiImportKw.pm
index 4b8e69fb..765e23cd 100644
--- a/lib/PublicInbox/LeiImportKw.pm
+++ b/lib/PublicInbox/LeiImportKw.pm
@@ -36,7 +36,7 @@ sub ipc_atfork_child {
 sub ck_update_kw { # via wq_io_do
 	my ($self, $url, $uid, $kw) = @_;
 	my @oidbin = $self->{-lms_rw}->num_oidbin($url, $uid);
-	my $uid_url = "$url/;UID=$uid";
+	my $uid_url = index($url, 'mh:') == 0 ? $url.$uid : "$url/;UID=$uid";
 	@oidbin > 1 and warn("W: $uid_url not unique:\n",
 				map { "\t".unpack('H*', $_)."\n" } @oidbin);
 	my @docids = sort { $a <=> $b } uniqstr
diff --git a/lib/PublicInbox/LeiIndex.pm b/lib/PublicInbox/LeiIndex.pm
index b3f3e1a0..0e329e58 100644
--- a/lib/PublicInbox/LeiIndex.pm
+++ b/lib/PublicInbox/LeiIndex.pm
@@ -35,7 +35,7 @@ sub lei_index {
 
 no warnings 'once';
 no strict 'refs';
-for my $m (qw(pmdir_cb input_net_cb)) {
+for my $m (qw(pmdir_cb input_net_cb input_mh_cb)) {
 	*$m = PublicInbox::LeiImport->can($m);
 }
 
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index daba9a8e..947a7a79 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -69,6 +69,11 @@ sub input_maildir_cb {
 	$self->input_eml_cb($eml);
 }
 
+sub input_mh_cb {
+	my ($dn, $n, $kw, $eml, $self) = @_;
+	$self->input_eml_cb($eml);
+}
+
 sub input_net_cb { # imap_each, nntp_each cb
 	my ($url, $uid, $kw, $eml, $self) = @_;
 	$self->input_eml_cb($eml);
@@ -190,7 +195,7 @@ sub input_path_url {
 		$ifmt = lc($1);
 	} elsif ($input =~ /\.(?:patch|eml)\z/i) {
 		$ifmt = 'eml';
-	} elsif (-f $input && $input =~ m{\A(?:.+)/(?:new|cur)/([^/]+)\z}) {
+	} elsif ($input =~ m{\A(?:.+)/(?:new|cur)/([^/]+)\z} && -f $input) {
 		my $bn = $1;
 		my $fl = PublicInbox::MdirReader::maildir_basename_flags($bn);
 		return if index($fl, 'T') >= 0;
@@ -204,6 +209,10 @@ sub input_path_url {
 	my $devfd = $lei->path_to_fd($input) // return;
 	if ($devfd >= 0) {
 		$self->input_fh($ifmt, $lei->{$devfd}, $input, @args);
+	} elsif ($devfd < 0 && $input =~ m{\A(.+/)([0-9]+)\z} && -f $input) {
+		my ($dn, $n) = ($1, $2);
+		my $mhr = PublicInbox::MHreader->new($dn, $lei->{3});
+		$mhr->mh_read_one($n, $self->can('input_mh_cb'), $self);
 	} elsif (-f $input && $ifmt eq 'eml') {
 		open my $fh, '<', $input or
 					return $lei->fail("open($input): $!");
@@ -231,6 +240,10 @@ sub input_path_url {
 						$self->can('input_maildir_cb'),
 						$self, @args);
 		}
+	} elsif (-d _ && $ifmt eq 'mh') {
+		my $mhr = PublicInbox::MHreader->new($input.'/', $lei->{3});
+		$mhr->{sort} = $lei->{opt}->{sort};
+		$mhr->mh_each_eml($self->can('input_mh_cb'), $self, @args);
 	} elsif (-d _ && $ifmt =~ /\A(?:v1|v2)\z/) {
 		my $ibx = PublicInbox::Inbox->new({inboxdir => $input});
 		each_ibx_eml($self, $ibx, @args);
@@ -354,13 +367,15 @@ sub prepare_inputs { # returns undef on error
 				PublicInbox::MboxReader->reads($ifmt) or return
 					$lei->fail("$ifmt not supported");
 			} elsif (-d $input_path) { # TODO extindex
-				$ifmt =~ /\A(?:maildir|v1|v2|extindex)\z/ or
+				$ifmt =~ /\A(?:maildir|mh|v1|v2|extindex)\z/ or
 					return$lei->fail("$ifmt not supported");
 				$input = $input_path;
 				add_dir $lei, $istate, $ifmt, \$input;
-			} elsif ($self->{missing_ok} && !-e _) {
+			} elsif ($self->{missing_ok} &&
+					$ifmt =~ /\A(?:maildir|mh)\z/ &&
+					!-e $input_path) {
 				# for "lei rm-watch" on missing Maildir
-				$may_sync and $input = 'maildir:'.
+				$may_sync and $input = "$ifmt:".
 						$lei->abs_path($input_path);
 			} else {
 				my $m = "Unable to handle $input";
@@ -373,7 +388,7 @@ sub prepare_inputs { # returns undef on error
 $input is `eml', not --in-format=$in_fmt
 
 			push @{$sync->{no}}, $input if $sync;
-		} elsif (-f $input && $input =~ m{\A(.+)/(new|cur)/([^/]+)\z}) {
+		} elsif ($input =~ m{\A(.+)/(new|cur)/([^/]+)\z} && -f $input) {
 			# single file in a Maildir
 			my ($mdir, $nc, $bn) = ($1, $2, $3);
 			my $other = $mdir . ($nc eq 'new' ? '/cur' : '/new');
@@ -385,12 +400,24 @@ $input is `eml', not --in-format=$in_fmt
 
 			if ($sync) {
 				$input = $lei->abs_path($mdir) . "/$nc/$bn";
-				push @{$sync->{ok}}, $input if $sync;
+				push @{$sync->{ok}}, $input;
 			}
 			require PublicInbox::MdirReader;
 		} else {
 			my $devfd = $lei->path_to_fd($input) // return;
-			if ($devfd >= 0 || -f $input || -p _) {
+			if ($devfd < 0 && $input =~ m{\A(.+)/([0-9]+)\z} &&
+					-f $input) { # single file in MH dir
+				my ($mh, $n) = ($1, $2);
+				lc($in_fmt//'eml') eq 'eml' or
+						return $lei->fail(<<"");
+$input is `eml', not --in-format=$in_fmt
+
+				if ($sync) {
+					$input = $lei->abs_path($mh)."/$n";
+					push @{$sync->{ok}}, $input;
+				}
+				require PublicInbox::MHreader;
+			} elsif ($devfd >= 0 || -f $input || -p _) {
 				push @{$sync->{no}}, $input if $sync;
 				push @f, $input;
 			} elsif (-d "$input/new" && -d "$input/cur") {
@@ -401,10 +428,13 @@ $input is `eml', not --in-format=$in_fmt
 				add_dir $lei, $istate, 'v1', \$input;
 			} elsif (-e "$input/ei.lock") {
 				add_dir $lei, $istate, 'extindex', \$input;
+			} elsif (-f "$input/.mh_sequences") {
+				add_dir $lei, $istate, 'mh', \$input;
 			} elsif ($self->{missing_ok} && !-e $input) {
 				if ($lei->{cmd} eq 'p2q') {
 					# will run "git format-patch"
 				} elsif ($may_sync) { # for lei rm-watch
+					# FIXME: support MH, here
 					$input = 'maildir:'.
 						$lei->abs_path($input);
 				}
@@ -446,6 +476,14 @@ $input is `eml', not --in-format=$in_fmt
 			$lei->refresh_watches;
 		}
 	}
+	if (my $mh = $istate->{mh}) {
+		require PublicInbox::MHreader;
+		grep(!m!\Amh:!i, @$mh) and die "BUG: @$mh (no pfx)";
+		if ($may_sync && $lei->{sto}) {
+			$lei->lms(1)->lms_write_prepare->add_folders(@$mh);
+			# $lei->refresh_watches; TODO
+		}
+	}
 	require PublicInbox::ExtSearch if $istate->{extindex};
 	$self->{inputs} = $inputs;
 }
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 17254a82..593715dc 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -435,15 +435,24 @@ sub folders {
 	map { $_->[0] } @{$sth->fetchall_arrayref};
 }
 
+sub blob_mismatch ($$$) {
+	my ($f, $oidhex, $rawref) = @_;
+	my $sha = $HEXLEN2SHA{length($oidhex)};
+	my $got = git_sha($sha, $rawref)->hexdigest;
+	$got eq $oidhex ? undef : warn("$f changed $oidhex => $got\n");
+}
+
 sub local_blob {
 	my ($self, $oidhex, $vrfy) = @_;
 	my $dbh = $self->{dbh} //= dbh_new($self);
+	my $oidbin = pack('H*', $oidhex);
+
 	my $b2n = $dbh->prepare(<<'');
 SELECT f.loc,b.name FROM blob2name b
 LEFT JOIN folders f ON b.fid = f.fid
 WHERE b.oidbin = ?
 
-	$b2n->bind_param(1, pack('H*', $oidhex), SQL_BLOB);
+	$b2n->bind_param(1, $oidbin, SQL_BLOB);
 	$b2n->execute;
 	while (my ($d, $n) = $b2n->fetchrow_array) {
 		substr($d, 0, length('maildir:')) = '';
@@ -456,19 +465,28 @@ WHERE b.oidbin = ?
 			my $f = "$d/$x/$n";
 			open my $fh, '<', $f or next;
 			# some (buggy) Maildir writers are non-atomic:
-			next unless -s $fh;
-			my $raw = read_all($fh, -s _);
-			if ($vrfy) {
-				my $sha = $HEXLEN2SHA{length($oidhex)};
-				my $got = git_sha($sha, \$raw)->hexdigest;
-				if ($got ne $oidhex) {
-					warn "$f changed $oidhex => $got\n";
-					next;
-				}
-			}
+			my $raw = read_all($fh, -s $fh // next);
+			next if $vrfy && blob_mismatch $f, $oidhex, \$raw;
 			return \$raw;
 		}
 	}
+
+	# MH, except `uid' is not always unique (can be packed)
+	$b2n = $dbh->prepare(<<'');
+SELECT f.loc,b.uid FROM blob2num b
+LEFT JOIN folders f ON b.fid = f.fid
+WHERE b.oidbin = ? AND f.loc REGEXP '^mh:/'
+
+	$b2n->bind_param(1, $oidbin, SQL_BLOB);
+	$b2n->execute;
+	while (my ($f, $n) = $b2n->fetchrow_array) {
+		$f =~ s/\Amh://s or die "BUG: not MH: $f";
+		$f .= "/$n";
+		open my $fh, '<', $f or next;
+		my $raw = read_all($fh, -s $fh // next);
+		next if blob_mismatch $f, $oidhex, \$raw;
+		return \$raw;
+	}
 	undef;
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 071ba113..de75e99e 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -400,6 +400,11 @@ sub new {
 				"$dst exists and is not a directory\n";
 		$lei->{ovv}->{dst} = $dst .= '/' if substr($dst, -1) ne '/';
 		$lei->{opt}->{save} //= \1 if $lei->{cmd} eq 'q';
+	} elsif ($fmt eq 'mh') {
+		-e $dst && !-d _ and die
+				"$dst exists and is not a directory\n";
+		$lei->{ovv}->{dst} = $dst .= '/' if substr($dst, -1) ne '/';
+		$lei->{opt}->{save} //= \1 if $lei->{cmd} eq 'q';
 	} elsif (substr($fmt, 0, 4) eq 'mbox') {
 		require PublicInbox::MboxReader;
 		$self->can("eml2$fmt") or die "bad mbox format: $fmt\n";
diff --git a/lib/PublicInbox/MHreader.pm b/lib/PublicInbox/MHreader.pm
new file mode 100644
index 00000000..673e3e06
--- /dev/null
+++ b/lib/PublicInbox/MHreader.pm
@@ -0,0 +1,103 @@
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# MH reader, based on Lib/mailbox.py in cpython source
+package PublicInbox::MHreader;
+use v5.12;
+use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::OnDestroy;
+use PublicInbox::IO qw(try_cat);
+use PublicInbox::MdirSort;
+use Carp qw(carp);
+use autodie qw(chdir closedir opendir);
+
+my %FL2OFF = ( # mh_sequences key => our keyword
+	replied => 0,
+	flagged => 1,
+	unseen => 2, # negate
+);
+my @OFF2KW = qw(answered flagged); # [2] => unseen (negated)
+
+sub new {
+	my ($cls, $dir, $cwdfh) = @_;
+	if (substr($dir, -1) ne '/') { # TODO: do this earlier
+		carp "W: appending `/' to `$dir' (fix caller)\n";
+		$dir .= '/';
+	}
+	bless { dir => $dir, cwdfh => $cwdfh }, $cls;
+}
+
+sub read_mh_sequences ($) { # caller must chdir($self->{dir})
+	my ($self) = @_;
+	my ($fl, $off, @n);
+	my @seq = ('', '', '');
+	for (split /\n+/s, try_cat('.mh_sequences')) {
+		($fl, @n) = split /[: \t]+/;
+		$off = $FL2OFF{$fl} // do { warn <<EOM;
+W: unknown `$fl' in $self->{dir}.mh_sequences (ignoring)
+EOM
+			next;
+		};
+		@n = grep /\A[0-9]+\z/s, @n; # don't stat, yet
+		if (@n) {
+			@n = sort { $b <=> $a } @n; # to avoid resize
+			my $buf = '';
+			vec($buf, $_, 1) = 1 for @n;
+			$seq[$off] = $buf;
+		}
+	}
+	\@seq;
+}
+
+sub mh_each_file {
+	my ($self, $efcb, @arg) = @_;
+	opendir(my $dh, my $dir = $self->{dir});
+	my $restore = PublicInbox::OnDestroy->new($$, \&chdir, $self->{cwdfh});
+	chdir($dh);
+	if (defined(my $sort = $self->{sort})) {
+		my @sort = map {
+			my @tmp = $_ eq '' ? ('sequence') : split(/[, ]/);
+			# sorting by name alphabetically makes no sense for MH:
+			for my $k (@tmp) {
+				s/\A(\-|\+|)(?:name|)\z/$1sequence/;
+			}
+			@tmp;
+		} @$sort;
+		my @n = grep /\A[0-9]+\z/s, readdir $dh;
+		mdir_sort \@n, \@sort;
+		$efcb->($dir, $_, $self, @arg) for @n;
+	} else {
+		while (readdir $dh) { # perl v5.12+ to set $_ on readdir
+			$efcb->($dir, $_, $self, @arg) if /\A[0-9]+\z/s;
+		}
+	}
+	closedir $dh; # may die
+}
+
+sub kw_for ($$) {
+	my ($self, $n) = @_;
+	my $seq = $self->{mh_seq} //= read_mh_sequences($self);
+	my @kw = map { vec($seq->[$_], $n, 1) ? $OFF2KW[$_] : () } (0, 1);
+	vec($seq->[2], $n, 1) or push @kw, 'seen';
+	\@kw;
+}
+
+sub _file2eml { # mh_each_file cb
+	my ($dir, $n, $self, $ucb, @arg) = @_;
+	my $eml = eml_from_path($n);
+	$ucb->($dir, $n, kw_for($self, $n), $eml, @arg) if $eml;
+}
+
+sub mh_each_eml {
+	my ($self, $ucb, @arg) = @_;
+	mh_each_file($self, \&_file2eml, $ucb, @arg);
+}
+
+sub mh_read_one {
+	my ($self, $n, $ucb, @arg) = @_;
+	my $restore = PublicInbox::OnDestroy->new($$, \&chdir, $self->{cwdfh});
+	chdir(my $dir = $self->{dir});
+	_file2eml($dir, $n, $self, $ucb, @arg);
+}
+
+1;
diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index db5f4545..2981b058 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -1,7 +1,7 @@
 # Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
-# Maildirs for now, MH eventually
+# Maildirs only (PublicInbox::MHreader exists, now)
 # ref: https://cr.yp.to/proto/maildir.html
 #	https://wiki2.dovecot.org/MailboxFormat/Maildir
 package PublicInbox::MdirReader;
diff --git a/lib/PublicInbox/MdirSort.pm b/lib/PublicInbox/MdirSort.pm
new file mode 100644
index 00000000..6bd9fb6c
--- /dev/null
+++ b/lib/PublicInbox/MdirSort.pm
@@ -0,0 +1,46 @@
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# used for sorting MH (and (TODO) Maildir) names
+# TODO: consider sort(1) to parallelize sorting of gigantic directories
+package PublicInbox::MdirSort;
+use v5.12;
+use Time::HiRes ();
+use parent qw(Exporter);
+use Fcntl qw(S_ISREG);
+our @EXPORT = qw(mdir_sort);
+my %ST = (sequence => 0, size => 1, atime => 2, mtime => 3, ctime => 4);
+
+sub mdir_sort ($$;$) {
+	my ($ent, $sort, $max) = @_;
+	my @st;
+	my @ent = map {
+		@st = Time::HiRes::stat $_;
+		# name, size, {a,m,c}time
+		S_ISREG($st[2]) ? [ $_, @st[7..10] ] : ();
+	} @$ent;
+	@ent = grep { $_->[1] <= $max } @ent if $max;
+	use sort 'stable';
+	for my $s (@$sort) {
+		if ($s =~ /\A(\-|\+|)name\z/) {
+			if ($1 eq '-') {
+				@ent = sort { $b->[0] cmp $a->[0] } @ent;
+			} else {
+				@ent = sort { $a->[0] cmp $b->[0] } @ent;
+			}
+		} elsif ($s =~ /\A(\-|\+|)
+				(sequence|size|ctime|mtime|atime)\z/x) {
+			my $key = $ST{$2};
+			if ($1 eq '-') {
+				@ent = sort { $b->[$key] <=> $a->[$key] } @ent;
+			} else {
+				@ent = sort { $a->[$key] <=> $b->[$key] } @ent;
+			}
+		} else {
+			die "E: unrecognized sort parameter: `$s'";
+		}
+	}
+	@$ent = map { $_->[0] } @ent;
+}
+
+1;
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index b0f28e16..d20bff28 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -24,6 +24,7 @@ BEGIN {
 	@EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods
 		run_script start_script key2sub xsys xsys_e xqx eml_load tick
 		have_xapian_compact json_utf8 setup_public_inboxes create_inbox
+		create_dir
 		create_coderepo require_bsd kernel_version check_broken_tmpfs
 		quit_waiter_pipe wait_for_eof require_git_http_backend
 		tcp_host_port test_lei lei lei_ok $lei_out $lei_err $lei_opt
@@ -843,26 +844,24 @@ sub my_sum {
 	substr PublicInbox::SHA::sha256_hex(join('', @l)), 0, 8;
 }
 
-sub create_coderepo ($$;@) {
-	my $ident = shift;
-	my $cb = pop;
+sub create_dir (@) {
+	my ($ident, $cb) = (shift, pop);
 	my %opt = @_;
 	require PublicInbox::Lock;
 	require PublicInbox::Import;
-	my ($base) = ($0 =~ m!\b([^/]+)\.[^\.]+\z!);
-	my ($db) = (PublicInbox::Import::default_branch() =~ m!([^/]+)\z!);
 	my $tmpdir = delete $opt{tmpdir};
-	my $dir = "t/data-gen/$base.$ident-".my_sum($db, $cb, \%opt);
+	my ($base) = ($0 =~ m!\b([^/]+)\.[^\.]+\z!);
+	my $dir = "t/data-gen/$base.$ident-".my_sum($cb, \%opt);
 	require File::Path;
 	my $new = File::Path::make_path($dir);
 	my $lk = PublicInbox::Lock->new("$dir/creat.lock");
 	my $scope = $lk->lock_for_scope;
 	if (!-f "$dir/creat.stamp") {
-		opendir(my $dfh, '.');
+		opendir(my $cwd, '.');
 		chdir($dir);
 		local %ENV = (%ENV, %COMMIT_ENV);
 		$cb->($dir);
-		chdir($dfh);
+		chdir($cwd); # some $cb chdir around
 		open my $s, '>', "$dir/creat.stamp";
 	}
 	return $dir if !defined($tmpdir);
@@ -870,6 +869,13 @@ sub create_coderepo ($$;@) {
 	$tmpdir;
 }
 
+sub create_coderepo (@) {
+	my $ident = shift;
+	require PublicInbox::Import;
+	my ($db) = (PublicInbox::Import::default_branch() =~ m!([^/]+)\z!);
+	create_dir "$ident-$db", @_;
+}
+
 sub create_inbox ($;@) {
 	my $ident = shift;
 	my $cb = pop;
diff --git a/t/mh_reader.t b/t/mh_reader.t
new file mode 100644
index 00000000..e8f69fa8
--- /dev/null
+++ b/t/mh_reader.t
@@ -0,0 +1,107 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use PublicInbox::TestCommon;
+require_ok 'PublicInbox::MHreader';
+use PublicInbox::IO qw(write_file);
+use PublicInbox::Lock;
+use PublicInbox::OnDestroy;
+use PublicInbox::Eml;
+use autodie;
+opendir my $cwdfh, '.';
+
+my $normal = create_dir 'normal', sub {
+	write_file '>', 3, "Subject: replied a\n\n";
+	write_file '>', 4, "Subject: replied b\n\n";
+	write_file '>', 1, "Subject: unseen\n\n";
+	write_file '>', 2, "Subject: unseen flagged\n\n";
+	write_file '>', '.mh_sequences', <<EOM;
+unseen: 1 2
+flagged: 2
+replied: 3 4
+EOM
+};
+
+my $for_sort = create_dir 'size', sub {
+	for (1..3) {
+		my $name = 10 - $_;
+		write_file '>', $name, "Subject: ".($_ x $_)."\n\n";
+	}
+};
+
+my $stale = create_dir 'stale', sub {
+	write_file '>', 4, "Subject: msg 4\n\n";
+	write_file '>', '.mh_sequences', <<EOM;
+unseen: 1 2
+EOM
+};
+
+{
+	my $mhr = PublicInbox::MHreader->new("$normal/", $cwdfh);
+	$mhr->{sort} = [ '' ];
+	my @res;
+	$mhr->mh_each_eml(sub { push @res, \@_; }, [ 'bogus' ]);
+	is scalar(@res), 4, 'got 4 messages' or diag explain(\@res);
+	is_deeply [map { $_->[1] } @res], [1, 2, 3, 4],
+		'got messages in expected order';
+	is scalar(grep { $_->[4]->[0] eq 'bogus' } @res), scalar(@res),
+		'cb arg passed to all messages' or diag explain(\@res);
+
+	$mhr = PublicInbox::MHreader->new("$stale/", $cwdfh);
+	@res = ();
+	$mhr->mh_each_eml(sub { push @res, \@_; });
+	is scalar(@res), 1, 'ignored stale messages';
+}
+
+test_lei(sub {
+	lei_ok qw(convert -f mboxrd), $normal;
+	my @msgs = grep /\S/s, split /^From .[^\n]+\n/sm, $lei_out;
+	my @eml = map { PublicInbox::Eml->new($_) } @msgs;
+	my $h = 'Subject';
+	@eml = sort { $a->header_raw($h) cmp $b->header_raw($h) } @eml;
+	my @has = map { scalar $_->header_raw($h) } @eml;
+	is_xdeeply \@has,
+		[ 'replied a', 'replied b', 'unseen', 'unseen flagged' ],
+		'subjects sorted';
+	$h = 'X-Status';
+	@has = map { scalar $_->header_raw($h) } @eml;
+	is_xdeeply \@has, [ 'A', 'A', undef, 'F' ], 'answered and flagged kw';
+	$h = 'Status';
+	@has = map { scalar $_->header_raw($h) } @eml;
+	is_xdeeply \@has, ['RO', 'RO', 'O', 'O'], 'read and old';
+	lei_ok qw(import +L:normal), $normal;
+	lei_ok qw(q L:normal -f mboxrd);
+	@msgs = grep /\S/s, split /^From .[^\n]+\n/sm, $lei_out;
+	my @eml2 = map { PublicInbox::Eml->new($_) } @msgs;
+	$h = 'Subject';
+	@eml2 = sort { $a->header_raw($h) cmp $b->header_raw($h) } @eml2;
+	is_xdeeply \@eml2, \@eml, 'import preserved kw';
+
+	lei_ok 'ls-mail-sync';
+	is $lei_out, 'mh:'.File::Spec->rel2abs($normal)."\n",
+		'mail sync stored';
+
+	lei_ok qw(convert -s size -f mboxrd), "mh:$for_sort";
+	chomp(my @s = grep /^Subject:/, split(/^/sm, $lei_out));
+	s/^Subject: // for @s;
+	is_xdeeply \@s, [ 1, 22, 333 ], 'sorted by size';
+
+	for my $s ([], [ 'name' ], [ 'sequence' ]) {
+		lei_ok qw(convert -f mboxrd), "mh:$for_sort", '-s', @$s;
+		chomp(@s = grep /^Subject:/, split(/^/sm, $lei_out));
+		s/^Subject: // for @s;
+		my $desc = "@$s" || '(default)';
+		is_xdeeply \@s, [ 333, 22, 1 ], "sorted by: $desc";
+	}
+
+	lei_ok qw(import +L:sorttest), "MH:$for_sort";
+	lei_ok 'ls-mail-sync', $for_sort;
+	is $lei_out, 'mh:'.File::Spec->rel2abs($for_sort)."\n",
+		"mail sync stored with `MH' normalized to `mh'";
+	lei_ok qw(index), 'mh:'.$stale;
+	lei qw(q -f mboxrd), 's:msg 4';
+	like $lei_out, qr/^Subject: msg 4\nStatus: RO\n\n\n/ms,
+		"message retrieved after `lei index'"
+});
+
+done_testing;

^ permalink raw reply related	[relevance 19%]

* Re: [PATCH] lei: support reading MH for convert+import+index
  2023-12-16 18:17 71%   ` Eric Wong
@ 2023-12-17  7:59 69%     ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-12-17  7:59 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Eric Wong <e@80x24.org> wrote:
> Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> > Nice, so eventually we should be able to specify the following instead of
> > faking out a maildir?
> > 
> > watch=mh:/var/spool/mlmmj/list.name/archive
> 
> Yes, that's the plan.

Well, reading /usr/lib/python*/mailbox.py on my system makes me cry:

    def pack(self):
        """Re-name messages to eliminate numbering gaps. Invalidates keys."""

That's for the Python stdlib MH class where I was looking for a
non-racy write implementation.

And checking the nmh source[1] reveals packing happens there, too...

Packing makes sense for a memory-efficient representation of
.mh_sequences without resorting to a tree or hash table; but it
invalidates `lei index' and forces -watch to do a full rescan if
anybody uses pack.  Ugh...

Fortunately, this doesn't seem to be the default behavior of nmh
(`nopack' appears to be the default).

[1] https://git.savannah.gnu.org/git/nmh.git sbr/folder_pack.c

> > > inotify|EVFILT_VNODE watches aren't supported, yet, either.

At least lei should have a reasonably fast way to handle this
using mail_sync.sqlite3 to compare SHA-(1|256) without having
to decode MIME/QP/Base-64 to get comparisons... I suppose
-watch needs to start using that, too...

> > In the case of mlmmj it's sufficient to watch the
> > /var/spool/mlmmj/list.name/index file for updates, but I don't know how well
> > this lends itself to other implementations (I am not at all familiar with MH).
> 
> Just watching the directory itself is sufficient (like Maildir)
> and will report new files.  We just have to check /\A[0-9]+\z/

At least mlmmj won't pack because it's an archive (or at
least it shouldn't....)

^ permalink raw reply	[relevance 69%]

* Re: [PATCH] lei: support reading MH for convert+import+index
  2023-12-16 16:15 71% ` Konstantin Ryabitsev
@ 2023-12-16 18:17 71%   ` Eric Wong
  2023-12-17  7:59 69%     ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-12-16 18:17 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> Nice, so eventually we should be able to specify the following instead of
> faking out a maildir?
> 
> watch=mh:/var/spool/mlmmj/list.name/archive

Yes, that's the plan.

> > inotify|EVFILT_VNODE watches aren't supported, yet, either.
> 
> In the case of mlmmj it's sufficient to watch the
> /var/spool/mlmmj/list.name/index file for updates, but I don't know how well
> this lends itself to other implementations (I am not at all familiar with MH).

Just watching the directory itself is sufficient (like Maildir)
and will report new files.  We just have to check /\A[0-9]+\z/

^ permalink raw reply	[relevance 71%]

* Re: [PATCH] lei: support reading MH for convert+import+index
  2023-12-16 13:09 20% [PATCH] lei: support reading MH for convert+import+index Eric Wong
@ 2023-12-16 16:15 71% ` Konstantin Ryabitsev
  2023-12-16 18:17 71%   ` Eric Wong
  2023-12-29 18:05 19% ` [PATCH v2] " Eric Wong
  1 sibling, 1 reply; 200+ results
From: Konstantin Ryabitsev @ 2023-12-16 16:15 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Sat, Dec 16, 2023 at 01:09:32PM +0000, Eric Wong wrote:
> The MH format is widely-supported and used by various MUAs such
> as mutt and sylpheed, and a MH-like format is used by mlmmj for
> archives, as well.  Locking implementations for writes are
> inconsistent, so this commit doesn't support writes, yet.

Nice, so eventually we should be able to specify the following instead of
faking out a maildir?

watch=mh:/var/spool/mlmmj/list.name/archive

> inotify|EVFILT_VNODE watches aren't supported, yet, either.

In the case of mlmmj it's sufficient to watch the
/var/spool/mlmmj/list.name/index file for updates, but I don't know how well
this lends itself to other implementations (I am not at all familiar with MH).

-K

^ permalink raw reply	[relevance 71%]

* [PATCH] lei: support reading MH for convert+import+index
@ 2023-12-16 13:09 20% Eric Wong
  2023-12-16 16:15 71% ` Konstantin Ryabitsev
  2023-12-29 18:05 19% ` [PATCH v2] " Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2023-12-16 13:09 UTC (permalink / raw)
  To: meta

The MH format is widely-supported and used by various MUAs such
as mutt and sylpheed, and a MH-like format is used by mlmmj for
archives, as well.  Locking implementations for writes are
inconsistent, so this commit doesn't support writes, yet.

inotify|EVFILT_VNODE watches aren't supported, yet, either.
---
 MANIFEST                       |   3 +
 lib/PublicInbox/LEI.pm         |  13 ++--
 lib/PublicInbox/LeiConvert.pm  |   5 ++
 lib/PublicInbox/LeiImport.pm   |  23 +++++++
 lib/PublicInbox/LeiImportKw.pm |   2 +-
 lib/PublicInbox/LeiIndex.pm    |   2 +-
 lib/PublicInbox/LeiInput.pm    |  52 +++++++++++++---
 lib/PublicInbox/LeiMailSync.pm |  39 ++++++++----
 lib/PublicInbox/LeiToMail.pm   |   5 ++
 lib/PublicInbox/MHreader.pm    | 103 +++++++++++++++++++++++++++++++
 lib/PublicInbox/MdirReader.pm  |   2 +-
 lib/PublicInbox/MdirSort.pm    |  46 ++++++++++++++
 lib/PublicInbox/TestCommon.pm  |  22 ++++---
 t/mh_reader.t                  | 108 +++++++++++++++++++++++++++++++++
 14 files changed, 392 insertions(+), 33 deletions(-)
 create mode 100644 lib/PublicInbox/MHreader.pm
 create mode 100644 lib/PublicInbox/MdirSort.pm
 create mode 100644 t/mh_reader.t

diff --git a/MANIFEST b/MANIFEST
index e22674b7..8bcc3179 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -293,6 +293,7 @@ lib/PublicInbox/Linkify.pm
 lib/PublicInbox/Listener.pm
 lib/PublicInbox/Lock.pm
 lib/PublicInbox/MDA.pm
+lib/PublicInbox/MHreader.pm
 lib/PublicInbox/MID.pm
 lib/PublicInbox/MIME.pm
 lib/PublicInbox/MailDiff.pm
@@ -302,6 +303,7 @@ lib/PublicInbox/MboxGz.pm
 lib/PublicInbox/MboxLock.pm
 lib/PublicInbox/MboxReader.pm
 lib/PublicInbox/MdirReader.pm
+lib/PublicInbox/MdirSort.pm
 lib/PublicInbox/MiscIdx.pm
 lib/PublicInbox/MiscSearch.pm
 lib/PublicInbox/MsgIter.pm
@@ -543,6 +545,7 @@ t/mda-mime.eml
 t/mda.t
 t/mda_filter_rubylang.t
 t/mdir_reader.t
+t/mh_reader.t
 t/mid.t
 t/mime.t
 t/miscsearch.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 17431518..e0cfd55a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -267,7 +267,7 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s new-only
 	lock=s@ in-format|F=s kw! verbose|v+ incremental! mail-sync!
-	commit-delay=i),
+	commit-delay=i sort|s:s@),
 	@net_opt, @c_opt ],
 'forget-mail-sync' => [ 'LOCATION...',
 	'forget sync information for a mail folder', @c_opt ],
@@ -280,7 +280,7 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
 'convert' => [ 'LOCATION...|--stdin',
 	'one-time conversion from URL or filesystem to another format',
 	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s lock=s@ kw!
-		rsyncable),
+		rsyncable sort|s:s@),
 	@net_opt, @c_opt ],
 'p2q' => [ 'LOCATION_OR_COMMIT...|--stdin',
 	"use a patch to generate a query for `lei q --stdin'",
@@ -321,6 +321,9 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
 my $stdin_formats = [ 'MAIL_FORMAT|eml|mboxrd|mboxcl2|mboxcl|mboxo',
 			'specify message input format' ];
 my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
+my $sort_out = [ 'VAL|received|relevance|docid',
+		"order of results is `--output'-dependent"];
+my $sort_in = [ 'sequence|mtime|size', 'sort input (format-dependent)' ];
 
 # we use \x{a0} (non-breaking SP) to avoid wrapping in PublicInbox::LeiHelp
 my %OPTDESC = (
@@ -428,8 +431,10 @@ my %OPTDESC = (
 'limit|n=i@' => ['NUM', 'limit on number of matches (default: 10000)' ],
 'offset=i' => ['OFF', 'search result offset (default: 0)'],
 
-'sort|s=s' => [ 'VAL|received|relevance|docid',
-		"order of results is `--output'-dependent"],
+'sort|s=s	q' => $sort_out,
+'sort|s=s	lcat' => $sort_out,
+'sort|s:s@	convert' => $sort_in,
+'sort|s:s@	import' => $sort_in,
 'reverse|r' => 'reverse search results', # like sort(1)
 
 'boost=i' => 'increase/decrease priority of results (default: 0)',
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 8f628562..17a952f2 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -28,6 +28,11 @@ sub input_maildir_cb {
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
+sub input_mh_cb {
+	my ($dn, $bn, $kw, $eml, $self) = @_;
+	$self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
 sub process_inputs { # via wq_do
 	my ($self) = @_;
 	local $PublicInbox::DS::in_loop = 0; # force synchronous awaitpid
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index c2552bf0..5521188c 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -53,6 +53,29 @@ sub pmdir_cb { # called via wq_io_do from LeiPmdir->each_mdir_fn
 	}
 }
 
+sub input_mh_cb {
+	my ($mhdir, $n, $kw, $eml, $self) = @_;
+	substr($mhdir, 0, 0) = 'mh:'; # add prefix
+	my $lse = $self->{lse} //= $self->{lei}->{sto}->search;
+	my $lms = $self->{-lms_rw} //= $self->{lei}->lms; # may be 0 or undef
+	my @oidbin = $lms ? $lms->num_oidbin($mhdir, $n) : ();
+	@oidbin > 1 and warn("W: $mhdir/$n not unique:\n",
+				map { "\t".unpack('H*', $_)."\n" } @oidbin);
+	my @docids = sort { $a <=> $b } uniqstr
+			map { $lse->over->oidbin_exists($_) } @oidbin;
+	if (scalar @docids) {
+		$lse->kw_changed(undef, $kw, \@docids) or return;
+	}
+	if (defined $eml) {
+		my $vmd = $self->{-import_kw} ? { kw => $kw } : undef;
+		$vmd->{sync_info} = [ $mhdir, $n + 0 ] if $self->{-mail_sync};
+		$self->input_eml_cb($eml, $vmd);
+	}
+	# TODO:
+	# elsif (my $ikw = $self->{lei}->{ikw}) { # old message, kw only
+	#	$ikw->wq_io_do('ck_update_kw', [], "mh:$dir", $uid, $kw);
+}
+
 sub input_net_cb { # imap_each / nntp_each
 	my ($uri, $uid, $kw, $eml, $self) = @_;
 	if (defined $eml) {
diff --git a/lib/PublicInbox/LeiImportKw.pm b/lib/PublicInbox/LeiImportKw.pm
index 4b8e69fb..765e23cd 100644
--- a/lib/PublicInbox/LeiImportKw.pm
+++ b/lib/PublicInbox/LeiImportKw.pm
@@ -36,7 +36,7 @@ sub ipc_atfork_child {
 sub ck_update_kw { # via wq_io_do
 	my ($self, $url, $uid, $kw) = @_;
 	my @oidbin = $self->{-lms_rw}->num_oidbin($url, $uid);
-	my $uid_url = "$url/;UID=$uid";
+	my $uid_url = index($url, 'mh:') == 0 ? $url.$uid : "$url/;UID=$uid";
 	@oidbin > 1 and warn("W: $uid_url not unique:\n",
 				map { "\t".unpack('H*', $_)."\n" } @oidbin);
 	my @docids = sort { $a <=> $b } uniqstr
diff --git a/lib/PublicInbox/LeiIndex.pm b/lib/PublicInbox/LeiIndex.pm
index b3f3e1a0..0e329e58 100644
--- a/lib/PublicInbox/LeiIndex.pm
+++ b/lib/PublicInbox/LeiIndex.pm
@@ -35,7 +35,7 @@ sub lei_index {
 
 no warnings 'once';
 no strict 'refs';
-for my $m (qw(pmdir_cb input_net_cb)) {
+for my $m (qw(pmdir_cb input_net_cb input_mh_cb)) {
 	*$m = PublicInbox::LeiImport->can($m);
 }
 
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index daba9a8e..947a7a79 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -69,6 +69,11 @@ sub input_maildir_cb {
 	$self->input_eml_cb($eml);
 }
 
+sub input_mh_cb {
+	my ($dn, $n, $kw, $eml, $self) = @_;
+	$self->input_eml_cb($eml);
+}
+
 sub input_net_cb { # imap_each, nntp_each cb
 	my ($url, $uid, $kw, $eml, $self) = @_;
 	$self->input_eml_cb($eml);
@@ -190,7 +195,7 @@ sub input_path_url {
 		$ifmt = lc($1);
 	} elsif ($input =~ /\.(?:patch|eml)\z/i) {
 		$ifmt = 'eml';
-	} elsif (-f $input && $input =~ m{\A(?:.+)/(?:new|cur)/([^/]+)\z}) {
+	} elsif ($input =~ m{\A(?:.+)/(?:new|cur)/([^/]+)\z} && -f $input) {
 		my $bn = $1;
 		my $fl = PublicInbox::MdirReader::maildir_basename_flags($bn);
 		return if index($fl, 'T') >= 0;
@@ -204,6 +209,10 @@ sub input_path_url {
 	my $devfd = $lei->path_to_fd($input) // return;
 	if ($devfd >= 0) {
 		$self->input_fh($ifmt, $lei->{$devfd}, $input, @args);
+	} elsif ($devfd < 0 && $input =~ m{\A(.+/)([0-9]+)\z} && -f $input) {
+		my ($dn, $n) = ($1, $2);
+		my $mhr = PublicInbox::MHreader->new($dn, $lei->{3});
+		$mhr->mh_read_one($n, $self->can('input_mh_cb'), $self);
 	} elsif (-f $input && $ifmt eq 'eml') {
 		open my $fh, '<', $input or
 					return $lei->fail("open($input): $!");
@@ -231,6 +240,10 @@ sub input_path_url {
 						$self->can('input_maildir_cb'),
 						$self, @args);
 		}
+	} elsif (-d _ && $ifmt eq 'mh') {
+		my $mhr = PublicInbox::MHreader->new($input.'/', $lei->{3});
+		$mhr->{sort} = $lei->{opt}->{sort};
+		$mhr->mh_each_eml($self->can('input_mh_cb'), $self, @args);
 	} elsif (-d _ && $ifmt =~ /\A(?:v1|v2)\z/) {
 		my $ibx = PublicInbox::Inbox->new({inboxdir => $input});
 		each_ibx_eml($self, $ibx, @args);
@@ -354,13 +367,15 @@ sub prepare_inputs { # returns undef on error
 				PublicInbox::MboxReader->reads($ifmt) or return
 					$lei->fail("$ifmt not supported");
 			} elsif (-d $input_path) { # TODO extindex
-				$ifmt =~ /\A(?:maildir|v1|v2|extindex)\z/ or
+				$ifmt =~ /\A(?:maildir|mh|v1|v2|extindex)\z/ or
 					return$lei->fail("$ifmt not supported");
 				$input = $input_path;
 				add_dir $lei, $istate, $ifmt, \$input;
-			} elsif ($self->{missing_ok} && !-e _) {
+			} elsif ($self->{missing_ok} &&
+					$ifmt =~ /\A(?:maildir|mh)\z/ &&
+					!-e $input_path) {
 				# for "lei rm-watch" on missing Maildir
-				$may_sync and $input = 'maildir:'.
+				$may_sync and $input = "$ifmt:".
 						$lei->abs_path($input_path);
 			} else {
 				my $m = "Unable to handle $input";
@@ -373,7 +388,7 @@ sub prepare_inputs { # returns undef on error
 $input is `eml', not --in-format=$in_fmt
 
 			push @{$sync->{no}}, $input if $sync;
-		} elsif (-f $input && $input =~ m{\A(.+)/(new|cur)/([^/]+)\z}) {
+		} elsif ($input =~ m{\A(.+)/(new|cur)/([^/]+)\z} && -f $input) {
 			# single file in a Maildir
 			my ($mdir, $nc, $bn) = ($1, $2, $3);
 			my $other = $mdir . ($nc eq 'new' ? '/cur' : '/new');
@@ -385,12 +400,24 @@ $input is `eml', not --in-format=$in_fmt
 
 			if ($sync) {
 				$input = $lei->abs_path($mdir) . "/$nc/$bn";
-				push @{$sync->{ok}}, $input if $sync;
+				push @{$sync->{ok}}, $input;
 			}
 			require PublicInbox::MdirReader;
 		} else {
 			my $devfd = $lei->path_to_fd($input) // return;
-			if ($devfd >= 0 || -f $input || -p _) {
+			if ($devfd < 0 && $input =~ m{\A(.+)/([0-9]+)\z} &&
+					-f $input) { # single file in MH dir
+				my ($mh, $n) = ($1, $2);
+				lc($in_fmt//'eml') eq 'eml' or
+						return $lei->fail(<<"");
+$input is `eml', not --in-format=$in_fmt
+
+				if ($sync) {
+					$input = $lei->abs_path($mh)."/$n";
+					push @{$sync->{ok}}, $input;
+				}
+				require PublicInbox::MHreader;
+			} elsif ($devfd >= 0 || -f $input || -p _) {
 				push @{$sync->{no}}, $input if $sync;
 				push @f, $input;
 			} elsif (-d "$input/new" && -d "$input/cur") {
@@ -401,10 +428,13 @@ $input is `eml', not --in-format=$in_fmt
 				add_dir $lei, $istate, 'v1', \$input;
 			} elsif (-e "$input/ei.lock") {
 				add_dir $lei, $istate, 'extindex', \$input;
+			} elsif (-f "$input/.mh_sequences") {
+				add_dir $lei, $istate, 'mh', \$input;
 			} elsif ($self->{missing_ok} && !-e $input) {
 				if ($lei->{cmd} eq 'p2q') {
 					# will run "git format-patch"
 				} elsif ($may_sync) { # for lei rm-watch
+					# FIXME: support MH, here
 					$input = 'maildir:'.
 						$lei->abs_path($input);
 				}
@@ -446,6 +476,14 @@ $input is `eml', not --in-format=$in_fmt
 			$lei->refresh_watches;
 		}
 	}
+	if (my $mh = $istate->{mh}) {
+		require PublicInbox::MHreader;
+		grep(!m!\Amh:!i, @$mh) and die "BUG: @$mh (no pfx)";
+		if ($may_sync && $lei->{sto}) {
+			$lei->lms(1)->lms_write_prepare->add_folders(@$mh);
+			# $lei->refresh_watches; TODO
+		}
+	}
 	require PublicInbox::ExtSearch if $istate->{extindex};
 	$self->{inputs} = $inputs;
 }
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 17254a82..8d00d1fa 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -435,15 +435,24 @@ sub folders {
 	map { $_->[0] } @{$sth->fetchall_arrayref};
 }
 
+sub blob_mismatch ($$$) {
+	my ($f, $oidhex, $rawref) = @_;
+	my $sha = $HEXLEN2SHA{length($oidhex)};
+	my $got = git_sha($sha, $rawref)->hexdigest;
+	$got eq $oidhex ? undef : warn("$f changed $oidhex => $got\n");
+}
+
 sub local_blob {
 	my ($self, $oidhex, $vrfy) = @_;
 	my $dbh = $self->{dbh} //= dbh_new($self);
+	my $oidbin = pack('H*', $oidhex);
+
 	my $b2n = $dbh->prepare(<<'');
 SELECT f.loc,b.name FROM blob2name b
 LEFT JOIN folders f ON b.fid = f.fid
 WHERE b.oidbin = ?
 
-	$b2n->bind_param(1, pack('H*', $oidhex), SQL_BLOB);
+	$b2n->bind_param(1, $oidbin, SQL_BLOB);
 	$b2n->execute;
 	while (my ($d, $n) = $b2n->fetchrow_array) {
 		substr($d, 0, length('maildir:')) = '';
@@ -456,19 +465,27 @@ WHERE b.oidbin = ?
 			my $f = "$d/$x/$n";
 			open my $fh, '<', $f or next;
 			# some (buggy) Maildir writers are non-atomic:
-			next unless -s $fh;
-			my $raw = read_all($fh, -s _);
-			if ($vrfy) {
-				my $sha = $HEXLEN2SHA{length($oidhex)};
-				my $got = git_sha($sha, \$raw)->hexdigest;
-				if ($got ne $oidhex) {
-					warn "$f changed $oidhex => $got\n";
-					next;
-				}
-			}
+			my $raw = read_all($fh, -s $fh // next);
+			next if $vrfy && blob_mismatch $f, $oidhex, \$raw;
 			return \$raw;
 		}
 	}
+
+	$b2n = $dbh->prepare(<<'');
+SELECT f.loc,b.uid FROM blob2num b
+LEFT JOIN folders f ON b.fid = f.fid
+WHERE b.oidbin = ? /* AND f.loc LIKE 'mh:/%' */
+
+	$b2n->bind_param(1, $oidbin, SQL_BLOB);
+	$b2n->execute;
+	while (my ($d, $n) = $b2n->fetchrow_array) {
+		substr($d, 0, length('mh:')) = '';
+		my $f = "$d/$n";
+		open my $fh, '<', $f or next;
+		my $raw = read_all($fh, -s $fh // next);
+		next if $vrfy && blob_mismatch $f, $oidhex, \$raw;
+		return \$raw;
+	}
 	undef;
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 071ba113..de75e99e 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -400,6 +400,11 @@ sub new {
 				"$dst exists and is not a directory\n";
 		$lei->{ovv}->{dst} = $dst .= '/' if substr($dst, -1) ne '/';
 		$lei->{opt}->{save} //= \1 if $lei->{cmd} eq 'q';
+	} elsif ($fmt eq 'mh') {
+		-e $dst && !-d _ and die
+				"$dst exists and is not a directory\n";
+		$lei->{ovv}->{dst} = $dst .= '/' if substr($dst, -1) ne '/';
+		$lei->{opt}->{save} //= \1 if $lei->{cmd} eq 'q';
 	} elsif (substr($fmt, 0, 4) eq 'mbox') {
 		require PublicInbox::MboxReader;
 		$self->can("eml2$fmt") or die "bad mbox format: $fmt\n";
diff --git a/lib/PublicInbox/MHreader.pm b/lib/PublicInbox/MHreader.pm
new file mode 100644
index 00000000..673e3e06
--- /dev/null
+++ b/lib/PublicInbox/MHreader.pm
@@ -0,0 +1,103 @@
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# MH reader, based on Lib/mailbox.py in cpython source
+package PublicInbox::MHreader;
+use v5.12;
+use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::OnDestroy;
+use PublicInbox::IO qw(try_cat);
+use PublicInbox::MdirSort;
+use Carp qw(carp);
+use autodie qw(chdir closedir opendir);
+
+my %FL2OFF = ( # mh_sequences key => our keyword
+	replied => 0,
+	flagged => 1,
+	unseen => 2, # negate
+);
+my @OFF2KW = qw(answered flagged); # [2] => unseen (negated)
+
+sub new {
+	my ($cls, $dir, $cwdfh) = @_;
+	if (substr($dir, -1) ne '/') { # TODO: do this earlier
+		carp "W: appending `/' to `$dir' (fix caller)\n";
+		$dir .= '/';
+	}
+	bless { dir => $dir, cwdfh => $cwdfh }, $cls;
+}
+
+sub read_mh_sequences ($) { # caller must chdir($self->{dir})
+	my ($self) = @_;
+	my ($fl, $off, @n);
+	my @seq = ('', '', '');
+	for (split /\n+/s, try_cat('.mh_sequences')) {
+		($fl, @n) = split /[: \t]+/;
+		$off = $FL2OFF{$fl} // do { warn <<EOM;
+W: unknown `$fl' in $self->{dir}.mh_sequences (ignoring)
+EOM
+			next;
+		};
+		@n = grep /\A[0-9]+\z/s, @n; # don't stat, yet
+		if (@n) {
+			@n = sort { $b <=> $a } @n; # to avoid resize
+			my $buf = '';
+			vec($buf, $_, 1) = 1 for @n;
+			$seq[$off] = $buf;
+		}
+	}
+	\@seq;
+}
+
+sub mh_each_file {
+	my ($self, $efcb, @arg) = @_;
+	opendir(my $dh, my $dir = $self->{dir});
+	my $restore = PublicInbox::OnDestroy->new($$, \&chdir, $self->{cwdfh});
+	chdir($dh);
+	if (defined(my $sort = $self->{sort})) {
+		my @sort = map {
+			my @tmp = $_ eq '' ? ('sequence') : split(/[, ]/);
+			# sorting by name alphabetically makes no sense for MH:
+			for my $k (@tmp) {
+				s/\A(\-|\+|)(?:name|)\z/$1sequence/;
+			}
+			@tmp;
+		} @$sort;
+		my @n = grep /\A[0-9]+\z/s, readdir $dh;
+		mdir_sort \@n, \@sort;
+		$efcb->($dir, $_, $self, @arg) for @n;
+	} else {
+		while (readdir $dh) { # perl v5.12+ to set $_ on readdir
+			$efcb->($dir, $_, $self, @arg) if /\A[0-9]+\z/s;
+		}
+	}
+	closedir $dh; # may die
+}
+
+sub kw_for ($$) {
+	my ($self, $n) = @_;
+	my $seq = $self->{mh_seq} //= read_mh_sequences($self);
+	my @kw = map { vec($seq->[$_], $n, 1) ? $OFF2KW[$_] : () } (0, 1);
+	vec($seq->[2], $n, 1) or push @kw, 'seen';
+	\@kw;
+}
+
+sub _file2eml { # mh_each_file cb
+	my ($dir, $n, $self, $ucb, @arg) = @_;
+	my $eml = eml_from_path($n);
+	$ucb->($dir, $n, kw_for($self, $n), $eml, @arg) if $eml;
+}
+
+sub mh_each_eml {
+	my ($self, $ucb, @arg) = @_;
+	mh_each_file($self, \&_file2eml, $ucb, @arg);
+}
+
+sub mh_read_one {
+	my ($self, $n, $ucb, @arg) = @_;
+	my $restore = PublicInbox::OnDestroy->new($$, \&chdir, $self->{cwdfh});
+	chdir(my $dir = $self->{dir});
+	_file2eml($dir, $n, $self, $ucb, @arg);
+}
+
+1;
diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index db5f4545..2981b058 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -1,7 +1,7 @@
 # Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
-# Maildirs for now, MH eventually
+# Maildirs only (PublicInbox::MHreader exists, now)
 # ref: https://cr.yp.to/proto/maildir.html
 #	https://wiki2.dovecot.org/MailboxFormat/Maildir
 package PublicInbox::MdirReader;
diff --git a/lib/PublicInbox/MdirSort.pm b/lib/PublicInbox/MdirSort.pm
new file mode 100644
index 00000000..6bd9fb6c
--- /dev/null
+++ b/lib/PublicInbox/MdirSort.pm
@@ -0,0 +1,46 @@
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+
+# used for sorting MH (and (TODO) Maildir) names
+# TODO: consider sort(1) to parallelize sorting of gigantic directories
+package PublicInbox::MdirSort;
+use v5.12;
+use Time::HiRes ();
+use parent qw(Exporter);
+use Fcntl qw(S_ISREG);
+our @EXPORT = qw(mdir_sort);
+my %ST = (sequence => 0, size => 1, atime => 2, mtime => 3, ctime => 4);
+
+sub mdir_sort ($$;$) {
+	my ($ent, $sort, $max) = @_;
+	my @st;
+	my @ent = map {
+		@st = Time::HiRes::stat $_;
+		# name, size, {a,m,c}time
+		S_ISREG($st[2]) ? [ $_, @st[7..10] ] : ();
+	} @$ent;
+	@ent = grep { $_->[1] <= $max } @ent if $max;
+	use sort 'stable';
+	for my $s (@$sort) {
+		if ($s =~ /\A(\-|\+|)name\z/) {
+			if ($1 eq '-') {
+				@ent = sort { $b->[0] cmp $a->[0] } @ent;
+			} else {
+				@ent = sort { $a->[0] cmp $b->[0] } @ent;
+			}
+		} elsif ($s =~ /\A(\-|\+|)
+				(sequence|size|ctime|mtime|atime)\z/x) {
+			my $key = $ST{$2};
+			if ($1 eq '-') {
+				@ent = sort { $b->[$key] <=> $a->[$key] } @ent;
+			} else {
+				@ent = sort { $a->[$key] <=> $b->[$key] } @ent;
+			}
+		} else {
+			die "E: unrecognized sort parameter: `$s'";
+		}
+	}
+	@$ent = map { $_->[0] } @ent;
+}
+
+1;
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 22c50675..64fe09fa 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -24,6 +24,7 @@ BEGIN {
 	@EXPORT = qw(tmpdir tcp_server tcp_connect require_git require_mods
 		run_script start_script key2sub xsys xsys_e xqx eml_load tick
 		have_xapian_compact json_utf8 setup_public_inboxes create_inbox
+		create_dir
 		create_coderepo require_bsd kernel_version check_broken_tmpfs
 		quit_waiter_pipe wait_for_eof require_git_http_backend
 		tcp_host_port test_lei lei lei_ok $lei_out $lei_err $lei_opt
@@ -843,26 +844,24 @@ sub my_sum {
 	substr PublicInbox::SHA::sha256_hex(join('', @l)), 0, 8;
 }
 
-sub create_coderepo ($$;@) {
-	my $ident = shift;
-	my $cb = pop;
+sub create_dir (@) {
+	my ($ident, $cb) = (shift, pop);
 	my %opt = @_;
 	require PublicInbox::Lock;
 	require PublicInbox::Import;
-	my ($base) = ($0 =~ m!\b([^/]+)\.[^\.]+\z!);
-	my ($db) = (PublicInbox::Import::default_branch() =~ m!([^/]+)\z!);
 	my $tmpdir = delete $opt{tmpdir};
-	my $dir = "t/data-gen/$base.$ident-".my_sum($db, $cb, \%opt);
+	my ($base) = ($0 =~ m!\b([^/]+)\.[^\.]+\z!);
+	my $dir = "t/data-gen/$base.$ident-".my_sum($cb, \%opt);
 	require File::Path;
 	my $new = File::Path::make_path($dir);
 	my $lk = PublicInbox::Lock->new("$dir/creat.lock");
 	my $scope = $lk->lock_for_scope;
 	if (!-f "$dir/creat.stamp") {
-		opendir(my $dfh, '.');
+		opendir(my $cwd, '.');
 		chdir($dir);
 		local %ENV = (%ENV, %COMMIT_ENV);
 		$cb->($dir);
-		chdir($dfh);
+		chdir($cwd); # some $cb chdir around
 		open my $s, '>', "$dir/creat.stamp";
 	}
 	return $dir if !defined($tmpdir);
@@ -870,6 +869,13 @@ sub create_coderepo ($$;@) {
 	$tmpdir;
 }
 
+sub create_coderepo (@) {
+	my $ident = shift;
+	require PublicInbox::Import;
+	my ($db) = (PublicInbox::Import::default_branch() =~ m!([^/]+)\z!);
+	create_dir "$ident-$db", @_;
+}
+
 sub create_inbox ($;@) {
 	my $ident = shift;
 	my $cb = pop;
diff --git a/t/mh_reader.t b/t/mh_reader.t
new file mode 100644
index 00000000..4bc77c1e
--- /dev/null
+++ b/t/mh_reader.t
@@ -0,0 +1,108 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use PublicInbox::TestCommon;
+require_ok 'PublicInbox::MHreader';
+use PublicInbox::IO qw(write_file);
+use PublicInbox::Lock;
+use PublicInbox::OnDestroy;
+use PublicInbox::Eml;
+use autodie;
+opendir my $cwdfh, '.';
+
+my $tmpdir = tmpdir;
+my $normal = create_dir 'normal', sub {
+	write_file '>', 3, "Subject: replied a\n\n";
+	write_file '>', 4, "Subject: replied b\n\n";
+	write_file '>', 1, "Subject: unseen\n\n";
+	write_file '>', 2, "Subject: unseen flagged\n\n";
+	write_file '>', '.mh_sequences', <<EOM;
+unseen: 1 2
+flagged: 2
+replied: 3 4
+EOM
+};
+
+my $for_sort = create_dir 'size', sub {
+	for (1..3) {
+		my $name = 10 - $_;
+		write_file '>', $name, "Subject: ".($_ x $_)."\n\n";
+	}
+};
+
+my $stale = create_dir 'stale', sub {
+	write_file '>', 4, "Subject: msg 4\n\n";
+	write_file '>', '.mh_sequences', <<EOM;
+unseen: 1 2
+EOM
+};
+
+{
+	my $mhr = PublicInbox::MHreader->new("$normal/", $cwdfh);
+	$mhr->{sort} = [ '' ];
+	my @res;
+	$mhr->mh_each_eml(sub { push @res, \@_; }, [ 'bogus' ]);
+	is scalar(@res), 4, 'got 4 messages' or diag explain(\@res);
+	is_deeply [map { $_->[1] } @res], [1, 2, 3, 4],
+		'got messages in expected order';
+	is scalar(grep { $_->[4]->[0] eq 'bogus' } @res), scalar(@res),
+		'cb arg passed to all messages' or diag explain(\@res);
+
+	$mhr = PublicInbox::MHreader->new("$stale/", $cwdfh);
+	@res = ();
+	$mhr->mh_each_eml(sub { push @res, \@_; });
+	is scalar(@res), 1, 'ignored stale messages';
+}
+
+test_lei(sub {
+	lei_ok qw(convert -f mboxrd), $normal;
+	my @msgs = grep /\S/s, split /^From .[^\n]+\n/sm, $lei_out;
+	my @eml = map { PublicInbox::Eml->new($_) } @msgs;
+	my $h = 'Subject';
+	@eml = sort { $a->header_raw($h) cmp $b->header_raw($h) } @eml;
+	my @has = map { scalar $_->header_raw($h) } @eml;
+	is_xdeeply \@has,
+		[ 'replied a', 'replied b', 'unseen', 'unseen flagged' ],
+		'subjects sorted';
+	$h = 'X-Status';
+	@has = map { scalar $_->header_raw($h) } @eml;
+	is_xdeeply \@has, [ 'A', 'A', undef, 'F' ], 'answered and flagged kw';
+	$h = 'Status';
+	@has = map { scalar $_->header_raw($h) } @eml;
+	is_xdeeply \@has, ['RO', 'RO', 'O', 'O'], 'read and old';
+	lei_ok qw(import +L:normal), $normal;
+	lei_ok qw(q L:normal -f mboxrd);
+	@msgs = grep /\S/s, split /^From .[^\n]+\n/sm, $lei_out;
+	my @eml2 = map { PublicInbox::Eml->new($_) } @msgs;
+	$h = 'Subject';
+	@eml2 = sort { $a->header_raw($h) cmp $b->header_raw($h) } @eml2;
+	is_xdeeply \@eml2, \@eml, 'import preserved kw';
+
+	lei_ok 'ls-mail-sync';
+	is $lei_out, 'mh:'.File::Spec->rel2abs($normal)."\n",
+		'mail sync stored';
+
+	lei_ok qw(convert -s size -f mboxrd), "mh:$for_sort";
+	chomp(my @s = grep /^Subject:/, split(/^/sm, $lei_out));
+	s/^Subject: // for @s;
+	is_xdeeply \@s, [ 1, 22, 333 ], 'sorted by size';
+
+	for my $s ([], [ 'name' ], [ 'sequence' ]) {
+		lei_ok qw(convert -f mboxrd), "mh:$for_sort", '-s', @$s;
+		chomp(@s = grep /^Subject:/, split(/^/sm, $lei_out));
+		s/^Subject: // for @s;
+		my $desc = "@$s" || '(default)';
+		is_xdeeply \@s, [ 333, 22, 1 ], "sorted by: $desc";
+	}
+
+	lei_ok qw(import +L:sorttest), "MH:$for_sort";
+	lei_ok 'ls-mail-sync', $for_sort;
+	is $lei_out, 'mh:'.File::Spec->rel2abs($for_sort)."\n",
+		"mail sync stored with `MH' normalized to `mh'";
+	lei_ok qw(index), 'mh:'.$stale;
+	lei qw(q -f mboxrd), 's:msg 4';
+	like $lei_out, qr/^Subject: msg 4\nStatus: RO\n\n\n/ms,
+		"message retrieved after `lei index'"
+});
+
+done_testing;

^ permalink raw reply related	[relevance 20%]

* [PATCH 0/2] lei bugfixes
@ 2023-12-16 11:13 71% Eric Wong
  2023-12-16 11:13 68% ` [PATCH 1/2] lei index: support +L: labels Eric Wong
  2023-12-16 11:13 64% ` [PATCH 2/2] lei: use ->child_error API properly Eric Wong
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2023-12-16 11:13 UTC (permalink / raw)
  To: meta

Eric Wong (2):
  lei index: support +L: labels
  lei: use ->child_error API properly

 lib/PublicInbox/LEI.pm         | 2 +-
 lib/PublicInbox/LeiExportKw.pm | 4 ++--
 lib/PublicInbox/LeiMirror.pm   | 2 +-
 lib/PublicInbox/LeiToMail.pm   | 4 ++--
 t/lei-index.t                  | 3 ++-
 5 files changed, 8 insertions(+), 7 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 1/2] lei index: support +L: labels
  2023-12-16 11:13 71% [PATCH 0/2] lei bugfixes Eric Wong
@ 2023-12-16 11:13 68% ` Eric Wong
  2023-12-16 11:13 64% ` [PATCH 2/2] lei: use ->child_error API properly Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-12-16 11:13 UTC (permalink / raw)
  To: meta

`lei index' should be capable of indexing the the same way
`lei import' does, but without the indexing.  I only noticed
this omission while developing a new feature.
---
 lib/PublicInbox/LEI.pm | 2 +-
 t/lei-index.t          | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index a89bdc51..17431518 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -259,7 +259,7 @@ tag => [ 'KEYWORDS... LOCATION...|--stdin',
 
 'reindex' => [ '', 'reindex all locally-indexed messages', @c_opt ],
 
-'index' => [ 'LOCATION...', 'one-time index from URL or filesystem',
+'index' => [ 'LOCATION... [LABELS...]', 'one-time index from URL or filesystem',
 	qw(in-format|F=s kw! offset=i recursive|r exclude=s include|I=s
 	verbose|v+ incremental!), @net_opt, # mainly for --proxy=
 	 @c_opt ],
diff --git a/t/lei-index.t b/t/lei-index.t
index c31b1c3c..2b28f1be 100644
--- a/t/lei-index.t
+++ b/t/lei-index.t
@@ -48,9 +48,10 @@ symlink(File::Spec->rel2abs('t/mda-mime.eml'), "$tmpdir/md1/cur/x:2,S") or
 test_lei({ tmpdir => $tmpdir }, sub {
 	my $store_path = "$ENV{HOME}/.local/share/lei/store/";
 
-	lei_ok('index', "$tmpdir/md");
+	lei_ok qw(index +L:md), "$tmpdir/md";
 	lei_ok(qw(q mid:qp@example.com));
 	my $res_a = json_utf8->decode($lei_out);
+	is_deeply $res_a->[0]->{L}, [ 'md' ], 'label set on index';
 	my $blob = $res_a->[0]->{'blob'};
 	like($blob, qr/\A[0-9a-f]{40,}\z/, 'got blob from qp@example');
 	lei_ok(qw(-C / blob), $blob);

^ permalink raw reply related	[relevance 68%]

* [PATCH 2/2] lei: use ->child_error API properly
  2023-12-16 11:13 71% [PATCH 0/2] lei bugfixes Eric Wong
  2023-12-16 11:13 68% ` [PATCH 1/2] lei index: support +L: labels Eric Wong
@ 2023-12-16 11:13 64% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-12-16 11:13 UTC (permalink / raw)
  To: meta

I noticed this bug while developing another feature and tests
were getting SIGHUP (since SIGHUP == 1 on most systems).
---
 lib/PublicInbox/LeiExportKw.pm | 4 ++--
 lib/PublicInbox/LeiMirror.pm   | 2 +-
 lib/PublicInbox/LeiToMail.pm   | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiExportKw.pm b/lib/PublicInbox/LeiExportKw.pm
index d2396fa7..16f069da 100644
--- a/lib/PublicInbox/LeiExportKw.pm
+++ b/lib/PublicInbox/LeiExportKw.pm
@@ -38,7 +38,7 @@ sub export_kw_md { # LeiMailSync->each_src callback
 		} elsif ($! == EEXIST) { # lost race with lei/store?
 			return;
 		} elsif ($! != ENOENT) {
-			$lei->child_error(1,
+			$lei->child_error(0,
 				"E: rename_noreplace($src -> $dst): $!");
 		} # else loop @try
 	}
@@ -46,7 +46,7 @@ sub export_kw_md { # LeiMailSync->each_src callback
 	# both tries failed
 	my $oidhex = unpack('H*', $oidbin);
 	my $src = "$mdir/{".join(',', @try)."}/$$id";
-	$lei->child_error(1, "rename_noreplace($src -> $dst) ($oidhex): $e");
+	$lei->child_error(0, "rename_noreplace($src -> $dst) ($oidhex): $e");
 	for (@try) { return if -e "$mdir/$_/$$id" }
 	$self->{lms}->clear_src("maildir:$mdir", $id);
 }
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 0c77a8b5..5353ae61 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -1175,7 +1175,7 @@ sub try_manifest {
 	local $self->{-local_manifest} = load_current_manifest($self);
 	local $self->{-new_symlinks} = [];
 	my ($path_pfx, $n, $multi) = multi_inbox($self, \$path, $m);
-	return $lei->child_error(1, $multi) if !ref($multi);
+	return $lei->child_error(0, $multi) if !ref($multi);
 	my $v2 = delete $multi->{v2};
 	if ($v2) {
 		for my $name (sort keys %$v2) {
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index a930fc30..071ba113 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -147,9 +147,9 @@ sub git_to_mail { # git->cat_async callback
 			$type = 'blob';
 			$size = length($$bref);
 		}
-		$type eq 'blob' or return $self->{lei}->child_error(1,
+		$type eq 'blob' or return $self->{lei}->child_error(0,
 						"W: $oid is $type (!= blob)");
-		$size or return $self->{lei}->child_error(1,"E: $oid is empty");
+		$size or return $self->{lei}->child_error(0,"E: $oid is empty");
 		$smsg->{blob} eq $oid or die "BUG: expected=$smsg->{blob}";
 		$self->{wcb}->($bref, $smsg);
 	};

^ permalink raw reply related	[relevance 64%]

* [PATCH 14/14] t/lei-import: relax EIO regexp
    2023-12-13  0:50 71% ` [PATCH 05/14] lei inspect: drop unneeded strftime import Eric Wong
@ 2023-12-13  0:50 70% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-12-13  0:50 UTC (permalink / raw)
  To: meta

musl uses "I/O error" while glibc uses "Input/output error"
I wish something like strerrorname_np(3) were portable
and built into Perl so we could just match on /EIO/.
---
 t/lei-import.t | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/lei-import.t b/t/lei-import.t
index b4446b56..89eb1492 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -172,12 +172,13 @@ SKIP: {
 	tick; # wait for strace to attach
 	ok(!lei(qw(import -F eml t/plack-qp.eml)),
 		'-F eml import fails on pathname error injection');
-	like($lei_err, qr!error reading t/plack-qp\.eml: .*Input/output error!,
+	my $IO = '[Ii](?:nput)?/[Oo](?:utput)?';
+	like($lei_err, qr!error reading t/plack-qp\.eml: .*?$IO error!,
 		'EIO noted in stderr');
 	open $fh, '<', 't/plack-qp.eml';
 	ok(!lei(qw(import -F eml -), undef, { %$lei_opt, 0 => $fh }),
 		'-F eml import fails on stdin error injection');
-	like($lei_err, qr!error reading .*?: .*Input/output error!,
+	like($lei_err, qr!error reading .*?: .*?$IO error!,
 		'EIO noted in stderr');
 }
 

^ permalink raw reply related	[relevance 70%]

* [PATCH 05/14] lei inspect: drop unneeded strftime import
  @ 2023-12-13  0:50 71% ` Eric Wong
  2023-12-13  0:50 70% ` [PATCH 14/14] t/lei-import: relax EIO regexp Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-12-13  0:50 UTC (permalink / raw)
  To: meta

`lei inspect' uses the `iso8601' sub from LeiOverview.
---
 lib/PublicInbox/LeiInspect.pm | 1 -
 1 file changed, 1 deletion(-)

diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index 88d7949c..576ab2c7 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -12,7 +12,6 @@ use parent qw(PublicInbox::IPC);
 use PublicInbox::Config;
 use PublicInbox::MID qw(mids);
 use PublicInbox::NetReader qw(imap_uri nntp_uri);
-use POSIX qw(strftime);
 use PublicInbox::LeiOverview;
 *iso8601 = \&PublicInbox::LeiOverview::iso8601;
 

^ permalink raw reply related	[relevance 71%]

* Re: lei up without creating xapian database
  2023-12-12 11:17 71% lei up without creating xapian database Aneesh Kumar K.V (IBM)
@ 2023-12-12 11:41 71% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-12-12 11:41 UTC (permalink / raw)
  To: Aneesh Kumar K.V (IBM); +Cc: meta

"Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> wrote:
> I currently use notmuch to manage my emails, and I already have a xapian
> index created for all the emails on my system. I want to switch to using
> lei to download emails from lore.kernel.org based on query patterns, and
> I want to index these emails with notmuch. However, I noticed that lei
> also creates an index for these emails, which is not necessary for my
> workflow. Is there a way to use lei to download emails to maildir from a
> public-inbox server without creating the xapian index on the local disk?

Not yet, unfortunately.  I have plans to make lei independent of
Xapian (SQLite will still be necessary), but it hasn't been a
priority this year, maybe next year...

The (hopefully) coming lei FUSE3 backend may be helpful for notmuch
users, too, since it'll allow storing compressed/deltafied/deduped
messages in git w/o eating millions of real inodes.

^ permalink raw reply	[relevance 71%]

* lei up without creating xapian database
@ 2023-12-12 11:17 71% Aneesh Kumar K.V (IBM)
  2023-12-12 11:41 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Aneesh Kumar K.V (IBM) @ 2023-12-12 11:17 UTC (permalink / raw)
  To: meta


Hi,

I currently use notmuch to manage my emails, and I already have a xapian
index created for all the emails on my system. I want to switch to using
lei to download emails from lore.kernel.org based on query patterns, and
I want to index these emails with notmuch. However, I noticed that lei
also creates an index for these emails, which is not necessary for my
workflow. Is there a way to use lei to download emails to maildir from a
public-inbox server without creating the xapian index on the local disk?

-aneesh

^ permalink raw reply	[relevance 71%]

* [PATCH 1/4] lei q: fix --no-import-before completion + docs
  @ 2023-11-28 17:36 59% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-28 17:36 UTC (permalink / raw)
  To: meta

--no-import-before skips importing entire messages, not just
keywords, so it can cause permanent data loss if -o is pointed
to precious data.
---
 Documentation/lei-q.pod |  5 +++--
 lib/PublicInbox/LEI.pm  |  1 +
 t/lei-q-kw.t            | 19 ++++++++++++++++---
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 4862ce78..95f3f702 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -108,8 +108,9 @@ Augment output destination instead of clobbering it.
 
 =item --no-import-before
 
-Do not import keywords before writing to an existing output
-destination.
+Do not import messages before writing to an existing output destination.
+Be certain you do not need existing data in your output before using
+this, it permanently erases data unless C<--augment> is used.
 
 =item --threads
 
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 86b71fcd..a89bdc51 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -353,6 +353,7 @@ my %OPTDESC = (
 'no-torsocks' => 'alias for --torsocks=no',
 'save!' =>  "do not save a search for `lei up'",
 'import-remote!' => 'do not memoize remote messages into local store',
+'import-before!' => 'do not import before writing to output (DANGEROUS)',
 
 'type=s' => [ 'any|mid|git', 'disambiguate type' ],
 
diff --git a/t/lei-q-kw.t b/t/lei-q-kw.t
index 06e1df6c..63e46037 100644
--- a/t/lei-q-kw.t
+++ b/t/lei-q-kw.t
@@ -9,6 +9,8 @@ use IO::Compress::Gzip qw(gzip);
 use PublicInbox::MboxReader;
 use PublicInbox::LeiToMail;
 use PublicInbox::Spawn qw(popen_rd);
+use File::Path qw(make_path);
+use PublicInbox::IO qw(write_file);
 my $exp = {
 	'<qp@example.com>' => eml_load('t/plack-qp.eml'),
 	'<testmessage@example.com>' => eml_load('t/utf8.eml'),
@@ -42,6 +44,19 @@ lei_ok(qw(q -o), "maildir:$o", qw(m:qp@example.com));
 @fn = glob("$o/cur/*:2,S");
 is(scalar(@fn), 1, "`seen' flag (but not `replied') set on Maildir file");
 
+{
+	$o = "$ENV{HOME}/dst-existing";
+	make_path(map { "$o/$_" } qw(new cur tmp));
+	my $bp = eml_load('t/data/binary.patch');
+	write_file '>', "$o/cur/binary-patch:2,S", $bp->as_string;
+	lei_ok qw(q --no-import-before m:qp@example.com -o), $o;
+	my @g = glob("$o/*/*");
+	is scalar(@g), 1, 'only newly imported message left';
+	is eml_load($g[0])->header_raw('Message-ID'), '<qp@example.com>';
+	lei qw(q m:binary-patch-test@example);
+	is $lei_out, "[null]\n", 'old message not imported';
+}
+
 SKIP: {
 	$o = "$ENV{HOME}/fifo";
 	mkfifo($o, 0600) or skip("mkfifo not supported: $!", 1);
@@ -80,9 +95,7 @@ my $write_file = sub {
 	if ($_[0] =~ /\.gz\z/) {
 		gzip(\($_[1]), $_[0]) or BAIL_OUT 'gzip';
 	} else {
-		open my $fh, '>', $_[0] or BAIL_OUT $!;
-		print $fh $_[1] or BAIL_OUT $!;
-		close $fh or BAIL_OUT;
+		write_file '>', $_[0], $_[1];
 	}
 };
 

^ permalink raw reply related	[relevance 59%]

* [PATCH 4/4] lei q|up|convert: common finish_output to detect errors
  2023-11-15  9:21 71% [PATCH 0/4] lei convert: support idempotent v2 outputs Eric Wong
                   ` (2 preceding siblings ...)
  2023-11-15  9:21 54% ` [PATCH 3/4] lei: avoid extra fork for v2 outputs Eric Wong
@ 2023-11-15  9:21 64% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-15  9:21 UTC (permalink / raw)
  To: meta

We need to consistently check the exit code of pigz|gzip|xz|bzip2
when writing to compressed mboxes (or bad storage).
---
 lib/PublicInbox/LeiConvert.pm |  4 ++--
 lib/PublicInbox/LeiToMail.pm  | 11 +++++++++++
 lib/PublicInbox/LeiXSearch.pm |  9 +--------
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 9d2479b0..8f628562 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -33,9 +33,9 @@ sub process_inputs { # via wq_do
 	local $PublicInbox::DS::in_loop = 0; # force synchronous awaitpid
 	$self->SUPER::process_inputs;
 	my $lei = $self->{lei};
-	delete $lei->{1};
 	my $l2m = delete $lei->{l2m};
-	delete $self->{wcb}; # commit
+	delete $self->{wcb}; # may close connections
+	$l2m->finish_output($lei) if $l2m;
 	if (my $v2w = delete $lei->{v2w}) { $v2w->done } # may die
 	my $nr_w = delete($l2m->{-nr_write}) // 0;
 	my $d = (delete($l2m->{-nr_seen}) // 0) - $nr_w;
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 0d62888d..007191bb 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -609,6 +609,17 @@ sub _pre_augment_mbox {
 	undef;
 }
 
+sub finish_output {
+	my ($self, $lei) = @_;
+	my $out = delete $lei->{1} // die 'BUG: no lei->{1}';
+	my $old = delete $lei->{old_1};
+	$lei->{1} = $old if $old;
+	return if $out->close; # reaps gzip|pigz|xz|bzip2
+	my $msg = "E: Error closing $lei->{ovv}->{dst}";
+	$? ? $lei->child_error($?) : ($msg .= " ($!)");
+	die $msg;
+}
+
 sub _do_augment_mbox {
 	my ($self, $lei) = @_;
 	return unless $self->{seekable};
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 5e36c11a..cee3ad07 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -393,14 +393,7 @@ sub query_done { # EOF callback for main daemon
 	$lei->sto_done_request;
 	$lei->{ovv}->ovv_end($lei);
 	if ($l2m) { # close() calls LeiToMail reap_compress
-		if (my $out = delete $lei->{old_1}) {
-			if (my $mbout = $lei->{1}) { # compressor pipe process
-				$mbout->close or die <<"";
-Error closing $lei->{ovv}->{dst}: \$!=$! \$?=$?
-
-			}
-			$lei->{1} = $out;
-		}
+		$l2m->finish_output($lei);
 		if ($l2m->lock_free) {
 			$l2m->poke_dst;
 			$lei->poke_mua;

^ permalink raw reply related	[relevance 64%]

* [PATCH 3/4] lei: avoid extra fork for v2 outputs
  2023-11-15  9:21 71% [PATCH 0/4] lei convert: support idempotent v2 outputs Eric Wong
  2023-11-15  9:21 71% ` [PATCH 1/4] lei: fix idempotent STDERR redirect in workers Eric Wong
  2023-11-15  9:21 47% ` [PATCH 2/4] lei convert: fix repeat and idempotent v2 output Eric Wong
@ 2023-11-15  9:21 54% ` Eric Wong
  2023-11-15  9:21 64% ` [PATCH 4/4] lei q|up|convert: common finish_output to detect errors Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-15  9:21 UTC (permalink / raw)
  To: meta

We've always forced LeiToMail to only have one process for v2
outputs anyways since v2 has its own sharding and IPC.  Thus we
can use the single LeiToMail process directly to avoid extra IPC
overhead.
---
 lib/PublicInbox/LeiConvert.pm |  7 ++-----
 lib/PublicInbox/LeiToMail.pm  | 19 +++++++++----------
 lib/PublicInbox/LeiXSearch.pm |  6 +-----
 lib/PublicInbox/V2Writable.pm |  2 --
 4 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 4a1f8323..9d2479b0 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -35,12 +35,9 @@ sub process_inputs { # via wq_do
 	my $lei = $self->{lei};
 	delete $lei->{1};
 	my $l2m = delete $lei->{l2m};
-	my $nr_w = delete($l2m->{-nr_write}) // 0;
 	delete $self->{wcb}; # commit
-	if (my $v2w = delete $lei->{v2w}) {
-		$nr_w = $v2w->wq_do('done'); # may die
-		$v2w->wq_close;
-	}
+	if (my $v2w = delete $lei->{v2w}) { $v2w->done } # may die
+	my $nr_w = delete($l2m->{-nr_write}) // 0;
 	my $d = (delete($l2m->{-nr_seen}) // 0) - $nr_w;
 	$d = $d ? " ($d duplicates)" : '';
 	$lei->qerr("# converted $nr_w messages$d");
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 2d9b7061..0d62888d 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -369,12 +369,14 @@ sub _v2_write_cb ($$) {
 	my ($self, $lei) = @_;
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe if $dedupe;
+	# only call in worker
+	$PublicInbox::Import::DROP_UNIQUE_UNSUB = $lei->{-drop_unique_unsub};
 	sub { # for git_to_mail
 		my ($bref, $smsg, $eml) = @_;
 		$eml //= PublicInbox::Eml->new($bref);
 		++$self->{-nr_seen};
 		return if $dedupe && $dedupe->is_dup($eml, $smsg);
-		$lei->{v2w}->wq_do('add', $eml); # V2Writable->add
+		$lei->{v2w}->add($eml) and ++$self->{-nr_write};
 	}
 }
 
@@ -647,11 +649,6 @@ sub _do_augment_mbox {
 	$dedupe->pause_dedupe if $dedupe;
 }
 
-sub v2w_done_wait { # awaitpid cb
-	my ($pid, $v2w, $lei) = @_;
-	$lei->child_error($?, "error for $v2w->{ibx}->{inboxdir}") if $?;
-}
-
 sub _pre_augment_v2 {
 	my ($self, $lei) = @_;
 	my $dir = $self->{dst};
@@ -677,11 +674,9 @@ sub _pre_augment_v2 {
 		$lei->x_it(shift);
 		die "E: can't write v2 inbox with broken config\n";
 	});
+	$lei->{-drop_unique_unsub} = $PublicInbox::Import::DROP_UNIQUE_UNSUB;
 	$ibx->init_inbox if @creat;
-	my $v2w = $ibx->importer;
-	$v2w->wq_workers_start("lei/v2w $dir", 1, $lei->oldset, {lei => $lei},
-				\&v2w_done_wait, $lei);
-	$lei->{v2w} = $v2w;
+	$lei->{v2w} = $ibx->importer;
 	return if !$lei->{opt}->{shared};
 	my $d = "$lei->{ale}->{git}->{git_dir}/objects";
 	open my $fh, '+>>', my $f = "$dir/git/0.git/objects/info/alternates";
@@ -806,6 +801,10 @@ sub wq_atexit_child {
 	my $lei = $self->{lei};
 	$lei->{ale}->git->async_wait_all;
 	my ($nr_w, $nr_s) = delete(@$self{qw(-nr_write -nr_seen)});
+	if (my $v2w = delete $lei->{v2w}) {
+		eval { $v2w->done };
+		$lei->child_error($?, "E: $@ ($v2w->{ibx}->{inboxdir})") if $@;
+	}
 	delete $self->{wcb};
 	(($nr_w //= 0) + ($nr_s //= 0)) or return;
 	return if $lei->{early_mua} || !$lei->{-progress} || !$lei->{pkt_op_p};
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 7eda6f9e..5e36c11a 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -391,11 +391,6 @@ sub query_done { # EOF callback for main daemon
 	($lei->{opt}->{'mail-sync'} && !$lei->{sto}) and
 		warn "BUG: {sto} missing with --mail-sync";
 	$lei->sto_done_request;
-	my $nr_w = delete($lei->{-nr_write}) // 0;
-	if (my $v2w = delete $lei->{v2w}) {
-		$nr_w = $v2w->wq_do('done'); # may die
-		$v2w->wq_close;
-	}
 	$lei->{ovv}->ovv_end($lei);
 	if ($l2m) { # close() calls LeiToMail reap_compress
 		if (my $out = delete $lei->{old_1}) {
@@ -413,6 +408,7 @@ Error closing $lei->{ovv}->{dst}: \$!=$! \$?=$?
 			delete $l2m->{mbl}; # drop dotlock
 		}
 	}
+	my $nr_w = delete($lei->{-nr_write}) // 0;
 	my $nr_dup = (delete($lei->{-nr_seen}) // 0) - $nr_w;
 	if ($lei->{-progress}) {
 		my $tot = $lei->{-mset_total} // 0;
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 231ed516..fb259396 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -135,7 +135,6 @@ sub add {
 	if (do_idx($self, $mime, $smsg)) {
 		$self->checkpoint;
 	}
-	++$self->{-nr_add}; # for lei convert
 	$cmt;
 }
 
@@ -611,7 +610,6 @@ sub done {
 	$self->lock_release(!!$nbytes) if $shards;
 	$self->git->cleanup;
 	die $err if $err;
-	delete $self->{-nr_add}; # for lei-convert
 }
 
 sub importer {

^ permalink raw reply related	[relevance 54%]

* [PATCH 2/4] lei convert: fix repeat and idempotent v2 output
  2023-11-15  9:21 71% [PATCH 0/4] lei convert: support idempotent v2 outputs Eric Wong
  2023-11-15  9:21 71% ` [PATCH 1/4] lei: fix idempotent STDERR redirect in workers Eric Wong
@ 2023-11-15  9:21 47% ` Eric Wong
  2023-11-15  9:21 54% ` [PATCH 3/4] lei: avoid extra fork for v2 outputs Eric Wong
  2023-11-15  9:21 64% ` [PATCH 4/4] lei q|up|convert: common finish_output to detect errors Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-15  9:21 UTC (permalink / raw)
  To: meta

We should be able to treat v2 outputs just like any other mail
format, with the exception that content dedupe is always
enforced by the v2 format.

This allows users hosting v2 public-inboxes to catch up broken
synchronization from alternate archives such as the mbox
archives hosted by https://lists.gnu.org/

Link: https://public-inbox.org/meta/20231114-hypersonic-papaya-starling-e1cfc8@nitro/
---
 lib/PublicInbox/LeiConvert.pm  |  8 ++++++--
 lib/PublicInbox/LeiOverview.pm |  4 ++--
 lib/PublicInbox/LeiToMail.pm   |  3 +--
 lib/PublicInbox/LeiXSearch.pm  |  4 ++--
 lib/PublicInbox/V2Writable.pm  |  3 ++-
 t/lei-convert.t                | 31 ++++++++++++++++++++++++++++++-
 6 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 22aba81a..4a1f8323 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -34,9 +34,13 @@ sub process_inputs { # via wq_do
 	$self->SUPER::process_inputs;
 	my $lei = $self->{lei};
 	delete $lei->{1};
-	my $l2m = delete $self->{l2m};
-	delete $self->{wcb}; # commit
+	my $l2m = delete $lei->{l2m};
 	my $nr_w = delete($l2m->{-nr_write}) // 0;
+	delete $self->{wcb}; # commit
+	if (my $v2w = delete $lei->{v2w}) {
+		$nr_w = $v2w->wq_do('done'); # may die
+		$v2w->wq_close;
+	}
 	my $d = (delete($l2m->{-nr_seen}) // 0) - $nr_w;
 	$d = $d ? " ($d duplicates)" : '';
 	$lei->qerr("# converted $nr_w messages$d");
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 129dabf8..0529bbe4 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -41,8 +41,8 @@ sub detect_fmt ($) {
 	my ($dst) = @_;
 	if ($dst =~ m!\A([:/]+://)!) {
 		die "$1 support not implemented, yet\n";
-	} elsif (!-e $dst || -d _) {
-		'maildir'; # the default TODO: MH?
+	} elsif (!-e $dst || -d _) { # maildir is the default TODO: MH
+		-e "$dst/inbox.lock" ? 'v2' : 'maildir';
 	} elsif (-f _ || -p _) {
 		die "unable to determine mbox family of $dst\n";
 	} else {
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 2928be45..2d9b7061 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -375,7 +375,6 @@ sub _v2_write_cb ($$) {
 		++$self->{-nr_seen};
 		return if $dedupe && $dedupe->is_dup($eml, $smsg);
 		$lei->{v2w}->wq_do('add', $eml); # V2Writable->add
-		++$self->{-nr_write};
 	}
 }
 
@@ -435,7 +434,7 @@ sub new {
 			($lei->{opt}->{dedupe}//'') eq 'oid';
 		$self->{base_type} = 'v2';
 		$self->{-wq_nr_workers} = 1; # v2 has shards
-		$lei->{opt}->{save} = \1;
+		$lei->{opt}->{save} //= \1 if $lei->{cmd} eq 'q';
 		$dst = $lei->{ovv}->{dst} = $lei->abs_path($dst);
 		@conflict = qw(mua sort);
 	} else {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index e85fd3c4..7eda6f9e 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -391,8 +391,9 @@ sub query_done { # EOF callback for main daemon
 	($lei->{opt}->{'mail-sync'} && !$lei->{sto}) and
 		warn "BUG: {sto} missing with --mail-sync";
 	$lei->sto_done_request;
+	my $nr_w = delete($lei->{-nr_write}) // 0;
 	if (my $v2w = delete $lei->{v2w}) {
-		my $wait = $v2w->wq_do('done'); # may die
+		$nr_w = $v2w->wq_do('done'); # may die
 		$v2w->wq_close;
 	}
 	$lei->{ovv}->ovv_end($lei);
@@ -412,7 +413,6 @@ Error closing $lei->{ovv}->{dst}: \$!=$! \$?=$?
 			delete $l2m->{mbl}; # drop dotlock
 		}
 	}
-	my $nr_w = delete($lei->{-nr_write}) // 0;
 	my $nr_dup = (delete($lei->{-nr_seen}) // 0) - $nr_w;
 	if ($lei->{-progress}) {
 		my $tot = $lei->{-mset_total} // 0;
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 4d606dfe..231ed516 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -135,7 +135,7 @@ sub add {
 	if (do_idx($self, $mime, $smsg)) {
 		$self->checkpoint;
 	}
-
+	++$self->{-nr_add}; # for lei convert
 	$cmt;
 }
 
@@ -611,6 +611,7 @@ sub done {
 	$self->lock_release(!!$nbytes) if $shards;
 	$self->git->cleanup;
 	die $err if $err;
+	delete $self->{-nr_add}; # for lei-convert
 }
 
 sub importer {
diff --git a/t/lei-convert.t b/t/lei-convert.t
index 84b57f81..6aff80bb 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -8,7 +8,8 @@ use PublicInbox::NetReader;
 use PublicInbox::Eml;
 use IO::Uncompress::Gunzip;
 use File::Path qw(remove_tree);
-use PublicInbox::Spawn qw(which);
+use PublicInbox::Spawn qw(which run_qx);
+use File::Compare;
 use autodie qw(open);
 require_mods(qw(lei -imapd -nntpd Mail::IMAPClient Net::NNTP));
 my ($tmpdir, $for_destroy) = tmpdir;
@@ -28,8 +29,36 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	my $d = $ENV{HOME};
 	lei_ok('convert', '-o', "mboxrd:$d/foo.mboxrd",
 		"imap://$imap_host_port/t.v2.0");
+	my ($nc0) = ($lei_err =~ /converted (\d+) messages/);
 	ok(-f "$d/foo.mboxrd", 'mboxrd created from imap://');
 
+	lei_ok qw(convert -o), "v2:$d/v2-test", "mboxrd:$d/foo.mboxrd";
+	my ($nc) = ($lei_err =~ /converted (\d+) messages/);
+	is $nc, $nc0, 'converted all messages messages';
+	lei_ok qw(q z:0.. -f jsonl --only), "$d/v2-test";
+	is(scalar(split(/^/sm, $lei_out)), $nc, 'got all messages in v2-test');
+
+	lei_ok qw(convert -o), "mboxrd:$d/from-v2.mboxrd", "$d/v2-test";
+	like $lei_err, qr/converted $nc messages/;
+	is(compare("$d/foo.mboxrd", "$d/from-v2.mboxrd"), 0,
+		'convert mboxrd -> v2 ->mboxrd roundtrip') or
+			diag run_qx([qw(git diff --no-index),
+					"$d/foo.mboxrd", "$d/from-v2.mboxrd"]);
+
+	lei_ok [qw(convert -F eml -o), "$d/v2-test"], undef,
+		{ 0 => \<<'EOM', %$lei_opt };
+From: f@example.com
+To: t@example.com
+Subject: append-to-v2-on-convert
+Message-ID: <append-to-v2-on-convert@example>
+Date: Fri, 02 Oct 1993 00:00:00 +0000
+EOM
+	like $lei_err, qr/converted 1 messages/, 'only one message added';
+	lei_ok qw(q z:0.. -f jsonl --only), "$d/v2-test";
+	is(scalar(split(/^/sm, $lei_out)), $nc + 1,
+		'got expected number of messages after append convert');
+	like $lei_out, qr/append-to-v2-on-convert/;
+
 	lei_ok('convert', '-o', "mboxrd:$d/nntp.mboxrd",
 		"nntp://$nntp_host_port/t.v2");
 	ok(-f "$d/nntp.mboxrd", 'mboxrd created from nntp://');

^ permalink raw reply related	[relevance 47%]

* [PATCH 1/4] lei: fix idempotent STDERR redirect in workers
  2023-11-15  9:21 71% [PATCH 0/4] lei convert: support idempotent v2 outputs Eric Wong
@ 2023-11-15  9:21 71% ` Eric Wong
  2023-11-15  9:21 47% ` [PATCH 2/4] lei convert: fix repeat and idempotent v2 output Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-15  9:21 UTC (permalink / raw)
  To: meta

This is needed to support forking from already-forked lei workers
and $lei->{2} is already STDERR.

Fixes: e015c3742f91 (lei: use autodie where appropriate, 2023-10-17)
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 460aed40..8d235b37 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -581,7 +581,7 @@ sub _lei_atfork_child {
 		close($_) for (grep(defined, delete @$self{qw(0 1 2 sock)}));
 		delete $cfg->{-lei_store};
 	} else { # worker, Net::NNTP (Net::Cmd) uses STDERR directly
-		open STDERR, '+>&', $self->{2};
+		open STDERR, '+>&='.fileno($self->{2}); # idempotent w/ fileno
 		STDERR->autoflush(1);
 		$self->{2} = \*STDERR;
 		POSIX::setpgid(0, $$) // die "setpgid(0, $$): $!";

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/4] lei convert: support idempotent v2 outputs
@ 2023-11-15  9:21 71% Eric Wong
  2023-11-15  9:21 71% ` [PATCH 1/4] lei: fix idempotent STDERR redirect in workers Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2023-11-15  9:21 UTC (permalink / raw)
  To: meta

This may make it easier for public-inbox admins to forcibly
inject missing messages from existing mbox*/maildir/IMAP/NNTP
archives.

1/4 was only needed to get 2/4 working, but 3/4 makes
it unnecessary with our current codebase (though we may
still need 1/4 in the future).

4/4 was noticed while working on 3/4.

Eric Wong (4):
  lei: fix idempotent STDERR redirect in workers
  lei convert: fix repeat and idempotent v2 output
  lei: avoid extra fork for v2 outputs
  lei q|up|convert: common finish_output to detect errors

 lib/PublicInbox/LEI.pm         |  2 +-
 lib/PublicInbox/LeiConvert.pm  |  9 ++++++---
 lib/PublicInbox/LeiOverview.pm |  4 ++--
 lib/PublicInbox/LeiToMail.pm   | 33 +++++++++++++++++++++------------
 lib/PublicInbox/LeiXSearch.pm  | 13 +------------
 lib/PublicInbox/V2Writable.pm  |  1 -
 t/lei-convert.t                | 31 ++++++++++++++++++++++++++++++-
 7 files changed, 61 insertions(+), 32 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 2/2] t/lei-import: account for more verbose error
    2023-11-15  1:04 70% ` [PATCH 1/2] lei: use -signal numbers for old Perl Eric Wong
@ 2023-11-15  1:04 71% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-11-15  1:04 UTC (permalink / raw)
  To: meta

Perl 5.16.3 on CentOS seems more verbose in one of the EIO
tests.  Relax the regexp so we can account for extra errors
reported by Perl.
---
 t/lei-import.t | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/lei-import.t b/t/lei-import.t
index bd562617..b4446b56 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -172,12 +172,12 @@ SKIP: {
 	tick; # wait for strace to attach
 	ok(!lei(qw(import -F eml t/plack-qp.eml)),
 		'-F eml import fails on pathname error injection');
-	like($lei_err, qr!error reading t/plack-qp\.eml: Input/output error!,
+	like($lei_err, qr!error reading t/plack-qp\.eml: .*Input/output error!,
 		'EIO noted in stderr');
 	open $fh, '<', 't/plack-qp.eml';
 	ok(!lei(qw(import -F eml -), undef, { %$lei_opt, 0 => $fh }),
 		'-F eml import fails on stdin error injection');
-	like($lei_err, qr!error reading .*?: Input/output error!,
+	like($lei_err, qr!error reading .*?: .*Input/output error!,
 		'EIO noted in stderr');
 }
 

^ permalink raw reply related	[relevance 71%]

* [PATCH 1/2] lei: use -signal numbers for old Perl
  @ 2023-11-15  1:04 70% ` Eric Wong
  2023-11-15  1:04 71% ` [PATCH 2/2] t/lei-import: account for more verbose error Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-11-15  1:04 UTC (permalink / raw)
  To: meta

Unlike modern Perls, Perl 5.16.3 on CentOS doesn't accept
negative string signals like "-TERM" .

This only became a problem since commit b231d91f42d7
(treewide: enable warnings in all exec-ed processes)
made our code stricter by enabling more warnings.
In both cases, the kill is probably unnecessary and safe
to remove since we can rely on closing sockets to drop
processes.
---
 lib/PublicInbox/LEI.pm        | 2 +-
 lib/PublicInbox/LeiXSearch.pm | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 77acb5a1..69065ce7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -474,7 +474,7 @@ my @WQ_KEYS = qw(lxs l2m ikw pmd wq1 lne v2w); # internal workers
 sub _drop_wq {
 	my ($self) = @_;
 	for my $wq (grep(defined, delete(@$self{@WQ_KEYS}))) {
-		$wq->wq_kill('-TERM');
+		$wq->wq_kill(-POSIX::SIGTERM());
 		$wq->DESTROY;
 	}
 }
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index b09c2462..e85fd3c4 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -437,7 +437,7 @@ sub do_post_augment {
 	my $err = $@;
 	if ($err) {
 		if (my $lxs = delete $lei->{lxs}) {
-			$lxs->wq_kill('-TERM');
+			$lxs->wq_kill(-POSIX::SIGTERM());
 			$lxs->wq_close;
 		}
 		$lei->fail("$err");

^ permalink raw reply related	[relevance 70%]

* Re: [Bug] lei: extra quotes inserted into query with AND/OR
  2023-11-12 11:59 71%       ` Henrik Grimler
@ 2023-11-12 13:24 71%         ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-12 13:24 UTC (permalink / raw)
  To: Henrik Grimler; +Cc: meta

Henrik Grimler <henrik@grimler.se> wrote:
> Aha, I see, thanks for the explanation! Without the single quotes, and
> after escaping parantheses, lei works as expected.

Good to know.

> For the record, I read some old posts where query was '' quoted, and
> thought it was the way to do it (for example
> https://people.kernel.org/monsieuricon/lore-lei-part-1-getting-started
> and https://josefbacik.github.io/kernel/2021/10/18/lei-and-b4.html)

I'm not sure, I'm a little surprised that they do, but it may be
the preprocessing that the hacky approxidate wrapper (for
rt:1.month.ago..) somehow smooths things over a bit...

I'm sleepy now and head hurts from trying to figure out some
-cindex bugs :x, so maybe I missed some things.

I do recommend --stdin for more complex queries, and I just
posted a patch which should let it work better for terminals
(not just pipes/regular files):

https://public-inbox.org/meta/20231112131233.718614-1-e@80x24.org/

^ permalink raw reply	[relevance 71%]

* [PATCH] lei: don't read --stdin terminals from daemon
@ 2023-11-12 13:12 58% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-12 13:12 UTC (permalink / raw)
  To: meta

We must use a foreground process to read from terminals
on stdin, otherwise weird things like lost keystrokes and
EIO can happen.  So take advantage of ->send_exec_cmd to
spawn `cat' in the same way we spawn MUAs, pagers,
`git config --edit' and `git credential' from script/lei
---
 lib/PublicInbox/InputPipe.pm | 33 +--------------------------------
 lib/PublicInbox/LEI.pm       | 10 +++++++++-
 2 files changed, 10 insertions(+), 33 deletions(-)

diff --git a/lib/PublicInbox/InputPipe.pm b/lib/PublicInbox/InputPipe.pm
index 232f20e8..ee5bda59 100644
--- a/lib/PublicInbox/InputPipe.pm
+++ b/lib/PublicInbox/InputPipe.pm
@@ -6,31 +6,6 @@ package PublicInbox::InputPipe;
 use v5.12;
 use parent qw(PublicInbox::DS);
 use PublicInbox::Syscall qw(EPOLLIN);
-use POSIX ();
-use Carp qw(croak carp);
-
-# I'm not sure what I'm doing w.r.t terminals.
-# FIXME needs non-interactive tests
-sub unblock_tty ($) {
-	my ($self) = @_;
-	my $fd = fileno($self->{sock});
-	my $t = POSIX::Termios->new;
-	$t->getattr($fd) or croak("tcgetattr($fd): $!");
-	return if $t->getlflag & POSIX::ICANON; # line-oriented, good
-
-	# make noncanonical mode TTYs behave like a O_NONBLOCK pipe.
-	# O_NONBLOCK itself isn't well-defined, here, so rely on VMIN + VTIME
-	my ($vmin, $vtime) = ($t->getcc(POSIX::VMIN), $t->getcc(POSIX::VTIME));
-	return if $vmin == 1 && $vtime == 0;
-
-	$t->setcc(POSIX::VMIN, 1); # 1 byte minimum
-	$t->setcc(POSIX::VTIME, 0); # no timeout
-	$t->setattr($fd, POSIX::TCSANOW) or croak("tcsetattr($fd): $!");
-
-	$t->setcc(POSIX::VMIN, $vmin);
-	$t->setcc(POSIX::VTIME, $vtime);
-	$self->{restore_termios} = $t;
-}
 
 sub consume {
 	my ($in, $cb, @args) = @_;
@@ -41,18 +16,12 @@ sub consume {
 		$self->requeue;
 	} elsif (-p _ || -S _) { # O_NONBLOCK for sockets and pipes
 		$in->blocking(0);
-	} elsif (-t $in) { # isatty(3) can't use `_' stat cache
-		unblock_tty($self);
 	}
 	$self;
 }
 
 sub close { # idempotent
 	my ($self) = @_;
-	if (my $t = delete($self->{restore_termios})) {
-		my $fd = fileno($self->{sock} // return);
-		$t->setattr($fd, POSIX::TCSANOW) or carp("tcsetattr($fd): $!")
-	}
 	$self->{-need_rq} ? delete($self->{sock}) : $self->SUPER::close
 }
 
@@ -67,7 +36,7 @@ sub event_step {
 			$self->{cb}->($self, @{$self->{args}}, '');
 			$self->close
 		} elsif ($!{EAGAIN}) { # rely on EPOLLIN
-		} elsif ($!{EINTR}) { # rely on EPOLLIN for sockets/pipes/tty
+		} elsif ($!{EINTR}) { # rely on EPOLLIN for sockets/pipes
 			$self->requeue if $self->{-need_rq};
 		} else { # another error
 			$self->{cb}->($self, @{$self->{args}}, undef);
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 681044c8..77acb5a1 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1577,7 +1577,15 @@ sub _stdin_cb { # PublicInbox::InputPipe::consume callback for --stdin
 sub slurp_stdin {
 	my ($lei, $cb) = @_;
 	require PublicInbox::InputPipe;
-	PublicInbox::InputPipe::consume($lei->{0}, \&_stdin_cb, $lei, $cb);
+	my $in = $lei->{0};
+	if (-t $in) { # run cat via script/lei and read from it
+		$in = undef;
+		use autodie qw(pipe);
+		pipe($in, my $wr);
+		say { $lei->{2} } '# enter query, Ctrl-D when done';
+		send_exec_cmd($lei, [ $lei->{0}, $wr ], ['cat'], {});
+	}
+	PublicInbox::InputPipe::consume($in, \&_stdin_cb, $lei, $cb);
 }
 
 1;

^ permalink raw reply related	[relevance 58%]

* Re: [Bug] lei: extra quotes inserted into query with AND/OR
  2023-11-12  9:02 69%     ` Eric Wong
@ 2023-11-12 11:59 71%       ` Henrik Grimler
  2023-11-12 13:24 71%         ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Henrik Grimler @ 2023-11-12 11:59 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Hi Eric,

On Sun, Nov 12, 2023 at 09:02:49AM +0000, Eric Wong wrote:
> Henrik Grimler <henrik@grimler.se> wrote:
> > Hi Eric,
> > 
> > On Sun, Nov 12, 2023 at 12:10:50AM +0000, Eric Wong wrote:
> > > Henrik Grimler <henrik@grimler.se> wrote:
> > > > Hi,
> > > > 
> > > > I recently found out about lei and installed it through archlinux's
> > > > package manager and am trying out queries. When using AND/OR extra
> > > > quotes are inserted in the curl command which messes it up, for
> > > > example:
> > > > 
> > > > $ lei q -I https://lore.kernel.org/all/ -o ~/mail/foo 'dfn:COPYING OR dfn:Makefile'
> > > > # /home/grimler/.local/share/lei/store 0/0
> > > > # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=dfn%3A%22COPYING+OR+dfn%3AMakefile%22
> > > > # 0 written to /home/grimler/mail/foo/ (0 matches)
> > > > 
> > > > where it can be seen that it tries to search for 'dfn:"COPYING OR
> > > > dfn:Makefile"', and no hits are returned since there is no file named
> > > > "COPYING OR dfn:Makefile".
> > > 
> > > Don't use quotes unless you want a phrase search.
> > 
> > The quotes are added by lei (or some dependency) when query contains
> > space. Happens also if I search for a single file:
> >   lei q -I https://lore.kernel.org/all/ -o ~/mail/foo ' dfn:COPYING'
> > which results in this curl cmd:
> >   /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=+dfn%3A%22COPYING%22
> > where %22 then is "
> 
> Right, spaces require quotes in sh and lei inserts quotes when
> it sees spaces assuming it's a phrase search.  Most queries
> involving filenames don't have spaces, and your original query
> shouldn't have spaces.  It's 3 separate args in @argv of
> `lei_q': [ "dfn:COPYING", "OR", "dfn:Makefile" ]
> 
> In other words, no quotes or spaces are needed in your case at all:
> 
> $ lei q dfn:COPYING OR dfn:Makefile
> (I've omitted the -I and -o args for brevity)
> 
> Your original query only passes 1 arg due to single or double quotes
> handled in the shell (assuming POSIX-like sh or bash):
> 
> $ lei q 'dfn:COPYING OR dfn:Makefile' # don't do this
> $ lei q "dfn:COPYING OR dfn:Makefile" # don't do this, either
> 
> In both cases the `lei_q' subroutine would only see
> [ "dfn:COPYING OR dfn:Makefile" ] in its @argv.
> 
> If you have odd cases where you really need spaces in a single
> token and maybe not phrase search, --stdin can probably get what
> you want more reliably:
> 
> $ echo 'dfn:"some filename with spaces" AND something.else' | lei q --stdin
> 
> Hope that helps.

Aha, I see, thanks for the explanation! Without the single quotes, and
after escaping parantheses, lei works as expected.

For the record, I read some old posts where query was '' quoted, and
thought it was the way to do it (for example
https://people.kernel.org/monsieuricon/lore-lei-part-1-getting-started
and https://josefbacik.github.io/kernel/2021/10/18/lei-and-b4.html)

Best regards,
Henrik Grimler

^ permalink raw reply	[relevance 71%]

* Re: [Bug] lei: extra quotes inserted into query with AND/OR
  2023-11-12  8:23 71%   ` Henrik Grimler
@ 2023-11-12  9:02 69%     ` Eric Wong
  2023-11-12 11:59 71%       ` Henrik Grimler
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-11-12  9:02 UTC (permalink / raw)
  To: Henrik Grimler; +Cc: meta

Henrik Grimler <henrik@grimler.se> wrote:
> Hi Eric,
> 
> On Sun, Nov 12, 2023 at 12:10:50AM +0000, Eric Wong wrote:
> > Henrik Grimler <henrik@grimler.se> wrote:
> > > Hi,
> > > 
> > > I recently found out about lei and installed it through archlinux's
> > > package manager and am trying out queries. When using AND/OR extra
> > > quotes are inserted in the curl command which messes it up, for
> > > example:
> > > 
> > > $ lei q -I https://lore.kernel.org/all/ -o ~/mail/foo 'dfn:COPYING OR dfn:Makefile'
> > > # /home/grimler/.local/share/lei/store 0/0
> > > # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=dfn%3A%22COPYING+OR+dfn%3AMakefile%22
> > > # 0 written to /home/grimler/mail/foo/ (0 matches)
> > > 
> > > where it can be seen that it tries to search for 'dfn:"COPYING OR
> > > dfn:Makefile"', and no hits are returned since there is no file named
> > > "COPYING OR dfn:Makefile".
> > 
> > Don't use quotes unless you want a phrase search.
> 
> The quotes are added by lei (or some dependency) when query contains
> space. Happens also if I search for a single file:
>   lei q -I https://lore.kernel.org/all/ -o ~/mail/foo ' dfn:COPYING'
> which results in this curl cmd:
>   /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=+dfn%3A%22COPYING%22
> where %22 then is "

Right, spaces require quotes in sh and lei inserts quotes when
it sees spaces assuming it's a phrase search.  Most queries
involving filenames don't have spaces, and your original query
shouldn't have spaces.  It's 3 separate args in @argv of
`lei_q': [ "dfn:COPYING", "OR", "dfn:Makefile" ]

In other words, no quotes or spaces are needed in your case at all:

$ lei q dfn:COPYING OR dfn:Makefile
(I've omitted the -I and -o args for brevity)

Your original query only passes 1 arg due to single or double quotes
handled in the shell (assuming POSIX-like sh or bash):

$ lei q 'dfn:COPYING OR dfn:Makefile' # don't do this
$ lei q "dfn:COPYING OR dfn:Makefile" # don't do this, either

In both cases the `lei_q' subroutine would only see
[ "dfn:COPYING OR dfn:Makefile" ] in its @argv.

If you have odd cases where you really need spaces in a single
token and maybe not phrase search, --stdin can probably get what
you want more reliably:

$ echo 'dfn:"some filename with spaces" AND something.else' | lei q --stdin

Hope that helps.

^ permalink raw reply	[relevance 69%]

* Re: [Bug] lei: extra quotes inserted into query with AND/OR
  2023-11-12  0:10 71% ` Eric Wong
@ 2023-11-12  8:23 71%   ` Henrik Grimler
  2023-11-12  9:02 69%     ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Henrik Grimler @ 2023-11-12  8:23 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Hi Eric,

On Sun, Nov 12, 2023 at 12:10:50AM +0000, Eric Wong wrote:
> Henrik Grimler <henrik@grimler.se> wrote:
> > Hi,
> > 
> > I recently found out about lei and installed it through archlinux's
> > package manager and am trying out queries. When using AND/OR extra
> > quotes are inserted in the curl command which messes it up, for
> > example:
> > 
> > $ lei q -I https://lore.kernel.org/all/ -o ~/mail/foo 'dfn:COPYING OR dfn:Makefile'
> > # /home/grimler/.local/share/lei/store 0/0
> > # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=dfn%3A%22COPYING+OR+dfn%3AMakefile%22
> > # 0 written to /home/grimler/mail/foo/ (0 matches)
> > 
> > where it can be seen that it tries to search for 'dfn:"COPYING OR
> > dfn:Makefile"', and no hits are returned since there is no file named
> > "COPYING OR dfn:Makefile".
> 
> Don't use quotes unless you want a phrase search.

The quotes are added by lei (or some dependency) when query contains
space. Happens also if I search for a single file:
  lei q -I https://lore.kernel.org/all/ -o ~/mail/foo ' dfn:COPYING'
which results in this curl cmd:
  /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=+dfn%3A%22COPYING%22
where %22 then is "

Without spaces in the query all is well:
  lei q -I https://lore.kernel.org/all/ -o ~/mail/foo 'dfn:COPYING'
which gives expected results
  /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=dfn%3ACOPYING

So, maybe there is an issue in some perl dependency on archlinux, any
suggestion where I should start digging?

Best regards,
Henrik Grimler

> Basically, I wanted the CLI and WWW search to feel the same.
> 
> > Is this a known issue, or am I doing something wrong?
> 
> I think you're the second user to add unnecessary quotes;
> is it learned behavior from another search tool?
> 
> In my experience, generic web search engines don't use quotes
> outside of phrase search, either...
> 
> My primary mail experience is from using mairix, so lei borrows
> heavily from it.  But IIRC, mairix doesn't support phrase search.
> 
> Anyways, thanks for the note and any future comments you provide :>

^ permalink raw reply	[relevance 71%]

* Re: [Bug] lei: extra quotes inserted into query with AND/OR
  2023-11-11 22:44 70% [Bug] lei: extra quotes inserted into query with AND/OR Henrik Grimler
@ 2023-11-12  0:10 71% ` Eric Wong
  2023-11-12  8:23 71%   ` Henrik Grimler
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-11-12  0:10 UTC (permalink / raw)
  To: Henrik Grimler; +Cc: meta

Henrik Grimler <henrik@grimler.se> wrote:
> Hi,
> 
> I recently found out about lei and installed it through archlinux's
> package manager and am trying out queries. When using AND/OR extra
> quotes are inserted in the curl command which messes it up, for
> example:
> 
> $ lei q -I https://lore.kernel.org/all/ -o ~/mail/foo 'dfn:COPYING OR dfn:Makefile'
> # /home/grimler/.local/share/lei/store 0/0
> # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=dfn%3A%22COPYING+OR+dfn%3AMakefile%22
> # 0 written to /home/grimler/mail/foo/ (0 matches)
> 
> where it can be seen that it tries to search for 'dfn:"COPYING OR
> dfn:Makefile"', and no hits are returned since there is no file named
> "COPYING OR dfn:Makefile".

Don't use quotes unless you want a phrase search.

Basically, I wanted the CLI and WWW search to feel the same.

> Is this a known issue, or am I doing something wrong?

I think you're the second user to add unnecessary quotes;
is it learned behavior from another search tool?

In my experience, generic web search engines don't use quotes
outside of phrase search, either...

My primary mail experience is from using mairix, so lei borrows
heavily from it.  But IIRC, mairix doesn't support phrase search.

Anyways, thanks for the note and any future comments you provide :>

^ permalink raw reply	[relevance 71%]

* [Bug] lei: extra quotes inserted into query with AND/OR
@ 2023-11-11 22:44 70% Henrik Grimler
  2023-11-12  0:10 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Henrik Grimler @ 2023-11-11 22:44 UTC (permalink / raw)
  To: meta

Hi,

I recently found out about lei and installed it through archlinux's
package manager and am trying out queries. When using AND/OR extra
quotes are inserted in the curl command which messes it up, for
example:

$ lei q -I https://lore.kernel.org/all/ -o ~/mail/foo 'dfn:COPYING OR dfn:Makefile'
# /home/grimler/.local/share/lei/store 0/0
# /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&q=dfn%3A%22COPYING+OR+dfn%3AMakefile%22
# 0 written to /home/grimler/mail/foo/ (0 matches)

where it can be seen that it tries to search for 'dfn:"COPYING OR
dfn:Makefile"', and no hits are returned since there is no file named
"COPYING OR dfn:Makefile".

lei and perl-publicinbox are reported to be version 1.9.0-2. I tried
out lei from git repo (commit 270715407b0e ("doc: update
README.unsubscribe")) and had the same issue with that version as
well.

Is this a known issue, or am I doing something wrong?

Best regards,
Henrik Grimler

^ permalink raw reply	[relevance 70%]

* [PATCH] t/lei-import: skip strace for restricted systems
@ 2023-11-10 22:26 62% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-10 22:26 UTC (permalink / raw)
  To: meta

Systems with Yama can restrict ptrace(2) (the underlying syscall
used by strace(1)) and make it difficult to test error handling
via error injection.  Just skip the tests on such systems since
it's probably not worth the effort to start using prctl(2) to
enable the test on such systems.
---
 lib/PublicInbox/TestCommon.pm | 18 +++++++++++++++---
 t/lei-import.t                |  7 +++----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 46e6a538..b84886a0 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -935,13 +935,25 @@ sub cfg_new ($;@) {
 }
 
 our $strace_cmd;
-sub strace () {
+sub strace (@) {
+	my ($for_daemon) = @_;
 	skip 'linux only test' if $^O ne 'linux';
+	if ($for_daemon) {
+		my $f = '/proc/sys/kernel/yama/ptrace_scope';
+		# TODO: we could fiddle with prctl in the daemon to make
+		# things work, but I'm not sure it's worth it...
+		state $ps = do {
+			my $fh;
+			CORE::open($fh, '<', $f) ? readline($fh) : 0;
+		};
+		chomp $ps;
+		skip "strace unusable on daemons\n$f is `$ps' (!= 0)" if $ps;
+	}
 	require_cmd('strace', 1);
 }
 
-sub strace_inject () {
-	my $cmd = strace;
+sub strace_inject (;$) {
+	my $cmd = strace(@_);
 	state $ver = do {
 		require PublicInbox::Spawn;
 		my $v = PublicInbox::Spawn::run_qx([$cmd, '--version']);
diff --git a/t/lei-import.t b/t/lei-import.t
index 6ad4c97b..1edd607d 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -155,19 +155,18 @@ do {
 like($lei_out, qr/\bbin\b/, 'commit-delay eventually commits');
 
 SKIP: {
-	my $strace = strace_inject; # skips if strace is old or non-Linux
+	my $strace = strace_inject(1); # skips if strace is old or non-Linux
 	my $tmpdir = tmpdir;
 	my $tr = "$tmpdir/tr";
-	my $cmd = [ $strace, "-o$tr", '-f',
+	my $cmd = [ $strace, '-q', "-o$tr", '-f',
 		"-P", File::Spec->rel2abs('t/plack-qp.eml'),
 		'-e', 'inject=readv,read:error=EIO'];
 	lei_ok qw(daemon-pid);
 	chomp(my $daemon_pid = $lei_out);
 	push @$cmd, '-p', $daemon_pid;
-	my $strace_opt = { 1 => \my $out, 2 => \my $err };
 	require PublicInbox::Spawn;
 	require PublicInbox::AutoReap;
-	my $pid = PublicInbox::Spawn::spawn($cmd, \%ENV, $strace_opt);
+	my $pid = PublicInbox::Spawn::spawn($cmd, \%ENV);
 	my $ar = PublicInbox::AutoReap->new($pid);
 	tick; # wait for strace to attach
 	ok(!lei(qw(import -F eml t/plack-qp.eml)),

^ permalink raw reply related	[relevance 62%]

* [PATCH 12/13] lei: get rid of autoreap usage
                     ` (2 preceding siblings ...)
  2023-11-09 10:09 70% ` [PATCH 06/13] lei ls-mail-source: gracefully handle network failures Eric Wong
@ 2023-11-09 10:09 55% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-09 10:09 UTC (permalink / raw)
  To: meta

We can rely on Process::IO->DESTROY to close and reap
in these cases.  This is the final step in eliminating
the wantarray invocations of popen_rd (and popen_wr).
---
 lib/PublicInbox/LeiInput.pm   | 13 +++++--------
 lib/PublicInbox/LeiRemote.pm  | 14 ++++++--------
 lib/PublicInbox/LeiXSearch.pm | 15 +++++++++------
 3 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index adb356c9..68c3c459 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -7,7 +7,6 @@ use v5.12;
 use PublicInbox::DS;
 use PublicInbox::Spawn qw(which popen_rd);
 use PublicInbox::InboxWritable qw(eml_from_path);
-use PublicInbox::AutoReap;
 
 # JMAP RFC 8621 4.1.1
 # https://www.iana.org/assignments/imap-jmap-keywords/imap-jmap-keywords.xhtml
@@ -114,15 +113,13 @@ sub handle_http_input ($$@) {
 	push @$curl, '-s', @$curl_opt;
 	my $cmd = $curl->for_uri($lei, $uri);
 	$lei->qerr("# $cmd");
-	my ($fh, $pid) = popen_rd($cmd, undef, { 2 => $lei->{2} });
-	my $ar = PublicInbox::AutoReap->new($pid);
+	my $fh = popen_rd($cmd, undef, { 2 => $lei->{2} });
 	grep(/\A--compressed\z/, @$curl) or
-		$fh = IO::Uncompress::Gunzip->new($fh, MultiStream => 1);
+		$fh = IO::Uncompress::Gunzip->new($fh,
+					MultiStream => 1, AutoClose => 1);
 	eval { $self->input_fh('mboxrd', $fh, $url, @args) };
-	my @err = ($@ ? $@ : ());
-	$ar->join;
-	push(@err, "\$?=$?") if $?;
-	$lei->child_error($?, "@$cmd failed: @err") if @err;
+	my $err = $@ ? ": $@" : '';
+	$lei->child_error($?, "@$cmd failed$err") if $err || $?;
 }
 
 sub oid2eml { # git->cat_async cb
diff --git a/lib/PublicInbox/LeiRemote.pm b/lib/PublicInbox/LeiRemote.pm
index 54750062..559fb8d5 100644
--- a/lib/PublicInbox/LeiRemote.pm
+++ b/lib/PublicInbox/LeiRemote.pm
@@ -12,7 +12,6 @@ use IO::Uncompress::Gunzip;
 use PublicInbox::MboxReader;
 use PublicInbox::Spawn qw(popen_rd);
 use PublicInbox::LeiCurl;
-use PublicInbox::AutoReap;
 use PublicInbox::ContentHash qw(git_sha);
 
 sub new {
@@ -22,7 +21,7 @@ sub new {
 
 sub isrch { $_[0] } # SolverGit expcets this
 
-sub _each_mboxrd_eml { # callback for MboxReader->mboxrd
+sub each_mboxrd_eml { # callback for MboxReader->mboxrd
 	my ($eml, $self) = @_;
 	my $lei = $self->{lei};
 	my $xoids = $lei->{ale}->xoids_for($eml, 1);
@@ -47,14 +46,13 @@ sub mset {
 	$uri->query_form(q => $qstr, x => 'm', r => 1); # r=1: relevance
 	my $cmd = $curl->for_uri($self->{lei}, $uri);
 	$self->{lei}->qerr("# $cmd");
-	my ($fh, $pid) = popen_rd($cmd, undef, { 2 => $lei->{2} });
-	my $ar = PublicInbox::AutoReap->new($pid);
 	$self->{smsg} = [];
-	$fh = IO::Uncompress::Gunzip->new($fh, MultiStream => 1);
-	PublicInbox::MboxReader->mboxrd($fh, \&_each_mboxrd_eml, $self);
+	my $fh = popen_rd($cmd, undef, { 2 => $lei->{2} });
+	$fh = IO::Uncompress::Gunzip->new($fh, MultiStream=>1, AutoClose=>1);
+	eval { PublicInbox::MboxReader->mboxrd($fh, \&each_mboxrd_eml, $self) };
+	my $err = $@ ? ": $@" : '';
 	my $wait = $self->{lei}->{sto}->wq_do('done');
-	$ar->join;
-	$lei->child_error($?) if $?;
+	$lei->child_error($?, "@$cmd failed$err") if $err || $?;
 	$self; # we are the mset (and $ibx, and $self)
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index ba8ff293..b09c2462 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -346,14 +346,17 @@ print STDERR $_;
 		my $cmd = $curl->for_uri($lei, $uri);
 		$lei->qerr("# $cmd");
 		$rdr->{2} //= popen_wr(@lbf_tee) if @lbf_tee;
-		my $cfh = popen_rd($cmd, undef, $rdr);
-		my $fh = IO::Uncompress::Gunzip->new($cfh, MultiStream => 1);
-		PublicInbox::MboxReader->mboxrd($fh, \&each_remote_eml, $self,
-						$lei, $each_smsg);
+		my $fh = popen_rd($cmd, undef, $rdr);
+		$fh = IO::Uncompress::Gunzip->new($fh,
+					MultiStream => 1, AutoClose => 1);
+		eval {
+			PublicInbox::MboxReader->mboxrd($fh, \&each_remote_eml,
+						$self, $lei, $each_smsg);
+		};
+		my ($exc, $code) = ($@, $?);
 		$lei->sto_done_request if delete($self->{-sto_imported});
+		die "E: $exc" if $exc && !$code;
 		my $nr = delete $lei->{-nr_remote_eml} // 0;
-		$cfh->close;
-		my $code = $?;
 		if (!$code) { # don't update if no results, maybe MTA is down
 			$lei->{lss}->cfg_set($key, $start) if $key && $nr;
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);

^ permalink raw reply related	[relevance 55%]

* [PATCH 03/13] lei: reuse FDs atfork and close explicitly
    2023-11-09 10:09 68% ` [PATCH 02/13] lei: use cached $daemon_pid when possible Eric Wong
@ 2023-11-09 10:09 70% ` Eric Wong
  2023-11-09 10:09 70% ` [PATCH 06/13] lei ls-mail-source: gracefully handle network failures Eric Wong
  2023-11-09 10:09 55% ` [PATCH 12/13] lei: get rid of autoreap usage Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-09 10:09 UTC (permalink / raw)
  To: meta

We'll avoid having a redundant STDERR FD open in lei workers,
and some explicit close() on `lei up' sockets reduces the
likelyhood of inadvertantly open FDs causing processes to
linger.
---
 lib/PublicInbox/LEI.pm | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f32e5bbc..681044c8 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -574,19 +574,20 @@ sub _lei_atfork_child {
 	my ($self, $persist) = @_;
 	# we need to explicitly close things which are on stack
 	my $cfg = $self->{cfg};
+	delete @$cfg{qw(-watches -lei_note_event)};
 	if ($persist) {
 		open $self->{3}, '<', '/';
 		fchdir($self);
 		close($_) for (grep(defined, delete @$self{qw(0 1 2 sock)}));
-		delete @$cfg{qw(-lei_store -watches -lei_note_event)};
+		delete $cfg->{-lei_store};
 	} else { # worker, Net::NNTP (Net::Cmd) uses STDERR directly
-		open STDERR, '+>&='.fileno($self->{2});
+		open STDERR, '+>&', $self->{2};
 		STDERR->autoflush(1);
+		$self->{2} = \*STDERR;
 		POSIX::setpgid(0, $$) // die "setpgid(0, $$): $!";
-		delete @$cfg{qw(-watches -lei_note_event)};
 	}
 	close($_) for (grep(defined, delete @$self{qw(old_1 au_done)}));
-	delete $self->{-socks};
+	close($_) for (@{delete($self->{-socks}) // []});
 	if (my $op_c = delete $self->{pkt_op_c}) {
 		close(delete $op_c->{sock});
 	}

^ permalink raw reply related	[relevance 70%]

* [PATCH 06/13] lei ls-mail-source: gracefully handle network failures
    2023-11-09 10:09 68% ` [PATCH 02/13] lei: use cached $daemon_pid when possible Eric Wong
  2023-11-09 10:09 70% ` [PATCH 03/13] lei: reuse FDs atfork and close explicitly Eric Wong
@ 2023-11-09 10:09 70% ` Eric Wong
  2023-11-09 10:09 55% ` [PATCH 12/13] lei: get rid of autoreap usage Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-09 10:09 UTC (permalink / raw)
  To: meta

All network connections may fail, so try to emit a helpful
error message instead of attempting to dispatch methods off
`undef'.
---
 lib/PublicInbox/LeiLsMailSource.pm | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiLsMailSource.pm b/lib/PublicInbox/LeiLsMailSource.pm
index 50799270..4b427b26 100644
--- a/lib/PublicInbox/LeiLsMailSource.pm
+++ b/lib/PublicInbox/LeiLsMailSource.pm
@@ -19,7 +19,8 @@ sub input_path_url { # overrides LeiInput version
 	if ($url =~ m!\Aimaps?://!i) {
 		my $uri = PublicInbox::URIimap->new($url);
 		my $sec = $lei->{net}->can('uri_section')->($uri);
-		my $mic = $lei->{net}->mic_get($uri);
+		my $mic = $lei->{net}->mic_get($uri) or
+			return $lei->err("E: $uri");
 		my $l = $mic->folders_hash($uri->path); # server-side filter
 		@$l = map { $_->[2] } # undo Schwartzian transform below:
 			sort { $a->[0] cmp $b->[0] || $a->[1] <=> $b->[1] }
@@ -39,7 +40,8 @@ sub input_path_url { # overrides LeiInput version
 		}
 	} elsif ($url =~ m!\A(?:nntps?|s?news)://!i) {
 		my $uri = PublicInbox::URInntps->new($url);
-		my $nn = $lei->{net}->nn_get($uri);
+		my $nn = $lei->{net}->nn_get($uri) or
+			return $lei->err("E: $uri");
 		my $l = $nn->newsgroups($uri->group); # name => description
 		my $sec = $lei->{net}->can('uri_section')->($uri);
 		if ($json) {

^ permalink raw reply related	[relevance 70%]

* [PATCH 02/13] lei: use cached $daemon_pid when possible
  @ 2023-11-09 10:09 68% ` Eric Wong
  2023-11-09 10:09 70% ` [PATCH 03/13] lei: reuse FDs atfork and close explicitly Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-09 10:09 UTC (permalink / raw)
  To: meta

->lei_daemon_pid can only be called in the top-level daemon
process when $daemon_pid is valid, so avoid a getpid(2) syscall
in those cases.
---
 lib/PublicInbox/LEI.pm   | 2 +-
 lib/PublicInbox/LeiUp.pm | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 2832db63..f32e5bbc 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -927,7 +927,7 @@ sub _config {
 	run_wait($cmd, \%env, \%opt) ? ($err_ok ? undef : fail($self, $?)) : 1;
 }
 
-sub lei_daemon_pid { puts shift, $$ }
+sub lei_daemon_pid { puts shift, $daemon_pid }
 
 sub lei_daemon_kill {
 	my ($self) = @_;
diff --git a/lib/PublicInbox/LeiUp.pm b/lib/PublicInbox/LeiUp.pm
index cd2337b4..0faa180d 100644
--- a/lib/PublicInbox/LeiUp.pm
+++ b/lib/PublicInbox/LeiUp.pm
@@ -11,6 +11,7 @@ use PublicInbox::LeiSavedSearch; # OverIdx
 use PublicInbox::DS;
 use PublicInbox::PktOp;
 use PublicInbox::LeiFinmsg;
+use PublicInbox::LEI;
 my $REMOTE_RE = qr!\A(?:imap|http)s?://!i; # http(s) will be for JMAP
 
 sub up1 ($$) {
@@ -92,7 +93,6 @@ sub redispatch_all ($$) {
 	$op_c->{ops} = { '' => [ $lei->can('dclose'), $lei ] };
 	my @first_batch = splice(@$upq, 0, $j); # initial parallelism
 	$lei->{-upq} = $upq;
-	$lei->{daemon_pid} = $$;
 	$lei->event_step_init; # wait for client disconnects
 	for my $out (@first_batch) {
 		PublicInbox::DS::requeue(
@@ -212,8 +212,8 @@ sub event_step { # runs via PublicInbox::DS::requeue
 
 sub DESTROY {
 	my ($self) = @_;
+	return if ($PublicInbox::LEI::daemon_pid // -1) != $$;
 	my $lei = $self->{lei}; # the original, from lei_up
-	return if $lei->{daemon_pid} != $$;
 	my $sock = delete $self->{unref_on_destroy};
 	my $s = $lei->{-socks} // [];
 	@$s = grep { $_ != $sock } @$s;

^ permalink raw reply related	[relevance 68%]

* Re: lei interactive TUIs (ncurses/vim/emacs)
  2023-09-22 20:33 69% lei interactive TUIs (ncurses/vim/emacs) Eric Wong
@ 2023-11-09  4:14 71% ` Kyle Meyer
  0 siblings, 0 replies; 200+ results
From: Kyle Meyer @ 2023-11-09  4:14 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

[ I missed this (just saw it mentioned in a recent message). ]

Eric Wong writes:

> I've also noticed vim has scripting abilities (like Emacs?) and
> notmuch bundles a vim extension we can take inspiration from.
> Perhaps we could bundle vim and Emacs extensions for lei, too...
[...]
> So, any thoughts on this matter?
>
> Anybody willing to maintain an lei TUI for emacs?

As part of piem [1], I've done some work on an Emacs interface for lei,
in particular `lei q'.  It's pretty bare-bones (been meaning to get back
to it...), but it's functional for searching and basic viewing.

[1]: https://git.kyleam.com/piem/about/

^ permalink raw reply	[relevance 71%]

* [PATCH] lei: fix SIGPIPE on large result sets to pager
@ 2023-11-07 13:01 44% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-11-07 13:01 UTC (permalink / raw)
  To: meta

When dealing with large search results, we need to deal with
EPIPE not just from the pager, but also EPIPE or ECONNRESET
between lei_xsearch and lei2mail processes.

Without this fix, lei_xsearch processes could linger and get
stuck writing to dead lei2mail processes if a user aborts the
pager early during a large result set.

To ensure lei_xsearch processes don't linger around after
lei2mail workers all die, we must close $l2m->{-wq_s2} before
spawning lei_xsearch processes, since $l2m->{-wq_s2} is only
used in lei2mail workers.

For `git cat-file' processes, we also need to trigger
PublicInbox::Git->close to handle unpredictable destructor
ordering to avoid using uninitialized IO refs.  This combines
with the `git_to_mail' change to deal with process cleanup
handling from premature shutdowns.

To test all this, we can't just rely on a single message being
large, but also need to rely on the result set being large
enough to saturate the lei_xsearch -> lei2mail socket so we
rely on GIANT_INBOX_DIR once again.
---
 lib/PublicInbox/Git.pm         |  2 ++
 lib/PublicInbox/LeiOverview.pm |  3 ++-
 lib/PublicInbox/LeiToMail.pm   |  8 +++++++-
 lib/PublicInbox/LeiXSearch.pm  |  3 ++-
 t/lei-sigpipe.t                | 27 ++++++++++++++++++++-------
 5 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index 11712db2..292c359a 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -276,6 +276,7 @@ sub cat_async_step ($$) {
 
 sub cat_async_wait ($) {
 	my ($self) = @_;
+	$self->close if !$self->{sock};
 	my $inflight = $self->{inflight} or return;
 	while (scalar(@$inflight)) {
 		cat_async_step($self, $inflight);
@@ -331,6 +332,7 @@ sub check_async_wait ($) {
 	my ($self) = @_;
 	return cat_async_wait($self) if $self->{-bc};
 	my $ck = $self->{ck} or return;
+	$ck->close if !$ck->{sock};
 	my $inflight = $ck->{inflight} or return;
 	check_async_step($ck, $inflight) while (scalar(@$inflight));
 }
diff --git a/lib/PublicInbox/LeiOverview.pm b/lib/PublicInbox/LeiOverview.pm
index 066c40bd..129dabf8 100644
--- a/lib/PublicInbox/LeiOverview.pm
+++ b/lib/PublicInbox/LeiOverview.pm
@@ -212,7 +212,8 @@ sub ovv_each_smsg_cb { # runs in wq worker usually
 		sub {
 			my ($smsg, $mitem, $eml) = @_;
 			$smsg->{pct} = get_pct($mitem) if $mitem;
-			$l2m->wq_io_do('write_mail', [], $smsg, $eml);
+			eval { $l2m->wq_io_do('write_mail', [], $smsg, $eml) };
+			$lei->fail($@) if $@ && !$!{ECONNRESET} && !$!{EPIPE};
 		}
 	} elsif ($self->{fmt} =~ /\A(concat)?json\z/ && $lei->{opt}->{pretty}) {
 		my $EOR = ($1//'') eq 'concat' ? "\n}" : "\n},";
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index e80163e2..b73af68a 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -8,11 +8,13 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
 use PublicInbox::IO;
+use PublicInbox::Git;
 use PublicInbox::Spawn qw(spawn);
 use IO::Handle; # ->autoflush
 use Fcntl qw(SEEK_SET SEEK_END O_CREAT O_EXCL O_WRONLY);
 use PublicInbox::Syscall qw(rename_noreplace);
 use autodie qw(open seek close);
+use Carp qw(croak);
 
 my %kw2char = ( # Maildir characters
 	draft => 'D',
@@ -132,8 +134,12 @@ sub eml2mboxcl2 {
 
 sub git_to_mail { # git->cat_async callback
 	my ($bref, $oid, $type, $size, $smsg) = @_;
-	my $self = delete $smsg->{l2m} // die "BUG: no l2m";
 	$type // return; # called by PublicInbox::Git::close
+	my $self = delete $smsg->{l2m};
+	if (!defined($self)) {
+		return if $PublicInbox::Git::in_cleanup;
+		croak "BUG: no l2m (type=$type)";
+	}
 	eval {
 		if ($type eq 'missing' &&
 			  ($bref = $self->{-lms_rw}->local_blob($oid, 1))) {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 5443188d..6c8dfe10 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -20,7 +20,7 @@ use PublicInbox::LEI;
 use Fcntl qw(SEEK_SET F_SETFL O_APPEND O_RDWR);
 use PublicInbox::ContentHash qw(git_sha);
 use POSIX qw(strftime);
-use autodie qw(open read seek truncate);
+use autodie qw(close open read seek truncate);
 use PublicInbox::Syscall qw($F_SETPIPE_SZ);
 
 sub new {
@@ -543,6 +543,7 @@ sub do_query {
 		pipe($lei->{startq}, $lei->{au_done}) or die "pipe: $!";
 		fcntl($lei->{startq}, $F_SETPIPE_SZ, 4096) if $F_SETPIPE_SZ;
 		delete $l2m->{au_peers};
+		close(delete $l2m->{-wq_s2}); # share wq_s1 with lei_xsearch
 	}
 	$self->wq_workers_start('lei_xsearch', undef,
 				$lei->oldset, { lei => $lei },
diff --git a/t/lei-sigpipe.t b/t/lei-sigpipe.t
index 622598a4..1aa700e9 100644
--- a/t/lei-sigpipe.t
+++ b/t/lei-sigpipe.t
@@ -7,6 +7,19 @@ use PublicInbox::TestCommon;
 use POSIX qw(WTERMSIG WIFSIGNALED SIGPIPE);
 use PublicInbox::OnDestroy;
 use PublicInbox::Syscall qw($F_SETPIPE_SZ);
+use autodie qw(close open pipe seek sysread);
+use PublicInbox::IO qw(write_file);
+my $inboxdir = $ENV{GIANT_INBOX_DIR};
+SKIP: {
+	$inboxdir // skip 'GIANT_INBOX_DIR unset to test large results';
+	require PublicInbox::Inbox;
+	my $ibx = PublicInbox::Inbox->new({
+		name => 'unconfigured-test',
+		address => [ "test\@example.com" ],
+		inboxdir => $inboxdir,
+	});
+	$ibx->search or xbail "GIANT_INBOX_DIR=$inboxdir has no search";
+}
 
 # undo systemd (and similar) ignoring SIGPIPE, since lei expects to be run
 # from an interactive terminal:
@@ -21,30 +34,30 @@ test_lei(sub {
 	my $f = "$ENV{HOME}/big.eml";
 	my $imported;
 	for my $out ([], [qw(-f mboxcl2)], [qw(-f text)]) {
-		pipe(my ($r, $w)) or BAIL_OUT $!;
+		pipe(my $r, my $w);
 		my $size = $F_SETPIPE_SZ && fcntl($w, $F_SETPIPE_SZ, 4096) ?
 			4096 : 65536;
 		unless (-f $f) {
-			open my $fh, '>', $f or xbail "open $f: $!";
-			print $fh <<'EOM' or xbail;
+			my $fh = write_file '>', $f, <<'EOM';
 From: big@example.com
 Message-ID: <big@example.com>
 EOM
 			print $fh 'Subject:';
 			print $fh (' '.('x' x 72)."\n") x (($size / 73) + 1);
 			print $fh "\nbody\n";
-			close $fh or xbail "close: $!";
+			close $fh;
 		}
 
 		lei_ok(qw(import), $f) if $imported++ == 0;
-		open my $errfh, '+>>', "$ENV{HOME}/stderr.log" or xbail $!;
+		open my $errfh, '+>>', "$ENV{HOME}/stderr.log";
 		my $opt = { run_mode => 0, 2 => $errfh, 1 => $w };
 		my $cmd = [qw(lei q -q -t), @$out, 'z:1..'];
+		push @$cmd, '--only='.$inboxdir if defined $inboxdir;
 		my $tp = start_script($cmd, undef, $opt);
 		close $w;
 		vec(my $rvec = '', fileno($r), 1) = 1;
 		if (!select($rvec, undef, undef, 30)) {
-			seek($errfh, 0, 0) or xbail $!;
+			seek($errfh, 0, 0);
 			my $s = do { local $/; <$errfh> };
 			xbail "lei q had no output after 30s, stderr=$s";
 		}
@@ -53,7 +66,7 @@ EOM
 		$tp->join;
 		ok(WIFSIGNALED($?), "signaled @$out");
 		is(WTERMSIG($?), SIGPIPE, "got SIGPIPE @$out");
-		seek($errfh, 0, 0) or xbail $!;
+		seek($errfh, 0, 0);
 		my $s = do { local $/; <$errfh> };
 		is($s, '', "quiet after sigpipe @$out");
 	}

^ permalink raw reply related	[relevance 44%]

* Re: lei - dfn filters for net/* catching drivers/net/*
  2023-11-02 21:27 71% ` Eric Wong
@ 2023-11-03 18:29 71%   ` David Wei
  0 siblings, 0 replies; 200+ results
From: David Wei @ 2023-11-03 18:29 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On 2023-11-02 14:27, Eric Wong wrote:
> David Wei <dw@davidwei.uk> wrote:
>> Hi,
>>
>> I have a problem with lei dfn filters. Here is my query:
>>
>> lei q -o ~/Mail/overlay -I https://lore.kernel.org/all -t '(dfn:net/* OR dfn:drivers/net/ethernet/mellanox/mlx5/* OR dfn:drivers/net/ethernet/broadcom/bnxt/*) AND tc:netdev@vger.kernel.org AND rt:2.week.ago..'
>>
>> I'm seeing patches that touch drivers/net/* whereas I only want to match
>> net/*.
>>
>> I tried changing it to dfn:^net/* and dfn:b/net/* but neither is
>> working,
> 
> Right, ^ is a regexp thing and I don't think Xapian supports anything
> like it.
> 
>> I also read the Xapian docs: https://xapian.org/docs/queryparser.html
>> but didn't see anything more than * wildcards.
>>
>> Could you please advise on how I can limit my query to only net/*?
> 
> I'm not an expert in Xapian's parser, either, but I think `AND NOT'
> is appropriate here.  So something like:
> 
> 	dfn:net/* AND NOT dfn:drivers/net/*
> 
> Would be helpful to know if it works for you.
> (having NOT only is very expensive and not allowed via the web interface,
> but combining it a positive match should be fine)

Thank you, using AND NOT does work. However, there are many more file
paths that partially match "net/", and excluding each one by one using
AND NOT is tedious.

I found that using b:b/net/* works very well to match patch diffs in
message bodies. This achieves my intended goal of matching only ^net/*.

^ permalink raw reply	[relevance 71%]

* Re: lei - dfn filters for net/* catching drivers/net/*
  2023-11-02 21:16 71% lei - dfn filters for net/* catching drivers/net/* David Wei
@ 2023-11-02 21:27 71% ` Eric Wong
  2023-11-03 18:29 71%   ` David Wei
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-11-02 21:27 UTC (permalink / raw)
  To: David Wei; +Cc: meta

David Wei <dw@davidwei.uk> wrote:
> Hi,
> 
> I have a problem with lei dfn filters. Here is my query:
> 
> lei q -o ~/Mail/overlay -I https://lore.kernel.org/all -t '(dfn:net/* OR dfn:drivers/net/ethernet/mellanox/mlx5/* OR dfn:drivers/net/ethernet/broadcom/bnxt/*) AND tc:netdev@vger.kernel.org AND rt:2.week.ago..'
> 
> I'm seeing patches that touch drivers/net/* whereas I only want to match
> net/*.
> 
> I tried changing it to dfn:^net/* and dfn:b/net/* but neither is
> working,

Right, ^ is a regexp thing and I don't think Xapian supports anything
like it.

> I also read the Xapian docs: https://xapian.org/docs/queryparser.html
> but didn't see anything more than * wildcards.
> 
> Could you please advise on how I can limit my query to only net/*?

I'm not an expert in Xapian's parser, either, but I think `AND NOT'
is appropriate here.  So something like:

	dfn:net/* AND NOT dfn:drivers/net/*

Would be helpful to know if it works for you.
(having NOT only is very expensive and not allowed via the web interface,
but combining it a positive match should be fine)

^ permalink raw reply	[relevance 71%]

* lei - dfn filters for net/* catching drivers/net/*
@ 2023-11-02 21:16 71% David Wei
  2023-11-02 21:27 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: David Wei @ 2023-11-02 21:16 UTC (permalink / raw)
  To: meta

Hi,

I have a problem with lei dfn filters. Here is my query:

lei q -o ~/Mail/overlay -I https://lore.kernel.org/all -t '(dfn:net/* OR dfn:drivers/net/ethernet/mellanox/mlx5/* OR dfn:drivers/net/ethernet/broadcom/bnxt/*) AND tc:netdev@vger.kernel.org AND rt:2.week.ago..'

I'm seeing patches that touch drivers/net/* whereas I only want to match
net/*.

I tried changing it to dfn:^net/* and dfn:b/net/* but neither is
working,

I also read the Xapian docs: https://xapian.org/docs/queryparser.html
but didn't see anything more than * wildcards.

Could you please advise on how I can limit my query to only net/*?

Thanks!
David

^ permalink raw reply	[relevance 71%]

* [PATCH] lei: don't exit lei-daemon on ovv_begin failure
@ 2023-10-27  1:14 88% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-27  1:14 UTC (permalink / raw)
  To: meta

When ->ovv_begin is called in LeiXSearch->do_query in the top-level
lei-daemon process, $lei->{pkt_op_p} still exists.  We must make
sure we're exiting the correct process since lei->out can call
lei->fail and lei->fail calls lei->x_it.

As to avoiding how I caused ->ovv_begin failures to begin with,
that's for a much bigger change...
---
 lib/PublicInbox/LEI.pm | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 7bc7b2dc..e060bcbe 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -40,6 +40,7 @@ $GLP_PASS->configure(qw(gnu_getopt no_ignore_case auto_abbrev pass_through));
 our (%PATH2CFG, # persistent for socket daemon
 $MDIR2CFGPATH, # /path/to/maildir => { /path/to/config => [ ino watches ] }
 $OPT, # shared between optparse and opt_dash callback (for Getopt::Long)
+$daemon_pid
 );
 
 # TBD: this is a documentation mechanism to show a subcommand
@@ -486,7 +487,7 @@ sub x_it ($$) {
 	stop_pager($self);
 	if ($self->{pkt_op_p}) { # worker => lei-daemon
 		$self->{pkt_op_p}->pkt_do('x_it', $code);
-		exit($code >> 8);
+		exit($code >> 8) if $$ != $daemon_pid;
 	} elsif ($self->{sock}) { # lei->daemon => lei(1) client
 		send($self->{sock}, "x_it $code", 0);
 	} elsif ($quit == \&CORE::exit) { # an admin (one-shot) command
@@ -1341,8 +1342,8 @@ sub lazy_start {
 	my $pid = fork;
 	return if $pid;
 	$0 = "lei-daemon $path";
-	local %PATH2CFG;
-	local $MDIR2CFGPATH;
+	local (%PATH2CFG, $MDIR2CFGPATH);
+	local $daemon_pid = $$;
 	$listener->blocking(0);
 	my $exit_code;
 	my $pil = PublicInbox::Listener->new($listener, \&accept_dispatch);

^ permalink raw reply related	[relevance 88%]

* [PATCH 31/30] lei: simplify startq/au_done wakeup notifications
    2023-10-17 23:38 71% ` [PATCH 18/30] t/lei-up: additional diagnostics for match failures Eric Wong
  2023-10-17 23:38 50% ` [PATCH 26/30] lei: use autodie where appropriate Eric Wong
@ 2023-10-19  1:14 59% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-19  1:14 UTC (permalink / raw)
  To: meta

We only need to write one byte at MUA start instead of a byte
for every LeiXSearch worker.  Also, make sure it succeeds by
enabling autodie for syswrite.

When reading, we can rely on `:perlio' layer `read' semantics
to retry on EINTR to avoid looping and other error checking.
---
 lib/PublicInbox/LEI.pm        |  9 +++++----
 lib/PublicInbox/LeiXSearch.pm | 28 +++++++++-------------------
 2 files changed, 14 insertions(+), 23 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 3ccdd4f7..56e4c001 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -9,7 +9,7 @@ package PublicInbox::LEI;
 use v5.12;
 use parent qw(PublicInbox::DS PublicInbox::LeiExternal
 	PublicInbox::LeiQuery);
-use autodie qw(bind chdir fork open socket socketpair unlink);
+use autodie qw(bind chdir fork open socket socketpair syswrite unlink);
 use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_SEQPACKET pack_sockaddr_un);
 use Errno qw(EPIPE EAGAIN ECONNREFUSED ENOENT ECONNRESET);
@@ -1031,9 +1031,10 @@ sub start_mua {
 		$io->[0] = $self->{1} if $self->{opt}->{stdin} && -t $self->{1};
 		send_exec_cmd($self, $io, \@cmd, {});
 	}
-	if ($self->{lxs} && $self->{au_done}) { # kick wait_startq
-		syswrite($self->{au_done}, 'q' x ($self->{lxs}->{jobs} // 0));
-	}
+
+	# kick wait_startq:
+	syswrite($self->{au_done}, 'q') if $self->{lxs} && $self->{au_done};
+
 	return unless -t $self->{2}; # XXX how to determine non-TUI MUAs?
 	$self->{opt}->{quiet} = 1;
 	delete $self->{-progress};
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 25b66b3b..241b9dab 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -122,26 +122,16 @@ sub _mset_more ($$) {
 	$size >= $mo->{limit} && (($mo->{offset} += $size) < $mo->{total});
 }
 
-# $startq will EOF when do_augment is done augmenting and allow
+# $startq will see `q' in do_post_augment -> start_mua if spawning MUA.
+# Otherwise $startq will EOF when do_augment is done augmenting and allow
 # query_combined_mset and query_thread_mset to proceed.
 sub wait_startq ($) {
 	my ($lei) = @_;
-	my $startq = delete $lei->{startq} or return;
-	while (1) {
-		my $n = sysread($startq, my $do_augment_done, 1);
-		if (defined $n) {
-			return if $n == 0; # no MUA
-			if ($do_augment_done eq 'q') {
-				$lei->{opt}->{quiet} = 1;
-				delete $lei->{opt}->{verbose};
-				delete $lei->{-progress};
-			} else {
-				die "BUG: do_augment_done=`$do_augment_done'";
-			}
-			return;
-		}
-		die "wait_startq: $!" unless $!{EINTR};
-	}
+	read(delete($lei->{startq}) // return, my $buf, 1) or return; # EOF
+	die "BUG: wrote `$buf' to au_done" if $buf ne 'q';
+	$lei->{opt}->{quiet} = 1;
+	delete $lei->{opt}->{verbose};
+	delete $lei->{-progress};
 }
 
 sub mset_progress {
@@ -451,10 +441,10 @@ sub do_post_augment {
 		$lei->fail("$err");
 	}
 	if (!$err && delete $lei->{early_mua}) { # non-augment case
-		eval { $lei->start_mua };
+		eval { $lei->start_mua }; # may trigger wait_startq
 		$lei->fail($@) if $@;
 	}
-	close(delete $lei->{au_done}); # triggers wait_startq in lei_xsearch
+	close(delete $lei->{au_done}); # trigger wait_startq if start_mua didn't
 }
 
 sub incr_post_augment { # called whenever an l2m shard finishes augment

^ permalink raw reply related	[relevance 59%]

* [PATCH 26/30] lei: use autodie where appropriate
    2023-10-17 23:38 71% ` [PATCH 18/30] t/lei-up: additional diagnostics for match failures Eric Wong
@ 2023-10-17 23:38 50% ` Eric Wong
  2023-10-19  1:14 59% ` [PATCH 31/30] lei: simplify startq/au_done wakeup notifications Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-17 23:38 UTC (permalink / raw)
  To: meta

This makes us a bit harsher with misbehaving clients, but we
only have one client implementation at the moment.
---
 lib/PublicInbox/LEI.pm | 48 ++++++++++++++++++------------------------
 1 file changed, 20 insertions(+), 28 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1ff6d67f..3ccdd4f7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -9,6 +9,7 @@ package PublicInbox::LEI;
 use v5.12;
 use parent qw(PublicInbox::DS PublicInbox::LeiExternal
 	PublicInbox::LeiQuery);
+use autodie qw(bind chdir fork open socket socketpair unlink);
 use Getopt::Long ();
 use Socket qw(AF_UNIX SOCK_SEQPACKET pack_sockaddr_un);
 use Errno qw(EPIPE EAGAIN ECONNREFUSED ENOENT ECONNRESET);
@@ -29,7 +30,6 @@ use File::Path ();
 use File::Spec;
 use Carp ();
 use Sys::Syslog qw(openlog syslog closelog);
-use Scalar::Util qw(looks_like_number);
 our $quit = \&CORE::exit;
 our ($current_lei, $errors_log, $listener, $oldset, $dir_idle);
 my $GLP = Getopt::Long::Parser->new;
@@ -574,12 +574,12 @@ sub _lei_atfork_child {
 	# we need to explicitly close things which are on stack
 	my $cfg = $self->{cfg};
 	if ($persist) {
-		open $self->{3}, '<', '/' or die "open(/) $!";
+		open $self->{3}, '<', '/';
 		fchdir($self);
 		close($_) for (grep(defined, delete @$self{qw(0 1 2 sock)}));
 		delete @$cfg{qw(-lei_store -watches -lei_note_event)};
 	} else { # worker, Net::NNTP (Net::Cmd) uses STDERR directly
-		open STDERR, '+>&='.fileno($self->{2}) or warn "open $!";
+		open STDERR, '+>&='.fileno($self->{2});
 		STDERR->autoflush(1);
 		POSIX::setpgid(0, $$) // die "setpgid(0, $$): $!";
 		delete @$cfg{qw(-watches -lei_note_event)};
@@ -813,10 +813,9 @@ sub dispatch {
 		if (my $chdir = $self->{opt}->{C}) {
 			for my $d (@$chdir) {
 				next if $d eq ''; # same as git(1)
-				chdir $d or return fail($self, "cd $d: $!");
+				chdir $d;
 			}
-			open $self->{3}, '<', '.' or
-				return fail($self, "open . $!");
+			open($self->{3}, '<', '.');
 		}
 		$cb->($self, @argv);
 	} elsif (grep(/\A-/, $cmd, @argv)) { # --help or -h only
@@ -851,7 +850,7 @@ sub _lei_cfg ($;$) {
 		}
 		my ($cfg_dir) = ($f =~ m!(.*?/)[^/]+\z!);
 		File::Path::mkpath($cfg_dir);
-		open my $fh, '>>', $f or die "open($f): $!\n";
+		open my $fh, '>>', $f;
 		@st = stat($fh) or die "fstat($f): $!\n";
 		$cur_st = pack('dd', $st[10], $st[7]);
 		qerr($self, "# $f created") if $self->{cmd} ne 'config';
@@ -1148,10 +1147,7 @@ sub accept_dispatch { # Listener {post_accept} callback
 		return send($sock, $msg, 0);
 	} else {
 		my $i = 0;
-		for my $fd (@fds) {
-			open($self->{$i++}, '+<&=', $fd) and next;
-			send($sock, "open(+<&=$fd) (FD=$i): $!", 0);
-		}
+		open($self->{$i++}, '+<&=', $_) for @fds;
 		$i == 4 or return send($sock, 'not enough FDs='.($i-1), 0)
 	}
 	# $ENV_STR = join('', map { "\0$_=$ENV{$_}" } keys %ENV);
@@ -1236,12 +1232,11 @@ sub dump_and_clear_log {
 sub cfg2lei ($) {
 	my ($cfg) = @_;
 	my $lei = bless { env => { %{$cfg->{-env}} } }, __PACKAGE__;
-	open($lei->{0}, '<&', \*STDIN) or die "dup 0: $!";
-	open($lei->{1}, '>>&', \*STDOUT) or die "dup 1: $!";
-	open($lei->{2}, '>>&', \*STDERR) or die "dup 2: $!";
-	open($lei->{3}, '<', '/') or die "open /: $!";
-	my ($x, $y);
-	socketpair($x, $y, AF_UNIX, SOCK_SEQPACKET, 0) or die "socketpair: $!";
+	open($lei->{0}, '<&', \*STDIN);
+	open($lei->{1}, '>>&', \*STDOUT);
+	open($lei->{2}, '>>&', \*STDERR);
+	open($lei->{3}, '<', '/');
+	socketpair(my $x, my $y, AF_UNIX, SOCK_SEQPACKET, 0);
 	$lei->{sock} = $x;
 	require PublicInbox::LeiSelfSocket;
 	PublicInbox::LeiSelfSocket->new($y); # adds to event loop
@@ -1317,17 +1312,15 @@ sub lazy_start {
 	my $lk = PublicInbox::Lock->new($errors_log);
 	umask(077) // die("umask(077): $!");
 	$lk->lock_acquire;
-	socket($listener, AF_UNIX, SOCK_SEQPACKET, 0) or die "socket: $!";
+	socket($listener, AF_UNIX, SOCK_SEQPACKET, 0);
 	if ($errno == ECONNREFUSED || $errno == ENOENT) {
 		return if connect($listener, $addr); # another process won
-		if ($errno == ECONNREFUSED && -S $path) {
-			unlink($path) or die "unlink($path): $!";
-		}
+		unlink($path) if $errno == ECONNREFUSED && -S $path;
 	} else {
 		$! = $errno; # allow interpolation to stringify in die
 		die "connect($path): $!";
 	}
-	bind($listener, $addr) or die "bind($path): $!";
+	bind($listener, $addr);
 	$lk->lock_release;
 	undef $lk;
 	my @st = stat($path) or die "stat($path): $!";
@@ -1340,11 +1333,11 @@ sub lazy_start {
 	require PublicInbox::Listener;
 	require PublicInbox::PktOp;
 	(-p STDOUT) or die "E: stdout must be a pipe\n";
-	open(STDIN, '+>>', $errors_log) or die "open($errors_log): $!";
+	open(STDIN, '+>>', $errors_log);
 	STDIN->autoflush(1);
 	dump_and_clear_log();
 	POSIX::setsid() > 0 or die "setsid: $!";
-	my $pid = fork // die "fork: $!";
+	my $pid = fork;
 	return if $pid;
 	$0 = "lei-daemon $path";
 	local %PATH2CFG;
@@ -1385,8 +1378,8 @@ sub lazy_start {
 	};
 	local $SIG{PIPE} = 'IGNORE';
 	local $SIG{ALRM} = 'IGNORE';
-	open STDERR, '>&STDIN' or die "redirect stderr failed: $!";
-	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!";
+	open STDERR, '>&STDIN';
+	open STDOUT, '>&STDIN';
 	# $daemon pipe to `lei' closed, main loop begins:
 	eval { PublicInbox::DS::event_loop($sig, $oldset) };
 	warn "event loop error: $@\n" if $@;
@@ -1424,8 +1417,7 @@ sub wq_done_wait { # awaitpid cb (via wq_eof)
 
 sub fchdir {
 	my ($lei) = @_;
-	my $dh = $lei->{3} // die 'BUG: lei->{3} (CWD) gone';
-	chdir($dh) || die "fchdir: $!";
+	chdir($lei->{3} // die 'BUG: lei->{3} (CWD) gone');
 }
 
 sub wq_eof { # EOF callback for main daemon

^ permalink raw reply related	[relevance 50%]

* [PATCH 18/30] t/lei-up: additional diagnostics for match failures
  @ 2023-10-17 23:38 71% ` Eric Wong
  2023-10-17 23:38 50% ` [PATCH 26/30] lei: use autodie where appropriate Eric Wong
  2023-10-19  1:14 59% ` [PATCH 31/30] lei: simplify startq/au_done wakeup notifications Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-17 23:38 UTC (permalink / raw)
  To: meta

I'm not sure why, but this test just failed for some odd reason
from `make check-run' on my Debian bullseye workstatation.
---
 t/lei-up.t | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/lei-up.t b/t/lei-up.t
index baed6507..2d3afd82 100644
--- a/t/lei-up.t
+++ b/t/lei-up.t
@@ -18,11 +18,11 @@ test_lei(sub {
 		gunzip("$home/$x.mbox.gz" => \$uc, MultiStream => 1) or
 				xbail "gunzip $GunzipError";
 		ok(index($uc, $qp->body_raw) >= 0,
-			"original mail in $x.mbox.gz");
+			"original mail in $x.mbox.gz") or diag $uc;
 		open my $fh, '<', "$home/$x" or xbail $!;
 		$uc = do { local $/; <$fh> } // xbail $!;
 		ok(index($uc, $qp->body_raw) >= 0,
-			"original mail in uncompressed $x");
+			"original mail in uncompressed $x") or diag $uc;
 	}
 	lei_ok qw(ls-search);
 	$s = eml_load('t/utf8.eml')->as_string;

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/3] lei: stdin handling improvements
@ 2023-10-17 10:11 71% Eric Wong
  2023-10-17 10:11 51% ` [PATCH 1/3] lei: consolidate stdin slurp, fix warnings Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-10-17 10:11 UTC (permalink / raw)
  To: meta

I'm not sure about 3/3 since termios(3) stuff is mostly alien to
me.  I'm very slowly expanding my horizons...

Eric Wong (3):
  lei: consolidate stdin slurp, fix warnings
  input_pipe: improve error handling
  input_pipe: handle noncanonical TTY

 lib/PublicInbox/InputPipe.pm  | 79 ++++++++++++++++++++++++++++-------
 lib/PublicInbox/LEI.pm        | 13 ++++++
 lib/PublicInbox/LeiInspect.pm | 12 +-----
 lib/PublicInbox/LeiLcat.pm    | 13 +-----
 lib/PublicInbox/LeiQuery.pm   | 14 ++-----
 t/lei.t                       |  5 +++
 6 files changed, 88 insertions(+), 48 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 1/3] lei: consolidate stdin slurp, fix warnings
  2023-10-17 10:11 71% [PATCH 0/3] lei: stdin handling improvements Eric Wong
@ 2023-10-17 10:11 51% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-17 10:11 UTC (permalink / raw)
  To: meta

We can share more code amongst stdin slurper (not streaming)
commands.  This also fixes uninitialized variable warnings when
feeding an empty stdin to these commands.
---
 lib/PublicInbox/LEI.pm        | 13 +++++++++++++
 lib/PublicInbox/LeiInspect.pm | 12 ++----------
 lib/PublicInbox/LeiLcat.pm    | 13 ++-----------
 lib/PublicInbox/LeiQuery.pm   | 14 +++-----------
 t/lei.t                       |  5 +++++
 5 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index b00be1a1..1ff6d67f 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1573,4 +1573,17 @@ sub request_umask {
 	$u eq 'u' or warn "E: recv $v has no umask";
 }
 
+sub _stdin_cb { # PublicInbox::InputPipe::consume callback for --stdin
+	my ($lei, $cb) = @_; # $_[-1] = $rbuf
+	$_[1] // return $lei->fail("error reading stdin: $!");
+	$lei->{stdin_buf} .= $_[-1];
+	do_env($lei, $cb) if $_[-1] eq '';
+}
+
+sub slurp_stdin {
+	my ($lei, $cb) = @_;
+	require PublicInbox::InputPipe;
+	PublicInbox::InputPipe::consume($lei->{0}, \&_stdin_cb, $lei, $cb);
+}
+
 1;
diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index 65c64cf2..d4ad03eb 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -253,20 +253,13 @@ sub inspect_start ($$) {
 
 sub do_inspect { # lei->do_env cb
 	my ($lei) = @_;
-	my $str = delete $lei->{istr};
+	my $str = delete $lei->{stdin_buf};
 	PublicInbox::Eml::strip_from($str);
 	my $eml = PublicInbox::Eml->new(\$str);
 	inspect_start($lei, [ 'blob:'.$lei->git_oid($eml)->hexdigest,
 			map { "mid:$_" } @{mids($eml)} ]);
 }
 
-sub ins_add { # InputPipe->consume callback
-	my ($lei) = @_; # $_[1] = $rbuf
-	$_[1] // return $lei->fail("error reading stdin: $!");
-	return $lei->{istr} .= $_[1] if $_[1] ne '';
-	$lei->do_env(\&do_inspect);
-}
-
 sub lei_inspect {
 	my ($lei, @argv) = @_;
 	$lei->{json} = ref(PublicInbox::Config::json())->new->utf8->canonical;
@@ -281,8 +274,7 @@ sub lei_inspect {
 		return $lei->fail(<<'') if @argv;
 no args allowed on command-line with --stdin
 
-		require PublicInbox::InputPipe;
-		PublicInbox::InputPipe::consume($lei->{0}, \&ins_add, $lei);
+		$lei->slurp_stdin(\&do_inspect);
 	} else {
 		inspect_start($lei, \@argv);
 	}
diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index 72875dc6..274a9605 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -124,18 +124,11 @@ could not extract Message-ID from $x
 
 sub do_lcat { # lei->do_env cb
 	my ($lei) = @_;
-	my @argv = split(/\s+/, $lei->{mset_opt}->{qstr});
+	my @argv = split(/\s+/, delete($lei->{stdin_buf}));
 	$lei->{mset_opt}->{qstr} = extract_all($lei, @argv) or return;
 	$lei->_start_query;
 }
 
-sub _stdin { # PublicInbox::InputPipe::consume callback for --stdin
-	my ($lei) = @_; # $_[1] = $rbuf
-	$_[1] // return $lei->fail("error reading stdin: $!");
-	return $lei->{mset_opt}->{qstr} .= $_[1] if $_[1] ne '';
-	$lei->do_env(\&do_lcat);
-}
-
 sub lei_lcat {
 	my ($lei, @argv) = @_;
 	my $lxs = $lei->lxs_prepare or return;
@@ -152,9 +145,7 @@ sub lei_lcat {
 		return $lei->fail(<<'') if @argv;
 no args allowed on command-line with --stdin
 
-		require PublicInbox::InputPipe;
-		PublicInbox::InputPipe::consume($lei->{0}, \&_stdin, $lei);
-		return;
+		return $lei->slurp_stdin(\&do_lcat);
 	}
 	$lei->{mset_opt}->{qstr} = extract_all($lei, @argv) or return;
 	$lei->_start_query;
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index e2d8a096..eadf811f 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -61,19 +61,13 @@ sub _start_query { # used by "lei q" and "lei up"
 
 sub do_qry { # do_env cb
 	my ($lei) = @_;
-	$lei->{mset_opt}->{q_raw} = $lei->{mset_opt}->{qstr};
+	$lei->{mset_opt}->{q_raw} = $lei->{mset_opt}->{qstr}
+						= delete $lei->{stdin_buf};
 	$lei->{lse}->query_approxidate($lei->{lse}->git,
 					$lei->{mset_opt}->{qstr});
 	_start_query($lei);
 }
 
-sub qstr_add { # PublicInbox::InputPipe::consume callback for --stdin
-	my ($lei) = @_; # $_[1] = $rbuf
-	$_[1] // $lei->fail("error reading stdin: $!");
-	return $lei->{mset_opt}->{qstr} .= $_[1] if $_[1] ne '';
-	$lei->do_env(\&do_qry);
-}
-
 # make the URI||PublicInbox::{Inbox,ExtSearch} a config-file friendly string
 sub cfg_ext ($) {
 	my ($x) = @_;
@@ -159,9 +153,7 @@ sub lei_q {
 		return $self->fail(<<'') if @argv;
 no query allowed on command-line with --stdin
 
-		require PublicInbox::InputPipe;
-		PublicInbox::InputPipe::consume($self->{0}, \&qstr_add, $self);
-		return;
+		return $self->slurp_stdin(\&do_qry);
 	}
 	chomp(@argv) and $self->qerr("# trailing `\\n' removed");
 	$mset_opt{q_raw} = [ @argv ]; # copy
diff --git a/t/lei.t b/t/lei.t
index 3ac804a8..1dbc9d4c 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -182,6 +182,11 @@ my $test_fail = sub {
 	}
 	lei_ok('sucks', \'yes, but hopefully less every day');
 	like($lei_out, qr/loaded features/, 'loaded features shown');
+
+	lei_ok([qw(q --stdin -f text)], undef, { 0 => \'', %$lei_opt });
+	is($lei_err, '', 'no errors on empty stdin');
+	is($lei_out, '', 'no output on empty query');
+
 SKIP: {
 	skip 'no curl', 3 unless require_cmd('curl', 1);
 	lei(qw(q --only http://127.0.0.1:99999/bogus/ t:m));

^ permalink raw reply related	[relevance 51%]

* Re: [PATCH 2/3] doc: lei-q: drop stale TODO comment (fixed in 1f1b1f0e22f7)
  2023-10-16 11:33 71% ` [PATCH 2/3] doc: lei-q: drop stale TODO comment (fixed in 1f1b1f0e22f7) Štěpán Němec
@ 2023-10-16 21:17 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-16 21:17 UTC (permalink / raw)
  To: Štěpán Němec; +Cc: meta

Štěpán Němec <stepnem@smrk.net> wrote:
> I also wonder about CAVEATS from lei-overview.pod:
> 
>   IMAP and NNTP client performance is poor on high-latency connections.
>   It will hopefully be fixed in 2022.
> 
> I think this needs an update (or removal, if the issue was indeed
> fixed).

Unfortunately not :<   It's a PITA to deal with some of that stuff
and I'm still using offlineimap (as I've been for ~20 years, now?)
I'll try to take care of it in 2024 at latest...

^ permalink raw reply	[relevance 71%]

* [PATCH 2/3] doc: lei-q: drop stale TODO comment (fixed in 1f1b1f0e22f7)
  @ 2023-10-16 11:33 71% ` Štěpán Němec
  2023-10-16 21:17 71%   ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Štěpán Němec @ 2023-10-16 11:33 UTC (permalink / raw)
  To: meta

Fixes: 1f1b1f0e22f7 ("doc: lei-q: document SEARCH TERMS prefixes")
---
I also wonder about CAVEATS from lei-overview.pod:

  IMAP and NNTP client performance is poor on high-latency connections.
  It will hopefully be fixed in 2022.

I think this needs an update (or removal, if the issue was indeed
fixed).

Thanks.

 Documentation/lei-q.pod | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 5f5338490b07..4862ce78b709 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -12,9 +12,6 @@ lei q [OPTIONS] (--stdin|-)
 
 Search for messages across the lei/store and externals.
 
-=for comment
-TODO: Give common prefixes, or at least a description/reference.
-
 =head1 OPTIONS
 
 =for comment
-- 
2.42.0


^ permalink raw reply related	[relevance 71%]

* [PATCH] lei: quiet excessive write/seen messages
@ 2023-10-12  0:21 66% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-12  0:21 UTC (permalink / raw)
  To: meta

We don't want to end up dumping nr_seen/nr_write when progress
is disabled, nor do we want forked off `lei note-event' workers
dump them when DS->Reset is called on fork.
---
 lib/PublicInbox/LEI.pm        | 11 +++++++----
 lib/PublicInbox/LeiXSearch.pm |  6 +++---
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index af39f8af..b00be1a1 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -631,6 +631,7 @@ sub pkt_op_pair {
 
 sub incr {
 	my $lei = shift;
+	$lei->{incr_pid} = $$ if @_;
 	while (my ($f, $n) = splice(@_, 0, 2)) { $lei->{$f} += $n }
 }
 
@@ -1399,10 +1400,12 @@ sub busy { 1 } # prevent daemon-shutdown if client is connected
 # can immediately reread it
 sub DESTROY {
 	my ($self) = @_;
-	for my $k (sort(grep(/\A-nr_/, keys %$self))) {
-		my $nr = $self->{$k};
-		substr($k, 0, length('-nr_'), '');
-		$self->child_error(0, "$nr $k messages");
+	if (defined($self->{incr_pid}) && $self->{incr_pid} == $$) {
+		for my $k (sort(grep(/\A-nr_/, keys %$self))) {
+			my $nr = $self->{$k};
+			substr($k, 0, length('-nr_'), '');
+			$self->child_error(0, "$nr $k messages");
+		}
 	}
 	$self->{1}->autoflush(1) if $self->{1};
 	stop_pager($self);
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 2a4af3e7..d83a403c 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -419,12 +419,12 @@ Error closing $lei->{ovv}->{dst}: \$!=$! \$?=$?
 			delete $l2m->{mbl}; # drop dotlock
 		}
 	}
+	my $nr_w = delete($lei->{-nr_write}) // 0;
+	my $nr_dup = (delete($lei->{-nr_seen}) // 0) - $nr_w;
 	if ($lei->{-progress}) {
 		my $tot = $lei->{-mset_total} // 0;
-		my $nr_w = delete($lei->{-nr_write}) // 0;
-		my $d = (delete($lei->{-nr_seen}) // 0) - $nr_w;
 		my $x = "$tot matches";
-		$x .= ", $d duplicates" if $d;
+		$x .= ", $nr_dup duplicates" if $nr_dup;
 		if ($l2m) {
 			my $m = "# $nr_w written to " .
 				"$lei->{ovv}->{dst} ($x)";

^ permalink raw reply related	[relevance 66%]

* [PATCH 8/9] lei blob: run cat_blob on lei/store for pending blobs
  2023-10-11  7:20 65% [PATCH 0/9] lei + import-related updates Eric Wong
  2023-10-11  7:20 56% ` [PATCH 1/9] lei rediff: use ProcessIO for --drq support Eric Wong
@ 2023-10-11  7:20 86% ` Eric Wong
  2023-10-11  7:20 40% ` [PATCH 9/9] lei import|tag|rm: support --commit-delay=SECONDS Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-11  7:20 UTC (permalink / raw)
  To: meta

This can probably be made asynchronous in the future via
PublicInbox::InputPipe, but it's good enough for testing.
---
 lib/PublicInbox/LeiBlob.pm  | 16 ++++++++++------
 lib/PublicInbox/LeiStore.pm |  5 +++++
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/LeiBlob.pm b/lib/PublicInbox/LeiBlob.pm
index 8df83b1d..d069d4a8 100644
--- a/lib/PublicInbox/LeiBlob.pm
+++ b/lib/PublicInbox/LeiBlob.pm
@@ -9,6 +9,7 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Spawn qw(run_wait popen_rd which);
 use PublicInbox::DS;
+use PublicInbox::Eml;
 
 sub get_git_dir ($$) {
 	my ($lei, $d) = @_;
@@ -121,18 +122,21 @@ sub lei_blob {
 				'cat-file', 'blob', $blob ];
 		if (defined $lei->{-attach_idx}) {
 			my $fh = popen_rd($cmd, $lei->{env}, $rdr);
-			require PublicInbox::Eml;
 			my $buf = do { local $/; <$fh> };
 			return extract_attach($lei, $blob, \$buf) if close($fh);
 		}
 		$rdr->{1} = $lei->{1};
 		my $cerr = run_wait($cmd, $lei->{env}, $rdr) or return;
 		my $lms = $lei->lms;
-		if (my $bref = $lms ? $lms->local_blob($blob, 1) : undef) {
-			defined($lei->{-attach_idx}) and
-				return extract_attach($lei, $blob, $bref);
-			return $lei->out($$bref);
-		} elsif ($opt->{mail}) {
+		my $bref = ($lms ? $lms->local_blob($blob, 1) : undef) // do {
+			my $sto = $lei->{sto} // $lei->_lei_store;
+			$sto && $sto->{-wq_s1} ? $sto->wq_do('cat_blob', $blob)
+						: undef;
+		};
+		$bref and return $lei->{-attach_idx} ?
+					extract_attach($lei, $blob, $bref) :
+					$lei->out($$bref);
+		if ($opt->{mail}) {
 			my $eh = $rdr->{2};
 			seek($eh, 0, 0);
 			return $lei->child_error($cerr, do { local $/; <$eh> });
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index e19ec88e..9c07af14 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -108,6 +108,11 @@ sub search {
 	PublicInbox::LeiSearch->new($_[0]->{priv_eidx}->{topdir});
 }
 
+sub cat_blob {
+	my ($self, $oid) = @_;
+	$self->{im} ? $self->{im}->cat_blob($oid) : undef;
+}
+
 # follows the stderr file
 sub _tail_err {
 	my ($self) = @_;

^ permalink raw reply related	[relevance 86%]

* [PATCH 9/9] lei import|tag|rm: support --commit-delay=SECONDS
  2023-10-11  7:20 65% [PATCH 0/9] lei + import-related updates Eric Wong
  2023-10-11  7:20 56% ` [PATCH 1/9] lei rediff: use ProcessIO for --drq support Eric Wong
  2023-10-11  7:20 86% ` [PATCH 8/9] lei blob: run cat_blob on lei/store for pending blobs Eric Wong
@ 2023-10-11  7:20 40% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-11  7:20 UTC (permalink / raw)
  To: meta

Delayed commits  allows users to trade off immediate safety for
throughput and reduced storage wear when running multiple
discreet commands.

This feature is currently useful for providing a way to make
t/lei-store-fail.t reliable and for ensuring `lei blob' can
retrieve messages which have not yet been committed.

In the future, it'll also be useful for the FUSE layer to batch
git activity.
---
 lib/PublicInbox/LEI.pm      | 23 ++++++++++++++---------
 lib/PublicInbox/LeiStore.pm |  6 ++++++
 t/lei-import.t              | 13 +++++++++++++
 t/lei-store-fail.t          | 20 +++++++++++++-------
 t/lei-tag.t                 | 15 ++++++++++++++-
 5 files changed, 60 insertions(+), 17 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index e2b3c0d9..af39f8af 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -231,13 +231,13 @@ our %CMD = ( # sorted in order of importance/use:
 'rm' => [ '--stdin|LOCATION...',
 	'remove a message from the index and prevent reindexing',
 	'stdin|', # /|\z/ must be first for lone dash
-	qw(in-format|F=s lock=s@), @net_opt, @c_opt ],
+	qw(in-format|F=s lock=s@ commit-delay=i), @net_opt, @c_opt ],
 'plonk' => [ '--threads|--from=IDENT',
 	'exclude mail matching From: or threads from non-Message-ID searches',
 	qw(stdin| threads|t from|f=s mid=s oid=s), @c_opt ],
-'tag' => [ 'KEYWORDS... LOCATION...|--stdin',
+tag => [ 'KEYWORDS... LOCATION...|--stdin',
 	'set/unset keywords and/or labels on message(s)',
-	qw(stdin| in-format|F=s input|i=s@ oid=s@ mid=s@),
+	qw(stdin| in-format|F=s input|i=s@ oid=s@ mid=s@ commit-delay=i),
 	@net_opt, @c_opt, pass_through('-kw:foo for delete') ],
 
 'purge-mailsource' => [ 'LOCATION|--all',
@@ -262,10 +262,11 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(in-format|F=s kw! offset=i recursive|r exclude=s include|I=s
 	verbose|v+ incremental!), @net_opt, # mainly for --proxy=
 	 @c_opt ],
-'import' => [ 'LOCATION...|--stdin [LABELS...]',
+import => [ 'LOCATION...|--stdin [LABELS...]',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s new-only
-	lock=s@ in-format|F=s kw! verbose|v+ incremental! mail-sync!),
+	lock=s@ in-format|F=s kw! verbose|v+ incremental! mail-sync!
+	commit-delay=i),
 	@net_opt, @c_opt ],
 'forget-mail-sync' => [ 'LOCATION...',
 	'forget sync information for a mail folder', @c_opt ],
@@ -1539,10 +1540,14 @@ sub sto_done_request {
 	my ($lei, $wq) = @_;
 	return unless $lei->{sto} && $lei->{sto}->{-wq_s1};
 	local $current_lei = $lei;
-	my $s = ($wq ? $wq->{lei_sock} : undef) // $lei->{sock};
-	my $errfh = $lei->{2} // *STDERR{GLOB};
-	my @io = $s ? ($errfh, $s) : ($errfh);
-	eval { $lei->{sto}->wq_io_do('done', \@io) };
+	if (my $n = $lei->{opt}->{'commit-delay'}) {
+		eval { $lei->{sto}->wq_do('schedule_commit', $n) };
+	} else {
+		my $s = ($wq ? $wq->{lei_sock} : undef) // $lei->{sock};
+		my $errfh = $lei->{2} // *STDERR{GLOB};
+		my @io = $s ? ($errfh, $s) : ($errfh);
+		eval { $lei->{sto}->wq_io_do('done', \@io) };
+	}
 	warn($@) if $@;
 }
 
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 9c07af14..aebb85a9 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -34,6 +34,7 @@ use Sys::Syslog qw(syslog openlog);
 use Errno qw(EEXIST ENOENT);
 use PublicInbox::Syscall qw(rename_noreplace);
 use PublicInbox::LeiStoreErr;
+use PublicInbox::DS qw(add_uniq_timer);
 
 sub new {
 	my (undef, $dir, $opt) = @_;
@@ -113,6 +114,11 @@ sub cat_blob {
 	$self->{im} ? $self->{im}->cat_blob($oid) : undef;
 }
 
+sub schedule_commit {
+	my ($self, $sec) = @_;
+	add_uniq_timer($self->{priv_eidx}->{topdir}, $sec, \&done, $self);
+}
+
 # follows the stderr file
 sub _tail_err {
 	my ($self) = @_;
diff --git a/t/lei-import.t b/t/lei-import.t
index 8b09d3aa..b2c1de9b 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -2,6 +2,7 @@
 # Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use v5.12; use PublicInbox::TestCommon;
+use PublicInbox::DS qw(now);
 use autodie qw(open close);
 test_lei(sub {
 ok(!lei(qw(import -F bogus), 't/plack-qp.eml'), 'fails with bogus format');
@@ -141,6 +142,18 @@ $res = json_utf8->decode($lei_out);
 is_deeply($res->[0]->{kw}, [qw(answered flagged seen)], 'keyword added');
 is_deeply($res->[0]->{L}, [qw(boombox inbox)], 'labels preserved');
 
+lei_ok qw(import --commit-delay=1 +L:bin -F eml t/data/binary.patch);
+lei_ok 'ls-label';
+unlike($lei_out, qr/\bbin\b/, 'commit-delay delays label');
+my $end = now + 10;
+my $n = 1;
+diag 'waiting for lei/store commit...';
+do {
+	tick $n;
+	$n = 0.1;
+} until (!lei('ls-label') || $lei_out =~ /\bbin\b/ || now > $end);
+like($lei_out, qr/\bbin\b/, 'commit-delay eventually commits');
+
 # see t/lei_to_mail.t for "import -F mbox*"
 });
 done_testing;
diff --git a/t/lei-store-fail.t b/t/lei-store-fail.t
index fb0f2b75..c2f03148 100644
--- a/t/lei-store-fail.t
+++ b/t/lei-store-fail.t
@@ -9,8 +9,11 @@ use Fcntl qw(SEEK_SET);
 use File::Path qw(remove_tree);
 
 my $start_home = $ENV{HOME}; # bug guard
+my $utf8_oid = '9bf1002c49eb075df47247b74d69bcd555e23422';
 test_lei(sub {
 	lei_ok qw(import -q t/plack-qp.eml); # start the store
+	ok(!lei(qw(blob --mail), $utf8_oid), 't/utf8.eml not imported, yet');
+
 	my $opt;
 	pipe($opt->{0}, my $in_w);
 	open $opt->{1}, '+>', undef;
@@ -20,27 +23,30 @@ test_lei(sub {
 	my $tp = start_script($cmd, undef, $opt);
 	close $opt->{0};
 	$in_w->autoflush(1);
-	for (1..500) { # need to fill up 64k read buffer
-		print $in_w <<EOM or xbail "print $!";
+	print $in_w <<EOM or xbail "print: $!";
 From k\@y Fri Oct  2 00:00:00 1993
 From: <k\@example.com>
 Date: Sat, 02 Oct 2010 00:00:00 +0000
 Subject: hi
-Message-ID: <$_\@t>
+Message-ID: <0\@t>
 
 will this save?
 EOM
-	}
-	tick 0.2; # XXX ugh, this is so hacky
+	# import another message w/ delay while mboxrd import is still running
+	lei_ok qw(import -q --commit-delay=300 t/utf8.eml);
+	lei_ok qw(blob --mail), $utf8_oid,
+		\'blob immediately available despite --commit-delay';
+	lei_ok qw(q m:testmessage@example.com);
+	is($lei_out, "[null]\n", 'delayed commit is unindexed');
 
-	# make sto_done_request fail:
+	# make immediate ->sto_done_request fail from mboxrd import:
 	remove_tree("$ENV{HOME}/.local/share/lei/store");
 	# subsequent lei commands are undefined behavior,
 	# but we need to make sure the current lei command fails:
 
 	close $in_w; # should trigger ->done
 	$tp->join;
-	isnt($?, 0, 'lei import error code set on failure');
+	isnt($?, 0, 'lei import -F mboxrd error code set on failure');
 	is(-s $opt->{1}, 0, 'nothing in stdout');
 	isnt(-s $opt->{2}, 0, 'stderr not empty');
 	seek($opt->{2}, 0, SEEK_SET);
diff --git a/t/lei-tag.t b/t/lei-tag.t
index cccf0af6..7278dfcd 100644
--- a/t/lei-tag.t
+++ b/t/lei-tag.t
@@ -1,9 +1,10 @@
 #!perl -w
 # Copyright (C) 2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict; use v5.10.1; use PublicInbox::TestCommon;
+use v5.12; use PublicInbox::TestCommon;
 require_git 2.6;
 require_mods(qw(json DBD::SQLite Xapian));
+use PublicInbox::DS qw(now);
 my ($ro_home, $cfg_path) = setup_public_inboxes;
 my $check_kw = sub {
 	my ($exp, %opt) = @_;
@@ -104,5 +105,17 @@ test_lei(sub {
 	lei_ok qw(tag +L:nope -F eml t/data/binary.patch);
 	like $lei_err, qr/\b1 unimported messages/, 'noted unimported'
 		or diag $lei_err;
+
+	lei_ok qw(tag -F eml --commit-delay=1 t/utf8.eml +L:utf8);
+	lei_ok 'ls-label';
+	unlike($lei_out, qr/\butf8\b/, 'commit-delay delays label');
+	my $end = now + 10;
+	my $n = 1;
+	diag 'waiting for lei/store commit...';
+	do {
+		tick $n;
+		$n = 0.1;
+	} until (!lei('ls-label') || $lei_out =~ /\butf8\b/ || now > $end);
+	like($lei_out, qr/\butf8\b/, 'commit-delay eventually commits');
 });
 done_testing;

^ permalink raw reply related	[relevance 40%]

* [PATCH 0/9] lei + import-related updates
@ 2023-10-11  7:20 65% Eric Wong
  2023-10-11  7:20 56% ` [PATCH 1/9] lei rediff: use ProcessIO for --drq support Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2023-10-11  7:20 UTC (permalink / raw)
  To: meta

A few more ProcessIO conversions to start with, and then
cleanups while I started working on import-related stuff.
Some of this will tie in nicely for FUSE, too...

I've realized msgtime messages were pointless anyways since
there's nothing anybody can really do about bad messages that
get through various upstream spam filters.

5/9 is a long-overdue cleanup I noticed while going
over Import.pm

9/9 ought to fix the fragile t/lei-store-fail.t test
by using new features.

Eric Wong (9):
  lei rediff: use ProcessIO for --drq support
  lei_xsearch: improve curl progress reporting
  msgtime: quiet warnings we can do nothing about
  msgtime: simplify msg_timestamp and msg_datestamp
  treewide: consolidate "From " line removal
  import: switch to Unix stream socket for fast-import
  import: cat_blob is a no-op w/o live fast-import
  lei blob: run cat_blob on lei/store for pending blobs
  lei import|tag|rm: support --commit-delay=SECONDS

 lib/PublicInbox/Eml.pm        |   6 ++
 lib/PublicInbox/IMAP.pm       |   2 +-
 lib/PublicInbox/Import.pm     | 138 ++++++++++++++++------------------
 lib/PublicInbox/LEI.pm        |  23 +++---
 lib/PublicInbox/LeiBlob.pm    |  16 ++--
 lib/PublicInbox/LeiInput.pm   |   5 +-
 lib/PublicInbox/LeiInspect.pm |   2 +-
 lib/PublicInbox/LeiRediff.pm  |  33 ++++----
 lib/PublicInbox/LeiStore.pm   |  11 +++
 lib/PublicInbox/LeiToMail.pm  |   3 +-
 lib/PublicInbox/LeiXSearch.pm |  34 +++++----
 lib/PublicInbox/Mbox.pm       |  16 ++--
 lib/PublicInbox/MboxReader.pm |   2 +-
 lib/PublicInbox/MsgTime.pm    |  49 +++++-------
 lib/PublicInbox/NNTP.pm       |   3 +-
 lib/PublicInbox/ProcessIO.pm  |  18 ++---
 lib/PublicInbox/Spawn.pm      |   1 +
 script/public-inbox-convert   |  18 ++---
 script/public-inbox-edit      |   5 +-
 script/public-inbox-learn     |   2 +-
 script/public-inbox-mda       |   4 +-
 script/public-inbox-purge     |   4 +-
 t/lei-import.t                |  13 ++++
 t/lei-store-fail.t            |  20 +++--
 t/lei-tag.t                   |  15 +++-
 25 files changed, 230 insertions(+), 213 deletions(-)


^ permalink raw reply	[relevance 65%]

* [PATCH 1/9] lei rediff: use ProcessIO for --drq support
  2023-10-11  7:20 65% [PATCH 0/9] lei + import-related updates Eric Wong
@ 2023-10-11  7:20 56% ` Eric Wong
  2023-10-11  7:20 86% ` [PATCH 8/9] lei blob: run cat_blob on lei/store for pending blobs Eric Wong
  2023-10-11  7:20 40% ` [PATCH 9/9] lei import|tag|rm: support --commit-delay=SECONDS Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-11  7:20 UTC (permalink / raw)
  To: meta

This required fixing binmode support a few commits ago, along
with properly enabling autoflush in popen_wr instead of setting
it on the wrapper ProcessIO class.
---
 lib/PublicInbox/LeiRediff.pm | 33 ++++++++++++++-------------------
 lib/PublicInbox/ProcessIO.pm | 18 +++++-------------
 lib/PublicInbox/Spawn.pm     |  1 +
 3 files changed, 20 insertions(+), 32 deletions(-)

diff --git a/lib/PublicInbox/LeiRediff.pm b/lib/PublicInbox/LeiRediff.pm
index b894342b..230f3e83 100644
--- a/lib/PublicInbox/LeiRediff.pm
+++ b/lib/PublicInbox/LeiRediff.pm
@@ -138,35 +138,30 @@ EOM
 	undef;
 }
 
-sub wait_requote { # OnDestroy callback
-	my ($lei, $pid, $old_1) = @_;
-	$lei->{1} = $old_1; # closes stdin of `perl -pe 's/^/> /'`
-	waitpid($pid, 0) == $pid or die "BUG(?) waitpid: \$!=$! \$?=$?";
-	$lei->child_error($?) if $?;
-}
+# awaitpid callback
+sub wait_requote { $_[1]->child_error($?) if $? }
 
-sub requote ($$) {
+sub requote ($$) { # '> ' prefix(es) lei->{1}
 	my ($lei, $pfx) = @_;
-	my $old_1 = $lei->{1};
-	my $opt = { 1 => $old_1, 2 => $lei->{2} };
+	my $opt = { 1 => $lei->{1}, 2 => $lei->{2} };
 	# $^X (perl) is overkill, but maybe there's a weird system w/o sed
-	my ($w, $pid) = popen_wr([$^X, '-pe', "s/^/$pfx/"], $lei->{env}, $opt);
-	$w->autoflush(1);
-	binmode $w, ':utf8'; # incompatible with ProcessIO due to syswrite
-	$lei->{1} = $w;
-	PublicInbox::OnDestroy->new(\&wait_requote, $lei, $pid, $old_1);
+	my $w = popen_wr([$^X, '-pe', "s/^/$pfx/"], $lei->{env}, $opt,
+			 \&wait_requote, $lei);
+	binmode $w, ':utf8';
+	$w;
 }
 
 sub extract_oids { # Eml each_part callback
 	my ($ary, $self) = @_;
+	my $lei = $self->{lei};
 	my ($p, undef, $idx) = @$ary;
-	$self->{lei}->out($p->header_obj->as_string, "\n");
+	$lei->out($p->header_obj->as_string, "\n");
 	my ($s, undef) = msg_part_text($p, $p->content_type || 'text/plain');
 	defined $s or return;
-	my $rq;
-	if ($self->{dqre} && $s =~ s/$self->{dqre}//g) { # '> ' prefix(es)
-		$rq = requote($self->{lei}, $1) if $self->{lei}->{opt}->{drq};
-	}
+
+	$self->{dqre} && $s =~ s/$self->{dqre}//g && $lei->{opt}->{drq} and
+		local $lei->{1} = requote($lei, $1);
+
 	my @top = split($PublicInbox::ViewDiff::EXTRACT_DIFFS, $s);
 	undef $s;
 	my $blobs = $self->{blobs}; # blobs to resolve
diff --git a/lib/PublicInbox/ProcessIO.pm b/lib/PublicInbox/ProcessIO.pm
index f120edd0..ea5d3e6c 100644
--- a/lib/PublicInbox/ProcessIO.pm
+++ b/lib/PublicInbox/ProcessIO.pm
@@ -7,6 +7,7 @@ package PublicInbox::ProcessIO;
 use v5.12;
 use PublicInbox::DS qw(awaitpid);
 use Symbol qw(gensym);
+use bytes qw(length);
 
 sub maybe_new {
 	my ($cls, $pid, $fh, @cb_arg) = @_;
@@ -31,25 +32,16 @@ sub TIEHANDLE {
 	$self;
 }
 
-# for IO::Uncompress::Gunzip
-sub BINMODE {
-	return binmode($_[0]->{fh}) if @_ == 1;
-	binmode $_[0]->{fh}, $_[1];
-}
+# for IO::Uncompress::Gunzip and PublicInbox::LeiRediff
+sub BINMODE { @_ == 1 ? binmode($_[0]->{fh}) : binmode($_[0]->{fh}, $_[1]) }
 
 sub READ { read($_[0]->{fh}, $_[1], $_[2], $_[3] || 0) }
 
 sub READLINE { readline($_[0]->{fh}) }
 
-sub WRITE {
-	use bytes qw(length);
-	syswrite($_[0]->{fh}, $_[1], $_[2] // length($_[1]), $_[3] // 0);
-}
+sub WRITE { syswrite($_[0]->{fh}, $_[1], $_[2] // length($_[1]), $_[3] // 0) }
 
-sub PRINT {
-	my $self = shift;
-	print { $self->{fh} } @_;
-}
+sub PRINT { print { $_[0]->{fh} } @_[1..$#_] }
 
 sub FILENO { fileno($_[0]->{fh}) }
 
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index 265638fe..106f5e01 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -376,6 +376,7 @@ sub popen_rd {
 sub popen_wr {
 	my ($cmd, $env, $opt, @cb_arg) = @_;
 	pipe(local $opt->{0}, my $w) or die "pipe: $!\n";
+	$w->autoflush(1);
 	my $pid = spawn($cmd, $env, $opt);
 	PublicInbox::ProcessIO->maybe_new($pid, $w, @cb_arg)
 }

^ permalink raw reply related	[relevance 56%]

* [PATCHv2 3/9] lei: always use async `done' requests to store
  2023-10-07 21:24 47% ` [PATCH 3/9] lei: always use async `done' requests to store Eric Wong
  2023-10-08  1:58 71%   ` Eric Wong
  2023-10-08  5:49 66%   ` [PATCH 2.5/9] lei: fix implicit stdin support for pipes Eric Wong
@ 2023-10-08 18:54 48%   ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-08 18:54 UTC (permalink / raw)
  To: meta

It's safer against deadlocks and we still get proper error
reporting by passing stderr across in addition to the lei
socket.
---
 v2: no change to LeiRemote.pm, increase delay in lei-store-fail.t

 MANIFEST                      |  1 +
 lib/PublicInbox/LEI.pm        |  9 +++----
 lib/PublicInbox/LeiInput.pm   |  2 +-
 lib/PublicInbox/LeiStore.pm   | 17 ++++++------
 lib/PublicInbox/LeiXSearch.pm |  6 ++---
 t/lei-store-fail.t            | 51 +++++++++++++++++++++++++++++++++++
 6 files changed, 68 insertions(+), 18 deletions(-)
 create mode 100644 t/lei-store-fail.t

diff --git a/MANIFEST b/MANIFEST
index 4693cbe0..689c6bf6 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -514,6 +514,7 @@ t/lei-q-thread.t
 t/lei-refresh-mail-sync.t
 t/lei-reindex.t
 t/lei-sigpipe.t
+t/lei-store-fail.t
 t/lei-tag.t
 t/lei-up.t
 t/lei-watch.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1ba2c2a1..e2b3c0d9 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1537,12 +1537,11 @@ sub lms {
 
 sub sto_done_request {
 	my ($lei, $wq) = @_;
-	return unless $lei->{sto};
+	return unless $lei->{sto} && $lei->{sto}->{-wq_s1};
 	local $current_lei = $lei;
-	my $sock = $wq ? $wq->{lei_sock} : undef;
-	$sock //= $lei->{sock};
-	my @io;
-	push(@io, $sock) if $sock; # async wait iff possible
+	my $s = ($wq ? $wq->{lei_sock} : undef) // $lei->{sock};
+	my $errfh = $lei->{2} // *STDERR{GLOB};
+	my @io = $s ? ($errfh, $s) : ($errfh);
 	eval { $lei->{sto}->wq_io_do('done', \@io) };
 	warn($@) if $@;
 }
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 91383265..93f8b6b8 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -467,7 +467,7 @@ sub process_inputs {
 	}
 	# always commit first, even on error partial work is acceptable for
 	# lei <import|tag|convert>
-	my $wait = $self->{lei}->{sto}->wq_do('done') if $self->{lei}->{sto};
+	$self->{lei}->sto_done_request;
 	$self->{lei}->fail($err) if $err;
 }
 
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 0cb78f79..e19ec88e 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -582,19 +582,20 @@ sub xchg_stderr {
 }
 
 sub done {
-	my ($self, $sock_ref) = @_;
-	my $err = '';
+	my ($self) = @_;
+	my ($errfh, $lei_sock) = @$self{0, 1}; # via sto_done_request
+	my @err;
 	if (my $im = delete($self->{im})) {
 		eval { $im->done };
-		if ($@) {
-			$err .= "import done: $@\n";
-			warn $err;
-		}
+		push(@err, "E: import done: $@\n") if $@;
 	}
 	delete $self->{lms};
-	$self->{priv_eidx}->done; # V2Writable::done
+	eval { $self->{priv_eidx}->done }; # V2Writable::done
+	push(@err, "E: priv_eidx done: $@\n") if $@;
+	print { $errfh // *STDERR{GLOB} } @err;
+	send($lei_sock, 'child_error 256', 0) if @err && $lei_sock;
 	xchg_stderr($self);
-	die $err if $err;
+	die @err if @err;
 }
 
 sub ipc_atfork_child {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1caa9d06..4077191f 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -358,9 +358,7 @@ sub query_remote_mboxrd {
 		$fh = IO::Uncompress::Gunzip->new($fh, MultiStream => 1);
 		PublicInbox::MboxReader->mboxrd($fh, \&each_remote_eml, $self,
 						$lei, $each_smsg);
-		if (delete($self->{-sto_imported})) {
-			my $wait = $self->{import_sto}->wq_do('done');
-		}
+		$lei->sto_done_request if delete($self->{-sto_imported});
 		$reap_curl->join;
 		my $nr = delete $lei->{-nr_remote_eml} // 0;
 		if ($? == 0) {
@@ -402,7 +400,7 @@ sub query_done { # EOF callback for main daemon
 	delete $lei->{lxs};
 	($lei->{opt}->{'mail-sync'} && !$lei->{sto}) and
 		warn "BUG: {sto} missing with --mail-sync";
-	$lei->sto_done_request if $lei->{sto};
+	$lei->sto_done_request;
 	if (my $v2w = delete $lei->{v2w}) {
 		my $wait = $v2w->wq_do('done'); # may die
 		$v2w->wq_close;
diff --git a/t/lei-store-fail.t b/t/lei-store-fail.t
new file mode 100644
index 00000000..fb0f2b75
--- /dev/null
+++ b/t/lei-store-fail.t
@@ -0,0 +1,51 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# ensure we detect errors in lei/store
+use v5.12;
+use PublicInbox::TestCommon;
+use autodie qw(pipe open close seek);
+use Fcntl qw(SEEK_SET);
+use File::Path qw(remove_tree);
+
+my $start_home = $ENV{HOME}; # bug guard
+test_lei(sub {
+	lei_ok qw(import -q t/plack-qp.eml); # start the store
+	my $opt;
+	pipe($opt->{0}, my $in_w);
+	open $opt->{1}, '+>', undef;
+	open $opt->{2}, '+>', undef;
+	$opt->{-CLOFORK} = [ $in_w ];
+	my $cmd = [ qw(lei import -q -F mboxrd) ];
+	my $tp = start_script($cmd, undef, $opt);
+	close $opt->{0};
+	$in_w->autoflush(1);
+	for (1..500) { # need to fill up 64k read buffer
+		print $in_w <<EOM or xbail "print $!";
+From k\@y Fri Oct  2 00:00:00 1993
+From: <k\@example.com>
+Date: Sat, 02 Oct 2010 00:00:00 +0000
+Subject: hi
+Message-ID: <$_\@t>
+
+will this save?
+EOM
+	}
+	tick 0.2; # XXX ugh, this is so hacky
+
+	# make sto_done_request fail:
+	remove_tree("$ENV{HOME}/.local/share/lei/store");
+	# subsequent lei commands are undefined behavior,
+	# but we need to make sure the current lei command fails:
+
+	close $in_w; # should trigger ->done
+	$tp->join;
+	isnt($?, 0, 'lei import error code set on failure');
+	is(-s $opt->{1}, 0, 'nothing in stdout');
+	isnt(-s $opt->{2}, 0, 'stderr not empty');
+	seek($opt->{2}, 0, SEEK_SET);
+	my @err = readline($opt->{2});
+	ok(grep(!/^#/, @err), 'noted error in stderr') or diag "@err";
+});
+
+done_testing;

^ permalink raw reply related	[relevance 48%]

* [PATCH 2.5/9] lei: fix implicit stdin support for pipes
  2023-10-07 21:24 47% ` [PATCH 3/9] lei: always use async `done' requests to store Eric Wong
  2023-10-08  1:58 71%   ` Eric Wong
@ 2023-10-08  5:49 66%   ` Eric Wong
  2023-10-08 18:54 48%   ` [PATCHv2 3/9] lei: always use async `done' requests to store Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-08  5:49 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> +++ b/t/lei-store-fail.t

> +	my $cmd = [ qw(lei import -q -F mboxrd) ];
> +	my $tp = start_script($cmd, undef, $opt);

Of course the lack of `-' or  `--stdin' only worked on Linux and
NetBSD, but not other BSDs.

-------8<------
Subject: [PATCH] lei: fix implicit stdin support for pipes

st_mode permission bits can't be used to determine if a file or
pipe we have on stdin readable or not.  Writable regular files
can be opened O_RDONLY, and permissions bits for pipes are
inconsistent across platforms.

On FreeBSD, OpenBSD, and Dragonfly, only the S_IFIFO bit is set
in st_mode with none of the permission bits are set.  Linux and
NetBSD have both the read and write permission bits set for both
ends of a the pipe, so they're just as inaccurate but allowed
the feature to work before this change.

For now, we'll just assume our users know that stdin is intended
for input and consider any pipe or regular file to be readable.

If we were to be pedantic, we'd check O_RDONLY or O_RDWR
description flags via the F_GETFL fcntl(2) op to determine if a
pipe or socket is readable.  However, I don't think it's worth
the code to do so.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f00b2465..1ba2c2a1 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -747,7 +747,7 @@ sub optparse ($$$) {
 					# w/o args means stdin
 					if ($sw eq 'stdin' && !@$argv &&
 							(-p $self->{0} ||
-							 -f _) && -r _) {
+							 -f _)) {
 						$OPT->{stdin} //= 1;
 					}
 					$ok = defined($OPT->{$sw}) and last;

^ permalink raw reply related	[relevance 66%]

* Re: [PATCH 3/9] lei: always use async `done' requests to store
  2023-10-07 21:24 47% ` [PATCH 3/9] lei: always use async `done' requests to store Eric Wong
@ 2023-10-08  1:58 71%   ` Eric Wong
  2023-10-08  5:49 66%   ` [PATCH 2.5/9] lei: fix implicit stdin support for pipes Eric Wong
  2023-10-08 18:54 48%   ` [PATCHv2 3/9] lei: always use async `done' requests to store Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-08  1:58 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> diff --git a/lib/PublicInbox/LeiRemote.pm b/lib/PublicInbox/LeiRemote.pm
> index 54750062..15013baa 100644
> --- a/lib/PublicInbox/LeiRemote.pm
> +++ b/lib/PublicInbox/LeiRemote.pm
> @@ -52,7 +52,7 @@ sub mset {
>  	$self->{smsg} = [];
>  	$fh = IO::Uncompress::Gunzip->new($fh, MultiStream => 1);
>  	PublicInbox::MboxReader->mboxrd($fh, \&_each_mboxrd_eml, $self);
> -	my $wait = $self->{lei}->{sto}->wq_do('done');
> +	$self->{lei}->sto_done_request;

That's usually not the normal lei/store, but the {tmp_sto} one
from LeiRediff.  So it must be synchronous in that case because
we make multiple queries, there.

So I'll revert that hunk.

And ugh, the new lei-store-fail.t test is so nasty; I don't
think a 100ms delay is enough for some systems...

diff --git a/t/lei-store-fail.t b/t/lei-store-fail.t
index e9ad779f..fb0f2b75 100644
--- a/t/lei-store-fail.t
+++ b/t/lei-store-fail.t
@@ -31,7 +31,7 @@ Message-ID: <$_\@t>
 will this save?
 EOM
 	}
-	tick 0.1; # XXX ugh, this is so hacky
+	tick 0.2; # XXX ugh, this is so hacky
 
 	# make sto_done_request fail:
 	remove_tree("$ENV{HOME}/.local/share/lei/store");

^ permalink raw reply related	[relevance 71%]

* [PATCH 2/9] lei: do not issue sto->done if socket is inactive
  @ 2023-10-07 21:24 71% ` Eric Wong
  2023-10-07 21:24 47% ` [PATCH 3/9] lei: always use async `done' requests to store Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-10-07 21:24 UTC (permalink / raw)
  To: meta

This fixes attempts to use an undefined value as an ARRAY reference
in PublicInbox::IPC::wq_io_do
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f8bcd43d..f00b2465 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1296,7 +1296,7 @@ sub can_stay_alive { # PublicInbox::DS::post_loop_do cb
 			my $lne = delete($cfg->{-lei_note_event});
 			$lne->wq_close if $lne;
 			my $sto = delete($cfg->{-lei_store}) // next;
-			eval { $sto->wq_io_do('done') };
+			eval { $sto->wq_do('done') if $sto->{-wq_s1} };
 			warn "E: $@ (dropping store for $cfg->{-f})" if $@;
 			$sto->wq_close;
 		}

^ permalink raw reply related	[relevance 71%]

* [PATCH 3/9] lei: always use async `done' requests to store
    2023-10-07 21:24 71% ` [PATCH 2/9] lei: do not issue sto->done if socket is inactive Eric Wong
@ 2023-10-07 21:24 47% ` Eric Wong
  2023-10-08  1:58 71%   ` Eric Wong
                     ` (2 more replies)
  1 sibling, 3 replies; 200+ results
From: Eric Wong @ 2023-10-07 21:24 UTC (permalink / raw)
  To: meta

It's safer against deadlocks and we still get proper error
reporting by passing stderr across in addition to the lei
socket.
---
 MANIFEST                      |  1 +
 lib/PublicInbox/LEI.pm        |  9 +++----
 lib/PublicInbox/LeiInput.pm   |  2 +-
 lib/PublicInbox/LeiRemote.pm  |  2 +-
 lib/PublicInbox/LeiStore.pm   | 17 ++++++------
 lib/PublicInbox/LeiXSearch.pm |  6 ++---
 t/lei-store-fail.t            | 51 +++++++++++++++++++++++++++++++++++
 7 files changed, 69 insertions(+), 19 deletions(-)
 create mode 100644 t/lei-store-fail.t

diff --git a/MANIFEST b/MANIFEST
index 4693cbe0..689c6bf6 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -514,6 +514,7 @@ t/lei-q-thread.t
 t/lei-refresh-mail-sync.t
 t/lei-reindex.t
 t/lei-sigpipe.t
+t/lei-store-fail.t
 t/lei-tag.t
 t/lei-up.t
 t/lei-watch.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index f00b2465..4f840e89 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1537,12 +1537,11 @@ sub lms {
 
 sub sto_done_request {
 	my ($lei, $wq) = @_;
-	return unless $lei->{sto};
+	return unless $lei->{sto} && $lei->{sto}->{-wq_s1};
 	local $current_lei = $lei;
-	my $sock = $wq ? $wq->{lei_sock} : undef;
-	$sock //= $lei->{sock};
-	my @io;
-	push(@io, $sock) if $sock; # async wait iff possible
+	my $s = ($wq ? $wq->{lei_sock} : undef) // $lei->{sock};
+	my $errfh = $lei->{2} // *STDERR{GLOB};
+	my @io = $s ? ($errfh, $s) : ($errfh);
 	eval { $lei->{sto}->wq_io_do('done', \@io) };
 	warn($@) if $@;
 }
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 91383265..93f8b6b8 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -467,7 +467,7 @@ sub process_inputs {
 	}
 	# always commit first, even on error partial work is acceptable for
 	# lei <import|tag|convert>
-	my $wait = $self->{lei}->{sto}->wq_do('done') if $self->{lei}->{sto};
+	$self->{lei}->sto_done_request;
 	$self->{lei}->fail($err) if $err;
 }
 
diff --git a/lib/PublicInbox/LeiRemote.pm b/lib/PublicInbox/LeiRemote.pm
index 54750062..15013baa 100644
--- a/lib/PublicInbox/LeiRemote.pm
+++ b/lib/PublicInbox/LeiRemote.pm
@@ -52,7 +52,7 @@ sub mset {
 	$self->{smsg} = [];
 	$fh = IO::Uncompress::Gunzip->new($fh, MultiStream => 1);
 	PublicInbox::MboxReader->mboxrd($fh, \&_each_mboxrd_eml, $self);
-	my $wait = $self->{lei}->{sto}->wq_do('done');
+	$self->{lei}->sto_done_request;
 	$ar->join;
 	$lei->child_error($?) if $?;
 	$self; # we are the mset (and $ibx, and $self)
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 0cb78f79..e19ec88e 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -582,19 +582,20 @@ sub xchg_stderr {
 }
 
 sub done {
-	my ($self, $sock_ref) = @_;
-	my $err = '';
+	my ($self) = @_;
+	my ($errfh, $lei_sock) = @$self{0, 1}; # via sto_done_request
+	my @err;
 	if (my $im = delete($self->{im})) {
 		eval { $im->done };
-		if ($@) {
-			$err .= "import done: $@\n";
-			warn $err;
-		}
+		push(@err, "E: import done: $@\n") if $@;
 	}
 	delete $self->{lms};
-	$self->{priv_eidx}->done; # V2Writable::done
+	eval { $self->{priv_eidx}->done }; # V2Writable::done
+	push(@err, "E: priv_eidx done: $@\n") if $@;
+	print { $errfh // *STDERR{GLOB} } @err;
+	send($lei_sock, 'child_error 256', 0) if @err && $lei_sock;
 	xchg_stderr($self);
-	die $err if $err;
+	die @err if @err;
 }
 
 sub ipc_atfork_child {
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 1caa9d06..4077191f 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -358,9 +358,7 @@ sub query_remote_mboxrd {
 		$fh = IO::Uncompress::Gunzip->new($fh, MultiStream => 1);
 		PublicInbox::MboxReader->mboxrd($fh, \&each_remote_eml, $self,
 						$lei, $each_smsg);
-		if (delete($self->{-sto_imported})) {
-			my $wait = $self->{import_sto}->wq_do('done');
-		}
+		$lei->sto_done_request if delete($self->{-sto_imported});
 		$reap_curl->join;
 		my $nr = delete $lei->{-nr_remote_eml} // 0;
 		if ($? == 0) {
@@ -402,7 +400,7 @@ sub query_done { # EOF callback for main daemon
 	delete $lei->{lxs};
 	($lei->{opt}->{'mail-sync'} && !$lei->{sto}) and
 		warn "BUG: {sto} missing with --mail-sync";
-	$lei->sto_done_request if $lei->{sto};
+	$lei->sto_done_request;
 	if (my $v2w = delete $lei->{v2w}) {
 		my $wait = $v2w->wq_do('done'); # may die
 		$v2w->wq_close;
diff --git a/t/lei-store-fail.t b/t/lei-store-fail.t
new file mode 100644
index 00000000..e9ad779f
--- /dev/null
+++ b/t/lei-store-fail.t
@@ -0,0 +1,51 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# ensure we detect errors in lei/store
+use v5.12;
+use PublicInbox::TestCommon;
+use autodie qw(pipe open close seek);
+use Fcntl qw(SEEK_SET);
+use File::Path qw(remove_tree);
+
+my $start_home = $ENV{HOME}; # bug guard
+test_lei(sub {
+	lei_ok qw(import -q t/plack-qp.eml); # start the store
+	my $opt;
+	pipe($opt->{0}, my $in_w);
+	open $opt->{1}, '+>', undef;
+	open $opt->{2}, '+>', undef;
+	$opt->{-CLOFORK} = [ $in_w ];
+	my $cmd = [ qw(lei import -q -F mboxrd) ];
+	my $tp = start_script($cmd, undef, $opt);
+	close $opt->{0};
+	$in_w->autoflush(1);
+	for (1..500) { # need to fill up 64k read buffer
+		print $in_w <<EOM or xbail "print $!";
+From k\@y Fri Oct  2 00:00:00 1993
+From: <k\@example.com>
+Date: Sat, 02 Oct 2010 00:00:00 +0000
+Subject: hi
+Message-ID: <$_\@t>
+
+will this save?
+EOM
+	}
+	tick 0.1; # XXX ugh, this is so hacky
+
+	# make sto_done_request fail:
+	remove_tree("$ENV{HOME}/.local/share/lei/store");
+	# subsequent lei commands are undefined behavior,
+	# but we need to make sure the current lei command fails:
+
+	close $in_w; # should trigger ->done
+	$tp->join;
+	isnt($?, 0, 'lei import error code set on failure');
+	is(-s $opt->{1}, 0, 'nothing in stdout');
+	isnt(-s $opt->{2}, 0, 'stderr not empty');
+	seek($opt->{2}, 0, SEEK_SET);
+	my @err = readline($opt->{2});
+	ok(grep(!/^#/, @err), 'noted error in stderr') or diag "@err";
+});
+
+done_testing;

^ permalink raw reply related	[relevance 47%]

* [PATCH 20/21] lei: document and local-ize $OPT hashref
  2023-10-04  3:49 64% [PATCH 00/21] lei + IPC related stuff Eric Wong
                   ` (5 preceding siblings ...)
  2023-10-04  3:49 69% ` [PATCH 11/21] lei: keep signals blocked on daemon shutdown Eric Wong
@ 2023-10-04  3:49 67% ` Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-04  3:49 UTC (permalink / raw)
  To: meta

This variable needs to be visible to a callback running inside
Getopt::Long, but we don't need to keep it around after
LEI->optparse runs.
---
 lib/PublicInbox/LEI.pm | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index a5a6d321..5f3147bf 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -37,15 +37,16 @@ $GLP->configure(qw(gnu_getopt no_ignore_case auto_abbrev));
 my $GLP_PASS = Getopt::Long::Parser->new;
 $GLP_PASS->configure(qw(gnu_getopt no_ignore_case auto_abbrev pass_through));
 
-our %PATH2CFG; # persistent for socket daemon
-our $MDIR2CFGPATH; # /path/to/maildir => { /path/to/config => [ ino watches ] }
+our (%PATH2CFG, # persistent for socket daemon
+$MDIR2CFGPATH, # /path/to/maildir => { /path/to/config => [ ino watches ] }
+$OPT, # shared between optparse and opt_dash callback (for Getopt::Long)
+);
 
 # TBD: this is a documentation mechanism to show a subcommand
 # (may) pass options through to another command:
 sub pass_through { $GLP_PASS }
 
-my $OPT;
-sub opt_dash ($$) {
+sub opt_dash ($$) { # callback runs inside optparse
 	my ($spec, $re_str) = @_; # 'limit|n=i', '([0-9]+)'
 	my ($key) = ($spec =~ m/\A([a-z]+)/g);
 	my $cb = sub { # Getopt::Long "<>" catch-all handler
@@ -691,7 +692,7 @@ sub optparse ($$$) {
 	# allow _complete --help to complete, not show help
 	return 1 if substr($cmd, 0, 1) eq '_';
 	$self->{cmd} = $cmd;
-	$OPT = $self->{opt} //= {};
+	local $OPT = $self->{opt} //= {};
 	my $info = $CMD{$cmd} // [ '[...]' ];
 	my ($proto, undef, @spec) = @$info;
 	my $glp = ref($spec[-1]) eq ref($GLP) ? pop(@spec) : $GLP;

^ permalink raw reply related	[relevance 67%]

* [PATCH 10/21] lei: reuse PublicInbox::Config::noop
  2023-10-04  3:49 64% [PATCH 00/21] lei + IPC related stuff Eric Wong
                   ` (3 preceding siblings ...)
  2023-10-04  3:49 43% ` [PATCH 08/21] lei: get rid of l2m_progress PktOp callback Eric Wong
@ 2023-10-04  3:49 71% ` Eric Wong
  2023-10-04  3:49 69% ` [PATCH 11/21] lei: keep signals blocked on daemon shutdown Eric Wong
  2023-10-04  3:49 67% ` [PATCH 20/21] lei: document and local-ize $OPT hashref Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-04  3:49 UTC (permalink / raw)
  To: meta

No need to define our own empty `noop' sub when PublicInbox::Config
already has one and is loaded anyways.
---
 lib/PublicInbox/LEI.pm | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index fba4edf3..c9ad46e2 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1217,8 +1217,6 @@ sub event_step_init {
 	};
 }
 
-sub noop {}
-
 sub oldset { $oldset }
 
 sub dump_and_clear_log {
@@ -1364,15 +1362,9 @@ sub lazy_start {
 			$lis->close; # DS::close
 		};
 	};
-	my $sig = {
-		CHLD => \&PublicInbox::DS::enqueue_reap,
-		QUIT => $quit,
-		INT => $quit,
-		TERM => $quit,
-		HUP => \&noop,
-		USR1 => \&noop,
-		USR2 => \&noop,
-	};
+	my $sig = { CHLD => \&PublicInbox::DS::enqueue_reap };
+	$sig->{$_} = $quit for qw(QUIT INT TERM);
+	$sig->{$_} = \&PublicInbox::Config::noop for qw(HUP USR1 USR2);
 	# for EVFILT_SIGNAL and signalfd behavioral difference:
 	my @kq_ign = eval { require PublicInbox::DSKQXS } ? keys(%$sig) : ();
 

^ permalink raw reply related	[relevance 71%]

* [PATCH 11/21] lei: keep signals blocked on daemon shutdown
  2023-10-04  3:49 64% [PATCH 00/21] lei + IPC related stuff Eric Wong
                   ` (4 preceding siblings ...)
  2023-10-04  3:49 71% ` [PATCH 10/21] lei: reuse PublicInbox::Config::noop Eric Wong
@ 2023-10-04  3:49 69% ` Eric Wong
  2023-10-04  3:49 67% ` [PATCH 20/21] lei: document and local-ize $OPT hashref Eric Wong
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-04  3:49 UTC (permalink / raw)
  To: meta

Since we completely shut down all workers before exiting,
we no longer have to care about missing SIGCHLD wakeups
during shutdown.
---
 lib/PublicInbox/LEI.pm | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index c9ad46e2..d611f5c3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1365,9 +1365,6 @@ sub lazy_start {
 	my $sig = { CHLD => \&PublicInbox::DS::enqueue_reap };
 	$sig->{$_} = $quit for qw(QUIT INT TERM);
 	$sig->{$_} = \&PublicInbox::Config::noop for qw(HUP USR1 USR2);
-	# for EVFILT_SIGNAL and signalfd behavioral difference:
-	my @kq_ign = eval { require PublicInbox::DSKQXS } ? keys(%$sig) : ();
-
 	require PublicInbox::DirIdle;
 	local $dir_idle = PublicInbox::DirIdle->new(sub {
 		# just rely on wakeup to hit post_loop_do
@@ -1390,16 +1387,6 @@ sub lazy_start {
 	# $daemon pipe to `lei' closed, main loop begins:
 	eval { PublicInbox::DS::event_loop($sig, $oldset) };
 	warn "event loop error: $@\n" if $@;
-
-	# EVFILT_SIGNAL will get a duplicate of all the signals it was sent
-	local @SIG{@kq_ign} = map 'IGNORE', @kq_ign;
-	PublicInbox::DS::sig_setmask($oldset) if @kq_ign;
-
-	# exit() may trigger waitpid via various DESTROY, ensure interruptible
-	local $SIG{TERM} = sub { exit(POSIX::SIGTERM + 128) };
-	local $SIG{INT} = sub { exit(POSIX::SIGINT + 128) };
-	local $SIG{QUIT} = sub { exit(POSIX::SIGQUIT + 128) };
-	PublicInbox::DS::sig_setmask($oldset) if !@kq_ign;
 	dump_and_clear_log();
 	exit($exit_code // 0);
 }

^ permalink raw reply related	[relevance 69%]

* [PATCH 05/21] lei: close DirIdle (inotify) early at daemon shutdown
  2023-10-04  3:49 64% [PATCH 00/21] lei + IPC related stuff Eric Wong
  2023-10-04  3:49 68% ` [PATCH 01/21] lei: drop stores explicitly at daemon shutdown Eric Wong
@ 2023-10-04  3:49 65% ` Eric Wong
  2023-10-04  3:49 33% ` [PATCH 07/21] lei: do_env combines fchdir and local Eric Wong
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-04  3:49 UTC (permalink / raw)
  To: meta

We don't want FS activity to delay lei-daemon shutdown.
---
 lib/PublicInbox/DirIdle.pm | 12 +++++++++---
 lib/PublicInbox/LEI.pm     |  5 +++++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/DirIdle.pm b/lib/PublicInbox/DirIdle.pm
index af99811c..de6f229b 100644
--- a/lib/PublicInbox/DirIdle.pm
+++ b/lib/PublicInbox/DirIdle.pm
@@ -68,10 +68,16 @@ sub rm_watches {
 	}
 }
 
+sub close {
+	my ($self) = @_;
+	delete $self->{cb};
+	$self->SUPER::close; # if using real kevent/inotify
+}
+
 sub event_step {
 	my ($self) = @_;
-	my $cb = $self->{cb};
-	local $PublicInbox::DS::in_loop = 0; # waitpid() synchronously
+	my $cb = $self->{cb} or return;
+	local $PublicInbox::DS::in_loop = 0; # waitpid() synchronously (FIXME)
 	eval {
 		my @events = $self->{inot}->read; # Linux::Inotify2->read
 		$cb->($_) for @events;
@@ -83,7 +89,7 @@ sub force_close {
 	my ($self) = @_;
 	my $inot = delete $self->{inot} // return;
 	if ($inot->can('fh')) { # Linux::Inotify2 2.3+
-		close($inot->fh) or warn "CLOSE ERROR: $!";
+		CORE::close($inot->fh) or warn "CLOSE ERROR: $!";
 	} elsif ($inot->isa('Linux::Inotify2')) {
 		require PublicInbox::LI2Wrap;
 		PublicInbox::LI2Wrap::wrapclose($inot);
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 74a7f5b9..8362800d 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1285,8 +1285,11 @@ sub can_stay_alive { # PublicInbox::DS::post_loop_do cb
 	}
 	return 1 if defined($$path);
 	my $n = PublicInbox::DS::close_non_busy() or do {
+		eval 'PublicInbox::LeiNoteEvent::flush_task()';
 		# drop stores only if no clients
 		for my $cfg (values %PATH2CFG) {
+			my $lne = delete($cfg->{-lei_note_event});
+			$lne->wq_close if $lne;
 			my $sto = delete($cfg->{-lei_store}) // next;
 			eval { $sto->wq_io_do('done') };
 			warn "E: $@ (dropping store for $cfg->{-f})" if $@;
@@ -1346,6 +1349,8 @@ sub lazy_start {
 		my (undef, $eof_p) = PublicInbox::PktOp->pair;
 		sub {
 			$exit_code //= eval("POSIX::SIG$_[0] + 128") if @_;
+			$dir_idle->close if $dir_idle; # EPOLL_CTL_DEL
+			$dir_idle = undef; # let RC take care of it
 			eval 'PublicInbox::LeiNoteEvent::flush_task()';
 			my $lis = $pil or exit($exit_code // 0);
 			# closing eof_p triggers \&noop wakeup

^ permalink raw reply related	[relevance 65%]

* [PATCH 08/21] lei: get rid of l2m_progress PktOp callback
  2023-10-04  3:49 64% [PATCH 00/21] lei + IPC related stuff Eric Wong
                   ` (2 preceding siblings ...)
  2023-10-04  3:49 33% ` [PATCH 07/21] lei: do_env combines fchdir and local Eric Wong
@ 2023-10-04  3:49 43% ` Eric Wong
  2023-10-04  3:49 71% ` [PATCH 10/21] lei: reuse PublicInbox::Config::noop Eric Wong
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-04  3:49 UTC (permalink / raw)
  To: meta

We already have an ->incr callback we can enhance to support
multiple counters with a single request.  Furthermore, we can
just flatten the object graph by storing counters directly in
the $lei object itself to reduce hash lookups.
---
 lib/PublicInbox/LEI.pm        | 13 ++++++-------
 lib/PublicInbox/LeiConvert.pm |  7 ++++---
 lib/PublicInbox/LeiTag.pm     |  6 +++---
 lib/PublicInbox/LeiToMail.pm  | 20 ++++++++++----------
 lib/PublicInbox/LeiXSearch.pm | 15 ++++-----------
 t/lei-tag.t                   |  3 +++
 6 files changed, 30 insertions(+), 34 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 3408551b..fba4edf3 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -628,8 +628,8 @@ sub pkt_op_pair {
 }
 
 sub incr {
-	my ($self, $field, $nr) = @_;
-	$self->{counters}->{$field} += $nr;
+	my $lei = shift;
+	while (my ($f, $n) = splice(@_, 0, 2)) { $lei->{$f} += $n }
 }
 
 sub pkt_ops {
@@ -1418,11 +1418,10 @@ sub busy { 1 } # prevent daemon-shutdown if client is connected
 # can immediately reread it
 sub DESTROY {
 	my ($self) = @_;
-	if (my $counters = delete $self->{counters}) {
-		for my $k (sort keys %$counters) {
-			my $nr = $counters->{$k};
-			$self->child_error(0, "$nr $k messages");
-		}
+	for my $k (sort(grep(/\A-nr_/, keys %$self))) {
+		my $nr = $self->{$k};
+		substr($k, 0, length('-nr_'), '');
+		$self->child_error(0, "$nr $k messages");
 	}
 	$self->{1}->autoflush(1) if $self->{1};
 	stop_pager($self);
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 1acd4558..22aba81a 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -34,9 +34,10 @@ sub process_inputs { # via wq_do
 	$self->SUPER::process_inputs;
 	my $lei = $self->{lei};
 	delete $lei->{1};
+	my $l2m = delete $self->{l2m};
 	delete $self->{wcb}; # commit
-	my $nr_w = delete($lei->{-nr_write}) // 0;
-	my $d = (delete($lei->{-nr_seen}) // 0) - $nr_w;
+	my $nr_w = delete($l2m->{-nr_write}) // 0;
+	my $d = (delete($l2m->{-nr_seen}) // 0) - $nr_w;
 	$d = $d ? " ($d duplicates)" : '';
 	$lei->qerr("# converted $nr_w messages$d");
 }
@@ -64,7 +65,7 @@ sub ipc_atfork_child {
 	my ($self) = @_;
 	my $lei = $self->{lei};
 	$lei->_lei_atfork_child;
-	my $l2m = delete $lei->{l2m};
+	my $l2m = $lei->{l2m};
 	if (my $net = $lei->{net}) { # may prompt user once
 		$net->{mics_cached} = $net->imap_common_init($lei);
 		$net->{nn_cached} = $net->nntp_common_init($lei);
diff --git a/lib/PublicInbox/LeiTag.pm b/lib/PublicInbox/LeiTag.pm
index 76bd2d70..320b0355 100644
--- a/lib/PublicInbox/LeiTag.pm
+++ b/lib/PublicInbox/LeiTag.pm
@@ -15,7 +15,7 @@ sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
 		$self->{lei}->{sto}->wq_do('update_xvmd', $xoids, $eml,
 						$self->{lei}->{vmd_mod});
 	} else {
-		++$self->{unimported};
+		++$self->{-nr_unimported};
 	}
 }
 
@@ -40,8 +40,8 @@ sub lei_tag { # the "lei tag" method
 
 sub note_unimported {
 	my ($self) = @_;
-	my $n = $self->{unimported} or return;
-	$self->{lei}->{pkt_op_p}->pkt_do('incr', 'unimported', $n);
+	my $n = $self->{-nr_unimported} or return;
+	$self->{lei}->{pkt_op_p}->pkt_do('incr', -nr_unimported => $n);
 }
 
 sub ipc_atfork_child {
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index b9f28ee4..f239da82 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -195,7 +195,7 @@ sub _mbox_write_cb ($$) {
 	sub { # for git_to_mail
 		my ($buf, $smsg, $eml) = @_;
 		$eml //= PublicInbox::Eml->new($buf);
-		++$lei->{-nr_seen};
+		++$self->{-nr_seen};
 		return if $dedupe->is_dup($eml, $smsg);
 		$lse->xsmsg_vmd($smsg) if $lse;
 		$smsg->{-recent} = 1 if $set_recent;
@@ -206,7 +206,7 @@ sub _mbox_write_cb ($$) {
 			my $lk = $ovv->lock_for_scope;
 			$lei->out($$buf);
 		}
-		++$lei->{-nr_write};
+		++$self->{-nr_write};
 	}
 }
 
@@ -291,7 +291,7 @@ sub _maildir_write_cb ($$) {
 		my ($bref, $smsg, $eml) = @_;
 		$dst // return $lei->fail; # dst may be undef-ed in last run
 
-		++$lei->{-nr_seen};
+		++$self->{-nr_seen};
 		return if $dedupe && $dedupe->is_dup($eml //
 						PublicInbox::Eml->new($$bref),
 						$smsg);
@@ -299,7 +299,7 @@ sub _maildir_write_cb ($$) {
 		my $n = _buf2maildir($dst, $bref // \($eml->as_string),
 					$smsg, $dir);
 		$lms->set_src($smsg->oidbin, $out, $n) if $lms;
-		++$lei->{-nr_write};
+		++$self->{-nr_write};
 	}
 }
 
@@ -322,7 +322,7 @@ EOM
 		my ($bref, $smsg, $eml) = @_;
 		$mic // return $lei->fail; # mic may be undef-ed in last run
 
-		++$lei->{-nr_seen};
+		++$self->{-nr_seen};
 		return if $dedupe && $dedupe->is_dup($eml //
 						PublicInbox::Eml->new($$bref),
 						$smsg);
@@ -335,7 +335,7 @@ EOM
 		# imap_append returns UID if IMAP server has UIDPLUS extension
 		($lms && $uid =~ /\A[0-9]+\z/) and
 			$lms->set_src($smsg->oidbin, $$uri, $uid + 0);
-		++$lei->{-nr_write};
+		++$self->{-nr_write};
 	}
 }
 
@@ -366,10 +366,10 @@ sub _v2_write_cb ($$) {
 	sub { # for git_to_mail
 		my ($bref, $smsg, $eml) = @_;
 		$eml //= PublicInbox::Eml->new($bref);
-		++$lei->{-nr_seen};
+		++$self->{-nr_seen};
 		return if $dedupe && $dedupe->is_dup($eml, $smsg);
 		$lei->{v2w}->wq_do('add', $eml); # V2Writable->add
-		++$lei->{-nr_write};
+		++$self->{-nr_write};
 	}
 }
 
@@ -796,11 +796,11 @@ sub wq_atexit_child {
 	local $PublicInbox::DS::in_loop = 0; # waitpid synchronously
 	my $lei = $self->{lei};
 	$lei->{ale}->git->async_wait_all;
-	my ($nr_w, $nr_s) = delete(@$lei{qw(-nr_write -nr_seen)});
+	my ($nr_w, $nr_s) = delete(@$self{qw(-nr_write -nr_seen)});
 	delete $self->{wcb};
 	(($nr_w //= 0) + ($nr_s //= 0)) or return;
 	return if $lei->{early_mua} || !$lei->{-progress} || !$lei->{pkt_op_p};
-	$lei->{pkt_op_p}->pkt_do('l2m_progress', $nr_w, $nr_s);
+	$lei->{pkt_op_p}->pkt_do('incr', -nr_write => $nr_w, -nr_seen => $nr_s)
 }
 
 # runs on a 1s timer in lei-daemon
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 8f63149e..1caa9d06 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -155,12 +155,6 @@ sub mset_progress {
 	}
 }
 
-sub l2m_progress {
-	my ($lei, $nr_write, $nr_seen) = @_;
-	$lei->{-nr_write} += $nr_write;
-	$lei->{-nr_seen} += $nr_seen;
-}
-
 sub query_one_mset { # for --threads and l2m w/o sort
 	my ($self, $ibxish) = @_;
 	local $0 = "$0 query_one_mset";
@@ -354,7 +348,6 @@ sub query_remote_mboxrd {
 	$self->{import_sto} = $lei->{sto} if $lei->{opt}->{'import-remote'};
 	for my $uri (@$uris) {
 		$lei->{-current_url} = $uri->as_string;
-		$lei->{-nr_remote_eml} = 0;
 		my $start = time;
 		my ($q, $key) = fudge_qstr_time($lei, $uri, $qstr);
 		$uri->query_form(@qform, q => $q);
@@ -369,9 +362,9 @@ sub query_remote_mboxrd {
 			my $wait = $self->{import_sto}->wq_do('done');
 		}
 		$reap_curl->join;
+		my $nr = delete $lei->{-nr_remote_eml} // 0;
 		if ($? == 0) {
 			# don't update if no results, maybe MTA is down
-			my $nr = $lei->{-nr_remote_eml};
 			$lei->{lss}->cfg_set($key, $start) if $key && $nr;
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
 			next;
@@ -433,8 +426,8 @@ Error closing $lei->{ovv}->{dst}: \$!=$! \$?=$?
 	}
 	if ($lei->{-progress}) {
 		my $tot = $lei->{-mset_total} // 0;
-		my $nr_w = $lei->{-nr_write} // 0;
-		my $d = ($lei->{-nr_seen} // 0) - $nr_w;
+		my $nr_w = delete($lei->{-nr_write}) // 0;
+		my $d = (delete($lei->{-nr_seen}) // 0) - $nr_w;
 		my $x = "$tot matches";
 		$x .= ", $d duplicates" if $d;
 		if ($l2m) {
@@ -532,7 +525,7 @@ sub do_query {
 		incr_post_augment => [ \&incr_post_augment, $lei ],
 		'' => [ \&query_done, $lei ],
 		mset_progress => [ \&mset_progress, $lei ],
-		l2m_progress => [ \&l2m_progress, $lei ],
+		incr => [ $lei ],
 		x_it => [ $lei ],
 		child_error => [ $lei ],
 		incr_start_query => [ \&incr_start_query, $lei, $self ],
diff --git a/t/lei-tag.t b/t/lei-tag.t
index 822677a7..cccf0af6 100644
--- a/t/lei-tag.t
+++ b/t/lei-tag.t
@@ -101,5 +101,8 @@ test_lei(sub {
 	if (0) { # TODO label+kw search w/ externals
 		lei_ok(qw(q L:qp), "mid:$mid", '--only', "$ro_home/t2");
 	}
+	lei_ok qw(tag +L:nope -F eml t/data/binary.patch);
+	like $lei_err, qr/\b1 unimported messages/, 'noted unimported'
+		or diag $lei_err;
 });
 done_testing;

^ permalink raw reply related	[relevance 43%]

* [PATCH 07/21] lei: do_env combines fchdir and local
  2023-10-04  3:49 64% [PATCH 00/21] lei + IPC related stuff Eric Wong
  2023-10-04  3:49 68% ` [PATCH 01/21] lei: drop stores explicitly at daemon shutdown Eric Wong
  2023-10-04  3:49 65% ` [PATCH 05/21] lei: close DirIdle (inotify) early " Eric Wong
@ 2023-10-04  3:49 33% ` Eric Wong
  2023-10-04  3:49 43% ` [PATCH 08/21] lei: get rid of l2m_progress PktOp callback Eric Wong
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-04  3:49 UTC (permalink / raw)
  To: meta

This will make switching $lei contexts less error-prone
and hopefully save us from some suprising bugs in the future.

Followup-to: 759885e60e59 (lei: ensure --stdin sets %ENV and $current_lei, 2023-09-14)
---
 lib/PublicInbox/LEI.pm        |  16 ++++--
 lib/PublicInbox/LeiAuth.pm    |   4 +-
 lib/PublicInbox/LeiConfig.pm  |  25 ++++----
 lib/PublicInbox/LeiInspect.pm |  28 ++++-----
 lib/PublicInbox/LeiLcat.pm    |  17 +++---
 lib/PublicInbox/LeiQuery.pm   |  19 +++----
 lib/PublicInbox/LeiXSearch.pm | 104 ++++++++++++++++------------------
 lib/PublicInbox/PktOp.pm      |  15 +++--
 8 files changed, 112 insertions(+), 116 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8362800d..3408551b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -479,7 +479,6 @@ sub _drop_wq {
 # pronounced "exit": x_it(1 << 8) => exit(1); x_it(13) => SIGPIPE
 sub x_it ($$) {
 	my ($self, $code) = @_;
-	local $current_lei = $self;
 	# make sure client sees stdout before exit
 	$self->{1}->autoflush(1) if $self->{1};
 	stop_pager($self);
@@ -514,7 +513,6 @@ sub qfin { # show message on finalization (LeiFinmsg)
 
 sub fail_handler ($;$$) {
 	my ($lei, $code, $io) = @_;
-	local $current_lei = $lei;
 	close($io) if $io; # needed to avoid warnings on SIGPIPE
 	_drop_wq($lei);
 	x_it($lei, $code // (1 << 8));
@@ -785,11 +783,19 @@ sub lazy_cb ($$$) { # $pfx is _complete_ or lei_
 		$pkg->can($pfx.$ucmd) : undef;
 }
 
+sub do_env {
+	my $lei = shift;
+	fchdir($lei);
+	my $cb = shift // return ($lei, %{$lei->{env}}) ;
+	local ($current_lei, %ENV) = ($lei, %{$lei->{env}});
+	$cb = $lei->can($cb) if !ref($cb); # $cb may be a scalar sub name
+	eval { $cb->($lei, @_) };
+	$lei->fail($@) if $@;
+}
+
 sub dispatch {
 	my ($self, $cmd, @argv) = @_;
-	fchdir($self);
-	local %ENV = %{$self->{env}};
-	local $current_lei = $self; # for __WARN__
+	local ($current_lei, %ENV) = do_env($self);
 	$self->{2}->autoflush(1); # keep stdout buffered until x_it|DESTROY
 	return _help($self, 'no command given') unless defined($cmd);
 	# do not support Getopt bundling for this
diff --git a/lib/PublicInbox/LeiAuth.pm b/lib/PublicInbox/LeiAuth.pm
index 9b09cecf..76a4410d 100644
--- a/lib/PublicInbox/LeiAuth.pm
+++ b/lib/PublicInbox/LeiAuth.pm
@@ -57,7 +57,7 @@ sub net_merge_all { # called in wq worker via wq_broadcast
 # called by top-level lei-daemon when first worker is done with auth
 # passes updated net auth info to current workers
 sub net_merge_continue {
-	my ($wq, $lei, $net_new) = @_;
+	my ($lei, $wq, $net_new) = @_;
 	$wq->{-net_new} = $net_new; # for "lei up"
 	$wq->wq_broadcast('PublicInbox::LeiAuth::net_merge_all', $net_new);
 	$wq->net_merge_all_done($lei); # defined per-WQ
@@ -65,7 +65,7 @@ sub net_merge_continue {
 
 sub op_merge { # prepares PktOp->pair ops
 	my ($self, $ops, $wq, $lei) = @_;
-	$ops->{net_merge_continue} = [ \&net_merge_continue, $wq, $lei ];
+	$ops->{net_merge_continue} = [ \&net_merge_continue, $lei, $wq ];
 }
 
 sub new { bless \(my $x), __PACKAGE__ }
diff --git a/lib/PublicInbox/LeiConfig.pm b/lib/PublicInbox/LeiConfig.pm
index 76fc43e7..b3495487 100644
--- a/lib/PublicInbox/LeiConfig.pm
+++ b/lib/PublicInbox/LeiConfig.pm
@@ -16,24 +16,21 @@ sub cfg_do_edit ($;$) {
 	# run in script/lei foreground
 	my ($op_c, $op_p) = PublicInbox::PktOp->pair;
 	# $op_p will EOF when $EDITOR is done
-	$op_c->{ops} = { '' => [\&cfg_edit_done, $self] };
+	$op_c->{ops} = { '' => [\&cfg_edit_done, $lei, $self] };
 	$lei->send_exec_cmd([ @$lei{qw(0 1 2)}, $op_p->{op_p} ], $cmd, $env);
 }
 
-sub cfg_edit_done { # PktOp
-	my ($self) = @_;
-	eval {
-		open my $fh, '+>', undef or die "open($!)";
-		my $cfg = do {
-			local $self->{lei}->{2} = $fh;
-			$self->{lei}->cfg_dump($self->{-f});
-		} or do {
-			seek($fh, 0, SEEK_SET);
-			return cfg_do_edit($self, do { local $/; <$fh> });
-		};
-		$self->cfg_verify($cfg) if $self->can('cfg_verify');
+sub cfg_edit_done { # PktOp lei->do_env cb
+	my ($lei, $self) = @_;
+	open my $fh, '+>', undef or die "open($!)";
+	my $cfg = do {
+		local $lei->{2} = $fh;
+		$lei->cfg_dump($self->{-f});
+	} or do {
+		seek($fh, 0, SEEK_SET);
+		return cfg_do_edit($self, do { local $/; <$fh> });
 	};
-	$self->{lei}->fail($@) if $@;
+	$self->cfg_verify($cfg) if $self->can('cfg_verify');
 }
 
 sub lei_config {
diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index 0455e739..f801610f 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -251,24 +251,20 @@ sub inspect_start ($$) {
 	$self->wq_close;
 }
 
+sub do_inspect { # lei->do_env cb
+	my ($lei) = @_;
+	my $str = delete $lei->{istr};
+	$str =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s;
+	my $eml = PublicInbox::Eml->new(\$str);
+	inspect_start($lei, [ 'blob:'.$lei->git_oid($eml)->hexdigest,
+			map { "mid:$_" } @{mids($eml)} ]);
+}
+
 sub ins_add { # InputPipe->consume callback
 	my ($lei) = @_; # $_[1] = $rbuf
-	if (defined $_[1]) {
-		$_[1] eq '' and return eval {
-			$lei->fchdir;
-			local %ENV = %{$lei->{env}};
-			local $PublicInbox::LEI::current_lei = $lei;
-			my $str = delete $lei->{istr};
-			$str =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s;
-			my $eml = PublicInbox::Eml->new(\$str);
-			inspect_start($lei, [
-				'blob:'.$lei->git_oid($eml)->hexdigest,
-				map { "mid:$_" } @{mids($eml)} ]);
-		};
-		$lei->{istr} .= $_[1];
-	} else {
-		$lei->fail("error reading stdin: $!");
-	}
+	$_[1] // return $lei->fail("error reading stdin: $!");
+	return $lei->{istr} .= $_[1] if $_[1] ne '';
+	$lei->do_env(\&do_inspect);
 }
 
 sub lei_inspect {
diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index 7ed191c3..72875dc6 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -122,19 +122,18 @@ could not extract Message-ID from $x
 	@q ? join(' OR ', @q) : $lei->fail("no Message-ID in: @argv");
 }
 
+sub do_lcat { # lei->do_env cb
+	my ($lei) = @_;
+	my @argv = split(/\s+/, $lei->{mset_opt}->{qstr});
+	$lei->{mset_opt}->{qstr} = extract_all($lei, @argv) or return;
+	$lei->_start_query;
+}
+
 sub _stdin { # PublicInbox::InputPipe::consume callback for --stdin
 	my ($lei) = @_; # $_[1] = $rbuf
 	$_[1] // return $lei->fail("error reading stdin: $!");
 	return $lei->{mset_opt}->{qstr} .= $_[1] if $_[1] ne '';
-	eval {
-		$lei->fchdir;
-		local %ENV = %{$lei->{env}};
-		local $PublicInbox::LEI::current_lei = $lei;
-		my @argv = split(/\s+/, $lei->{mset_opt}->{qstr});
-		$lei->{mset_opt}->{qstr} = extract_all($lei, @argv) or return;
-		$lei->_start_query;
-	};
-	$lei->fail($@) if $@;
+	$lei->do_env(\&do_lcat);
 }
 
 sub lei_lcat {
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index a23354f0..e2d8a096 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -59,20 +59,19 @@ sub _start_query { # used by "lei q" and "lei up"
 	$lxs->do_query($self);
 }
 
+sub do_qry { # do_env cb
+	my ($lei) = @_;
+	$lei->{mset_opt}->{q_raw} = $lei->{mset_opt}->{qstr};
+	$lei->{lse}->query_approxidate($lei->{lse}->git,
+					$lei->{mset_opt}->{qstr});
+	_start_query($lei);
+}
+
 sub qstr_add { # PublicInbox::InputPipe::consume callback for --stdin
 	my ($lei) = @_; # $_[1] = $rbuf
 	$_[1] // $lei->fail("error reading stdin: $!");
 	return $lei->{mset_opt}->{qstr} .= $_[1] if $_[1] ne '';
-	eval {
-		$lei->fchdir;
-		local %ENV = %{$lei->{env}};
-		local $PublicInbox::LEI::current_lei = $lei;
-		$lei->{mset_opt}->{q_raw} = $lei->{mset_opt}->{qstr};
-		$lei->{lse}->query_approxidate($lei->{lse}->git,
-						$lei->{mset_opt}->{qstr});
-		_start_query($lei);
-	};
-	$lei->fail($@) if $@;
+	$lei->do_env(\&do_qry);
 }
 
 # make the URI||PublicInbox::{Inbox,ExtSearch} a config-file friendly string
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 4e0849e8..8f63149e 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -405,62 +405,54 @@ sub xsearch_done_wait { # awaitpid cb
 
 sub query_done { # EOF callback for main daemon
 	my ($lei) = @_;
-	local $PublicInbox::LEI::current_lei = $lei;
-	eval {
-		my $l2m = delete $lei->{l2m};
-		delete $lei->{lxs};
-		($lei->{opt}->{'mail-sync'} && !$lei->{sto}) and
-			warn "BUG: {sto} missing with --mail-sync";
-		$lei->sto_done_request if $lei->{sto};
-		if (my $v2w = delete $lei->{v2w}) {
-			my $wait = $v2w->wq_do('done'); # may die
-			$v2w->wq_close;
-		}
-		$lei->{ovv}->ovv_end($lei);
-		if ($l2m) { # close() calls LeiToMail reap_compress
-			if (my $out = delete $lei->{old_1}) {
-				if (my $mbout = $lei->{1}) {
-					close($mbout) or die <<"";
+	my $l2m = delete $lei->{l2m};
+	delete $lei->{lxs};
+	($lei->{opt}->{'mail-sync'} && !$lei->{sto}) and
+		warn "BUG: {sto} missing with --mail-sync";
+	$lei->sto_done_request if $lei->{sto};
+	if (my $v2w = delete $lei->{v2w}) {
+		my $wait = $v2w->wq_do('done'); # may die
+		$v2w->wq_close;
+	}
+	$lei->{ovv}->ovv_end($lei);
+	if ($l2m) { # close() calls LeiToMail reap_compress
+		if (my $out = delete $lei->{old_1}) {
+			if (my $mbout = $lei->{1}) {
+				close($mbout) or die <<"";
 Error closing $lei->{ovv}->{dst}: \$!=$! \$?=$?
 
-				}
-				$lei->{1} = $out;
-			}
-			if ($l2m->lock_free) {
-				$l2m->poke_dst;
-				$lei->poke_mua;
-			} else { # mbox users
-				delete $l2m->{mbl}; # drop dotlock
 			}
+			$lei->{1} = $out;
 		}
-		if ($lei->{-progress}) {
-			my $tot = $lei->{-mset_total} // 0;
-			my $nr_w = $lei->{-nr_write} // 0;
-			my $d = ($lei->{-nr_seen} // 0) - $nr_w;
-			my $x = "$tot matches";
-			$x .= ", $d duplicates" if $d;
-			if ($l2m) {
-				my $m = "# $nr_w written to " .
-					"$lei->{ovv}->{dst} ($x)";
-				$nr_w ? $lei->qfin($m) : $lei->qerr($m);
-			} else {
-				$lei->qerr("# $x");
-			}
+		if ($l2m->lock_free) {
+			$l2m->poke_dst;
+			$lei->poke_mua;
+		} else { # mbox users
+			delete $l2m->{mbl}; # drop dotlock
 		}
-		$lei->start_mua if $l2m && !$l2m->lock_free;
-		$lei->dclose;
-	};
-	$lei->fail($@) if $@;
+	}
+	if ($lei->{-progress}) {
+		my $tot = $lei->{-mset_total} // 0;
+		my $nr_w = $lei->{-nr_write} // 0;
+		my $d = ($lei->{-nr_seen} // 0) - $nr_w;
+		my $x = "$tot matches";
+		$x .= ", $d duplicates" if $d;
+		if ($l2m) {
+			my $m = "# $nr_w written to " .
+				"$lei->{ovv}->{dst} ($x)";
+			$nr_w ? $lei->qfin($m) : $lei->qerr($m);
+		} else {
+			$lei->qerr("# $x");
+		}
+	}
+	$lei->start_mua if $l2m && !$l2m->lock_free;
+	$lei->dclose;
 }
 
 sub do_post_augment {
 	my ($lei) = @_;
-	local $PublicInbox::LEI::current_lei = $lei;
 	my $l2m = $lei->{l2m} or return; # client disconnected
-	eval {
-		$lei->fchdir;
-		$l2m->post_augment($lei);
-	};
+	eval { $l2m->post_augment($lei) };
 	my $err = $@;
 	if ($err) {
 		if (my $lxs = delete $lei->{lxs}) {
@@ -518,7 +510,7 @@ sub start_query ($$) { # always runs in main (lei-daemon) process
 }
 
 sub incr_start_query { # called whenever an l2m shard starts do_post_auth
-	my ($self, $lei) = @_;
+	my ($lei, $self) = @_;
 	my $l2m = $lei->{l2m};
 	return if ++$self->{nr_start_query} != $l2m->{-wq_nr_workers};
 	start_query($self, $lei);
@@ -534,16 +526,16 @@ sub do_query {
 	my ($self, $lei) = @_;
 	my $l2m = $lei->{l2m};
 	my $ops = {
-		'sigpipe_handler' => [ $lei ],
-		'fail_handler' => [ $lei ],
-		'do_post_augment' => [ \&do_post_augment, $lei ],
-		'incr_post_augment' => [ \&incr_post_augment, $lei ],
+		sigpipe_handler => [ $lei ],
+		fail_handler => [ $lei ],
+		do_post_augment => [ \&do_post_augment, $lei ],
+		incr_post_augment => [ \&incr_post_augment, $lei ],
 		'' => [ \&query_done, $lei ],
-		'mset_progress' => [ \&mset_progress, $lei ],
-		'l2m_progress' => [ \&l2m_progress, $lei ],
-		'x_it' => [ $lei ],
-		'child_error' => [ $lei ],
-		'incr_start_query' => [ $self, $lei ],
+		mset_progress => [ \&mset_progress, $lei ],
+		l2m_progress => [ \&l2m_progress, $lei ],
+		x_it => [ $lei ],
+		child_error => [ $lei ],
+		incr_start_query => [ \&incr_start_query, $lei, $self ],
 	};
 	$lei->{auth}->op_merge($ops, $l2m, $lei) if $l2m && $lei->{auth};
 	my $end = $lei->pkt_op_pair;
diff --git a/lib/PublicInbox/PktOp.pm b/lib/PublicInbox/PktOp.pm
index dc432307..1bcdd799 100644
--- a/lib/PublicInbox/PktOp.pm
+++ b/lib/PublicInbox/PktOp.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # op dispatch socket, reads a message, runs a sub
@@ -6,8 +6,7 @@
 # Used for lei_xsearch and maybe other things
 # "command" => [ $sub, @fixed_operands ]
 package PublicInbox::PktOp;
-use strict;
-use v5.10.1;
+use v5.12;
 use parent qw(PublicInbox::DS);
 use Errno qw(EAGAIN ECONNRESET);
 use PublicInbox::Syscall qw(EPOLLIN);
@@ -55,7 +54,15 @@ sub event_step {
 	my $op = $self->{ops}->{$cmd //= $msg};
 	if ($op) {
 		my ($obj, @args) = (@$op, @pargs);
-		blessed($obj) ? $obj->$cmd(@args) : $obj->(@args);
+		if (blessed($args[0]) && $args[0]->can('do_env')) {
+			my $lei = shift @args;
+			$lei->do_env($obj, @args);
+		} elsif (blessed($obj)) {
+			$obj->can('do_env') ? $obj->do_env($cmd, @args)
+						: $obj->$cmd(@args);
+		} else {
+			$obj->(@args);
+		}
 	} elsif ($msg ne '') {
 		die "BUG: unknown message: `$cmd'";
 	}

^ permalink raw reply related	[relevance 33%]

* [PATCH 01/21] lei: drop stores explicitly at daemon shutdown
  2023-10-04  3:49 64% [PATCH 00/21] lei + IPC related stuff Eric Wong
@ 2023-10-04  3:49 68% ` Eric Wong
  2023-10-04  3:49 65% ` [PATCH 05/21] lei: close DirIdle (inotify) early " Eric Wong
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-04  3:49 UTC (permalink / raw)
  To: meta

This will allow us to avoid unblocking signals during
shutdown to simplify our code.
---
 lib/PublicInbox/DS.pm  |  3 +--
 lib/PublicInbox/LEI.pm | 13 ++++++++++++-
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index 142122a8..c476311b 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -38,10 +38,9 @@ our @EXPORT_OK = qw(now msg_more awaitpid add_timer add_uniq_timer);
 
 my %Stack;
 my $nextq; # queue for next_tick
-my $AWAIT_PIDS; # pid => [ $callback, @args ]
 my $reap_armed;
 my $ToClose; # sockets to close when event loop is done
-our (
+our ($AWAIT_PIDS, # pid => [ $callback, @args ]
      %DescriptorMap,             # fd (num) -> PublicInbox::DS object
      $Poller, # global Select, Epoll, DSPoll, or DSKQXS ref
 
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 10c08b90..368eee26 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1271,6 +1271,15 @@ sub dir_idle_handler ($) { # PublicInbox::DirIdle callback
 	}
 }
 
+sub drop_all_stores () {
+	for my $cfg (values %PATH2CFG) {
+		my $sto = delete($cfg->{-lei_store}) // next;
+		eval { $sto->wq_io_do('done') };
+		warn "E: $@ (dropping store for $cfg->{-f})" if $@;
+		$sto->wq_close;
+	}
+}
+
 # lei(1) calls this when it can't connect
 sub lazy_start {
 	my ($path, $errno, $narg) = @_;
@@ -1367,7 +1376,9 @@ sub lazy_start {
 				$s->close;
 			}
 		}
-		$n; # true: continue, false: stop
+		drop_all_stores() if !$n; # drop stores only if no clients
+		# returns true: continue, false: stop
+		$n + scalar(keys(%$PublicInbox::DS::AWAIT_PIDS));
 	});
 
 	# STDIN was redirected to /dev/null above, closing STDERR and

^ permalink raw reply related	[relevance 68%]

* [PATCH 00/21] lei + IPC related stuff
@ 2023-10-04  3:49 64% Eric Wong
  2023-10-04  3:49 68% ` [PATCH 01/21] lei: drop stores explicitly at daemon shutdown Eric Wong
                   ` (6 more replies)
  0 siblings, 7 replies; 200+ results
From: Eric Wong @ 2023-10-04  3:49 UTC (permalink / raw)
  To: meta

More work coming to make internal IPC stuff simpler
and better layered for future enhancements
(FUSE, increase xap_helper usage internally, etc).

Eric Wong (21):
  lei: drop stores explicitly at daemon shutdown
  ds: hoist out close_non_busy
  ds: don't pass FD map to post_loop_do callback
  move all non-test @post_loop_do into named subs
  lei: close DirIdle (inotify) early at daemon shutdown
  input_pipe: {args} is never undefined
  lei: do_env combines fchdir and local
  lei: get rid of l2m_progress PktOp callback
  t/lei_to_mail: modernize and document test
  lei: reuse PublicInbox::Config::noop
  lei: keep signals blocked on daemon shutdown
  mbox_lock: retry on EINTR and use autodie
  lock: retry on EINTR, improve error reporting
  treewide: use PublicInbox::Lock->new
  gcf2: use PublicInbox::Lock
  spawn: use autodie and PublicInbox::Lock
  xap_helper: retry flock on EINTR
  XapHelper.pm: use EINTR-aware recv_cmd wrapper
  spawn: drop checks for directory writability
  lei: document and local-ize $OPT hashref
  searchidx: fix redundant `in' in warning message

 lib/PublicInbox/DS.pm         |  17 +++--
 lib/PublicInbox/Daemon.pm     |  47 ++++++-------
 lib/PublicInbox/DirIdle.pm    |  12 +++-
 lib/PublicInbox/Gcf2.pm       |   8 ++-
 lib/PublicInbox/IPC.pm        |   4 +-
 lib/PublicInbox/InputPipe.pm  |  11 ++-
 lib/PublicInbox/LEI.pm        | 127 ++++++++++++++++------------------
 lib/PublicInbox/LeiAuth.pm    |   4 +-
 lib/PublicInbox/LeiConfig.pm  |  25 +++----
 lib/PublicInbox/LeiConvert.pm |   7 +-
 lib/PublicInbox/LeiInspect.pm |  28 ++++----
 lib/PublicInbox/LeiLcat.pm    |  17 +++--
 lib/PublicInbox/LeiMirror.pm  |   2 +-
 lib/PublicInbox/LeiQuery.pm   |  19 +++--
 lib/PublicInbox/LeiTag.pm     |   6 +-
 lib/PublicInbox/LeiToMail.pm  |  20 +++---
 lib/PublicInbox/LeiXSearch.pm | 113 +++++++++++++-----------------
 lib/PublicInbox/Lock.pm       |  52 ++++++++------
 lib/PublicInbox/MboxLock.pm   |  49 ++++++-------
 lib/PublicInbox/PktOp.pm      |  15 ++--
 lib/PublicInbox/SearchIdx.pm  |   2 +-
 lib/PublicInbox/Spawn.pm      |  32 ++++-----
 lib/PublicInbox/TestCommon.pm |   7 +-
 lib/PublicInbox/Watch.pm      |   4 +-
 lib/PublicInbox/XapHelper.pm  |  11 ++-
 lib/PublicInbox/xap_helper.h  |   6 +-
 t/lei-tag.t                   |   3 +
 t/lei_to_mail.t               |  37 +++++-----
 t/solver_git.t                |   3 +-
 t/v2mirror.t                  |   2 +-
 30 files changed, 339 insertions(+), 351 deletions(-)

^ permalink raw reply	[relevance 64%]

* [PATCH] t/lei-q-save: quiet `no email in From: ...' warnings
@ 2023-10-03 16:18 47% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-03 16:18 UTC (permalink / raw)
  To: meta

PublicInbox::Import will warn if it can't extract a valid
address from an email.  We need to ensure our tests capture
them to $lei_err instead of spewing them to the terminal.

While we're at it, use autodie and xsys_e to simplify some.
---
 t/lei-q-save.t | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/t/lei-q-save.t b/t/lei-q-save.t
index 53311696..a9f9d4d6 100644
--- a/t/lei-q-save.t
+++ b/t/lei-q-save.t
@@ -1,7 +1,8 @@
 #!perl -w
 # Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict; use v5.10.1; use PublicInbox::TestCommon;
+use v5.12; use PublicInbox::TestCommon;
+use autodie qw(close open unlink);
 use PublicInbox::Smsg;
 use List::Util qw(sum);
 use File::Path qw(remove_tree);
@@ -89,7 +90,7 @@ test_lei(sub {
 	like($lei_out, qr!^\Q$home/mbcl2\E$!sm, 'complete got mbcl2 output');
 	like($lei_out, qr!^\Q$home/md\E$!sm, 'complete got maildir output');
 
-	unlink("$home/mbcl2") or xbail "unlink $!";
+	unlink("$home/mbcl2");
 	lei_ok qw(_complete lei up);
 	like($lei_out, qr!^\Q$home/mbcl2\E$!sm,
 		'mbcl2 output shown despite unlink');
@@ -97,24 +98,24 @@ test_lei(sub {
 	ok(-f "$home/mbcl2"  && -s _ == 0, 'up recreates on missing output');
 
 	# no --augment
-	open my $mb, '>', "$home/mbrd" or xbail "open $!";
+	open my $mb, '>', "$home/mbrd";
 	print $mb $pre_existing;
-	close $mb or xbail "close: $!";
+	close $mb;
 	lei_ok(qw(q -o mboxrd:mbrd m:qp@example.com -C), $home);
-	open $mb, '<', "$home/mbrd" or xbail "open $!";
+	open $mb, '<', "$home/mbrd";
 	is_deeply([grep(/pre-existing/, <$mb>)], [],
 		'pre-existing messsage gone w/o augment');
-	close $mb;
+	undef $mb;
 	lei_ok(qw(q m:import-before@example.com));
 	is(json_utf8->decode($lei_out)->[0]->{'s'},
 		'pre-existing', '--save imported before clobbering');
 
 	# --augment
-	open $mb, '>', "$home/mbrd-aug" or xbail "open $!";
+	open $mb, '>', "$home/mbrd-aug";
 	print $mb $pre_existing;
-	close $mb or xbail "close: $!";
+	close $mb;
 	lei_ok(qw(q -a -o mboxrd:mbrd-aug m:qp@example.com -C), $home);
-	open $mb, '<', "$home/mbrd-aug" or xbail "open $!";
+	open $mb, '<', "$home/mbrd-aug";
 	$mb = do { local $/; <$mb> };
 	like($mb, qr/pre-existing/, 'pre-existing message preserved w/ -a');
 	like($mb, qr/<qp\@example\.com>/, 'new result written w/ -a');
@@ -228,16 +229,14 @@ test_lei(sub {
 	my @lss = glob("$home/" .
 		'.local/share/lei/saved-searches/*/lei.saved-search');
 	my $out = xqx([qw(git config -f), $lss[0], 'lei.q.output']);
-	xsys($^X, qw(-w -i -p -e), "s/\\[/\\0/", $lss[0])
-		and xbail "-ipe $lss[0]: $?";
+	xsys_e($^X, qw(-w -i -p -e), "s/\\[/\\0/", $lss[0]);
 	lei_ok qw(ls-search);
 	like($lei_err, qr/bad config line.*?\Q$lss[0]\E/,
 		'git config parse error shown w/ lei ls-search');
 	lei_ok qw(up --all), \'up works with bad config';
 	like($lei_err, qr/bad config line.*?\Q$lss[0]\E/,
 		'git config parse error shown w/ lei up');
-	xsys($^X, qw(-w -i -p -e), "s/\\0/\\[/", $lss[0])
-		and xbail "-ipe $lss[0]: $?";
+	xsys_e($^X, qw(-w -i -p -e), "s/\\0/\\[/", $lss[0]);
 	lei_ok qw(ls-search);
 	is($lei_err, '', 'no errors w/ fixed config');
 
@@ -249,17 +248,17 @@ test_lei(sub {
 
 	my $d = "$home/d";
 	lei_ok [qw(import -q -F eml)], undef,
-		{0 => \"Subject: do not call\n\n"};
+		{%$lei_opt, 0 => \"Subject: do not call\n\n"};
 	lei_ok qw(q -o), $d, 's:do not call';
 
 	my @orig = glob("$d/*/*");
 	is(scalar(@orig), 1, 'got one message via argv');
 	lei_ok [qw(import -q -Feml)], undef,
-		{0 => \"Subject: do not ever call\n\n"};
+		{%$lei_opt, 0 => \"Subject: do not ever call\n\n"};
 	lei_ok 'up', $d;
 	is_deeply([glob("$d/*/*")], \@orig, 'nothing written');
 	lei_ok [qw(import -q -Feml)], undef,
-		{0 => \"Subject: do not call, ever\n\n"};
+		{%$lei_opt, 0 => \"Subject: do not call, ever\n\n"};
 	lei_ok 'up', $d;
 	@after = glob("$d/*/*");
 	is(scalar(@after), 2, '2 total, messages, now');
@@ -270,14 +269,15 @@ test_lei(sub {
 		'up retrieved correct message');
 
 	$d = "$home/d-stdin";
-	lei_ok [ qw(q -q -o), $d ], undef, { 0 => \'s:"do not ever call"' };
+	lei_ok [ qw(q -q -o), $d ], undef,
+		{ %$lei_opt, 0 => \'s:"do not ever call"' };
 	@orig = glob("$d/*/*");
 	is(scalar(@orig), 1, 'got one message via stdin');
 
 	lei_ok [qw(import -q -Feml)], undef,
-		{0 => \"Subject: do not fall or ever call\n\n"};
+		{%$lei_opt, 0 => \"Subject: do not fall or ever call\n\n"};
 	lei_ok [qw(import -q -Feml)], undef,
-		{0 => \"Subject: do not ever call, again\n\n"};
+		{%$lei_opt, 0 => \"Subject: do not ever call, again\n\n"};
 	lei_ok 'up', $d;
 	@new = glob("$d/new/*");
 	is(scalar(@new), 1, "new message written to `new'") or do {
@@ -292,7 +292,7 @@ test_lei(sub {
 	lei_ok(qw(q --no-external m:import-before@example.com -t -o), $d);
 	@orig = glob("$d/{new,cur}/*");
 	is(scalar(@orig), 1, 'one result so far');
-	lei_ok [ qw(import -Feml) ], undef, { 0 => \<<'EOM' };
+	lei_ok [ qw(import -Feml) ], undef, { %$lei_opt, 0 => \<<'EOM' };
 Date: Sun, 02 Oct 2023 00:00:00 +0000
 From: <x@example.com>
 In-Reply-To: <import-before@example.com>

^ permalink raw reply related	[relevance 47%]

* [PATCH 5/8] lei: workers exit after they tell lei-daemon
  @ 2023-10-03  6:43 90% ` Eric Wong
  2023-10-03  6:43 62% ` [PATCH 7/8] xt/lei-onion-convert: test TLS + SOCKS Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-10-03  6:43 UTC (permalink / raw)
  To: meta

We don't want workers continuing after their stdout has triggered
EPIPE or some other write error.

This fixes xt/lei-onion-convert.t to ensure the quit_waiter_pipe
is fully-closed at daemon teardown during tests.  Using the
`exit' perlop still ensures OnDestroy callbacks will fire.
---
 lib/PublicInbox/LEI.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 817772f7..10c08b90 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -485,6 +485,7 @@ sub x_it ($$) {
 	stop_pager($self);
 	if ($self->{pkt_op_p}) { # worker => lei-daemon
 		$self->{pkt_op_p}->pkt_do('x_it', $code);
+		exit($code >> 8);
 	} elsif ($self->{sock}) { # lei->daemon => lei(1) client
 		send($self->{sock}, "x_it $code", 0);
 	} elsif ($quit == \&CORE::exit) { # an admin (one-shot) command

^ permalink raw reply related	[relevance 90%]

* [PATCH 7/8] xt/lei-onion-convert: test TLS + SOCKS
    2023-10-03  6:43 90% ` [PATCH 5/8] lei: workers exit after they tell lei-daemon Eric Wong
@ 2023-10-03  6:43 62% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-10-03  6:43 UTC (permalink / raw)
  To: meta

While .onion URLs don't commonly use TLS, using Tor to access
non-.onion URLs is possible and TLS is advisable in that case.

TLS + SOCKS support is also useful for non-Tor SOCKS proxies
(e.g. "ssh -D"), but 127.0.0.1:9050 (Tor) is probably the most
standardized address.

While we're in the area: switch to v5.12, use autodie, and
ensure all necessary modules are present.
---
 xt/lei-onion-convert.t | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/xt/lei-onion-convert.t b/xt/lei-onion-convert.t
index 6dd17065..d3afbbb9 100644
--- a/xt/lei-onion-convert.t
+++ b/xt/lei-onion-convert.t
@@ -1,10 +1,12 @@
 #!perl -w
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict; use v5.10; use PublicInbox::TestCommon;
+use v5.12; use PublicInbox::TestCommon;
 use PublicInbox::MboxReader;
+use autodie qw(pipe close);
 my $test_tor = $ENV{TEST_TOR};
 plan skip_all => "TEST_TOR unset" unless $test_tor;
+require_mods qw(IO::Socket::Socks IO::Socket::SSL Mail::IMAPClient Net::NNTP);
 unless ($test_tor =~ m!\Asocks5h://!i) {
 	my $default = 'socks5h://127.0.0.1:9050';
 	diag "using $default (set TEST_TOR=socks5h://ADDR:PORT to override)";
@@ -19,11 +21,24 @@ my @cnv = qw(lei convert -o mboxrd:/dev/stdout);
 my @proxy_cli = ("--proxy=$test_tor");
 my $proxy_cfg = "proxy=$test_tor";
 test_lei(sub {
+	# ensure TLS + SOCKS works
+	ok !lei(qw(ls-mail-source imaps://mews.public-inbox.org/
+		-c), "imap.$proxy_cfg"),
+		'imaps fails on wrong hostname w/ Tor';
+	ok !lei(qw(ls-mail-source nntps://mews.public-inbox.org/
+		-c), "nntp.$proxy_cfg"),
+		'nntps fails on wrong hostname w/ Tor';
+
+	lei_ok qw(ls-mail-source imaps://news.public-inbox.org/
+		-c), "imap.$proxy_cfg";
+	lei_ok qw(ls-mail-source nntps://news.public-inbox.org/
+		-c), "nntp.$proxy_cfg";
+
 	my $run = {};
 	for my $args ([$nntp_url, @proxy_cli], [$imap_url, @proxy_cli],
 			[ $nntp_url, '-c', "nntp.$proxy_cfg" ],
 			[ $imap_url, '-c', "imap.$proxy_cfg" ]) {
-		pipe(my ($r, $w)) or xbail "pipe: $!";
+		pipe(my $r, my $w);
 		my $cmd = [@cnv, @$args];
 		my $td = start_script($cmd, undef, { 1 => $w, run_mode => 0 });
 		$args->[0] =~ s!\A(.+?://).*!$1...!;

^ permalink raw reply related	[relevance 62%]

* Re: [PATCH] lei: do label/keyword parsing in optparse
  2023-10-02 15:00 42% [PATCH] lei: do label/keyword parsing in optparse Eric Wong
@ 2023-10-02 20:14 62% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-02 20:14 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> +++ b/t/lei-import.t
> @@ -126,6 +126,18 @@ $res = json_utf8->decode($lei_out);
>  is_deeply($res->[0]->{kw}, [qw(answered seen)], 'keyword added');
>  is_deeply($res->[0]->{L}, [qw(boombox inbox)], 'labels preserved');
>  
> +# +kw:seen is not a location
> +ok(!lei(qw(import -F eml +kw:seen)), 'import fails w/ only kw arg');
> +like($lei_err, qr/\bLOCATION\.\.\. or --stdin must be set/, 'error message');

That's unreliable because stdin could be pointed to a
regular file or pipe while running tests, so we shouldn't
inherit.

So I think I'll squash the following in and use autodie a bit
more while I'm at it, too.
---
 t/lei-import.t | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/t/lei-import.t b/t/lei-import.t
index 30d8b531..8b09d3aa 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -1,14 +1,15 @@
 #!perl -w
 # Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict; use v5.10.1; use PublicInbox::TestCommon;
+use v5.12; use PublicInbox::TestCommon;
+use autodie qw(open close);
 test_lei(sub {
 ok(!lei(qw(import -F bogus), 't/plack-qp.eml'), 'fails with bogus format');
 like($lei_err, qr/\bis `eml', not --in-format/, 'gave error message');
 
 lei_ok(qw(q s:boolean), \'search miss before import');
 unlike($lei_out, qr/boolean/i, 'no results, yet');
-open my $fh, '<', 't/data/0001.patch' or BAIL_OUT $!;
+open my $fh, '<', 't/data/0001.patch';
 lei_ok([qw(import -F eml -)], undef, { %$lei_opt, 0 => $fh },
 	\'import single file from stdin') or diag $lei_err;
 close $fh;
@@ -18,7 +19,7 @@ lei_ok(qw(q s:boolean -f mboxrd), \'blob accessible after import');
 	my $expect = [ eml_load('t/data/0001.patch') ];
 	require PublicInbox::MboxReader;
 	my @cmp;
-	open my $fh, '<', \$lei_out or BAIL_OUT "open :scalar: $!";
+	open my $fh, '<', \$lei_out;
 	PublicInbox::MboxReader->mboxrd($fh, sub {
 		my ($eml) = @_;
 		$eml->header_set('Status');
@@ -127,8 +128,10 @@ is_deeply($res->[0]->{kw}, [qw(answered seen)], 'keyword added');
 is_deeply($res->[0]->{L}, [qw(boombox inbox)], 'labels preserved');
 
 # +kw:seen is not a location
-ok(!lei(qw(import -F eml +kw:seen)), 'import fails w/ only kw arg');
-like($lei_err, qr/\bLOCATION\.\.\. or --stdin must be set/, 'error message');
+open my $null, '<', '/dev/null';
+ok(!lei([qw(import -F eml +kw:seen)], undef, { %$lei_opt, 0 => $null }),
+	'import fails w/ only kw arg');
+like($lei_err, qr/\bLOCATION\.\.\. or --stdin must be set/s, 'error message');
 
 lei_ok([qw(import -F eml +kw:flagged)], # no lone dash (`-')
 	undef, { %$lei_opt, 0 => \$eml_str },

^ permalink raw reply related	[relevance 62%]

* [PATCH] lei: do label/keyword parsing in optparse
@ 2023-10-02 15:00 42% Eric Wong
  2023-10-02 20:14 62% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-10-02 15:00 UTC (permalink / raw)
  To: meta

Calling vmd_mod_extract after optparse causes the implicit
stdin-as-input functionality to fail, as the implicit stdin
requires a lack of inputs remaining in argv after option
parsing (along with a regular file or pipe as stdin).

This allows commands such as `lei import -F eml +kw:seen'
to work without `--stdin', `-' or any path names when
importing a single message.  This also ensures commands like
`lei import +kw:seen' without any inputs/locations will fail
reliably, as the extra +kw: arg won't be a false-positive.
---
  I noticed this while attempting to write a test for
  speeding up non-thread queries:
  https://public-inbox.org/meta/20231002145807.2665296-1-e@80x24.org/

 lib/PublicInbox/LEI.pm         | 18 +++++++++++++-----
 lib/PublicInbox/LeiAddWatch.pm |  4 +---
 lib/PublicInbox/LeiImport.pm   |  4 +---
 lib/PublicInbox/LeiInput.pm    | 11 +++++------
 lib/PublicInbox/LeiTag.pm      |  7 ++-----
 t/lei-import.t                 | 12 ++++++++++++
 6 files changed, 34 insertions(+), 22 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 06bc7ebd..817772f7 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -234,7 +234,7 @@ our %CMD = ( # sorted in order of importance/use:
 'plonk' => [ '--threads|--from=IDENT',
 	'exclude mail matching From: or threads from non-Message-ID searches',
 	qw(stdin| threads|t from|f=s mid=s oid=s), @c_opt ],
-'tag' => [ 'KEYWORDS...',
+'tag' => [ 'KEYWORDS... LOCATION...|--stdin',
 	'set/unset keywords and/or labels on message(s)',
 	qw(stdin| in-format|F=s input|i=s@ oid=s@ mid=s@),
 	@net_opt, @c_opt, pass_through('-kw:foo for delete') ],
@@ -243,7 +243,8 @@ our %CMD = ( # sorted in order of importance/use:
 	'remove imported messages from IMAP, Maildirs, and MH',
 	qw(exact! all jobs:i indexed), @c_opt ],
 
-'add-watch' => [ 'LOCATION...', 'watch for new messages and flag changes',
+'add-watch' => [ 'LOCATION... [LABELS...]',
+	'watch for new messages and flag changes',
 	qw(poll-interval=s state=s recursive|r), @c_opt ],
 'rm-watch' => [ 'LOCATION...', 'remove specified watch(es)',
 	qw(recursive|r), @c_opt ],
@@ -260,7 +261,7 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(in-format|F=s kw! offset=i recursive|r exclude=s include|I=s
 	verbose|v+ incremental!), @net_opt, # mainly for --proxy=
 	 @c_opt ],
-'import' => [ 'LOCATION...|--stdin',
+'import' => [ 'LOCATION...|--stdin [LABELS...]',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s new-only
 	lock=s@ in-format|F=s kw! verbose|v+ incremental! mail-sync!),
@@ -711,6 +712,14 @@ sub optparse ($$$) {
 	# "-" aliases "stdin" or "clear"
 	$OPT->{$lone_dash} = ${$OPT->{$lone_dash}} if defined $lone_dash;
 
+	if ($proto =~ s/\s*\[?(?:KEYWORDS|LABELS)\.\.\.\]?\s*//g) {
+		require PublicInbox::LeiInput;
+		my @err = PublicInbox::LeiInput::vmd_mod_extract($self, $argv);
+		return $self->fail(join("\n", @err)) if @err;
+	} else {
+		warn "proto $proto\n" if $cmd =~ /(add-watch|tag|index)/;
+	}
+
 	my $i = 0;
 	my $POS_ARG = '[A-Z][A-Z0-9_]+';
 	my ($err, $inf);
@@ -741,8 +750,7 @@ sub optparse ($$$) {
 							 -f _) && -r _) {
 						$OPT->{stdin} //= 1;
 					}
-					$ok = defined($OPT->{$sw});
-					last if $ok;
+					$ok = defined($OPT->{$sw}) and last;
 				} elsif (defined($argv->[$i])) {
 					$ok = 1;
 					$i++;
diff --git a/lib/PublicInbox/LeiAddWatch.pm b/lib/PublicInbox/LeiAddWatch.pm
index f61e2de4..e2be5cee 100644
--- a/lib/PublicInbox/LeiAddWatch.pm
+++ b/lib/PublicInbox/LeiAddWatch.pm
@@ -15,11 +15,9 @@ sub lei_add_watch {
 	my $state = $lei->{opt}->{'state'} // 'import-rw';
 	$lei->watch_state_ok($state) or
 		return $lei->fail("invalid state: $state");
-	my $vmd_mod = $self->vmd_mod_extract(\@argv);
-	return $lei->fail(join("\n", @{$vmd_mod->{err}})) if $vmd_mod->{err};
 	$self->prepare_inputs($lei, \@argv) or return;
 	my @vmd;
-	while (my ($type, $vals) = each %$vmd_mod) {
+	while (my ($type, $vals) = each %{$lei->{vmd_mod}}) {
 		push @vmd, "$type:$_" for @$vals;
 	}
 	my $vmd0 = shift @vmd;
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index a324a652..c2552bf0 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -71,9 +71,7 @@ sub do_import_index ($$@) {
 	my $sto = $lei->_lei_store(1);
 	$sto->write_prepare($lei);
 	$self->{-import_kw} = $lei->{opt}->{kw} // 1;
-	my $vmd_mod = $self->vmd_mod_extract(\@inputs);
-	return $lei->fail(join("\n", @{$vmd_mod->{err}})) if $vmd_mod->{err};
-	$self->{all_vmd} = $vmd_mod if scalar keys %$vmd_mod;
+	$self->{all_vmd} = $lei->{vmd_mod} if keys %{$lei->{vmd_mod}};
 	$lei->ale; # initialize for workers to read (before LeiPmdir->new)
 	$self->{-mail_sync} = $lei->{opt}->{'mail-sync'} // 1;
 	$self->prepare_inputs($lei, \@inputs) or return;
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 58069b0a..91383265 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -491,23 +491,22 @@ sub input_only_net_merge_all_done {
 # for update_xvmd -> update_vmd
 # returns something like { "+L" => [ @Labels ], ... }
 sub vmd_mod_extract {
-	my $argv = $_[-1];
-	my $vmd_mod = {};
-	my @new_argv;
+	my ($lei, $argv) = @_;
+	my (@new_argv, @err);
 	for my $x (@$argv) {
 		if ($x =~ /\A(\+|\-)(kw|L):(.+)\z/) {
 			my ($op, $pfx, $val) = ($1, $2, $3);
 			if (my $err = $ERR{$pfx}->($val)) {
-				push @{$vmd_mod->{err}}, $err;
+				push @err, $err;
 			} else { # set "+kw", "+L", "-L", "-kw"
-				push @{$vmd_mod->{$op.$pfx}}, $val;
+				push @{$lei->{vmd_mod}->{$op.$pfx}}, $val;
 			}
 		} else {
 			push @new_argv, $x;
 		}
 	}
 	@$argv = @new_argv;
-	$vmd_mod;
+	@err;
 }
 
 1;
diff --git a/lib/PublicInbox/LeiTag.pm b/lib/PublicInbox/LeiTag.pm
index 8ce96a10..76bd2d70 100644
--- a/lib/PublicInbox/LeiTag.pm
+++ b/lib/PublicInbox/LeiTag.pm
@@ -13,7 +13,7 @@ sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
 	if (my $xoids = $self->{lse}->xoids_for($eml) // # tries LeiMailSync
 			$self->{lei}->{ale}->xoids_for($eml)) {
 		$self->{lei}->{sto}->wq_do('update_xvmd', $xoids, $eml,
-						$self->{vmd_mod});
+						$self->{lei}->{vmd_mod});
 	} else {
 		++$self->{unimported};
 	}
@@ -31,11 +31,8 @@ sub lei_tag { # the "lei tag" method
 	my $sto = $lei->_lei_store(1)->write_prepare($lei);
 	my $self = bless {}, __PACKAGE__;
 	$lei->ale; # refresh and prepare
-	my $vmd_mod = $self->vmd_mod_extract(\@argv);
-	return $lei->fail(join("\n", @{$vmd_mod->{err}})) if $vmd_mod->{err};
-	$self->{vmd_mod} = $vmd_mod; # before LeiPmdir->new in prepare_inputs
 	$self->prepare_inputs($lei, \@argv) or return;
-	grep(defined, @$vmd_mod{qw(+kw +L -L -kw)}) or
+	grep(defined, @{$lei->{vmd_mod}}{qw(+kw +L -L -kw)}) or
 		return $lei->fail('no keywords or labels specified');
 	$lei->{-err_type} = 'non-fatal';
 	$lei->wq1_start($self);
diff --git a/t/lei-import.t b/t/lei-import.t
index c9e668a3..30d8b531 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -126,6 +126,18 @@ $res = json_utf8->decode($lei_out);
 is_deeply($res->[0]->{kw}, [qw(answered seen)], 'keyword added');
 is_deeply($res->[0]->{L}, [qw(boombox inbox)], 'labels preserved');
 
+# +kw:seen is not a location
+ok(!lei(qw(import -F eml +kw:seen)), 'import fails w/ only kw arg');
+like($lei_err, qr/\bLOCATION\.\.\. or --stdin must be set/, 'error message');
+
+lei_ok([qw(import -F eml +kw:flagged)], # no lone dash (`-')
+	undef, { %$lei_opt, 0 => \$eml_str },
+	'import succeeds with implicit --stdin');
+lei_ok(qw(q m:inbox@example.com));
+$res = json_utf8->decode($lei_out);
+is_deeply($res->[0]->{kw}, [qw(answered flagged seen)], 'keyword added');
+is_deeply($res->[0]->{L}, [qw(boombox inbox)], 'labels preserved');
+
 # see t/lei_to_mail.t for "import -F mbox*"
 });
 done_testing;

^ permalink raw reply related	[relevance 42%]

* [PATCH] lei up: faster non-thread, single-source incremental query
@ 2023-10-02 14:58 68% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-02 14:58 UTC (permalink / raw)
  To: meta

When using isearch (that is v1/v2 inbox relying on extindex
for search), there's actually no guarantee that IMAP UIDs
are in the correct order with regard to Xapian docids.

Thus we must iterate through every UID(num) to see if it's
suitable to display in a saved search.  The old grep filter
(before commit a6fe84489127) was not effective since it
didn't account for the mset->items correspondence.

Fortunately, this bug merely manifests in reduced performance
as of a6fe84489127.  Prior to that, it could cause incorrect
keywords and labels to be applied.

Unfortunately, this behavior is hard-to-test so no test case
is included.

Followup-to: a6fe84489127 (lei up: fix missing -t/--threads matches w/ saved search)
---
 lib/PublicInbox/LeiXSearch.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 5f105567..4e0849e8 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -211,9 +211,10 @@ sub query_one_mset { # for --threads and l2m w/o sort
 			}
 		} else {
 			$first_ids = $ids;
-			my @items = $mset->items;
+			my @items = $mset->items; # parallel with @$ids
 			for my $n (@$ids) {
 				my $mitem = $items[$i++];
+				next if $n <= $min;
 				my $smsg = $over->get_art($n) or next;
 				next if $smsg->{bytes} == 0;
 				mitem_kw($srch, $smsg, $mitem, $fl) if $can_kw;

^ permalink raw reply related	[relevance 68%]

* [PATCH] lei up: fix missing -t/--threads matches w/ saved search
@ 2023-10-01 22:29 50% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-01 22:29 UTC (permalink / raw)
  To: meta

We must not filter out seen docids from the mset; but only with
the result of over->expand_thread.
---
 lib/PublicInbox/LeiXSearch.pm | 34 +++++++++++++---------------------
 lib/PublicInbox/Over.pm       |  7 +++++--
 t/lei-q-save.t                | 19 +++++++++++++++++++
 3 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 7f4911b3..5f105567 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -176,14 +176,10 @@ sub query_one_mset { # for --threads and l2m w/o sort
 	my $threads = $lei->{opt}->{threads} // 0;
 	my $fl = $threads > 1 ? 1 : undef;
 	my $lss = $lei->{lss};
-	my $maxk = "external.$dir.maxuid";
-	my $stop_at = $lss ? $lss->{-cfg}->{$maxk} : undef;
-	if (defined $stop_at) {
-		ref($stop_at) and
-			return warn("$maxk=$stop_at has multiple values\n");
-		($stop_at =~ /[^0-9]/) and
-			return warn("$maxk=$stop_at not numeric\n");
-	}
+	my $maxk = "external.$dir.maxuid"; # max of previous, so our min
+	my $min = $lss ? ($lss->{-cfg}->{$maxk} // 0) : 0;
+	ref($min) and return warn("$maxk=$min has multiple values\n");
+	($min =~ /[^0-9]/) and return warn("$maxk=$min not numeric\n");
 	my $first_ids;
 	do {
 		$mset = eval { $srch->mset($mo->{qstr}, $mo) };
@@ -192,29 +188,26 @@ sub query_one_mset { # for --threads and l2m w/o sort
 				$mset->get_matches_estimated);
 		wait_startq($lei); # wait for keyword updates
 		my $ids = $srch->mset_to_artnums($mset, $mo);
-		@$ids = grep { $_ > $stop_at } @$ids if defined($stop_at);
 		my $i = 0;
 		if ($threads) {
 			# copy $ids if $lss since over->expand_thread
 			# shifts @{$ctx->{ids}}
 			$first_ids = [ @$ids ] if $lss;
-			my $ctx = { ids => $ids };
-			my %n2item = map { ($ids->[$i++], $_) } $mset->items;
-			while ($over->expand_thread($ctx)) {
-				for my $n (@{$ctx->{xids}}) {
+			my $ctx = { ids => $ids, min => $min };
+			my %n2item = map { $ids->[$i++] => $_ } $mset->items;
+			while ($over->expand_thread($ctx)) { # fills {xids}
+				for my $n (@{delete $ctx->{xids}}) {
 					my $smsg = $over->get_art($n) or next;
-					my $mitem = delete $n2item{$n};
+					my $mi = delete $n2item{$n};
 					next if $smsg->{bytes} == 0;
-					if ($mitem && $can_kw) {
-						mitem_kw($srch, $smsg, $mitem,
-							$fl);
-					} elsif ($mitem && $fl) {
+					if ($mi && $can_kw) {
+						mitem_kw($srch, $smsg, $mi, $fl)
+					} elsif ($mi && $fl) {
 						# call ->xsmsg_vmd, later
 						$smsg->{lei_q_tt_flagged} = 1;
 					}
-					$each_smsg->($smsg, $mitem);
+					$each_smsg->($smsg, $mi);
 				}
-				@{$ctx->{xids}} = ();
 			}
 		} else {
 			$first_ids = $ids;
@@ -230,7 +223,6 @@ sub query_one_mset { # for --threads and l2m w/o sort
 	} while (_mset_more($mset, $mo));
 	_check_mset_limit($lei, $dir, $mset);
 	if ($lss && scalar(@$first_ids)) {
-		undef $stop_at;
 		my $max = $first_ids->[0];
 		$lss->cfg_set($maxk, $max);
 		undef $lss;
diff --git a/lib/PublicInbox/Over.pm b/lib/PublicInbox/Over.pm
index 82034b30..e3a8adb1 100644
--- a/lib/PublicInbox/Over.pm
+++ b/lib/PublicInbox/Over.pm
@@ -12,6 +12,7 @@ use DBD::SQLite;
 use PublicInbox::Smsg;
 use Compress::Zlib qw(uncompress);
 use constant DEFAULT_LIMIT => 1000;
+use List::Util (); # for max
 
 sub dbh_new {
 	my ($self, $rw) = @_;
@@ -198,10 +199,12 @@ ORDER BY $sort_col DESC
 }
 
 # strict `tid' matches, only, for thread-expanded mbox.gz search results
-# and future CLI interface
+# and lei
 # returns true if we have IDs, undef if not
 sub expand_thread {
 	my ($self, $ctx) = @_;
+	# previous maxuid for LeiSavedSearch is our min:
+	my $lss_min = $ctx->{min} // 0;
 	my $dbh = dbh($self);
 	do {
 		defined(my $num = $ctx->{ids}->[0]) or return;
@@ -214,7 +217,7 @@ SELECT num FROM over WHERE tid = ? AND num > ?
 ORDER BY num ASC LIMIT 1000
 
 			my $xids = $dbh->selectcol_arrayref($sql, undef, $tid,
-							$ctx->{prev} // 0);
+				List::Util::max($ctx->{prev} // 0, $lss_min));
 			if (scalar(@$xids)) {
 				$ctx->{prev} = $xids->[-1];
 				$ctx->{xids} = $xids;
diff --git a/t/lei-q-save.t b/t/lei-q-save.t
index 1d9d5a51..53311696 100644
--- a/t/lei-q-save.t
+++ b/t/lei-q-save.t
@@ -15,6 +15,7 @@ $doc3->header_set('Date', PublicInbox::Smsg::date({ds => time - (86400 * 4)}));
 my $cat_env = { VISUAL => 'cat', EDITOR => 'cat' };
 my $pre_existing = <<'EOF';
 From x Mon Sep 17 00:00:00 2001
+From: <x@example.com>
 Message-ID: <import-before@example.com>
 Subject: pre-existing
 Date: Sat, 02 Oct 2010 00:00:00 +0000
@@ -286,5 +287,23 @@ test_lei(sub {
 	is(eml_load($new[0])->header('Subject'), 'do not ever call, again',
 		'up retrieved correct message');
 
+	# --thread expansion
+	$d = "$home/thread-expand";
+	lei_ok(qw(q --no-external m:import-before@example.com -t -o), $d);
+	@orig = glob("$d/{new,cur}/*");
+	is(scalar(@orig), 1, 'one result so far');
+	lei_ok [ qw(import -Feml) ], undef, { 0 => \<<'EOM' };
+Date: Sun, 02 Oct 2023 00:00:00 +0000
+From: <x@example.com>
+In-Reply-To: <import-before@example.com>
+Message-ID: <reply1@example.com>
+Subject: reply1
+EOM
+
+	lei_ok qw(up), $d;
+	@new = glob("$d/{new,cur}/*");
+	is(scalar(@new), 2, 'got new message');
+	is_xdeeply([grep { $_ eq $orig[0] } @new], \@orig,
+		'original message preserved on up w/ threads');
 });
 done_testing;

^ permalink raw reply related	[relevance 50%]

* [PATCH 08/13] lei mail-diff: don't remove temporary subdirectory
    2023-10-01  9:54 64% ` [PATCH 06/13] lei rediff: `git diff -O<order-file>' support Eric Wong
  2023-10-01  9:54 71% ` [PATCH 07/13] lei: correct exit signal Eric Wong
@ 2023-10-01  9:54 71% ` Eric Wong
  2023-10-01  9:54 71% ` [PATCH 12/13] lei: ->fail only allows integer exit codes Eric Wong
  2023-10-01  9:54 50% ` [PATCH 13/13] lei: deal with clients with blocked stderr Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-01  9:54 UTC (permalink / raw)
  To: meta

->{curdir} is localized inside MailDiff->dump_eml anyways, so it
was attempting to remove `undef' :x.  Since most messages don't
have too many attachments, save some opcodes on our end and just
let File::Temp::Dir->DESTROY handle all the cleanup.
---
 lib/PublicInbox/LeiMailDiff.pm | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/PublicInbox/LeiMailDiff.pm b/lib/PublicInbox/LeiMailDiff.pm
index 5e2e4b0b..af6ecf82 100644
--- a/lib/PublicInbox/LeiMailDiff.pm
+++ b/lib/PublicInbox/LeiMailDiff.pm
@@ -7,7 +7,6 @@ package PublicInbox::LeiMailDiff;
 use v5.12;
 use parent qw(PublicInbox::IPC PublicInbox::LeiInput PublicInbox::MailDiff);
 use PublicInbox::Spawn qw(run_wait);
-use File::Path ();
 require PublicInbox::LeiRediff;
 
 sub diff_a ($$) {
@@ -21,7 +20,6 @@ sub diff_a ($$) {
 	my $rdr = { -C => "$self->{tmp}" };
 	@$rdr{1, 2} = @$lei{1, 2};
 	run_wait($cmd, $lei->{env}, $rdr) and $lei->child_error($?);
-	File::Path::remove_tree($self->{curdir});
 }
 
 sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh

^ permalink raw reply related	[relevance 71%]

* [PATCH 12/13] lei: ->fail only allows integer exit codes
                     ` (2 preceding siblings ...)
  2023-10-01  9:54 71% ` [PATCH 08/13] lei mail-diff: don't remove temporary subdirectory Eric Wong
@ 2023-10-01  9:54 71% ` Eric Wong
  2023-10-01  9:54 50% ` [PATCH 13/13] lei: deal with clients with blocked stderr Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-01  9:54 UTC (permalink / raw)
  To: meta

We can't use floating point numbers nor Inf/-Inf as exit codes;
but we can allow `-1' as shorthand for 255.
---
 lib/PublicInbox/LEI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1b14d5e1..1899bf38 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -524,7 +524,7 @@ sub sigpipe_handler { # handles SIGPIPE from @WQ_KEYS workers
 
 sub fail ($;@) {
 	my ($lei, @msg) = @_;
-	my $exit_code = looks_like_number($msg[0]) ? shift(@msg) : undef;
+	my $exit_code = ($msg[0]//'') =~ /\A-?[0-9]+\z/ ? shift(@msg) : undef;
 	local $current_lei = $lei;
 	$lei->{failed}++;
 	if (@msg) {

^ permalink raw reply related	[relevance 71%]

* [PATCH 13/13] lei: deal with clients with blocked stderr
                     ` (3 preceding siblings ...)
  2023-10-01  9:54 71% ` [PATCH 12/13] lei: ->fail only allows integer exit codes Eric Wong
@ 2023-10-01  9:54 50% ` Eric Wong
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-01  9:54 UTC (permalink / raw)
  To: meta

lei/store can get stuck if lei-daemon is blocked, and lei-daemon
can get stuck when a clients stderr is redirected to a pager
that isn't consumed.

So start relying on Time::HiRes::alarm to generate SIGALRM to
break out of the `print' perlop.  Unfortunately, this isn't easy
since Perl auto-restarts all writes, so we dup(2) the
destination FD and close the copy in the SIGALRM handler to
force `print' to return.

Most programs (MUAs, editors, etc.) aren't equipped to deal with
non-blocking STDERR, so we can't make the stderr file description
non-blocking.

Another way to solve this problem would be to have script/lei
send a non-blocking pipe to lei-daemon in the {2} slot and
make script/lei splice messages from the pipe to stderr.
Unfortunately, that requires more work and forces more
complexity into script/lei and slow down normal cases where
stderr doesn't get blocked.
---
 lib/PublicInbox/LEI.pm         |  3 ++-
 lib/PublicInbox/LeiStore.pm    |  8 ++++++--
 lib/PublicInbox/LeiStoreErr.pm | 27 +++++++++++++++++++++++++--
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1899bf38..06bc7ebd 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1287,7 +1287,7 @@ sub lazy_start {
 	undef $lk;
 	my @st = stat($path) or die "stat($path): $!";
 	my $dev_ino_expect = pack('dd', $st[0], $st[1]); # dev+ino
-	local $oldset = PublicInbox::DS::block_signals();
+	local $oldset = PublicInbox::DS::block_signals(POSIX::SIGALRM);
 	die "incompatible narg=$narg" if $narg != 5;
 	$PublicInbox::IPC::send_cmd or die <<"";
 (Socket::MsgHdr || Inline::C) missing/unconfigured (narg=$narg);
@@ -1369,6 +1369,7 @@ sub lazy_start {
 		  strftime('%Y-%m-%dT%H:%M:%SZ', gmtime(time))," $$ ", @_);
 	};
 	local $SIG{PIPE} = 'IGNORE';
+	local $SIG{ALRM} = 'IGNORE';
 	open STDERR, '>&STDIN' or die "redirect stderr failed: $!";
 	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!";
 	# $daemon pipe to `lei' closed, main loop begins:
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 9923ec3f..0cb78f79 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -33,6 +33,7 @@ use IO::Handle (); # ->autoflush
 use Sys::Syslog qw(syslog openlog);
 use Errno qw(EEXIST ENOENT);
 use PublicInbox::Syscall qw(rename_noreplace);
+use PublicInbox::LeiStoreErr;
 
 sub new {
 	my (undef, $dir, $opt) = @_;
@@ -112,7 +113,10 @@ sub _tail_err {
 	my ($self) = @_;
 	my $err = $self->{-tmp_err} // return;
 	$err->clearerr; # clear EOF marker
-	print { $self->{-err_wr} } readline($err);
+	my @msg = readline($err);
+	PublicInbox::LeiStoreErr::emit($self->{-err_wr}, @msg) and return;
+	# syslog is the last resort if lei-daemon broke
+	syslog('warning', '%s', $_) for @msg;
 }
 
 sub eidx_init {
@@ -627,12 +631,12 @@ sub write_prepare {
 		# Mail we import into lei are private, so headers filtered out
 		# by -mda for public mail are not appropriate
 		local @PublicInbox::MDA::BAD_HEADERS = ();
+		local $SIG{ALRM} = 'IGNORE';
 		$self->wq_workers_start("lei/store $dir", 1, $lei->oldset, {
 					lei => $lei,
 					-err_wr => $w,
 					to_close => [ $r ],
 				}, \&_sto_atexit);
-		require PublicInbox::LeiStoreErr;
 		PublicInbox::LeiStoreErr->new($r, $lei);
 	}
 	$lei->{sto} = $self;
diff --git a/lib/PublicInbox/LeiStoreErr.pm b/lib/PublicInbox/LeiStoreErr.pm
index 47fa2277..fe4af51e 100644
--- a/lib/PublicInbox/LeiStoreErr.pm
+++ b/lib/PublicInbox/LeiStoreErr.pm
@@ -9,6 +9,30 @@ use parent qw(PublicInbox::DS);
 use PublicInbox::Syscall qw(EPOLLIN);
 use Sys::Syslog qw(openlog syslog closelog);
 use IO::Handle (); # ->blocking
+use Time::HiRes ();
+use autodie qw(open);
+our $err_wr;
+
+# We don't want blocked stderr on clients to block lei/store or lei-daemon.
+# We can't make stderr non-blocking since it can break MUAs or anything
+# lei might spawn.  So we setup a timer to wake us up after a second if
+# printing to a user's stderr hasn't completed, yet.  Unfortunately,
+# EINTR alone isn't enough since Perl auto-restarts writes on signals,
+# so to interrupt writes to clients with blocked stderr, we dup the
+# error output to $err_wr ahead-of-time and close $err_wr in the
+# SIGALRM handler to ensure `print' gets aborted:
+
+sub abort_err_wr { close($err_wr) if $err_wr; undef $err_wr }
+
+sub emit ($@) {
+	my ($efh, @msg) = @_;
+	open(local $err_wr, '>&', $efh); # fdopen(dup(fileno($efh)), "w")
+	local $SIG{ALRM} = \&abort_err_wr;
+	Time::HiRes::alarm(1.0, 0.1);
+	my $ret = print $err_wr @msg;
+	Time::HiRes::alarm(0);
+	$ret;
+}
 
 sub new {
 	my ($cls, $rd, $lei) = @_;
@@ -26,8 +50,7 @@ sub event_step {
 	for my $lei (values %PublicInbox::DS::DescriptorMap) {
 		my $cb = $lei->can('store_path') // next;
 		next if $cb->($lei) ne $self->{store_path};
-		my $err = $lei->{2} // next;
-		print $err $buf and $printed = 1;
+		emit($lei->{2} // next, $buf) and $printed = 1;
 	}
 	if (!$printed) {
 		openlog('lei/store', 'pid,nowait,nofatal,ndelay', 'user');

^ permalink raw reply related	[relevance 50%]

* [PATCH 07/13] lei: correct exit signal
    2023-10-01  9:54 64% ` [PATCH 06/13] lei rediff: `git diff -O<order-file>' support Eric Wong
@ 2023-10-01  9:54 71% ` Eric Wong
  2023-10-01  9:54 71% ` [PATCH 08/13] lei mail-diff: don't remove temporary subdirectory Eric Wong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-01  9:54 UTC (permalink / raw)
  To: meta

The first argument passed to Perl signal handlers is a
signal name (e.g. "TERM") and not an integer that can
be passed to the `exit' perlop. Thus we must look up the
integer value from the POSIX module.
---
 lib/PublicInbox/LEI.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 48c5644b..1b14d5e1 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1310,9 +1310,9 @@ sub lazy_start {
 	local $quit = do {
 		my (undef, $eof_p) = PublicInbox::PktOp->pair;
 		sub {
-			$exit_code //= shift;
+			$exit_code //= eval("POSIX::SIG$_[0] + 128") if @_;
 			eval 'PublicInbox::LeiNoteEvent::flush_task()';
-			my $lis = $pil or exit($exit_code);
+			my $lis = $pil or exit($exit_code // 0);
 			# closing eof_p triggers \&noop wakeup
 			$listener = $eof_p = $pil = $path = undef;
 			$lis->close; # DS::close

^ permalink raw reply related	[relevance 71%]

* [PATCH 06/13] lei rediff: `git diff -O<order-file>' support
  @ 2023-10-01  9:54 64% ` Eric Wong
  2023-10-01  9:54 71% ` [PATCH 07/13] lei: correct exit signal Eric Wong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-10-01  9:54 UTC (permalink / raw)
  To: meta

We can't use the `-O' switch since it conflicts with
--only|-O= to specify externals.  Thus we'll introduce
a more verbose `--order-file=FILE' option when running
`git diff'.
---
 lib/PublicInbox/LEI.pm       | 6 +++---
 lib/PublicInbox/LeiRediff.pm | 1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index beb0f897..48c5644b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -159,7 +159,7 @@ our @diff_opt = qw(unified|U=i output-indicator-new=s output-indicator-old=s
 	rename-empty! check ws-error-highlight=s full-index binary
 	abbrev:i break-rewrites|B:s find-renames|M:s find-copies:s
 	find-copies-harder irreversible-delete|D l=i diff-filter=s
-	S=s G=s find-object=s pickaxe-all pickaxe-regex O=s R
+	S=s G=s find-object=s pickaxe-all pickaxe-regex R
 	relative:s text|a ignore-cr-at-eol ignore-space-at-eol
 	ignore-space-change|b ignore-all-space|w ignore-blank-lines
 	inter-hunk-context=i function-context|W exit-code ext-diff
@@ -198,8 +198,8 @@ our %CMD = ( # sorted in order of importance/use:
 'rediff' => [ '--stdin|LOCATION...',
 		'regenerate a diff with different options',
 	'stdin|', # /|\z/ must be first for lone dash
-	qw(git-dir=s@ cwd! verbose|v+ color:s no-color drq:1 dequote-only:1),
-	@diff_opt, @lxs_opt, @net_opt, @c_opt ],
+	qw(git-dir=s@ cwd! verbose|v+ color:s no-color drq:1 dequote-only:1
+	order-file=s), @diff_opt, @lxs_opt, @net_opt, @c_opt ],
 
 'mail-diff' => [ '--stdin|LOCATION...', 'diff the contents of emails',
 	'stdin|', # /|\z/ must be first for lone dash
diff --git a/lib/PublicInbox/LeiRediff.pm b/lib/PublicInbox/LeiRediff.pm
index efd24d17..6cc6131b 100644
--- a/lib/PublicInbox/LeiRediff.pm
+++ b/lib/PublicInbox/LeiRediff.pm
@@ -82,6 +82,7 @@ sub _lei_diff_prepare ($$) {
 			push @$cmd, $c ? "-$c" : "--$o";
 		}
 	}
+	push(@$cmd, "-O$opt->{'order-file'}") if $opt->{'order-file'};
 }
 
 sub diff_ctxq ($$) {

^ permalink raw reply related	[relevance 64%]

* [PATCH] t/lei-convert: fix uninitialized variable w/o pigz
@ 2023-09-30 16:17 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-30 16:17 UTC (permalink / raw)
  To: meta

`backtick` captures return `undef' when a command is missing

Fixes: 5df0446abcca (lei: don't gzip --rsyncable by default for mbox*)
---
 t/lei-convert.t | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/lei-convert.t b/t/lei-convert.t
index d75110cb..84b57f81 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -132,7 +132,7 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	SKIP: {
 		my $ok;
 		for my $x (($ENV{GZIP}//''), qw(pigz gzip)) {
-			$x && `$x -h 2>&1` =~ /--rsyncable\b/s or next;
+			$x && (`$x -h 2>&1`//'') =~ /--rsyncable\b/s or next;
 			$ok = $x;
 			last;
 		}

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/2] lei: support reading inboxes & extindex w/o search
@ 2023-09-30  0:36 71% Eric Wong
  2023-09-30  0:36 42% ` [PATCH 2/2] lei convert: support reading from v1, v2, and extindex Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-09-30  0:36 UTC (permalink / raw)
  To: meta

This works on completely unindexed inboxes, even, as long as the
inbox.lock (or ssoma.lock) file exists.

Eric Wong (2):
  lei_input: always prefix `maildir:' internally
  lei convert: support reading from v1, v2, and extindex

 lib/PublicInbox/ExtSearch.pm |   6 +-
 lib/PublicInbox/LeiInput.pm  | 113 ++++++++++++++++++++++++++---------
 t/extsearch.t                |  24 ++++++++
 t/lei-convert.t              |  40 +++++++++++++
 4 files changed, 153 insertions(+), 30 deletions(-)


^ permalink raw reply	[relevance 71%]

* [PATCH 2/2] lei convert: support reading from v1, v2, and extindex
  2023-09-30  0:36 71% [PATCH 0/2] lei: support reading inboxes & extindex w/o search Eric Wong
@ 2023-09-30  0:36 42% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-30  0:36 UTC (permalink / raw)
  To: meta

We should be able to dump all public-inbox and extindex directories
to Maildir/mbox* or IMAP folders.  Even unindexed inboxes can be
dumped as long as inbox.lock (or ssoma.lock) exists.

This change likely works for `lei tag' and other lei_input-using
things, as well, but that's untested at the moment.  I mainly
want to be able to use `lei convert' to benchmark some upcoming
changes...
---
 lib/PublicInbox/ExtSearch.pm |  6 ++--
 lib/PublicInbox/LeiInput.pm  | 70 +++++++++++++++++++++++++++++++-----
 t/extsearch.t                | 24 +++++++++++++
 t/lei-convert.t              | 40 +++++++++++++++++++++
 4 files changed, 129 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/ExtSearch.pm b/lib/PublicInbox/ExtSearch.pm
index fa49a1d0..d43c23e6 100644
--- a/lib/PublicInbox/ExtSearch.pm
+++ b/lib/PublicInbox/ExtSearch.pm
@@ -33,9 +33,11 @@ sub misc {
 # same as per-inbox ->over, for now...
 sub over {
 	my ($self) = @_;
-	$self->{over} //= do {
+	$self->{over} // eval {
 		PublicInbox::Inbox::_cleanup_later($self);
-		PublicInbox::Over->new("$self->{xpfx}/over.sqlite3");
+		my $over = PublicInbox::Over->new("$self->{xpfx}/over.sqlite3");
+		$over->dbh; # may die
+		$self->{over} = $over;
 	};
 }
 
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index f88c5374..58069b0a 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -125,6 +125,51 @@ sub handle_http_input ($$@) {
 	$lei->child_error($?, "@$cmd failed: @err") if @err;
 }
 
+sub oid2eml { # git->cat_async cb
+	my ($bref, $oid, $type, $size, $self) = @_;
+	if ($type eq 'blob') {
+		$self->input_eml_cb(PublicInbox::Eml->new($bref));
+	} else {
+		warn "W: $oid is type=$type\n";
+	}
+}
+
+sub each_ibx_eml_unindexed {
+	my ($self, $ibx, @args) = @_;
+	$ibx->isa('PublicInbox::Inbox') or return $self->{lei}->fail(<<EOM);
+unindexed extindex $ibx->{topdir} not supported
+EOM
+	require PublicInbox::SearchIdx;
+	my $n = $ibx->max_git_epoch;
+	my @g = defined($n) ? map { $ibx->git_epoch($_) } (0..$n) : ($ibx->git);
+	my $sync = { D => {}, ibx => $ibx }; # D => {} filters out deletes
+	my ($f, $at, $ct, $oid, $cmt);
+	for my $git (grep defined, @g) {
+		my $s = PublicInbox::SearchIdx::log2stack($sync, $git, 'HEAD');
+		while (($f, $at, $ct, $oid, $cmt) = $s->pop_rec) {
+			$git->cat_async($oid, \&oid2eml, $self) if $f eq 'm';
+		}
+		$git->cleanup; # wait all
+	}
+}
+
+sub each_ibx_eml {
+	my ($self, $ibx, @args) = @_; # TODO: is @args used at all?
+	my $over = $ibx->over or return each_ibx_eml_unindexed(@_);
+	my $git = $ibx->git;
+	my $prev = 0;
+	my $smsg;
+	my $ids = $over->ids_after(\$prev);
+	while (@$ids) {
+		for (@$ids) {
+			$smsg = $over->get_art($_) // next;
+			$git->cat_async($smsg->{blob}, \&oid2eml, $self);
+		}
+		$ids = $over->ids_after(\$prev);
+	}
+	$git->cat_async_wait;
+}
+
 sub input_path_url {
 	my ($self, $input, @args) = @_;
 	my $lei = $self->{lei};
@@ -191,6 +236,12 @@ sub input_path_url {
 						$self->can('input_maildir_cb'),
 						$self, @args);
 		}
+	} elsif (-d _ && $ifmt =~ /\A(?:v1|v2)\z/) {
+		my $ibx = PublicInbox::Inbox->new({inboxdir => $input});
+		each_ibx_eml($self, $ibx, @args);
+	} elsif (-d _ && $ifmt eq 'extindex') {
+		my $esrch = PublicInbox::ExtSearch->new($input);
+		each_ibx_eml($self, $esrch, @args);
 	} elsif ($self->{missing_ok} && !-e $input) { # don't ->fail
 		if ($lei->{cmd} eq 'p2q') {
 			my $fp = [ qw(git format-patch --stdout -1), $input ];
@@ -308,9 +359,9 @@ sub prepare_inputs { # returns undef on error
 				require PublicInbox::MboxReader;
 				PublicInbox::MboxReader->reads($ifmt) or return
 					$lei->fail("$ifmt not supported");
-			} elsif (-d $input_path) {
-				$ifmt eq 'maildir' or return # TODO v1/v2/ei
-					$lei->fail("$ifmt not supported");
+			} elsif (-d $input_path) { # TODO extindex
+				$ifmt =~ /\A(?:maildir|v1|v2|extindex)\z/ or
+					return$lei->fail("$ifmt not supported");
 				$input = $input_path;
 				add_dir $lei, $istate, $ifmt, \$input;
 			} elsif ($self->{missing_ok} && !-e _) {
@@ -350,12 +401,12 @@ $input is `eml', not --in-format=$in_fmt
 				push @f, $input;
 			} elsif (-d "$input/new" && -d "$input/cur") {
 				add_dir $lei, $istate, 'maildir', \$input;
-			} elsif (-e "$input/inbox.lock") { # TODO
-				$lei->fail('v2 inputs not yet supported (TODO)');
-				#add_dir $lei, $istate, 'v2', \$input;
-			} elsif (-e "$input/ssoma.lock") { # TODO
-				$lei->fail('v1 inputs not yet supported (TODO)');
-				#add_dir $lei, $istate, 'v1', \$input;
+			} elsif (-e "$input/inbox.lock") {
+				add_dir $lei, $istate, 'v2', \$input;
+			} elsif (-e "$input/ssoma.lock") {
+				add_dir $lei, $istate, 'v1', \$input;
+			} elsif (-e "$input/ei.lock") {
+				add_dir $lei, $istate, 'extindex', \$input;
 			} elsif ($self->{missing_ok} && !-e $input) {
 				if ($lei->{cmd} eq 'p2q') {
 					# will run "git format-patch"
@@ -401,6 +452,7 @@ $input is `eml', not --in-format=$in_fmt
 			$lei->refresh_watches;
 		}
 	}
+	require PublicInbox::ExtSearch if $istate->{extindex};
 	$self->{inputs} = $inputs;
 }
 
diff --git a/t/extsearch.t b/t/extsearch.t
index 8ded3382..19eaf3b5 100644
--- a/t/extsearch.t
+++ b/t/extsearch.t
@@ -581,4 +581,28 @@ EOM
 	}
 }
 
+test_lei(sub {
+	my $d = "$home/extindex";
+	lei_ok('convert', '-o', "$home/md1", $d);
+	lei_ok('convert', '-o', "$home/md2", "extindex:$d");
+	my $dst = [];
+	my $cb = sub { push @$dst, $_[2]->as_string };
+	require PublicInbox::MdirReader;
+	PublicInbox::MdirReader->new->maildir_each_eml("$home/md1", $cb);
+	my @md1 = sort { $a cmp $b } @$dst;
+	ok(scalar(@md1), 'dumped messages to md1');
+	$dst = [];
+	PublicInbox::MdirReader->new->maildir_each_eml("$home/md2", $cb);
+	@$dst = sort { $a cmp $b } @$dst;
+	is_deeply($dst, \@md1,
+		"convert from extindex w/ or w/o `extindex' prefix");
+
+	use autodie qw(unlink);
+	my @o = glob "$home/extindex/ei*/over.sqlite*";
+	unlink(@o);
+	ok(!lei('convert', '-o', "$home/fail", "extindex:$d"));
+	like($lei_err, qr/unindexed .*?not supported/,
+		'noted unindexed extindex is unsupported');
+});
+
 done_testing;
diff --git a/t/lei-convert.t b/t/lei-convert.t
index 115e7ed0..d75110cb 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -7,6 +7,8 @@ use PublicInbox::MdirReader;
 use PublicInbox::NetReader;
 use PublicInbox::Eml;
 use IO::Uncompress::Gunzip;
+use File::Path qw(remove_tree);
+use PublicInbox::Spawn qw(which);
 use autodie qw(open);
 require_mods(qw(lei -imapd -nntpd Mail::IMAPClient Net::NNTP));
 my ($tmpdir, $for_destroy) = tmpdir;
@@ -148,5 +150,43 @@ test_lei({ tmpdir => $tmpdir }, sub {
 		});
 		is_deeply(\@tmp, \@bar, 'read rsyncable-gzipped mboxcl2');
 	}
+	my $cp = which('cp') or xbail 'cp(1) not available (WTF?)';
+	for my $v (1, 2) {
+		my $ibx_dir = "$ro_home/t$v";
+		lei_ok qw(convert -f mboxrd), $ibx_dir,
+				\"dump v$v inbox to mboxrd";
+		my $out = $lei_out;
+		lei_ok qw(convert -f mboxrd), "v$v:$ibx_dir",
+				\"dump v$v inbox to mboxrd w/ v$v:// prefix";
+		is $out, $lei_out, "v$v:// prefix accepted";
+		open my $fh, '<', \$out;
+		my (@mb, @md, @md2);
+		PublicInbox::MboxReader->mboxrd($fh, sub {
+			$_[0]->header_set('Status');
+			push @mb, $_[0]->as_string;
+		});
+		undef $out;
+		ok(scalar(@mb), 'got messages output');
+		my $mdir = "$d/v$v-mdir";
+		lei_ok qw(convert -o), $mdir, $ibx_dir,
+			\"dump v$v inbox to Maildir";
+		PublicInbox::MdirReader->new->maildir_each_eml($mdir, sub {
+			push @md, $_[2]->as_string;
+		});
+		@md = sort { $a cmp $b } @md;
+		@mb = sort { $a cmp $b } @mb;
+		is_deeply(\@mb, \@md, 'got matching inboxes');
+		xsys_e([$cp, '-Rp', $ibx_dir, "$d/tv$v" ]);
+		remove_tree($mdir, "$d/tv$v/public-inbox",
+				glob("$d/tv$v/xap*"));
+
+		lei_ok qw(convert -o), $mdir, "$d/tv$v",
+			\"dump u indexed v$v inbox to Maildir";
+		PublicInbox::MdirReader->new->maildir_each_eml($mdir, sub {
+			push @md2, $_[2]->as_string;
+		});
+		@md2 = sort { $a cmp $b } @md2;
+		is_deeply(\@md, \@md2, 'got matching inboxes even unindexed');
+	}
 });
 done_testing;

^ permalink raw reply related	[relevance 42%]

* [PATCH 3/3] lei: don't gzip --rsyncable by default for mbox*
  @ 2023-09-27  6:02 46% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-27  6:02 UTC (permalink / raw)
  To: meta

Using and memoizing the usability of `--rsyncable' is unsafe
since pigz (or GNU gzip) can be uninstalled and leave a user
with a non-rsync-aware gzip implementation in the long-running
daemon.  So we stop passing --rsyncable by default to pigz/gzip
and no longer attempt to check for it (since it was a TOCTTOU
error, anyways).

Specifying --rsyncable explicitly didn't work, either, and
ended up passing `1' to the gzip/pigz argv :x

Finally, we now test --rsyncable on the CLI by adding support
for it in `lei convert' and testing it in t/lei-convert.t
---
 lib/PublicInbox/LEI.pm        |  3 ++-
 lib/PublicInbox/LeiToMail.pm  |  2 +-
 lib/PublicInbox/MboxReader.pm | 36 ++++++++---------------------------
 t/lei-convert.t               | 27 ++++++++++++++++++++++++--
 4 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 2f8d7a96..beb0f897 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -275,7 +275,8 @@ our %CMD = ( # sorted in order of importance/use:
 	qw(all:s mode=s), @net_opt, @c_opt ],
 'convert' => [ 'LOCATION...|--stdin',
 	'one-time conversion from URL or filesystem to another format',
-	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s lock=s@ kw!),
+	qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s lock=s@ kw!
+		rsyncable),
 	@net_opt, @c_opt ],
 'p2q' => [ 'LOCATION_OR_COMMIT...|--stdin',
 	"use a patch to generate a query for `lei q --stdin'",
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index a2cd8650..2dddf00b 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -155,7 +155,7 @@ sub reap_compress { # awaitpid callback
 	$lei->fail($?, "@$cmd failed") if $?;
 }
 
-sub _post_augment_mbox { # open a compressor process from top-level process
+sub _post_augment_mbox { # open a compressor process from top-level lei-daemon
 	my ($self, $lei) = @_;
 	my $zsfx = $self->{zsfx} or return;
 	my $cmd = PublicInbox::MboxReader::zsfx2cmd($zsfx, undef, $lei);
diff --git a/lib/PublicInbox/MboxReader.pm b/lib/PublicInbox/MboxReader.pm
index beffabe8..e4209022 100644
--- a/lib/PublicInbox/MboxReader.pm
+++ b/lib/PublicInbox/MboxReader.pm
@@ -1,10 +1,10 @@
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
-# reader for mbox variants we support
+# reader for mbox variants we support (and also sets up commands for writing)
 package PublicInbox::MboxReader;
 use strict;
-use v5.10.1;
+use v5.10.1; # check regexps before v5.12
 use Data::Dumper;
 $Data::Dumper::Useqq = 1; # should've been the default, for bad data
 
@@ -141,10 +141,9 @@ sub reads {
 
 # all of these support -c for stdout and -d for decompression,
 # mutt is commonly distributed with hooks for gz, bz2 and xz, at least
-# { foo => '' } means "--foo" is passed to the command-line,
-# otherwise { foo => '--bar' } passes "--bar"
+# { foo => '' } means "--foo" is passed to the command-line
 my %zsfx2cmd = (
-	gz => [ qw(GZIP pigz gzip) ],
+	gz => [ qw(GZIP pigz gzip), { rsyncable => '' } ],
 	bz2 => [ 'bzip2', {} ],
 	xz => [ 'xz', {} ],
 	# don't add new entries here unless MUA support is widely available
@@ -173,28 +172,9 @@ sub zsfx2cmd ($$$) {
 	}
 	$cmd[0] // die join(' or ', @info)." missing for .$zsfx";
 
-	# not all gzip support --rsyncable, FreeBSD gzip doesn't even exit
-	# with an error code
-	if (!$decompress && $cmd[0] =~ m!/gzip\z! && !defined($cmd_opt)) {
-		pipe(my ($r, $w)) or die "pipe: $!";
-		open my $null, '+>', '/dev/null' or die "open: $!";
-		my $rdr = { 0 => $null, 1 => $null, 2 => $w };
-		my $tst = [ $cmd[0], '--rsyncable' ];
-		my $pid = PublicInbox::Spawn::spawn($tst, undef, $rdr);
-		close $w;
-		my $err = do { local $/; <$r> };
-		waitpid($pid, 0) == $pid or die "BUG: waitpid: $!";
-		$cmd_opt = $err ? {} : { rsyncable => '' };
-		push(@$x, $cmd_opt);
-	}
-	for my $bool (keys %$cmd_opt) {
-		my $switch = $cmd_opt->{$bool} // next;
-		push @cmd, '--'.($switch || $bool);
-	}
-	for my $key (qw(rsyncable)) { # support compression level?
-		my $switch = $cmd_opt->{$key} // next;
-		my $val = $lei->{opt}->{$key} // next;
-		push @cmd, $switch, $val;
+	# only for --rsyncable.  TODO: support compression level?
+	for my $key (keys %$cmd_opt) {
+		push @cmd, '--'.$key if $lei->{opt}->{$key};
 	}
 	\@cmd;
 }
diff --git a/t/lei-convert.t b/t/lei-convert.t
index e1849ff7..115e7ed0 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -1,12 +1,13 @@
 #!perl -w
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict; use v5.10.1; use PublicInbox::TestCommon;
+use v5.12; use PublicInbox::TestCommon;
 use PublicInbox::MboxReader;
 use PublicInbox::MdirReader;
 use PublicInbox::NetReader;
 use PublicInbox::Eml;
 use IO::Uncompress::Gunzip;
+use autodie qw(open);
 require_mods(qw(lei -imapd -nntpd Mail::IMAPClient Net::NNTP));
 my ($tmpdir, $for_destroy) = tmpdir;
 my $sock = tcp_server;
@@ -125,5 +126,27 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	like($md[0], qr/:2,S\z/, "`seen' flag set in Maildir");
 	lei_ok(qw(convert -o mboxrd:/dev/stdout), "$d/md2");
 	like($lei_out, qr/^Status: RO/sm, "`seen' flag preserved");
+
+	SKIP: {
+		my $ok;
+		for my $x (($ENV{GZIP}//''), qw(pigz gzip)) {
+			$x && `$x -h 2>&1` =~ /--rsyncable\b/s or next;
+			$ok = $x;
+			last;
+		}
+		skip 'pigz || gzip do not support --rsyncable' if !$ok;
+		lei_ok qw(convert --rsyncable), "mboxrd:$d/qp.gz",
+			'-o', "mboxcl2:$d/qp2.gz";
+		undef $fh; # necessary to make IO::Uncompress::Gunzip happy
+		open $fh, '<', "$d/qp2.gz";
+		$fh = IO::Uncompress::Gunzip->new($fh, MultiStream => 1);
+		my @tmp;
+		PublicInbox::MboxReader->mboxcl2($fh, sub {
+			my ($eml) = @_;
+			$eml->header_set($_) for qw(Content-Length Lines);
+			push @tmp, $eml;
+		});
+		is_deeply(\@tmp, \@bar, 'read rsyncable-gzipped mboxcl2');
+	}
 });
 done_testing;

^ permalink raw reply related	[relevance 46%]

* [PATCH 3/4] lei: use scalar %SIG assignment
  @ 2023-09-24 21:08 71% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-24 21:08 UTC (permalink / raw)
  To: meta

Perl v5.16.3 (and possibly some later versions) complain about
this, but newer (v5.32.1) are fine with it.

Fixes: e281363ba937 ("lei: ensure we run DESTROY|END at daemon exit w/ kqueue")
---
 lib/PublicInbox/LEI.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 1ead9bf6..be77fa90 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1381,9 +1381,9 @@ sub lazy_start {
 	PublicInbox::DS::sig_setmask($oldset) if @kq_ign;
 
 	# exit() may trigger waitpid via various DESTROY, ensure interruptible
-	local @SIG{TERM} = sub { exit(POSIX::SIGTERM + 128) };
-	local @SIG{INT} = sub { exit(POSIX::SIGINT + 128) };
-	local @SIG{QUIT} = sub { exit(POSIX::SIGQUIT + 128) };
+	local $SIG{TERM} = sub { exit(POSIX::SIGTERM + 128) };
+	local $SIG{INT} = sub { exit(POSIX::SIGINT + 128) };
+	local $SIG{QUIT} = sub { exit(POSIX::SIGQUIT + 128) };
 	PublicInbox::DS::sig_setmask($oldset) if !@kq_ign;
 	dump_and_clear_log();
 	exit($exit_code // 0);

^ permalink raw reply related	[relevance 71%]

* [PATCH 5/6] lei: fix `-c NAME=VALUE' config support
  2023-09-24  5:42 69% [PATCH 0/6] lei config fixes and improvements Eric Wong
                   ` (2 preceding siblings ...)
  2023-09-24  5:42 66% ` [PATCH 4/6] lei config: send `git config' errors to pager Eric Wong
@ 2023-09-24  5:42 30% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-24  5:42 UTC (permalink / raw)
  To: meta

We can pass `-c NAME=VALUE' args directly to git-config without
needing a temporary directory nor file.  Furthermore, this opens
the door to us being able to correctly handle `-c NAME=VALUE'
after `delete $lei->{cfg}' if we need to reload the config
during a command.

This tightens up error-checking for `lei config' and ensures we
can make config settings changes while using `-c NAME=VALUE'
instead of editing the temporary file.

The non-obvious part was avoiding the use of the -f/--file arg for
`git config' for read-only operations and include relying on
`-c include.path=$ABS_PATH'.  This is done by parsing the
switches to be passed to `git config' to determine if it's a
read-only operation or not.
---
 lib/PublicInbox/Config.pm    | 52 +++++++++++++++-----
 lib/PublicInbox/LEI.pm       | 95 +++++++++++++++++++-----------------
 lib/PublicInbox/LeiConfig.pm | 23 ++++++---
 lib/PublicInbox/LeiMirror.pm |  5 +-
 t/lei.t                      | 17 ++++++-
 5 files changed, 123 insertions(+), 69 deletions(-)

diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm
index f6236d84..533f4a52 100644
--- a/lib/PublicInbox/Config.pm
+++ b/lib/PublicInbox/Config.pm
@@ -22,7 +22,7 @@ sub _array ($) { ref($_[0]) eq 'ARRAY' ? $_[0] : [ $_[0] ] }
 # returns key-value pairs of config directives in a hash
 # if keys may be multi-value, the value is an array ref containing all values
 sub new {
-	my ($class, $file, $errfh) = @_;
+	my ($class, $file, $lei) = @_;
 	$file //= default_file();
 	my $self;
 	my $set_dedupe;
@@ -36,7 +36,7 @@ sub new {
 			$self = $DEDUPE->{$file} and return $self;
 			$set_dedupe = 1;
 		}
-		$self = git_config_dump($class, $file, $errfh);
+		$self = git_config_dump($class, $file, $lei);
 		$self->{'-f'} = $file;
 	}
 	# caches
@@ -174,13 +174,34 @@ sub config_fh_parse ($$$) {
 	\%rv;
 }
 
+sub tmp_cmd_opt ($$) {
+	my ($env, $opt) = @_;
+	# quiet global and system gitconfig if supported by installed git,
+	# but normally harmless if too noisy (NOGLOBAL no longer exists)
+	$env->{GIT_CONFIG_NOSYSTEM} = 1;
+	$env->{GIT_CONFIG_GLOBAL} = '/dev/null'; # git v2.32+
+	$opt->{-C} = '/'; # avoid $worktree/.git/config on MOST systems :P
+}
+
 sub git_config_dump {
-	my ($class, $file, $errfh) = @_;
-	return bless {}, $class unless -e $file;
-	my $cmd = [ qw(git config -z -l --includes), "--file=$file" ];
-	my $fh = popen_rd($cmd, undef, { 2 => $errfh // 2 });
+	my ($class, $file, $lei) = @_;
+	my @opt_c = map { ('-c', $_) } @{$lei->{opt}->{c} // []};
+	$file = undef if !-e $file;
+	# XXX should we set {-f} if !-e $file?
+	return bless {}, $class if (!@opt_c && !defined($file));
+	my %env;
+	my $opt = { 2 => $lei->{2} // 2 };
+	if (@opt_c) {
+		unshift(@opt_c, '-c', "include.path=$file") if defined($file);
+		tmp_cmd_opt(\%env, $opt);
+	}
+	my @cmd = ('git', @opt_c, qw(config -z -l --includes));
+	push(@cmd, '-f', $file) if !@opt_c && defined($file);
+	my $fh = popen_rd(\@cmd, \%env, $opt);
 	my $rv = config_fh_parse($fh, "\0", "\n");
-	close $fh or die "@$cmd failed: \$?=$?\n";
+	close $fh or die "@cmd failed: \$?=$?\n";
+	$rv->{-opt_c} = \@opt_c if @opt_c; # for ->urlmatch
+	$rv->{-f} = $file;
 	bless $rv, $class;
 }
 
@@ -544,14 +565,23 @@ sub _fill_ei ($$) {
 	$es;
 }
 
+sub config_cmd {
+	my ($self, $env, $opt) = @_;
+	my $f = $self->{-f} // default_file();
+	my @opt_c = @{$self->{-opt_c} // []};
+	my @cmd = ('git', @opt_c, 'config');
+	@opt_c ? tmp_cmd_opt($env, $opt) : push(@cmd, '-f', $f);
+	\@cmd;
+}
+
 sub urlmatch {
 	my ($self, $key, $url, $try_git) = @_;
 	state $urlmatch_broken; # requires git 1.8.5
 	return if $urlmatch_broken;
-	my $file = $self->{'-f'} // default_file();
-	my $cmd = [qw/git config -z --includes --get-urlmatch/,
-		"--file=$file", $key, $url ];
-	my $fh = popen_rd($cmd);
+	my (%env, %opt);
+	my $cmd = $self->config_cmd(\%env, \%opt);
+	push @$cmd, qw(-z --includes --get-urlmatch), $key, $url;
+	my $fh = popen_rd($cmd, \%env, \%opt);
 	local $/ = "\0";
 	my $val = <$fh>;
 	if (!close($fh)) {
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 488006e0..8b62def2 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -281,7 +281,8 @@ our %CMD = ( # sorted in order of importance/use:
 	"use a patch to generate a query for `lei q --stdin'",
 	qw(stdin| in-format|F=s want|w=s@ uri debug), @net_opt, @c_opt ],
 'config' => [ '[...]', sub {
-		'git-config(1) wrapper for '._config_path($_[0]);
+		'git-config(1) wrapper for '._config_path($_[0]). "\n" .
+	'-l/--list and other common git-config uses are supported'
 	}, qw(config-file|system|global|file|f=s), # for conflict detection
 	 qw(edit|e c=s@ C=s@), pass_through('git config') ],
 'inspect' => [ 'ITEMS...|--stdin', 'inspect lei/store and/or local external',
@@ -456,6 +457,7 @@ my %OPTDESC = (
 'z|0' => 'use NUL \\0 instead of newline (CR) to delimit lines',
 
 'signal|s=s' => [ 'SIG', 'signal to send lei-daemon (default: TERM)' ],
+'edit|e	config' => 'open an editor to modify the lei config file',
 ); # %OPTDESC
 
 my %CONFIG_KEYS = (
@@ -760,36 +762,6 @@ sub optparse ($$$) {
 	$err ? fail($self, "usage: lei $cmd $proto\nE: $err") : 1;
 }
 
-sub _tmp_cfg { # for lei -c <name>=<value> ...
-	my ($self) = @_;
-	my $cfg = _lei_cfg($self, 1);
-	require File::Temp;
-	my $ft = File::Temp->new(TEMPLATE => 'lei_cfg-XXXX', TMPDIR => 1);
-	my $tmp = { '-f' => $ft->filename, -tmp => $ft };
-	$ft->autoflush(1);
-	print $ft <<EOM or return fail($self, "$tmp->{-f}: $!");
-[include]
-	path = $cfg->{-f}
-EOM
-	$tmp = $self->{cfg} = bless { %$cfg, %$tmp }, ref($cfg);
-	for (@{$self->{opt}->{c}}) {
-		/\A([^=\.]+\.[^=]+)(?:=(.*))?\z/ or return fail($self, <<EOM);
-`-c $_' is not of the form -c <name>=<value>'
-EOM
-		my ($name, $value) = ($1, $2 // 1);
-		_config($self, '--add', $name, $value) or return;
-		if (defined(my $v = $tmp->{$name})) {
-			if (ref($v) eq 'ARRAY') {
-				push @$v, $value;
-			} else {
-				$tmp->{$name} = [ $v, $value ];
-			}
-		} else {
-			$tmp->{$name} = $value;
-		}
-	}
-}
-
 sub lazy_cb ($$$) { # $pfx is _complete_ or lei_
 	my ($self, $cmd, $pfx) = @_;
 	my $ucmd = $cmd;
@@ -819,7 +791,6 @@ sub dispatch {
 	}
 	if (my $cb = lazy_cb(__PACKAGE__, $cmd, 'lei_')) {
 		optparse($self, $cmd, \@argv) or return;
-		$self->{opt}->{c} and (_tmp_cfg($self) // return);
 		if (my $chdir = $self->{opt}->{C}) {
 			for my $d (@$chdir) {
 				next if $d eq ''; # same as git(1)
@@ -844,17 +815,20 @@ sub _lei_cfg ($;$) {
 	my $f = _config_path($self);
 	my @st = stat($f);
 	my $cur_st = @st ? pack('dd', $st[10], $st[7]) : ''; # 10:ctime, 7:size
-	my ($sto, $sto_dir, $watches, $lne);
-	if (my $cfg = $PATH2CFG{$f}) { # reuse existing object in common case
-		return ($self->{cfg} = $cfg) if $cur_st eq $cfg->{-st};
+	my ($sto, $sto_dir, $watches, $lne, $cfg);
+	if ($cfg = $PATH2CFG{$f}) { # reuse existing object in common case
+		($cur_st eq $cfg->{-st} && !$self->{opt}->{c}) and
+			return ($self->{cfg} = $cfg);
+		# reuse some fields below if they match:
 		($sto, $sto_dir, $watches, $lne) =
 				@$cfg{qw(-lei_store leistore.dir -watches
 					-lei_note_event)};
 	}
 	if (!@st) {
-		unless ($creat) {
-			delete $self->{cfg};
-			return bless {}, 'PublicInbox::Config';
+		unless ($creat) { # any commands which write to cfg must creat
+			$cfg = PublicInbox::Config->git_config_dump(
+							'/dev/null', $self);
+			return ($self->{cfg} = $cfg);
 		}
 		my ($cfg_dir) = ($f =~ m!(.*?/)[^/]+\z!);
 		File::Path::mkpath($cfg_dir);
@@ -863,9 +837,8 @@ sub _lei_cfg ($;$) {
 		$cur_st = pack('dd', $st[10], $st[7]);
 		qerr($self, "# $f created") if $self->{cmd} ne 'config';
 	}
-	my $cfg = PublicInbox::Config->git_config_dump($f, $self->{2});
+	$cfg = PublicInbox::Config->git_config_dump($f, $self);
 	$cfg->{-st} = $cur_st;
-	$cfg->{'-f'} = $f;
 	if ($sto && canonpath_harder($sto_dir // store_path($self))
 			eq canonpath_harder($cfg->{'leistore.dir'} //
 						store_path($self))) {
@@ -877,7 +850,7 @@ sub _lei_cfg ($;$) {
 		# FIXME: use inotify/EVFILT_VNODE to detect unlinked configs
 		delete(@PATH2CFG{grep(!-f, keys %PATH2CFG)});
 	}
-	$self->{cfg} = $PATH2CFG{$f} = $cfg;
+	$self->{cfg} = $self->{opt}->{c} ? $cfg : ($PATH2CFG{$f} = $cfg);
 	refresh_watches($self);
 	$cfg;
 }
@@ -898,11 +871,41 @@ sub _lei_store ($;$) {
 sub _config {
 	my ($self, @argv) = @_;
 	my $err_ok = ($argv[0] // '') eq '+e' ? shift(@argv) : undef;
-	my %env = (%{$self->{env}}, GIT_CONFIG => undef);
+	my %env;
+	my %opt = map { $_ => $self->{$_} } (0..2);
 	my $cfg = _lei_cfg($self, 1);
-	my $cmd = [ qw(git config -f), $cfg->{'-f'}, @argv ];
-	my %rdr = map { $_ => $self->{$_} } (0..2);
-	waitpid(spawn($cmd, \%env, \%rdr), 0);
+	my $opt_c = delete local $cfg->{-opt_c};
+	my @file_arg;
+	if ($opt_c) {
+		my ($set, $get, $nondash);
+		for (@argv) { # order matters for git-config
+			if (!$nondash) {
+				if (/\A--(?:add|rename-section|remove-section|
+						replace-all|
+						unset-all|unset)\z/x) {
+					++$set;
+				} elsif ($_ eq '-l' || $_ eq '--list' ||
+						/\A--get/) {
+					++$get;
+				} elsif (/\A-/) { # -z and such
+				} else {
+					++$nondash;
+				}
+			} else {
+				++$nondash;
+			}
+		}
+		if ($set || ($nondash//0) > 1 && !$get) {
+			@file_arg = ('-f', $cfg->{-f});
+			$env{GIT_CONFIG} = $file_arg[1];
+		} else { # OK, we can use `-c n=v' for read-only
+			$cfg->{-opt_c} = $opt_c;
+			$env{GIT_CONFIG} = undef;
+		}
+	}
+	my $cmd = $cfg->config_cmd(\%env, \%opt);
+	push @$cmd, @file_arg, @argv;
+	waitpid(spawn($cmd, \%env, \%opt), 0);
 	$? == 0 ? 1 : ($err_ok ? undef : fail($self, $?));
 }
 
@@ -1545,7 +1548,7 @@ sub sto_done_request {
 
 sub cfg_dump ($$) {
 	my ($lei, $f) = @_;
-	my $ret = eval { PublicInbox::Config->git_config_dump($f, $lei->{2}) };
+	my $ret = eval { PublicInbox::Config->git_config_dump($f, $lei) };
 	return $ret if !$@;
 	warn($@);
 	undef;
diff --git a/lib/PublicInbox/LeiConfig.pm b/lib/PublicInbox/LeiConfig.pm
index fd4b0eca..76fc43e7 100644
--- a/lib/PublicInbox/LeiConfig.pm
+++ b/lib/PublicInbox/LeiConfig.pm
@@ -1,8 +1,7 @@
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-package PublicInbox::LeiConfig;
-use strict;
-use v5.10.1;
+package PublicInbox::LeiConfig; # subclassed by LeiEditSearch
+use v5.12;
 use PublicInbox::PktOp;
 use Fcntl qw(SEEK_SET);
 use autodie qw(open seek);
@@ -41,10 +40,18 @@ sub lei_config {
 	my ($lei, @argv) = @_;
 	$lei->{opt}->{'config-file'} and return $lei->fail(
 		"config file switches not supported by `lei config'");
-	return $lei->_config(@argv) unless $lei->{opt}->{edit};
-	my $f = $lei->_lei_cfg(1)->{-f};
-	my $self = bless { lei => $lei, -f => $f }, __PACKAGE__;
-	cfg_do_edit($self);
+	if ($lei->{opt}->{edit}) {
+		@argv and return $lei->fail(
+'--edit must be used without other arguments');
+		$lei->{opt}->{c} and return $lei->fail(
+"`-c $lei->{opt}->{c}->[0]' not allowed with --edit");
+		my $f = $lei->_lei_cfg(1)->{-f};
+		cfg_do_edit(bless { lei => $lei, -f => $f }, __PACKAGE__);
+	} elsif (@argv) { # let git-config do error-checking
+		$lei->_config(@argv);
+	} else {
+		$lei->_help('no options given');
+	}
 }
 
 1;
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index bed034f1..fed6b668 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -193,7 +193,8 @@ sub _write_inbox_config {
 	} elsif (!$!{EEXIST}) {
 		die "open($f): $!";
 	}
-	my $cfg = PublicInbox::Config->git_config_dump($f, $self->{lei}->{2});
+	my $cfg = PublicInbox::Config->git_config_dump($f,
+						{ 2 => $self->{lei}->{2} });
 	my $ibx = $self->{ibx} = {}; # for indexing
 	for my $sec (grep(/\Apublicinbox\./, @{$cfg->{-section_order}})) {
 		for (qw(address newsgroup nntpmirror)) {
@@ -238,7 +239,7 @@ sub index_cloned_inbox {
 		}
 		# force synchronous awaitpid for v2:
 		local $PublicInbox::DS::in_loop = 0;
-		my $cfg = PublicInbox::Config->new(undef, $lei->{2});
+		my $cfg = PublicInbox::Config->new(undef, { 2 => $lei->{2} });
 		my $env = PublicInbox::Admin::index_prepare($opt, $cfg);
 		local %ENV = (%ENV, %$env) if $env;
 		PublicInbox::Admin::progress_prepare($opt, $lei->{2});
diff --git a/t/lei.t b/t/lei.t
index 1199ca75..3ac804a8 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -40,10 +40,21 @@ my $test_help = sub {
 	lei_ok(qw(config -h));
 	like($lei_out, qr! \Q$home\E/\.config/lei/config\b!,
 		'actual path shown in config -h');
+	my $exp_help = qr/\Q$lei_out\E/s;
+	ok(!lei('config'), 'config w/o args fails');
+	like($lei_err, $exp_help, 'config w/o args shows our help in stderr');
 	lei_ok(qw(config -h), { XDG_CONFIG_HOME => '/XDC' },
 		\'config with XDG_CONFIG_HOME');
 	like($lei_out, qr! /XDC/lei/config\b!, 'XDG_CONFIG_HOME in config -h');
 	is($lei_err, '', 'no errors from config -h');
+
+	lei_ok(qw(-c foo.bar config dash.c works));
+	lei_ok(qw(config dash.c));
+	is($lei_out, "works\n", 'config set w/ -c');
+
+	lei_ok(qw(-c foo.bar config --add dash.c add-works));
+	lei_ok(qw(config --get-all dash.c));
+	is($lei_out, "works\nadd-works\n", 'config --add w/ -c');
 };
 
 my $ok_err_info = sub {
@@ -101,9 +112,11 @@ my $test_config = sub {
 	is($lei_out, "tr00\n", "-c string value passed as-is");
 	lei_ok(qw(-c imap.debug=a -c imap.debug=b config --get-all imap.debug));
 	is($lei_out, "a\nb\n", '-c and --get-all work together');
-
-	lei_ok([qw(config -e)], { VISUAL => 'cat', EDITOR => 'cat' });
+	my $env = { VISUAL => 'cat', EDITOR => 'cat' };
+	lei_ok([qw(config -e)], $env);
 	is($lei_out, "[a]\n\tb = c\n", '--edit works');
+	ok(!lei([qw(-c a.b=c config -e)], $env), '-c conflicts with -e');
+	like($lei_err, qr/not allowed/, 'error message shown');
 };
 
 my $test_completion = sub {

^ permalink raw reply related	[relevance 30%]

* [PATCH 2/6] lei view_text: used tied ProcessPipe for `git config'
  2023-09-24  5:42 69% [PATCH 0/6] lei config fixes and improvements Eric Wong
  2023-09-24  5:42 56% ` [PATCH 1/6] lei: check git-config(1) failures Eric Wong
@ 2023-09-24  5:42 71% ` Eric Wong
  2023-09-24  5:42 66% ` [PATCH 4/6] lei config: send `git config' errors to pager Eric Wong
  2023-09-24  5:42 30% ` [PATCH 5/6] lei: fix `-c NAME=VALUE' config support Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-24  5:42 UTC (permalink / raw)
  To: meta

The code exists and is loaded anyways, so we might as well
save an explicit call to waitpid.  Noticed while checking
over our uses of `git config'
---
 lib/PublicInbox/LeiViewText.pm | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiViewText.pm b/lib/PublicInbox/LeiViewText.pm
index 53555467..70441867 100644
--- a/lib/PublicInbox/LeiViewText.pm
+++ b/lib/PublicInbox/LeiViewText.pm
@@ -72,12 +72,11 @@ sub new {
 	my $self = bless { %{$lei->{opt}}, -colored => \&uncolored }, $cls;
 	$self->{-quote_reply} = 1 if $fmt eq 'reply';
 	return $self unless $self->{color} //= -t $lei->{1};
-	my $cmd = [ qw(git config -z --includes -l) ];
-	my ($r, $pid) = popen_rd($cmd, undef, { 2 => $lei->{2} });
+	my @cmd = qw(git config -z --includes -l); # reuse normal git config
+	my $r = popen_rd(\@cmd, undef, { 2 => $lei->{2} });
 	my $cfg = PublicInbox::Config::config_fh_parse($r, "\0", "\n");
-	waitpid($pid, 0);
-	if ($?) {
-		warn "# git-config failed, no color (non-fatal)\n";
+	if (!close($r)) {
+		warn "# @cmd failed, no color (non-fatal \$?=$?)\n";
 		return $self;
 	}
 	$self->{-colored} = \&my_colored;

^ permalink raw reply related	[relevance 71%]

* [PATCH 0/6] lei config fixes and improvements
@ 2023-09-24  5:42 69% Eric Wong
  2023-09-24  5:42 56% ` [PATCH 1/6] lei: check git-config(1) failures Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2023-09-24  5:42 UTC (permalink / raw)
  To: meta

Fixing `-c NAME=VALUE' was something I noticed working on
improving *BSD support the other week.  Everything else
here was noticed while fixing -c.  And 6/6 is because I'm
annoyed at seeing test-only code in Config.pm

Eric Wong (6):
  lei: check git-config(1) failures
  lei view_text: used tied ProcessPipe for `git config'
  config: handle key-only entries as booleans
  lei config: send `git config' errors to pager
  lei: fix `-c NAME=VALUE' config support
  config: drop scalar ref support from internal API

 lib/PublicInbox/Config.pm            |  78 ++++++++++------
 lib/PublicInbox/LEI.pm               | 102 +++++++++++----------
 lib/PublicInbox/LeiAddWatch.pm       |   7 +-
 lib/PublicInbox/LeiConfig.pm         |  35 +++++---
 lib/PublicInbox/LeiForgetExternal.pm |   3 +-
 lib/PublicInbox/LeiInit.pm           |   4 +-
 lib/PublicInbox/LeiMirror.pm         |   5 +-
 lib/PublicInbox/LeiRmWatch.pm        |   2 +-
 lib/PublicInbox/LeiViewText.pm       |   9 +-
 lib/PublicInbox/TestCommon.pm        |  20 +++--
 t/config.t                           | 128 +++++++++++++++------------
 t/config_limiter.t                   |  31 +++----
 t/inbox_idle.t                       |  15 ++--
 t/lei.t                              |  17 +++-
 t/psgi_bad_mids.t                    |  18 ++--
 t/psgi_mount.t                       |  14 ++-
 t/psgi_multipart_not.t               |  16 ++--
 t/psgi_scan_all.t                    |  18 ++--
 t/psgi_search.t                      |  12 ++-
 t/psgi_text.t                        |  21 ++---
 t/watch_filter_rubylang.t            |  30 +++----
 t/watch_imap.t                       |  20 +++--
 t/watch_maildir.t                    |  24 ++---
 t/watch_maildir_v2.t                 |  44 ++++-----
 t/watch_multiple_headers.t           |  21 +++--
 25 files changed, 381 insertions(+), 313 deletions(-)


^ permalink raw reply	[relevance 69%]

* [PATCH 4/6] lei config: send `git config' errors to pager
  2023-09-24  5:42 69% [PATCH 0/6] lei config fixes and improvements Eric Wong
  2023-09-24  5:42 56% ` [PATCH 1/6] lei: check git-config(1) failures Eric Wong
  2023-09-24  5:42 71% ` [PATCH 2/6] lei view_text: used tied ProcessPipe for `git config' Eric Wong
@ 2023-09-24  5:42 66% ` Eric Wong
  2023-09-24  5:42 30% ` [PATCH 5/6] lei: fix `-c NAME=VALUE' config support Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-24  5:42 UTC (permalink / raw)
  To: meta

Our previous use of lei->cfg_dump was wrong as the extra arg was
never supported.  Instead, we need to capture the output of
`git config' and send it to the pager if ->cfg_dump fails.  We'll
also add a note to the user to quit the pager to continue.
---
 lib/PublicInbox/LEI.pm       |  2 +-
 lib/PublicInbox/LeiConfig.pm | 12 ++++++++++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index a6d92eec..488006e0 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1098,7 +1098,7 @@ sub pgr_err {
 	my ($self, @msg) = @_;
 	return warn(@msg) unless $self->{sock} && -t $self->{2};
 	start_pager($self, { LESS => 'RX' }); # no 'F' so we prompt
-	print { $self->{2} } @msg;
+	say { $self->{2} } @msg, '# -quit pager to continue-';
 	$self->{2}->autoflush(1);
 	stop_pager($self);
 	send($self->{sock}, 'wait', 0); # wait for user to quit pager
diff --git a/lib/PublicInbox/LeiConfig.pm b/lib/PublicInbox/LeiConfig.pm
index 23be9aaf..fd4b0eca 100644
--- a/lib/PublicInbox/LeiConfig.pm
+++ b/lib/PublicInbox/LeiConfig.pm
@@ -4,6 +4,8 @@ package PublicInbox::LeiConfig;
 use strict;
 use v5.10.1;
 use PublicInbox::PktOp;
+use Fcntl qw(SEEK_SET);
+use autodie qw(open seek);
 
 sub cfg_do_edit ($;$) {
 	my ($self, $reason) = @_;
@@ -22,8 +24,14 @@ sub cfg_do_edit ($;$) {
 sub cfg_edit_done { # PktOp
 	my ($self) = @_;
 	eval {
-		my $cfg = $self->{lei}->cfg_dump($self->{-f}, $self->{lei}->{2})
-			// return cfg_do_edit($self, "\n");
+		open my $fh, '+>', undef or die "open($!)";
+		my $cfg = do {
+			local $self->{lei}->{2} = $fh;
+			$self->{lei}->cfg_dump($self->{-f});
+		} or do {
+			seek($fh, 0, SEEK_SET);
+			return cfg_do_edit($self, do { local $/; <$fh> });
+		};
 		$self->cfg_verify($cfg) if $self->can('cfg_verify');
 	};
 	$self->{lei}->fail($@) if $@;

^ permalink raw reply related	[relevance 66%]

* [PATCH 1/6] lei: check git-config(1) failures
  2023-09-24  5:42 69% [PATCH 0/6] lei config fixes and improvements Eric Wong
@ 2023-09-24  5:42 56% ` Eric Wong
  2023-09-24  5:42 71% ` [PATCH 2/6] lei view_text: used tied ProcessPipe for `git config' Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-24  5:42 UTC (permalink / raw)
  To: meta

2020-2021 were bad times and I somehow got deluded into
believing git-config(1) would always succeed :x
---
 lib/PublicInbox/LEI.pm               | 9 ++++++---
 lib/PublicInbox/LeiAddWatch.pm       | 7 ++++---
 lib/PublicInbox/LeiForgetExternal.pm | 3 +--
 lib/PublicInbox/LeiInit.pm           | 4 ++--
 lib/PublicInbox/LeiRmWatch.pm        | 2 +-
 5 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 368f9357..a6d92eec 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -776,9 +776,8 @@ EOM
 		/\A([^=\.]+\.[^=]+)(?:=(.*))?\z/ or return fail($self, <<EOM);
 `-c $_' is not of the form -c <name>=<value>'
 EOM
-		my $name = $1;
-		my $value = $2 // 1;
-		_config($self, '--add', $name, $value);
+		my ($name, $value) = ($1, $2 // 1);
+		_config($self, '--add', $name, $value) or return;
 		if (defined(my $v = $tmp->{$name})) {
 			if (ref($v) eq 'ARRAY') {
 				push @$v, $value;
@@ -894,13 +893,17 @@ sub _lei_store ($;$) {
 	};
 }
 
+# returns true on success, undef
+# argv[0] eq `+e' means errors do not ->fail # (like `sh +e')
 sub _config {
 	my ($self, @argv) = @_;
+	my $err_ok = ($argv[0] // '') eq '+e' ? shift(@argv) : undef;
 	my %env = (%{$self->{env}}, GIT_CONFIG => undef);
 	my $cfg = _lei_cfg($self, 1);
 	my $cmd = [ qw(git config -f), $cfg->{'-f'}, @argv ];
 	my %rdr = map { $_ => $self->{$_} } (0..2);
 	waitpid(spawn($cmd, \%env, \%rdr), 0);
+	$? == 0 ? 1 : ($err_ok ? undef : fail($self, $?));
 }
 
 sub lei_daemon_pid { puts shift, $$ }
diff --git a/lib/PublicInbox/LeiAddWatch.pm b/lib/PublicInbox/LeiAddWatch.pm
index 97e7a342..f61e2de4 100644
--- a/lib/PublicInbox/LeiAddWatch.pm
+++ b/lib/PublicInbox/LeiAddWatch.pm
@@ -26,13 +26,14 @@ sub lei_add_watch {
 	for my $w (@{$self->{inputs}}) {
 		# clobber existing, allow multiple
 		if (defined($vmd0)) {
-			$lei->_config("watch.$w.vmd", '--replace-all', $vmd0);
+			$lei->_config("watch.$w.vmd", '--replace-all', $vmd0)
+				or return;
 			for my $v (@vmd) {
-				$lei->_config("watch.$w.vmd", $v);
+				$lei->_config("watch.$w.vmd", $v) or return;
 			}
 		}
 		next if defined $cfg->{"watch.$w.state"};
-		$lei->_config("watch.$w.state", $state);
+		$lei->_config("watch.$w.state", $state) or return;
 	}
 	$lei->_lei_store(1); # create
 	$lei->lms(1)->lms_write_prepare->add_folders(@{$self->{inputs}});
diff --git a/lib/PublicInbox/LeiForgetExternal.pm b/lib/PublicInbox/LeiForgetExternal.pm
index 39bfc60b..c8d1df38 100644
--- a/lib/PublicInbox/LeiForgetExternal.pm
+++ b/lib/PublicInbox/LeiForgetExternal.pm
@@ -16,8 +16,7 @@ sub lei_forget_external {
 			next if $seen{$l}++;
 			my $key = "external.$l.boost";
 			delete($cfg->{$key});
-			$lei->_config('--unset', $key);
-			if ($? == 0) {
+			if ($lei->_config('+e', '--unset', $key)) {
 				$lei->qerr("# $l forgotten ");
 			} elsif (($? >> 8) == 5) {
 				warn("# $l not found\n");
diff --git a/lib/PublicInbox/LeiInit.pm b/lib/PublicInbox/LeiInit.pm
index 27ce8169..94897e61 100644
--- a/lib/PublicInbox/LeiInit.pm
+++ b/lib/PublicInbox/LeiInit.pm
@@ -23,7 +23,7 @@ sub lei_init {
 
 		# some folks like symlinks and bind mounts :P
 		if (@dir && "@cur[1,0]" eq "@dir[1,0]") {
-			$self->_config('leistore.dir', $dir);
+			$self->_config('leistore.dir', $dir) or return;
 			$self->_lei_store(1)->done;
 			return $self->qerr("$exists (as $cur)");
 		}
@@ -31,7 +31,7 @@ sub lei_init {
 E: leistore.dir=$cur already initialized and it is not $dir
 
 	}
-	$self->_config('leistore.dir', $dir);
+	$self->_config('leistore.dir', $dir) or return;
 	$self->_lei_store(1)->done;
 	$exists //= "# leistore.dir=$dir newly initialized";
 	$self->qerr($exists);
diff --git a/lib/PublicInbox/LeiRmWatch.pm b/lib/PublicInbox/LeiRmWatch.pm
index c0f336f0..19bee3ab 100644
--- a/lib/PublicInbox/LeiRmWatch.pm
+++ b/lib/PublicInbox/LeiRmWatch.pm
@@ -14,7 +14,7 @@ sub lei_rm_watch {
 	my $self = bless { missing_ok => 1 }, __PACKAGE__;
 	$self->prepare_inputs($lei, \@argv) or return;
 	for my $w (@{$self->{inputs}}) {
-		$lei->_config('--remove-section', "watch.$w");
+		$lei->_config('--remove-section', "watch.$w") or return;
 	}
 	delete $lei->{cfg}; # force reload
 	$lei->refresh_watches;

^ permalink raw reply related	[relevance 56%]

* [PATCH 1/4] lei blob|rediff: fix usage of lei->fail
  2023-09-22 21:13 71% [PATCH 0/4] small lei fixes Eric Wong
@ 2023-09-22 21:13 90% ` Eric Wong
  2023-09-22 21:13 66% ` [PATCH 2/4] lei: improve ->fail internal API Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-22 21:13 UTC (permalink / raw)
  To: meta

lei->fail only takes one message argument, presently;
but it's probably a good idea to change the API...
---
 lib/PublicInbox/LeiBlob.pm   | 2 +-
 lib/PublicInbox/LeiRediff.pm | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiBlob.pm b/lib/PublicInbox/LeiBlob.pm
index 1692289c..5fc6d902 100644
--- a/lib/PublicInbox/LeiBlob.pm
+++ b/lib/PublicInbox/LeiBlob.pm
@@ -158,7 +158,7 @@ sub lei_blob {
 	if ($lxs->remotes) {
 		require PublicInbox::LeiRemote;
 		$lei->{curl} //= which('curl') or return
-			$lei->fail('curl needed for', $lxs->remotes);
+			$lei->fail('curl needed for '.join(', ',$lxs->remotes));
 		$lei->_lei_store(1)->write_prepare($lei);
 	}
 	require PublicInbox::SolverGit;
diff --git a/lib/PublicInbox/LeiRediff.pm b/lib/PublicInbox/LeiRediff.pm
index c312d90f..9cf95c08 100644
--- a/lib/PublicInbox/LeiRediff.pm
+++ b/lib/PublicInbox/LeiRediff.pm
@@ -268,7 +268,7 @@ sub lei_rediff {
 	if ($lxs->remotes) {
 		require PublicInbox::LeiRemote;
 		$lei->{curl} //= which('curl') or return
-			$lei->fail('curl needed for', $lxs->remotes);
+			$lei->fail('curl needed for '.join(', ',$lxs->remotes));
 	}
 	$lei->ale->refresh_externals($lxs, $lei);
 	my $self = bless {

^ permalink raw reply related	[relevance 90%]

* [PATCH 0/4] small lei fixes
@ 2023-09-22 21:13 71% Eric Wong
  2023-09-22 21:13 90% ` [PATCH 1/4] lei blob|rediff: fix usage of lei->fail Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Eric Wong @ 2023-09-22 21:13 UTC (permalink / raw)
  To: meta

Only noticed while working on bigger fixes...

Eric Wong (4):
  lei blob|rediff: fix usage of lei->fail
  lei: improve ->fail internal API
  lei_to_mail: drop awkward duplication of $lei object
  lei: use File::Temp for listing saved searches

 lib/PublicInbox/LEI.pm            | 19 ++++++++++++-------
 lib/PublicInbox/LeiBlob.pm        |  2 +-
 lib/PublicInbox/LeiRediff.pm      |  2 +-
 lib/PublicInbox/LeiSavedSearch.pm | 20 +++++++++-----------
 lib/PublicInbox/LeiToMail.pm      | 12 +++++-------
 5 files changed, 28 insertions(+), 27 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 3/4] lei_to_mail: drop awkward duplication of $lei object
  2023-09-22 21:13 71% [PATCH 0/4] small lei fixes Eric Wong
  2023-09-22 21:13 90% ` [PATCH 1/4] lei blob|rediff: fix usage of lei->fail Eric Wong
  2023-09-22 21:13 66% ` [PATCH 2/4] lei: improve ->fail internal API Eric Wong
@ 2023-09-22 21:13 70% ` Eric Wong
  2023-09-22 21:13 65% ` [PATCH 4/4] lei: use File::Temp for listing saved searches Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-22 21:13 UTC (permalink / raw)
  To: meta

Our awaitpid API now exists and ProcessPipe uses it, so it's
immune to cyclic references.  Thus there's no need to create
a duplicate of the lei object to prevent leaks.
---
 lib/PublicInbox/LeiToMail.pm | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 7c7967c8..4adcc33e 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -151,10 +151,9 @@ sub git_to_mail { # git->cat_async callback
 }
 
 sub reap_compress { # awaitpid callback
-	my ($pid, $lei) = @_;
-	my $cmd = delete $lei->{"pid.$pid"};
-	return if $? == 0;
-	$lei->fail($?, "@$cmd failed");
+	my ($pid, $lei, $cmd, $old_out) = @_;
+	$lei->{1} = $old_out;
+	$lei->fail($?, "@$cmd failed") if $?;
 }
 
 sub _post_augment_mbox { # open a compressor process from top-level process
@@ -165,9 +164,8 @@ sub _post_augment_mbox { # open a compressor process from top-level process
 	my $rdr = { 0 => $r, 1 => $lei->{1}, 2 => $lei->{2}, pgid => 0 };
 	my $pid = spawn($cmd, undef, $rdr);
 	my $pp = gensym;
-	my $dup = bless { "pid.$pid" => $cmd }, ref($lei);
-	$dup->{$_} = $lei->{$_} for qw(2 sock);
-	tie *$pp, 'PublicInbox::ProcessPipe', $pid, $w, \&reap_compress, $dup;
+	tie *$pp, 'PublicInbox::ProcessPipe', $pid, $w,
+			\&reap_compress, $lei, $cmd, $lei->{1};
 	$lei->{1} = $pp;
 }
 

^ permalink raw reply related	[relevance 70%]

* [PATCH 2/4] lei: improve ->fail internal API
  2023-09-22 21:13 71% [PATCH 0/4] small lei fixes Eric Wong
  2023-09-22 21:13 90% ` [PATCH 1/4] lei blob|rediff: fix usage of lei->fail Eric Wong
@ 2023-09-22 21:13 66% ` Eric Wong
  2023-09-22 21:13 70% ` [PATCH 3/4] lei_to_mail: drop awkward duplication of $lei object Eric Wong
  2023-09-22 21:13 65% ` [PATCH 4/4] lei: use File::Temp for listing saved searches Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-22 21:13 UTC (permalink / raw)
  To: meta

Allow the exit code to be the first argument intead of the last
to match our ->child_error, as well as the BSD err(3) API.
We'll also avoid shifting user-passed exit codes so $? can be
passed as-is without losing signal information.
---
 lib/PublicInbox/LEI.pm       | 19 ++++++++++++-------
 lib/PublicInbox/LeiToMail.pm |  2 +-
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index c61ce76d..368f9357 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -29,6 +29,7 @@ use File::Path ();
 use File::Spec;
 use Carp ();
 use Sys::Syslog qw(openlog syslog closelog);
+use Scalar::Util qw(looks_like_number);
 our $quit = \&CORE::exit;
 our ($current_lei, $errors_log, $listener, $oldset, $dir_idle);
 my $GLP = Getopt::Long::Parser->new;
@@ -518,13 +519,17 @@ sub sigpipe_handler { # handles SIGPIPE from @WQ_KEYS workers
 	fail_handler($_[0], 13, delete $_[0]->{1});
 }
 
-sub fail ($$;$) {
-	my ($self, $msg, $exit_code) = @_;
-	local $current_lei = $self;
-	$self->{failed}++;
-	warn(substr($msg, -1, 1) eq "\n" ? $msg : "$msg\n") if defined $msg;
-	$self->{pkt_op_p}->pkt_do('fail_handler') if $self->{pkt_op_p};
-	x_it($self, ($exit_code // 1) << 8);
+sub fail ($;@) {
+	my ($lei, @msg) = @_;
+	my $exit_code = looks_like_number($msg[0]) ? shift(@msg) : undef;
+	local $current_lei = $lei;
+	$lei->{failed}++;
+	if (@msg) {
+		push @msg, "\n" if substr($msg[-1], -1, 1);
+		warn @msg;
+	}
+	$lei->{pkt_op_p}->pkt_do('fail_handler') if $lei->{pkt_op_p};
+	x_it($lei, $exit_code // (1 << 8));
 	undef;
 }
 
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index e357ee00..7c7967c8 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -154,7 +154,7 @@ sub reap_compress { # awaitpid callback
 	my ($pid, $lei) = @_;
 	my $cmd = delete $lei->{"pid.$pid"};
 	return if $? == 0;
-	$lei->fail("@$cmd failed", $? >> 8);
+	$lei->fail($?, "@$cmd failed");
 }
 
 sub _post_augment_mbox { # open a compressor process from top-level process

^ permalink raw reply related	[relevance 66%]

* [PATCH 4/4] lei: use File::Temp for listing saved searches
  2023-09-22 21:13 71% [PATCH 0/4] small lei fixes Eric Wong
                   ` (2 preceding siblings ...)
  2023-09-22 21:13 70% ` [PATCH 3/4] lei_to_mail: drop awkward duplication of $lei object Eric Wong
@ 2023-09-22 21:13 65% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-22 21:13 UTC (permalink / raw)
  To: meta

I have no idea how badly my brain malfunctioned here when I
wrote the code to create a temporary file without O_EXCL :x
I'm still not sure if users have enough saved searches for
justifying a cache, here.
---
 lib/PublicInbox/LeiSavedSearch.pm | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/lib/PublicInbox/LeiSavedSearch.pm b/lib/PublicInbox/LeiSavedSearch.pm
index e5396342..2811c46d 100644
--- a/lib/PublicInbox/LeiSavedSearch.pm
+++ b/lib/PublicInbox/LeiSavedSearch.pm
@@ -3,8 +3,7 @@
 
 # pretends to be like LeiDedupe and also PublicInbox::Inbox
 package PublicInbox::LeiSavedSearch;
-use strict;
-use v5.10.1;
+use v5.12;
 use parent qw(PublicInbox::Lock);
 use PublicInbox::Git;
 use PublicInbox::OverIdx;
@@ -14,6 +13,8 @@ use PublicInbox::Spawn qw(run_die);
 use PublicInbox::ContentHash qw(git_sha);
 use PublicInbox::MID qw(mids_for_index);
 use PublicInbox::SHA qw(sha256_hex);
+use File::Temp ();
+use IO::Handle ();
 our $LOCAL_PFX = qr!\A(?:maildir|mh|mbox.+|mmdf|v2):!i; # TODO: put in LeiToMail?
 
 # move this to PublicInbox::Config if other things use it:
@@ -76,20 +77,17 @@ sub list {
 	my $lss_dir = $lei->share_path.'/saved-searches';
 	return () unless -d $lss_dir;
 	# TODO: persist the cache?  Use another format?
-	my $f = $lei->cache_dir."/saved-tmp.$$.".time.'.config';
-	open my $fh, '>', $f or die "open $f: $!";
+	my $fh = File::Temp->new(TEMPLATE => 'lss_list-XXXX', TMPDIR => 1) or
+		die "File::Temp->new: $!";
 	print $fh "[include]\n";
 	for my $p (glob("$lss_dir/*/lei.saved-search")) {
 		print $fh "\tpath = ", cquote_val($p), "\n";
 	}
-	close $fh or die "close $f: $!";
-	my $cfg = $lei->cfg_dump($f);
-	unlink($f);
+	$fh->flush or die "flush: $fh";
+	my $cfg = $lei->cfg_dump($fh->filename);
 	my $out = $cfg ? $cfg->get_all('lei.q.output') : [];
-	map {;
-		s!$LOCAL_PFX!!;
-		$_;
-	} @$out
+	s!$LOCAL_PFX!! for @$out;;
+	@$out;
 }
 
 sub translate_dedupe ($$) {

^ permalink raw reply related	[relevance 65%]

* lei interactive TUIs (ncurses/vim/emacs)
@ 2023-09-22 20:33 69% Eric Wong
  2023-11-09  4:14 71% ` Kyle Meyer
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-09-22 20:33 UTC (permalink / raw)
  To: meta

I hardly have any experience in this area; but automatic
end-to-end tests for ncurses and other TUI stuff seems
like a huge pain...

I've also noticed vim has scripting abilities (like Emacs?) and
notmuch bundles a vim extension we can take inspiration from.
Perhaps we could bundle vim and Emacs extensions for lei, too...

While I use vim[1], I've always kept my vim decoupled from Perl
(or Lua, Python, Ruby, TCL, etc) and rather tie stuff together
with pipes.  IOW, I'm happy my editor can run arbitrary shell
commands; but don't want stuff linked into my editor (since more
code is usually more fragile).

So, any thoughts on this matter?

Anybody willing to maintain an lei TUI for emacs?
I might give a vim TUI a shot...

But writing and testing a FUSE FS is much more natural...


[1] My choice to use vi/vim is merely because it's the most
    widely-installed editor on random systems I ssh into.
    It also mostly works the same after a few decades w/o
    needing UI changes; same goes for Perl.

^ permalink raw reply	[relevance 69%]

* [PATCH] t/lei-mirror: avoid make(1) jobserver warning
@ 2023-09-22 18:37 66% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-22 18:37 UTC (permalink / raw)
  To: meta

We can't control `make test' nor user-defined targets in
config.mak.  There's no need for a jobserver to run `make help',
anyways, so just let things be.

This also fixes the use of `gmake check' et al. on *BSDs where
various make flags confuse BSD make(1)

While we're at it, allow the test to run in the odd case make(1)
isn't available at all...
---
 t/lei-mirror.t | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index 9b5d73ec..08961491 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -1,7 +1,7 @@
 #!perl -w
 # Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict; use v5.10.1; use PublicInbox::TestCommon;
+use v5.12; use PublicInbox::TestCommon;
 use PublicInbox::Inbox;
 require_mods(qw(-httpd lei DBD::SQLite));
 require_cmd('curl');
@@ -26,11 +26,14 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	is(PublicInbox::Git::try_cat("$t1/description"),
 		"mirror of $http/t1/\n", 'description set');
 	ok(-f "$t1/Makefile", 'convenience Makefile added (v1)');
-	my $make = which('make');
-	is(xsys([$make, 'help'], undef, { -C => $t1, 1 => \(my $help) }), 0,
-		'make help');
+	SKIP: {
+		my $make = require_cmd('make', 1);
+		delete local @ENV{qw(MFLAGS MAKEFLAGS MAKELEVEL)};
+		is(xsys([$make, 'help'], undef, { -C => $t1, 1 => \(my $help) }),
+			0, "$make handled Makefile without errors");
+		isnt($help, '', 'make help worked');
+	}
 	ok(-f "$t1/inbox.config.example", 'inbox.config.example downloaded');
-	isnt($help, '', 'make help worked');
 	is((stat(_))[9], $created{v1},
 		'inbox.config.example mtime is ->created_at');
 	is((stat(_))[2] & 0222, 0, 'inbox.config.example not writable');

^ permalink raw reply related	[relevance 66%]

* Re: [PATCH 03/15] t/lei-p2q: extra diagnostics
  @ 2023-09-21 10:23 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-21 10:23 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> I got one mysterious test failure here, once, and can't seem
> to reproduce it...

nearly 2 years later: I've hit it again, but once again cannot
reproduce it...

> +++ b/t/lei-p2q.t
> @@ -7,7 +7,7 @@ require_mods(qw(json DBD::SQLite Search::Xapian));
>  
>  test_lei(sub {
>  	ok(!lei(qw(p2q this-better-cause-format-patch-to-fail)),
> -		'p2q fails on bogus arg');
> +		'p2q fails on bogus arg') or diag $lei_err;
>  	like($lei_err, qr/format-patch.*failed/, 'notes format-patch failure');

$? does get printed (32768, so (>> 8) means 128), as it should be.
So I wonder if there's a place we drop the socket prematurely or
something else is amiss...

On a side note, lei's internal IPC could probably be simplified
a bit w/o sacrificing parallelism.

^ permalink raw reply	[relevance 71%]

* Re: RFC: lei searches managed by users in git
  2023-09-15 21:08 67% RFC: lei searches managed by users in git Konstantin Ryabitsev
@ 2023-09-15 22:47 55% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-15 22:47 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> Hello:
> 
> I am curious what is the best approach to have a centrally managed set of lei
> searches, for example via config files tracked in git. For example, the file
> could look like this:

I don't have one nor have I thought about it until now...

I wonder if `lei edit-search --all' could be a thing...
As would adding gitconfig output to ls-search
(`lei ls-search -l -f gitconfig').

Maybe `lei edit-search --all=lei.q' could cut down
on non-interesting fields when editing all searches.

I also wouldn't be opposed to supporting non-interactive use
for automation:

	lei edit-search --set lei.q=foo OUTPUT
	lei edit-search --add lei.q=foo OUTPUT

...but maybe it's not needed (see below on `lei q'):

> mricon.toml:

Any particular reason for toml?  I don't have a toml parser
installed and I suspect many don't, either.  git-config is
widely available and installed, of course.  I've been trying
hard to avoid data format and dependency proliferation.

>     [search.torvalds]
>         # All mail sent by torvalds
>         q = 'f:torvalds@linux-foundation.org'
>     [search.floppy]
>         # Any messages talking about floppies or touching floppy code
>         q = 'dfhh:floppy_* OR dfn:drivers/block/floppy.c OR s:floppy OR ((nq:bug OR nq:regression) AND nq:floppy)'
> 
> I could then have a small wrapper maintaining saved searches and making the
> mailboxes available via special newsgroups like:
> 
>     org.kernel.lei.mricon.torvalds
>     org.kernel.lei.mricon.floppy

Sidenote: I think `query' or similar is more appropriate than
using `lei' in a public newsgroup name since it's not `local' :>

> The goal is to make it possible for maintainers to define their own set of
> saved searches and have access to them at kernel.org via imap/pop3/nntp.
> 
> It's easy to write a simple wrapper that would invoke lei-edit-search and
> replace the search string when there are updates to the config files, but I'm
> curious if you already have thoughts on how to best implement something like
> this.

Rerunning `lei q' on existing destinations is fine for updating
existing queries.  This is especially nice with `-f v2', since
dedupe is faster on v2 than any other output format.

Thus there's no need to even bother with `lei edit-search' if
you're doing this non-interactively with your own files/formats
(e.g. via web UI or something).

> My biggest concern is someone committing an invalid query and not receiving
> any more email as a result -- so having a sane way to validate the query
> before sticking it into the saved search would be handy.

Perhaps `lei q --limit=1' could be appropriate for validating queries
with the below fix.  But validating searches can ahead of time is
a TOCTOU problem.  There can be added support for per-inbox prefixes
(e.g. altid with gmane: on public-inbox.org/git), and also when mixing
old/new versions via different HTTP(S) externals.

-------8<-------
Subject: [PATCH] lei q: set exit code for invalid Xapian queries

Xapian can't parse every query, so ensure we set the
exit code for the client.
---
 lib/PublicInbox/LeiXSearch.pm | 6 ++++--
 t/lei.t                       | 5 +++++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 5965274c..7f4911b3 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -186,7 +186,8 @@ sub query_one_mset { # for --threads and l2m w/o sort
 	}
 	my $first_ids;
 	do {
-		$mset = $srch->mset($mo->{qstr}, $mo);
+		$mset = eval { $srch->mset($mo->{qstr}, $mo) };
+		return $lei->child_error(22 << 8, "E: $@") if $@; # 22 from curl
 		mset_progress($lei, $dir, $mo->{offset} + $mset->size,
 				$mset->get_matches_estimated);
 		wait_startq($lei); # wait for keyword updates
@@ -249,7 +250,8 @@ sub query_combined_mset { # non-parallel for non-"--threads" users
 	}
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei);
 	do {
-		$mset = $self->mset($mo->{qstr}, $mo);
+		$mset = eval { $self->mset($mo->{qstr}, $mo) };
+		return $lei->child_error(22 << 8, "E: $@") if $@; # 22 from curl
 		mset_progress($lei, 'xsearch', $mo->{offset} + $mset->size,
 				$mset->get_matches_estimated);
 		wait_startq($lei); # wait for keyword updates
diff --git a/t/lei.t b/t/lei.t
index d83bde69..1199ca75 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -147,6 +147,11 @@ my $test_fail = sub {
 	lei_ok('q', "foo\n");
 	like($lei_err, qr/trailing `\\n' removed/s, "noted `\\n' removal");
 
+	lei(qw(q from:infinity..));
+	is($? >> 8, 22, 'combined query fails on invalid range op');
+	lei(qw(q -t from:infinity..));
+	is($? >> 8, 22, 'single query fails on invalid range op');
+
 	for my $lk (qw(ei inbox)) {
 		my $d = "$home/newline\n$lk";
 		my $all = $lk eq 'ei' ? 'ALL' : 'all';

^ permalink raw reply related	[relevance 55%]

* RFC: lei searches managed by users in git
@ 2023-09-15 21:08 67% Konstantin Ryabitsev
  2023-09-15 22:47 55% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Konstantin Ryabitsev @ 2023-09-15 21:08 UTC (permalink / raw)
  To: meta

Hello:

I am curious what is the best approach to have a centrally managed set of lei
searches, for example via config files tracked in git. For example, the file
could look like this:

mricon.toml:

    [search.torvalds]
        # All mail sent by torvalds
        q = 'f:torvalds@linux-foundation.org'
    [search.floppy]
        # Any messages talking about floppies or touching floppy code
        q = 'dfhh:floppy_* OR dfn:drivers/block/floppy.c OR s:floppy OR ((nq:bug OR nq:regression) AND nq:floppy)'

I could then have a small wrapper maintaining saved searches and making the
mailboxes available via special newsgroups like:

    org.kernel.lei.mricon.torvalds
    org.kernel.lei.mricon.floppy

The goal is to make it possible for maintainers to define their own set of
saved searches and have access to them at kernel.org via imap/pop3/nntp.

It's easy to write a simple wrapper that would invoke lei-edit-search and
replace the search string when there are updates to the config files, but I'm
curious if you already have thoughts on how to best implement something like
this.

My biggest concern is someone committing an invalid query and not receiving
any more email as a result -- so having a sane way to validate the query
before sticking it into the saved search would be handy.

-K

^ permalink raw reply	[relevance 67%]

* [PATCH] lei: ensure we run DESTROY|END at daemon exit w/ kqueue
@ 2023-09-15 10:11 63% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-15 10:11 UTC (permalink / raw)
  To: meta

The fundamental difference which I originally missed when
implementing kqueue EVFILT_SIGNAL support is that it does not
consume signals like signalfd(2) does.  In other words, with
EVFILT_SIGNAL, it's possible for a single signal to be delivered
twice if we unblock signals upon leaving the event loop as we do
in lei.

Note: Our DS->event_loop and Sigfd APIs can/should probably be
changed to better accomodate EVFILT_SIGNAL differences from
signalfd without sacrificing usability of either.

This fixes the problem of leftover lei-ovv.dst*, lei_cfg-* and
skv.* files in $TMPDIR at the end of test suite runs on *BSD
when IO::KQueue is installed.
---
 lib/PublicInbox/LEI.pm | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 5fbb1211..c61ce76d 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1318,6 +1318,9 @@ sub lazy_start {
 		USR1 => \&noop,
 		USR2 => \&noop,
 	};
+	# for EVFILT_SIGNAL and signalfd behavioral difference:
+	my @kq_ign = eval { require PublicInbox::DSKQXS } ? keys(%$sig) : ();
+
 	require PublicInbox::DirIdle;
 	local $dir_idle = PublicInbox::DirIdle->new(sub {
 		# just rely on wakeup to hit post_loop_do
@@ -1356,13 +1359,22 @@ sub lazy_start {
 		$current_lei ? err($current_lei, @_) : warn(
 		  strftime('%Y-%m-%dT%H:%M:%SZ', gmtime(time))," $$ ", @_);
 	};
+	local $SIG{PIPE} = 'IGNORE';
 	open STDERR, '>&STDIN' or die "redirect stderr failed: $!";
 	open STDOUT, '>&STDIN' or die "redirect stdout failed: $!";
 	# $daemon pipe to `lei' closed, main loop begins:
 	eval { PublicInbox::DS::event_loop($sig, $oldset) };
 	warn "event loop error: $@\n" if $@;
+
+	# EVFILT_SIGNAL will get a duplicate of all the signals it was sent
+	local @SIG{@kq_ign} = map 'IGNORE', @kq_ign;
+	PublicInbox::DS::sig_setmask($oldset) if @kq_ign;
+
 	# exit() may trigger waitpid via various DESTROY, ensure interruptible
-	PublicInbox::DS::sig_setmask($oldset);
+	local @SIG{TERM} = sub { exit(POSIX::SIGTERM + 128) };
+	local @SIG{INT} = sub { exit(POSIX::SIGINT + 128) };
+	local @SIG{QUIT} = sub { exit(POSIX::SIGQUIT + 128) };
+	PublicInbox::DS::sig_setmask($oldset) if !@kq_ign;
 	dump_and_clear_log();
 	exit($exit_code // 0);
 }

^ permalink raw reply related	[relevance 63%]

* [PATCH] lei: ensure --stdin sets %ENV and $current_lei
@ 2023-09-14 23:10 64% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-14 23:10 UTC (permalink / raw)
  To: meta

--stdin usage means the current request can be delayed
indefinitely while other requests with different %ENV
come in.  So make sure our warnings and %ENV can match
non-stdin behavior.

This probably fix segfaults during process cleanup on OpenBSD
since _lei_atfork_child use non-localized assignment of
$current_lei.  But it could be another red herring.  Either way,
it's the right thing to do from an environment replication
perspective.
---
 lib/PublicInbox/LeiInspect.pm | 3 +++
 lib/PublicInbox/LeiLcat.pm    | 2 ++
 lib/PublicInbox/LeiQuery.pm   | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index d1dca4ef..0455e739 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -255,6 +255,9 @@ sub ins_add { # InputPipe->consume callback
 	my ($lei) = @_; # $_[1] = $rbuf
 	if (defined $_[1]) {
 		$_[1] eq '' and return eval {
+			$lei->fchdir;
+			local %ENV = %{$lei->{env}};
+			local $PublicInbox::LEI::current_lei = $lei;
 			my $str = delete $lei->{istr};
 			$str =~ s/\A[\r\n]*From [^\r\n]*\r?\n//s;
 			my $eml = PublicInbox::Eml->new(\$str);
diff --git a/lib/PublicInbox/LeiLcat.pm b/lib/PublicInbox/LeiLcat.pm
index 8d89cb73..7ed191c3 100644
--- a/lib/PublicInbox/LeiLcat.pm
+++ b/lib/PublicInbox/LeiLcat.pm
@@ -128,6 +128,8 @@ sub _stdin { # PublicInbox::InputPipe::consume callback for --stdin
 	return $lei->{mset_opt}->{qstr} .= $_[1] if $_[1] ne '';
 	eval {
 		$lei->fchdir;
+		local %ENV = %{$lei->{env}};
+		local $PublicInbox::LEI::current_lei = $lei;
 		my @argv = split(/\s+/, $lei->{mset_opt}->{qstr});
 		$lei->{mset_opt}->{qstr} = extract_all($lei, @argv) or return;
 		$lei->_start_query;
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 26cfb3fd..a23354f0 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -65,6 +65,8 @@ sub qstr_add { # PublicInbox::InputPipe::consume callback for --stdin
 	return $lei->{mset_opt}->{qstr} .= $_[1] if $_[1] ne '';
 	eval {
 		$lei->fchdir;
+		local %ENV = %{$lei->{env}};
+		local $PublicInbox::LEI::current_lei = $lei;
 		$lei->{mset_opt}->{q_raw} = $lei->{mset_opt}->{qstr};
 		$lei->{lse}->query_approxidate($lei->{lse}->git,
 						$lei->{mset_opt}->{qstr});

^ permalink raw reply related	[relevance 64%]

* [PATCH] t/lei-mirror: do not bail out on `make help' failure
@ 2023-09-14 12:12 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-09-14 12:12 UTC (permalink / raw)
  To: meta

I'm not sure why, but this test occasionally fails on OpenBSD
and I can't reproduce it on a repeatable basis.  In any case,
there's no reason we can't continue the rest of the test if
`make help' fails.
---
 t/lei-mirror.t | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/lei-mirror.t b/t/lei-mirror.t
index 2400578a..9b5d73ec 100644
--- a/t/lei-mirror.t
+++ b/t/lei-mirror.t
@@ -27,7 +27,8 @@ test_lei({ tmpdir => $tmpdir }, sub {
 		"mirror of $http/t1/\n", 'description set');
 	ok(-f "$t1/Makefile", 'convenience Makefile added (v1)');
 	my $make = which('make');
-	xsys_e([$make, 'help'], undef, { -C => $t1, 1 => \(my $help) });
+	is(xsys([$make, 'help'], undef, { -C => $t1, 1 => \(my $help) }), 0,
+		'make help');
 	ok(-f "$t1/inbox.config.example", 'inbox.config.example downloaded');
 	isnt($help, '', 'make help worked');
 	is((stat(_))[9], $created{v1},

^ permalink raw reply related	[relevance 71%]

* [PATCH] lei: make --dedupe=content always account for Message-IDs
@ 2023-06-15  9:50 51% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-06-15  9:50 UTC (permalink / raw)
  To: meta

The content dedupe logic was originally designed for v2 public
inboxes as a fallback for when the importer sees identical
Message-IDs.  Thus it did not account for Message-ID(s) in
the message itself.

This change doesn't affect saved searches (the default when
writing to a pathname or IMAP).  It affects --no-save, and
outputs to stdout (even if stdout is redirected to a file).

Prior to this change, lei reused the v2 logic as-is without
accounting for Message-IDs anywhere with `--dedupe=content'
(the default).  This could cause messages to be skipped when
the content matches despite Message-IDs being different.

So with this change, `lei q --dedupe=content' will hash the
Message-ID(s) in the message to ensure messages with different
Message-IDs are NOT deduplicated.

Whether or not this change is a bug fix or introduces regression
is actually debatable.  In my mind, it is better to err on the
side of showing too many messages rather than too few, even if
the actual contents of the message are identical.  Making saved
searches deduplicate without accounting for Message-IDs would be
more difficult, too.
---
 lib/PublicInbox/ContentHash.pm | 15 +++++++++++----
 lib/PublicInbox/LeiDedupe.pm   |  9 +++++++--
 t/lei_dedupe.t                 |  6 +++++-
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/ContentHash.pm b/lib/PublicInbox/ContentHash.pm
index fc94257c..95ca2929 100644
--- a/lib/PublicInbox/ContentHash.pm
+++ b/lib/PublicInbox/ContentHash.pm
@@ -54,16 +54,23 @@ sub content_dig_i {
 	$dig->add($s);
 }
 
-sub content_digest ($;$) {
-	my ($eml, $dig) = @_;
+sub content_digest ($;$$) {
+	my ($eml, $dig, $hash_mids) = @_;
 	$dig //= Digest::SHA->new(256);
 
 	# References: and In-Reply-To: get used interchangeably
 	# in some "duplicates" in LKML.  We treat them the same
 	# in SearchIdx, so treat them the same for this:
 	# do NOT consider the Message-ID as part of the content_hash
-	# if we got here, we've already got Message-ID reuse
-	my %seen = map { $_ => 1 } @{mids($eml)};
+	# if we got here, we've already got Message-ID reuse for v2.
+	#
+	# However, `lei q --dedupe=content' does use $hash_mids since
+	# it doesn't have any other dedupe
+	my $mids = mids($eml);
+	if ($hash_mids) {
+		$dig->add("mid\0$_\0") for @$mids;
+	}
+	my %seen = map { $_ => 1 } @$mids;
 	for (grep { !$seen{$_}++ } @{references($eml)}) {
 		utf8::encode($_);
 		$dig->add("ref\0$_\0");
diff --git a/lib/PublicInbox/LeiDedupe.pm b/lib/PublicInbox/LeiDedupe.pm
index 86cd8490..eda54d79 100644
--- a/lib/PublicInbox/LeiDedupe.pm
+++ b/lib/PublicInbox/LeiDedupe.pm
@@ -2,7 +2,7 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 package PublicInbox::LeiDedupe;
 use v5.12;
-use PublicInbox::ContentHash qw(content_hash git_sha);
+use PublicInbox::ContentHash qw(content_hash content_digest git_sha);
 use PublicInbox::SHA qw(sha256);
 
 # n.b. mutt sets most of these headers not sure about Bytes
@@ -69,7 +69,12 @@ sub dedupe_content ($) {
 	my ($skv) = @_;
 	(sub { # may be called in a child process
 		my ($eml) = @_; # $oidhex = $_[1], ignored
-		$skv->set_maybe(content_hash($eml), '');
+
+		# we must account for Message-ID via hash_mids, since
+		# (unlike v2 dedupe) Message-ID is not accounted for elsewhere:
+		$skv->set_maybe(content_digest($eml, PublicInbox::SHA->new(256),
+				1 # hash_mids
+				)->digest, '');
 	}, sub {
 		my ($smsg) = @_;
 		$skv->set_maybe(smsg_hash($smsg), '');
diff --git a/t/lei_dedupe.t b/t/lei_dedupe.t
index e1944d02..13fc1f3b 100644
--- a/t/lei_dedupe.t
+++ b/t/lei_dedupe.t
@@ -1,5 +1,5 @@
 #!perl -w
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict;
 use v5.10.1;
@@ -10,6 +10,8 @@ use PublicInbox::Smsg;
 require_mods(qw(DBD::SQLite));
 use_ok 'PublicInbox::LeiDedupe';
 my $eml = eml_load('t/plack-qp.eml');
+my $sameish = eml_load('t/plack-qp.eml');
+$sameish->header_set('Message-ID', '<cuepee@example.com>');
 my $mid = $eml->header_raw('Message-ID');
 my $different = eml_load('t/msg_iter-order.eml');
 $different->header_set('Message-ID', $mid);
@@ -47,6 +49,8 @@ for my $strat (undef, 'content') {
 	ok(!$dd->is_dup($different), "different is_dup with $desc dedupe");
 	ok(!$dd->is_smsg_dup($smsg), "is_smsg_dup pass w/ $desc dedupe");
 	ok($dd->is_smsg_dup($smsg), "is_smsg_dup reject w/ $desc dedupe");
+	ok(!$dd->is_dup($sameish),
+		"Message-ID accounted for w/ same content otherwise");
 }
 $lei->{opt}->{dedupe} = 'bogus';
 eval { PublicInbox::LeiDedupe->new($lei) };

^ permalink raw reply related	[relevance 51%]

* [PATCH] lei import: set +(L|kw) on already-imported blobs
@ 2023-06-15  8:46 60% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-06-15  8:46 UTC (permalink / raw)
  To: meta

When import hits blobs it's already seen, we'll add labels
regardless in order to match the behavior of other inexact
matches.  This is useful when importing exact copies of
messages which exist in multiple mailboxes.

I noticed this when I had a message imported from my normal IMAP
`INBOX', but also copied it to a different folder for future
reference.
---
 Documentation/RelNotes/v2.0.0.wip |  3 +++
 lib/PublicInbox/LeiStore.pm       |  8 +++++++-
 t/lei-import.t                    | 17 ++++++++++++++++-
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/Documentation/RelNotes/v2.0.0.wip b/Documentation/RelNotes/v2.0.0.wip
index cd90bdae..cccf11ae 100644
--- a/Documentation/RelNotes/v2.0.0.wip
+++ b/Documentation/RelNotes/v2.0.0.wip
@@ -60,6 +60,9 @@ lei
   * fix `lei q -tt' on locally-indexed messages (still broken for remotes:
     https://public-inbox.org/meta/20230226170931.M947721@dcvr/ )
 
+  * `lei import' now set labels+keywords consistently on all
+     already-imported messages
+
 solver (used by lei (rediff|blob), and PublicInbox::WWW)
 
   * handle copies in patches properly
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index cf5a03a0..727de066 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -387,8 +387,14 @@ sub add_eml {
 		_lms_rw($self)->set_src($smsg->oidbin, @{$vmd->{sync_info}});
 	}
 	unless ($im_mark) { # duplicate blob returns undef
-		return unless wantarray;
+		return unless wantarray || $vmd;
 		my @docids = $oidx->blob_exists($smsg->{blob});
+		if ($vmd) {
+			for my $docid (@docids) {
+				my $idx = $eidx->idx_shard($docid);
+				_add_vmd($self, $idx, $docid, $vmd);
+			}
+		}
 		return _docids_and_maybe_kw $self, \@docids;
 	}
 
diff --git a/t/lei-import.t b/t/lei-import.t
index 6e9a853c..c9e668a3 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -1,5 +1,5 @@
 #!perl -w
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 test_lei(sub {
@@ -110,6 +110,21 @@ $res = json_utf8->decode($lei_out);
 is_deeply($res->[0]->{kw}, ['seen'], 'keyword set');
 is_deeply($res->[0]->{L}, ['inbox'], 'label set');
 
+# idempotent import can add label
+lei_ok([qw(import -F eml - +L:boombox)],
+	undef, { %$lei_opt, 0 => \$eml_str });
+lei_ok(qw(q m:inbox@example.com));
+$res = json_utf8->decode($lei_out);
+is_deeply($res->[0]->{kw}, ['seen'], 'keyword remains set');
+is_deeply($res->[0]->{L}, [qw(boombox inbox)], 'new label added');
+
+# idempotent import can add keyword
+lei_ok([qw(import -F eml - +kw:answered)],
+	undef, { %$lei_opt, 0 => \$eml_str });
+lei_ok(qw(q m:inbox@example.com));
+$res = json_utf8->decode($lei_out);
+is_deeply($res->[0]->{kw}, [qw(answered seen)], 'keyword added');
+is_deeply($res->[0]->{L}, [qw(boombox inbox)], 'labels preserved');
 
 # see t/lei_to_mail.t for "import -F mbox*"
 });

^ permalink raw reply related	[relevance 60%]

* [PATCH] doc: lei q: document v2:$INBOX_DIR output format
@ 2023-06-15  0:08 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-06-15  0:08 UTC (permalink / raw)
  To: meta

This has been supported in every lei release, actually.
---
 Documentation/lei-q.pod | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 5e9a5658..c0254ba0 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -50,6 +50,10 @@ A prefix can specify the format of the output: C<maildir>,
 C<mboxrd>, C<mboxcl2>, C<mboxcl>, C<mboxo>.  For a description of
 mail formats, see L<lei-mail-formats(5)>.
 
+C<v2:/path/to/inbox> may be used to create a new inbox of
+L<public-inbox-v2-format(5)>.  The new inbox will not be configured
+in the L<public-inbox-config(5)> file.
+
 C<maildir> is the default for an existing directory or non-existing path.
 
 Default: C<-> (stdout)

^ permalink raw reply related	[relevance 71%]

* [PATCH] t/lei.t: quiet newline warning on older Perls
@ 2023-06-08 18:26 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-06-08 18:26 UTC (permalink / raw)
  To: meta

Perl < 5.22 warned on newlines in the middle of a string instead
of just the end.  Workaround it by disabling all warnings on older
Perls while running File::Path::mkpath.
---
 t/lei.t | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/lei.t b/t/lei.t
index a80143ef..5d0fa622 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -149,7 +149,10 @@ my $test_fail = sub {
 	for my $lk (qw(ei inbox)) {
 		my $d = "$home/newline\n$lk";
 		my $all = $lk eq 'ei' ? 'ALL' : 'all';
-		File::Path::mkpath("$d/$all.git/objects");
+		{ # quiet newline warning on older Perls
+			local $^W = undef if $^V lt v5.22.0;
+			File::Path::mkpath("$d/$all.git/objects");
+		}
 		open my $fh, '>', "$d/$lk.lock" or BAIL_OUT "open $d/$lk.lock";
 		for my $fl (qw(-I --only)) {
 			ok(!lei('q', $fl, $d, 'whatever'),

^ permalink raw reply related	[relevance 71%]

* [PATCH] t/lei-import-nntp: dump $lei_err on failure
@ 2023-04-29  7:18 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-04-29  7:18 UTC (permalink / raw)
  To: meta

I hit an error on the backwards range import test and can't
reproduce it, perhaps dumping $lei_err can help diagnose it
in the future.
---
 t/lei-import-nntp.t | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/lei-import-nntp.t b/t/lei-import-nntp.t
index eb1ae312..2c48d973 100644
--- a/t/lei-import-nntp.t
+++ b/t/lei-import-nntp.t
@@ -43,7 +43,8 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	lei_ok 'ls-mail-sync';
 	like($lei_out, qr!\A\Q$url\E\n\z!, 'ls-mail-sync output as-expected');
 
-	ok(!lei(qw(import), "$url/12-1"), 'backwards range rejected');
+	ok(!lei(qw(import), "$url/12-1"), 'backwards range rejected') or
+		diag $lei_err;
 
 	# new home
 	local $ENV{HOME} = "$tmpdir/h2";

^ permalink raw reply related	[relevance 71%]

* [PATCH] t/lei-refresh-mail-sync: improve test reliability
@ 2023-03-28 10:53 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-03-28 10:53 UTC (permalink / raw)
  To: meta

Lack of signalfd/EVFILT_SIGNAL means we need to kill a
process repeatedly to ensure it wakes up.
---
 t/lei-refresh-mail-sync.t | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/t/lei-refresh-mail-sync.t b/t/lei-refresh-mail-sync.t
index 0498a0c4..8ccc68c6 100644
--- a/t/lei-refresh-mail-sync.t
+++ b/t/lei-refresh-mail-sync.t
@@ -137,8 +137,12 @@ SKIP: {
 	my $ar = PublicInbox::AutoReap->new($pid);
 	ok(!(lei 'refresh-mail-sync', $url), 'URL fails on dead -imapd');
 	ok(!(lei 'refresh-mail-sync', '--all'), '--all fails on dead -imapd');
-	$ar->kill for qw(avoid sig wake miss-no signalfd or EVFILT_SIG);
-	$ar->join('TERM');
+	{
+		local $SIG{CHLD} = sub { $ar->join('TERM'); undef $ar };
+		do {
+			eval { $ar->kill and tick(0.01) }
+		} while (defined($ar));
+	}
 
 	my $cmd = $srv->{imapd}->{cmd};
 	my $s = $srv->{imapd}->{s};

^ permalink raw reply related	[relevance 71%]

* Re: Issues with `lei` as non-root
  2023-03-28  3:38 71%         ` Eric Wong
@ 2023-03-28  4:08 71%           ` Louis DeLosSantos
  0 siblings, 0 replies; 200+ results
From: Louis DeLosSantos @ 2023-03-28  4:08 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

> Thats a lot of tail processes....
Ugh, sorry to waste your time.

All the tails were from a run away UI program I'm working on.
Once I killed them, `let` runs just fine as non-root.

Thanks for the free tech support, hope I didn't steal your attention
from something valuable :-D.

On Mon, Mar 27, 2023 at 11:38 PM Eric Wong <e@80x24.org> wrote:
>
> Louis DeLosSantos <louis.delos@gmail.com> wrote:
> > > Definitely not; the lei-daemon is per-user.
> >
> > Okay, maybe this is the issue to begin with? I installed lei from dnf.
> > I'm not sure what launches the daemon, is it launched on first run?
> >
> > If that is the case, it was probably launched when I restored to `sudo
> > lei q ....` command.
> > But, if its running as systemd service, I could move it to user service.
>
> You shouldn't need to manage it as a service; it's auto-started
> and killing it is harmless in most cases.  I'm considering it
> have it auto-exit if it stays idle for a long time and there's
> no active inotify watches.
>
> lei-daemon doesn't start until any other lei command is invoked;
> so it shouldn't be started on installation.
>
> > # show system-wide limits
>
> > ==> /proc/sys/fs/inotify/max_user_instances <==
> > 128
>
> <snip>
>
> > tail       367093 louis    4r  a_inode               0,14         0
>
> Thats a lot of tail processes....
> I wonder if they were spawned by `lei q -v' for emitting curl stderr?
> They should be auto-killed.
> (or if you have some other reason for running tail on your system).

^ permalink raw reply	[relevance 71%]

* Re: Issues with `lei` as non-root
  2023-03-28  3:05 42%       ` Louis DeLosSantos
@ 2023-03-28  3:38 71%         ` Eric Wong
  2023-03-28  4:08 71%           ` Louis DeLosSantos
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-03-28  3:38 UTC (permalink / raw)
  To: Louis DeLosSantos; +Cc: meta

Louis DeLosSantos <louis.delos@gmail.com> wrote:
> > Definitely not; the lei-daemon is per-user.
> 
> Okay, maybe this is the issue to begin with? I installed lei from dnf.
> I'm not sure what launches the daemon, is it launched on first run?
> 
> If that is the case, it was probably launched when I restored to `sudo
> lei q ....` command.
> But, if its running as systemd service, I could move it to user service.

You shouldn't need to manage it as a service; it's auto-started
and killing it is harmless in most cases.  I'm considering it
have it auto-exit if it stays idle for a long time and there's
no active inotify watches.

lei-daemon doesn't start until any other lei command is invoked;
so it shouldn't be started on installation.

> # show system-wide limits

> ==> /proc/sys/fs/inotify/max_user_instances <==
> 128

<snip>

> tail       367093 louis    4r  a_inode               0,14         0

Thats a lot of tail processes....
I wonder if they were spawned by `lei q -v' for emitting curl stderr?
They should be auto-killed.
(or if you have some other reason for running tail on your system).

^ permalink raw reply	[relevance 71%]

* Re: Issues with `lei` as non-root
  2023-03-28  2:52 71%     ` Eric Wong
@ 2023-03-28  3:05 42%       ` Louis DeLosSantos
  2023-03-28  3:38 71%         ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Louis DeLosSantos @ 2023-03-28  3:05 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

> Actually, only 18 (0..17).  The `mem' stuff is mmap-ed and
doesn't count against `ulimit -n` (RLIMIT_NOFILE).

Yup, you are right, did a quick "wc -l" without thinking about it.

> Definitely not; the lei-daemon is per-user.

Okay, maybe this is the issue to begin with? I installed lei from dnf.
I'm not sure what launches the daemon, is it launched on first run?

If that is the case, it was probably launched when I restored to `sudo
lei q ....` command.
But, if its running as systemd service, I could move it to user service.

# show system-wide limits
~
🖳  head /proc/sys/fs/inotify/max_*
==> /proc/sys/fs/inotify/max_queued_events <==
16384

==> /proc/sys/fs/inotify/max_user_instances <==
128

==> /proc/sys/fs/inotify/max_user_watches <==
524288

# show per-user inotify FDs (-nP speeds up lsof by avoiding lookups)
systemd      1686 louis    6r  a_inode               0,14         0
  15364 inotify
systemd      1686 louis   11r  a_inode               0,14         0
  15364 inotify
systemd      1686 louis   13r  a_inode               0,14         0
  15364 inotify
dbus-brok    1727 louis    7r  a_inode               0,14         0
  15364 inotify
swaync       1868 louis   15r  a_inode               0,14         0
  15364 inotify
dbus-brok    1898 louis    7r  a_inode               0,14         0
  15364 inotify
xdg-deskt    1914 louis   10r  a_inode               0,14         0
  15364 inotify
xdg-deskt    1938 louis   12r  a_inode               0,14         0
  15364 inotify
xdg-deskt    1968 louis   14r  a_inode               0,14         0
  15364 inotify
wireplumb    1979 louis   19r  a_inode               0,14         0
  15364 inotify
wireplumb    1979 louis   28r  a_inode               0,14         0
  15364 inotify
wireplumb    1979 louis   30r  a_inode               0,14         0
  15364 inotify
code         2362 louis   63r  a_inode               0,14         0
  15364 inotify
code         2362 louis   86r  a_inode               0,14         0
  15364 inotify
code         2362 louis  109r  a_inode               0,14         0
  15364 inotify
code         2456 louis   21r  a_inode               0,14         0
  15364 inotify
code         2493 louis   51r  a_inode               0,14         0
  15364 inotify
code         2547 louis   49r  a_inode               0,14         0
  15364 inotify
code         2547 louis   54r  a_inode               0,14         0
  15364 inotify
firefox      4168 louis   80r  a_inode               0,14         0
  15364 inotify
cgroupify    4347 louis    6r  a_inode               0,14         0
  15364 inotify
flatpak-s   53051 louis    7r  a_inode               0,14         0
  15364 inotify
obsidian    53073 louis   69r  a_inode               0,14         0
  15364 inotify
flatpak-p   53080 louis    7r  a_inode               0,14         0
  15364 inotify
obsidian    53135 louis   21r  a_inode               0,14         0
  15364 inotify
obsidian    53141 louis   56r  a_inode               0,14         0
  15364 inotify
tail       235359 louis    4r  a_inode               0,14         0
  15364 inotify
tail       251716 louis    4r  a_inode               0,14         0
  15364 inotify
tail       251821 louis    4r  a_inode               0,14         0
  15364 inotify
tail       252069 louis    4r  a_inode               0,14         0
  15364 inotify
tail       252266 louis    4r  a_inode               0,14         0
  15364 inotify
tail       252399 louis    4r  a_inode               0,14         0
  15364 inotify
tail       252434 louis    4r  a_inode               0,14         0
  15364 inotify
tail       252749 louis    4r  a_inode               0,14         0
  15364 inotify
tail       252953 louis    4r  a_inode               0,14         0
  15364 inotify
tail       253611 louis    4r  a_inode               0,14         0
  15364 inotify
tail       253772 louis    4r  a_inode               0,14         0
  15364 inotify
tail       253858 louis    4r  a_inode               0,14         0
  15364 inotify
tail       253932 louis    4r  a_inode               0,14         0
  15364 inotify
tail       254159 louis    4r  a_inode               0,14         0
  15364 inotify
tail       254323 louis    4r  a_inode               0,14         0
  15364 inotify
tail       254425 louis    4r  a_inode               0,14         0
  15364 inotify
tail       255070 louis    4r  a_inode               0,14         0
  15364 inotify
tail       255592 louis    4r  a_inode               0,14         0
  15364 inotify
tail       256816 louis    4r  a_inode               0,14         0
  15364 inotify
tail       256939 louis    4r  a_inode               0,14         0
  15364 inotify
tail       257302 louis    4r  a_inode               0,14         0
  15364 inotify
tail       257435 louis    4r  a_inode               0,14         0
  15364 inotify
tail       257746 louis    4r  a_inode               0,14         0
  15364 inotify
tail       258071 louis    4r  a_inode               0,14         0
  15364 inotify
tail       258169 louis    4r  a_inode               0,14         0
  15364 inotify
tail       258279 louis    4r  a_inode               0,14         0
  15364 inotify
tail       258413 louis    4r  a_inode               0,14         0
  15364 inotify
tail       258715 louis    4r  a_inode               0,14         0
  15364 inotify
tail       259167 louis    4r  a_inode               0,14         0
  15364 inotify
tail       259289 louis    4r  a_inode               0,14         0
  15364 inotify
tail       259484 louis    4r  a_inode               0,14         0
  15364 inotify
tail       259623 louis    4r  a_inode               0,14         0
  15364 inotify
tail       259876 louis    4r  a_inode               0,14         0
  15364 inotify
tail       260275 louis    4r  a_inode               0,14         0
  15364 inotify
tail       260412 louis    4r  a_inode               0,14         0
  15364 inotify
tail       260518 louis    4r  a_inode               0,14         0
  15364 inotify
tail       260650 louis    4r  a_inode               0,14         0
  15364 inotify
tail       260818 louis    4r  a_inode               0,14         0
  15364 inotify
tail       261665 louis    4r  a_inode               0,14         0
  15364 inotify
tail       262636 louis    4r  a_inode               0,14         0
  15364 inotify
tail       262963 louis    4r  a_inode               0,14         0
  15364 inotify
tail       263781 louis    4r  a_inode               0,14         0
  15364 inotify
tail       264183 louis    4r  a_inode               0,14         0
  15364 inotify
tail       264242 louis    4r  a_inode               0,14         0
  15364 inotify
tail       264484 louis    4r  a_inode               0,14         0
  15364 inotify
tail       264769 louis    4r  a_inode               0,14         0
  15364 inotify
tail       264970 louis    4r  a_inode               0,14         0
  15364 inotify
tail       265488 louis    4r  a_inode               0,14         0
  15364 inotify
tail       265939 louis    4r  a_inode               0,14         0
  15364 inotify
tail       266145 louis    4r  a_inode               0,14         0
  15364 inotify
tail       267183 louis    4r  a_inode               0,14         0
  15364 inotify
tail       267352 louis    4r  a_inode               0,14         0
  15364 inotify
tail       267951 louis    4r  a_inode               0,14         0
  15364 inotify
tail       268135 louis    4r  a_inode               0,14         0
  15364 inotify
tail       268320 louis    4r  a_inode               0,14         0
  15364 inotify
tail       268892 louis    4r  a_inode               0,14         0
  15364 inotify
tail       269418 louis    4r  a_inode               0,14         0
  15364 inotify
tail       269631 louis    4r  a_inode               0,14         0
  15364 inotify
tail       339541 louis    4r  a_inode               0,14         0
  15364 inotify
tail       339854 louis    4r  a_inode               0,14         0
  15364 inotify
tail       340349 louis    4r  a_inode               0,14         0
  15364 inotify
tail       340566 louis    4r  a_inode               0,14         0
  15364 inotify
tail       340712 louis    4r  a_inode               0,14         0
  15364 inotify
tail       340858 louis    4r  a_inode               0,14         0
  15364 inotify
tail       341122 louis    4r  a_inode               0,14         0
  15364 inotify
tail       341356 louis    4r  a_inode               0,14         0
  15364 inotify
tail       341499 louis    4r  a_inode               0,14         0
  15364 inotify
tail       341643 louis    4r  a_inode               0,14         0
  15364 inotify
tail       341858 louis    4r  a_inode               0,14         0
  15364 inotify
tail       341948 louis    4r  a_inode               0,14         0
  15364 inotify
tail       342271 louis    4r  a_inode               0,14         0
  15364 inotify
tail       342410 louis    4r  a_inode               0,14         0
  15364 inotify
tail       342557 louis    4r  a_inode               0,14         0
  15364 inotify
tail       342802 louis    4r  a_inode               0,14         0
  15364 inotify
tail       342944 louis    4r  a_inode               0,14         0
  15364 inotify
tail       343409 louis    4r  a_inode               0,14         0
  15364 inotify
tail       343690 louis    4r  a_inode               0,14         0
  15364 inotify
tail       343864 louis    4r  a_inode               0,14         0
  15364 inotify
tail       344049 louis    4r  a_inode               0,14         0
  15364 inotify
tail       344285 louis    4r  a_inode               0,14         0
  15364 inotify
tail       344429 louis    4r  a_inode               0,14         0
  15364 inotify
tail       345251 louis    4r  a_inode               0,14         0
  15364 inotify
tail       345541 louis    4r  a_inode               0,14         0
  15364 inotify
tail       345972 louis    4r  a_inode               0,14         0
  15364 inotify
tail       346163 louis    4r  a_inode               0,14         0
  15364 inotify
tail       346755 louis    4r  a_inode               0,14         0
  15364 inotify
tail       347063 louis    4r  a_inode               0,14         0
  15364 inotify
tail       350273 louis    4r  a_inode               0,14         0
  15364 inotify
tail       350316 louis    4r  a_inode               0,14         0
  15364 inotify
tail       350781 louis    4r  a_inode               0,14         0
  15364 inotify
tail       350942 louis    4r  a_inode               0,14         0
  15364 inotify
tail       351111 louis    4r  a_inode               0,14         0
  15364 inotify
tail       351746 louis    4r  a_inode               0,14         0
  15364 inotify
tail       353489 louis    4r  a_inode               0,14         0
  15364 inotify
tail       353801 louis    4r  a_inode               0,14         0
  15364 inotify
tail       354352 louis    4r  a_inode               0,14         0
  15364 inotify
tail       354638 louis    4r  a_inode               0,14         0
  15364 inotify
tail       354923 louis    4r  a_inode               0,14         0
  15364 inotify
tail       362029 louis    4r  a_inode               0,14         0
  15364 inotify
tail       367093 louis    4r  a_inode               0,14         0
  15364 inotify
waybar     811099 louis   37r  a_inode               0,14         0
  15364 inotify
waybar     811099 louis   38r  a_inode               0,14         0
  15364 inotify

On Mon, Mar 27, 2023 at 10:52 PM Eric Wong <e@80x24.org> wrote:
>
> Louis DeLosSantos <louis.delos@gmail.com> wrote:
>
> <snip>
>
> > Above is 54 open sockets. Which seems fine.
>
> Actually, only 18 (0..17).  The `mem' stuff is mmap-ed and
> doesn't count against `ulimit -n` (RLIMIT_NOFILE).
>
> > Should daemon be running as root, if I intend to only use lei as user?
>
> Definitely not; the lei-daemon is per-user.
>
> I also forgot, inotify has its own per-user limits; perhaps
> you're hitting those?
>
> # show system-wide limits
> $ head /proc/sys/fs/inotify/max_*
>
> # show per-user inotify FDs (-nP speeds up lsof by avoiding lookups)
> $ lsof -nP -u $USER |grep inotify

^ permalink raw reply	[relevance 42%]

* Re: Issues with `lei` as non-root
  2023-03-28  2:30 45%   ` Louis DeLosSantos
@ 2023-03-28  2:52 71%     ` Eric Wong
  2023-03-28  3:05 42%       ` Louis DeLosSantos
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-03-28  2:52 UTC (permalink / raw)
  To: Louis DeLosSantos; +Cc: meta

Louis DeLosSantos <louis.delos@gmail.com> wrote:

<snip>

> Above is 54 open sockets. Which seems fine.

Actually, only 18 (0..17).  The `mem' stuff is mmap-ed and
doesn't count against `ulimit -n` (RLIMIT_NOFILE).

> Should daemon be running as root, if I intend to only use lei as user?

Definitely not; the lei-daemon is per-user.

I also forgot, inotify has its own per-user limits; perhaps
you're hitting those?

# show system-wide limits
$ head /proc/sys/fs/inotify/max_*

# show per-user inotify FDs (-nP speeds up lsof by avoiding lookups)
$ lsof -nP -u $USER |grep inotify

^ permalink raw reply	[relevance 71%]

* Re: Issues with `lei` as non-root
  2023-03-28  1:32 71% ` Eric Wong
@ 2023-03-28  2:30 45%   ` Louis DeLosSantos
  2023-03-28  2:52 71%     ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Louis DeLosSantos @ 2023-03-28  2:30 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

> What's the output of `ulimit -n` and `lsof -p $(lei daemon-pid)`?
🖳  ulimit -n
1024

🖳  ps -ef |grep lei-daemon
root      861005       1  0 19:24 ?        00:00:00 lei-daemon
/tmp/lei-0/5.seq.sock
louis    1025477 1015489  0 22:22 pts/6    00:00:00 grep --color=auto lei-daemon

~
🖳 sudo  lsof -p 861005
COMMAND      PID USER   FD      TYPE             DEVICE SIZE/OFF    NODE NAME
lei-daemo 861005 root  cwd       DIR               0,34      246
2902791 /home/louis/Mail/linux-bpf
lei-daemo 861005 root  rtd       DIR               0,34      158     256 /
lei-daemo 861005 root  txt       REG               0,34    15984
746278 /usr/bin/perl
lei-daemo 861005 root  mem       REG               0,32
746278 /usr/bin/perl (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
669987 /usr/lib64/libstdc++.so.6.0.30 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
654072 /usr/lib64/libgcc_s-12-20221121.so.1 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
707780 /usr/lib64/libxapian.so.30.12.1 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
744976 /usr/lib64/perl5/vendor_perl/auto/Data/Dumper/Dumper.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2847838 /usr/lib64/perl5/vendor_perl/auto/Search/Xapian/Xapian.so
(path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
670001 /usr/lib64/libz.so.1.2.12 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
670059 /usr/lib64/libsqlite3.so.0.8.6 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
1142922 /usr/lib/locale/locale-archive (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
746285 /usr/lib64/perl5/auto/Sys/Hostname/Hostname.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
670032 /usr/lib64/libuuid.so.1.3.0 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2846948 /usr/lib64/perl5/vendor_perl/auto/Compress/Raw/Zlib/Zlib.so
(path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2848056 /usr/lib64/perl5/vendor_perl/auto/DBD/SQLite/SQLite.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
748851 /usr/lib64/perl5/vendor_perl/auto/DBI/DBI.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745669 /usr/lib64/perl5/auto/attributes/attributes.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2847712 /usr/lib64/perl5/vendor_perl/auto/Cpanel/JSON/XS/XS.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2848586 /usr/lib64/perl5/vendor_perl/auto/Linux/Inotify2/Inotify2.so
(path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2849070 /usr/lib64/perl5/vendor_perl/auto/Socket/MsgHdr/MsgHdr.so
(path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2847193 /usr/lib64/perl5/vendor_perl/auto/Sys/Syslog/Syslog.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745481 /usr/lib64/perl5/vendor_perl/auto/MIME/Base64/Base64.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745670 /usr/lib64/perl5/auto/re/re.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745506 /usr/lib64/perl5/vendor_perl/auto/Storable/Storable.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745611 /usr/lib64/perl5/vendor_perl/auto/Encode/Encode.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2849930 /usr/lib64/perl5/vendor_perl/auto/Email/Address/XS/XS.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
1143694 /usr/lib64/libm.so.6 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
1143691 /usr/lib64/libc.so.6 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745650 /usr/lib64/libperl.so.5.36.0 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2846914 /usr/lib64/perl5/vendor_perl/auto/Digest/SHA/SHA.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745663 /usr/lib64/perl5/auto/File/Glob/Glob.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745449 /usr/lib64/perl5/auto/IO/IO.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745355 /usr/lib64/perl5/vendor_perl/auto/Socket/Socket.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745492 /usr/lib64/perl5/vendor_perl/auto/List/Util/Util.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745259 /usr/lib64/perl5/auto/POSIX/POSIX.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
1019649 /usr/lib64/libcrypt.so.2.0.0 (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745556 /usr/lib64/perl5/vendor_perl/auto/Cwd/Cwd.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
2846928 /usr/lib64/perl5/vendor_perl/auto/Time/HiRes/HiRes.so (path
dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
745427 /usr/lib64/perl5/auto/Fcntl/Fcntl.so (path dev=0,34)
lei-daemo 861005 root  mem       REG               0,32
1143688 /usr/lib64/ld-linux-x86-64.so.2 (path dev=0,34)
lei-daemo 861005 root    0u      REG               0,38        0
3448 /tmp/lei-0/errors.log
lei-daemo 861005 root    1u      REG               0,38        0
3448 /tmp/lei-0/errors.log
lei-daemo 861005 root    2u      REG               0,38        0
3448 /tmp/lei-0/errors.log
lei-daemo 861005 root    3u  a_inode               0,14        0
15364 [eventpoll:4,5,7,8,14]
lei-daemo 861005 root    4u     unix 0x0000000033b859c6      0t0
4095008 /tmp/lei-0/5.seq.sock type=SEQPACKET (LISTEN)
lei-daemo 861005 root    5u     unix 0x000000007a56c532      0t0
4085549 type=SEQPACKET (CONNECTED)
lei-daemo 861005 root    6u     unix 0x00000000c6d0e27b      0t0
4085550 type=SEQPACKET (CONNECTED)
lei-daemo 861005 root    7r  a_inode               0,14        0   15364 inotify
lei-daemo 861005 root    8u  a_inode               0,14        0
15364 [signalfd]
lei-daemo 861005 root   14r     FIFO               0,13      0t0 4098533 pipe
lei-daemo 861005 root   16u     unix 0x00000000557447d0      0t0
4098534 type=SEQPACKET (CONNECTED)
lei-daemo 861005 root   17u     unix 0x00000000f60edd15      0t0
4098535 type=SEQPACKET (CONNECTED)

Above is 54 open sockets. Which seems fine.


Should daemon be running as root, if I intend to only use lei as user?

On Mon, Mar 27, 2023 at 9:32 PM Eric Wong <e@80x24.org> wrote:
>
> Louis DeLosSantos <louis.delos@gmail.com> wrote:
> > Hello,
> >
> > I'm experimenting with `lei` as a nice search tool for `lore.kernel.org`
> >
> > Everything works fine with the caveat that it seems to break if I'm not root.
> >
> > When using `lei` as non-root we get this error:
>
> I've never used lei as root nor has any part of public-inbox
> ever been intended to run as root.
>
> > ```
> > E: Linux::Inotify2->new: Too many open files at
> > /usr/share/perl5/vendor_perl/PublicInbox/DirIdle.pm line 40.
> > connect(/run/user/1000/lei/5.seq.sock): Connection refused (after
> > attempted daemon start)
> > ```
> >
> > Any ideas why this may occur? Is `lei` designed to only be ran as root
> > or is Fedora installing perl in an odd fashion which results in root
> > needing to be used?
>
> What's the output of `ulimit -n` and `lsof -p $(lei daemon-pid)`?
>
> (you may need to use `ps -ef |grep lei-daemon` to get the PID
> if lei is broken and using too many FDs, though)
>
> `ulimit -n' is the open file limit, typically 1024 or higher.
>
> If `lsof -p $PID` may reveal a bug in lei which leaves too many
> files open.  lei (especially with inotify on Linux) should use
> far less than 1024.
>
> (FreeBSD may end up using far more open files, but that's a
> different story)

^ permalink raw reply	[relevance 45%]

* Re: Issues with `lei` as non-root
  2023-03-28  1:00 71% Issues with `lei` as non-root Louis DeLosSantos
@ 2023-03-28  1:32 71% ` Eric Wong
  2023-03-28  2:30 45%   ` Louis DeLosSantos
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-03-28  1:32 UTC (permalink / raw)
  To: Louis DeLosSantos; +Cc: meta

Louis DeLosSantos <louis.delos@gmail.com> wrote:
> Hello,
> 
> I'm experimenting with `lei` as a nice search tool for `lore.kernel.org`
> 
> Everything works fine with the caveat that it seems to break if I'm not root.
> 
> When using `lei` as non-root we get this error:

I've never used lei as root nor has any part of public-inbox
ever been intended to run as root.

> ```
> E: Linux::Inotify2->new: Too many open files at
> /usr/share/perl5/vendor_perl/PublicInbox/DirIdle.pm line 40.
> connect(/run/user/1000/lei/5.seq.sock): Connection refused (after
> attempted daemon start)
> ```
> 
> Any ideas why this may occur? Is `lei` designed to only be ran as root
> or is Fedora installing perl in an odd fashion which results in root
> needing to be used?

What's the output of `ulimit -n` and `lsof -p $(lei daemon-pid)`?

(you may need to use `ps -ef |grep lei-daemon` to get the PID
if lei is broken and using too many FDs, though)

`ulimit -n' is the open file limit, typically 1024 or higher.

If `lsof -p $PID` may reveal a bug in lei which leaves too many
files open.  lei (especially with inotify on Linux) should use
far less than 1024.

(FreeBSD may end up using far more open files, but that's a
different story)

^ permalink raw reply	[relevance 71%]

* Issues with `lei` as non-root
@ 2023-03-28  1:00 71% Louis DeLosSantos
  2023-03-28  1:32 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Louis DeLosSantos @ 2023-03-28  1:00 UTC (permalink / raw)
  To: meta

Hello,

I'm experimenting with `lei` as a nice search tool for `lore.kernel.org`

Everything works fine with the caveat that it seems to break if I'm not root.

When using `lei` as non-root we get this error:

```
E: Linux::Inotify2->new: Too many open files at
/usr/share/perl5/vendor_perl/PublicInbox/DirIdle.pm line 40.
connect(/run/user/1000/lei/5.seq.sock): Connection refused (after
attempted daemon start)
```

Any ideas why this may occur? Is `lei` designed to only be ran as root
or is Fedora installing perl in an odd fashion which results in root
needing to be used?

^ permalink raw reply	[relevance 71%]

* repeat `lei import' users?
@ 2023-03-23 22:05 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-03-23 22:05 UTC (permalink / raw)
  To: meta

Just wondering if there's lei-mail-sync-overview(7) followers,
yet... It could be a bit less clunky :x

^ permalink raw reply	[relevance 71%]

* [PATCH] lei: improve bash completion involving colons
@ 2023-03-23 21:45 46% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-03-23 21:45 UTC (permalink / raw)
  To: meta

This fixes completions of labels (`+L:' for `lei import' and
`L:' for `lei q') so they can appear anywhere in the
command-line.

I mainly wanted this for `lei import $URL +L:label', but
this also fixes `lei forget-external' completions for URLs
(which involve colons).
---
 contrib/completion/lei-completion.bash | 15 ++++++++-----
 lib/PublicInbox/LeiExternal.pm         | 31 +++++++++++---------------
 lib/PublicInbox/LeiForgetExternal.pm   |  8 ++-----
 lib/PublicInbox/LeiImport.pm           | 22 +++++++++++-------
 lib/PublicInbox/LeiQuery.pm            |  2 ++
 5 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/contrib/completion/lei-completion.bash b/contrib/completion/lei-completion.bash
index 5c137e68..b86afa2c 100644
--- a/contrib/completion/lei-completion.bash
+++ b/contrib/completion/lei-completion.bash
@@ -1,16 +1,19 @@
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # preliminary bash completion support for lei (Local Email Interface)
 # Needs a lot of work, see `lei__complete' in lib/PublicInbox::LEI.pm
 _lei() {
 	local wordlist="$(lei _complete ${COMP_WORDS[@]})"
-	case $wordlist in
-	*':'* | *'='* | '//'*) compopt -o nospace ;;
-	*) compopt +o nospace ;; # the default
-	esac
 	wordlist="${wordlist//;/\\\\;}" # escape ';' for ';UIDVALIDITY' and such
-	COMPREPLY=($(compgen -W "$wordlist" -- "${COMP_WORDS[COMP_CWORD]}"))
+
+	local word="${COMP_WORDS[COMP_CWORD]}"
+	if test "$word" = ':' && test $COMP_CWORD -ge 1
+	then
+		COMPREPLY=($(compgen -W "$wordlist" --))
+	else
+		COMPREPLY=($(compgen -W "$wordlist" -- "$word"))
+	fi
 	return 0
 }
 complete -o default -o bashdefault -F _lei lei
diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 3e2a2288..31b9bd1e 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -86,39 +86,34 @@ sub canonicalize_excludes {
 # returns an anonymous sub which returns an array of potential results
 sub complete_url_prepare {
 	my $argv = $_[-1]; # $_[0] may be $lei
-	# Workaround bash word-splitting URLs to ['https', ':', '//' ...]
-	# Maybe there's a better way to go about this in
-	# contrib/completion/lei-completion.bash
-	my $re = '';
-	my $cur = pop(@$argv) // '';
+	# Workaround bash default COMP_WORDBREAKS splitting URLs to
+	# ['https', ':', '//', ...].  COMP_WORDBREAKS is global for all
+	# completions loaded, not just ours, so we can't change it.
+	# cf. contrib/completion/lei-completion.bash
+	my ($pfx, $cur)  = ('', pop(@$argv) // '');
 	if (@$argv) {
 		my @x = @$argv;
-		if ($cur eq ':' && @x) {
+		if ($cur =~ /\A[:;=]\z/) { # COMP_WORDBREAKS + URL union
 			push @x, $cur;
 			$cur = '';
 		}
-		while (@x > 2 && $x[0] !~ /\A(?:http|nntp|imap)s?\z/i &&
-				$x[1] ne ':') {
-			shift @x;
+		while (@x && $pfx !~ m!\A(?: (?:[\+\-]?(?:L|kw):) |
+				(?:(?:imap|nntp|http)s?:) |
+				(?:--\w?\z)|(?:-\w?\z) )!x) {
+			$pfx = pop(@x).$pfx;
 		}
-		if (@x >= 2) { # qw(https : hostname : 443) or qw(http :)
-			$re = join('', @x);
-		} else { # just filter out the flags and hope for the best
-			$re = join('', grep(!/^-/, @$argv));
-		}
-		$re = quotemeta($re);
 	}
+	my $re = qr!\A\Q$pfx\E(\Q$cur\E.*)!;
 	my $match_cb = sub {
 		# the "//;" here (for AUTH=ANONYMOUS) interacts badly with
 		# bash tab completion, strip it out for now since our commands
 		# work w/o it.  Not sure if there's a better solution...
 		$_[0] =~ s!//;AUTH=ANONYMOUS\@!//!i;
-		$_[0] =~ s!;!\\;!g;
 		# only return the part specified on the CLI
 		# don't duplicate if already 100% completed
-		$_[0] =~ /\A$re(\Q$cur\E.*)/ ? ($cur eq $1 ? () : $1) : ()
+		$_[0] =~ $re ? ($cur eq $1 ? () : $1) : ()
 	};
-	wantarray ? ($re, $cur, $match_cb) : $match_cb;
+	wantarray ? ($pfx, $cur, $match_cb) : $match_cb;
 }
 
 1;
diff --git a/lib/PublicInbox/LeiForgetExternal.pm b/lib/PublicInbox/LeiForgetExternal.pm
index 07f0ac80..39bfc60b 100644
--- a/lib/PublicInbox/LeiForgetExternal.pm
+++ b/lib/PublicInbox/LeiForgetExternal.pm
@@ -32,14 +32,10 @@ sub lei_forget_external {
 sub _complete_forget_external {
 	my ($lei, @argv) = @_;
 	my $cfg = $lei->_lei_cfg or return ();
-	my ($cur, $re, $match_cb) = $lei->complete_url_prepare(\@argv);
-	# FIXME: bash completion off "http:" or "https:" when the last
-	# character is a colon doesn't work properly even if we're
-	# returning "//$HTTP_HOST/$PATH_INFO/", not sure why, could
-	# be a bash issue.
+	my ($pfx, $cur, $match_cb) = $lei->complete_url_prepare(\@argv);
 	map {
 		$match_cb->(substr($_, length('external.')));
-	} grep(/\Aexternal\.$re\Q$cur/, @{$cfg->{-section_order}});
+	} grep(/\Aexternal\.\Q$pfx$cur/, @{$cfg->{-section_order}});
 }
 
 1;
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 2d91e4c4..9053048a 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -115,18 +115,24 @@ sub lei_import { # the main "lei import" method
 
 sub _complete_import {
 	my ($lei, @argv) = @_;
-	my ($re, $cur, $match_cb) = $lei->complete_url_prepare(\@argv);
-	my @k = $lei->url_folder_cache->keys($argv[-1] // undef, 1);
+	my $has_arg = @argv;
+	my ($pfx, $cur, $match_cb) = $lei->complete_url_prepare(\@argv);
+	my @try = $has_arg ? ($pfx.$cur, $argv[-1]) : ($argv[-1]);
+	push(@try, undef) if defined $try[-1];
+	my (@f, @k);
+	for (@try) {
+		@k = $lei->url_folder_cache->keys($_, 1) and last;
+	}
 	my @L = eval { $lei->_lei_store->search->all_terms('L') };
 	push(@k, map { "+L:$_" } @L);
-	my @m = map { $match_cb->($_) } @k;
-	my %f = map { $_ => 1 } (@m ? @m : @k);
 	if (my $lms = $lei->lms) {
-		@k = $lms->folders($argv[-1] // undef, 1);
-		@m = map { $match_cb->($_) } @k;
-		if (@m) { @f{@m} = @m } else { @f{@k} = @k }
+		for (@try) {
+			@f = $lms->folders($_, 1) and last;
+		}
+		push @k, @f;
 	}
-	keys %f;
+	my @m = map { $match_cb->($_) } @k;
+	@m ? @m : @k;
 }
 
 no warnings 'once';
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index 358574ea..3337e5d4 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -173,6 +173,8 @@ no query allowed on command-line with --stdin
 # shell completion helper called by lei__complete
 sub _complete_q {
 	my ($self, @argv) = @_;
+	join('', @argv) =~ /\bL:\S*\z/ and
+		return eval { $self->_lei_store->search->all_terms('L') };
 	my @cur;
 	my $cb = $self->lazy_cb(qw(forget-external _complete_));
 	while (@argv) {

^ permalink raw reply related	[relevance 46%]

* [PATCH 4/6] doc: lei import: add hints about nntp.* and imap.* config options
  @ 2023-03-09 19:28 71% ` Eric Wong
  2023-03-09 19:28 68% ` [PATCH 5/6] doc: lei config: update with --edit and --list examples Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-03-09 19:28 UTC (permalink / raw)
  To: meta

I'm setting up more imports and forgot about them :x
---
 Documentation/lei-import.pod | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
index 69ec6497..31d6db13 100644
--- a/Documentation/lei-import.pod
+++ b/Documentation/lei-import.pod
@@ -86,8 +86,13 @@ Default: C<auto>
 
 Use the specified proxy (e.g., C<socks5h://0:9050>).
 
+Consider L<imap.proxy> and L<nntp.proxy> which can be persistently
+configured on a per-host basis in L<lei-config(1)>.
+
 =back
 
+See L<lei-config(1)> for various C<imap.*> and C<nntp.*> options.
+
 =head1 CONTACT
 
 Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
@@ -103,4 +108,4 @@ License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
 
 =head1 SEE ALSO
 
-L<lei-index(1)>, L<lei-store-format(5)>
+L<lei-config(1)>, L<lei-index(1)>, L<lei-store-format(5)>

^ permalink raw reply related	[relevance 71%]

* [PATCH 5/6] doc: lei config: update with --edit and --list examples
    2023-03-09 19:28 71% ` [PATCH 4/6] doc: lei import: add hints about nntp.* and imap.* config options Eric Wong
@ 2023-03-09 19:28 68% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2023-03-09 19:28 UTC (permalink / raw)
  To: meta

I typically use --edit/-e to make changes and --list/-l with
git; and same with lei.
---
 Documentation/lei-config.pod | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/Documentation/lei-config.pod b/Documentation/lei-config.pod
index 663404fe..23a60c8a 100644
--- a/Documentation/lei-config.pod
+++ b/Documentation/lei-config.pod
@@ -4,7 +4,11 @@ lei-config - git-config wrapper for lei configuration file
 
 =head1 SYNOPSIS
 
-lei config [OPTIONS]
+lei config <name> [[<value>] [<value-pattern>]]
+
+lei config -l | --list
+
+lei config -e | --edit
 
 =head1 DESCRIPTION
 
@@ -97,6 +101,27 @@ C<frag>, C<func>, and C<context>.
 
 =back
 
+=head1 OPTIONS
+
+Most L<git-config(1)> command-line switches are accepted by C<lei config>
+as-is.  The most-frequently-used options are expected to be:
+
+=over 4
+
+=item -e
+
+=item --edit
+
+Opens an editor to edit the lei config file
+
+=item -l
+
+=item --list
+
+List all variables set in config file, along with their values.
+
+=back
+
 =head1 CONTACT
 
 Feedback welcome via plain-text mail to L<mailto:meta@public-inbox.org>
@@ -106,6 +131,6 @@ L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>
 
 =head1 COPYRIGHT
 
-Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+Copyright all contributors L<mailto:meta@public-inbox.org>
 
 License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>

^ permalink raw reply related	[relevance 68%]

* [PATCH] doc: note "lei q -tt" is broken with HTTP(S) remotes
  2023-02-26 17:09 69%     ` Eric Wong
@ 2023-02-26 17:15 71%       ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-02-26 17:15 UTC (permalink / raw)
  To: Maxim Mikityanskiy; +Cc: meta, Kyle Meyer

Eric Wong <e@80x24.org> wrote:
> Getting -tt to work on remote inboxes will take more effort.
> I'm not sure which option is better:

I suppose documenting the current breakage first is important:

--------- 8< --------
Subject: [PATCH] doc: note "lei q -tt" is broken with HTTP(S) remotes

I'm still trying to decide how to handle HTTP(S) remotes
properly...

Link: https://public-inbox.org/meta/20230226170931.M947721@dcvr/
---
 Documentation/lei-q.pod | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index d52c5b04..5e9a5658 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -124,6 +124,9 @@ of the same thread.
 TODO: Warning: this flag may become persistent and saved in
 lei/store unless an MUA unflags it!  (Behavior undecided)
 
+Caveat: C<-tt> only works on locally-indexed messages at the
+moment, and not on remote (HTTP(S)) endpoints.
+
 =item --jobs=QUERY_WORKERS[,WRITE_WORKERS]
 =item --jobs=,WRITE_WORKERS
 

^ permalink raw reply related	[relevance 71%]

* Re: [PATCH] lei q: do not collapse threads with `-tt'
  2023-02-26 12:17 71%   ` Maxim Mikityanskiy
@ 2023-02-26 17:09 69%     ` Eric Wong
  2023-02-26 17:15 71%       ` [PATCH] doc: note "lei q -tt" is broken with HTTP(S) remotes Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-02-26 17:09 UTC (permalink / raw)
  To: Maxim Mikityanskiy; +Cc: meta, Kyle Meyer

Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
> On Tue, Feb 14, 2023 at 02:42:32AM +0000, Eric Wong wrote:
> > Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
> > > lei q --no-save -a -o /tmp/lei-test -I 'https://lore.kernel.org/all' \
> > >     -tt 'a:syzbot AND rt:2023-01-01..2023-01-07'

<snip>

> > Yes, now it seems it's the collapsing optimization.

<snip>

> Sorry for taking too long, I finally found a minute to test it, and
> unfortunately I didn't see a difference. I queried for:
> 
> a:syzbot AND rt:2023-02-01..2023-02-07
> 
> and I still saw I lot of threads without a single flag.
> 
> I double-checked that the patch was actually applied, killed lei-daemon,
> and removed the mailbox directory, but it didn't help.

Ah, oops.  My original fix only works for locally-cloned inboxes;
but not remote (http/https) inboxes...

I think some inconsistency on the client side is also introduced
by using -I/--include vs --only; since -I/--include will use
previously-indexed messages in ~/.local/share/lei/store

Getting -tt to work on remote inboxes will take more effort.
I'm not sure which option is better:

1) Support t=2 natively in the WWW interface.  This requires
   both the server and client to be updated.  It may require
   extra dedupe step on the server, making it more expensive.
   Thinking out loud, I think the dedupe step can be avoided
   by sorting on THREADID...

2) use t=1 in the client as-is, but index the streamed mbox
   locally, first.  This requires a temporary Xapian DB to
   ensure there's no overlap if using --only.
   This only requires a client update, but likely adds more
   complexity.  It also delays updates to the Maildir,
   meaning all messages need to be downloaded before the MUA
   sees it...

I'm leaning towards 1...

^ permalink raw reply	[relevance 69%]

* Re: [PATCH] lei q: do not collapse threads with `-tt'
  2023-02-14  2:42 66% ` [PATCH] lei q: do not collapse threads with `-tt' Eric Wong
@ 2023-02-26 12:17 71%   ` Maxim Mikityanskiy
  2023-02-26 17:09 69%     ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Maxim Mikityanskiy @ 2023-02-26 12:17 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta, Kyle Meyer

On Tue, Feb 14, 2023 at 02:42:32AM +0000, Eric Wong wrote:
> Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
> > lei q --no-save -a -o /tmp/lei-test -I 'https://lore.kernel.org/all' \
> >     -tt 'a:syzbot AND rt:2023-01-01..2023-01-07'
> 
> At first, I thought -a (--augment) was causing it...
> 
> Sidenote: you also don't need to quote the query (I forget the exact
> rules, but I tried to keep quotes easier for phrase searches).
> 
> > It looks as if the match works correctly, but the -tt option fails to
> > mark most of the matched emails as important, except a few that actually
> > got marked (I couldn't find a pattern here). It's also not consistent,
> > for example, after I removed /tmp/lei-test and restarted the lei q
> > command, I got many more important emails, almost in each thread, but
> > there were still threads without flagged emails.
> 
> Yes, now it seems it's the collapsing optimization.
> 
> > I'm checking the flags with mutt.
> > 
> > Does anyone know what could be the reason for such behavior?
> 
> I think the following patch fixes it.

Sorry for taking too long, I finally found a minute to test it, and
unfortunately I didn't see a difference. I queried for:

a:syzbot AND rt:2023-02-01..2023-02-07

and I still saw I lot of threads without a single flag.

I double-checked that the patch was actually applied, killed lei-daemon,
and removed the mailbox directory, but it didn't help.

> (I accidentally sent you a private copy with invalid blobs since
> I had other unpublished changes)
> 
> -----8<-------
> Subject: [PATCH] lei q: do not collapse threads with `-tt'
> 
> While having Xapian collapse threads is an easy way to reduce
> the amount of deduplication work we need to do when writing
> out threads; we can't rely on it when using `lei q -tt` since
> that needs to flag all hits.
> 
> Reported-by: Maxim Mikityanskiy <maxtram95@gmail.com>
> Link: https://public-inbox.org/git/Y+pgBmj0jxR+cVkD@mail.gmail.com/
> ---
>  lib/PublicInbox/Search.pm | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
> index 2feb3e13..273cc57c 100644
> --- a/lib/PublicInbox/Search.pm
> +++ b/lib/PublicInbox/Search.pm
> @@ -460,8 +460,9 @@ sub _enquire_once { # retry_reopen callback
>  		$enquire->set_sort_by_relevance_then_value(TS, !$opts->{asc});
>  	}
>  
> -	# `mairix -t / --threads' or JMAP collapseThreads
> -	if ($opts->{threads} && has_threadid($self)) {
> +	# `lei q -t / --threads' or JMAP collapseThreads; but don't collapse
> +	# on `-tt' ({threads} > 1) which sets the Flagged|Important keyword
> +	if (($opts->{threads} // 0) == 1 && has_threadid($self)) {
>  		$enquire->set_collapse_key(THREADID);
>  	}
>  	$enquire->get_mset($opts->{offset} || 0, $opts->{limit} || 50);

^ permalink raw reply	[relevance 71%]

* Re: FUSE3 vs read-write IMAP for lei
  2022-12-09  1:41 65% ` FUSE3 vs read-write IMAP for lei Eric Wong
@ 2023-02-20 19:27 71%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-02-20 19:27 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> == FUSE3 - Maildir-oriented FS
> 
> + will require C compiler since old FUSE XS modules don't do
>   FUSE3 (for readdirplus); and going w/o readdirplus is
>   unimaginable for Maildir.
> 
> + I already have existing (unreleased) AGPL-3 work based on
>   FUSE3 + URCU + Perl5 with just-ahead-of-time (JAOT) compilation

Fwiw, I've pushed out a new "fuse3" branch to public-inbox.git:

https://80x24.org/public-inbox.git/80ce906027eeb7b4cc5cc7d3858294927951988a/s/

I think I need some C + URCU in my life to keep my brain working

> * I've seen the light w/ URCU, and can't go back to C without it :P
> 
> - likely Linux-only (not sure how good FUSE support will be if
>   depending on FUSE3 features)

Well, I got rid of the futex requirement from the original...

> - kernel caches still incur nasty memory overhead w/ Maildir
> 
> - readdir(3) userspace API still sucks

^ permalink raw reply	[relevance 71%]

* [PATCH] lei q: do not collapse threads with `-tt'
  2023-02-13 16:06 63% lei q -tt doesn't work properly? Maxim Mikityanskiy
@ 2023-02-14  2:42 66% ` Eric Wong
  2023-02-26 12:17 71%   ` Maxim Mikityanskiy
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-02-14  2:42 UTC (permalink / raw)
  To: Maxim Mikityanskiy; +Cc: meta, Kyle Meyer

Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
> lei q --no-save -a -o /tmp/lei-test -I 'https://lore.kernel.org/all' \
>     -tt 'a:syzbot AND rt:2023-01-01..2023-01-07'

At first, I thought -a (--augment) was causing it...

Sidenote: you also don't need to quote the query (I forget the exact
rules, but I tried to keep quotes easier for phrase searches).

> It looks as if the match works correctly, but the -tt option fails to
> mark most of the matched emails as important, except a few that actually
> got marked (I couldn't find a pattern here). It's also not consistent,
> for example, after I removed /tmp/lei-test and restarted the lei q
> command, I got many more important emails, almost in each thread, but
> there were still threads without flagged emails.

Yes, now it seems it's the collapsing optimization.

> I'm checking the flags with mutt.
> 
> Does anyone know what could be the reason for such behavior?

I think the following patch fixes it.

(I accidentally sent you a private copy with invalid blobs since
I had other unpublished changes)

-----8<-------
Subject: [PATCH] lei q: do not collapse threads with `-tt'

While having Xapian collapse threads is an easy way to reduce
the amount of deduplication work we need to do when writing
out threads; we can't rely on it when using `lei q -tt` since
that needs to flag all hits.

Reported-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Link: https://public-inbox.org/git/Y+pgBmj0jxR+cVkD@mail.gmail.com/
---
 lib/PublicInbox/Search.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 2feb3e13..273cc57c 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -460,8 +460,9 @@ sub _enquire_once { # retry_reopen callback
 		$enquire->set_sort_by_relevance_then_value(TS, !$opts->{asc});
 	}
 
-	# `mairix -t / --threads' or JMAP collapseThreads
-	if ($opts->{threads} && has_threadid($self)) {
+	# `lei q -t / --threads' or JMAP collapseThreads; but don't collapse
+	# on `-tt' ({threads} > 1) which sets the Flagged|Important keyword
+	if (($opts->{threads} // 0) == 1 && has_threadid($self)) {
 		$enquire->set_collapse_key(THREADID);
 	}
 	$enquire->get_mset($opts->{offset} || 0, $opts->{limit} || 50);

^ permalink raw reply related	[relevance 66%]

* lei q -tt doesn't work properly?
@ 2023-02-13 16:06 63% Maxim Mikityanskiy
  2023-02-14  2:42 66% ` [PATCH] lei q: do not collapse threads with `-tt' Eric Wong
  0 siblings, 1 reply; 200+ results
From: Maxim Mikityanskiy @ 2023-02-13 16:06 UTC (permalink / raw)
  To: meta; +Cc: Eric Wong, Kyle Meyer

Hello,

I'm trying to use the -tt flag to download the whole thread, but mark
the actual matching emails as important. I'm not sure if I'm doing it
incorrectly, or maybe there is a bug in lei.

According to the man page:

--cut--

-t  Return all messages in the same thread as the actual match(es).

    Using this twice ("-tt") sets the "flagged" (AKA "important") on
    messages which were actual matches.  This is useful to distinguish
    messages which were direct hits from messages which were merely
    part of the same thread.

--cut--

I'm using this command, for example:

lei q --no-save -a -o /tmp/lei-test -I 'https://lore.kernel.org/all' \
    -tt 'a:syzbot AND rt:2023-01-01..2023-01-07'

What I expect to see is at least one flagged email in each thread
(otherwise why would this thread by downloaded), however, instead, most
of the emails are not flagged, that is, the whole threads don't have any
flagged email (although they clearly have emails from syzbot, which
caused the match). Occasionally, some emails are flagged, for example,
these two:

https://lore.kernel.org/all/87wn621hmp.fsf@toke.dk/
https://lore.kernel.org/all/20230103081308.942805751@linuxfoundation.org/

It looks as if the match works correctly, but the -tt option fails to
mark most of the matched emails as important, except a few that actually
got marked (I couldn't find a pattern here). It's also not consistent,
for example, after I removed /tmp/lei-test and restarted the lei q
command, I got many more important emails, almost in each thread, but
there were still threads without flagged emails.

I'm checking the flags with mutt.

Does anyone know what could be the reason for such behavior?

Thanks,
Max

^ permalink raw reply	[relevance 63%]

* [PATCH] t/lei-refresh-mail-sync: avoid kill+sleep loop
@ 2023-02-12  3:12 61% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-02-12  3:12 UTC (permalink / raw)
  To: meta

While we can't waitpid() on daemonized process, we can abuse the
lack of FD_CLOEXEC to detect a process death.  This saves
roughly 400ms for this slow test.
---
 lib/PublicInbox/TestCommon.pm |  3 +++
 t/lei-refresh-mail-sync.t     | 20 ++++++++++++--------
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index 1fe7931e..8a34e45a 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -480,6 +480,9 @@ sub start_script {
 	my $pid = fork // die "fork: $!\n";
 	if ($pid == 0) {
 		eval { PublicInbox::DS->Reset };
+		for (@{delete($opt->{-CLOFORK}) // []}) {
+			close($_) or die "close $!";
+		}
 		# pretend to be systemd (cf. sd_listen_fds(3))
 		# 3 == SD_LISTEN_FDS_START
 		my $fd;
diff --git a/t/lei-refresh-mail-sync.t b/t/lei-refresh-mail-sync.t
index ea83a513..0498a0c4 100644
--- a/t/lei-refresh-mail-sync.t
+++ b/t/lei-refresh-mail-sync.t
@@ -5,17 +5,20 @@ use strict; use v5.10.1; use PublicInbox::TestCommon;
 require_mods(qw(lei));
 use File::Path qw(remove_tree);
 require Socket;
+use Fcntl qw(F_SETFD);
+
+pipe(my ($stop_r, $stop_w)) or xbail "pipe: $!";
+fcntl($stop_w, F_SETFD, 0) or xbail "F_SETFD: $!";
 
 my $stop_daemon = sub { # needed since we don't have inotify
+	close $stop_w or xbail "close \$stop_w: $!";
 	lei_ok qw(daemon-pid);
 	chomp(my $pid = $lei_out);
 	$pid > 0 or xbail "bad pid: $pid";
 	kill('TERM', $pid) or xbail "kill: $!";
-	for (0..10) {
-		tick;
-		kill(0, $pid) or last;
-	}
-	kill(0, $pid) and xbail "daemon still running (PID:$pid)";
+	is(sysread($stop_r, my $buf, 1), 0, 'daemon stop pipe read EOF');
+	pipe($stop_r, $stop_w) or xbail "pipe: $!";
+	fcntl($stop_w, F_SETFD, 0) or xbail "F_SETFD: $!";
 };
 
 test_lei({ daemon_only => 1 }, sub {
@@ -88,7 +91,8 @@ SKIP: {
 		$sock_cls //= ref($s);
 		my $cmd = [ "-$x", '-W0', "--stdout=$home/$x.out",
 			"--stderr=$home/$x.err" ];
-		my $td = start_script($cmd, $env, { 3 => $s }) or xbail("-$x");
+		my $opt = { 3 => $s, -CLOFORK => [ $stop_w ] };
+		my $td = start_script($cmd, $env, $opt) or xbail("-$x");
 		my $addr = tcp_host_port($s);
 		$srv->{$x} = { addr => $addr, td => $td, cmd => $cmd, s => $s };
 	}
@@ -139,8 +143,8 @@ SKIP: {
 	my $cmd = $srv->{imapd}->{cmd};
 	my $s = $srv->{imapd}->{s};
 	$s->blocking(0);
-	$srv->{imapd}->{td} = start_script($cmd, $env, { 3 => $s }) or
-		xbail "@$cmd";
+	my $opt = { 3 => $s, -CLOFORK => [ $stop_w ] };
+	$srv->{imapd}->{td} = start_script($cmd, $env, $opt) or xbail "@$cmd";
 	lei_ok 'refresh-mail-sync', '--all';
 	lei_ok 'inspect', "blob:$oid";
 	is($lei_out, $before, 'no changes when server was down');

^ permalink raw reply related	[relevance 61%]

* [PATCH] lei: drop -watches and -lei_note_event from workers
@ 2023-01-31  0:05 70% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-01-31  0:05 UTC (permalink / raw)
  To: meta

I noticed these while tracking down circular refs for commit
7b654d175cf2e31b (ipc: drop awaitpid_init to avoid circular refs, 2023-01-30).
While they're not the cause of circular refs, they're still
a waste of memory in worker processes.
---
 lib/PublicInbox/LEI.pm | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ffd50db5..d05b20de 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -561,17 +561,17 @@ sub note_sigpipe { # triggers sigpipe_handler
 sub _lei_atfork_child {
 	my ($self, $persist) = @_;
 	# we need to explicitly close things which are on stack
+	my $cfg = $self->{cfg};
 	if ($persist) {
 		open $self->{3}, '<', '/' or die "open(/) $!";
 		fchdir($self);
 		close($_) for (grep(defined, delete @$self{qw(0 1 2 sock)}));
-		if (my $cfg = $self->{cfg}) {
-			delete @$cfg{qw(-lei_store -watches -lei_note_event)};
-		}
+		delete @$cfg{qw(-lei_store -watches -lei_note_event)};
 	} else { # worker, Net::NNTP (Net::Cmd) uses STDERR directly
 		open STDERR, '+>&='.fileno($self->{2}) or warn "open $!";
 		STDERR->autoflush(1);
 		POSIX::setpgid(0, $$) // die "setpgid(0, $$): $!";
+		delete @$cfg{qw(-watches -lei_note_event)};
 	}
 	close($_) for (grep(defined, delete @$self{qw(old_1 au_done)}));
 	delete $self->{-socks};

^ permalink raw reply related	[relevance 70%]

* [PATCH 0/2] fix xt/lei-auth-fail.t
@ 2023-01-29 22:58 71% Eric Wong
  2023-01-29 22:58 69% ` [PATCH 2/2] xt/lei-auth-fail: use valid label name Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2023-01-29 22:58 UTC (permalink / raw)
  To: meta

I need a failing mock IMAP server for this, or remember xt/ exists :<

Eric Wong (2):
  lei_input: give a hint for upper-case in labels
  xt/lei-auth-fail: use valid label name

 lib/PublicInbox/LeiInput.pm | 2 ++
 xt/lei-auth-fail.t          | 7 ++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 2/2] xt/lei-auth-fail: use valid label name
  2023-01-29 22:58 71% [PATCH 0/2] fix xt/lei-auth-fail.t Eric Wong
@ 2023-01-29 22:58 69% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-01-29 22:58 UTC (permalink / raw)
  To: meta

Uppercase characters aren't allowed for labels due to Xapian
boolean limitations, so we need to use lowercase labels.

Fixes: 27015c3365fd0690 (lei_input: disallow uppercase characters for labels, 2021-10-31)
---
 xt/lei-auth-fail.t | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/xt/lei-auth-fail.t b/xt/lei-auth-fail.t
index 06cb8533..1ccc2ab2 100644
--- a/xt/lei-auth-fail.t
+++ b/xt/lei-auth-fail.t
@@ -1,7 +1,8 @@
 #!perl -w
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-use strict; use v5.10.1; use PublicInbox::TestCommon;
+use v5.12;
+use PublicInbox::TestCommon;
 require_mods(qw(Mail::IMAPClient lei));
 
 # TODO: mock IMAP server which fails at authentication so we don't
@@ -13,7 +14,7 @@ test_lei(sub {
 	for my $pfx ([qw(q z:0.. --only), "$ro_home/t1", '-o'],
 			[qw(convert -o mboxrd:/dev/stdout)],
 			[qw(convert t/utf8.eml -o), $imap_fail],
-			['import'], [qw(tag +L:INBOX)]) {
+			['import'], [qw(tag +L:inbox)]) {
 		ok(!lei(@$pfx, $imap_fail), "IMAP auth failure on @$pfx");
 		like($lei_err, qr!\bE:.*?imaps?://.*?!sm, 'error shown');
 		unlike($lei_err, qr!Hunter2!s, 'password not shown');

^ permalink raw reply related	[relevance 69%]

* [PATCH 2/2] content_digest_dbg: convert to arrayref and limit to lei
  @ 2023-01-29 10:30 63% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-01-29 10:30 UTC (permalink / raw)
  To: meta

Since it's an extremely small class and not subclassed or
anything, we'll make it even smaller as an arrayref.

We also don't load this for PublicInbox::WWW or anything that
runs in public-facing daemons.
---
 lib/PublicInbox/ContentDigestDbg.pm | 10 ++++++----
 lib/PublicInbox/MailDiff.pm         |  5 +----
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/lib/PublicInbox/ContentDigestDbg.pm b/lib/PublicInbox/ContentDigestDbg.pm
index 899afbbe..5de0ee8a 100644
--- a/lib/PublicInbox/ContentDigestDbg.pm
+++ b/lib/PublicInbox/ContentDigestDbg.pm
@@ -1,17 +1,19 @@
 # Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+# only loaded in lei
 package PublicInbox::ContentDigestDbg; # cf. PublicInbox::ContentDigest
 use v5.12;
 use Data::Dumper;
 use PublicInbox::SHA;
+$Data::Dumper::Useqq = $Data::Dumper::Terse = 1;
 
-sub new { bless { dig => PublicInbox::SHA->new(256), fh => $_[1] }, __PACKAGE__ }
+sub new { bless [ PublicInbox::SHA->new(256), $_[1] ], __PACKAGE__ }
 
 sub add {
-	$_[0]->{dig}->add($_[1]);
-	print { $_[0]->{fh} } Dumper([split(/^/sm, $_[1])]) or die "print $!";
+	$_[0]->[0]->add($_[1]);
+	print { $_[0]->[1] } Dumper([split(/^/sm, $_[1])]) or die "print $!";
 }
 
-sub hexdigest { $_[0]->{dig}->hexdigest; }
+sub hexdigest { $_[0]->[0]->hexdigest }
 
 1;
diff --git a/lib/PublicInbox/MailDiff.pm b/lib/PublicInbox/MailDiff.pm
index 0ed06f9a..a0ecef9f 100644
--- a/lib/PublicInbox/MailDiff.pm
+++ b/lib/PublicInbox/MailDiff.pm
@@ -4,8 +4,6 @@ package PublicInbox::MailDiff;
 use v5.12;
 use File::Temp 0.19 (); # 0.19 for ->newdir
 use PublicInbox::ContentHash qw(content_digest);
-use PublicInbox::ContentDigestDbg;
-use Data::Dumper ();
 use PublicInbox::MsgIter qw(msg_part_text);
 use PublicInbox::ViewDiff qw(flush_diff);
 use PublicInbox::GitAsyncCat;
@@ -34,12 +32,11 @@ sub dump_eml ($$$) {
 	$eml->each_part(\&write_part, $self);
 
 	return if $self->{ctx}; # don't need content_digest noise in WWW UI
+	require PublicInbox::ContentDigestDbg;
 
 	# XXX is this even useful?  perhaps hide it behind a CLI switch
 	open my $fh, '>', "$dir/content_digest" or die "open: $!";
 	my $dig = PublicInbox::ContentDigestDbg->new($fh);
-	local $Data::Dumper::Useqq = 1;
-	local $Data::Dumper::Terse = 1;
 	content_digest($eml, $dig);
 	print $fh "\n", $dig->hexdigest, "\n" or die "print $!";
 	close $fh or die "close: $!";

^ permalink raw reply related	[relevance 63%]

* [PATCH 11/12] ipc+lei: switch to awaitpid
  @ 2023-01-17  7:19 39% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-01-17  7:19 UTC (permalink / raw)
  To: meta

This avoids awkwardly stuffing an arrayref into callbacks
which expect multiple arguments.  IPC->awaitpid_init now
allows pre-registering callbacks before spawning workers.
---
 lib/PublicInbox/IPC.pm        | 30 ++++++++++++++----------------
 lib/PublicInbox/LEI.pm        |  8 +++-----
 lib/PublicInbox/LeiConvert.pm |  2 +-
 lib/PublicInbox/LeiInput.pm   |  2 +-
 lib/PublicInbox/LeiMirror.pm  |  7 +++----
 lib/PublicInbox/LeiStore.pm   |  7 +++----
 lib/PublicInbox/LeiToMail.pm  |  7 +++----
 lib/PublicInbox/LeiUp.pm      |  5 ++---
 lib/PublicInbox/LeiXSearch.pm |  9 ++++-----
 script/public-inbox-clone     |  2 +-
 10 files changed, 35 insertions(+), 44 deletions(-)

diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm
index 34e40118..edc5ba64 100644
--- a/lib/PublicInbox/IPC.pm
+++ b/lib/PublicInbox/IPC.pm
@@ -12,7 +12,7 @@ use strict;
 use v5.10.1;
 use parent qw(Exporter);
 use Carp qw(croak);
-use PublicInbox::DS qw(dwaitpid);
+use PublicInbox::DS qw(awaitpid);
 use PublicInbox::Spawn;
 use PublicInbox::OnDestroy;
 use PublicInbox::WQWorker;
@@ -133,26 +133,26 @@ sub ipc_worker_spawn {
 	$self->{-ipc_req} = $w_req;
 	$self->{-ipc_res} = $r_res;
 	$self->{-ipc_ppid} = $$;
+	awaitpid($pid, \&ipc_worker_reap, $self);
 	$self->{-ipc_pid} = $pid;
 }
 
-sub ipc_worker_reap { # dwaitpid callback
-	my ($args, $pid) = @_;
-	my ($self, @uargs) = @$args;
+sub ipc_worker_reap { # awaitpid callback
+	my ($pid, $self) = @_;
 	delete $self->{-wq_workers}->{$pid};
-	return $self->{-reap_do}->($args, $pid) if $self->{-reap_do};
+	if (my $cb_args = $self->{-reap_do}) {
+		return $cb_args->[0]->($pid, $self, @$cb_args[1..$#$cb_args]);
+	}
 	return if !$?;
 	my $s = $? & 127;
 	# TERM(15) is our default exit signal, PIPE(13) is likely w/ pager
 	warn "$self->{-wq_ident} PID:$pid died \$?=$?\n" if $s != 15 && $s != 13
 }
 
-sub wq_wait_async {
-	my ($self, $cb, @uargs) = @_;
-	local $PublicInbox::DS::in_loop = 1;
-	$self->{-reap_do} = $cb;
-	my @pids = keys %{$self->{-wq_workers}};
-	dwaitpid($_, \&ipc_worker_reap, [ $self, @uargs ]) for @pids;
+# register wait workers
+sub awaitpid_init {
+	my ($self, @cb_args) = @_;
+	$self->{-reap_do} = \@cb_args;
 }
 
 # for base class, override in sub classes
@@ -178,9 +178,7 @@ sub ipc_worker_stop {
 	}
 	die 'no PID with IPC pipes' unless $pid;
 	$w_req = $r_res = undef;
-
-	return if $$ != $ppid;
-	dwaitpid($pid, \&ipc_worker_reap, [$self]);
+	awaitpid($pid) if $$ == $ppid; # for non-event loop
 }
 
 # use this if we have multiple readers reading curl or "pigz -dc"
@@ -397,6 +395,7 @@ sub _wq_worker_start ($$$$) {
 		undef $end; # trigger exit
 	} else {
 		$self->{-wq_workers}->{$pid} = $bcast1;
+		awaitpid($pid, \&ipc_worker_reap, $self);
 	}
 }
 
@@ -428,8 +427,7 @@ sub wq_close {
 	}
 	delete @$self{qw(-wq_s1 -wq_s2)} or return;
 	return if $self->{-reap_do};
-	my @pids = keys %{$self->{-wq_workers}};
-	dwaitpid($_, \&ipc_worker_reap, [ $self ]) for @pids;
+	awaitpid($_) for keys %{$self->{-wq_workers}};
 }
 
 sub wq_kill {
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index b78d70de..6ad42111 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -18,7 +18,6 @@ use IO::Handle ();
 use Fcntl qw(SEEK_SET);
 use PublicInbox::Config;
 use PublicInbox::Syscall qw(EPOLLIN);
-use PublicInbox::DS qw(dwaitpid);
 use PublicInbox::Spawn qw(spawn popen_rd);
 use PublicInbox::Lock;
 use PublicInbox::Eml;
@@ -644,12 +643,12 @@ sub workers_start {
 	my $end = $lei->pkt_op_pair;
 	my $ident = $wq->{-wq_ident} // "lei-$lei->{cmd} worker";
 	$flds->{lei} = $lei;
+	$wq->awaitpid_init($wq->can('_wq_done_wait') // \&wq_done_wait, $lei);
 	$wq->wq_workers_start($ident, $jobs, $lei->oldset, $flds);
 	delete $lei->{pkt_op_p};
 	my $op_c = delete $lei->{pkt_op_c};
 	@$end = ();
 	$lei->event_step_init;
-	$wq->wq_wait_async($wq->can('_wq_done_wait') // \&wq_done_wait, $lei);
 	($op_c, $ops);
 }
 
@@ -1391,9 +1390,8 @@ sub DESTROY {
 	# preserve $? for ->fail or ->x_it code
 }
 
-sub wq_done_wait { # dwaitpid callback
-	my ($arg, $pid) = @_;
-	my ($wq, $lei) = @$arg;
+sub wq_done_wait { # awaitpid cb (via wq_eof / IPC->awaitpid_init)
+	my ($pid, $wq, $lei) = @_;
 	local $current_lei = $lei;
 	my $err_type = $lei->{-err_type};
 	$? and $lei->child_error($?,
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 59af40de..1acd4558 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -30,7 +30,7 @@ sub input_maildir_cb {
 
 sub process_inputs { # via wq_do
 	my ($self) = @_;
-	local $PublicInbox::DS::in_loop = 0; # force synchronous dwaitpid
+	local $PublicInbox::DS::in_loop = 0; # force synchronous awaitpid
 	$self->SUPER::process_inputs;
 	my $lei = $self->{lei};
 	delete $lei->{1};
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index a1dcc907..c258f824 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -177,7 +177,7 @@ sub input_path_url {
 			$mbl->{fh} =
 			     PublicInbox::MboxReader::zsfxcat($in, $zsfx, $lei);
 		}
-		local $PublicInbox::DS::in_loop = 0 if $zsfx; # dwaitpid
+		local $PublicInbox::DS::in_loop = 0 if $zsfx; # awaitpid
 		$self->input_fh($ifmt, $mbl->{fh}, $input, @args);
 	} elsif (-d _ && (-d "$input/cur" || -d "$input/new")) {
 		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 87abf88c..abf66315 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -31,9 +31,8 @@ sub keep_going ($) {
 		$_[0]->{lei}->{opt}->{'keep-going'});
 }
 
-sub _wq_done_wait { # dwaitpid callback (via wq_eof)
-	my ($arg, $pid) = @_;
-	my ($mrr, $lei) = @$arg;
+sub _wq_done_wait { # awaitpid cb (via wq_eof / IPC->awaitpid_init)
+	my ($pid, $mrr, $lei) = @_;
 	if ($?) {
 		$lei->child_error($?);
 	} elsif (!$lei->{child_error}) {
@@ -236,7 +235,7 @@ sub index_cloned_inbox {
 			my ($k) = ($sw =~ /\A([\w-]+)/);
 			$opt->{$k} = $lei->{opt}->{$k};
 		}
-		# force synchronous dwaitpid for v2:
+		# force synchronous awaitpid for v2:
 		local $PublicInbox::DS::in_loop = 0;
 		my $cfg = PublicInbox::Config->new(undef, $lei->{2});
 		my $env = PublicInbox::Admin::index_prepare($opt, $cfg);
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 57f0e013..0ecf1388 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -604,9 +604,8 @@ sub recv_and_run {
 	$self->SUPER::recv_and_run(@args);
 }
 
-sub _sto_atexit { # dwaitpid callback
-	my ($args, $pid) = @_;
-	my $self = $args->[0];
+sub _sto_atexit { # awaitpid cb (via awaitpid_init)
+	my ($pid, $sto) = @_;
 	warn "lei/store PID:$pid died \$?=$?\n" if $?;
 }
 
@@ -621,12 +620,12 @@ sub write_prepare {
 		# Mail we import into lei are private, so headers filtered out
 		# by -mda for public mail are not appropriate
 		local @PublicInbox::MDA::BAD_HEADERS = ();
+		$self->awaitpid_init(\&_sto_atexit); # outlives $lei
 		$self->wq_workers_start("lei/store $dir", 1, $lei->oldset, {
 					lei => $lei,
 					-err_wr => $w,
 					to_close => [ $r ],
 				});
-		$self->wq_wait_async(\&_sto_atexit); # outlives $lei
 		require PublicInbox::LeiStoreErr;
 		PublicInbox::LeiStoreErr->new($r, $lei);
 	}
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 1528165a..6a4554e7 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -652,9 +652,8 @@ sub _do_augment_mbox {
 	$dedupe->pause_dedupe if $dedupe;
 }
 
-sub v2w_done_wait { # dwaitpid callback
-	my ($arg, $pid) = @_;
-	my ($v2w, $lei) = @$arg;
+sub v2w_done_wait { # awaitpid cb (via awaitpid_init)
+	my ($pid, $v2w, $lei) = @_;
 	$lei->child_error($?, "error for $v2w->{ibx}->{inboxdir}") if $?;
 }
 
@@ -680,8 +679,8 @@ sub _pre_augment_v2 {
 	PublicInbox::InboxWritable->new($ibx, @creat);
 	$ibx->init_inbox if @creat;
 	my $v2w = $ibx->importer;
+	$v2w->awaitpid_init(\&v2w_done_wait, $lei);
 	$v2w->wq_workers_start("lei/v2w $dir", 1, $lei->oldset, {lei => $lei});
-	$v2w->wq_wait_async(\&v2w_done_wait, $lei);
 	$lei->{v2w} = $v2w;
 	return if !$lei->{opt}->{shared};
 	my $d = "$lei->{ale}->{git}->{git_dir}/objects";
diff --git a/lib/PublicInbox/LeiUp.pm b/lib/PublicInbox/LeiUp.pm
index 49917339..3e92242e 100644
--- a/lib/PublicInbox/LeiUp.pm
+++ b/lib/PublicInbox/LeiUp.pm
@@ -165,9 +165,8 @@ sub _complete_up { # lei__complete hook
 	map { $match_cb->($_) } PublicInbox::LeiSavedSearch::list($lei);
 }
 
-sub _wq_done_wait { # dwaitpid callback
-	my ($arg, $pid) = @_;
-	my ($wq, $lei) = @$arg;
+sub _wq_done_wait { # awaitpid cb (via awaitpid_init)
+	my ($pid, $wq, $lei) = @_;
 	$lei->child_error($?, 'auth failure') if $?
 }
 
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 730df1f7..f9aa870e 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -400,9 +400,8 @@ sub query_remote_mboxrd {
 
 sub git { $_[0]->{git} // die 'BUG: git uninitialized' }
 
-sub xsearch_done_wait { # dwaitpid callback
-	my ($arg, $pid) = @_;
-	my ($wq, $lei) = @$arg;
+sub xsearch_done_wait { # awaitpid cb (via awaitpid_init)
+	my ($pid, $wq, $lei) = @_;
 	return if !$?;
 	my $s = $? & 127;
 	return $lei->child_error($?) if $s == 13 || $s == 15;
@@ -573,16 +572,16 @@ sub do_query {
 			fcntl($b_r, $F_SETPIPE_SZ, 4096) if $F_SETPIPE_SZ;
 			$l2m->{au_peers} = [ $a_r, $a_w, $b_r, $b_w ];
 		}
+		$l2m->awaitpid_init(\&xsearch_done_wait, $lei);
 		$l2m->wq_workers_start('lei2mail', undef,
 					$lei->oldset, { lei => $lei });
-		$l2m->wq_wait_async(\&xsearch_done_wait, $lei);
 		pipe($lei->{startq}, $lei->{au_done}) or die "pipe: $!";
 		fcntl($lei->{startq}, $F_SETPIPE_SZ, 4096) if $F_SETPIPE_SZ;
 		delete $l2m->{au_peers};
 	}
+	$self->awaitpid_init(\&xsearch_done_wait, $lei);
 	$self->wq_workers_start('lei_xsearch', undef,
 				$lei->oldset, { lei => $lei });
-	$self->wq_wait_async(\&xsearch_done_wait, $lei);
 	my $op_c = delete $lei->{pkt_op_c};
 	delete $lei->{pkt_op_p};
 	@$end = ();
diff --git a/script/public-inbox-clone b/script/public-inbox-clone
index e93ac37b..598979bc 100755
--- a/script/public-inbox-clone
+++ b/script/public-inbox-clone
@@ -62,5 +62,5 @@ my $mrr = bless {
 
 $? = 0;
 $mrr->do_mirror;
-$mrr->can('_wq_done_wait')->([$mrr, $lei], $$);
+$mrr->can('_wq_done_wait')->($$, $mrr, $lei);
 exit(($lei->{child_error} // 0) >> 8);

^ permalink raw reply related	[relevance 39%]

* [1/2 PATCH] hoist MailDiff and ContentDigestDbg out of lei
  @ 2023-01-11 11:00 42% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2023-01-11 11:00 UTC (permalink / raw)
  To: meta

These will be reused in the web UI, too.
---
 <20230111105539.302803-1-e@80x24.org> was actually [2/2] of
 this series.  My mind drifted and I thought it was just one
 patch :x

 MANIFEST                            |  3 ++
 lib/PublicInbox/ContentDigestDbg.pm | 17 +++++++
 lib/PublicInbox/LeiMailDiff.pm      | 71 +++--------------------------
 lib/PublicInbox/MailDiff.pm         | 50 ++++++++++++++++++++
 t/lei-mail-diff.t                   | 14 ++++++
 5 files changed, 91 insertions(+), 64 deletions(-)
 create mode 100644 lib/PublicInbox/ContentDigestDbg.pm
 create mode 100644 lib/PublicInbox/MailDiff.pm
 create mode 100644 t/lei-mail-diff.t

diff --git a/MANIFEST b/MANIFEST
index 565317ce..3626e4d2 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -163,6 +163,7 @@ lib/PublicInbox/CmdIPC4.pm
 lib/PublicInbox/CompressNoop.pm
 lib/PublicInbox/Config.pm
 lib/PublicInbox/ConfigIter.pm
+lib/PublicInbox/ContentDigestDbg.pm
 lib/PublicInbox/ContentHash.pm
 lib/PublicInbox/DS.pm
 lib/PublicInbox/DSKQXS.pm
@@ -280,6 +281,7 @@ lib/PublicInbox/Lock.pm
 lib/PublicInbox/MDA.pm
 lib/PublicInbox/MID.pm
 lib/PublicInbox/MIME.pm
+lib/PublicInbox/MailDiff.pm
 lib/PublicInbox/ManifestJsGz.pm
 lib/PublicInbox/Mbox.pm
 lib/PublicInbox/MboxGz.pm
@@ -478,6 +480,7 @@ t/lei-import.t
 t/lei-index.t
 t/lei-inspect.t
 t/lei-lcat.t
+t/lei-mail-diff.t
 t/lei-mirror.psgi
 t/lei-mirror.t
 t/lei-p2q.t
diff --git a/lib/PublicInbox/ContentDigestDbg.pm b/lib/PublicInbox/ContentDigestDbg.pm
new file mode 100644
index 00000000..425e8589
--- /dev/null
+++ b/lib/PublicInbox/ContentDigestDbg.pm
@@ -0,0 +1,17 @@
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+package PublicInbox::ContentDigestDbg; # cf. PublicInbox::ContentDigest
+use v5.12;
+use Data::Dumper;
+use Digest::SHA;
+
+sub new { bless { dig => Digest::SHA->new(256), fh => $_[1] }, __PACKAGE__ }
+
+sub add {
+	$_[0]->{dig}->add($_[1]);
+	print { $_[0]->{fh} } Dumper([split(/^/sm, $_[1])]) or die "print $!";
+}
+
+sub hexdigest { $_[0]->{dig}->hexdigest; }
+
+1;
diff --git a/lib/PublicInbox/LeiMailDiff.pm b/lib/PublicInbox/LeiMailDiff.pm
index 2b4cfd9e..c813144f 100644
--- a/lib/PublicInbox/LeiMailDiff.pm
+++ b/lib/PublicInbox/LeiMailDiff.pm
@@ -4,59 +4,16 @@
 # The "lei mail-diff" sub-command, diffs input contents against
 # the first message of input
 package PublicInbox::LeiMailDiff;
-use strict;
-use v5.10.1;
-use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use File::Temp 0.19 (); # 0.19 for ->newdir
+use v5.12;
+use parent qw(PublicInbox::IPC PublicInbox::LeiInput PublicInbox::MailDiff);
 use PublicInbox::Spawn qw(spawn which);
-use PublicInbox::MsgIter qw(msg_part_text);
-use File::Path qw(remove_tree);
-use PublicInbox::ContentHash qw(content_digest);
+use File::Path ();
 require PublicInbox::LeiRediff;
-use Data::Dumper ();
-
-sub write_part { # Eml->each_part callback
-	my ($ary, $self) = @_;
-	my ($part, $depth, $idx) = @$ary;
-	if ($idx ne '1' || $self->{lei}->{opt}->{'raw-header'}) {
-		open my $fh, '>', "$self->{curdir}/$idx.hdr" or die "open: $!";
-		print $fh ${$part->{hdr}} or die "print $!";
-		close $fh or die "close $!";
-	}
-	my $ct = $part->content_type || 'text/plain';
-	my ($s, $err) = msg_part_text($part, $ct);
-	my $sfx = defined($s) ? 'txt' : 'bin';
-	open my $fh, '>', "$self->{curdir}/$idx.$sfx" or die "open: $!";
-	print $fh ($s // $part->body) or die "print $!";
-	close $fh or die "close $!";
-}
-
-sub dump_eml ($$$) {
-	my ($self, $dir, $eml) = @_;
-	local $self->{curdir} = $dir;
-	mkdir $dir or die "mkdir($dir): $!";
-	$eml->each_part(\&write_part, $self);
-
-	open my $fh, '>', "$dir/content_digest" or die "open: $!";
-	my $dig = PublicInbox::ContentDigestDbg->new($fh);
-	local $Data::Dumper::Useqq = 1;
-	local $Data::Dumper::Terse = 1;
-	content_digest($eml, $dig);
-	print $fh "\n", $dig->hexdigest, "\n" or die "print $!";
-	close $fh or die "close: $!";
-}
-
-sub prep_a ($$) {
-	my ($self, $eml) = @_;
-	$self->{tmp} = File::Temp->newdir('lei-mail-diff-XXXX', TMPDIR => 1);
-	dump_eml($self, "$self->{tmp}/a", $eml);
-}
 
 sub diff_a ($$) {
 	my ($self, $eml) = @_;
-	++$self->{nr};
-	my $dir = "$self->{tmp}/N$self->{nr}";
-	dump_eml($self, $dir, $eml);
+	my $dir = "$self->{tmp}/N".(++$self->{nr});
+	$self->dump_eml($dir, $eml);
 	my $cmd = [ qw(git diff --no-index) ];
 	my $lei = $self->{lei};
 	PublicInbox::LeiRediff::_lei_diff_prepare($lei, $cmd);
@@ -71,7 +28,7 @@ sub diff_a ($$) {
 
 sub input_eml_cb { # used by PublicInbox::LeiInput::input_fh
 	my ($self, $eml) = @_;
-	$self->{tmp} ? diff_a($self, $eml) : prep_a($self, $eml);
+	$self->{tmp} ? diff_a($self, $eml) : $self->prep_a($eml);
 }
 
 sub lei_mail_diff {
@@ -82,24 +39,10 @@ sub lei_mail_diff {
 	$lei->{opt}->{color} //= $isatty;
 	$lei->start_pager if $isatty;
 	$lei->{-err_type} = 'non-fatal';
+	$self->{-raw_hdr} = $lei->{opt}->{'raw-header'};
 	$lei->wq1_start($self);
 }
 
 no warnings 'once';
 *net_merge_all_done = \&PublicInbox::LeiInput::input_only_net_merge_all_done;
-
-package PublicInbox::ContentDigestDbg; # cf. PublicInbox::ContentDigest
-use strict;
-use v5.10.1;
-use Data::Dumper;
-
-sub new { bless { dig => Digest::SHA->new(256), fh => $_[1] }, __PACKAGE__ }
-
-sub add {
-	$_[0]->{dig}->add($_[1]);
-	print { $_[0]->{fh} } Dumper([split(/^/sm, $_[1])]) or die "print $!";
-}
-
-sub hexdigest { $_[0]->{dig}->hexdigest; }
-
 1;
diff --git a/lib/PublicInbox/MailDiff.pm b/lib/PublicInbox/MailDiff.pm
new file mode 100644
index 00000000..06eb3a0d
--- /dev/null
+++ b/lib/PublicInbox/MailDiff.pm
@@ -0,0 +1,50 @@
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+package PublicInbox::MailDiff;
+use v5.12;
+use File::Temp 0.19 (); # 0.19 for ->newdir
+use PublicInbox::ContentHash qw(content_digest);
+use PublicInbox::ContentDigestDbg;
+use Data::Dumper ();
+use PublicInbox::MsgIter qw(msg_part_text);
+
+sub write_part { # Eml->each_part callback
+	my ($ary, $self) = @_;
+	my ($part, $depth, $idx) = @$ary;
+	if ($idx ne '1' || $self->{-raw_hdr}) {
+		open my $fh, '>', "$self->{curdir}/$idx.hdr" or die "open: $!";
+		print $fh ${$part->{hdr}} or die "print $!";
+		close $fh or die "close $!";
+	}
+	my $ct = $part->content_type || 'text/plain';
+	my ($s, $err) = msg_part_text($part, $ct);
+	my $sfx = defined($s) ? 'txt' : 'bin';
+	open my $fh, '>', "$self->{curdir}/$idx.$sfx" or die "open: $!";
+	print $fh ($s // $part->body) or die "print $!";
+	close $fh or die "close $!";
+}
+
+# public
+sub dump_eml ($$$) {
+	my ($self, $dir, $eml) = @_;
+	local $self->{curdir} = $dir;
+	mkdir $dir or die "mkdir($dir): $!";
+	$eml->each_part(\&write_part, $self);
+
+	open my $fh, '>', "$dir/content_digest" or die "open: $!";
+	my $dig = PublicInbox::ContentDigestDbg->new($fh);
+	local $Data::Dumper::Useqq = 1;
+	local $Data::Dumper::Terse = 1;
+	content_digest($eml, $dig);
+	print $fh "\n", $dig->hexdigest, "\n" or die "print $!";
+	close $fh or die "close: $!";
+}
+
+# public
+sub prep_a ($$) {
+	my ($self, $eml) = @_;
+	$self->{tmp} = File::Temp->newdir('mail-diff-XXXX', TMPDIR => 1);
+	dump_eml($self, "$self->{tmp}/a", $eml);
+}
+
+1;
diff --git a/t/lei-mail-diff.t b/t/lei-mail-diff.t
new file mode 100644
index 00000000..9398596a
--- /dev/null
+++ b/t/lei-mail-diff.t
@@ -0,0 +1,14 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use v5.12; use PublicInbox::TestCommon;
+
+test_lei(sub {
+	ok(!lei('mail-diff', 't/data/0001.patch', 't/data/binary.patch'),
+		'different messages are different');
+	like($lei_out, qr/^\+/m, 'diff shown');
+	lei_ok('mail-diff', 't/data/0001.patch', 't/data/0001.patch');
+	is($lei_out, '', 'no output if identical');
+});
+
+done_testing;

^ permalink raw reply related	[relevance 42%]

* FUSE3 vs read-write IMAP for lei
  @ 2022-12-09  1:41 65% ` Eric Wong
  2023-02-20 19:27 71%   ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2022-12-09  1:41 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> I don't think lei+FUSE will be as portable or useful as a
> local IMAP server (and maybe JMAP, eventually); but r/w IMAP
> support would be nice..

One thing about lei which really bothers me is that it needs to
write out mail already in local git repos to the FS to be read
by regular MUAs.

This is wasteful, of course; so there's 2 ways I can imagine
exposing mail to ordinary MUAs without excessive disk
traffic/wear:


== read-write IMAP for lei

* connect via localhost (127.0.0.1, [::1]) (don't think local sockets
  are supported by any MUAs)

* May be extended to JMAP; but it's hard to be motivated on JMAP
  since my favorite MUA doesn't do JMAP, yet.

+ portable (can be done in pure Perl + DBD::SQLite + Xapian)

+ we already have a read-only IMAP server which can be extended
  for read/write

- still needs login w/ username+password due to multi-user systems

- may still susceptible to abuse and from multi-user systems

- IMAP gets pretty complex, and MUAs sometimes don't do it well



== FUSE3 - Maildir-oriented FS


+ will require C compiler since old FUSE XS modules don't do
  FUSE3 (for readdirplus); and going w/o readdirplus is
  unimaginable for Maildir.

+ I already have existing (unreleased) AGPL-3 work based on
  FUSE3 + URCU + Perl5 with just-ahead-of-time (JAOT) compilation

* I've seen the light w/ URCU, and can't go back to C without it :P

- likely Linux-only (not sure how good FUSE support will be if
  depending on FUSE3 features)

- kernel caches still incur nasty memory overhead w/ Maildir

- readdir(3) userspace API still sucks


Of course, doing both is an option, too, given enough time...

^ permalink raw reply	[relevance 65%]

* [PATCH 0/2] lei - expanding relative paths for `lei up'
@ 2022-12-01 11:21 90% Eric Wong
  2022-12-01 11:21 67% ` [PATCH 1/2] lei: stricter external checks for valid $GIT_DIR/objects Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2022-12-01 11:21 UTC (permalink / raw)
  To: meta

I ran `lei q --only ./ -o $MAILDIR $QUERY' at some point from
inside an inboxdir.

Then I got confused when `lei up --all' running from $HOME
started complaining about $HOME/objects being missing from ALE.
It turned out $HOME/public-inbox (one of my worktrees) was
causing $HOME to false-positive as a v1 public-inbox for lei :x.

So this is a two-pronged fix to prevent some weird stuff from
happening.

Eric Wong (2):
  lei: stricter external checks for valid $GIT_DIR/objects
  lei_saved_search: expand only/include/exclude to absolute paths

 lib/PublicInbox/LeiQuery.pm   | 23 ++++++++++++++++++++---
 lib/PublicInbox/LeiXSearch.pm | 14 ++++++++++----
 t/lei-q-save.t                | 13 +++++++++----
 t/lei.t                       |  3 ++-
 4 files changed, 41 insertions(+), 12 deletions(-)

^ permalink raw reply	[relevance 90%]

* [PATCH 1/2] lei: stricter external checks for valid $GIT_DIR/objects
  2022-12-01 11:21 90% [PATCH 0/2] lei - expanding relative paths for `lei up' Eric Wong
@ 2022-12-01 11:21 67% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-12-01 11:21 UTC (permalink / raw)
  To: meta

I ended up with my $HOME in
~/.cache/lei/all_locals_ever.git/objects/info/alterntes
and am trying to avoid that in the future.
---
 lib/PublicInbox/LeiXSearch.pm | 5 +++--
 t/lei.t                       | 3 ++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 90cb83b9..8e195c4c 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -617,11 +617,12 @@ sub prepare_external {
 	} elsif ($loc =~ m!\Ahttps?://!) {
 		require URI;
 		return add_uri($self, URI->new($loc));
-	} elsif (-f "$loc/ei.lock") {
+	} elsif (-f "$loc/ei.lock" && -d "$loc/ALL.git/objects") {
 		require PublicInbox::ExtSearch;
 		die "`\\n' not allowed in `$loc'\n" if index($loc, "\n") >= 0;
 		$loc = PublicInbox::ExtSearch->new($loc);
-	} elsif (-f "$loc/inbox.lock" || -d "$loc/public-inbox") {
+	} elsif ((-f "$loc/inbox.lock" && -d "$loc/all.git/objects") ||
+			(-d "$loc/public-inbox" && -d "$loc/objects")) {
 		die "`\\n' not allowed in `$loc'\n" if index($loc, "\n") >= 0;
 		require PublicInbox::Inbox; # v2, v1
 		$loc = bless { inboxdir => $loc }, 'PublicInbox::Inbox';
diff --git a/t/lei.t b/t/lei.t
index b10c9b59..a80143ef 100644
--- a/t/lei.t
+++ b/t/lei.t
@@ -148,7 +148,8 @@ my $test_fail = sub {
 
 	for my $lk (qw(ei inbox)) {
 		my $d = "$home/newline\n$lk";
-		mkdir $d;
+		my $all = $lk eq 'ei' ? 'ALL' : 'all';
+		File::Path::mkpath("$d/$all.git/objects");
 		open my $fh, '>', "$d/$lk.lock" or BAIL_OUT "open $d/$lk.lock";
 		for my $fl (qw(-I --only)) {
 			ok(!lei('q', $fl, $d, 'whatever'),

^ permalink raw reply related	[relevance 67%]

* [PATCH] lei q|up: limit default write --jobs for IMAP(S)
  2022-09-10 20:19 71%             ` Eric Wong
@ 2022-11-14  8:07 64%               ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-11-14  8:07 UTC (permalink / raw)
  To: meta; +Cc: Ricardo Ribalda

Eric Wong <e@80x24.org> wrote:
> Thanks for confirming things work as intended.  I think the
> default should be clamped, though... 15 seems a bit high for
> smaller IMAP servers *shrug*

--------8<-------
Subject: [PATCH] lei q|up: limit default write --jobs for IMAP(S)

IMAP(S) servers often limit per-user connections, so avoid
bumping into limits to improve the out-of-the-box experience.
4 seems like a conservative default, since we already chose
that number for remote HTTP(S) endpoints.

Link: https://public-inbox.org/meta/20220910201958.GA12212@dcvr/
---
  /me having git-repack OOM due to excessive default pack.threads
  reminded me of this issue :x

 Documentation/lei-q.pod     | 4 ++--
 lib/PublicInbox/LeiQuery.pm | 7 +++++--
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 8134223e..d52c5b04 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -135,8 +135,8 @@ Set the number of query and write worker processes for parallelism.
 C<QUERY_WORKERS> defaults to the number of CPUs available, but 4 per
 remote (HTTP/HTTPS) host.
 
-C<WRITE_WORKERS> defaults to the number of CPUs available for Maildir,
-IMAP/IMAPS, and mbox* destinations.
+C<WRITE_WORKERS> defaults to 75% of the number of CPUs available for
+Maildir and mbox* destinations, but 4 per IMAP/IMAPS host.
 
 Omitting C<QUERY_WORKERS> but leaving the comma (C<,>) allows
 one to only set C<WRITE_WORKERS>
diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index df9c32b3..0f839236 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -39,8 +39,11 @@ sub _start_query { # used by "lei q" and "lei up"
 			$lms->lms_write_prepare->lms_pause; # just create
 		}
 	}
-	$l2m and $l2m->{-wq_nr_workers} //= $mj //
-		int($nproc * 0.75 + 0.5); # keep some CPU for git
+	$l2m and $l2m->{-wq_nr_workers} //= $mj // do {
+		# keep some CPU for git, and don't overload IMAP destinations
+		my $n = int($nproc * 0.75 + 0.5);
+		$self->{net} && $n > 4 ? 4 : $n;
+	};
 
 	# descending docid order is cheapest, MUA controls sorting order
 	$self->{mset_opt}->{relevance} //= -2 if $l2m || $opt->{threads};

^ permalink raw reply related	[relevance 64%]

* Re: [PATCH 5/6] doc: lei-import: link to lei-store-format(5)
  2022-11-03  0:48 90% ` [PATCH 5/6] doc: lei-import: link to lei-store-format(5) Eric Wong
@ 2022-11-03  2:03 90%   ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-11-03  2:03 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> +(aka L<lei/store|lei-store-format(5)>).  C<LOCATION> is a

Can't have unescaped `/' like that, will squash this in:

diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
index 25ef75c3..69ec6497 100644
--- a/Documentation/lei-import.pod
+++ b/Documentation/lei-import.pod
@@ -11,7 +11,7 @@ lei import [OPTIONS] (--stdin|-)
 =head1 DESCRIPTION
 
 Import messages into the local storage of L<lei(1)>
-(aka L<lei/store|lei-store-format(5)>).  C<LOCATION> is a
+(aka L<leiE<sol>store|lei-store-format(5)>).  C<LOCATION> is a
 source of messages: a directory (Maildir), a file, or a URL
 (C<imap://>, C<imaps://>, C<nntp://>, or C<nntps://>).  URLs requiring
 authentication use L<git-credential(1)> to

^ permalink raw reply related	[relevance 90%]

* [PATCH 5/6] doc: lei-import: link to lei-store-format(5)
    2022-11-03  0:48 62% ` [PATCH 2/6] doc: lei: improve description of *-search commands Eric Wong
  2022-11-03  0:48 71% ` [PATCH 3/6] doc: txt2pre: linkify "lei COMMAND" form Eric Wong
@ 2022-11-03  0:48 90% ` Eric Wong
  2022-11-03  2:03 90%   ` Eric Wong
  2022-11-03  0:48 90% ` [PATCH 6/6] txt2pre: linkify lei/store => lei-store-format.html Eric Wong
  3 siblings, 1 reply; 200+ results
From: Eric Wong @ 2022-11-03  0:48 UTC (permalink / raw)
  To: meta

Users should know where `lei import' writes to.
---
 Documentation/lei-import.pod | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
index 4ac7dccd..25ef75c3 100644
--- a/Documentation/lei-import.pod
+++ b/Documentation/lei-import.pod
@@ -10,7 +10,8 @@ lei import [OPTIONS] (--stdin|-)
 
 =head1 DESCRIPTION
 
-Import messages into the local storage of L<lei(1)>.  C<LOCATION> is a
+Import messages into the local storage of L<lei(1)>
+(aka L<lei/store|lei-store-format(5)>).  C<LOCATION> is a
 source of messages: a directory (Maildir), a file, or a URL
 (C<imap://>, C<imaps://>, C<nntp://>, or C<nntps://>).  URLs requiring
 authentication use L<git-credential(1)> to
@@ -102,4 +103,4 @@ License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
 
 =head1 SEE ALSO
 
-L<lei-index(1)>
+L<lei-index(1)>, L<lei-store-format(5)>

^ permalink raw reply related	[relevance 90%]

* [PATCH 6/6] txt2pre: linkify lei/store => lei-store-format.html
                     ` (2 preceding siblings ...)
  2022-11-03  0:48 90% ` [PATCH 5/6] doc: lei-import: link to lei-store-format(5) Eric Wong
@ 2022-11-03  0:48 90% ` Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-11-03  0:48 UTC (permalink / raw)
  To: meta

Linking to the manpage probably helps clarify what `lei/store'
refers to without too much clutter in the raw POD source.
---
 Documentation/txt2pre | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/txt2pre b/Documentation/txt2pre
index b9d74fb7..62175f34 100755
--- a/Documentation/txt2pre
+++ b/Documentation/txt2pre
@@ -80,6 +80,8 @@ for (qw[lei(1)
 	/\Alei-(.+?)\(1\)\z/ and $xurls{"lei $1"} = "$n.html";
 }
 
+$xurls{'lei/store'} = 'lei-store-format.html';
+
 for (qw[make(1) flock(2) setrlimit(2) vfork(2) tmpfs(5) inotify(7) unix(7)
 		syslog(3)]) {
 	my ($n, $s) = (/([\w\-]+)\((\d)\)/);

^ permalink raw reply related	[relevance 90%]

* [PATCH 3/6] doc: txt2pre: linkify "lei COMMAND" form
    2022-11-03  0:48 62% ` [PATCH 2/6] doc: lei: improve description of *-search commands Eric Wong
@ 2022-11-03  0:48 71% ` Eric Wong
  2022-11-03  0:48 90% ` [PATCH 5/6] doc: lei-import: link to lei-store-format(5) Eric Wong
  2022-11-03  0:48 90% ` [PATCH 6/6] txt2pre: linkify lei/store => lei-store-format.html Eric Wong
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-11-03  0:48 UTC (permalink / raw)
  To: meta

While manpages are named `L<lei-COMMAND(1)>', `lei COMMAND'
can be worth linkifying for ease-of-navigation, too.
---
 Documentation/txt2pre | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/txt2pre b/Documentation/txt2pre
index c8dbd2ba..82573a30 100755
--- a/Documentation/txt2pre
+++ b/Documentation/txt2pre
@@ -9,7 +9,7 @@ use strict;
 use warnings;
 use PublicInbox::Linkify;
 use PublicInbox::Hval qw(ascii_html);
-my %xurls;
+my (%xurls, %lei);
 for (qw[lei(1)
 	lei-add-external(1)
 	lei-add-watch(1)
@@ -77,6 +77,7 @@ for (qw[lei(1)
 	my ($n) = (/([\w\-\.]+)/);
 	$xurls{$_} = "$n.html";
 	$xurls{$n} = "$n.html";
+	/\Alei-(.+?)\(1\)\z/ and $xurls{"lei $1"} = "$n.html";
 }
 
 for (qw[make(1) flock(2) setrlimit(2) vfork(2) tmpfs(5) inotify(7) unix(7)
@@ -161,6 +162,9 @@ if ($str =~ /^NAME\n\s+([^\n]+)/sm) {
 	if ($title =~ /([\w\.\-]+)/) {
 		delete $xurls{$1};
 	}
+	if ($title =~ /\blei-([\w\-]+)\b/) {
+		delete $xurls{"lei $1"};
+	}
 }
 $title = ascii_html($title);
 my $l = PublicInbox::Linkify->new;

^ permalink raw reply related	[relevance 71%]

* [PATCH 2/6] doc: lei: improve description of *-search commands
  @ 2022-11-03  0:48 62% ` Eric Wong
  2022-11-03  0:48 71% ` [PATCH 3/6] doc: txt2pre: linkify "lei COMMAND" form Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-11-03  0:48 UTC (permalink / raw)
  To: meta

The `OUTPUT' use may not be immediately apparent, clarify
that it's from `lei q'.
---
 Documentation/lei-edit-search.pod   | 6 ++++--
 Documentation/lei-forget-search.pod | 4 +++-
 Documentation/lei-ls-search.pod     | 5 +++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/Documentation/lei-edit-search.pod b/Documentation/lei-edit-search.pod
index 21cb11aa..7f447ca2 100644
--- a/Documentation/lei-edit-search.pod
+++ b/Documentation/lei-edit-search.pod
@@ -8,7 +8,9 @@ lei edit-search [OPTIONS] OUTPUT
 
 =head1 DESCRIPTION
 
-Invoke C<git config --edit> to edit the saved search at C<OUTPUT>.
+Invoke C<git config --edit> to edit the saved search at C<OUTPUT>,
+where C<OUTPUT> was supplied for argument of C<lei q -o OUTPUT ...>
+A listing of outputs is available via C<lei ls-search>.
 
 =head1 CONTACT
 
@@ -19,7 +21,7 @@ and L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta
 
 =head1 COPYRIGHT
 
-Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+Copyright all contributors L<mailto:meta@public-inbox.org>
 
 License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
 
diff --git a/Documentation/lei-forget-search.pod b/Documentation/lei-forget-search.pod
index adbe7638..5ff526f1 100644
--- a/Documentation/lei-forget-search.pod
+++ b/Documentation/lei-forget-search.pod
@@ -8,7 +8,9 @@ lei forget-search [OPTIONS] OUTPUT
 
 =head1 DESCRIPTION
 
-Forget a saved search at C<OUTPUT>.
+Forget a saved search at C<OUTPUT>,
+where C<OUTPUT> was supplied for argument of C<lei q -o OUTPUT ...>
+A listing of outputs is available via C<lei ls-search>.
 
 =head1 OPTIONS
 
diff --git a/Documentation/lei-ls-search.pod b/Documentation/lei-ls-search.pod
index a56611bf..0fe4b759 100644
--- a/Documentation/lei-ls-search.pod
+++ b/Documentation/lei-ls-search.pod
@@ -8,7 +8,8 @@ lei ls-search [OPTIONS] [PREFIX]
 
 =head1 DESCRIPTION
 
-List saved search queries.  If C<PREFIX> is given, restrict the output
+List saved search queries (generated from C<lei q -o OUTPUT>).
+If C<PREFIX> is given, restrict the output
 to entries that start with the specified value.
 
 =head1 OPTIONS
@@ -55,7 +56,7 @@ and L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta
 
 =head1 COPYRIGHT
 
-Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+Copyright all contributors L<mailto:meta@public-inbox.org>
 
 License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>
 

^ permalink raw reply related	[relevance 62%]

* [PATCH] lei: fix globbing semantics to match end-of-filename
@ 2022-11-01  9:36 60% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-11-01  9:36 UTC (permalink / raw)
  To: meta

Globs such as `*/foo' should not match `*/foobar'.  I noticed
this while adding glob support to public-inbox-clone.

This may subtly break some existing cases, but there aren't many
lei users, yet, and globbing semantics should match what most
other glob-using programs, do...

We'll also make `lei ls-mail-sync' behave more consistently with
`lei ls-external', as far as the basename matching fallback
goes.
---
 lib/PublicInbox/LeiExternal.pm   | 4 ++--
 lib/PublicInbox/LeiLsExternal.pm | 1 +
 lib/PublicInbox/LeiLsMailSync.pm | 7 +++++--
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiExternal.pm b/lib/PublicInbox/LeiExternal.pm
index 30bb1a45..a6562e7f 100644
--- a/lib/PublicInbox/LeiExternal.pm
+++ b/lib/PublicInbox/LeiExternal.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # *-external commands of lei
@@ -88,7 +88,7 @@ sub get_externals {
 	my @cur = externals_each($self);
 	my $do_glob = !$self->{opt}->{globoff}; # glob by default
 	if ($do_glob && (my $re = glob2re($loc))) {
-		@m = grep(m!$re!, @cur);
+		@m = grep(m!$re/?\z!, @cur);
 		return @m if scalar(@m);
 	} elsif (index($loc, '/') < 0) { # exact basename match:
 		@m = grep(m!/\Q$loc\E/?\z!, @cur);
diff --git a/lib/PublicInbox/LeiLsExternal.pm b/lib/PublicInbox/LeiLsExternal.pm
index dd2eb2e7..e624cbd4 100644
--- a/lib/PublicInbox/LeiLsExternal.pm
+++ b/lib/PublicInbox/LeiLsExternal.pm
@@ -13,6 +13,7 @@ sub lei_ls_external {
 	my ($OFS, $ORS) = $lei->{opt}->{z} ? ("\0", "\0\0") : (" ", "\n");
 	$filter //= '*';
 	my $re = $do_glob ? $lei->glob2re($filter) : undef;
+	$re .= '/?\\z' if defined $re;
 	$re //= index($filter, '/') < 0 ?
 			qr!/\Q$filter\E/?\z! : # exact basename match
 			qr/\Q$filter\E/; # grep -F semantics
diff --git a/lib/PublicInbox/LeiLsMailSync.pm b/lib/PublicInbox/LeiLsMailSync.pm
index 2b167b1d..8da0c284 100644
--- a/lib/PublicInbox/LeiLsMailSync.pm
+++ b/lib/PublicInbox/LeiLsMailSync.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # front-end for the "lei ls-mail-sync" sub-command
@@ -12,7 +12,10 @@ sub lei_ls_mail_sync {
 	my $lms = $lei->lms or return;
 	my $opt = $lei->{opt};
 	my $re = $opt->{globoff} ? undef : $lei->glob2re($filter // '*');
-	$re //= qr/\Q$filter\E/;
+	$re .= '/?\\z' if defined $re;
+	$re //= index($filter, '/') < 0 ?
+			qr!/\Q$filter\E/?\z! : # exact basename match
+			qr/\Q$filter\E/; # grep -F semantics
 	my @f = $lms->folders;
 	@f = $opt->{'invert-match'} ? grep(!/$re/, @f) : grep(/$re/, @f);
 	if ($opt->{'local'} && !$opt->{remote}) {

^ permalink raw reply related	[relevance 60%]

* [PATCH] lei up: improve error for multiple lei.q values
@ 2022-10-31 21:52 90% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-10-31 21:52 UTC (permalink / raw)
  To: meta

Point users towards the lei.internal.rawstr variable which
may be tripping up handling of lei.q after `lei edit-search'.
---
 lib/PublicInbox/LeiUp.pm | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiUp.pm b/lib/PublicInbox/LeiUp.pm
index 5ad21451..49917339 100644
--- a/lib/PublicInbox/LeiUp.pm
+++ b/lib/PublicInbox/LeiUp.pm
@@ -32,8 +32,10 @@ sub up1 ($$) {
 	my $rawstr = $lss->{-cfg}->{'lei.internal.rawstr'} //
 		(scalar(@$q) == 1 && substr($q->[0], -1) eq "\n");
 	if ($rawstr) {
-		scalar(@$q) > 1 and
-			die "$f: lei.q has multiple values (@$q) (out=$out)\n";
+		die <<EOM if scalar(@$q) > 1;
+$f: lei.q has multiple values (@$q) (out=$out)
+$f: while lei.internal.rawstr is set
+EOM
 		$lse->query_approxidate($lse->git, $mset_opt->{qstr} = $q->[0]);
 	} else {
 		$mset_opt->{qstr} = $lse->query_argv_to_string($lse->git, $q);

^ permalink raw reply related	[relevance 90%]

* Re: [Need Help] lei add quotes at the search
  2022-10-30 23:06 63%     ` Eric Wong
  2022-10-31  7:36 71%       ` Hangbin Liu
@ 2022-10-31  7:47 71%       ` Hangbin Liu
  1 sibling, 0 replies; 200+ results
From: Hangbin Liu @ 2022-10-31  7:47 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Sun, Oct 30, 2022 at 11:06:31PM +0000, Eric Wong wrote:
> > If I add "\" on each "(", this will break to a very long config search.
> > I tried to adjust it to
> 
> I think that can work if lei.internal.rawstr is set in the
> config to indicate stdin was used (It's auto-set by --stdin).

OH, BTW, I will get error
fatal: bad config line 4 in file [..snip..]/lei.saved-search
if adding '\' in the config file.

> I guess it also works if it's the only lei.q config entry
> and the lei.q entry contains "\n"

But with "\n" in the config file. My previous config will works fine.
> 
> cf. https://public-inbox.org/meta/20211110102837.41721-1-e@80x24.org/

Thanks
Hangbin

^ permalink raw reply	[relevance 71%]

* Re: [Need Help] lei add quotes at the search
  2022-10-30 23:06 63%     ` Eric Wong
@ 2022-10-31  7:36 71%       ` Hangbin Liu
  2022-10-31  7:47 71%       ` Hangbin Liu
  1 sibling, 0 replies; 200+ results
From: Hangbin Liu @ 2022-10-31  7:36 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Sun, Oct 30, 2022 at 11:06:31PM +0000, Eric Wong wrote:
> Hangbin Liu <liuhangbin@gmail.com> wrote:
> > Sorry, I don't have the fc35 environment now.
> 
> No worries, I dont think fc35 is really a culprit.  Were you
> running a pre-release version of public-inbox or lei before?

Sorry, I forgot. Maybe I installed via `dnf copr enable icon/b4`
because I start using lei after reading blog
https://people.kernel.org/monsieuricon/lore-lei-part-1-getting-started

> > I'm curious about why the quote(%22) is added after "tc", not after "("
> 
> It's because Xapian can only handle a phrase after the `tc:' prefix.
> thus:	tc:"foo bar"	actually parses `tc:' as a prefix for To/Cc;
> while:	"tc:foo bar"	looks for the phrase "tc:foo bar" anywhere
> in the message, and won't limit to To/Cc headers.
> 
> This happens in the query_argv_to_string sub:
> 
> https://public-inbox.org/meta/2feb3e13b49d222bc7bd28430a9cf159692a933f/s/?b=lib/PublicInbox/Search.pm#n358
> 
> From the CLI:	lei q "tc:foo bar"	is indistinguishable
> from	lei q tc:"foo bar"	, so it gets treated as the latter.

Thanks for the explanation.

> > But if I have a long search line. This will breaks too much and hard to edit.
> > e.g. My real previous search is like
> > 
> > [lei]
> >         q = (tc:liuhangbin OR \
> >              (dfn:drivers/net/wireguard/ AND rt:6.month.ago..) OR \
> >              (dfn:tools/testing/selftests/net/ AND rt:1.month.ago..) OR \
> >              (dfn:drivers/net/team/ AND rt:6.month.ago..) OR \
> >              (dfn:net/ipv4/igmp.c AND rt:6.month.ago..) OR \
> >              (dfn:net/ipv6/mcast.c AND rt:6.month.ago..)) \
> > 	     NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)
> > 
> > If I add "\" on each "(", this will break to a very long config search.
> > I tried to adjust it to
> 
> I think that can work if lei.internal.rawstr is set in the
> config to indicate stdin was used (It's auto-set by --stdin).
> I guess it also works if it's the only lei.q config entry
> and the lei.q entry contains "\n"
> 
> cf. https://public-inbox.org/meta/20211110102837.41721-1-e@80x24.org/
> 
> > [lei]
> >         q = (tc:liuhangbin OR \
> >              (dfn:drivers/net/wireguard/ AND rt:6.month.ago..) OR \
> >              (dfn:tools/testing/selftests/net/ AND rt:1.month.ago..) OR \
> >              (dfn:drivers/net/team/ AND rt:6.month.ago..) OR \
> >              (dfn:net/ipv4/igmp.c AND rt:6.month.ago..) OR \
> >              (dfn:net/ipv6/mcast.c AND rt:6.month.ago..))
> >         q = NOT
> >         q = (tc:stable@vger.kernel.org
> >         q = OR
> >         q = f:sfr@canb.auug.org.au)
> > 
> > And now it works...
> 
> Sorta... at least for remotes it does:
> 
> > $ lei up /home/Liu/Mail/gmail/Linux_Kernel
> > # https://lore.kernel.org/all/ limiting to 2022-09-30 17:00 +0800 and newer
> > 60927 lei_xsearch 0 wq_worker: query_one_mset: Exception: Unknown range operation at /usr/share/perl5/vendor_perl/PublicInbox/IPC.pm line 254.
> 
> Note that Exception means it's not handling the first part of
> the query when hitting the local Xapian DB.  It's not doing the
> approxidate ($X.month.ago) substitution for the local Xapian DB,
> thus you got the "Unknown range operation".
> 
> > # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=((tc%3Aliuhangbin+OR+(dfn%3Adrivers%2Fnet%2Fwireguard%2F+AND+rt%3A6.month.ago..)+OR+(dfn%3Atools%2Ftesting%2Fselftests%2Fnet%2F+AND+rt%3A1.month.ago..)+OR+(dfn%3Adrivers%2Fnet%2Fteam%2F+AND+rt%3A6.month.ago..)+OR+(dfn%3Anet%2Fipv4%2Figmp.c+AND+rt%3A6.month.ago..)+OR+(dfn%3Anet%2Fipv6%2Fmcast.c+AND+rt%3A1651301673..))+NOT+(tc%3Astable%40vger.kernel.org+OR+f%3Asfr%40canb.auug.org.au))+AND+dt%3A20220930090001..
> > # https://lore.kernel.org/all/ 43/?
> 
> Of course, the lack of approxidate parsing there inside lei is
> fine, since the lore.kernel.org instance will do it remotely...
> 
> > So I want to know when/why *lei* add the quotes.
> 
> lei adds quotes since it can't distinguish if the shell user
> used single or double quotes.  Xapian uses double quotes for
> phrase search, and I wanted:	lei q "this is a phrase"
> to work naturally, which means:	lei q 'this is a phrase'
> (with single quotes) works the same way as with double quotes
> because the difference is handled by the shell and lei never
> sees it.

Thanks for the help.

Hangbin

^ permalink raw reply	[relevance 71%]

* Re: [Need Help] lei add quotes at the search
  2022-10-30  7:08 59%   ` Hangbin Liu
@ 2022-10-30 23:06 63%     ` Eric Wong
  2022-10-31  7:36 71%       ` Hangbin Liu
  2022-10-31  7:47 71%       ` Hangbin Liu
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2022-10-30 23:06 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: meta

Hangbin Liu <liuhangbin@gmail.com> wrote:
> Hi Eric,
> 
> Thanks for the help.
> 
> On Sun, Oct 30, 2022 at 05:13:33AM +0000, Eric Wong wrote:
> > Hangbin Liu <liuhangbin@gmail.com> wrote:
> > > Hi,
> > > 
> > > I used to use a search like
> > > 
> > > lei q -I https://lore.kernel.org/all/ -o ~/Mail/liuhangbin --threads --dedupe=mid '((tc:liuhangbin AND rt:6.month.ago..) NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)'
> > > 
> > > It works on fc35. But after I update to fc36 with lei-1.9.0-1.fc36. It start to
> > > add quotes in the search link and make the search never works. e.g.
> > 
> > Are you able to show the curl CLI from fc35?
> > Which public-inbox/lei version was it?
> > 
> > I'm actually curious fc35 worked at all, since the quoting would've
> > been broken, I think...
> 
> Sorry, I don't have the fc35 environment now.

No worries, I dont think fc35 is really a culprit.  Were you
running a pre-release version of public-inbox or lei before?

> > > $ lei q -I https://lore.kernel.org/all/ -o ~/Mail/liuhangbin --threads --dedupe=mid '((tc:liuhangbin AND rt:6.month.ago..) NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)'
> > > # /home/Liu/.local/share/lei/store 0/0
> > > # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=((tc%3A%22liuhangbin+AND+rt%3A6.month.ago..)+NOT+(tc%3Astable%40vger.kernel.org+OR+f%3Asfr%40canb.auug.org.au)%22
> > > # 0 written to /home/Liu/Mail/liuhangbin/ (0 matches)
> > > 
> > > Do you think if this is a bug, or I should update my search.
> > 
> > The %22 in fc36 is because your entire query is treated as one
> > element in argv and matches expected behavior.
> > 
> > Since '(' and ')' in the shell CLI is special, I suggest either:
> 
> I'm curious about why the quote(%22) is added after "tc", not after "("

It's because Xapian can only handle a phrase after the `tc:' prefix.
thus:	tc:"foo bar"	actually parses `tc:' as a prefix for To/Cc;
while:	"tc:foo bar"	looks for the phrase "tc:foo bar" anywhere
in the message, and won't limit to To/Cc headers.

This happens in the query_argv_to_string sub:

https://public-inbox.org/meta/2feb3e13b49d222bc7bd28430a9cf159692a933f/s/?b=lib/PublicInbox/Search.pm#n358

From the CLI:	lei q "tc:foo bar"	is indistinguishable
from	lei q tc:"foo bar"	, so it gets treated as the latter.

> > a) using --stdin to enter queries containing '(' and ')'
> > 
> > b) quoting (or escaping) only the '(' and ')':
> > 
> >     '('tc:liuhangbin AND rt:6.month.ago..')' NOT ...
> > 
> >                 or
> > 
> >     \(tc:liuhangbin AND rt:6.month.ago..\) NOT ...
> 
> with this way, the cmd line works. And in config file, it would looks like
> 
> [lei]
>         q = ((tc:liuhangbin
>         q = AND
>         q = rt:6.month.ago..)
>         q = NOT
>         q = (tc:stable@vger.kernel.org
>         q = OR
>         q = f:sfr@canb.auug.org.au)
> 
> But if I have a long search line. This will breaks too much and hard to edit.
> e.g. My real previous search is like
> 
> [lei]
>         q = (tc:liuhangbin OR \
>              (dfn:drivers/net/wireguard/ AND rt:6.month.ago..) OR \
>              (dfn:tools/testing/selftests/net/ AND rt:1.month.ago..) OR \
>              (dfn:drivers/net/team/ AND rt:6.month.ago..) OR \
>              (dfn:net/ipv4/igmp.c AND rt:6.month.ago..) OR \
>              (dfn:net/ipv6/mcast.c AND rt:6.month.ago..)) \
> 	     NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)
> 
> If I add "\" on each "(", this will break to a very long config search.
> I tried to adjust it to

I think that can work if lei.internal.rawstr is set in the
config to indicate stdin was used (It's auto-set by --stdin).
I guess it also works if it's the only lei.q config entry
and the lei.q entry contains "\n"

cf. https://public-inbox.org/meta/20211110102837.41721-1-e@80x24.org/

> [lei]
>         q = (tc:liuhangbin OR \
>              (dfn:drivers/net/wireguard/ AND rt:6.month.ago..) OR \
>              (dfn:tools/testing/selftests/net/ AND rt:1.month.ago..) OR \
>              (dfn:drivers/net/team/ AND rt:6.month.ago..) OR \
>              (dfn:net/ipv4/igmp.c AND rt:6.month.ago..) OR \
>              (dfn:net/ipv6/mcast.c AND rt:6.month.ago..))
>         q = NOT
>         q = (tc:stable@vger.kernel.org
>         q = OR
>         q = f:sfr@canb.auug.org.au)
> 
> And now it works...

Sorta... at least for remotes it does:

> $ lei up /home/Liu/Mail/gmail/Linux_Kernel
> # https://lore.kernel.org/all/ limiting to 2022-09-30 17:00 +0800 and newer
> 60927 lei_xsearch 0 wq_worker: query_one_mset: Exception: Unknown range operation at /usr/share/perl5/vendor_perl/PublicInbox/IPC.pm line 254.

Note that Exception means it's not handling the first part of
the query when hitting the local Xapian DB.  It's not doing the
approxidate ($X.month.ago) substitution for the local Xapian DB,
thus you got the "Unknown range operation".

> # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=((tc%3Aliuhangbin+OR+(dfn%3Adrivers%2Fnet%2Fwireguard%2F+AND+rt%3A6.month.ago..)+OR+(dfn%3Atools%2Ftesting%2Fselftests%2Fnet%2F+AND+rt%3A1.month.ago..)+OR+(dfn%3Adrivers%2Fnet%2Fteam%2F+AND+rt%3A6.month.ago..)+OR+(dfn%3Anet%2Fipv4%2Figmp.c+AND+rt%3A6.month.ago..)+OR+(dfn%3Anet%2Fipv6%2Fmcast.c+AND+rt%3A1651301673..))+NOT+(tc%3Astable%40vger.kernel.org+OR+f%3Asfr%40canb.auug.org.au))+AND+dt%3A20220930090001..
> # https://lore.kernel.org/all/ 43/?

Of course, the lack of approxidate parsing there inside lei is
fine, since the lore.kernel.org instance will do it remotely...

> So I want to know when/why *lei* add the quotes.

lei adds quotes since it can't distinguish if the shell user
used single or double quotes.  Xapian uses double quotes for
phrase search, and I wanted:	lei q "this is a phrase"
to work naturally, which means:	lei q 'this is a phrase'
(with single quotes) works the same way as with double quotes
because the difference is handled by the shell and lei never
sees it.

^ permalink raw reply	[relevance 63%]

* Re: [Need Help] lei add quotes at the search
  2022-10-30  5:13 71% ` Eric Wong
@ 2022-10-30  7:08 59%   ` Hangbin Liu
  2022-10-30 23:06 63%     ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Hangbin Liu @ 2022-10-30  7:08 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Hi Eric,

Thanks for the help.

On Sun, Oct 30, 2022 at 05:13:33AM +0000, Eric Wong wrote:
> Hangbin Liu <liuhangbin@gmail.com> wrote:
> > Hi,
> > 
> > I used to use a search like
> > 
> > lei q -I https://lore.kernel.org/all/ -o ~/Mail/liuhangbin --threads --dedupe=mid '((tc:liuhangbin AND rt:6.month.ago..) NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)'
> > 
> > It works on fc35. But after I update to fc36 with lei-1.9.0-1.fc36. It start to
> > add quotes in the search link and make the search never works. e.g.
> 
> Are you able to show the curl CLI from fc35?
> Which public-inbox/lei version was it?
> 
> I'm actually curious fc35 worked at all, since the quoting would've
> been broken, I think...

Sorry, I don't have the fc35 environment now.

> 
> > $ lei q -I https://lore.kernel.org/all/ -o ~/Mail/liuhangbin --threads --dedupe=mid '((tc:liuhangbin AND rt:6.month.ago..) NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)'
> > # /home/Liu/.local/share/lei/store 0/0
> > # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=((tc%3A%22liuhangbin+AND+rt%3A6.month.ago..)+NOT+(tc%3Astable%40vger.kernel.org+OR+f%3Asfr%40canb.auug.org.au)%22
> > # 0 written to /home/Liu/Mail/liuhangbin/ (0 matches)
> > 
> > Do you think if this is a bug, or I should update my search.
> 
> The %22 in fc36 is because your entire query is treated as one
> element in argv and matches expected behavior.
> 
> Since '(' and ')' in the shell CLI is special, I suggest either:

I'm curious about why the quote(%22) is added after "tc", not after "("

> 
> a) using --stdin to enter queries containing '(' and ')'
> 
> b) quoting (or escaping) only the '(' and ')':
> 
>     '('tc:liuhangbin AND rt:6.month.ago..')' NOT ...
> 
>                 or
> 
>     \(tc:liuhangbin AND rt:6.month.ago..\) NOT ...

with this way, the cmd line works. And in config file, it would looks like

[lei]
        q = ((tc:liuhangbin
        q = AND
        q = rt:6.month.ago..)
        q = NOT
        q = (tc:stable@vger.kernel.org
        q = OR
        q = f:sfr@canb.auug.org.au)

But if I have a long search line. This will breaks too much and hard to edit.
e.g. My real previous search is like

[lei]
        q = (tc:liuhangbin OR \
             (dfn:drivers/net/wireguard/ AND rt:6.month.ago..) OR \
             (dfn:tools/testing/selftests/net/ AND rt:1.month.ago..) OR \
             (dfn:drivers/net/team/ AND rt:6.month.ago..) OR \
             (dfn:net/ipv4/igmp.c AND rt:6.month.ago..) OR \
             (dfn:net/ipv6/mcast.c AND rt:6.month.ago..)) \
	     NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)

If I add "\" on each "(", this will break to a very long config search.
I tried to adjust it to

[lei]
        q = (tc:liuhangbin OR \
             (dfn:drivers/net/wireguard/ AND rt:6.month.ago..) OR \
             (dfn:tools/testing/selftests/net/ AND rt:1.month.ago..) OR \
             (dfn:drivers/net/team/ AND rt:6.month.ago..) OR \
             (dfn:net/ipv4/igmp.c AND rt:6.month.ago..) OR \
             (dfn:net/ipv6/mcast.c AND rt:6.month.ago..))
        q = NOT
        q = (tc:stable@vger.kernel.org
        q = OR
        q = f:sfr@canb.auug.org.au)

And now it works...

$ lei up /home/Liu/Mail/gmail/Linux_Kernel
# https://lore.kernel.org/all/ limiting to 2022-09-30 17:00 +0800 and newer
60927 lei_xsearch 0 wq_worker: query_one_mset: Exception: Unknown range operation at /usr/share/perl5/vendor_perl/PublicInbox/IPC.pm line 254.
# /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=((tc%3Aliuhangbin+OR+(dfn%3Adrivers%2Fnet%2Fwireguard%2F+AND+rt%3A6.month.ago..)+OR+(dfn%3Atools%2Ftesting%2Fselftests%2Fnet%2F+AND+rt%3A1.month.ago..)+OR+(dfn%3Adrivers%2Fnet%2Fteam%2F+AND+rt%3A6.month.ago..)+OR+(dfn%3Anet%2Fipv4%2Figmp.c+AND+rt%3A6.month.ago..)+OR+(dfn%3Anet%2Fipv6%2Fmcast.c+AND+rt%3A1651301673..))+NOT+(tc%3Astable%40vger.kernel.org+OR+f%3Asfr%40canb.auug.org.au))+AND+dt%3A20220930090001..
# https://lore.kernel.org/all/ 43/?

So I want to know when/why *lei* add the quotes.

Thanks
Hangbin

^ permalink raw reply	[relevance 59%]

* Re: [Need Help] lei add quotes at the search
  2022-10-30  4:03 70% [Need Help] lei add quotes at the search Hangbin Liu
@ 2022-10-30  5:13 71% ` Eric Wong
  2022-10-30  7:08 59%   ` Hangbin Liu
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2022-10-30  5:13 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: meta

Hangbin Liu <liuhangbin@gmail.com> wrote:
> Hi,
> 
> I used to use a search like
> 
> lei q -I https://lore.kernel.org/all/ -o ~/Mail/liuhangbin --threads --dedupe=mid '((tc:liuhangbin AND rt:6.month.ago..) NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)'
> 
> It works on fc35. But after I update to fc36 with lei-1.9.0-1.fc36. It start to
> add quotes in the search link and make the search never works. e.g.

Are you able to show the curl CLI from fc35?
Which public-inbox/lei version was it?

I'm actually curious fc35 worked at all, since the quoting would've
been broken, I think...

> $ lei q -I https://lore.kernel.org/all/ -o ~/Mail/liuhangbin --threads --dedupe=mid '((tc:liuhangbin AND rt:6.month.ago..) NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)'
> # /home/Liu/.local/share/lei/store 0/0
> # /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=((tc%3A%22liuhangbin+AND+rt%3A6.month.ago..)+NOT+(tc%3Astable%40vger.kernel.org+OR+f%3Asfr%40canb.auug.org.au)%22
> # 0 written to /home/Liu/Mail/liuhangbin/ (0 matches)
> 
> Do you think if this is a bug, or I should update my search.

The %22 in fc36 is because your entire query is treated as one
element in argv and matches expected behavior.

Since '(' and ')' in the shell CLI is special, I suggest either:

a) using --stdin to enter queries containing '(' and ')'

b) quoting (or escaping) only the '(' and ')':

    '('tc:liuhangbin AND rt:6.month.ago..')' NOT ...

                or

    \(tc:liuhangbin AND rt:6.month.ago..\) NOT ...

  Which makes your argv something like:

    [ "(tc:liuhangbin", "AND", "rt:6.month.ago..)", "NOT", ... ]

^ permalink raw reply	[relevance 71%]

* [Need Help] lei add quotes at the search
@ 2022-10-30  4:03 70% Hangbin Liu
  2022-10-30  5:13 71% ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Hangbin Liu @ 2022-10-30  4:03 UTC (permalink / raw)
  To: meta

Hi,

I used to use a search like

lei q -I https://lore.kernel.org/all/ -o ~/Mail/liuhangbin --threads --dedupe=mid '((tc:liuhangbin AND rt:6.month.ago..) NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)'

It works on fc35. But after I update to fc36 with lei-1.9.0-1.fc36. It start to
add quotes in the search link and make the search never works. e.g.

$ lei q -I https://lore.kernel.org/all/ -o ~/Mail/liuhangbin --threads --dedupe=mid '((tc:liuhangbin AND rt:6.month.ago..) NOT (tc:stable@vger.kernel.org OR f:sfr@canb.auug.org.au)'
# /home/Liu/.local/share/lei/store 0/0
# /usr/bin/curl -Sf -s -d '' https://lore.kernel.org/all/?x=m&t=1&q=((tc%3A%22liuhangbin+AND+rt%3A6.month.ago..)+NOT+(tc%3Astable%40vger.kernel.org+OR+f%3Asfr%40canb.auug.org.au)%22
# 0 written to /home/Liu/Mail/liuhangbin/ (0 matches)

Do you think if this is a bug, or I should update my search.

Thanks
Hangbin

^ permalink raw reply	[relevance 70%]

* [PATCH] lei: force --jobs=1,1 for SQLite < 3.8.3
@ 2022-10-01  0:33 60% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-10-01  0:33 UTC (permalink / raw)
  To: meta

SQLite prior to 3.8.3 did not reset its PRNG for generating
unique temporary file names, so it would barf on t/lei-up.t
occasionally due to O_EXCL -> EEXIST conflicts.

This fixes occasional test failures under CentOS 7.x which ships
SQLite 3.7.17.
---
 lib/PublicInbox/LeiQuery.pm |  4 +++-
 lib/PublicInbox/LeiUp.pm    |  5 +++--
 lib/PublicInbox/OverIdx.pm  | 12 ++++++++++++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiQuery.pm b/lib/PublicInbox/LeiQuery.pm
index c998e5c0..df9c32b3 100644
--- a/lib/PublicInbox/LeiQuery.pm
+++ b/lib/PublicInbox/LeiQuery.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # handles "lei q" command and provides internals for
@@ -6,6 +6,7 @@
 package PublicInbox::LeiQuery;
 use strict;
 use v5.10.1;
+use PublicInbox::OverIdx;
 
 sub prep_ext { # externals_each callback
 	my ($lxs, $exclude, $loc) = @_;
@@ -17,6 +18,7 @@ sub _start_query { # used by "lei q" and "lei up"
 	require PublicInbox::LeiOverview;
 	PublicInbox::LeiOverview->new($self) or return;
 	my $opt = $self->{opt};
+	PublicInbox::OverIdx::fork_ok($opt);
 	my ($xj, $mj) = split(/,/, $opt->{jobs} // '');
 	(defined($xj) && $xj ne '' && $xj !~ /\A[1-9][0-9]*\z/) and
 		die "`$xj' search jobs must be >= 1\n";
diff --git a/lib/PublicInbox/LeiUp.pm b/lib/PublicInbox/LeiUp.pm
index b8a98360..5ad21451 100644
--- a/lib/PublicInbox/LeiUp.pm
+++ b/lib/PublicInbox/LeiUp.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # "lei up" - updates the result of "lei q --save"
@@ -7,7 +7,7 @@ use strict;
 use v5.10.1;
 # n.b. we use LeiInput to setup IMAP auth
 use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
-use PublicInbox::LeiSavedSearch;
+use PublicInbox::LeiSavedSearch; # OverIdx
 use PublicInbox::DS;
 use PublicInbox::PktOp;
 use PublicInbox::LeiFinmsg;
@@ -75,6 +75,7 @@ sub redispatch_all ($$) {
 	my $upq = [ (@{$self->{o_local} // []}, @{$self->{o_remote} // []}) ];
 	return up1($lei, $upq->[0]) if @$upq == 1; # just one, may start MUA
 
+	PublicInbox::OverIdx::fork_ok($lei->{opt});
 	# FIXME: this is also used per-query, see lei->_start_query
 	my $j = $lei->{opt}->{jobs} || do {
 		my $n = $self->detect_nproc // 1;
diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index e7c96e14..a49ca6db 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -670,4 +670,16 @@ sub vivify_xvmd {
 	$smsg->{-vivify_xvmd} = \@vivify_xvmd;
 }
 
+sub fork_ok {
+	return 1 if $DBD::SQLite::sqlite_version >= 3008003;
+	my ($opt) = @_;
+	my @j = split(/,/, $opt->{jobs} // '');
+	state $warned;
+	grep { $_ > 1 } @j and $warned //= warn('DBD::SQLite version is ',
+		 $DBD::SQLite::sqlite_version,
+		", need >= 3008003 (3.8.3) for --jobs > 1\n");
+	$opt->{jobs} = '1,1';
+	undef;
+}
+
 1;
-- 
2.33.0


^ permalink raw reply related	[relevance 60%]

* SQLite <3.8.3 was broken on fork (was: fixes noticed while diagnosing t/lei-up.t)
  2022-09-30  9:21 71% [PATCH 0/4] fixes noticed while diagnosing t/lei-up.t Eric Wong
  2022-09-30  9:21 56% ` [PATCH 2/4] t/lei-up: improve diagnostics for this test Eric Wong
  2022-09-30  9:21 61% ` [PATCH 3/4] lei_to_mail: propagate errors to script/lei Eric Wong
@ 2022-09-30 17:20 65% ` Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-09-30 17:20 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> I'm still trying to figure out why OverIdx->adj_counter (via
> next_tid) in the LeiSavedSearch dedupe check occasionally fails
> under CentOS 7.x (but not other systems).

CentOS 7.x ships SQLite 3.7.17, and SQLite prior to 3.8.3 (2014-02-03)
didn't reset its PRNG for generating temporary filenames upon fork.

I can't find a way reset that PRNG via DBD::SQLite, either...

One possible nasty thing I could do is reset TMPDIR upon fork,
but that may leave bugs lingering, too :<

Here's the debugging patch I used to strace what was going on:
(I may just make lei_ok bail out on error...)

diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index e7c96e14..e1af910c 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -51,10 +51,22 @@ SELECT val FROM counter WHERE key = ? LIMIT 1
 sub adj_counter ($$$) {
 	my ($self, $key, $op) = @_;
 	my $dbh = $self->{dbh};
+use PublicInbox::Spawn qw(spawn);
+my $trace = "/tmp/$$.strace";
+open my $err, '>>', $trace;
+my $pid = spawn([qw(strace -f -s4096 -v -p), $$], undef, { 2 => $err });
+select undef, undef, undef, 0.1; 
+
 	my $sth = $dbh->prepare_cached(<<"");
 UPDATE counter SET val = val $op 1 WHERE key = ?
 
-	$sth->execute($key);
+	eval { $sth->execute($key) };
+my $err = $@;
+syswrite(STDERR, "$$ $err\n") if $err;
+kill('TERM', $pid);
+waitpid($pid, 0);
+die $err if $err;
+#unlink($trace);
 
 	get_counter($dbh, $key);
 }
diff --git a/t/lei-up.t b/t/lei-up.t
index baed6507..9c65a243 100644
--- a/t/lei-up.t
+++ b/t/lei-up.t
@@ -27,7 +27,7 @@ test_lei(sub {
 	lei_ok qw(ls-search);
 	$s = eml_load('t/utf8.eml')->as_string;
 	lei_ok [qw(import -q -F eml -)], undef, { 0 => \$s, %$lei_opt };
-	lei_ok qw(up --all=local);
+	lei_ok qw(up --all=local) or xbail "lei up --all=local failed $?";
 
 	gunzip("$home/a.mbox.gz" => \$uc, MultiStream => 1) or
 		 xbail "gunzip $GunzipError";

^ permalink raw reply related	[relevance 65%]

* [PATCH 3/4] lei_to_mail: propagate errors to script/lei
  2022-09-30  9:21 71% [PATCH 0/4] fixes noticed while diagnosing t/lei-up.t Eric Wong
  2022-09-30  9:21 56% ` [PATCH 2/4] t/lei-up: improve diagnostics for this test Eric Wong
@ 2022-09-30  9:21 61% ` Eric Wong
  2022-09-30 17:20 65% ` SQLite <3.8.3 was broken on fork (was: fixes noticed while diagnosing t/lei-up.t) Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-09-30  9:21 UTC (permalink / raw)
  To: meta

We need to rely on lei->fail to propagate errors in lei workers
to the script/lei client, otherwise tests and other scripts can
stumble forward with incomplete/incorrect/broken outputs.

This helps me focus on occasional t/lei-up.t failures I see on
CentOS 7.x where OverIdx->adj_counter fails on "lei up --all"...
---
 lib/PublicInbox/LeiToMail.pm | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 03cbde3b..b58e2652 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -132,19 +132,22 @@ sub eml2mboxcl2 {
 }
 
 sub git_to_mail { # git->cat_async callback
-	my ($bref, $oid, $type, $size, $arg) = @_;
+	my ($bref, $oid, $type, $size, $smsg) = @_;
+	my $self = delete $smsg->{l2m} // die "BUG: no l2m";
 	$type // return; # called by git->async_abort
-	my ($write_cb, $smsg) = @$arg;
-	if ($type eq 'missing' && $smsg->{-lms_rw}) {
-		if ($bref = $smsg->{-lms_rw}->local_blob($oid, 1)) {
+	eval {
+		if ($type eq 'missing' &&
+			  ($bref = $self->{-lms_rw}->local_blob($oid, 1))) {
 			$type = 'blob';
 			$size = length($$bref);
 		}
-	}
-	return warn("W: $oid is $type (!= blob)\n") if $type ne 'blob';
-	return warn("E: $oid is empty\n") unless $size;
-	die "BUG: expected=$smsg->{blob} got=$oid" if $smsg->{blob} ne $oid;
-	$write_cb->($bref, $smsg);
+		$type eq 'blob' or return $self->{lei}->child_error(1,
+						"W: $oid is $type (!= blob)");
+		$size or return $self->{lei}->child_error(1,"E: $oid is empty");
+		$smsg->{blob} eq $oid or die "BUG: expected=$smsg->{blob}";
+		$self->{wcb}->($bref, $smsg);
+	};
+	$self->{lei}->fail("$@ (oid=$oid)") if $@;
 }
 
 sub reap_compress { # dwaitpid callback
@@ -790,19 +793,22 @@ sub poke_dst {
 
 sub write_mail { # via ->wq_io_do
 	my ($self, $smsg, $eml) = @_;
-	return $self->{wcb}->(undef, $smsg, $eml) if $eml;
-	$smsg->{-lms_rw} = $self->{-lms_rw};
-	$self->{git}->cat_async($smsg->{blob}, \&git_to_mail,
-				[$self->{wcb}, $smsg]);
+	if ($eml) {
+		eval { $self->{wcb}->(undef, $smsg, $eml) };
+		$self->{lei}->fail("blob=$smsg->{blob} $@") if $@;
+	} else {
+		$smsg->{l2m} = $self;
+		$self->{git}->cat_async($smsg->{blob}, \&git_to_mail, $smsg);
+	}
 }
 
 sub wq_atexit_child {
 	my ($self) = @_;
 	local $PublicInbox::DS::in_loop = 0; # waitpid synchronously
 	my $lei = $self->{lei};
-	delete $self->{wcb};
 	$lei->{ale}->git->async_wait_all;
 	my ($nr_w, $nr_s) = delete(@$lei{qw(-nr_write -nr_seen)});
+	delete $self->{wcb};
 	$nr_s or return;
 	return if $lei->{early_mua} || !$lei->{-progress} || !$lei->{pkt_op_p};
 	$lei->{pkt_op_p}->pkt_do('l2m_progress', $nr_w, $nr_s);

^ permalink raw reply related	[relevance 61%]

* [PATCH 2/4] t/lei-up: improve diagnostics for this test
  2022-09-30  9:21 71% [PATCH 0/4] fixes noticed while diagnosing t/lei-up.t Eric Wong
@ 2022-09-30  9:21 56% ` Eric Wong
  2022-09-30  9:21 61% ` [PATCH 3/4] lei_to_mail: propagate errors to script/lei Eric Wong
  2022-09-30 17:20 65% ` SQLite <3.8.3 was broken on fork (was: fixes noticed while diagnosing t/lei-up.t) Eric Wong
  2 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-09-30  9:21 UTC (permalink / raw)
  To: meta

I'm getting occasional failures for this test on CentOS 7.x (but
not on FreeBSD nor Debian 10/11).  I'm not why, yet, so just
improve diagnostics for now.
---
 t/lei-up.t | 43 +++++++++++++++++++++++++++----------------
 1 file changed, 27 insertions(+), 16 deletions(-)

diff --git a/t/lei-up.t b/t/lei-up.t
index 022ebc05..baed6507 100644
--- a/t/lei-up.t
+++ b/t/lei-up.t
@@ -5,39 +5,50 @@ use strict; use v5.10.1; use PublicInbox::TestCommon;
 use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
 test_lei(sub {
 	my ($ro_home, $cfg_path) = setup_public_inboxes;
-	my $s = eml_load('t/plack-qp.eml')->as_string;
+	my $home = $ENV{HOME};
+	my $qp = eml_load('t/plack-qp.eml');
+	my $s = $qp->as_string;
 	lei_ok [qw(import -q -F eml -)], undef, { 0 => \$s, %$lei_opt };
-	lei_ok qw(q z:0.. -f mboxcl2 -o), "$ENV{HOME}/a.mbox.gz";
-	lei_ok qw(q z:0.. -f mboxcl2 -o), "$ENV{HOME}/b.mbox.gz";
-	lei_ok qw(q z:0.. -f mboxcl2 -o), "$ENV{HOME}/a";
-	lei_ok qw(q z:0.. -f mboxcl2 -o), "$ENV{HOME}/b";
+	lei_ok qw(q z:0.. -f mboxcl2 -o), "$home/a.mbox.gz";
+	lei_ok qw(q z:0.. -f mboxcl2 -o), "$home/b.mbox.gz";
+	lei_ok qw(q z:0.. -f mboxcl2 -o), "$home/a";
+	lei_ok qw(q z:0.. -f mboxcl2 -o), "$home/b";
+	my $uc;
+	for my $x (qw(a b)) {
+		gunzip("$home/$x.mbox.gz" => \$uc, MultiStream => 1) or
+				xbail "gunzip $GunzipError";
+		ok(index($uc, $qp->body_raw) >= 0,
+			"original mail in $x.mbox.gz");
+		open my $fh, '<', "$home/$x" or xbail $!;
+		$uc = do { local $/; <$fh> } // xbail $!;
+		ok(index($uc, $qp->body_raw) >= 0,
+			"original mail in uncompressed $x");
+	}
 	lei_ok qw(ls-search);
 	$s = eml_load('t/utf8.eml')->as_string;
 	lei_ok [qw(import -q -F eml -)], undef, { 0 => \$s, %$lei_opt };
 	lei_ok qw(up --all=local);
-	open my $fh, '<', "$ENV{HOME}/a.mbox.gz" or xbail "open: $!";
-	my $gz = do { local $/; <$fh> };
-	my $uc;
-	gunzip(\$gz => \$uc, MultiStream => 1) or xbail "gunzip $GunzipError";
-	open $fh, '<', "$ENV{HOME}/a" or xbail "open: $!";
 
+	gunzip("$home/a.mbox.gz" => \$uc, MultiStream => 1) or
+		 xbail "gunzip $GunzipError";
+
+	open my $fh, '<', "$home/a" or xbail "open: $!";
 	my $exp = do { local $/; <$fh> };
 	is($uc, $exp, 'compressed and uncompressed match (a.gz)');
 	like($exp, qr/testmessage\@example.com/, '2nd message added');
-	open $fh, '<', "$ENV{HOME}/b.mbox.gz" or xbail "open: $!";
 
-	$gz = do { local $/; <$fh> };
 	undef $uc;
-	gunzip(\$gz => \$uc, MultiStream => 1) or xbail "gunzip $GunzipError";
+	gunzip("$home/b.mbox.gz" => \$uc, MultiStream => 1) or
+		 xbail "gunzip $GunzipError";
 	is($uc, $exp, 'compressed and uncompressed match (b.gz)');
 
-	open $fh, '<', "$ENV{HOME}/b" or xbail "open: $!";
+	open $fh, '<', "$home/b" or xbail "open: $!";
 	$uc = do { local $/; <$fh> };
 	is($uc, $exp, 'uncompressed both match');
 
-	lei_ok [ qw(up -q), "$ENV{HOME}/b", "--mua=touch $ENV{HOME}/c" ],
+	lei_ok [ qw(up -q), "$home/b", "--mua=touch $home/c" ],
 		undef, { run_mode => 0 };
-	ok(-f "$ENV{HOME}/c", '--mua works with single output');
+	ok(-f "$home/c", '--mua works with single output');
 });
 
 done_testing;

^ permalink raw reply related	[relevance 56%]

* [PATCH 0/4] fixes noticed while diagnosing t/lei-up.t
@ 2022-09-30  9:21 71% Eric Wong
  2022-09-30  9:21 56% ` [PATCH 2/4] t/lei-up: improve diagnostics for this test Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Eric Wong @ 2022-09-30  9:21 UTC (permalink / raw)
  To: meta

I'm still trying to figure out why OverIdx->adj_counter (via
next_tid) in the LeiSavedSearch dedupe check occasionally fails
under CentOS 7.x (but not other systems).

Meanwhile, some improvements noticed along the way.
The underlying problem remains...

Disabling WAL didn't help t/lei-up.t on CentOS 7.x, so the
obvious newish feature we use is unlikely the culprit...

Eric Wong (4):
  tests: favor 3 argument `open' with interopolation
  t/lei-up: improve diagnostics for this test
  lei_to_mail: propagate errors to script/lei
  t/altid_v2: improve test style

 lib/PublicInbox/LeiToMail.pm | 34 ++++++++++++++++------------
 t/altid_v2.t                 | 10 ++++-----
 t/hl_mod.t                   |  4 ++--
 t/lei-up.t                   | 43 ++++++++++++++++++++++--------------
 t/lei_to_mail.t              | 10 ++++-----
 5 files changed, 59 insertions(+), 42 deletions(-)

^ permalink raw reply	[relevance 71%]

* Re: [PATCH v2] lei: bail out earlier on IMAP writer failures
  2022-09-10 19:53 70%           ` Ricardo Ribalda
@ 2022-09-10 20:19 71%             ` Eric Wong
  2022-11-14  8:07 64%               ` [PATCH] lei q|up: limit default write --jobs for IMAP(S) Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2022-09-10 20:19 UTC (permalink / raw)
  To: Ricardo Ribalda; +Cc: meta

Ricardo Ribalda <ribalda@chromium.org> wrote:
> ribalda@denia:/tmp/public-inbox$ lei up imaps://imap.gmail.com/lei/me
> # https://lore.kernel.org/all/ limiting to 2022-09-08 19:52 +0000 and newer
> # /usr/local/google/home/ribalda/.local/share/lei/store 12/12
> # /usr/bin/curl -Sf -s -d ''
> https://lore.kernel.org/all/?x=m&t=1&q=((ribalda)+AND+rt%3A1660161176..)+AND+dt%3A20220908195223..
> E: imaps://imap.gmail.com/lei/me connection failed.
> E: Consider using `--jobs ,1' to limit IMAP connections

Thanks for confirming things work as intended.  I think the
default should be clamped, though... 15 seems a bit high for
smaller IMAP servers *shrug*

> ribalda@denia:/tmp/public-inbox$ lei up imaps://imap.gmail.com/lei/me --jobs ,15

^ permalink raw reply	[relevance 71%]

* Re: [PATCH v2] lei: bail out earlier on IMAP writer failures
  2022-09-10 19:50 71%         ` Eric Wong
@ 2022-09-10 19:53 70%           ` Ricardo Ribalda
  2022-09-10 20:19 71%             ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Ricardo Ribalda @ 2022-09-10 19:53 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Sat, 10 Sept 2022 at 21:51, Eric Wong <e@80x24.org> wrote:
>
> Ricardo Ribalda <ribalda@chromium.org> wrote:
> > Similar output:
>
> Oh, wait, did you run `lei daemon-kill' after applying the
> patch?  That's needed to reload the code, I'll try to make it
> either auto-reload or warn users in the future.
>
> Thanks.

Ups :S

ribalda@denia:/tmp/public-inbox$ lei up imaps://imap.gmail.com/lei/me
# https://lore.kernel.org/all/ limiting to 2022-09-08 19:52 +0000 and newer
# /usr/local/google/home/ribalda/.local/share/lei/store 12/12
# /usr/bin/curl -Sf -s -d ''
https://lore.kernel.org/all/?x=m&t=1&q=((ribalda)+AND+rt%3A1660161176..)+AND+dt%3A20220908195223..
E: imaps://imap.gmail.com/lei/me connection failed.
E: Consider using `--jobs ,1' to limit IMAP connections
ribalda@denia:/tmp/public-inbox$ lei up imaps://imap.gmail.com/lei/me --jobs ,15
# https://lore.kernel.org/all/ limiting to 2022-09-08 19:52 +0000 and newer
# /usr/local/google/home/ribalda/.local/share/lei/store 12/12
# /usr/bin/curl -Sf -s -d ''
https://lore.kernel.org/all/?x=m&t=1&q=((ribalda)+AND+rt%3A1660161182..)+AND+dt%3A20220908195223..
# https://lore.kernel.org/all/ 20/20
# 0 written to imaps://imap.gmail.com/lei/me (32 matches, 20 duplicates)




-- 
Ricardo Ribalda

^ permalink raw reply	[relevance 70%]

* Re: [PATCH v2] lei: bail out earlier on IMAP writer failures
  2022-09-10 19:34 37%       ` Ricardo Ribalda
@ 2022-09-10 19:50 71%         ` Eric Wong
  2022-09-10 19:53 70%           ` Ricardo Ribalda
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2022-09-10 19:50 UTC (permalink / raw)
  To: Ricardo Ribalda; +Cc: meta

Ricardo Ribalda <ribalda@chromium.org> wrote:
> Similar output:

Oh, wait, did you run `lei daemon-kill' after applying the
patch?  That's needed to reload the code, I'll try to make it
either auto-reload or warn users in the future.

Thanks.

^ permalink raw reply	[relevance 71%]

* Re: [PATCH v2] lei: bail out earlier on IMAP writer failures
  2022-09-10  1:18 59%     ` [PATCH v2] lei: bail out earlier on " Eric Wong
@ 2022-09-10 19:34 37%       ` Ricardo Ribalda
  2022-09-10 19:50 71%         ` Eric Wong
  0 siblings, 1 reply; 200+ results
From: Ricardo Ribalda @ 2022-09-10 19:34 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Hi Eric

On Sat, 10 Sept 2022 at 03:19, Eric Wong <e@80x24.org> wrote:
>
> Ricardo Ribalda <ribalda@chromium.org> wrote:
> > The patch did not seem to have any effect :(, I never  get a "IMAP
> > LastError: " message
>
> Yeah, I guess IMAP servers will just shutdown the socket w/o
> saying anything. At least I didn't get anything from dovecot...
>
> The below patch is a refinement of what I posted originally
> and should stop the process instead of attempting to continue
> and spew.
>
> > On the other hand, the -j worked! I can go up to -j ,15 without any error.
>
> Good to know.
>
> I wonder if making the default `-j ,4' for IMAP is reasonable if
> unspecified.  That's the default limit for HTTP(S) hosts, and I
> seem to recall 4 being a reasonable limit for browsers.
>
> Thanks for the report and followup!

Similar output:


ribalda@denia:/tmp/public-inbox$ lei up imaps://imap.gmail.com/lei/me
# https://lore.kernel.org/all/ limiting to 2022-09-08 19:33 +0000 and newer
# /usr/local/google/home/ribalda/.local/share/lei/store 12/12
# /usr/bin/curl -Sf -s -d ''
https://lore.kernel.org/all/?x=m&t=1&q=((ribalda)+AND+rt%3A1660159987..)+AND+dt%3A20220908193300..
1486767 lei2mail 17 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486762 lei2mail 12 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486782 lei2mail 32 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486771 lei2mail 21 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486766 lei2mail 16 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486788 lei2mail 38 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486761 lei2mail 11 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486791 lei2mail 41 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486775 lei2mail 25 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486765 lei2mail 15 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486755 lei2mail 5 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486793 lei2mail 43 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486795 lei2mail 45 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486763 lei2mail 13 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486779 lei2mail 29 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486773 lei2mail 23 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486751 lei2mail 1 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486787 lei2mail 37 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486796 lei2mail 46 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486776 lei2mail 26 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486752 lei2mail 2 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486750 lei2mail 0 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486786 lei2mail 36 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486777 lei2mail 27 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486764 lei2mail 14 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486794 lei2mail 44 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486756 lei2mail 6 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486790 lei2mail 40 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486784 lei2mail 34 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486792 lei2mail 42 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486768 lei2mail 18 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486774 lei2mail 24 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486757 lei2mail 7 wq_worker: do_post_auth: Can't call method
"uidvalidity" on an undefined value at
/usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
1486750 lei2mail 0 wq_worker: write_mail: Can't use an undefined value
as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
line 783.
# https://lore.kernel.org/all/ 20/20
# 0 written to imaps://imap.gmail.com/lei/me (32 matches)
ribalda@denia:/tmp/public-inbox$ lei up imaps://imap.gmail.com/lei/me -j ,15
# https://lore.kernel.org/all/ limiting to 2022-09-08 19:33 +0000 and newer
# /usr/local/google/home/ribalda/.local/share/lei/store 12/12
# /usr/bin/curl -Sf -s -d ''
https://lore.kernel.org/all/?x=m&t=1&q=((ribalda)+AND+rt%3A1660160023..)+AND+dt%3A20220908193307..
# https://lore.kernel.org/all/ 20/20
# 0 written to imaps://imap.gmail.com/lei/me (32 matches)


Thanks!


>
> ------8<------
> From: Eric Wong <e@80x24.org>
> Subject: [PATCH] lei: bail out earlier on IMAP writer failures
>
> Excessive IMAP connections can overload IMAP servers and cause
> clients to be disconnected without diagnostic messages.
> Use $lei->fail on these exceptions to propagate errors to the
> CLI ASAP to avoid further errors down the line.
>
> This ought to make problems more apparent for users using IMAP
> destinations.
>
> Reported-by: Ricardo Ribalda <ribalda@chromium.org>
> Link: https://public-inbox.org/meta/CANiDSCsDfutAUMBLPZbxdyka+_jnhv+4YNYdL9QPRoC=wNUGCQ@mail.gmail.com/
> ---
>  lib/PublicInbox/LeiToMail.pm | 10 +++++++---
>  lib/PublicInbox/NetReader.pm |  8 +++++++-
>  2 files changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
> index 2aa3977e..03cbde3b 100644
> --- a/lib/PublicInbox/LeiToMail.pm
> +++ b/lib/PublicInbox/LeiToMail.pm
> @@ -310,8 +310,11 @@ sub _imap_write_cb ($$) {
>         my $dedupe = $lei->{dedupe};
>         $dedupe->prepare_dedupe if $dedupe;
>         my $append = $lei->{net}->can('imap_append');
> -       my $uri = $self->{uri};
> -       my $mic = $lei->{net}->mic_get($uri);
> +       my $uri = $self->{uri} // die 'BUG: no {uri}';
> +       my $mic = $lei->{net}->mic_get($uri) // die <<EOM;
> +E: $uri connection failed.
> +E: Consider using `--jobs ,1' to limit IMAP connections
> +EOM
>         my $folder = $uri->mailbox;
>         $uri->uidvalidity($mic->uidvalidity($folder));
>         my $lse = $lei->{lse}; # may be undef
> @@ -749,7 +752,8 @@ sub do_post_auth {
>                 $au_peers->[1] = undef;
>                 sysread($au_peers->[0], my $barrier1, 1);
>         }
> -       $self->{wcb} = $self->write_cb($lei);
> +       eval { $self->{wcb} = $self->write_cb($lei) };
> +       $lei->fail($@) if $@;
>         if ($au_peers) { # wait for peer l2m to set write_cb
>                 $au_peers->[3] = undef;
>                 sysread($au_peers->[2], my $barrier2, 1);
> diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
> index c1af03a3..4de2583e 100644
> --- a/lib/PublicInbox/NetReader.pm
> +++ b/lib/PublicInbox/NetReader.pm
> @@ -685,7 +685,13 @@ sub mic_get {
>         }
>         my $mic = mic_new($self, $mic_arg, $sec, $uri);
>         $cached //= {}; # invalid placeholder if no cache enabled
> -       $mic && $mic->IsConnected ? ($cached->{$sec} = $mic) : undef;
> +       if ($mic && $mic->IsConnected) {
> +               $cached->{$sec} = $mic;
> +       } else {
> +               warn 'IMAP LastError: ',$mic->LastError, "\n" if $mic;
> +               warn "IMAP errno: $!\n" if $!;
> +               undef;
> +       }
>  }
>
>  sub imap_each {



-- 
Ricardo Ribalda

^ permalink raw reply	[relevance 37%]

* [PATCH] lei: fix --help for --jobs with `up' and `q'
@ 2022-09-10  1:35 71% Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-09-10  1:35 UTC (permalink / raw)
  To: meta

The help needs to match on the short option, too, and that
`lei q' option is (like most options) shared with `lei up'.
---
 lib/PublicInbox/LEI.pm | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8a3a3ab6..f3e80113 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -399,8 +399,10 @@ my %OPTDESC = (
 		'include specified external(s) in search' ],
 'only|O=s@	q' => [ 'LOCATION',
 		'only use specified external(s) for search' ],
-'jobs=s	q' => [ '[SEARCH_JOBS][,WRITER_JOBS]',
-		'control number of search and writer jobs' ],
+'jobs|j=s' => [ 'JOBSPEC',
+		'control number of query and writer jobs' .
+		"integers delimited by `,', either of which may be omitted"
+		],
 'jobs|j=i	add-external' => 'set parallelism when indexing after --mirror',
 
 'in-format|F=s' => $stdin_formats,

^ permalink raw reply related	[relevance 71%]

* [PATCH v2] lei: bail out earlier on IMAP writer failures
  2022-09-09 20:35 71%   ` [PATCH] lei: add diagnostics for IMAP writer failures Ricardo Ribalda
@ 2022-09-10  1:18 59%     ` Eric Wong
  2022-09-10 19:34 37%       ` Ricardo Ribalda
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2022-09-10  1:18 UTC (permalink / raw)
  To: Ricardo Ribalda; +Cc: meta

Ricardo Ribalda <ribalda@chromium.org> wrote:
> The patch did not seem to have any effect :(, I never  get a "IMAP
> LastError: " message

Yeah, I guess IMAP servers will just shutdown the socket w/o
saying anything. At least I didn't get anything from dovecot...

The below patch is a refinement of what I posted originally
and should stop the process instead of attempting to continue
and spew.

> On the other hand, the -j worked! I can go up to -j ,15 without any error.

Good to know.

I wonder if making the default `-j ,4' for IMAP is reasonable if
unspecified.  That's the default limit for HTTP(S) hosts, and I
seem to recall 4 being a reasonable limit for browsers.

Thanks for the report and followup!

------8<------
From: Eric Wong <e@80x24.org>
Subject: [PATCH] lei: bail out earlier on IMAP writer failures

Excessive IMAP connections can overload IMAP servers and cause
clients to be disconnected without diagnostic messages.
Use $lei->fail on these exceptions to propagate errors to the
CLI ASAP to avoid further errors down the line.

This ought to make problems more apparent for users using IMAP
destinations.

Reported-by: Ricardo Ribalda <ribalda@chromium.org>
Link: https://public-inbox.org/meta/CANiDSCsDfutAUMBLPZbxdyka+_jnhv+4YNYdL9QPRoC=wNUGCQ@mail.gmail.com/
---
 lib/PublicInbox/LeiToMail.pm | 10 +++++++---
 lib/PublicInbox/NetReader.pm |  8 +++++++-
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 2aa3977e..03cbde3b 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -310,8 +310,11 @@ sub _imap_write_cb ($$) {
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe if $dedupe;
 	my $append = $lei->{net}->can('imap_append');
-	my $uri = $self->{uri};
-	my $mic = $lei->{net}->mic_get($uri);
+	my $uri = $self->{uri} // die 'BUG: no {uri}';
+	my $mic = $lei->{net}->mic_get($uri) // die <<EOM;
+E: $uri connection failed.
+E: Consider using `--jobs ,1' to limit IMAP connections
+EOM
 	my $folder = $uri->mailbox;
 	$uri->uidvalidity($mic->uidvalidity($folder));
 	my $lse = $lei->{lse}; # may be undef
@@ -749,7 +752,8 @@ sub do_post_auth {
 		$au_peers->[1] = undef;
 		sysread($au_peers->[0], my $barrier1, 1);
 	}
-	$self->{wcb} = $self->write_cb($lei);
+	eval { $self->{wcb} = $self->write_cb($lei) };
+	$lei->fail($@) if $@;
 	if ($au_peers) { # wait for peer l2m to set write_cb
 		$au_peers->[3] = undef;
 		sysread($au_peers->[2], my $barrier2, 1);
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index c1af03a3..4de2583e 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -685,7 +685,13 @@ sub mic_get {
 	}
 	my $mic = mic_new($self, $mic_arg, $sec, $uri);
 	$cached //= {}; # invalid placeholder if no cache enabled
-	$mic && $mic->IsConnected ? ($cached->{$sec} = $mic) : undef;
+	if ($mic && $mic->IsConnected) {
+		$cached->{$sec} = $mic;
+	} else {
+		warn 'IMAP LastError: ',$mic->LastError, "\n" if $mic;
+		warn "IMAP errno: $!\n" if $!;
+		undef;
+	}
 }
 
 sub imap_each {

^ permalink raw reply related	[relevance 59%]

* Re: [PATCH] lei: add diagnostics for IMAP writer failures
  2022-09-09 17:44 64% ` [PATCH] lei: add diagnostics for IMAP writer failures Eric Wong
  2022-09-09 18:00 86%   ` [PATCH] doc: document --jobs for `lei q' and `lei up' Eric Wong
@ 2022-09-09 20:35 71%   ` Ricardo Ribalda
  2022-09-10  1:18 59%     ` [PATCH v2] lei: bail out earlier on " Eric Wong
  1 sibling, 1 reply; 200+ results
From: Ricardo Ribalda @ 2022-09-09 20:35 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Hi Eric

On Fri, 9 Sept 2022 at 19:45, Eric Wong <e@80x24.org> wrote:
>
> Ricardo Ribalda <ribalda@chromium.org> wrote:
> > Hi
> >
> > I am getting a lot of those messages when using lei in imap mode.
> >
> > Nonetheless the mail seems to arrive fine to its destination (on the
> > first run, some of the mail was lost, but for small batches of mails
> > it seems to work fine).
> >
> > It is a valid message or just some red-herring.
> >
> > I am using debian testing/
> >
> > Regards!
> >
> > # https://lore.kernel.org/all/ limiting to 2022-09-07 10:03 +0000 and newer
> > # /usr/local/google/home/ribalda/.local/share/lei/store 13/13
> > # /usr/bin/curl -Sf -s -d ''
> > https://lore.kernel.org/all/?x=m&t=1&q=((ribalda)+AND+rt%3A1660039651..)+AND+dt%3A20220907100321..
> > 1285740 lei2mail 6 wq_worker: do_post_auth: Can't call method
> > "uidvalidity" on an undefined value at
> > /usr/share/perl5/PublicInbox/LeiToMail.pm line 313.
>
> <snip>
>
> > 1285735 lei2mail 1 wq_worker: write_mail: Can't use an undefined value
> > as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
> > line 783.
> > # https://lore.kernel.org/all/ 18/18
> > # 14 written to imaps://imap.gmail.com/lei/me (31 matches)
>
> I wonder if it's excessive parallelism for gmail's IMAP.
> I haven't tested IMAP destinations, much...
>
> Can you try the patch at the bottom?
>
> There's also another patch coming to document the `--jobs|-j' CLI
> switch for `lei up' and `lei q', but trying `-j ,1' may help you
> if it's parallelism.  Note the comma before `1', it accepts
> `-j $Q,$W' since $Q is the number of query processes and $W is
> the number of LeiToMail writers.

The patch did not seem to have any effect :(, I never  get a "IMAP
LastError: " message

On the other hand, the -j worked! I can go up to -j ,15 without any error.

>
> -------8<-------
> From: Eric Wong <e@80x24.org>
> Subject: [PATCH] lei: add diagnostics for IMAP writer failures
>
> This may help diagnose the problem with IMAP destinations
> encountered at:
> https://public-inbox.org/meta/CANiDSCsDfutAUMBLPZbxdyka+_jnhv+4YNYdL9QPRoC=wNUGCQ@mail.gmail.com/
> ---
>  lib/PublicInbox/LeiToMail.pm | 4 ++--
>  lib/PublicInbox/NetReader.pm | 8 +++++++-
>  2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
> index 2aa3977e..bc00b96a 100644
> --- a/lib/PublicInbox/LeiToMail.pm
> +++ b/lib/PublicInbox/LeiToMail.pm
> @@ -310,8 +310,8 @@ sub _imap_write_cb ($$) {
>         my $dedupe = $lei->{dedupe};
>         $dedupe->prepare_dedupe if $dedupe;
>         my $append = $lei->{net}->can('imap_append');
> -       my $uri = $self->{uri};
> -       my $mic = $lei->{net}->mic_get($uri);
> +       my $uri = $self->{uri} // die 'BUG: no {uri}';
> +       my $mic = $lei->{net}->mic_get($uri) // die 'BUG: no $mic';
>         my $folder = $uri->mailbox;
>         $uri->uidvalidity($mic->uidvalidity($folder));
>         my $lse = $lei->{lse}; # may be undef
> diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
> index c1af03a3..4de2583e 100644
> --- a/lib/PublicInbox/NetReader.pm
> +++ b/lib/PublicInbox/NetReader.pm
> @@ -685,7 +685,13 @@ sub mic_get {
>         }
>         my $mic = mic_new($self, $mic_arg, $sec, $uri);
>         $cached //= {}; # invalid placeholder if no cache enabled
> -       $mic && $mic->IsConnected ? ($cached->{$sec} = $mic) : undef;
> +       if ($mic && $mic->IsConnected) {
> +               $cached->{$sec} = $mic;
> +       } else {
> +               warn 'IMAP LastError: ',$mic->LastError, "\n" if $mic;
> +               warn "IMAP errno: $!\n" if $!;
> +               undef;
> +       }
>  }
>
>  sub imap_each {



-- 
Ricardo Ribalda

^ permalink raw reply	[relevance 71%]

* [PATCH] doc: document --jobs for `lei q' and `lei up'
  2022-09-09 17:44 64% ` [PATCH] lei: add diagnostics for IMAP writer failures Eric Wong
@ 2022-09-09 18:00 86%   ` Eric Wong
  2022-09-09 20:35 71%   ` [PATCH] lei: add diagnostics for IMAP writer failures Ricardo Ribalda
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2022-09-09 18:00 UTC (permalink / raw)
  To: meta; +Cc: Ricardo Ribalda

Eric Wong <e@80x24.org> wrote:
> There's also another patch coming to document the `--jobs|-j' CLI
> switch for `lei up' and `lei q', but trying `-j ,1' may help you
> if it's parallelism.  Note the comma before `1', it accepts
> `-j $Q,$W' since $Q is the number of query processes and $W is
> the number of LeiToMail writers.

-------8<-----
From: Eric Wong <e@80x24.org>
Subject: [PATCH] doc: document --jobs for `lei q' and `lei up'

These may be helpful for users on slow disks or limited IMAP
connections.
---
 Documentation/lei-q.pod  | 17 +++++++++++++++++
 Documentation/lei-up.pod |  4 +++-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 2f0c3bc6..8134223e 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -124,6 +124,23 @@ of the same thread.
 TODO: Warning: this flag may become persistent and saved in
 lei/store unless an MUA unflags it!  (Behavior undecided)
 
+=item --jobs=QUERY_WORKERS[,WRITE_WORKERS]
+=item --jobs=,WRITE_WORKERS
+
+=item -j QUERY_WORKERS[,WRITE_WORKERS]
+=item -j ,WRITE_WORKERS
+
+Set the number of query and write worker processes for parallelism.
+
+C<QUERY_WORKERS> defaults to the number of CPUs available, but 4 per
+remote (HTTP/HTTPS) host.
+
+C<WRITE_WORKERS> defaults to the number of CPUs available for Maildir,
+IMAP/IMAPS, and mbox* destinations.
+
+Omitting C<QUERY_WORKERS> but leaving the comma (C<,>) allows
+one to only set C<WRITE_WORKERS>
+
 =item --dedupe=STRATEGY
 
 =item -d STRATEGY
diff --git a/Documentation/lei-up.pod b/Documentation/lei-up.pod
index ac644a96..3b7c6f46 100644
--- a/Documentation/lei-up.pod
+++ b/Documentation/lei-up.pod
@@ -64,7 +64,9 @@ specified via C<lei q --only>.
 
 =item --mua=CMD
 
-C<--lock>, C<--alert>, and C<--mua> are all supported and
+=item --jobs QUERY_WORKERS[,WRITE_WORKERS]
+
+C<--lock>, C<--alert>, C<--mua>, and C<--jobs> are all supported and
 documented in L<lei-q(1)>.
 
 C<--mua> is incompatible with C<--all>.

^ permalink raw reply related	[relevance 86%]

* [PATCH] lei: add diagnostics for IMAP writer failures
  @ 2022-09-09 17:44 64% ` Eric Wong
  2022-09-09 18:00 86%   ` [PATCH] doc: document --jobs for `lei q' and `lei up' Eric Wong
  2022-09-09 20:35 71%   ` [PATCH] lei: add diagnostics for IMAP writer failures Ricardo Ribalda
  0 siblings, 2 replies; 200+ results
From: Eric Wong @ 2022-09-09 17:44 UTC (permalink / raw)
  To: Ricardo Ribalda; +Cc: meta

Ricardo Ribalda <ribalda@chromium.org> wrote:
> Hi
> 
> I am getting a lot of those messages when using lei in imap mode.
> 
> Nonetheless the mail seems to arrive fine to its destination (on the
> first run, some of the mail was lost, but for small batches of mails
> it seems to work fine).
> 
> It is a valid message or just some red-herring.
> 
> I am using debian testing/
> 
> Regards!
> 
> # https://lore.kernel.org/all/ limiting to 2022-09-07 10:03 +0000 and newer
> # /usr/local/google/home/ribalda/.local/share/lei/store 13/13
> # /usr/bin/curl -Sf -s -d ''
> https://lore.kernel.org/all/?x=m&t=1&q=((ribalda)+AND+rt%3A1660039651..)+AND+dt%3A20220907100321..
> 1285740 lei2mail 6 wq_worker: do_post_auth: Can't call method
> "uidvalidity" on an undefined value at
> /usr/share/perl5/PublicInbox/LeiToMail.pm line 313.

<snip>

> 1285735 lei2mail 1 wq_worker: write_mail: Can't use an undefined value
> as a subroutine reference at /usr/share/perl5/PublicInbox/LeiToMail.pm
> line 783.
> # https://lore.kernel.org/all/ 18/18
> # 14 written to imaps://imap.gmail.com/lei/me (31 matches)

I wonder if it's excessive parallelism for gmail's IMAP.
I haven't tested IMAP destinations, much...

Can you try the patch at the bottom?

There's also another patch coming to document the `--jobs|-j' CLI
switch for `lei up' and `lei q', but trying `-j ,1' may help you
if it's parallelism.  Note the comma before `1', it accepts
`-j $Q,$W' since $Q is the number of query processes and $W is
the number of LeiToMail writers.

-------8<-------
From: Eric Wong <e@80x24.org>
Subject: [PATCH] lei: add diagnostics for IMAP writer failures

This may help diagnose the problem with IMAP destinations
encountered at:
https://public-inbox.org/meta/CANiDSCsDfutAUMBLPZbxdyka+_jnhv+4YNYdL9QPRoC=wNUGCQ@mail.gmail.com/
---
 lib/PublicInbox/LeiToMail.pm | 4 ++--
 lib/PublicInbox/NetReader.pm | 8 +++++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 2aa3977e..bc00b96a 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -310,8 +310,8 @@ sub _imap_write_cb ($$) {
 	my $dedupe = $lei->{dedupe};
 	$dedupe->prepare_dedupe if $dedupe;
 	my $append = $lei->{net}->can('imap_append');
-	my $uri = $self->{uri};
-	my $mic = $lei->{net}->mic_get($uri);
+	my $uri = $self->{uri} // die 'BUG: no {uri}';
+	my $mic = $lei->{net}->mic_get($uri) // die 'BUG: no $mic';
 	my $folder = $uri->mailbox;
 	$uri->uidvalidity($mic->uidvalidity($folder));
 	my $lse = $lei->{lse}; # may be undef
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index c1af03a3..4de2583e 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -685,7 +685,13 @@ sub mic_get {
 	}
 	my $mic = mic_new($self, $mic_arg, $sec, $uri);
 	$cached //= {}; # invalid placeholder if no cache enabled
-	$mic && $mic->IsConnected ? ($cached->{$sec} = $mic) : undef;
+	if ($mic && $mic->IsConnected) {
+		$cached->{$sec} = $mic;
+	} else {
+		warn 'IMAP LastError: ',$mic->LastError, "\n" if $mic;
+		warn "IMAP errno: $!\n" if $!;
+		undef;
+	}
 }
 
 sub imap_each {

^ permalink raw reply related	[relevance 64%]

* [PATCH 0/2] lei-related import tweaks
@ 2022-09-02 18:26 71% Eric Wong
  2022-09-02 18:26 69% ` [PATCH 1/2] lei/store: do not write info/refs file Eric Wong
  0 siblings, 1 reply; 200+ results
From: Eric Wong @ 2022-09-02 18:26 UTC (permalink / raw)
  To: meta

Just some minor nits I noticed in lei-land...

Eric Wong (2):
  lei/store: do not write info/refs file
  import: pass --quiet to `git gc' if STDERR isn't a tty

 lib/PublicInbox/Import.pm | 7 +++++--
 t/lei_store.t             | 5 ++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

^ permalink raw reply	[relevance 71%]

* [PATCH 1/2] lei/store: do not write info/refs file
  2022-09-02 18:26 71% [PATCH 0/2] lei-related import tweaks Eric Wong
@ 2022-09-02 18:26 69% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-09-02 18:26 UTC (permalink / raw)
  To: meta

That file is meant for dumb HTTP servers, so avoid wasting two
inodes on something that should never be served for private
email.
---
 lib/PublicInbox/Import.pm | 2 +-
 t/lei_store.t             | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index aef49033..2c8f310a 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -182,8 +182,8 @@ sub _update_git_info ($$) {
 		my $env = { GIT_INDEX_FILE => $index };
 		run_die([@cmd, qw(read-tree -m -v -i), $self->{ref}], $env);
 	}
-	eval { run_die([@cmd, 'update-server-info']) };
 	my $ibx = $self->{ibx};
+	eval { run_die([@cmd, 'update-server-info']) } if $ibx;
 	if ($ibx && $ibx->version == 1 && -d "$ibx->{inboxdir}/public-inbox" &&
 				eval { require PublicInbox::SearchIdx }) {
 		eval {
diff --git a/t/lei_store.t b/t/lei_store.t
index 40ad7800..5a5e5de0 100644
--- a/t/lei_store.t
+++ b/t/lei_store.t
@@ -1,5 +1,5 @@
 #!perl -w
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict;
 use v5.10.1;
@@ -149,4 +149,7 @@ EOM
 	is($mset->size, 1, 'rt:1.hour.ago.. works w/ local time');
 }
 
+is_deeply([glob("$store_dir/local/*.git/info/refs")], [],
+	'no info/refs in private lei/store');
+
 done_testing;

^ permalink raw reply related	[relevance 69%]

* [PATCH 1/3] Makefile.PL: add lei-reindex manpage
  @ 2022-08-30  9:10 71% ` Eric Wong
  0 siblings, 0 replies; 200+ results
From: Eric Wong @ 2022-08-30  9:10 UTC (permalink / raw)
  To: meta

I forgot to add this when I added the new command :x
---
 Makefile.PL | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile.PL b/Makefile.PL
index 67012d3e..ff03b615 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -53,7 +53,8 @@ $v->{-m1} = [ map {
 	lei-import lei-index lei-init lei-inspect lei-lcat
 	lei-ls-external lei-ls-label lei-ls-mail-source lei-ls-mail-sync
 	lei-ls-search lei-ls-watch lei-mail-diff lei-p2q lei-q
-	lei-rediff lei-refresh-mail-sync lei-rm lei-rm-watch lei-tag
+	lei-rediff lei-refresh-mail-sync lei-reindex
+	lei-rm lei-rm-watch lei-tag
 	lei-up)];
 $v->{-m5} = [ qw(public-inbox-config public-inbox-v1-format
 		public-inbox-v2-format public-inbox-extindex-format

^ permalink raw reply related	[relevance 71%]

* [PATCH 4/4] lei/store: reindex culls over-indexed messages
    2022-08-19  9:07 90% ` [PATCH 1/4] lei reindex: account for parallel lei/store users Eric Wong
@ 2022-08-19  9:07 71% ` Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2022-08-19  9:07 UTC (permalink / raw)
  To: meta

I may be the only lei user who has redundantly-indexed messages
needing this, though...
---
 lib/PublicInbox/LeiStore.pm | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 8e710540..57f0e013 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -344,6 +344,15 @@ sub _reindex_1 { # git->cat_async callback
 		my $eml = PublicInbox::Eml->new($bref);
 		$smsg->{-merge_vmd} = 1; # preserve existing keywords
 		$eidx->idx_shard($smsg->{num})->index_eml($eml, $smsg);
+	} elsif ($type eq 'missing') {
+		# pre-release/buggy lei may've indexed external-only msgs,
+		# try to correct that, here
+		warn("E: missing $hex, culling (ancient lei artifact?)\n");
+		$smsg->{to} = $smsg->{cc} = $smsg->{from} = '';
+		$smsg->{bytes} = 0;
+		$eidx->{oidx}->update_blob($smsg, '');
+		my $eml = PublicInbox::Eml->new("\r\n\r\n");
+		$eidx->idx_shard($smsg->{num})->index_eml($eml, $smsg);
 	} else {
 		warn("E: $type $hex\n");
 	}

^ permalink raw reply related	[relevance 71%]

* [PATCH 1/4] lei reindex: account for parallel lei/store users
  @ 2022-08-19  9:07 90% ` Eric Wong
  2022-08-19  9:07 71% ` [PATCH 4/4] lei/store: reindex culls over-indexed messages Eric Wong
  1 sibling, 0 replies; 200+ results
From: Eric Wong @ 2022-08-19  9:07 UTC (permalink / raw)
  To: meta

We need to call eidx_init in each git->cat_async callback
since another requestor may've stopped the shard processes.
---
 lib/PublicInbox/LeiStore.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 277ed6bd..8e710540 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -337,7 +337,8 @@ sub _docids_and_maybe_kw ($$) {
 
 sub _reindex_1 { # git->cat_async callback
 	my ($bref, $hex, $type, $size, $smsg) = @_;
-	my ($self, $eidx, $tl) = delete @$smsg{qw(-self -eidx -tl)};
+	my $self = delete $smsg->{-sto};
+	my ($eidx, $tl) = eidx_init($self);
 	$bref //= _lms_rw($self)->local_blob($hex, 1);
 	if ($bref) {
 		my $eml = PublicInbox::Eml->new($bref);
@@ -353,7 +354,7 @@ sub reindex_art {
 	my ($eidx, $tl) = eidx_init($self);
 	my $smsg = $eidx->{oidx}->get_art($art) // return;
 	return if $smsg->{bytes} == 0; # external-only message
-	@$smsg{qw(-self -eidx -tl)} = ($self, $eidx, $tl);
+	$smsg->{-sto} = $self;
 	$eidx->git->cat_async($smsg->{blob} // die("no blob (#$art)"),
 				\&_reindex_1, $smsg);
 }

^ permalink raw reply related	[relevance 90%]

Results 1-200 of ~1311   | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2021-10-21 21:10     [PATCH 00/15] use RENAME_NOREPLACE on Linux 3.15+ Eric Wong
2021-10-21 21:10     ` [PATCH 03/15] t/lei-p2q: extra diagnostics Eric Wong
2023-09-21 10:23 71%   ` Eric Wong
2022-08-04  7:23     [PATCH] TODO: remove done items, adjust/add/abandon some Eric Wong
2022-12-09  1:41 65% ` FUSE3 vs read-write IMAP for lei Eric Wong
2023-02-20 19:27 71%   ` Eric Wong
2022-08-19  9:07     [PATCH 0/4] lei reindex-related stuff Eric Wong
2022-08-19  9:07 90% ` [PATCH 1/4] lei reindex: account for parallel lei/store users Eric Wong
2022-08-19  9:07 71% ` [PATCH 4/4] lei/store: reindex culls over-indexed messages Eric Wong
2022-08-30  9:10     [PATCH 0/3] misc doc updates I missed for 1.9 :x Eric Wong
2022-08-30  9:10 71% ` [PATCH 1/3] Makefile.PL: add lei-reindex manpage Eric Wong
2022-09-02 18:26 71% [PATCH 0/2] lei-related import tweaks Eric Wong
2022-09-02 18:26 69% ` [PATCH 1/2] lei/store: do not write info/refs file Eric Wong
2022-09-09 10:09     imap: "Can't use an undefined value as a subroutine reference" Ricardo Ribalda
2022-09-09 17:44 64% ` [PATCH] lei: add diagnostics for IMAP writer failures Eric Wong
2022-09-09 18:00 86%   ` [PATCH] doc: document --jobs for `lei q' and `lei up' Eric Wong
2022-09-09 20:35 71%   ` [PATCH] lei: add diagnostics for IMAP writer failures Ricardo Ribalda
2022-09-10  1:18 59%     ` [PATCH v2] lei: bail out earlier on " Eric Wong
2022-09-10 19:34 37%       ` Ricardo Ribalda
2022-09-10 19:50 71%         ` Eric Wong
2022-09-10 19:53 70%           ` Ricardo Ribalda
2022-09-10 20:19 71%             ` Eric Wong
2022-11-14  8:07 64%               ` [PATCH] lei q|up: limit default write --jobs for IMAP(S) Eric Wong
2022-09-10  1:35 71% [PATCH] lei: fix --help for --jobs with `up' and `q' Eric Wong
2022-09-30  9:21 71% [PATCH 0/4] fixes noticed while diagnosing t/lei-up.t Eric Wong
2022-09-30  9:21 56% ` [PATCH 2/4] t/lei-up: improve diagnostics for this test Eric Wong
2022-09-30  9:21 61% ` [PATCH 3/4] lei_to_mail: propagate errors to script/lei Eric Wong
2022-09-30 17:20 65% ` SQLite <3.8.3 was broken on fork (was: fixes noticed while diagnosing t/lei-up.t) Eric Wong
2022-10-01  0:33 60% [PATCH] lei: force --jobs=1,1 for SQLite < 3.8.3 Eric Wong
2022-10-30  4:03 70% [Need Help] lei add quotes at the search Hangbin Liu
2022-10-30  5:13 71% ` Eric Wong
2022-10-30  7:08 59%   ` Hangbin Liu
2022-10-30 23:06 63%     ` Eric Wong
2022-10-31  7:36 71%       ` Hangbin Liu
2022-10-31  7:47 71%       ` Hangbin Liu
2022-10-31 21:52 90% [PATCH] lei up: improve error for multiple lei.q values Eric Wong
2022-11-01  9:36 60% [PATCH] lei: fix globbing semantics to match end-of-filename Eric Wong
2022-11-03  0:48     [PATCH 0/6] doc: linkify HTML harder Eric Wong
2022-11-03  0:48 62% ` [PATCH 2/6] doc: lei: improve description of *-search commands Eric Wong
2022-11-03  0:48 71% ` [PATCH 3/6] doc: txt2pre: linkify "lei COMMAND" form Eric Wong
2022-11-03  0:48 90% ` [PATCH 5/6] doc: lei-import: link to lei-store-format(5) Eric Wong
2022-11-03  2:03 90%   ` Eric Wong
2022-11-03  0:48 90% ` [PATCH 6/6] txt2pre: linkify lei/store => lei-store-format.html Eric Wong
2022-12-01 11:21 90% [PATCH 0/2] lei - expanding relative paths for `lei up' Eric Wong
2022-12-01 11:21 67% ` [PATCH 1/2] lei: stricter external checks for valid $GIT_DIR/objects Eric Wong
2023-01-11 10:55     [PATCH] www: /$INBOX/$MSGID/d/ to diff reused Message-IDs Eric Wong
2023-01-11 11:00 42% ` [1/2 PATCH] hoist MailDiff and ContentDigestDbg out of lei Eric Wong
2023-01-17  7:18     [PATCH 00/12] improve process reaping Eric Wong
2023-01-17  7:19 39% ` [PATCH 11/12] ipc+lei: switch to awaitpid Eric Wong
2023-01-29 10:30     [PATCH 0/2] allow OpenSSL SHA-(1|256) use if installed Eric Wong
2023-01-29 10:30 63% ` [PATCH 2/2] content_digest_dbg: convert to arrayref and limit to lei Eric Wong
2023-01-29 22:58 71% [PATCH 0/2] fix xt/lei-auth-fail.t Eric Wong
2023-01-29 22:58 69% ` [PATCH 2/2] xt/lei-auth-fail: use valid label name Eric Wong
2023-01-31  0:05 70% [PATCH] lei: drop -watches and -lei_note_event from workers Eric Wong
2023-02-12  3:12 61% [PATCH] t/lei-refresh-mail-sync: avoid kill+sleep loop Eric Wong
2023-02-13 16:06 63% lei q -tt doesn't work properly? Maxim Mikityanskiy
2023-02-14  2:42 66% ` [PATCH] lei q: do not collapse threads with `-tt' Eric Wong
2023-02-26 12:17 71%   ` Maxim Mikityanskiy
2023-02-26 17:09 69%     ` Eric Wong
2023-02-26 17:15 71%       ` [PATCH] doc: note "lei q -tt" is broken with HTTP(S) remotes Eric Wong
2023-03-09 19:28     [PATCH 0/6] various doc updates Eric Wong
2023-03-09 19:28 71% ` [PATCH 4/6] doc: lei import: add hints about nntp.* and imap.* config options Eric Wong
2023-03-09 19:28 68% ` [PATCH 5/6] doc: lei config: update with --edit and --list examples Eric Wong
2023-03-23 21:45 46% [PATCH] lei: improve bash completion involving colons Eric Wong
2023-03-23 22:05 71% repeat `lei import' users? Eric Wong
2023-03-28  1:00 71% Issues with `lei` as non-root Louis DeLosSantos
2023-03-28  1:32 71% ` Eric Wong
2023-03-28  2:30 45%   ` Louis DeLosSantos
2023-03-28  2:52 71%     ` Eric Wong
2023-03-28  3:05 42%       ` Louis DeLosSantos
2023-03-28  3:38 71%         ` Eric Wong
2023-03-28  4:08 71%           ` Louis DeLosSantos
2023-03-28 10:53 71% [PATCH] t/lei-refresh-mail-sync: improve test reliability Eric Wong
2023-04-29  7:18 71% [PATCH] t/lei-import-nntp: dump $lei_err on failure Eric Wong
2023-06-08 18:26 71% [PATCH] t/lei.t: quiet newline warning on older Perls Eric Wong
2023-06-15  0:08 71% [PATCH] doc: lei q: document v2:$INBOX_DIR output format Eric Wong
2023-06-15  8:46 60% [PATCH] lei import: set +(L|kw) on already-imported blobs Eric Wong
2023-06-15  9:50 51% [PATCH] lei: make --dedupe=content always account for Message-IDs Eric Wong
2023-09-14 12:12 71% [PATCH] t/lei-mirror: do not bail out on `make help' failure Eric Wong
2023-09-14 23:10 64% [PATCH] lei: ensure --stdin sets %ENV and $current_lei Eric Wong
2023-09-15 10:11 63% [PATCH] lei: ensure we run DESTROY|END at daemon exit w/ kqueue Eric Wong
2023-09-15 21:08 67% RFC: lei searches managed by users in git Konstantin Ryabitsev
2023-09-15 22:47 55% ` Eric Wong
2023-09-22 18:37 66% [PATCH] t/lei-mirror: avoid make(1) jobserver warning Eric Wong
2023-09-22 20:33 69% lei interactive TUIs (ncurses/vim/emacs) Eric Wong
2023-11-09  4:14 71% ` Kyle Meyer
2023-09-22 21:13 71% [PATCH 0/4] small lei fixes Eric Wong
2023-09-22 21:13 90% ` [PATCH 1/4] lei blob|rediff: fix usage of lei->fail Eric Wong
2023-09-22 21:13 66% ` [PATCH 2/4] lei: improve ->fail internal API Eric Wong
2023-09-22 21:13 70% ` [PATCH 3/4] lei_to_mail: drop awkward duplication of $lei object Eric Wong
2023-09-22 21:13 65% ` [PATCH 4/4] lei: use File::Temp for listing saved searches Eric Wong
2023-09-24  5:42 69% [PATCH 0/6] lei config fixes and improvements Eric Wong
2023-09-24  5:42 56% ` [PATCH 1/6] lei: check git-config(1) failures Eric Wong
2023-09-24  5:42 71% ` [PATCH 2/6] lei view_text: used tied ProcessPipe for `git config' Eric Wong
2023-09-24  5:42 66% ` [PATCH 4/6] lei config: send `git config' errors to pager Eric Wong
2023-09-24  5:42 30% ` [PATCH 5/6] lei: fix `-c NAME=VALUE' config support Eric Wong
2023-09-24 21:08     [PATCH 0/4] various CentOS 7.x related fixes Eric Wong
2023-09-24 21:08 71% ` [PATCH 3/4] lei: use scalar %SIG assignment Eric Wong
2023-09-27  6:02     [PATCH 0/3] more process management cleanups + bugfix Eric Wong
2023-09-27  6:02 46% ` [PATCH 3/3] lei: don't gzip --rsyncable by default for mbox* Eric Wong
2023-09-30  0:36 71% [PATCH 0/2] lei: support reading inboxes & extindex w/o search Eric Wong
2023-09-30  0:36 42% ` [PATCH 2/2] lei convert: support reading from v1, v2, and extindex Eric Wong
2023-09-30 16:17 71% [PATCH] t/lei-convert: fix uninitialized variable w/o pigz Eric Wong
2023-10-01  9:54     [PATCH 00/13] various warning/diagnostic fixes Eric Wong
2023-10-01  9:54 64% ` [PATCH 06/13] lei rediff: `git diff -O<order-file>' support Eric Wong
2023-10-01  9:54 71% ` [PATCH 07/13] lei: correct exit signal Eric Wong
2023-10-01  9:54 71% ` [PATCH 08/13] lei mail-diff: don't remove temporary subdirectory Eric Wong
2023-10-01  9:54 71% ` [PATCH 12/13] lei: ->fail only allows integer exit codes Eric Wong
2023-10-01  9:54 50% ` [PATCH 13/13] lei: deal with clients with blocked stderr Eric Wong
2023-10-01 22:29 50% [PATCH] lei up: fix missing -t/--threads matches w/ saved search Eric Wong
2023-10-02 14:58 68% [PATCH] lei up: faster non-thread, single-source incremental query Eric Wong
2023-10-02 15:00 42% [PATCH] lei: do label/keyword parsing in optparse Eric Wong
2023-10-02 20:14 62% ` Eric Wong
2023-10-03  6:43     [PATCH 0/8] IMAP/NNTP client improvements Eric Wong
2023-10-03  6:43 90% ` [PATCH 5/8] lei: workers exit after they tell lei-daemon Eric Wong
2023-10-03  6:43 62% ` [PATCH 7/8] xt/lei-onion-convert: test TLS + SOCKS Eric Wong
2023-10-03 16:18 47% [PATCH] t/lei-q-save: quiet `no email in From: ...' warnings Eric Wong
2023-10-04  3:49 64% [PATCH 00/21] lei + IPC related stuff Eric Wong
2023-10-04  3:49 68% ` [PATCH 01/21] lei: drop stores explicitly at daemon shutdown Eric Wong
2023-10-04  3:49 65% ` [PATCH 05/21] lei: close DirIdle (inotify) early " Eric Wong
2023-10-04  3:49 33% ` [PATCH 07/21] lei: do_env combines fchdir and local Eric Wong
2023-10-04  3:49 43% ` [PATCH 08/21] lei: get rid of l2m_progress PktOp callback Eric Wong
2023-10-04  3:49 71% ` [PATCH 10/21] lei: reuse PublicInbox::Config::noop Eric Wong
2023-10-04  3:49 69% ` [PATCH 11/21] lei: keep signals blocked on daemon shutdown Eric Wong
2023-10-04  3:49 67% ` [PATCH 20/21] lei: document and local-ize $OPT hashref Eric Wong
2023-10-07 21:24     [PATCH 0/9] more process-related cleanups Eric Wong
2023-10-07 21:24 71% ` [PATCH 2/9] lei: do not issue sto->done if socket is inactive Eric Wong
2023-10-07 21:24 47% ` [PATCH 3/9] lei: always use async `done' requests to store Eric Wong
2023-10-08  1:58 71%   ` Eric Wong
2023-10-08  5:49 66%   ` [PATCH 2.5/9] lei: fix implicit stdin support for pipes Eric Wong
2023-10-08 18:54 48%   ` [PATCHv2 3/9] lei: always use async `done' requests to store Eric Wong
2023-10-11  7:20 65% [PATCH 0/9] lei + import-related updates Eric Wong
2023-10-11  7:20 56% ` [PATCH 1/9] lei rediff: use ProcessIO for --drq support Eric Wong
2023-10-11  7:20 86% ` [PATCH 8/9] lei blob: run cat_blob on lei/store for pending blobs Eric Wong
2023-10-11  7:20 40% ` [PATCH 9/9] lei import|tag|rm: support --commit-delay=SECONDS Eric Wong
2023-10-12  0:21 66% [PATCH] lei: quiet excessive write/seen messages Eric Wong
2023-10-16 11:33     [PATCH 1/3] doc: fix some typos and grammar Štěpán Němec
2023-10-16 11:33 71% ` [PATCH 2/3] doc: lei-q: drop stale TODO comment (fixed in 1f1b1f0e22f7) Štěpán Němec
2023-10-16 21:17 71%   ` Eric Wong
2023-10-17 10:11 71% [PATCH 0/3] lei: stdin handling improvements Eric Wong
2023-10-17 10:11 51% ` [PATCH 1/3] lei: consolidate stdin slurp, fix warnings Eric Wong
2023-10-17 23:37     [PATCH 00/30] autodie-ification and code simplifications Eric Wong
2023-10-17 23:38 71% ` [PATCH 18/30] t/lei-up: additional diagnostics for match failures Eric Wong
2023-10-17 23:38 50% ` [PATCH 26/30] lei: use autodie where appropriate Eric Wong
2023-10-19  1:14 59% ` [PATCH 31/30] lei: simplify startq/au_done wakeup notifications Eric Wong
2023-10-27  1:14 88% [PATCH] lei: don't exit lei-daemon on ovv_begin failure Eric Wong
2023-11-02 21:16 71% lei - dfn filters for net/* catching drivers/net/* David Wei
2023-11-02 21:27 71% ` Eric Wong
2023-11-03 18:29 71%   ` David Wei
2023-11-07 13:01 44% [PATCH] lei: fix SIGPIPE on large result sets to pager Eric Wong
2023-11-09 10:09     [PATCH 00/13] misc error handling stuff and simplifications Eric Wong
2023-11-09 10:09 68% ` [PATCH 02/13] lei: use cached $daemon_pid when possible Eric Wong
2023-11-09 10:09 70% ` [PATCH 03/13] lei: reuse FDs atfork and close explicitly Eric Wong
2023-11-09 10:09 70% ` [PATCH 06/13] lei ls-mail-source: gracefully handle network failures Eric Wong
2023-11-09 10:09 55% ` [PATCH 12/13] lei: get rid of autoreap usage Eric Wong
2023-11-10 22:26 62% [PATCH] t/lei-import: skip strace for restricted systems Eric Wong
2023-11-11 22:44 70% [Bug] lei: extra quotes inserted into query with AND/OR Henrik Grimler
2023-11-12  0:10 71% ` Eric Wong
2023-11-12  8:23 71%   ` Henrik Grimler
2023-11-12  9:02 69%     ` Eric Wong
2023-11-12 11:59 71%       ` Henrik Grimler
2023-11-12 13:24 71%         ` Eric Wong
2023-11-12 13:12 58% [PATCH] lei: don't read --stdin terminals from daemon Eric Wong
2023-11-15  1:04     [PATCH 0/2] some CentOS fixes Eric Wong
2023-11-15  1:04 70% ` [PATCH 1/2] lei: use -signal numbers for old Perl Eric Wong
2023-11-15  1:04 71% ` [PATCH 2/2] t/lei-import: account for more verbose error Eric Wong
2023-11-15  9:21 71% [PATCH 0/4] lei convert: support idempotent v2 outputs Eric Wong
2023-11-15  9:21 71% ` [PATCH 1/4] lei: fix idempotent STDERR redirect in workers Eric Wong
2023-11-15  9:21 47% ` [PATCH 2/4] lei convert: fix repeat and idempotent v2 output Eric Wong
2023-11-15  9:21 54% ` [PATCH 3/4] lei: avoid extra fork for v2 outputs Eric Wong
2023-11-15  9:21 64% ` [PATCH 4/4] lei q|up|convert: common finish_output to detect errors Eric Wong
2023-11-28 17:36     [PATCH 0/4] non-cindex-related stuff Eric Wong
2023-11-28 17:36 59% ` [PATCH 1/4] lei q: fix --no-import-before completion + docs Eric Wong
2023-12-12 11:17 71% lei up without creating xapian database Aneesh Kumar K.V (IBM)
2023-12-12 11:41 71% ` Eric Wong
2023-12-13  0:50     [PATCH 00/14] Alpine Linux support Eric Wong
2023-12-13  0:50 71% ` [PATCH 05/14] lei inspect: drop unneeded strftime import Eric Wong
2023-12-13  0:50 70% ` [PATCH 14/14] t/lei-import: relax EIO regexp Eric Wong
2023-12-16 11:13 71% [PATCH 0/2] lei bugfixes Eric Wong
2023-12-16 11:13 68% ` [PATCH 1/2] lei index: support +L: labels Eric Wong
2023-12-16 11:13 64% ` [PATCH 2/2] lei: use ->child_error API properly Eric Wong
2023-12-16 13:09 20% [PATCH] lei: support reading MH for convert+import+index Eric Wong
2023-12-16 16:15 71% ` Konstantin Ryabitsev
2023-12-16 18:17 71%   ` Eric Wong
2023-12-17  7:59 69%     ` Eric Wong
2023-12-29 18:05 19% ` [PATCH v2] " Eric Wong
2024-01-03 10:23 32% [PATCH] lei: MH: support inotify to detect updates Eric Wong
2024-01-10 11:18 71% [PATCH 0/3] lei NNTP + error handling fixes Eric Wong
2024-01-10 11:18 60% ` [PATCH 2/3] lei+net_reader: show NNTP message in more failures Eric Wong
2024-01-30  6:31 71% [PATCH 0/2] watch: add MH support + lei doc Eric Wong
2024-01-30  6:31 69% ` [PATCH 2/2] doc/lei-mail-formats: update MH read-only status Eric Wong
2024-01-31 10:20     [PATCH 0/5] more MH-related updates Eric Wong
2024-01-31 10:20 65% ` [PATCH 1/5] lei convert: explicitly allow --sort for inputs Eric Wong
2024-01-31 10:20 55% ` [PATCH 4/5] scripts/import_*: update usage to include lei tips Eric Wong
2024-01-31 10:20 59% ` [PATCH 5/5] lei: sort MH inputs sequentially by default Eric Wong
2024-02-09 16:06 70% lei up can't fetch new thread messages when searching by mid Pratyush Yadav
2024-02-09 17:35 71% ` Eric Wong
2024-02-12 11:11 71%   ` Pratyush Yadav
2024-03-08 21:05     [PATCH 0/2] fixes noticed while tracking down fast-import failures Eric Wong
2024-03-08 21:05 65% ` [PATCH 1/2] lei: prevent empty {bytes} field in saved search Eric Wong
2024-03-13 15:35 71% Lei exception Gonsolo
2024-03-13 19:20 71% ` Eric Wong
     [not found]       ` <CANL0fFSMQ1YL1a8PEpU39pYQ7d6vmmndughvJVue=SWNYNdqGQ@mail.gmail.com>
2024-03-14 14:46 71%     ` Eric Wong
2024-03-15 17:36 71%       ` Gonsolo
2024-04-04 19:04 71% v1.9.0 : `ls-search' is not an lei command Josh Steadmon
2024-04-04 19:55 71% ` Eric Wong
2024-04-09 19:59 71%   ` Josh Steadmon
2024-04-11 18:58 71% [PATCH] lei blob: fix attachment extraction for unimported||inflight Eric Wong
2024-04-11 22:46 65% lei-up doesn't output replies to matching thread Josh Steadmon
2024-04-12  2:07 71% ` Eric Wong
2024-04-12  2:01 51% [PATCH] lei q: support --thread-id=$MSGID || -T $MSGID Eric Wong
2024-04-12  8:03 71% ` Štěpán Němec
2024-04-12  9:43 71%   ` Eric Wong
2024-04-12 18:04 71% [PATCH 0/3] some lei fixes Eric Wong
2024-04-12 18:04 71% ` [PATCH 3/3] lei: remove leftover debugging message Eric Wong
2024-04-16 20:56 71% [PATCH 0/4] lei parallelism fixes Eric Wong
2024-04-16 20:56 71% ` [PATCH 1/4] v2 + lei/store: always wait for fast-import checkpoint Eric Wong
2024-04-16 20:56 65% ` [PATCH 2/4] lei: use ->barrier to commit to lei/store Eric Wong
2024-04-16 20:56 63% ` [PATCH 3/4] lei/store: stop shard workers + cat-file on idle Eric Wong
2024-04-17  9:34 60%   ` [PATCH v2 " Eric Wong
2024-04-16 20:56 53% ` [PATCH 4/4] lei: use async barrier for --import-before Eric Wong
2024-04-18  6:24 71% Sharing lei searches Gonsolo
2024-04-18 10:26 71% ` Eric Wong
2024-04-18 15:17 71%   ` Gonsolo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).