unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 0/9] lei: a bunch of random stuff
@ 2021-09-18  9:33 Eric Wong
  2021-09-18  9:33 ` [PATCH 1/9] lei: lock worker counts Eric Wong
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

The unique timers stuff will be used for "lei up" polling,
as will 9/9 to improve "lei up" usability.

The net_reader changes were noticed while getting imaps://
to work with socks5h:// (not just imap://).

There's still a lot of mail_sync stuff going on, but it's
getting closer...

Eric Wong (9):
  lei: lock worker counts
  lei_mail_sync: rely on flock(2), avoid IPC
  lei_mail_sync: set nodatacow on btrfs
  ds: support add unique timers
  net_reader: tie SocksDebug to {imap,nntp}.Debug
  net_reader: detect IMAP failures earlier
  net_reader: support imaps:// w/ socks5h:// proxy
  net_reader: set SO_KEEPALIVE on all Net::NNTP sockets
  lei up: automatically use dt: for remote externals

 Documentation/lei-up.pod              |  15 ++++
 lib/PublicInbox/DS.pm                 | 100 +++++++++++++-------------
 lib/PublicInbox/LEI.pm                |  40 +++++------
 lib/PublicInbox/LeiExportKw.pm        |  32 ++++-----
 lib/PublicInbox/LeiForgetMailSync.pm  |   6 +-
 lib/PublicInbox/LeiImport.pm          |   8 +--
 lib/PublicInbox/LeiInput.pm           |   2 +-
 lib/PublicInbox/LeiInspect.pm         |   5 +-
 lib/PublicInbox/LeiLsMailSource.pm    |   3 +-
 lib/PublicInbox/LeiLsMailSync.pm      |   3 +-
 lib/PublicInbox/LeiLsSearch.pm        |   2 +-
 lib/PublicInbox/LeiMailSync.pm        |  51 ++++++++++---
 lib/PublicInbox/LeiNoteEvent.pm       |  31 ++++----
 lib/PublicInbox/LeiRefreshMailSync.pm |  35 ++++-----
 lib/PublicInbox/LeiRm.pm              |   2 +-
 lib/PublicInbox/LeiSavedSearch.pm     |   1 +
 lib/PublicInbox/LeiStore.pm           |  39 +---------
 lib/PublicInbox/LeiTag.pm             |   3 +-
 lib/PublicInbox/LeiToMail.pm          |  10 ++-
 lib/PublicInbox/LeiUp.pm              |   2 +-
 lib/PublicInbox/LeiXSearch.pm         |  50 ++++++++++---
 lib/PublicInbox/NetReader.pm          |  26 ++++---
 t/lei-q-remote-import.t               |   4 ++
 23 files changed, 259 insertions(+), 211 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/9] lei: lock worker counts
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18  9:33 ` [PATCH 2/9] lei_mail_sync: rely on flock(2), avoid IPC Eric Wong
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

It doesn't seem worthwhile to change worker counts dynamically
on a per-command-basis with lei, and I don't know how such an
interface would even work...
---
 lib/PublicInbox/LEI.pm                | 3 ++-
 lib/PublicInbox/LeiExportKw.pm        | 1 -
 lib/PublicInbox/LeiImport.pm          | 1 -
 lib/PublicInbox/LeiLsMailSource.pm    | 3 +--
 lib/PublicInbox/LeiLsSearch.pm        | 2 +-
 lib/PublicInbox/LeiRefreshMailSync.pm | 4 +---
 lib/PublicInbox/LeiRm.pm              | 2 +-
 lib/PublicInbox/LeiTag.pm             | 3 +--
 8 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 9794497b..41e761f8 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -258,7 +258,7 @@ our %CMD = ( # sorted in order of importance/use:
 	 @c_opt ],
 'import' => [ 'LOCATION...|--stdin',
 	'one-time import/update from URL or filesystem',
-	qw(stdin| offset=i recursive|r exclude=s include|I=s jobs=s new-only
+	qw(stdin| offset=i recursive|r exclude=s include|I=s new-only
 	lock=s@ in-format|F=s kw! verbose|v+ incremental! mail-sync!),
 	@net_opt, @c_opt ],
 'forget-mail-sync' => [ 'LOCATION...',
@@ -627,6 +627,7 @@ sub workers_start {
 	my $end = $lei->pkt_op_pair;
 	my $ident = $wq->{-wq_ident} // "lei-$lei->{cmd} worker";
 	$flds->{lei} = $lei;
+	$wq->{-wq_nr_workers} //= $jobs; # lock, no incrementing
 	$wq->wq_workers_start($ident, $jobs, $lei->oldset, $flds);
 	delete $lei->{pkt_op_p};
 	my $op_c = delete $lei->{pkt_op_c};
diff --git a/lib/PublicInbox/LeiExportKw.pm b/lib/PublicInbox/LeiExportKw.pm
index d37f3768..8b8aa373 100644
--- a/lib/PublicInbox/LeiExportKw.pm
+++ b/lib/PublicInbox/LeiExportKw.pm
@@ -124,7 +124,6 @@ EOM
 	my $ops = {};
 	$sto->write_prepare($lei);
 	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
-	$self->{-wq_nr_workers} = $j // 1; # locked
 	(my $op_c, $ops) = $lei->workers_start($self, $j, $ops);
 	$lei->{wq1} = $self;
 	$lei->{-err_type} = 'non-fatal';
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 7c563bd8..b1cb3940 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -101,7 +101,6 @@ sub do_import_index ($$@) {
 	}
 	my $ops = {};
 	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
-	$self->{-wq_nr_workers} = $j // 1; # locked
 	$lei->{-eml_noisy} = 1;
 	(my $op_c, $ops) = $lei->workers_start($self, $j, $ops);
 	$lei->{wq1} = $self;
diff --git a/lib/PublicInbox/LeiLsMailSource.pm b/lib/PublicInbox/LeiLsMailSource.pm
index f012e10e..bcb1838e 100644
--- a/lib/PublicInbox/LeiLsMailSource.pm
+++ b/lib/PublicInbox/LeiLsMailSource.pm
@@ -96,8 +96,7 @@ sub lei_ls_mail_source {
 	$lei->start_pager if -t $lei->{1};
 	my $ops = {};
 	$lei->{auth}->op_merge($ops, $self);
-	my $j = $self->{-wq_nr_workers} = 1; # locked
-	(my $op_c, $ops) = $lei->workers_start($self, $j, $ops);
+	(my $op_c, $ops) = $lei->workers_start($self, 1, $ops);
 	$lei->{wq1} = $self;
 	$lei->{-err_type} = 'non-fatal';
 	net_merge_all_done($self) unless $lei->{auth};
diff --git a/lib/PublicInbox/LeiLsSearch.pm b/lib/PublicInbox/LeiLsSearch.pm
index 70136135..aebf0184 100644
--- a/lib/PublicInbox/LeiLsSearch.pm
+++ b/lib/PublicInbox/LeiLsSearch.pm
@@ -71,7 +71,7 @@ sub do_ls_search_long {
 
 sub bg_worker ($$$) {
 	my ($lei, $pfx, $json) = @_;
-	my $self = bless { -wq_nr_workers => 1, json => $json }, __PACKAGE__;
+	my $self = bless { json => $json }, __PACKAGE__;
 	my ($op_c, $ops) = $lei->workers_start($self, 1);
 	$lei->{wq1} = $self;
 	$self->wq_io_do('do_ls_search_long', [], $pfx);
diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index 09a7ead0..cdd99725 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -84,11 +84,9 @@ EOM
 	my $self = bless { missing_ok => 1 }, __PACKAGE__;
 	$lei->{opt}->{'mail-sync'} = 1; # for prepare_inputs
 	$self->prepare_inputs($lei, \@folders) or return;
-	my $j = $lei->{opt}->{jobs} || scalar(@{$self->{inputs}}) || 1;
 	my $ops = {};
 	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
-	$self->{-wq_nr_workers} = $j // 1; # locked
-	(my $op_c, $ops) = $lei->workers_start($self, $j, $ops);
+	(my $op_c, $ops) = $lei->workers_start($self, 1, $ops);
 	$lei->{wq1} = $self;
 	$lei->{-err_type} = 'non-fatal';
 	net_merge_all_done($self) unless $lei->{auth};
diff --git a/lib/PublicInbox/LeiRm.pm b/lib/PublicInbox/LeiRm.pm
index 778fa1de..3371f3ed 100644
--- a/lib/PublicInbox/LeiRm.pm
+++ b/lib/PublicInbox/LeiRm.pm
@@ -32,7 +32,7 @@ sub lei_rm {
 	my ($lei, @inputs) = @_;
 	$lei->_lei_store(1)->write_prepare($lei);
 	$lei->{opt}->{'in-format'} //= 'eml';
-	my $self = bless { -wq_nr_workers => 1 }, __PACKAGE__;
+	my $self = bless {}, __PACKAGE__;
 	$self->prepare_inputs($lei, \@inputs) or return;
 	my ($op_c, $ops) = $lei->workers_start($self, 1);
 	$lei->{wq1} = $self;
diff --git a/lib/PublicInbox/LeiTag.pm b/lib/PublicInbox/LeiTag.pm
index 44d77b88..c4f5ecff 100644
--- a/lib/PublicInbox/LeiTag.pm
+++ b/lib/PublicInbox/LeiTag.pm
@@ -50,8 +50,7 @@ sub lei_tag { # the "lei tag" method
 		return $lei->fail('no keywords or labels specified');
 	my $ops = {};
 	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
-	my $j = $self->{-wq_nr_workers} = 1; # locked for now
-	(my $op_c, $ops) = $lei->workers_start($self, $j, $ops);
+	(my $op_c, $ops) = $lei->workers_start($self, 1, $ops);
 	$lei->{wq1} = $self;
 	$lei->{-err_type} = 'non-fatal';
 	net_merge_all_done($self) unless $lei->{auth};

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/9] lei_mail_sync: rely on flock(2), avoid IPC
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
  2021-09-18  9:33 ` [PATCH 1/9] lei: lock worker counts Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18  9:33 ` [PATCH 3/9] lei_mail_sync: set nodatacow on btrfs Eric Wong
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

Since 44917fdd24a8bec1 ("lei_mail_sync: do not use transactions"),
relying on lei/store to serialize access was a pointless endeavor.

Rely on flock(2) to serialize multiple writers since (in my
experience) it's the easiest way to deal with parallel writers
when using SQLite.  This allows us to simplify existing callers
while speeding up 'lei refresh-mail-sync --all=local' by 5% or
so.
---
 lib/PublicInbox/LEI.pm                | 30 +++++++----------
 lib/PublicInbox/LeiExportKw.pm        | 31 ++++++++----------
 lib/PublicInbox/LeiForgetMailSync.pm  |  6 ++--
 lib/PublicInbox/LeiImport.pm          |  7 ++--
 lib/PublicInbox/LeiInput.pm           |  2 +-
 lib/PublicInbox/LeiInspect.pm         |  5 ++-
 lib/PublicInbox/LeiLsMailSync.pm      |  3 +-
 lib/PublicInbox/LeiMailSync.pm        | 46 +++++++++++++++++++++------
 lib/PublicInbox/LeiNoteEvent.pm       | 26 +++++++--------
 lib/PublicInbox/LeiRefreshMailSync.pm | 31 ++++++++++--------
 lib/PublicInbox/LeiStore.pm           | 39 ++---------------------
 lib/PublicInbox/LeiToMail.pm          |  6 ++--
 12 files changed, 107 insertions(+), 125 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 41e761f8..053b6174 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -1442,20 +1442,16 @@ sub refresh_watches {
 	}
 
 	# add all known Maildir folders as implicit watches
-	my $sto = $lei->_lei_store;
-	my $renames = 0;
-	if (my $lms = $sto ? $sto->search->lms : undef) {
+	my $lms = $lei->lms;
+	if ($lms) {
+		$lms->lms_write_prepare;
 		for my $d ($lms->folders('maildir:')) {
 			substr($d, 0, length('maildir:')) = '';
-			my $cd = canonpath_harder($d);
-			my $f = "maildir:$cd";
 
 			# fixup old bugs while we're iterating:
-			if ($d ne $cd) {
-				$sto->ipc_do('lms_rename_folder',
-						"maildir:$d", $f);
-				++$renames;
-			}
+			my $cd = canonpath_harder($d);
+			my $f = "maildir:$cd";
+			$lms->rename_folder("maildir:$d", $f) if $d ne $cd;
 			next if $watches->{$f}; # may be set to pause
 			require PublicInbox::LeiWatch;
 			$watches->{$f} = PublicInbox::LeiWatch->new($f);
@@ -1463,7 +1459,6 @@ sub refresh_watches {
 			add_maildir_watch($cd, $cfg_f);
 		}
 	}
-	$lei->sto_done_request if $renames;
 	if ($old) { # cull old non-existent entries
 		for my $url (keys %$old) {
 			next if exists $seen{$url};
@@ -1490,13 +1485,12 @@ sub git_oid {
 	git_sha(1, $eml);
 }
 
-sub lms { # read-only LeiMailSync
-	my ($lei) = @_;
-	my $lse = $lei->{lse} // do {
-		my $sto = $lei->{sto} // _lei_store($lei);
-		$sto ? $sto->search : undef
-	};
-	$lse ? $lse->lms : undef;
+sub lms {
+	my ($lei, $rw) = @_;
+	my $sto = $lei->{sto} // _lei_store($lei) // return;
+	require PublicInbox::LeiMailSync;
+	my $f = "$sto->{priv_eidx}->{topdir}/mail_sync.sqlite3";
+	(-f $f || $rw) ? PublicInbox::LeiMailSync->new($f) : undef;
 }
 
 sub sto_done_request { # only call this from lei-daemon process (not workers)
diff --git a/lib/PublicInbox/LeiExportKw.pm b/lib/PublicInbox/LeiExportKw.pm
index 8b8aa373..8c5fbc13 100644
--- a/lib/PublicInbox/LeiExportKw.pm
+++ b/lib/PublicInbox/LeiExportKw.pm
@@ -40,7 +40,7 @@ sub export_kw_md { # LeiMailSync->each_src callback
 			if (!unlink($src) and $! != ENOENT) {
 				$lei->child_error(1, "E: unlink($src): $!");
 			}
-			$lei->{sto}->ipc_do('lms_mv_src', "maildir:$mdir",
+			$self->{lms}->mv_src("maildir:$mdir",
 						$oidbin, $id, $bn);
 			return; # success anyways if link(2) worked
 		} elsif ($! == EEXIST) { # lost race with lei/store?
@@ -55,7 +55,7 @@ sub export_kw_md { # LeiMailSync->each_src callback
 	my $src = "$mdir/{".join(',', @try)."}/$$id";
 	$lei->child_error(1, "link($src -> $dst) ($oidhex): $e");
 	for (@try) { return if -e "$mdir/$_/$$id" }
-	$lei->{sto}->ipc_do('lms_clear_src', "maildir:$mdir", $id);
+	$self->{lms}->clear_src("maildir:$mdir", $id);
 }
 
 sub export_kw_imap { # LeiMailSync->each_src callback
@@ -67,18 +67,17 @@ sub export_kw_imap { # LeiMailSync->each_src callback
 # overrides PublicInbox::LeiInput::input_path_url
 sub input_path_url {
 	my ($self, $input, @args) = @_;
-	my $lms = $self->{-lms_ro} //= $self->{lse}->lms;
+	$self->{lms}->lms_write_prepare;
 	if ($input =~ /\Amaildir:(.+)/i) {
 		my $mdir = $1;
 		require PublicInbox::LeiToMail; # kw2suffix
-		$lms->each_src($input, \&export_kw_md, $self, $mdir);
+		$self->{lms}->each_src($input, \&export_kw_md, $self, $mdir);
 	} elsif ($input =~ m!\Aimaps?://!i) {
 		my $uri = PublicInbox::URIimap->new($input);
 		my $mic = $self->{nwr}->mic_for_folder($uri);
-		$lms->each_src($$uri, \&export_kw_imap, $self, $mic);
+		$self->{lms}->each_src($$uri, \&export_kw_imap, $self, $mic);
 		$mic->expunge;
 	} else { die "BUG: $input not supported" }
-	my $wait = $self->{lei}->{sto}->ipc_do('done');
 }
 
 sub lei_export_kw {
@@ -86,26 +85,25 @@ sub lei_export_kw {
 	my $sto = $lei->_lei_store or return $lei->fail(<<EOM);
 lei/store uninitialized, see lei-import(1)
 EOM
-	my $lse = $sto->search;
-	my $lms = $lse->lms or return $lei->fail(<<EOM);
+	my $lms = $lei->lms or return $lei->fail(<<EOM);
 lei mail_sync uninitialized, see lei-import(1)
 EOM
-	my $opt = $lei->{opt};
-	if (defined(my $all = $opt->{all})) { # --all=<local|remote>
+	if (defined(my $all = $lei->{opt}->{all})) { # --all=<local|remote>
 		$lms->group2folders($lei, $all, \@folders) or return;
+		@folders = grep(/\A(?:maildir|imaps?):/i, @folders);
 	} else {
 		my $err = $lms->arg2folder($lei, \@folders);
 		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
 		return $lei->fail($err->{fail}) if $err->{fail};
 	}
-	my $self = bless { lse => $lse }, __PACKAGE__;
+	$lms->lms_pause;
+	my $self = bless { lse => $sto->search, lms => $lms }, __PACKAGE__;
 	$lei->{opt}->{'mail-sync'} = 1; # for prepare_inputs
 	$self->prepare_inputs($lei, \@folders) or return;
-	my $j = $opt->{jobs} // scalar(@{$self->{inputs}}) || 1;
 	if (my @ro = grep(!/\A(?:maildir|imaps?):/i, @folders)) {
 		return $lei->fail("cannot export to read-only folders: @ro");
 	}
-	my $m = $opt->{mode} // 'merge';
+	my $m = $lei->{opt}->{mode} // 'merge';
 	if ($m eq 'merge') { # default
 		$self->{-merge_kw} = 1;
 	} elsif ($m eq 'set') {
@@ -120,11 +118,9 @@ EOM
 		$self->{imap_mod_kw} = $net->can($self->{-merge_kw} ?
 					'imap_add_kw' : 'imap_set_kw');
 	}
-	undef $lms; # for fork
 	my $ops = {};
-	$sto->write_prepare($lei);
 	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
-	(my $op_c, $ops) = $lei->workers_start($self, $j, $ops);
+	(my $op_c, $ops) = $lei->workers_start($self, 1, $ops);
 	$lei->{wq1} = $self;
 	$lei->{-err_type} = 'non-fatal';
 	net_merge_all_done($self) unless $lei->{auth};
@@ -133,8 +129,7 @@ EOM
 
 sub _complete_export_kw {
 	my ($lei, @argv) = @_;
-	my $sto = $lei->_lei_store or return;
-	my $lms = $sto->search->lms or return;
+	my $lms = $lei->lms or return;
 	my $match_cb = $lei->complete_url_prepare(\@argv);
 	map { $match_cb->($_) } $lms->folders;
 }
diff --git a/lib/PublicInbox/LeiForgetMailSync.pm b/lib/PublicInbox/LeiForgetMailSync.pm
index 2b4e58a9..701f48d2 100644
--- a/lib/PublicInbox/LeiForgetMailSync.pm
+++ b/lib/PublicInbox/LeiForgetMailSync.pm
@@ -15,13 +15,11 @@ use PublicInbox::LeiExportKw;
 sub lei_forget_mail_sync {
 	my ($lei, @folders) = @_;
 	my $lms = $lei->lms or return;
-	my $sto = $lei->_lei_store or return; # may disappear due to race
-	$sto->write_prepare($lei);
+	$lms->lms_write_prepare;
 	my $err = $lms->arg2folder($lei, \@folders);
 	$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
 	return $lei->fail($err->{fail}) if $err->{fail};
-	$sto->ipc_do('lms_forget_folders', @folders);
-	$lei->sto_done_request;
+	$lms->forget_folders(@folders);
 }
 
 *_complete_forget_mail_sync = \&PublicInbox::LeiExportKw::_complete_export_kw;
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index b1cb3940..9084d771 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -36,7 +36,7 @@ sub pmdir_cb { # called via wq_io_do from LeiPmdir->each_mdir_fn
 	my $kw = PublicInbox::MdirReader::flags2kw($fl);
 	substr($folder, 0, 0) = 'maildir:'; # add prefix
 	my $lse = $self->{lse} //= $self->{lei}->{sto}->search;
-	my $lms = $self->{-lms_ro} //= $lse->lms; # may be 0 or undef
+	my $lms = $self->{-lms_ro} //= $self->{lei}->lms; # may be 0 or undef
 	my $oidbin = $lms ? $lms->name_oidbin($folder, $bn) : undef;
 	my @docids = defined($oidbin) ? $lse->over->oidbin_exists($oidbin) : ();
 	my $vmd = $self->{-import_kw} ? { kw => $kw } : undef;
@@ -83,7 +83,7 @@ sub do_import_index ($$@) {
 		# $j = $net->net_concurrency($j); TODO
 		if ($lei->{opt}->{incremental} // 1) {
 			$net->{incremental} = 1;
-			$net->{-lms_ro} = $sto->search->lms // 0;
+			$net->{-lms_ro} = $lei->lms // 0;
 			if ($self->{-import_kw} && $net->{-lms_ro} &&
 					!$lei->{opt}->{'new-only'} &&
 					$net->{imap_order}) {
@@ -120,8 +120,7 @@ sub _complete_import {
 	my $match_cb = $lei->complete_url_prepare(\@argv);
 	my @m = map { $match_cb->($_) } $lei->url_folder_cache->keys;
 	my %f = map { $_ => 1 } @m;
-	my $sto = $lei->_lei_store;
-	if (my $lms = $sto ? $sto->search->lms : undef) {
+	if (my $lms = $lei->lms) {
 		@m = map { $match_cb->($_) } $lms->folders;
 		@f{@m} = @m;
 	}
diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 372e0fe1..fe736981 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -360,7 +360,7 @@ $input is `eml', not --in-format=$in_fmt
 		# start watching Maildirs ASAP
 		if ($may_sync && $lei->{sto}) {
 			grep(!m!\Amaildir:/!i, @md) and die "BUG: @md (no pfx)";
-			my $wait = $lei->{sto}->ipc_do('add_sync_folders', @md);
+			$lei->lms(1)->lms_write_prepare->add_folders(@md);
 			$lei->refresh_watches;
 		}
 	}
diff --git a/lib/PublicInbox/LeiInspect.pm b/lib/PublicInbox/LeiInspect.pm
index 2385f7f8..f06cea61 100644
--- a/lib/PublicInbox/LeiInspect.pm
+++ b/lib/PublicInbox/LeiInspect.pm
@@ -18,7 +18,7 @@ sub inspect_blob ($$) {
 		my $oidbin = pack('H*', $oidhex);
 		my @docids = $lse ? $lse->over->oidbin_exists($oidbin) : ();
 		$ent->{'lei/store'} = \@docids if @docids;
-		my $lms = $lse->lms;
+		my $lms = $lei->lms;
 		if (my $loc = $lms ? $lms->locations_for($oidbin) : undef) {
 			$ent->{'mail-sync'} = $loc;
 		}
@@ -29,8 +29,7 @@ sub inspect_blob ($$) {
 sub inspect_imap_uid ($$) {
 	my ($lei, $uid_uri) = @_;
 	my $ent = {};
-	my $lse = $lei->{lse} or return $ent;
-	my $lms = $lse->lms or return $ent;
+	my $lms = $lei->lms or return $ent;
 	my $oidhex = $lms->imap_oid($lei, $uid_uri);
 	if (ref(my $err = $oidhex)) { # art2folder error
 		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
diff --git a/lib/PublicInbox/LeiLsMailSync.pm b/lib/PublicInbox/LeiLsMailSync.pm
index 505c0b3f..2b167b1d 100644
--- a/lib/PublicInbox/LeiLsMailSync.pm
+++ b/lib/PublicInbox/LeiLsMailSync.pm
@@ -9,8 +9,7 @@ use PublicInbox::LeiMailSync;
 
 sub lei_ls_mail_sync {
 	my ($lei, $filter) = @_;
-	my $sto = $lei->_lei_store or return;
-	my $lms = $sto->search->lms or return;
+	my $lms = $lei->lms or return;
 	my $opt = $lei->{opt};
 	my $re = $opt->{globoff} ? undef : $lei->glob2re($filter // '*');
 	$re //= qr/\Q$filter\E/;
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 8f584ccb..690c6477 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -5,6 +5,7 @@
 package PublicInbox::LeiMailSync;
 use strict;
 use v5.10.1;
+use parent qw(PublicInbox::Lock);
 use DBI;
 use PublicInbox::ContentHash qw(git_sha);
 use Carp ();
@@ -21,7 +22,7 @@ sub dbh_new {
 		sqlite_use_immediate_transaction => 1,
 	});
 	# no sqlite_unicode, here, all strings are binary
-	create_tables($dbh) if $rw;
+	create_tables($self, $dbh) if $rw;
 	$dbh->do('PRAGMA journal_mode = WAL') if $creat;
 	$dbh->do('PRAGMA case_sensitive_like = ON');
 	$dbh;
@@ -29,13 +30,24 @@ sub dbh_new {
 
 sub new {
 	my ($cls, $f) = @_;
-	bless { filename => $f, fmap => {} }, $cls;
+	bless {
+		filename => $f,
+		fmap => {},
+		lock_path => "$f.flock",
+	}, $cls;
 }
 
-sub lms_write_prepare { ($_[0]->{dbh} //= dbh_new($_[0], 1)) };
+sub lms_write_prepare { ($_[0]->{dbh} //= dbh_new($_[0], 1)); $_[0] }
+
+sub lms_pause {
+	my ($self) = @_;
+	$self->{fmap} = {};
+	delete $self->{dbh};
+}
 
 sub create_tables {
-	my ($dbh) = @_;
+	my ($self, $dbh) = @_;
+	my $lk = $self->lock_for_scope;
 
 	$dbh->do(<<'');
 CREATE TABLE IF NOT EXISTS folders (
@@ -115,8 +127,15 @@ EOM
 	$fid;
 }
 
+sub add_folders {
+	my ($self, @folders) = @_;
+	my $lk = $self->lock_for_scope;
+	for my $f (@folders) { $self->{fmap}->{$f} //= fid_for($self, $f, 1) }
+}
+
 sub set_src {
 	my ($self, $oidbin, $folder, $id) = @_;
+	my $lk = $self->lock_for_scope;
 	my $fid = $self->{fmap}->{$folder} //= fid_for($self, $folder, 1);
 	my $sth;
 	if (ref($id)) { # scalar name
@@ -134,6 +153,7 @@ INSERT OR IGNORE INTO blob2num (oidbin, fid, uid) VALUES (?, ?, ?)
 
 sub clear_src {
 	my ($self, $folder, $id) = @_;
+	my $lk = $self->lock_for_scope;
 	my $fid = $self->{fmap}->{$folder} //= fid_for($self, $folder, 1);
 	my $sth;
 	if (ref($id)) { # scalar name
@@ -152,6 +172,7 @@ DELETE FROM blob2num WHERE fid = ? AND uid = ?
 # Maildir-only
 sub mv_src {
 	my ($self, $folder, $oidbin, $id, $newbn) = @_;
+	my $lk = $self->lock_for_scope;
 	my $fid = $self->{fmap}->{$folder} //= fid_for($self, $folder, 1);
 	my $sth = $self->{dbh}->prepare_cached(<<'');
 UPDATE blob2name SET name = ? WHERE fid = ? AND oidbin = ? AND name = ?
@@ -421,18 +442,23 @@ EOF
 	$err;
 }
 
-sub forget_folder {
-	my ($self, $folder) = @_;
-	my $fid = delete($self->{fmap}->{$folder}) //
-		fid_for($self, $folder) // return;
-	for my $t (qw(blob2name blob2num folders)) {
-		$self->{dbh}->do("DELETE FROM $t WHERE fid = ?", undef, $fid);
+sub forget_folders {
+	my ($self, @folders) = @_;
+	my $lk = $self->lock_for_scope;
+	for my $folder (@folders) {
+		my $fid = delete($self->{fmap}->{$folder}) //
+			fid_for($self, $folder) // next;
+		for my $t (qw(blob2name blob2num folders)) {
+			$self->{dbh}->do("DELETE FROM $t WHERE fid = ?",
+					undef, $fid);
+		}
 	}
 }
 
 # only used for changing canonicalization errors
 sub rename_folder {
 	my ($self, $old, $new) = @_;
+	my $lk = $self->lock_for_scope;
 	my $ofid = delete($self->{fmap}->{$old}) //
 		fid_for($self, $old) // return;
 	eval {
diff --git a/lib/PublicInbox/LeiNoteEvent.pm b/lib/PublicInbox/LeiNoteEvent.pm
index 41415346..c03c5319 100644
--- a/lib/PublicInbox/LeiNoteEvent.pm
+++ b/lib/PublicInbox/LeiNoteEvent.pm
@@ -2,6 +2,7 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # internal command for dealing with inotify, kqueue vnodes, etc
+# it is a semi-persistent worker
 package PublicInbox::LeiNoteEvent;
 use strict;
 use v5.10.1;
@@ -12,11 +13,8 @@ our $to_flush; # { cfgpath => $lei }
 
 sub flush_lei ($) {
 	my ($lei) = @_;
-	if (my $lne = delete $lei->{cfg}->{-lei_note_event}) {
-		$lne->wq_close(1, undef, $lei); # runs _lei_wq_eof;
-	} elsif ($lei->{sto}) { # lms_clear_src calls only:
-		$lei->sto_done_request;
-	}
+	my $lne = delete $lei->{cfg}->{-lei_note_event};
+	$lne->wq_close(1, undef, $lei) if $lne; # runs _lei_wq_eof;
 }
 
 # we batch up writes and flush every 5s (matching Linux default
@@ -38,14 +36,14 @@ sub note_event_arm_done ($) {
 sub eml_event ($$$$) {
 	my ($self, $eml, $vmd, $state) = @_;
 	my $sto = $self->{lei}->{sto};
-	my $lse = $self->{lse} //= $sto->search;
 	if ($state =~ /\Aimport-(?:rw|ro)\z/) {
 		$sto->ipc_do('set_eml', $eml, $vmd);
 	} elsif ($state =~ /\Aindex-(?:rw|ro)\z/) {
 		my $xoids = $self->{lei}->ale->xoids_for($eml);
 		$sto->ipc_do('index_eml_only', $eml, $vmd, $xoids);
 	} elsif ($state =~ /\Atag-(?:rw|ro)\z/) {
-		my $c = $lse->kw_changed($eml, $vmd->{kw}, my $docids = []);
+		my $docids = [];
+		my $c = $self->{lse}->kw_changed($eml, $vmd->{kw}, $docids);
 		if (scalar @$docids) { # already in lei/store
 			$sto->ipc_do('set_eml_vmd', undef, $vmd, $docids) if $c;
 		} elsif (my $xoids = $self->{lei}->ale->xoids_for($eml)) {
@@ -69,21 +67,19 @@ sub lei_note_event {
 	my $cfg = $lei->_lei_cfg or return; # gone (race)
 	my $sto = $lei->_lei_store or return; # gone
 	return flush_lei($lei) if $folder eq 'done'; # special case
-	my $lms = $sto->search->lms or return;
+	my $lms = $lei->lms or return;
+	$lms->lms_write_prepare if $new_cur eq ''; # for ->clear_src below
 	my $err = $lms->arg2folder($lei, [ $folder ]);
 	return if $err->{fail};
-	undef $lms;
 	my $state = $cfg->get_1("watch.$folder", 'state') // 'tag-rw';
 	return if $state eq 'pause';
+	return $lms->clear_src($folder, \$bn) if $new_cur eq '';
+	$lms->lms_pause;
 	$lei->ale; # prepare
 	$sto->write_prepare($lei);
-	if ($new_cur eq '') {
-		$sto->ipc_do('lms_clear_src', $folder, \$bn);
-		return note_event_arm_done($lei);
-	}
 	require PublicInbox::MdirReader;
 	my $self = $cfg->{-lei_note_event} //= do {
-		my $wq = bless {}, __PACKAGE__;
+		my $wq = bless { lms => $lms }, __PACKAGE__;
 		# MUAs such as mutt can trigger massive rename() storms so
 		# use all CPU power available:
 		my $jobs = $wq->detect_nproc // 1;
@@ -105,6 +101,8 @@ sub lei_note_event {
 sub ipc_atfork_child {
 	my ($self) = @_;
 	$self->{lei}->_lei_atfork_child(1); # persistent, for a while
+	$self->{lms}->lms_write_prepare;
+	$self->{lse} = $self->{lei}->{sto}->search;
 	$self->SUPER::ipc_atfork_child;
 }
 
diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index cdd99725..72b8fe63 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -11,9 +11,9 @@ use PublicInbox::LeiExportKw;
 use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::Import;
 
-sub folder_missing {
+sub folder_missing { # may be called by LeiInput
 	my ($self, $folder) = @_;
-	$self->{lei}->{sto}->ipc_do('lms_forget_folders', $folder);
+	$self->{lms}->forget_folders($folder);
 }
 
 sub prune_mdir { # lms->each_src callback
@@ -21,13 +21,13 @@ sub prune_mdir { # lms->each_src callback
 	my @try = $$id =~ /:2,[a-zA-Z]*\z/ ? qw(cur new) : qw(new cur);
 	for (@try) { return if -f "$mdir/$_/$$id" }
 	# both tries failed
-	$self->{lei}->{sto}->ipc_do('lms_clear_src', "maildir:$mdir", $id);
+	$self->{lms}->clear_src("maildir:$mdir", $id);
 }
 
 sub prune_imap { # lms->each_src callback
 	my ($oidbin, $uid, $self, $uids, $url) = @_;
 	return if exists $uids->{$uid};
-	$self->{lei}->{sto}->ipc_do('lms_clear_src', $url, $uid);
+	$self->{lms}->clear_src($url, $uid);
 }
 
 # detects missed file moves
@@ -36,18 +36,16 @@ sub pmdir_cb { # called via LeiPmdir->each_mdir_fn
 	my ($folder, $bn) = ($f =~ m!\A(.+?)/(?:new|cur)/([^/]+)\z!) or
 		die "BUG: $f was not from a Maildir?";
 	substr($folder, 0, 0) = 'maildir:'; # add prefix
-	my $lms = $self->{-lms_ro} //= $self->{lei}->lms;
-	return if defined($lms->name_oidbin($folder, $bn));
+	return if defined($self->{lms}->name_oidbin($folder, $bn));
 	my $eml = eml_from_path($f) // return;
 	my $oidbin = $self->{lei}->git_oid($eml)->digest;
-	$self->{lei}->{sto}->ipc_do('lms_set_src', $oidbin, $folder, \$bn);
+	$self->{lms}->set_src($oidbin, $folder, \$bn);
 }
 
 sub input_path_url { # overrides PublicInbox::LeiInput::input_path_url
 	my ($self, $input, @args) = @_;
-	my $lms = $self->{-lms_ro} //= $self->{lei}->lms;
 	if ($input =~ /\Amaildir:(.+)/i) {
-		$lms->each_src($input, \&prune_mdir, $self, my $mdir = $1);
+		$self->{lms}->each_src($input, \&prune_mdir, $self, $1);
 		$self->{lse} //= $self->{lei}->{sto}->search;
 		# call pmdir_cb (via maildir_each_file -> each_mdir_fn)
 		PublicInbox::LeiInput::input_path_url($self, $input);
@@ -56,7 +54,8 @@ sub input_path_url { # overrides PublicInbox::LeiInput::input_path_url
 		if (my $mic = $self->{lei}->{net}->mic_for_folder($uri)) {
 			my $uids = $mic->search('UID 1:*');
 			$uids = +{ map { $_ => undef } @$uids };
-			$lms->each_src($$uri, \&prune_imap, $self, $uids, $$uri)
+			$self->{lms}->each_src($$uri, \&prune_imap, $self,
+						$uids, $$uri)
 		} else {
 			$self->folder_missing($$uri);
 		}
@@ -79,9 +78,9 @@ EOM
 		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
 		return $lei->fail($err->{fail}) if $err->{fail};
 	}
-	undef $lms; # must be done before fork
+	$lms->lms_pause; # must be done before fork
 	$sto->write_prepare($lei);
-	my $self = bless { missing_ok => 1 }, __PACKAGE__;
+	my $self = bless { missing_ok => 1, lms => $lms }, __PACKAGE__;
 	$lei->{opt}->{'mail-sync'} = 1; # for prepare_inputs
 	$self->prepare_inputs($lei, \@folders) or return;
 	my $ops = {};
@@ -93,9 +92,15 @@ EOM
 	$lei->wait_wq_events($op_c, $ops); # net_merge_all_done if !{auth}
 }
 
+sub ipc_atfork_child { # needed for PublicInbox::LeiPmdir
+	my ($self) = @_;
+	PublicInbox::LeiInput::input_only_atfork_child($self);
+	$self->{lms}->lms_write_prepare;
+	undef;
+}
+
 no warnings 'once';
 *_complete_refresh_mail_sync = \&PublicInbox::LeiExportKw::_complete_export_kw;
-*ipc_atfork_child = \&PublicInbox::LeiInput::input_only_atfork_child;
 *net_merge_all_done = \&PublicInbox::LeiInput::input_only_net_merge_all_done;
 
 1;
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 32f55abd..08add8f5 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -190,7 +190,7 @@ sub export1_kw_md ($$$$$) {
 				syslog('warning', "unlink($src): $!");
 			}
 			# TODO: verify oidbin?
-			lms_mv_src($self, "maildir:$mdir",
+			$self->{lms}->mv_src("maildir:$mdir",
 					$oidbin, \$orig, $bn);
 			return;
 		} elsif ($! == EEXIST) { # lost race with "lei export-kw"?
@@ -200,7 +200,7 @@ sub export1_kw_md ($$$$$) {
 		}
 	}
 	for (@try) { return if -e "$mdir/$_/$orig" };
-	lms_clear_src($self, "maildir:$mdir", \$orig);
+	$self->{lms}->clear_src("maildir:$mdir", \$orig);
 }
 
 sub sto_export_kw ($$$) {
@@ -255,7 +255,7 @@ sub remove_eml_vmd { # remove just the VMD
 	\@docids;
 }
 
-sub _lms_rw ($) {
+sub _lms_rw ($) { # it is important to have eidx processes open before lms
 	my ($self) = @_;
 	my ($eidx, $tl) = eidx_init($self);
 	$self->{lms} //= do {
@@ -267,37 +267,11 @@ sub _lms_rw ($) {
 	};
 }
 
-sub lms_clear_src {
-	my ($self, $folder, $id) = @_;
-	_lms_rw($self)->clear_src($folder, $id);
-}
-
-sub lms_mv_src {
-	my ($self, $folder, $oidbin, $id, $newbn) = @_;
-	_lms_rw($self)->mv_src($folder, $oidbin, $id, $newbn);
-}
-
-sub lms_forget_folders {
-	my ($self, @folders) = @_;
-	my $lms = _lms_rw($self);
-	for my $f (@folders) { $lms->forget_folder($f) }
-}
-
-sub lms_rename_folder {
-	my ($self, $old, $new) = @_;
-	_lms_rw($self)->rename_folder($old, $new);
-}
-
 sub set_sync_info {
 	my ($self, $oidhex, $folder, $id) = @_;
 	_lms_rw($self)->set_src(pack('H*', $oidhex), $folder, $id);
 }
 
-sub lms_set_src {
-	my ($self, $oidbin, $folder, $id) = @_;
-	_lms_rw($self)->set_src($oidbin, $folder, $id);
-}
-
 sub _remove_if_local { # git->cat_async arg
 	my ($bref, $oidhex, $type, $size, $self) = @_;
 	$self->{im}->remove($bref) if $bref;
@@ -608,11 +582,4 @@ sub write_prepare {
 	$lei->{sto} = $self;
 }
 
-# called by lei-daemon before lei->refresh_watches
-sub add_sync_folders {
-	my ($self, @folders) = @_;
-	my $lms = _lms_rw($self);
-	for my $f (@folders) { $lms->fid_for($f, 1) }
-}
-
 1;
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index 15729bda..d3253d9b 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -458,8 +458,10 @@ sub _pre_augment_maildir {
 
 sub clobber_dst_prepare ($;$) {
 	my ($lei, $f) = @_;
-	my $wait = (defined($f) && $lei->{sto}) ?
-			$lei->{sto}->ipc_do('lms_forget_folders', $f) : undef;
+	if (my $lms = defined($f) ? $lei->lms : undef) {
+		$lms->lms_write_prepare;
+		$lms->forget_folders($f);
+	}
 	my $dedupe = $lei->{dedupe} or return;
 	$dedupe->reset_dedupe if $dedupe->can('reset_dedupe');
 }

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/9] lei_mail_sync: set nodatacow on btrfs
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
  2021-09-18  9:33 ` [PATCH 1/9] lei: lock worker counts Eric Wong
  2021-09-18  9:33 ` [PATCH 2/9] lei_mail_sync: rely on flock(2), avoid IPC Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18  9:33 ` [PATCH 4/9] ds: support add unique timers Eric Wong
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

As with other SQLite3 databases, copy-on-write with
files experiencing random writes leads to write amplification
and low performance.
---
 lib/PublicInbox/LeiMailSync.pm | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 690c6477..f185b585 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -14,6 +14,11 @@ sub dbh_new {
 	my ($self, $rw) = @_;
 	my $f = $self->{filename};
 	my $creat = $rw && !-s $f;
+	if ($creat) {
+		require PublicInbox::Spawn;
+		open my $fh, '+>>', $f or Carp::croak "open($f): $!";
+		PublicInbox::Spawn::nodatacow_fd(fileno($fh));
+	}
 	my $dbh = DBI->connect("dbi:SQLite:dbname=$f",'','', {
 		AutoCommit => 1,
 		RaiseError => 1,

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/9] ds: support add unique timers
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
                   ` (2 preceding siblings ...)
  2021-09-18  9:33 ` [PATCH 3/9] lei_mail_sync: set nodatacow on btrfs Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18  9:33 ` [PATCH 5/9] net_reader: tie SocksDebug to {imap,nntp}.Debug Eric Wong
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

A common pattern we use is to arm a timer once and prevent
it from being armed until it fires.  We'll be using it more
to do polling for saved searches and imports.
---
 lib/PublicInbox/DS.pm           | 100 ++++++++++++++++----------------
 lib/PublicInbox/LeiNoteEvent.pm |   5 +-
 2 files changed, 52 insertions(+), 53 deletions(-)

diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm
index d804792b..c89c7b8b 100644
--- a/lib/PublicInbox/DS.pm
+++ b/lib/PublicInbox/DS.pm
@@ -32,7 +32,7 @@ use PublicInbox::Syscall qw(:epoll);
 use PublicInbox::Tmpfile;
 use Errno qw(EAGAIN EINVAL);
 use Carp qw(carp croak);
-our @EXPORT_OK = qw(now msg_more dwaitpid add_timer);
+our @EXPORT_OK = qw(now msg_more dwaitpid add_timer add_uniq_timer);
 
 my %Stack;
 my $nextq; # queue for next_tick
@@ -40,7 +40,7 @@ my $wait_pids; # list of [ pid, callback, callback_arg ]
 my $later_q; # list of callbacks to run at some later interval
 my $EXPMAP; # fd -> idle_time
 our $EXPTIME = 180; # 3 minutes
-my ($later_timer, $reap_armed, $exp_timer);
+my ($reap_armed, $exp_timer);
 my $ToClose; # sockets to close when event loop is done
 our (
      %DescriptorMap,             # fd (num) -> PublicInbox::DS object
@@ -51,6 +51,7 @@ our (
 
      $LoopTimeout,               # timeout of event loop in milliseconds
      @Timers,                    # timers
+     %UniqTimer,
      $in_loop,
      );
 
@@ -70,6 +71,7 @@ sub Reset {
 		$in_loop = undef; # first in case DESTROY callbacks use this
 		%DescriptorMap = ();
 		@Timers = ();
+		%UniqTimer = ();
 		$PostLoopCallback = undef;
 
 		# we may be iterating inside one of these on our stack
@@ -81,9 +83,9 @@ sub Reset {
 		$Epoll = undef; # may call DSKQXS::DESTROY
 	} while (@Timers || keys(%Stack) || $nextq || $wait_pids ||
 		$later_q || $ToClose || keys(%DescriptorMap) ||
-		$PostLoopCallback);
+		$PostLoopCallback || keys(%UniqTimer));
 
-	$reap_armed = $later_timer = $exp_timer = undef;
+	$reap_armed = $exp_timer = undef;
 	$LoopTimeout = -1;  # no timeout by default
 }
 
@@ -97,36 +99,33 @@ immediately.
 =cut
 sub SetLoopTimeout { $LoopTimeout = $_[1] + 0 }
 
-=head2 C<< PublicInbox::DS::add_timer( $seconds, $coderef, $arg) >>
+sub _add_named_timer {
+	my ($name, $secs, $coderef, @args) = @_;
+	my $fire_time = now() + $secs;
+	my $timer = [$fire_time, $name, $coderef, @args];
 
-Add a timer to occur $seconds from now. $seconds may be fractional, but timers
-are not guaranteed to fire at the exact time you ask for.
-
-=cut
-sub add_timer ($$;@) {
-    my ($secs, $coderef, @args) = @_;
-
-    my $fire_time = now() + $secs;
-
-    my $timer = [$fire_time, $coderef, @args];
+	if (!@Timers || $fire_time >= $Timers[-1][0]) {
+		push @Timers, $timer;
+		return $timer;
+	}
 
-    if (!@Timers || $fire_time >= $Timers[-1][0]) {
-        push @Timers, $timer;
-        return $timer;
-    }
+	# Now, where do we insert?  (NOTE: this appears slow, algorithm-wise,
+	# but it was compared against calendar queues, heaps, naive push/sort,
+	# and a bunch of other versions, and found to be fastest with a large
+	# variety of datasets.)
+	for (my $i = 0; $i < @Timers; $i++) {
+		if ($Timers[$i][0] > $fire_time) {
+			splice(@Timers, $i, 0, $timer);
+			return $timer;
+		}
+	}
+	die "Shouldn't get here.";
+}
 
-    # Now, where do we insert?  (NOTE: this appears slow, algorithm-wise,
-    # but it was compared against calendar queues, heaps, naive push/sort,
-    # and a bunch of other versions, and found to be fastest with a large
-    # variety of datasets.)
-    for (my $i = 0; $i < @Timers; $i++) {
-        if ($Timers[$i][0] > $fire_time) {
-            splice(@Timers, $i, 0, $timer);
-            return $timer;
-        }
-    }
+sub add_timer { _add_named_timer(undef, @_) }
 
-    die "Shouldn't get here.";
+sub add_uniq_timer { # ($name, $secs, $coderef, @args) = @_;
+	$UniqTimer{$_[0]} //= _add_named_timer(@_);
 }
 
 # keeping this around in case we support other FD types for now,
@@ -184,31 +183,32 @@ sub next_tick () {
 
 # runs timers and returns milliseconds for next one, or next event loop
 sub RunTimers {
-    next_tick();
+	next_tick();
 
-    return (($nextq || $ToClose) ? 0 : $LoopTimeout) unless @Timers;
+	return (($nextq || $ToClose) ? 0 : $LoopTimeout) unless @Timers;
 
-    my $now = now();
+	my $now = now();
 
-    # Run expired timers
-    while (@Timers && $Timers[0][0] <= $now) {
-        my $to_run = shift(@Timers);
-        $to_run->[1]->(@$to_run[2..$#$to_run]);
-    }
+	# Run expired timers
+	while (@Timers && $Timers[0][0] <= $now) {
+		my $to_run = shift(@Timers);
+		delete $UniqTimer{$to_run->[1] // ''};
+		$to_run->[2]->(@$to_run[3..$#$to_run]);
+	}
 
-    # timers may enqueue into nextq:
-    return 0 if ($nextq || $ToClose);
+	# timers may enqueue into nextq:
+	return 0 if ($nextq || $ToClose);
 
-    return $LoopTimeout unless @Timers;
+	return $LoopTimeout unless @Timers;
 
-    # convert time to an even number of milliseconds, adding 1
-    # extra, otherwise floating point fun can occur and we'll
-    # call RunTimers like 20-30 times, each returning a timeout
-    # of 0.0000212 seconds
-    my $timeout = int(($Timers[0][0] - $now) * 1000) + 1;
+	# convert time to an even number of milliseconds, adding 1
+	# extra, otherwise floating point fun can occur and we'll
+	# call RunTimers like 20-30 times, each returning a timeout
+	# of 0.0000212 seconds
+	my $timeout = int(($Timers[0][0] - $now) * 1000) + 1;
 
-    # -1 is an infinite timeout, so prefer a real timeout
-    ($LoopTimeout < 0 || $LoopTimeout >= $timeout) ? $timeout : $LoopTimeout;
+	# -1 is an infinite timeout, so prefer a real timeout
+	($LoopTimeout < 0 || $LoopTimeout >= $timeout) ? $timeout : $LoopTimeout
 }
 
 sub sig_setmask { sigprocmask(SIG_SETMASK, @_) or die "sigprocmask: $!" }
@@ -660,7 +660,7 @@ sub dwaitpid ($;$$) {
 
 sub _run_later () {
 	my $q = $later_q or return;
-	$later_timer = $later_q = undef;
+	$later_q = undef;
 	$Stack{later_q} = $q;
 	$_->() for @$q;
 	delete $Stack{later_q};
@@ -668,7 +668,7 @@ sub _run_later () {
 
 sub later ($) {
 	push @$later_q, $_[0]; # autovivifies @$later_q
-	$later_timer //= add_timer(60, \&_run_later);
+	add_uniq_timer('later', 60, \&_run_later);
 }
 
 sub expire_old () {
diff --git a/lib/PublicInbox/LeiNoteEvent.pm b/lib/PublicInbox/LeiNoteEvent.pm
index c03c5319..18313359 100644
--- a/lib/PublicInbox/LeiNoteEvent.pm
+++ b/lib/PublicInbox/LeiNoteEvent.pm
@@ -7,8 +7,8 @@ package PublicInbox::LeiNoteEvent;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC);
+use PublicInbox::DS;
 
-my $flush_timer;
 our $to_flush; # { cfgpath => $lei }
 
 sub flush_lei ($) {
@@ -20,7 +20,6 @@ sub flush_lei ($) {
 # we batch up writes and flush every 5s (matching Linux default
 # writeback behavior) since MUAs can trigger a storm of inotify events
 sub flush_task { # PublicInbox::DS timer callback
-	undef $flush_timer;
 	my $todo = $to_flush // return;
 	$to_flush = undef;
 	for my $lei (values %$todo) { flush_lei($lei) }
@@ -29,7 +28,7 @@ sub flush_task { # PublicInbox::DS timer callback
 # sets a timer to flush
 sub note_event_arm_done ($) {
 	my ($lei) = @_;
-	$flush_timer //= PublicInbox::DS::add_timer(5, \&flush_task);
+	PublicInbox::DS::add_uniq_timer('flush_timer', 5, \&flush_task);
 	$to_flush->{$lei->{cfg}->{'-f'}} //= $lei;
 }
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/9] net_reader: tie SocksDebug to {imap,nntp}.Debug
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
                   ` (3 preceding siblings ...)
  2021-09-18  9:33 ` [PATCH 4/9] ds: support add unique timers Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18  9:33 ` [PATCH 6/9] net_reader: detect IMAP failures earlier Eric Wong
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

I think tying IO::Socket::Socks debugging to existing debug
switches is enough, and there's no need to introduce a separate
socks.Debug parameter.
---
 lib/PublicInbox/NetReader.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 5725a155..e703cddb 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -35,7 +35,7 @@ sub socks_args ($) {
 		eval { require IO::Socket::Socks } or die <<EOM;
 IO::Socket::Socks missing for socks5h://$h:$p
 EOM
-		# for Mail::IMAPClient
+		# for IO::Socket::Socks
 		return { ProxyAddr => $h, ProxyPort => $p };
 	}
 	die "$val not understood (only socks5h:// is supported)\n";
@@ -51,6 +51,7 @@ sub mic_new ($$$$) {
 		require IO::Socket::Socks;
 
 		my %opt = %$sa;
+		$opt{SocksDebug} = 1 if $mic_arg{Debug};
 		$opt{ConnectAddr} = delete $mic_arg{Server};
 		$opt{ConnectPort} = delete $mic_arg{Port};
 		$mic_arg{Socket} = IO::Socket::Socks->new(%opt) or die
@@ -170,6 +171,7 @@ sub nn_new ($$$) {
 	my $nn;
 	if (defined $nn_arg->{ProxyAddr}) {
 		require PublicInbox::NetNNTPSocks;
+		$nn_arg->{SocksDebug} = 1 if $nn_arg->{Debug};
 		eval { $nn = PublicInbox::NetNNTPSocks->new_socks(%$nn_arg) };
 		die "E: <$uri> $@\n" if $@;
 	} else {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/9] net_reader: detect IMAP failures earlier
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
                   ` (4 preceding siblings ...)
  2021-09-18  9:33 ` [PATCH 5/9] net_reader: tie SocksDebug to {imap,nntp}.Debug Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18  9:33 ` [PATCH 7/9] net_reader: support imaps:// w/ socks5h:// proxy Eric Wong
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

An Mail::IMAPClient object may be returned even on connection
failure, so use IsConnected to check for it.  This ensures
git-credential will no longer prompt for passwords when there's
no connection.
---
 lib/PublicInbox/NetReader.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index e703cddb..8eff847e 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -108,7 +108,8 @@ sub mic_for ($$$$) { # mic = Mail::IMAPClient
 	};
 	$mic_arg->{Ssl} = 1 if $uri->scheme eq 'imaps';
 	require PublicInbox::IMAPClient;
-	my $mic = mic_new($self, $mic_arg, $sec, $uri) or
+	my $mic = mic_new($self, $mic_arg, $sec, $uri);
+	($mic && $mic->IsConnected) or
 		die "E: <$uri> new: $@".onion_hint($lei, $uri);
 
 	# default to using STARTTLS if it's available, but allow

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/9] net_reader: support imaps:// w/ socks5h:// proxy
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
                   ` (5 preceding siblings ...)
  2021-09-18  9:33 ` [PATCH 6/9] net_reader: detect IMAP failures earlier Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18 10:41   ` Eric Wong
  2021-09-18  9:33 ` [PATCH 8/9] net_reader: set SO_KEEPALIVE on all Net::NNTP sockets Eric Wong
  2021-09-18  9:33 ` [PATCH 9/9] lei up: automatically use dt: for remote externals Eric Wong
  8 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

While Non-TLS IMAP worked perfectly with IO::Socket::Socks
and Mail::IMAPClient; we need to wrap the IO::Socket::Socks
object with IO::Socket::SSL before handing it to
Mail::IMAPClient.
---
 lib/PublicInbox/NetReader.pm | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 8eff847e..403df952 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -43,21 +43,26 @@ EOM
 
 sub mic_new ($$$$) {
 	my ($self, $mic_arg, $sec, $uri) = @_;
-	my %mic_arg = %$mic_arg;
+	my %mic_arg = (%$mic_arg, Keepalive => 1);
 	my $sa = $self->{cfg_opt}->{$sec}->{-proxy_cfg} || $self->{-proxy_cli};
 	if ($sa) {
 		# this `require' needed for worker[1..Inf], since socks_args
 		# only got called in worker[0]
 		require IO::Socket::Socks;
-
-		my %opt = %$sa;
+		my %opt = (%$sa, Keepalive => 1);
 		$opt{SocksDebug} = 1 if $mic_arg{Debug};
 		$opt{ConnectAddr} = delete $mic_arg{Server};
 		$opt{ConnectPort} = delete $mic_arg{Port};
-		$mic_arg{Socket} = IO::Socket::Socks->new(%opt) or die
-			"E: <$$uri> ".eval('$IO::Socket::Socks::SOCKS_ERROR');
+		my $s = IO::Socket::Socks->new(%opt) or die
+			"E: <$uri> ".eval('$IO::Socket::Socks::SOCKS_ERROR');
+		if ($mic_arg->{Ssl}) { # for imaps://
+			require IO::Socket::SSL;
+			$s = IO::Socket::SSL->start_SSL($s) or
+				"E: <$uri> ".(IO::Socket::SSL->errstr // '');
+		}
+		$mic_arg{Socket} = $s;
 	}
-	PublicInbox::IMAPClient->new(%mic_arg, Keepalive => 1);
+	PublicInbox::IMAPClient->new(%mic_arg);
 }
 
 sub auth_anon_cb { '' }; # for Mail::IMAPClient::Authcallback
@@ -103,7 +108,6 @@ sub mic_for ($$$$) { # mic = Mail::IMAPClient
 	my $mic_arg = {
 		Port => $uri->port,
 		Server => $host,
-		Ssl => $uri->scheme eq 'imaps',
 		%$common, # may set Starttls, Compress, Debug ....
 	};
 	$mic_arg->{Ssl} = 1 if $uri->scheme eq 'imaps';

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 8/9] net_reader: set SO_KEEPALIVE on all Net::NNTP sockets
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
                   ` (6 preceding siblings ...)
  2021-09-18  9:33 ` [PATCH 7/9] net_reader: support imaps:// w/ socks5h:// proxy Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18  9:33 ` [PATCH 9/9] lei up: automatically use dt: for remote externals Eric Wong
  8 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

SO_KEEPALIVE can prevent stuck processes and is safe to enable
unconditionally on all TCP sockets (like git, and the rest of
public-inbox does).  Verified via strace on both NNTP and NNTPS
with and without nntp.proxy=socks5h://...
---
 lib/PublicInbox/NetReader.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 403df952..28e20d38 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -182,6 +182,7 @@ sub nn_new ($$$) {
 	} else {
 		$nn = Net::NNTP->new(%$nn_arg) or return;
 	}
+	setsockopt($nn, Socket::SOL_SOCKET(), Socket::SO_KEEPALIVE(), 1);
 
 	# default to using STARTTLS if it's available, but allow
 	# it to be disabled for localhost/VPN users

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 9/9] lei up: automatically use dt: for remote externals
  2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
                   ` (7 preceding siblings ...)
  2021-09-18  9:33 ` [PATCH 8/9] net_reader: set SO_KEEPALIVE on all Net::NNTP sockets Eric Wong
@ 2021-09-18  9:33 ` Eric Wong
  2021-09-18 14:31   ` Kyle Meyer
  8 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2021-09-18  9:33 UTC (permalink / raw)
  To: meta

Since we can't use maxuid for remote externals, automatically
maintaining the last time we got results and appending a dt:
range to the query will prevent HTTP(S) responses from getting
too big.

We could be using "rt:", but no stable release of public-inbox
supports it, yet, so we'll use dt:, instead.

By default, there's a two day fudge factor to account for MTA
downtime and delays; which is hopefully enough.  The fudge
factor may be changed per-invocation with the
--remote-fudge-factor=INTERVAL option

Since different externals can have different message transport
routes, "lastresult" entries are stored on a per-external basis.
---
 Documentation/lei-up.pod          | 15 ++++++++++
 lib/PublicInbox/LEI.pm            |  7 +++--
 lib/PublicInbox/LeiSavedSearch.pm |  1 +
 lib/PublicInbox/LeiToMail.pm      |  4 ++-
 lib/PublicInbox/LeiUp.pm          |  2 +-
 lib/PublicInbox/LeiXSearch.pm     | 50 ++++++++++++++++++++++++++-----
 t/lei-q-remote-import.t           |  4 +++
 7 files changed, 71 insertions(+), 12 deletions(-)

diff --git a/Documentation/lei-up.pod b/Documentation/lei-up.pod
index e5d97f43..4dc6d3b8 100644
--- a/Documentation/lei-up.pod
+++ b/Documentation/lei-up.pod
@@ -22,6 +22,21 @@ C<--all> updates all saved searches (listed in L<lei-ls-search(1)>).
 C<--all=local> only updates local mailboxes, C<--all=remote> only
 updates remote mailboxes (currently C<imap://> and C<imaps://>).
 
+=item --remote-fudge-time=INTERVAL
+
+Look for mail older than the time of the last successful query.
+Using a small interval will reduce bandwidth use.  A larger
+interval reduces the likelyhood missing a result due to MTA
+delays or downtime.
+
+The time(s) of the last successful queries are the C<lastresult>
+values visible from L<lei-edit-search(1)>.
+
+Date formats understood by L<git-rev-parse(1)> may be used.
+e.g C<1.hour> or C<3.days>
+
+Default: 2.days
+
 =back
 
 The following options, described in L<lei-q(1)>, are supported.
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 053b6174..8b0614f2 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -183,7 +183,8 @@ our %CMD = ( # sorted in order of importance/use:
 	shared color! mail-sync!), @c_opt, opt_dash('limit|n=i', '[0-9]+') ],
 
 'up' => [ 'OUTPUT...|--all', 'update saved search',
-	qw(jobs|j=s lock=s@ alert=s@ mua=s verbose|v+ all:s), @c_opt ],
+	qw(jobs|j=s lock=s@ alert=s@ mua=s verbose|v+
+	remote-fudge-time=s all:s), @c_opt ],
 
 'lcat' => [ '--stdin|MSGID_OR_URL...', 'display local copy of message(s)',
 	'stdin|', # /|\z/ must be first for lone dash
@@ -420,7 +421,9 @@ my %OPTDESC = (
 'remote' => 'limit operations to those requiring network access',
 'remote!' => 'prevent operations requiring network access',
 
-'all:s	up' => ['local', 'update all (local) saved searches' ],
+'all:s	up' => ['local|remote', 'update all remote or local saved searches' ],
+'remote-fudge-time=s' => [ 'INTERVAL',
+	'look for mail INTERVAL older than the last successful query' ],
 
 'mid=s' => 'specify the Message-ID of a message',
 'oid=s' => 'specify the git object ID of a message',
diff --git a/lib/PublicInbox/LeiSavedSearch.pm b/lib/PublicInbox/LeiSavedSearch.pm
index 96ff816e..754f8294 100644
--- a/lib/PublicInbox/LeiSavedSearch.pm
+++ b/lib/PublicInbox/LeiSavedSearch.pm
@@ -149,6 +149,7 @@ sub new { # new saved search "lei q --save"
 	$dst = "$lei->{ovv}->{fmt}:$dst" if $dst !~ m!\Aimaps?://!i;
 	print $fh <<EOM;
 ; to refresh with new results, run: lei up $sq_dst
+; `maxuid' and `lastresult' lines are maintained by "lei up" for optimization
 [lei]
 $q
 [lei "q"]
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index d3253d9b..9f7171fb 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -437,7 +437,9 @@ sub new {
 			($lei->{opt}->{save} ? 'LeiSavedSearch' : 'LeiDedupe');
 		eval "require $dd_cls";
 		die "$dd_cls: $@" if $@;
-		$dd_cls->new($lei);
+		my $dd = $dd_cls->new($lei);
+		$lei->{lss} //= $dd if $dd && $dd->can('cfg_set');
+		$dd;
 	};
 	$self;
 }
diff --git a/lib/PublicInbox/LeiUp.pm b/lib/PublicInbox/LeiUp.pm
index 4637cb46..abb05d46 100644
--- a/lib/PublicInbox/LeiUp.pm
+++ b/lib/PublicInbox/LeiUp.pm
@@ -45,7 +45,7 @@ sub up1 ($$) {
 		ref($v) and return $lei->fail("multiple values of $c in $f");
 		$lei->{opt}->{$k} = $v;
 	}
-	$lei->{lss} = $lss; # for LeiOverview->new
+	$lei->{lss} = $lss; # for LeiOverview->new and query_remote_mboxrd
 	my $lxs = $lei->lxs_prepare or return;
 	$lei->ale->refresh_externals($lxs, $lei);
 	$lei->_start_query;
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 50cadb5e..1d49da3d 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -18,6 +18,7 @@ use PublicInbox::Smsg;
 use PublicInbox::Eml;
 use Fcntl qw(SEEK_SET F_SETFL O_APPEND O_RDWR);
 use PublicInbox::ContentHash qw(git_sha);
+use POSIX qw(strftime);
 
 sub new {
 	my ($class) = @_;
@@ -168,8 +169,7 @@ sub query_one_mset { # for --threads and l2m w/o sort
 	my $can_kw = !!$ibxish->can('msg_keywords');
 	my $threads = $lei->{opt}->{threads} // 0;
 	my $fl = $threads > 1 ? 1 : undef;
-	my $lss = $lei->{dedupe};
-	$lss = undef unless $lss && $lss->can('cfg_set'); # saved search
+	my $lss = $lei->{lss};
 	my $maxk = "external.$dir.maxuid";
 	my $stop_at = $lss ? $lss->{-cfg}->{$maxk} : undef;
 	if (defined $stop_at) {
@@ -292,6 +292,37 @@ sub each_remote_eml { # callback for MboxReader->mboxrd
 	$each_smsg->($smsg, undef, $eml);
 }
 
+sub fudge_qstr_time ($$$) {
+	my ($lei, $uri, $qstr) = @_;
+	return ($qstr, undef) unless $lei->{lss};
+	my $cfg = $lei->{lss}->{-cfg} // die 'BUG: no lss->{-cfg}';
+	my $cfg_key = "external.$uri.lastresult";
+	my $lr = $cfg->{$cfg_key} or return ($qstr, $cfg_key);
+	if ($lr !~ /\A\-?[0-9]+\z/) {
+		$lei->child_error(0,
+			"$cfg->{-f}: $cfg_key=$lr not an integer, ignoring");
+		return ($qstr, $cfg_key);
+	}
+	my $rft = $lei->{opt}->{'remote-fudge-time'};
+	if ($rft && $rft !~ /\A-?[0-9]+\z/) {
+		my @t = $lei->{lss}->git->date_parse($rft);
+		my $diff = time - $t[0];
+		$lei->qerr("# $rft => $diff seconds");
+		$rft = $diff;
+	}
+	$lr -= ($rft || (48 * 60 * 60));
+	$lei->qerr("# $uri limiting to ".
+		strftime('%Y-%m-%d %k:%M', gmtime($lr)). ' and newer');
+	# this should really be rt: (received-time), but no stable
+	# public-inbox releases support it, yet.
+	my $dt = 'dt:'.strftime('%Y%m%d%H%M%S', gmtime($lr)).'..';
+	if ($qstr =~ /\S/) {
+		substr($qstr, 0, 0, '(');
+		$qstr .= ') AND ';
+	}
+	($qstr .= $dt, $cfg_key);
+}
+
 sub query_remote_mboxrd {
 	my ($self, $uris) = @_;
 	local $0 = "$0 query_remote_mboxrd";
@@ -300,7 +331,7 @@ sub query_remote_mboxrd {
 	my $opt = $lei->{opt};
 	my $qstr = $lei->{mset_opt}->{qstr};
 	$qstr =~ s/[ \n\t]+/ /sg; # make URLs less ugly
-	my @qform = (q => $qstr, x => 'm');
+	my @qform = (x => 'm');
 	push(@qform, t => 1) if $opt->{threads};
 	my $verbose = $opt->{verbose};
 	my ($reap_tail, $reap_curl);
@@ -323,7 +354,9 @@ sub query_remote_mboxrd {
 	for my $uri (@$uris) {
 		$lei->{-current_url} = $uri->as_string;
 		$lei->{-nr_remote_eml} = 0;
-		$uri->query_form(@qform);
+		my $start = time;
+		my ($q, $key) = fudge_qstr_time($lei, $uri, $qstr);
+		$uri->query_form(@qform, q => $q);
 		my $cmd = $curl->for_uri($lei, $uri);
 		$lei->qerr("# $cmd");
 		my ($fh, $pid) = popen_rd($cmd, undef, $rdr);
@@ -336,10 +369,11 @@ sub query_remote_mboxrd {
 		@$reap_curl = (); # cancel OnDestroy
 		die $err if $err;
 		my $nr = $lei->{-nr_remote_eml};
-		if ($nr && $lei->{sto}) {
-			my $wait = $lei->{sto}->ipc_do('done');
-		}
+		my $wait = $lei->{sto}->ipc_do('done') if $nr && $lei->{sto};
 		if ($? == 0) {
+			# don't update if no results, maybe MTA is down
+			$key && $nr and
+				$lei->{lss}->cfg_set($key, $start);
 			mset_progress($lei, $lei->{-current_url}, $nr, $nr);
 			next;
 		}
@@ -353,7 +387,7 @@ sub query_remote_mboxrd {
 					$lei->err("truncate($cmd stderr): $!");
 		}
 		next if (($? >> 8) == 22 && $err =~ /\b404\b/);
-		$uri->query_form(q => $lei->{mset_opt}->{qstr});
+		$uri->query_form(q => $qstr);
 		$lei->child_error($?, "E: <$uri> $err");
 	}
 	undef $each_smsg;
diff --git a/t/lei-q-remote-import.t b/t/lei-q-remote-import.t
index 9131c01b..fdf6a11e 100644
--- a/t/lei-q-remote-import.t
+++ b/t/lei-q-remote-import.t
@@ -99,5 +99,9 @@ EOF
 	lei_ok('up', "$ENV{HOME}/md");
 	is_deeply(\@f, [ glob("$ENV{HOME}/md/*/*") ],
 		'lei up remote dedupe works on maildir');
+	my $edit_env = { VISUAL => 'cat' };
+	lei_ok([qw(edit-search), "$ENV{HOME}/md"], $edit_env);
+	like($lei_out, qr/^\Q[external "$url"]\E\n\s*lastresult = \d+/sm,
+		'lastresult set');
 });
 done_testing;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 7/9] net_reader: support imaps:// w/ socks5h:// proxy
  2021-09-18  9:33 ` [PATCH 7/9] net_reader: support imaps:// w/ socks5h:// proxy Eric Wong
@ 2021-09-18 10:41   ` Eric Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18 10:41 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> +			$s = IO::Socket::SSL->start_SSL($s) or
> +				"E: <$uri> ".(IO::Socket::SSL->errstr // '');

Oops, I missed a die() and missed the warning when running tests :x

-----8<----
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index 28e20d38..ccfdd261 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -57,7 +57,7 @@ sub mic_new ($$$$) {
 			"E: <$uri> ".eval('$IO::Socket::Socks::SOCKS_ERROR');
 		if ($mic_arg->{Ssl}) { # for imaps://
 			require IO::Socket::SSL;
-			$s = IO::Socket::SSL->start_SSL($s) or
+			$s = IO::Socket::SSL->start_SSL($s) or die
 				"E: <$uri> ".(IO::Socket::SSL->errstr // '');
 		}
 		$mic_arg{Socket} = $s;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 9/9] lei up: automatically use dt: for remote externals
  2021-09-18  9:33 ` [PATCH 9/9] lei up: automatically use dt: for remote externals Eric Wong
@ 2021-09-18 14:31   ` Kyle Meyer
  2021-09-18 20:30     ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Kyle Meyer @ 2021-09-18 14:31 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> diff --git a/Documentation/lei-up.pod b/Documentation/lei-up.pod
> index e5d97f43..4dc6d3b8 100644
> --- a/Documentation/lei-up.pod
> +++ b/Documentation/lei-up.pod
> @@ -22,6 +22,21 @@ C<--all> updates all saved searches (listed in L<lei-ls-search(1)>).
>  C<--all=local> only updates local mailboxes, C<--all=remote> only
>  updates remote mailboxes (currently C<imap://> and C<imaps://>).
>  
> +=item --remote-fudge-time=INTERVAL
> +
> +Look for mail older than the time of the last successful query.
> +Using a small interval will reduce bandwidth use.  A larger
> +interval reduces the likelyhood missing a result due to MTA
> +delays or downtime.

s/likelyhood/likelihood of/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 9/9] lei up: automatically use dt: for remote externals
  2021-09-18 14:31   ` Kyle Meyer
@ 2021-09-18 20:30     ` Eric Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2021-09-18 20:30 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> > +interval reduces the likelyhood missing a result due to MTA
> > +delays or downtime.
> 
> s/likelyhood/likelihood of/

Thanks, pushed as commit f3d0e746c6a35c8600b91af99958a52cbc114a4b

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-09-18 20:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-18  9:33 [PATCH 0/9] lei: a bunch of random stuff Eric Wong
2021-09-18  9:33 ` [PATCH 1/9] lei: lock worker counts Eric Wong
2021-09-18  9:33 ` [PATCH 2/9] lei_mail_sync: rely on flock(2), avoid IPC Eric Wong
2021-09-18  9:33 ` [PATCH 3/9] lei_mail_sync: set nodatacow on btrfs Eric Wong
2021-09-18  9:33 ` [PATCH 4/9] ds: support add unique timers Eric Wong
2021-09-18  9:33 ` [PATCH 5/9] net_reader: tie SocksDebug to {imap,nntp}.Debug Eric Wong
2021-09-18  9:33 ` [PATCH 6/9] net_reader: detect IMAP failures earlier Eric Wong
2021-09-18  9:33 ` [PATCH 7/9] net_reader: support imaps:// w/ socks5h:// proxy Eric Wong
2021-09-18 10:41   ` Eric Wong
2021-09-18  9:33 ` [PATCH 8/9] net_reader: set SO_KEEPALIVE on all Net::NNTP sockets Eric Wong
2021-09-18  9:33 ` [PATCH 9/9] lei up: automatically use dt: for remote externals Eric Wong
2021-09-18 14:31   ` Kyle Meyer
2021-09-18 20:30     ` Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).