unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] lei refresh-mail-sync: another try...
@ 2021-09-17  1:56 Eric Wong
  2021-09-17  1:56 ` [PATCH v2 1/6] lei refresh-mail-sync: replace prune-mail-sync Eric Wong
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Eric Wong @ 2021-09-17  1:56 UTC (permalink / raw)
  To: meta

OK, so a bunch of things were broken/annoying in my initial
patch @ <20210916094116.11457-3-e@80x24.org>

2/6 was only exposed in real-world usage with giant folders;
and a testcase is still pending for that...

Eric Wong (6):
  lei refresh-mail-sync: replace prune-mail-sync
  lei_mail_sync: don't hold statement handle into callback
  lei refresh-mail-sync: remove "gone" notices
  lei refresh-mail-sync: drop unused {verify} code path
  lei refresh-mail-sync: implicitly remove missing folders
  lei refresh-mail-sync: drop old IMAP folder info

 MANIFEST                                      |   3 +-
 lib/PublicInbox/LEI.pm                        |   3 +-
 lib/PublicInbox/LeiInput.pm                   |  11 +-
 lib/PublicInbox/LeiMailSync.pm                |  41 ++++--
 ...PruneMailSync.pm => LeiRefreshMailSync.pm} |  79 +++++-----
 lib/PublicInbox/LeiStore.pm                   |   5 +
 t/lei-export-kw.t                             |   1 -
 t/lei-refresh-mail-sync.t                     | 137 ++++++++++++++++++
 8 files changed, 231 insertions(+), 49 deletions(-)
 rename lib/PublicInbox/{LeiPruneMailSync.pm => LeiRefreshMailSync.pm} (52%)
 create mode 100644 t/lei-refresh-mail-sync.t

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/6] lei refresh-mail-sync: replace prune-mail-sync
  2021-09-17  1:56 [PATCH v2 0/6] lei refresh-mail-sync: another try Eric Wong
@ 2021-09-17  1:56 ` Eric Wong
  2021-09-17  1:56 ` [PATCH v2 2/6] lei_mail_sync: don't hold statement handle into callback Eric Wong
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-09-17  1:56 UTC (permalink / raw)
  To: meta

Merely pruning mail synchronization information was
insufficient for Maildir: renames are common in Maildir
and we need to detect them after-the-fact when lei-daemon
isn't running.

Running this command could make "lei index" far more
useful...

v2: close R/O mail_sync.sqlite3 dbh before fork
  Keeping the DB file handle open across fork can cause bad things
  to happen even if we don't use it since sqlite3 itself still knows
  about it (but doesn't know Perl code doesn't know about it).
---
 MANIFEST                                      |  3 +-
 lib/PublicInbox/LEI.pm                        |  3 +-
 ...PruneMailSync.pm => LeiRefreshMailSync.pm} | 54 +++++++++------
 lib/PublicInbox/LeiStore.pm                   |  5 ++
 t/lei-export-kw.t                             |  1 -
 t/lei-refresh-mail-sync.t                     | 67 +++++++++++++++++++
 6 files changed, 111 insertions(+), 22 deletions(-)
 rename lib/PublicInbox/{LeiPruneMailSync.pm => LeiRefreshMailSync.pm} (62%)
 create mode 100644 t/lei-refresh-mail-sync.t

diff --git a/MANIFEST b/MANIFEST
index 640eabd1..9f11f2f9 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -235,9 +235,9 @@ lib/PublicInbox/LeiNoteEvent.pm
 lib/PublicInbox/LeiOverview.pm
 lib/PublicInbox/LeiP2q.pm
 lib/PublicInbox/LeiPmdir.pm
-lib/PublicInbox/LeiPruneMailSync.pm
 lib/PublicInbox/LeiQuery.pm
 lib/PublicInbox/LeiRediff.pm
+lib/PublicInbox/LeiRefreshMailSync.pm
 lib/PublicInbox/LeiRemote.pm
 lib/PublicInbox/LeiRm.pm
 lib/PublicInbox/LeiRmWatch.pm
@@ -450,6 +450,7 @@ t/lei-q-kw.t
 t/lei-q-remote-import.t
 t/lei-q-save.t
 t/lei-q-thread.t
+t/lei-refresh-mail-sync.t
 t/lei-sigpipe.t
 t/lei-tag.t
 t/lei-up.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index ec103231..9794497b 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -263,7 +263,7 @@ our %CMD = ( # sorted in order of importance/use:
 	@net_opt, @c_opt ],
 'forget-mail-sync' => [ 'LOCATION...',
 	'forget sync information for a mail folder', @c_opt ],
-'prune-mail-sync' => [ 'LOCATION...|--all',
+'refresh-mail-sync' => [ 'LOCATION...|--all',
 	'prune dangling sync data for a mail folder', 'all:s', @c_opt ],
 'export-kw' => [ 'LOCATION...|--all',
 	'one-time export of keywords of sync sources',
@@ -616,6 +616,7 @@ sub pkt_ops {
 	$ops->{x_it} = [ \&x_it, $lei ];
 	$ops->{child_error} = [ \&child_error, $lei ];
 	$ops->{incr} = [ \&incr, $lei ];
+	$ops->{sto_done_request} = [ \&sto_done_request, $lei, $lei->{sock} ];
 	$ops;
 }
 
diff --git a/lib/PublicInbox/LeiPruneMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
similarity index 62%
rename from lib/PublicInbox/LeiPruneMailSync.pm
rename to lib/PublicInbox/LeiRefreshMailSync.pm
index 3678bd04..3c083965 100644
--- a/lib/PublicInbox/LeiPruneMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -1,16 +1,20 @@
 # Copyright (C) 2021 all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
-# "lei prune-mail-sync" drops dangling sync information
-package PublicInbox::LeiPruneMailSync;
+# "lei refresh-mail-sync" drops dangling sync information
+# and attempts to detect moved files
+package PublicInbox::LeiRefreshMailSync;
 use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
 use PublicInbox::LeiExportKw;
 use PublicInbox::InboxWritable qw(eml_from_path);
+use PublicInbox::ContentHash qw(git_sha);
+use PublicInbox::Import;
 
 sub eml_match ($$) {
 	my ($eml, $oidbin) = @_;
+	$eml->header_set($_) for @PublicInbox::Import::UNWANTED_HEADERS;
 	$oidbin eq git_sha(length($oidbin) == 20 ? 1 : 256, $eml)->digest;
 }
 
@@ -20,7 +24,7 @@ sub prune_mdir { # lms->each_src callback
 	for my $d (@try) {
 		my $src = "$mdir/$d/$$id";
 		if ($self->{verify}) {
-			my $eml = eml_from_path($src) or next;
+			my $eml = eml_from_path($src) // next;
 			return if eml_match($eml, $oidbin);
 		} elsif (-f $src) {
 			return;
@@ -38,12 +42,27 @@ sub prune_imap { # lms->each_src callback
 	$self->{lei}->{sto}->ipc_do('lms_clear_src', $url, $uid);
 }
 
+# detects missed file moves
+sub pmdir_cb { # called via LeiPmdir->each_mdir_fn
+	my ($self, $f, $fl) = @_;
+	my ($folder, $bn) = ($f =~ m!\A(.+?)/(?:new|cur)/([^/]+)\z!) or
+		die "BUG: $f was not from a Maildir?";
+	substr($folder, 0, 0) = 'maildir:'; # add prefix
+	my $lms = $self->{-lms_ro} //= $self->{lei}->lms;
+	return if defined($lms->name_oidbin($folder, $bn));
+	my $eml = eml_from_path($f) // return;
+	my $oidbin = $self->{lei}->git_oid($eml)->digest;
+	$self->{lei}->{sto}->ipc_do('lms_set_src', $oidbin, $folder, \$bn);
+}
+
 sub input_path_url { # overrides PublicInbox::LeiInput::input_path_url
 	my ($self, $input, @args) = @_;
 	my $lms = $self->{-lms_ro} //= $self->{lei}->lms;
 	if ($input =~ /\Amaildir:(.+)/i) {
-		my $mdir = $1;
-		$lms->each_src($input, \&prune_mdir, $self, $mdir);
+		$lms->each_src($input, \&prune_mdir, $self, my $mdir = $1);
+		$self->{lse} //= $self->{lei}->{sto}->search;
+		# call pmdir_cb (via maildir_each_file -> each_mdir_fn)
+		PublicInbox::LeiInput::input_path_url($self, $input);
 	} elsif ($input =~ m!\Aimaps?://!i) {
 		my $uri = PublicInbox::URIimap->new($input);
 		my $mic = $self->{lei}->{net}->mic_for_folder($uri);
@@ -51,34 +70,31 @@ sub input_path_url { # overrides PublicInbox::LeiInput::input_path_url
 		$uids = +{ map { $_ => undef } @$uids };
 		$lms->each_src($$uri, \&prune_imap, $self, $uids, $$uri);
 	} else { die "BUG: $input not supported" }
-	my $wait = $self->{lei}->{sto}->ipc_do('done');
+	$self->{lei}->{pkt_op_p}->pkt_do('sto_done_request');
 }
 
-sub lei_prune_mail_sync {
+sub lei_refresh_mail_sync {
 	my ($lei, @folders) = @_;
 	my $sto = $lei->_lei_store or return $lei->fail(<<EOM);
 lei/store uninitialized, see lei-import(1)
 EOM
-	if (my $lms = $lei->lms) {
-		if (defined(my $all = $lei->{opt}->{all})) {
-			$lms->group2folders($lei, $all, \@folders) or return;
-		} else {
-			my $err = $lms->arg2folder($lei, \@folders);
-			$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
-			return $lei->fail($err->{fail}) if $err->{fail};
-		}
-	} else {
-		return $lei->fail(<<EOM);
+	my $lms = $lei->lms or return $lei->fail(<<EOM);
 lei mail_sync.sqlite3 uninitialized, see lei-import(1)
 EOM
+	if (defined(my $all = $lei->{opt}->{all})) {
+		$lms->group2folders($lei, $all, \@folders) or return;
+	} else {
+		my $err = $lms->arg2folder($lei, \@folders);
+		$lei->qerr(@{$err->{qerr}}) if $err->{qerr};
+		return $lei->fail($err->{fail}) if $err->{fail};
 	}
+	undef $lms; # must be done before fork
 	$sto->write_prepare($lei);
 	my $self = bless { missing_ok => 1 }, __PACKAGE__;
 	$lei->{opt}->{'mail-sync'} = 1; # for prepare_inputs
 	$self->prepare_inputs($lei, \@folders) or return;
 	my $j = $lei->{opt}->{jobs} || scalar(@{$self->{inputs}}) || 1;
 	my $ops = {};
-	$sto->write_prepare($lei);
 	$lei->{auth}->op_merge($ops, $self) if $lei->{auth};
 	$self->{-wq_nr_workers} = $j // 1; # locked
 	(my $op_c, $ops) = $lei->workers_start($self, $j, $ops);
@@ -89,7 +105,7 @@ EOM
 }
 
 no warnings 'once';
-*_complete_prune_mail_sync = \&PublicInbox::LeiExportKw::_complete_export_kw;
+*_complete_refresh_mail_sync = \&PublicInbox::LeiExportKw::_complete_export_kw;
 *ipc_atfork_child = \&PublicInbox::LeiInput::input_only_atfork_child;
 *net_merge_all_done = \&PublicInbox::LeiInput::input_only_net_merge_all_done;
 
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index e8bcb04e..32f55abd 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -293,6 +293,11 @@ sub set_sync_info {
 	_lms_rw($self)->set_src(pack('H*', $oidhex), $folder, $id);
 }
 
+sub lms_set_src {
+	my ($self, $oidbin, $folder, $id) = @_;
+	_lms_rw($self)->set_src($oidbin, $folder, $id);
+}
+
 sub _remove_if_local { # git->cat_async arg
 	my ($bref, $oidhex, $type, $size, $self) = @_;
 	$self->{im}->remove($bref) if $bref;
diff --git a/t/lei-export-kw.t b/t/lei-export-kw.t
index 9531949a..1fe940bb 100644
--- a/t/lei-export-kw.t
+++ b/t/lei-export-kw.t
@@ -6,7 +6,6 @@ use File::Copy qw(cp);
 use File::Path qw(make_path);
 require_mods(qw(lei -imapd Mail::IMAPClient));
 my ($tmpdir, $for_destroy) = tmpdir;
-my ($ro_home, $cfg_path) = setup_public_inboxes;
 my $expect = eml_load('t/data/0001.patch');
 test_lei({ tmpdir => $tmpdir }, sub {
 	my $home = $ENV{HOME};
diff --git a/t/lei-refresh-mail-sync.t b/t/lei-refresh-mail-sync.t
new file mode 100644
index 00000000..ff558277
--- /dev/null
+++ b/t/lei-refresh-mail-sync.t
@@ -0,0 +1,67 @@
+#!perl -w
+# Copyright (C) all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_mods(qw(lei));
+
+my $stop_daemon = sub { # needed since we don't have inotify
+	lei_ok qw(daemon-pid);
+	chomp(my $pid = $lei_out);
+	$pid > 0 or xbail "bad pid: $pid";
+	kill('TERM', $pid) or xbail "kill: $!";
+	for (0..10) {
+		tick;
+		kill(0, $pid) or last;
+	}
+	kill(0, $pid) and xbail "daemon still running (PID:$pid)";
+};
+
+test_lei({ daemon_only => 1 }, sub {
+	my $d = "$ENV{HOME}/d";
+	my ($ro_home, $cfg_path) = setup_public_inboxes;
+	lei_ok qw(daemon-pid);
+	lei_ok qw(add-external), "$ro_home/t2";
+	lei_ok qw(q mid:testmessage@example.com -o), "Maildir:$d";
+	my (@o) = glob("$d/*/*");
+	scalar(@o) == 1 or xbail('multiple results', \@o);
+	my ($bn0) = ($o[0] =~ m!/([^/]+)\z!);
+
+	my $oid = '9bf1002c49eb075df47247b74d69bcd555e23422';
+	lei_ok 'inspect', "blob:$oid";
+	my $before = json_utf8->decode($lei_out);
+	my $exp0 = { 'mail-sync' => { "maildir:$d" => [ $bn0 ] } };
+	is_deeply($before, $exp0, 'inspect shows expected');
+
+	$stop_daemon->();
+	my $dst = $o[0];
+	$dst =~ s/:2,.*\z// and $dst =~ s!/cur/!/new/! and
+		rename($o[0], $dst) or xbail "rename($o[0] => $dst): $!";
+
+	lei_ok 'inspect', "blob:$oid";
+	is_deeply(json_utf8->decode($lei_out),
+		$before, 'inspect unchanged immediately after restart');
+	lei_ok 'refresh-mail-sync', '--all';
+	lei_ok 'inspect', "blob:$oid";
+	my ($bn1) = ($dst =~ m!/([^/]+)\z!);
+	my $exp1 = { 'mail-sync' => { "maildir:$d" => [ $bn1 ] } };
+	is_deeply(json_utf8->decode($lei_out), $exp1,
+		'refresh-mail-sync updated location');
+
+	$stop_daemon->();
+	rename($dst, "$d/unwatched") or xbail "rename $dst out-of-the-way $!";
+
+	lei_ok 'refresh-mail-sync', $d;
+	lei_ok 'inspect', "blob:$oid";
+	is($lei_out, '{}', 'no known locations after "removal"');
+	lei_ok 'refresh-mail-sync', "Maildir:$d";
+
+	$stop_daemon->();
+	rename("$d/unwatched", $dst) or xbail "rename $dst back";
+
+	lei_ok 'refresh-mail-sync', "Maildir:$d";
+	lei_ok 'inspect', "blob:$oid";
+	is_deeply(json_utf8->decode($lei_out), $exp1,
+		'replaced file noted again');
+});
+
+done_testing;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/6] lei_mail_sync: don't hold statement handle into callback
  2021-09-17  1:56 [PATCH v2 0/6] lei refresh-mail-sync: another try Eric Wong
  2021-09-17  1:56 ` [PATCH v2 1/6] lei refresh-mail-sync: replace prune-mail-sync Eric Wong
@ 2021-09-17  1:56 ` Eric Wong
  2021-09-17  1:56 ` [PATCH v2 3/6] lei refresh-mail-sync: remove "gone" notices Eric Wong
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-09-17  1:56 UTC (permalink / raw)
  To: meta

This can cause readers and writers to conflict since the
implicit transaction from SELECT in a LeiRefreshMailSync
worker would block the LeiStore process.
---
 lib/PublicInbox/LeiMailSync.pm | 41 ++++++++++++++++++++++++++--------
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 5a10c127..8f584ccb 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -169,22 +169,45 @@ INSERT OR IGNORE INTO blob2name (oidbin, fid, name) VALUES (?, ?, ?)
 sub each_src {
 	my ($self, $folder, $cb, @args) = @_;
 	my $dbh = $self->{dbh} //= dbh_new($self);
-	my ($fid, $sth);
+	my $fid;
 	if (ref($folder) eq 'HASH') {
 		$fid = $folder->{fid} // die "BUG: no `fid'";
 	} else {
 		$fid = $self->{fmap}->{$folder} //=
 			fid_for($self, $folder) // return;
 	}
-	$sth = $dbh->prepare('SELECT oidbin,uid FROM blob2num WHERE fid = ?');
-	$sth->execute($fid);
-	while (my ($oidbin, $id) = $sth->fetchrow_array) {
-		$cb->($oidbin, $id, @args);
+
+	# minimize implicit txn time to avoid blocking writers by
+	# batching SELECTs.  This looks wonky but is necessary since
+	# $cb-> may access the DB on its own.
+	my $ary = $dbh->selectall_arrayref(<<'', undef, $fid);
+SELECT _rowid_,oidbin,uid FROM blob2num WHERE fid = ?
+ORDER BY _rowid_ ASC LIMIT 1000
+
+	my $min = @$ary ? $ary->[-1]->[0] : undef;
+	while (defined $min) {
+		for my $row (@$ary) { $cb->($row->[1], $row->[2], @args) }
+
+		$ary = $dbh->selectall_arrayref(<<'', undef, $fid, $min);
+SELECT _rowid_,oidbin,uid FROM blob2num WHERE fid = ? AND _rowid_ > ?
+ORDER BY _rowid_ ASC LIMIT 1000
+
+		$min = @$ary ? $ary->[-1]->[0] : undef;
 	}
-	$sth = $dbh->prepare('SELECT oidbin,name FROM blob2name WHERE fid = ?');
-	$sth->execute($fid);
-	while (my ($oidbin, $id) = $sth->fetchrow_array) {
-		$cb->($oidbin, \$id, @args);
+
+	$ary = $dbh->selectall_arrayref(<<'', undef, $fid);
+SELECT _rowid_,oidbin,name FROM blob2name WHERE fid = ?
+ORDER BY _rowid_ ASC LIMIT 1000
+
+	$min = @$ary ? $ary->[-1]->[0] : undef;
+	while (defined $min) {
+		for my $row (@$ary) { $cb->($row->[1], \($row->[2]), @args) }
+
+		$ary = $dbh->selectall_arrayref(<<'', undef, $fid, $min);
+SELECT _rowid_,oidbin,name FROM blob2name WHERE fid = ? AND _rowid_ > ?
+ORDER BY _rowid_ ASC LIMIT 1000
+
+		$min = @$ary ? $ary->[-1]->[0] : undef;
 	}
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/6] lei refresh-mail-sync: remove "gone" notices
  2021-09-17  1:56 [PATCH v2 0/6] lei refresh-mail-sync: another try Eric Wong
  2021-09-17  1:56 ` [PATCH v2 1/6] lei refresh-mail-sync: replace prune-mail-sync Eric Wong
  2021-09-17  1:56 ` [PATCH v2 2/6] lei_mail_sync: don't hold statement handle into callback Eric Wong
@ 2021-09-17  1:56 ` Eric Wong
  2021-09-17  1:56 ` [PATCH v2 4/6] lei refresh-mail-sync: drop unused {verify} code path Eric Wong
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-09-17  1:56 UTC (permalink / raw)
  To: meta

Those stderr messages are not useful at all, and harmful with
the noise they cause.
---
 lib/PublicInbox/LeiRefreshMailSync.pm | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index 3c083965..71fc348c 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -31,14 +31,12 @@ sub prune_mdir { # lms->each_src callback
 		}
 	}
 	# both tries failed
-	$self->{lei}->qerr("# maildir:$mdir $$id gone");
 	$self->{lei}->{sto}->ipc_do('lms_clear_src', "maildir:$mdir", $id);
 }
 
 sub prune_imap { # lms->each_src callback
 	my ($oidbin, $uid, $self, $uids, $url) = @_;
 	return if exists $uids->{$uid};
-	$self->{lei}->qerr("# $url $uid gone");
 	$self->{lei}->{sto}->ipc_do('lms_clear_src', $url, $uid);
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 4/6] lei refresh-mail-sync: drop unused {verify} code path
  2021-09-17  1:56 [PATCH v2 0/6] lei refresh-mail-sync: another try Eric Wong
                   ` (2 preceding siblings ...)
  2021-09-17  1:56 ` [PATCH v2 3/6] lei refresh-mail-sync: remove "gone" notices Eric Wong
@ 2021-09-17  1:56 ` Eric Wong
  2021-09-17  1:56 ` [PATCH v2 5/6] lei refresh-mail-sync: implicitly remove missing folders Eric Wong
  2021-09-17  1:56 ` [PATCH v2 6/6] lei refresh-mail-sync: drop old IMAP folder info Eric Wong
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-09-17  1:56 UTC (permalink / raw)
  To: meta

That option was never wired up, and probably not needed...
---
 lib/PublicInbox/LeiRefreshMailSync.pm | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index 71fc348c..4cae1536 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -9,27 +9,12 @@ use v5.10.1;
 use parent qw(PublicInbox::IPC PublicInbox::LeiInput);
 use PublicInbox::LeiExportKw;
 use PublicInbox::InboxWritable qw(eml_from_path);
-use PublicInbox::ContentHash qw(git_sha);
 use PublicInbox::Import;
 
-sub eml_match ($$) {
-	my ($eml, $oidbin) = @_;
-	$eml->header_set($_) for @PublicInbox::Import::UNWANTED_HEADERS;
-	$oidbin eq git_sha(length($oidbin) == 20 ? 1 : 256, $eml)->digest;
-}
-
 sub prune_mdir { # lms->each_src callback
 	my ($oidbin, $id, $self, $mdir) = @_;
 	my @try = $$id =~ /:2,[a-zA-Z]*\z/ ? qw(cur new) : qw(new cur);
-	for my $d (@try) {
-		my $src = "$mdir/$d/$$id";
-		if ($self->{verify}) {
-			my $eml = eml_from_path($src) // next;
-			return if eml_match($eml, $oidbin);
-		} elsif (-f $src) {
-			return;
-		}
-	}
+	for (@try) { return if -f "$mdir/$_/$$id" }
 	# both tries failed
 	$self->{lei}->{sto}->ipc_do('lms_clear_src', "maildir:$mdir", $id);
 }

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 5/6] lei refresh-mail-sync: implicitly remove missing folders
  2021-09-17  1:56 [PATCH v2 0/6] lei refresh-mail-sync: another try Eric Wong
                   ` (3 preceding siblings ...)
  2021-09-17  1:56 ` [PATCH v2 4/6] lei refresh-mail-sync: drop unused {verify} code path Eric Wong
@ 2021-09-17  1:56 ` Eric Wong
  2021-09-17  1:56 ` [PATCH v2 6/6] lei refresh-mail-sync: drop old IMAP folder info Eric Wong
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-09-17  1:56 UTC (permalink / raw)
  To: meta

There's no point in keeping mail_sync.sqlite3 entries around
if the folder is gone.  We do keep saved-search configs around,
however, since somebody may decide to blow away a search and
start over.
---
 lib/PublicInbox/LeiInput.pm           | 11 ++++++++++-
 lib/PublicInbox/LeiRefreshMailSync.pm |  5 +++++
 t/lei-refresh-mail-sync.t             | 10 ++++++++++
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/LeiInput.pm b/lib/PublicInbox/LeiInput.pm
index 8ce445c8..372e0fe1 100644
--- a/lib/PublicInbox/LeiInput.pm
+++ b/lib/PublicInbox/LeiInput.pm
@@ -124,7 +124,11 @@ sub input_path_url {
 		handle_http_input($self, $input, @args);
 		return;
 	}
+
+	# local-only below
+	my $ifmt_pfx = '';
 	if ($input =~ s!\A([a-z0-9]+):!!i) {
+		$ifmt_pfx = "$1:";
 		$ifmt = lc($1);
 	} elsif ($input =~ /\.(?:patch|eml)\z/i) {
 		$ifmt = 'eml';
@@ -172,11 +176,16 @@ EOM
 						$self->can('input_maildir_cb'),
 						$self, @args);
 		}
+	} elsif ($self->{missing_ok} && !-e $input) { # don't ->fail
+		$self->folder_missing("$ifmt:$input");
 	} else {
-		$lei->fail("$input unsupported (TODO)");
+		$lei->fail("$ifmt_pfx$input unsupported (TODO)");
 	}
 }
 
+# subclasses should overrride this (see LeiRefreshMailSync)
+sub folder_missing { die "BUG: ->folder_missing undefined for $_[0]" }
+
 sub bad_http ($$;$) {
 	my ($lei, $url, $alt) = @_;
 	my $x = $alt ? "did you mean <$alt>?" : 'download and import manually';
diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index 4cae1536..19f64b58 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -11,6 +11,11 @@ use PublicInbox::LeiExportKw;
 use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::Import;
 
+sub folder_missing {
+	my ($self, $folder) = @_;
+	$self->{lei}->{sto}->ipc_do('lms_forget_folders', $folder);
+}
+
 sub prune_mdir { # lms->each_src callback
 	my ($oidbin, $id, $self, $mdir) = @_;
 	my @try = $$id =~ /:2,[a-zA-Z]*\z/ ? qw(cur new) : qw(new cur);
diff --git a/t/lei-refresh-mail-sync.t b/t/lei-refresh-mail-sync.t
index ff558277..d3438011 100644
--- a/t/lei-refresh-mail-sync.t
+++ b/t/lei-refresh-mail-sync.t
@@ -3,6 +3,7 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 require_mods(qw(lei));
+use File::Path qw(remove_tree);
 
 my $stop_daemon = sub { # needed since we don't have inotify
 	lei_ok qw(daemon-pid);
@@ -62,6 +63,15 @@ test_lei({ daemon_only => 1 }, sub {
 	lei_ok 'inspect', "blob:$oid";
 	is_deeply(json_utf8->decode($lei_out), $exp1,
 		'replaced file noted again');
+
+	$stop_daemon->();
+
+	remove_tree($d);
+	lei_ok 'refresh-mail-sync', '--all';
+	lei_ok 'inspect', "blob:$oid";
+	is($lei_out, '{}', 'no known locations after "removal"');
+	lei_ok 'ls-mail-sync';
+	is($lei_out, '', 'no sync left when folder is gone');
 });
 
 done_testing;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 6/6] lei refresh-mail-sync: drop old IMAP folder info
  2021-09-17  1:56 [PATCH v2 0/6] lei refresh-mail-sync: another try Eric Wong
                   ` (4 preceding siblings ...)
  2021-09-17  1:56 ` [PATCH v2 5/6] lei refresh-mail-sync: implicitly remove missing folders Eric Wong
@ 2021-09-17  1:56 ` Eric Wong
  5 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2021-09-17  1:56 UTC (permalink / raw)
  To: meta

Like with Maildir, IMAP folders can be deleted entirely.
Ensure they can be eliminated, but don't be fooled into
removing them if they're temporarily unreachable.
---
 lib/PublicInbox/LeiRefreshMailSync.pm | 11 +++--
 t/lei-refresh-mail-sync.t             | 60 +++++++++++++++++++++++++++
 2 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/LeiRefreshMailSync.pm b/lib/PublicInbox/LeiRefreshMailSync.pm
index 19f64b58..09a7ead0 100644
--- a/lib/PublicInbox/LeiRefreshMailSync.pm
+++ b/lib/PublicInbox/LeiRefreshMailSync.pm
@@ -53,10 +53,13 @@ sub input_path_url { # overrides PublicInbox::LeiInput::input_path_url
 		PublicInbox::LeiInput::input_path_url($self, $input);
 	} elsif ($input =~ m!\Aimaps?://!i) {
 		my $uri = PublicInbox::URIimap->new($input);
-		my $mic = $self->{lei}->{net}->mic_for_folder($uri);
-		my $uids = $mic->search('UID 1:*');
-		$uids = +{ map { $_ => undef } @$uids };
-		$lms->each_src($$uri, \&prune_imap, $self, $uids, $$uri);
+		if (my $mic = $self->{lei}->{net}->mic_for_folder($uri)) {
+			my $uids = $mic->search('UID 1:*');
+			$uids = +{ map { $_ => undef } @$uids };
+			$lms->each_src($$uri, \&prune_imap, $self, $uids, $$uri)
+		} else {
+			$self->folder_missing($$uri);
+		}
 	} else { die "BUG: $input not supported" }
 	$self->{lei}->{pkt_op_p}->pkt_do('sto_done_request');
 }
diff --git a/t/lei-refresh-mail-sync.t b/t/lei-refresh-mail-sync.t
index d3438011..90356b57 100644
--- a/t/lei-refresh-mail-sync.t
+++ b/t/lei-refresh-mail-sync.t
@@ -72,6 +72,66 @@ test_lei({ daemon_only => 1 }, sub {
 	is($lei_out, '{}', 'no known locations after "removal"');
 	lei_ok 'ls-mail-sync';
 	is($lei_out, '', 'no sync left when folder is gone');
+
+SKIP: {
+	require_mods(qw(-imapd -nntpd Mail::IMAPClient Net::NNTP), 1);
+	require File::Copy; # stdlib
+	my $home = $ENV{HOME};
+	my $srv;
+	my $cfg_path2 = "$home/cfg2";
+	File::Copy::cp($cfg_path, $cfg_path2);
+	my $env = { PI_CONFIG => $cfg_path2 };
+	for my $x (qw(imapd)) {
+		my $s = tcp_server;
+		my $cmd = [ "-$x", '-W0', "--stdout=$home/$x.out",
+			"--stderr=$home/$x.err" ];
+		my $td = start_script($cmd, $env, { 3 => $s}) or xbail("-$x");
+		$srv->{$x} = {
+			addr => (my $scalar = tcp_host_port($s)),
+			td => $td,
+			cmd => $cmd,
+		};
+	}
+	my $url = "imap://$srv->{imapd}->{addr}/t.v1.0";
+	lei_ok 'import', $url, '+L:v1';
+	lei_ok 'inspect', "blob:$oid";
+	$before = json_utf8->decode($lei_out);
+	my @f = grep(m!\Aimap://;AUTH=ANONYMOUS\@\Q$srv->{imapd}->{addr}\E!,
+		keys %{$before->{'mail-sync'}});
+	is(scalar(@f), 1, 'got IMAP folder') or xbail(\@f);
+	xsys([qw(git config), '-f', $cfg_path2,
+		qw(--unset publicinbox.t1.newsgroup)]) and
+		xbail "git config $?";
+	$stop_daemon->(); # drop IMAP IDLE
+	$srv->{imapd}->{td}->kill('HUP');
+	tick; # wait for HUP
+	lei_ok 'refresh-mail-sync', $url;
+	lei_ok 'inspect', "blob:$oid";
+	my $after = json_utf8->decode($lei_out);
+	ok(!$after->{'mail-sync'}, 'no sync info for non-existent mailbox');
+	lei_ok 'ls-mail-sync';
+	unlike $lei_out, qr!^\Q$f[0]\E!, 'IMAP folder gone from mail_sync';
+
+	# simulate server downtime
+	$url = "imap://$srv->{imapd}->{addr}/t.v2.0";
+	lei_ok 'import', $url, '+L:v2';
+
+	lei_ok 'inspect', "blob:$oid";
+	$before = $lei_out;
+	delete $srv->{imapd}->{td}; # kill + join daemon
+
+	ok(!(lei 'refresh-mail-sync', $url), 'URL fails on dead -imapd');
+	ok(!(lei 'refresh-mail-sync', '--all'), '--all fails on dead -imapd');
+
+	# restart server (somewhat dangerous since we released the socket)
+	my $cmd = $srv->{imapd}->{cmd};
+	push @$cmd, '-l', $srv->{imapd}->{addr};
+	$srv->{imapd}->{td} = start_script($cmd, $env) or xbail "@$cmd";
+
+	lei_ok 'refresh-mail-sync', '--all';
+	lei_ok 'inspect', "blob:$oid";
+	is($lei_out, $before, 'no changes when server was down');
+}; # imapd+nntpd stuff
 });
 
 done_testing;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-17  1:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-17  1:56 [PATCH v2 0/6] lei refresh-mail-sync: another try Eric Wong
2021-09-17  1:56 ` [PATCH v2 1/6] lei refresh-mail-sync: replace prune-mail-sync Eric Wong
2021-09-17  1:56 ` [PATCH v2 2/6] lei_mail_sync: don't hold statement handle into callback Eric Wong
2021-09-17  1:56 ` [PATCH v2 3/6] lei refresh-mail-sync: remove "gone" notices Eric Wong
2021-09-17  1:56 ` [PATCH v2 4/6] lei refresh-mail-sync: drop unused {verify} code path Eric Wong
2021-09-17  1:56 ` [PATCH v2 5/6] lei refresh-mail-sync: implicitly remove missing folders Eric Wong
2021-09-17  1:56 ` [PATCH v2 6/6] lei refresh-mail-sync: drop old IMAP folder info Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).