unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 0/5] no trash, glossary doc
@ 2021-03-10 13:23 Eric Wong
  2021-03-10 13:23 ` [PATCH 1/5] doc: technical/data_structures: update for EOFpipe Eric Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
  To: meta

-watch on IMAP now matches Maildir behavior in skipping
trashed (deleted) and draft messages.

"lei import" now ignores (trashed|deleted) messages, as well;
but imports drafts.

The glossary is intended to help reduce confusion as more
things overlap with different terminology.

Eric Wong (5):
  doc: technical/data_structures: update for EOFpipe
  watch: IMAP: ignore \Deleted and \Draft messages
  lei import: simplify Maildir handling
  lei import: skip trashed Maildir messages
  doc: start glossary for overlapping concepts

 Documentation/public-inbox-glossary.pod     | 95 +++++++++++++++++++++
 Documentation/technical/data_structures.txt | 10 +--
 Documentation/txt2pre                       |  1 +
 MANIFEST                                    |  1 +
 Makefile.PL                                 |  3 +-
 lib/PublicInbox/LeiImport.pm                |  8 +-
 lib/PublicInbox/LeiStore.pm                 |  6 --
 lib/PublicInbox/MdirReader.pm               |  1 +
 lib/PublicInbox/NetReader.pm                |  2 +
 lib/PublicInbox/Watch.pm                    | 26 +-----
 t/lei-import-maildir.t                      |  7 ++
 xt/net_writer-imap.t                        | 44 ++++++++++
 12 files changed, 165 insertions(+), 39 deletions(-)
 create mode 100644 Documentation/public-inbox-glossary.pod

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/5] doc: technical/data_structures: update for EOFpipe
  2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
  2021-03-10 13:23 ` [PATCH 2/5] watch: IMAP: ignore \Deleted and \Draft messages Eric Wong
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
  To: meta

ParentPipe no longer exists and was replaced by the more
flexible EOFpipe.
---
 Documentation/technical/data_structures.txt | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/data_structures.txt b/Documentation/technical/data_structures.txt
index 8776a67b..4dcf9ce6 100644
--- a/Documentation/technical/data_structures.txt
+++ b/Documentation/technical/data_structures.txt
@@ -222,10 +222,8 @@ daemon classes
   given PublicInbox::Config which may be instantiated more than
   once in the future.
 
-* PublicInbox::ParentPipe
+* PublicInbox::EOFpipe
 
-  Per-worker process class to detect shutdown of master process.
-  This is not used if using -W0 to disable worker processes
-  in public-inbox-httpd or public-inbox-nntpd.
-
-  This is a per-worker singleton.
+  Used throughout to trigger a callback when a pipe(7) is closed.
+  This is frequently used to portably detect process exit without
+  relying on a catch-all waitpid(-1, ...) call.

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/5] watch: IMAP: ignore \Deleted and \Draft messages
  2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
  2021-03-10 13:23 ` [PATCH 1/5] doc: technical/data_structures: update for EOFpipe Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
  2021-03-10 13:23 ` [PATCH 3/5] lei import: simplify Maildir handling Eric Wong
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
  To: meta

This matches existing Maildir behavior, as trash and draft
messages have little reason to be exposed publicly.
---
 lib/PublicInbox/NetReader.pm |  2 ++
 lib/PublicInbox/Watch.pm     | 26 ++++-----------------
 xt/net_writer-imap.t         | 44 ++++++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 22 deletions(-)

diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index f5f71005..d3094fc7 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -349,6 +349,8 @@ sub _imap_do_msg ($$$$$) {
 		if (my $k = $IMAPflags2kw{$f}) {
 			push @$kw, $k;
 		} elsif ($f eq "\\Recent") { # not in JMAP
+		} elsif ($f eq "\\Deleted") { # not in JMAP
+			return;
 		} elsif ($self->{verbose}) {
 			warn "# unknown IMAP flag $f <$uri;uid=$uid>\n";
 		}
diff --git a/lib/PublicInbox/Watch.pm b/lib/PublicInbox/Watch.pm
index dd245935..4fbc9640 100644
--- a/lib/PublicInbox/Watch.pm
+++ b/lib/PublicInbox/Watch.pm
@@ -287,30 +287,9 @@ sub watch_fs_init ($) {
 	PublicInbox::DirIdle->new([keys %{$self->{mdmap}}], $cb);
 }
 
-sub imap_import_msg ($$$$$) {
-	my ($self, $uri, $uid, $raw, $flags) = @_;
-	# our target audience expects LF-only, save storage
-	$$raw =~ s/\r\n/\n/sg;
-
-	my $inboxes = $self->{imap}->{$$uri};
-	if (ref($inboxes)) {
-		for my $ibx (@$inboxes) {
-			my $eml = PublicInbox::Eml->new($$raw);
-			import_eml($self, $ibx, $eml);
-		}
-	} elsif ($inboxes eq 'watchspam') {
-		return if $flags !~ /\\Seen\b/; # don't remove unseen messages
-		local $SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
-		my $eml = PublicInbox::Eml->new($raw);
-		$self->{pi_cfg}->each_inbox(\&remove_eml_i,
-						$self, $eml, "$uri UID:$uid");
-	} else {
-		die "BUG: destination unknown $inboxes";
-	}
-}
-
 sub net_cb { # NetReader::(nntp|imap)_each callback
 	my ($uri, $art, $kw, $eml, $self, $inboxes) = @_;
+	return if grep(/\Adraft\z/, @$kw);
 	local $self->{cur_uid} = $art; # IMAP UID or NNTP article
 	if (ref($inboxes)) {
 		my @ibx = @$inboxes;
@@ -321,6 +300,9 @@ sub net_cb { # NetReader::(nntp|imap)_each callback
 		}
 		import_eml($self, $last, $eml);
 	} elsif ($inboxes eq 'watchspam') {
+		if ($uri->scheme =~ /\Aimaps?\z/ && !grep(/\Aseen\z/, @$kw)) {
+			return;
+		}
 		$self->{pi_cfg}->each_inbox(\&remove_eml_i,
 				$self, $eml, "$uri #$art");
 	} else {
diff --git a/xt/net_writer-imap.t b/xt/net_writer-imap.t
index 3631d932..11a10e74 100644
--- a/xt/net_writer-imap.t
+++ b/xt/net_writer-imap.t
@@ -7,6 +7,8 @@ use POSIX qw(strftime);
 use PublicInbox::OnDestroy;
 use PublicInbox::URIimap;
 use PublicInbox::Config;
+use PublicInbox::DS;
+use PublicInbox::InboxIdle;
 use Fcntl qw(O_EXCL O_WRONLY O_CREAT);
 my $imap_url = $ENV{TEST_IMAP_WRITE_URL} or
 	plan skip_all => 'TEST_IMAP_WRITE_URL unset';
@@ -170,6 +172,48 @@ test_lei(sub {
 	$res = json_utf8->decode($lei_out)->[0];
 	is_deeply([@$res{qw(m kw)}], ['testmessage@example.com', ['seen']],
 		'kw set');
+
+	$mic = $nwr->mic_for_folder($folder_uri);
+	for my $kw (qw(Deleted Seen Answered Draft)) {
+		my $buf = <<EOM;
+From: x\@example.com
+Message-ID: <$kw\@test.example.com>
+
+EOM
+		$mic->append_string($folder_uri->mailbox, $buf, "\\$kw")
+			or BAIL_OUT "append $kw $@";
+	}
+	# $mic->expunge or BAIL_OUT "expunge: $@";
+	$mic->disconnect;
+
+	my $inboxdir = "$ENV{HOME}/wtest";
+	my @cmd = (qw(-init -Lbasic wtest), $inboxdir,
+			qw(https://example.com/wtest wtest@example.com));
+	run_script(\@cmd) or BAIL_OUT "init wtest";
+	xsys(qw(git config), "--file=$ENV{HOME}/.public-inbox/config",
+			'publicinbox.wtest.watch',
+			$$folder_uri) == 0 or BAIL_OUT "git config $?";
+	my $watcherr = "$ENV{HOME}/watch.err";
+	open my $err_wr, '>>', $watcherr or BAIL_OUT $!;
+	my $pub_cfg = PublicInbox::Config->new;
+	PublicInbox::DS->Reset;
+	my $ii = PublicInbox::InboxIdle->new($pub_cfg);
+	my $cb = sub { PublicInbox::DS->SetPostLoopCallback(sub {}) };
+	my $obj = bless \$cb, 'PublicInbox::TestCommon::InboxWakeup';
+	$pub_cfg->each_inbox(sub { $_[0]->subscribe_unlock('ident', $obj) });
+	my $w = start_script(['-watch'], undef, { 2 => $err_wr });
+	diag 'waiting for initial fetch...';
+	PublicInbox::DS->EventLoop;
+	my $ibx = $pub_cfg->lookup_name('wtest');
+	my $mm = $ibx->mm;
+	ok(defined($mm->num_for('Seen@test.example.com')),
+		'-watch takes seen message');
+	ok(defined($mm->num_for('Answered@test.example.com')),
+		'-watch takes answered message');
+	ok(!defined($mm->num_for('Deleted@test.example.com')),
+		'-watch ignored \\Deleted');
+	ok(!defined($mm->num_for('Draft@test.example.com')),
+		'-watch ignored \\Draft');
 });
 
 undef $cleanup; # remove temporary folder

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/5] lei import: simplify Maildir handling
  2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
  2021-03-10 13:23 ` [PATCH 1/5] doc: technical/data_structures: update for EOFpipe Eric Wong
  2021-03-10 13:23 ` [PATCH 2/5] watch: IMAP: ignore \Deleted and \Draft messages Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
  2021-03-10 13:23 ` [PATCH 4/5] lei import: skip trashed Maildir messages Eric Wong
  2021-03-10 13:23 ` [PATCH 5/5] doc: start glossary for overlapping concepts Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
  To: meta

Having a one-off Maildir functionality in LeiStore doesn't seem
worth the maintenance burden, especially given an upcoming
change to skip trashed messages.

I expect this will hurt performance slightly with extra IPC
overhead for the socket copy, but "lei import" may eventually
become rare or at least not hit messages redundantly.
---
 lib/PublicInbox/LeiImport.pm | 8 ++++----
 lib/PublicInbox/LeiStore.pm  | 6 ------
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 23cecd53..815788b3 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -147,9 +147,9 @@ error reading $input: $!
 	$lei->child_error(1 << 8, "$input: $@") if $@;
 }
 
-sub _import_maildir { # maildir_each_file cb
-	my ($f, $sto, $set_kw) = @_;
-	$sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
+sub _import_maildir { # maildir_each_eml cb
+	my ($f, $kw, $eml, $sto, $set_kw) = @_;
+	$sto->ipc_do('set_eml', $eml, $set_kw ? @$kw : ());
 }
 
 sub _import_net { # imap_each, nntp_each cb
@@ -181,7 +181,7 @@ sub import_path_url {
 		return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
 $input appears to a be a maildir, not $ifmt
 EOM
-		PublicInbox::MdirReader::maildir_each_file($input,
+		PublicInbox::MdirReader::maildir_each_eml($input,
 					\&_import_maildir,
 					$lei->{sto}, $lei->{opt}->{kw});
 	} else {
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 92c29100..6ace2ad1 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -213,12 +213,6 @@ sub set_eml {
 	add_eml($self, $eml, @kw) // set_eml_keywords($self, $eml, @kw);
 }
 
-sub set_eml_from_maildir {
-	my ($self, $f, $set_kw) = @_;
-	my $eml = eml_from_path($f) or return;
-	set_eml($self, $eml, $set_kw ? maildir_keywords($f) : ());
-}
-
 sub checkpoint {
 	my ($self, $wait) = @_;
 	if (my $im = $self->{im}) {

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/5] lei import: skip trashed Maildir messages
  2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
                   ` (2 preceding siblings ...)
  2021-03-10 13:23 ` [PATCH 3/5] lei import: simplify Maildir handling Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
  2021-03-10 13:23 ` [PATCH 5/5] doc: start glossary for overlapping concepts Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
  To: meta

This matches IMAP behavior in NetReader in skipping \\Deleted
messages.  Since lei may be used for personal, non-public mail;
Draft messages are NOT skipped by "lei import".
---
 lib/PublicInbox/MdirReader.pm | 1 +
 t/lei-import-maildir.t        | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index 44724af1..06806e80 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -57,6 +57,7 @@ sub maildir_each_eml ($$;@) {
 	opendir my $dh, $pfx or return;
 	while (defined(my $bn = readdir($dh))) {
 		my $fl = maildir_basename_flags($bn) // next;
+		next if index($fl, 'T') >= 0;
 		my $f = $pfx.$bn;
 		my $eml = eml_from_path($f) or next;
 		my @kw = sort(map { $c2kw{$_} // () } split(//, $fl));
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
index a3796491..bd89677a 100644
--- a/t/lei-import-maildir.t
+++ b/t/lei-import-maildir.t
@@ -29,5 +29,12 @@ test_lei(sub {
 	like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
 	is_deeply($res->[0]->{kw}, ['answered', 'seen'], 'keywords set');
 	is($res->[1], undef, 'only got one result');
+
+	symlink(abs_path('t/utf8.eml'), "$md/cur/u:2,ST") or
+		BAIL_OUT "symlink $md $!";
+	lei_ok('import', "maildir:$md", \'import Maildir w/ trashed message');
+	lei_ok(qw(q -d none m:testmessage@example.com));
+	$res = json_utf8->decode($lei_out);
+	is_deeply($res, [ undef ], 'trashed message not imported');
 });
 done_testing;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 5/5] doc: start glossary for overlapping concepts
  2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
                   ` (3 preceding siblings ...)
  2021-03-10 13:23 ` [PATCH 4/5] lei import: skip trashed Maildir messages Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
  4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
  To: meta

This is intended to keep track of concepts with different terms
between NNTP, IMAP, config file, lei storage, and upcoming
JMAP support.
---
 Documentation/public-inbox-glossary.pod | 95 +++++++++++++++++++++++++
 Documentation/txt2pre                   |  1 +
 MANIFEST                                |  1 +
 Makefile.PL                             |  3 +-
 4 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/public-inbox-glossary.pod

diff --git a/Documentation/public-inbox-glossary.pod b/Documentation/public-inbox-glossary.pod
new file mode 100644
index 00000000..e188e563
--- /dev/null
+++ b/Documentation/public-inbox-glossary.pod
@@ -0,0 +1,95 @@
+=head1 NAME
+
+public-inbox-glossary - glossary for public-inbox
+
+=head1 DESCRIPTION
+
+public-inbox combines several independently-developed protocols
+and data formats with overlapping concepts.  This document is
+intended as a guide to identify and clarify overlapping concepts
+with different names.
+
+This is mainly intended for hackers of public-inbox, but may be useful
+for administrators of public-facing services and/or users building
+tools.
+
+=head1 TERMS
+
+=item IMAP UID, NNTP article number, on-disk Xapian docid
+
+A sequentially-assigned positive integer.  These integers are per-inbox,
+or per-extindex.  This is the C<num> column of the C<over> table in
+C<over.sqlite3>
+
+=item tid, THREADID
+
+A sequentially-assigned positive integer.  These integers are
+per-inbox or per-extindex.  In the future, this may be prefixed
+with C<T> for JMAP (RFC 8621) and RFC 8474.  This may not be
+strictly compliant with RFC 8621 since inboxes and extindices
+are considered independent entities from each other.
+
+This is the C<tid> column of the C<over> table in C<over.sqlite3>
+
+=item blob
+
+For email, this is the git blob object ID (SHA-(1|256)) of an
+RFC-(822|2822|5322) email message.
+
+=item IMAP EMAILID, JMAP Email Id
+
+To-be-decided.  This will likely be the git blob ID prefixed with C<g>
+rather than the numeric UID to accomodate the same blob showing
+up in both an extindex and inbox (or multiple extindices).
+
+=item newsgroup
+
+The name of the NNTP newsgroup, see L<public-inbox-config(5)>.
+
+=item IMAP (folder|mailbox) slice
+
+A 50K slice of a newsgroup to accomodate the limitations of IMAP
+clients with L<public-inbox-imapd(1)>.  This is the C<newsgroup>
+name with a C<.$INTEGER_SUFFIX>, e.g. a newsgroup named C<inbox.test>
+would have its first slice named C<inbox.test.0>, and second slice
+named C<inbox.test.1> and so forth.
+
+If implemented, the RFC 8474 MAILBOXID of an IMAP slice will NOT have
+the same Mailbox Id as the public-facing full JMAP mailbox.
+
+=item inbox name, public JMAP mailbox name
+
+The HTTP(S) name of the public-inbox
+(C<publicinbox.E<lt>nameE<gt>.*>).  JMAP will use this name
+rather than the newsgroup name since public-facing JMAP will be
+part of the PSGI code and not need a separate daemon like
+L<public-inbox-nntpd(1)> or L<public-inbox-imapd(1)>
+
+=item keywords, (IMAP|Maildir) flags, mbox Status + X-Status
+
+Private, per-message keywords or flags as described in RFC 8621
+section 10.4.  These are conveyed in the C<Status:> and
+C<X-Status:> headers for L<mbox(5)>, as IMAP FLAGS (RFC 3501 section 2.3.2),
+or Maildir info flags.
+
+L<public-inbox-watch(1)> ignores drafts and trashed (deleted)
+messages.  L<lei-import(1)> ignores trashed (deleted) messages,
+but it imports drafts.
+
+=item labels, private JMAP mailboxes
+
+For L<lei(1)> users only.  This will allow lei users to place
+the same email into one or more virtual folders for
+ease-of-filtering.  This is NOT tied to public-inbox names, as
+messages stored by lei may not be public.
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<http://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<public-inbox-v2-format(5)>, L<public-inbox-v1-format(5)>,
+L<public-inbox-extindex-format(5)>, L<gitglossary(7)>
diff --git a/Documentation/txt2pre b/Documentation/txt2pre
index 3277531f..244dc50c 100755
--- a/Documentation/txt2pre
+++ b/Documentation/txt2pre
@@ -27,6 +27,7 @@ for (qw[lei(1)
 	public-inbox-convert(1)
 	public-inbox-daemon(8)
 	public-inbox-edit(1)
+	public-inbox-glossary(7)
 	public-inbox-httpd(1)
 	public-inbox-imapd(1)
 	public-inbox-index(1)
diff --git a/MANIFEST b/MANIFEST
index 8c9c86a0..8662d2c0 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -41,6 +41,7 @@ Documentation/public-inbox-daemon.pod
 Documentation/public-inbox-edit.pod
 Documentation/public-inbox-extindex-format.pod
 Documentation/public-inbox-extindex.pod
+Documentation/public-inbox-glossary.pod
 Documentation/public-inbox-httpd.pod
 Documentation/public-inbox-imapd.pod
 Documentation/public-inbox-index.pod
diff --git a/Makefile.PL b/Makefile.PL
index 6da2ed70..21d3d6ea 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -48,7 +48,8 @@ $v->{-m1} = [ map {
 	lei-forget-external lei-import lei-init lei-ls-external lei-q)];
 $v->{-m5} = [ qw(public-inbox-config public-inbox-v1-format
 		public-inbox-v2-format public-inbox-extindex-format) ];
-$v->{-m7} = [ qw(lei-overview public-inbox-overview public-inbox-tuning) ];
+$v->{-m7} = [ qw(lei-overview public-inbox-overview public-inbox-tuning
+		public-inbox-glossary) ];
 $v->{-m8} = [ qw(public-inbox-daemon) ];
 my @sections = (1, 5, 7, 8);
 $v->{check_80} = [];

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-10 13:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
2021-03-10 13:23 ` [PATCH 1/5] doc: technical/data_structures: update for EOFpipe Eric Wong
2021-03-10 13:23 ` [PATCH 2/5] watch: IMAP: ignore \Deleted and \Draft messages Eric Wong
2021-03-10 13:23 ` [PATCH 3/5] lei import: simplify Maildir handling Eric Wong
2021-03-10 13:23 ` [PATCH 4/5] lei import: skip trashed Maildir messages Eric Wong
2021-03-10 13:23 ` [PATCH 5/5] doc: start glossary for overlapping concepts Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).