* [PATCH 0/5] no trash, glossary doc
@ 2021-03-10 13:23 Eric Wong
2021-03-10 13:23 ` [PATCH 1/5] doc: technical/data_structures: update for EOFpipe Eric Wong
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
To: meta
-watch on IMAP now matches Maildir behavior in skipping
trashed (deleted) and draft messages.
"lei import" now ignores (trashed|deleted) messages, as well;
but imports drafts.
The glossary is intended to help reduce confusion as more
things overlap with different terminology.
Eric Wong (5):
doc: technical/data_structures: update for EOFpipe
watch: IMAP: ignore \Deleted and \Draft messages
lei import: simplify Maildir handling
lei import: skip trashed Maildir messages
doc: start glossary for overlapping concepts
Documentation/public-inbox-glossary.pod | 95 +++++++++++++++++++++
Documentation/technical/data_structures.txt | 10 +--
Documentation/txt2pre | 1 +
MANIFEST | 1 +
Makefile.PL | 3 +-
lib/PublicInbox/LeiImport.pm | 8 +-
lib/PublicInbox/LeiStore.pm | 6 --
lib/PublicInbox/MdirReader.pm | 1 +
lib/PublicInbox/NetReader.pm | 2 +
lib/PublicInbox/Watch.pm | 26 +-----
t/lei-import-maildir.t | 7 ++
xt/net_writer-imap.t | 44 ++++++++++
12 files changed, 165 insertions(+), 39 deletions(-)
create mode 100644 Documentation/public-inbox-glossary.pod
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/5] doc: technical/data_structures: update for EOFpipe
2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
2021-03-10 13:23 ` [PATCH 2/5] watch: IMAP: ignore \Deleted and \Draft messages Eric Wong
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
To: meta
ParentPipe no longer exists and was replaced by the more
flexible EOFpipe.
---
Documentation/technical/data_structures.txt | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/Documentation/technical/data_structures.txt b/Documentation/technical/data_structures.txt
index 8776a67b..4dcf9ce6 100644
--- a/Documentation/technical/data_structures.txt
+++ b/Documentation/technical/data_structures.txt
@@ -222,10 +222,8 @@ daemon classes
given PublicInbox::Config which may be instantiated more than
once in the future.
-* PublicInbox::ParentPipe
+* PublicInbox::EOFpipe
- Per-worker process class to detect shutdown of master process.
- This is not used if using -W0 to disable worker processes
- in public-inbox-httpd or public-inbox-nntpd.
-
- This is a per-worker singleton.
+ Used throughout to trigger a callback when a pipe(7) is closed.
+ This is frequently used to portably detect process exit without
+ relying on a catch-all waitpid(-1, ...) call.
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/5] watch: IMAP: ignore \Deleted and \Draft messages
2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
2021-03-10 13:23 ` [PATCH 1/5] doc: technical/data_structures: update for EOFpipe Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
2021-03-10 13:23 ` [PATCH 3/5] lei import: simplify Maildir handling Eric Wong
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
To: meta
This matches existing Maildir behavior, as trash and draft
messages have little reason to be exposed publicly.
---
lib/PublicInbox/NetReader.pm | 2 ++
lib/PublicInbox/Watch.pm | 26 ++++-----------------
xt/net_writer-imap.t | 44 ++++++++++++++++++++++++++++++++++++
3 files changed, 50 insertions(+), 22 deletions(-)
diff --git a/lib/PublicInbox/NetReader.pm b/lib/PublicInbox/NetReader.pm
index f5f71005..d3094fc7 100644
--- a/lib/PublicInbox/NetReader.pm
+++ b/lib/PublicInbox/NetReader.pm
@@ -349,6 +349,8 @@ sub _imap_do_msg ($$$$$) {
if (my $k = $IMAPflags2kw{$f}) {
push @$kw, $k;
} elsif ($f eq "\\Recent") { # not in JMAP
+ } elsif ($f eq "\\Deleted") { # not in JMAP
+ return;
} elsif ($self->{verbose}) {
warn "# unknown IMAP flag $f <$uri;uid=$uid>\n";
}
diff --git a/lib/PublicInbox/Watch.pm b/lib/PublicInbox/Watch.pm
index dd245935..4fbc9640 100644
--- a/lib/PublicInbox/Watch.pm
+++ b/lib/PublicInbox/Watch.pm
@@ -287,30 +287,9 @@ sub watch_fs_init ($) {
PublicInbox::DirIdle->new([keys %{$self->{mdmap}}], $cb);
}
-sub imap_import_msg ($$$$$) {
- my ($self, $uri, $uid, $raw, $flags) = @_;
- # our target audience expects LF-only, save storage
- $$raw =~ s/\r\n/\n/sg;
-
- my $inboxes = $self->{imap}->{$$uri};
- if (ref($inboxes)) {
- for my $ibx (@$inboxes) {
- my $eml = PublicInbox::Eml->new($$raw);
- import_eml($self, $ibx, $eml);
- }
- } elsif ($inboxes eq 'watchspam') {
- return if $flags !~ /\\Seen\b/; # don't remove unseen messages
- local $SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
- my $eml = PublicInbox::Eml->new($raw);
- $self->{pi_cfg}->each_inbox(\&remove_eml_i,
- $self, $eml, "$uri UID:$uid");
- } else {
- die "BUG: destination unknown $inboxes";
- }
-}
-
sub net_cb { # NetReader::(nntp|imap)_each callback
my ($uri, $art, $kw, $eml, $self, $inboxes) = @_;
+ return if grep(/\Adraft\z/, @$kw);
local $self->{cur_uid} = $art; # IMAP UID or NNTP article
if (ref($inboxes)) {
my @ibx = @$inboxes;
@@ -321,6 +300,9 @@ sub net_cb { # NetReader::(nntp|imap)_each callback
}
import_eml($self, $last, $eml);
} elsif ($inboxes eq 'watchspam') {
+ if ($uri->scheme =~ /\Aimaps?\z/ && !grep(/\Aseen\z/, @$kw)) {
+ return;
+ }
$self->{pi_cfg}->each_inbox(\&remove_eml_i,
$self, $eml, "$uri #$art");
} else {
diff --git a/xt/net_writer-imap.t b/xt/net_writer-imap.t
index 3631d932..11a10e74 100644
--- a/xt/net_writer-imap.t
+++ b/xt/net_writer-imap.t
@@ -7,6 +7,8 @@ use POSIX qw(strftime);
use PublicInbox::OnDestroy;
use PublicInbox::URIimap;
use PublicInbox::Config;
+use PublicInbox::DS;
+use PublicInbox::InboxIdle;
use Fcntl qw(O_EXCL O_WRONLY O_CREAT);
my $imap_url = $ENV{TEST_IMAP_WRITE_URL} or
plan skip_all => 'TEST_IMAP_WRITE_URL unset';
@@ -170,6 +172,48 @@ test_lei(sub {
$res = json_utf8->decode($lei_out)->[0];
is_deeply([@$res{qw(m kw)}], ['testmessage@example.com', ['seen']],
'kw set');
+
+ $mic = $nwr->mic_for_folder($folder_uri);
+ for my $kw (qw(Deleted Seen Answered Draft)) {
+ my $buf = <<EOM;
+From: x\@example.com
+Message-ID: <$kw\@test.example.com>
+
+EOM
+ $mic->append_string($folder_uri->mailbox, $buf, "\\$kw")
+ or BAIL_OUT "append $kw $@";
+ }
+ # $mic->expunge or BAIL_OUT "expunge: $@";
+ $mic->disconnect;
+
+ my $inboxdir = "$ENV{HOME}/wtest";
+ my @cmd = (qw(-init -Lbasic wtest), $inboxdir,
+ qw(https://example.com/wtest wtest@example.com));
+ run_script(\@cmd) or BAIL_OUT "init wtest";
+ xsys(qw(git config), "--file=$ENV{HOME}/.public-inbox/config",
+ 'publicinbox.wtest.watch',
+ $$folder_uri) == 0 or BAIL_OUT "git config $?";
+ my $watcherr = "$ENV{HOME}/watch.err";
+ open my $err_wr, '>>', $watcherr or BAIL_OUT $!;
+ my $pub_cfg = PublicInbox::Config->new;
+ PublicInbox::DS->Reset;
+ my $ii = PublicInbox::InboxIdle->new($pub_cfg);
+ my $cb = sub { PublicInbox::DS->SetPostLoopCallback(sub {}) };
+ my $obj = bless \$cb, 'PublicInbox::TestCommon::InboxWakeup';
+ $pub_cfg->each_inbox(sub { $_[0]->subscribe_unlock('ident', $obj) });
+ my $w = start_script(['-watch'], undef, { 2 => $err_wr });
+ diag 'waiting for initial fetch...';
+ PublicInbox::DS->EventLoop;
+ my $ibx = $pub_cfg->lookup_name('wtest');
+ my $mm = $ibx->mm;
+ ok(defined($mm->num_for('Seen@test.example.com')),
+ '-watch takes seen message');
+ ok(defined($mm->num_for('Answered@test.example.com')),
+ '-watch takes answered message');
+ ok(!defined($mm->num_for('Deleted@test.example.com')),
+ '-watch ignored \\Deleted');
+ ok(!defined($mm->num_for('Draft@test.example.com')),
+ '-watch ignored \\Draft');
});
undef $cleanup; # remove temporary folder
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/5] lei import: simplify Maildir handling
2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
2021-03-10 13:23 ` [PATCH 1/5] doc: technical/data_structures: update for EOFpipe Eric Wong
2021-03-10 13:23 ` [PATCH 2/5] watch: IMAP: ignore \Deleted and \Draft messages Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
2021-03-10 13:23 ` [PATCH 4/5] lei import: skip trashed Maildir messages Eric Wong
2021-03-10 13:23 ` [PATCH 5/5] doc: start glossary for overlapping concepts Eric Wong
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
To: meta
Having a one-off Maildir functionality in LeiStore doesn't seem
worth the maintenance burden, especially given an upcoming
change to skip trashed messages.
I expect this will hurt performance slightly with extra IPC
overhead for the socket copy, but "lei import" may eventually
become rare or at least not hit messages redundantly.
---
lib/PublicInbox/LeiImport.pm | 8 ++++----
lib/PublicInbox/LeiStore.pm | 6 ------
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 23cecd53..815788b3 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -147,9 +147,9 @@ error reading $input: $!
$lei->child_error(1 << 8, "$input: $@") if $@;
}
-sub _import_maildir { # maildir_each_file cb
- my ($f, $sto, $set_kw) = @_;
- $sto->ipc_do('set_eml_from_maildir', $f, $set_kw);
+sub _import_maildir { # maildir_each_eml cb
+ my ($f, $kw, $eml, $sto, $set_kw) = @_;
+ $sto->ipc_do('set_eml', $eml, $set_kw ? @$kw : ());
}
sub _import_net { # imap_each, nntp_each cb
@@ -181,7 +181,7 @@ sub import_path_url {
return $lei->fail(<<EOM) if $ifmt && $ifmt ne 'maildir';
$input appears to a be a maildir, not $ifmt
EOM
- PublicInbox::MdirReader::maildir_each_file($input,
+ PublicInbox::MdirReader::maildir_each_eml($input,
\&_import_maildir,
$lei->{sto}, $lei->{opt}->{kw});
} else {
diff --git a/lib/PublicInbox/LeiStore.pm b/lib/PublicInbox/LeiStore.pm
index 92c29100..6ace2ad1 100644
--- a/lib/PublicInbox/LeiStore.pm
+++ b/lib/PublicInbox/LeiStore.pm
@@ -213,12 +213,6 @@ sub set_eml {
add_eml($self, $eml, @kw) // set_eml_keywords($self, $eml, @kw);
}
-sub set_eml_from_maildir {
- my ($self, $f, $set_kw) = @_;
- my $eml = eml_from_path($f) or return;
- set_eml($self, $eml, $set_kw ? maildir_keywords($f) : ());
-}
-
sub checkpoint {
my ($self, $wait) = @_;
if (my $im = $self->{im}) {
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 4/5] lei import: skip trashed Maildir messages
2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
` (2 preceding siblings ...)
2021-03-10 13:23 ` [PATCH 3/5] lei import: simplify Maildir handling Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
2021-03-10 13:23 ` [PATCH 5/5] doc: start glossary for overlapping concepts Eric Wong
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
To: meta
This matches IMAP behavior in NetReader in skipping \\Deleted
messages. Since lei may be used for personal, non-public mail;
Draft messages are NOT skipped by "lei import".
---
lib/PublicInbox/MdirReader.pm | 1 +
t/lei-import-maildir.t | 7 +++++++
2 files changed, 8 insertions(+)
diff --git a/lib/PublicInbox/MdirReader.pm b/lib/PublicInbox/MdirReader.pm
index 44724af1..06806e80 100644
--- a/lib/PublicInbox/MdirReader.pm
+++ b/lib/PublicInbox/MdirReader.pm
@@ -57,6 +57,7 @@ sub maildir_each_eml ($$;@) {
opendir my $dh, $pfx or return;
while (defined(my $bn = readdir($dh))) {
my $fl = maildir_basename_flags($bn) // next;
+ next if index($fl, 'T') >= 0;
my $f = $pfx.$bn;
my $eml = eml_from_path($f) or next;
my @kw = sort(map { $c2kw{$_} // () } split(//, $fl));
diff --git a/t/lei-import-maildir.t b/t/lei-import-maildir.t
index a3796491..bd89677a 100644
--- a/t/lei-import-maildir.t
+++ b/t/lei-import-maildir.t
@@ -29,5 +29,12 @@ test_lei(sub {
like($res->[0]->{'s'}, qr/use boolean/, 'got expected result');
is_deeply($res->[0]->{kw}, ['answered', 'seen'], 'keywords set');
is($res->[1], undef, 'only got one result');
+
+ symlink(abs_path('t/utf8.eml'), "$md/cur/u:2,ST") or
+ BAIL_OUT "symlink $md $!";
+ lei_ok('import', "maildir:$md", \'import Maildir w/ trashed message');
+ lei_ok(qw(q -d none m:testmessage@example.com));
+ $res = json_utf8->decode($lei_out);
+ is_deeply($res, [ undef ], 'trashed message not imported');
});
done_testing;
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 5/5] doc: start glossary for overlapping concepts
2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
` (3 preceding siblings ...)
2021-03-10 13:23 ` [PATCH 4/5] lei import: skip trashed Maildir messages Eric Wong
@ 2021-03-10 13:23 ` Eric Wong
4 siblings, 0 replies; 6+ messages in thread
From: Eric Wong @ 2021-03-10 13:23 UTC (permalink / raw)
To: meta
This is intended to keep track of concepts with different terms
between NNTP, IMAP, config file, lei storage, and upcoming
JMAP support.
---
Documentation/public-inbox-glossary.pod | 95 +++++++++++++++++++++++++
Documentation/txt2pre | 1 +
MANIFEST | 1 +
Makefile.PL | 3 +-
4 files changed, 99 insertions(+), 1 deletion(-)
create mode 100644 Documentation/public-inbox-glossary.pod
diff --git a/Documentation/public-inbox-glossary.pod b/Documentation/public-inbox-glossary.pod
new file mode 100644
index 00000000..e188e563
--- /dev/null
+++ b/Documentation/public-inbox-glossary.pod
@@ -0,0 +1,95 @@
+=head1 NAME
+
+public-inbox-glossary - glossary for public-inbox
+
+=head1 DESCRIPTION
+
+public-inbox combines several independently-developed protocols
+and data formats with overlapping concepts. This document is
+intended as a guide to identify and clarify overlapping concepts
+with different names.
+
+This is mainly intended for hackers of public-inbox, but may be useful
+for administrators of public-facing services and/or users building
+tools.
+
+=head1 TERMS
+
+=item IMAP UID, NNTP article number, on-disk Xapian docid
+
+A sequentially-assigned positive integer. These integers are per-inbox,
+or per-extindex. This is the C<num> column of the C<over> table in
+C<over.sqlite3>
+
+=item tid, THREADID
+
+A sequentially-assigned positive integer. These integers are
+per-inbox or per-extindex. In the future, this may be prefixed
+with C<T> for JMAP (RFC 8621) and RFC 8474. This may not be
+strictly compliant with RFC 8621 since inboxes and extindices
+are considered independent entities from each other.
+
+This is the C<tid> column of the C<over> table in C<over.sqlite3>
+
+=item blob
+
+For email, this is the git blob object ID (SHA-(1|256)) of an
+RFC-(822|2822|5322) email message.
+
+=item IMAP EMAILID, JMAP Email Id
+
+To-be-decided. This will likely be the git blob ID prefixed with C<g>
+rather than the numeric UID to accomodate the same blob showing
+up in both an extindex and inbox (or multiple extindices).
+
+=item newsgroup
+
+The name of the NNTP newsgroup, see L<public-inbox-config(5)>.
+
+=item IMAP (folder|mailbox) slice
+
+A 50K slice of a newsgroup to accomodate the limitations of IMAP
+clients with L<public-inbox-imapd(1)>. This is the C<newsgroup>
+name with a C<.$INTEGER_SUFFIX>, e.g. a newsgroup named C<inbox.test>
+would have its first slice named C<inbox.test.0>, and second slice
+named C<inbox.test.1> and so forth.
+
+If implemented, the RFC 8474 MAILBOXID of an IMAP slice will NOT have
+the same Mailbox Id as the public-facing full JMAP mailbox.
+
+=item inbox name, public JMAP mailbox name
+
+The HTTP(S) name of the public-inbox
+(C<publicinbox.E<lt>nameE<gt>.*>). JMAP will use this name
+rather than the newsgroup name since public-facing JMAP will be
+part of the PSGI code and not need a separate daemon like
+L<public-inbox-nntpd(1)> or L<public-inbox-imapd(1)>
+
+=item keywords, (IMAP|Maildir) flags, mbox Status + X-Status
+
+Private, per-message keywords or flags as described in RFC 8621
+section 10.4. These are conveyed in the C<Status:> and
+C<X-Status:> headers for L<mbox(5)>, as IMAP FLAGS (RFC 3501 section 2.3.2),
+or Maildir info flags.
+
+L<public-inbox-watch(1)> ignores drafts and trashed (deleted)
+messages. L<lei-import(1)> ignores trashed (deleted) messages,
+but it imports drafts.
+
+=item labels, private JMAP mailboxes
+
+For L<lei(1)> users only. This will allow lei users to place
+the same email into one or more virtual folders for
+ease-of-filtering. This is NOT tied to public-inbox names, as
+messages stored by lei may not be public.
+
+=head1 COPYRIGHT
+
+Copyright 2021 all contributors L<mailto:meta@public-inbox.org>
+
+License: AGPL-3.0+ L<http://www.gnu.org/licenses/agpl-3.0.txt>
+
+=head1 SEE ALSO
+
+L<public-inbox-v2-format(5)>, L<public-inbox-v1-format(5)>,
+L<public-inbox-extindex-format(5)>, L<gitglossary(7)>
diff --git a/Documentation/txt2pre b/Documentation/txt2pre
index 3277531f..244dc50c 100755
--- a/Documentation/txt2pre
+++ b/Documentation/txt2pre
@@ -27,6 +27,7 @@ for (qw[lei(1)
public-inbox-convert(1)
public-inbox-daemon(8)
public-inbox-edit(1)
+ public-inbox-glossary(7)
public-inbox-httpd(1)
public-inbox-imapd(1)
public-inbox-index(1)
diff --git a/MANIFEST b/MANIFEST
index 8c9c86a0..8662d2c0 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -41,6 +41,7 @@ Documentation/public-inbox-daemon.pod
Documentation/public-inbox-edit.pod
Documentation/public-inbox-extindex-format.pod
Documentation/public-inbox-extindex.pod
+Documentation/public-inbox-glossary.pod
Documentation/public-inbox-httpd.pod
Documentation/public-inbox-imapd.pod
Documentation/public-inbox-index.pod
diff --git a/Makefile.PL b/Makefile.PL
index 6da2ed70..21d3d6ea 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -48,7 +48,8 @@ $v->{-m1} = [ map {
lei-forget-external lei-import lei-init lei-ls-external lei-q)];
$v->{-m5} = [ qw(public-inbox-config public-inbox-v1-format
public-inbox-v2-format public-inbox-extindex-format) ];
-$v->{-m7} = [ qw(lei-overview public-inbox-overview public-inbox-tuning) ];
+$v->{-m7} = [ qw(lei-overview public-inbox-overview public-inbox-tuning
+ public-inbox-glossary) ];
$v->{-m8} = [ qw(public-inbox-daemon) ];
my @sections = (1, 5, 7, 8);
$v->{check_80} = [];
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-10 13:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-10 13:23 [PATCH 0/5] no trash, glossary doc Eric Wong
2021-03-10 13:23 ` [PATCH 1/5] doc: technical/data_structures: update for EOFpipe Eric Wong
2021-03-10 13:23 ` [PATCH 2/5] watch: IMAP: ignore \Deleted and \Draft messages Eric Wong
2021-03-10 13:23 ` [PATCH 3/5] lei import: simplify Maildir handling Eric Wong
2021-03-10 13:23 ` [PATCH 4/5] lei import: skip trashed Maildir messages Eric Wong
2021-03-10 13:23 ` [PATCH 5/5] doc: start glossary for overlapping concepts Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).