unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 0/4] lei: fleshing out some existing features
@ 2021-02-25 10:11 Eric Wong
  2021-02-25 10:11 ` [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

Managed to get more stuff done while still pondering keyword
 storage with read-only externals(*)

1/4 fleshes out convert, which should be feature-complete as far
as currently supported inputs and outputs (no MH, JMAP, POP3,
MMDF, yet)

2/4 represents a major incompatibility in replacing --format/-f with
--in-format/-F in "lei import" for consistency with "lei convert".
Anyways, this is pre-release software and I discouraged "-f" anyways;
so hopefully nobody's scripts are broken :x

4/4 is another one of the things I've found myself wanting
for a while (it wasn't in mairix).

(*) https://public-inbox.org/meta/20210224204950.GA2076@dcvr/

Eric Wong (4):
  lei convert: support IMAP output and "-F eml" inputs
  lei import: use --in-format/-F for consistency
  test_common: io_modes: always support read/write
  lei q: -tt marks direct hits as "flagged"

 Documentation/lei-import.pod  |  2 +-
 Documentation/lei-q.pod       |  8 ++++++
 MANIFEST                      |  1 +
 lib/PublicInbox/LEI.pm        | 12 ++++-----
 lib/PublicInbox/LeiConvert.pm | 51 ++++++++++++++++++++++-------------
 lib/PublicInbox/LeiImport.pm  |  8 +++---
 lib/PublicInbox/LeiXSearch.pm | 21 ++++++++++++---
 lib/PublicInbox/NetWriter.pm  |  3 ++-
 lib/PublicInbox/TestCommon.pm |  4 +--
 t/lei-convert.t               | 15 +++++++++++
 t/lei-import.t                | 12 ++++-----
 t/lei-q-thread.t              | 47 ++++++++++++++++++++++++++++++++
 t/lei_to_mail.t               |  2 +-
 xt/net_writer-imap.t          |  4 +++
 14 files changed, 146 insertions(+), 44 deletions(-)
 create mode 100644 t/lei-q-thread.t


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs
  2021-02-25 10:11 [PATCH 0/4] lei: fleshing out some existing features Eric Wong
@ 2021-02-25 10:11 ` Eric Wong
  2021-02-25 10:11 ` [PATCH 2/4] lei import: use --in-format/-F for consistency Eric Wong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

eml ("message/rfc822" MIME type) is supported by "lei import",
so it probably makes sense to support via convert, at least
for tests.  And IMAP support is supported in "lei q -o $MFOLDER",
so this only required renaming {nrd} => {net} and initializing
outputs before augment preparation (creating the IMAP folder)
---
 lib/PublicInbox/LeiConvert.pm | 47 +++++++++++++++++++++++------------
 lib/PublicInbox/LeiImport.pm  |  1 -
 lib/PublicInbox/NetWriter.pm  |  3 ++-
 t/lei-convert.t               | 15 +++++++++++
 xt/net_writer-imap.t          |  4 +++
 5 files changed, 52 insertions(+), 18 deletions(-)

diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index a7e47871..32aa2edb 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -28,25 +28,35 @@ sub mdir_cb {
 	$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
+sub convert_fh ($$$$) {
+	my ($self, $ifmt, $fh, $name) = @_;
+	if ($ifmt eq 'eml') {
+		my $buf = do { local $/; <$fh> } //
+			return $self->{lei}->child_error(1 << 8, <<"");
+error reading $name: $!
+
+		my $eml = PublicInbox::Eml->new(\$buf);
+		$self->{wcb}->(undef, { kw => [] }, $eml);
+	} else {
+		PublicInbox::MboxReader->$ifmt($fh, \&mbox_cb, $self);
+	}
+}
+
 sub do_convert { # via wq_do
 	my ($self) = @_;
 	my $lei = $self->{lei};
 	my $in_fmt = $lei->{opt}->{'in-format'};
 	my $mics;
-	if (my $nrd = $lei->{nrd}) { # may prompt user once
-		$nrd->{mics_cached} = $nrd->imap_common_init($lei);
-		$nrd->{nn_cached} = $nrd->nntp_common_init($lei);
-	}
 	if (my $stdin = delete $self->{0}) {
-		PublicInbox::MboxReader->$in_fmt($stdin, \&mbox_cb, $self);
+		convert_fh($self, $in_fmt, $stdin, '<stdin>');
 	}
 	for my $input (@{$self->{inputs}}) {
 		my $ifmt = lc($in_fmt // '');
 		if ($input =~ m!\Aimaps?://!) {
-			$lei->{nrd}->imap_each($input, \&net_cb, $self);
+			$lei->{net}->imap_each($input, \&net_cb, $self);
 			next;
 		} elsif ($input =~ m!\A(?:nntps?|s?news)://!) {
-			$lei->{nrd}->nntp_each($input, \&net_cb, $self);
+			$lei->{net}->nntp_each($input, \&net_cb, $self);
 			next;
 		} elsif ($input =~ s!\A([a-z0-9]+):!!i) {
 			$ifmt = lc $1;
@@ -54,7 +64,7 @@ sub do_convert { # via wq_do
 		if (-f $input) {
 			open my $fh, '<', $input or
 					return $lei->fail("open $input: $!");
-			PublicInbox::MboxReader->$ifmt($fh, \&mbox_cb, $self);
+			convert_fh($self, $ifmt, $fh, $input);
 		} elsif (-d _) {
 			PublicInbox::MdirReader::maildir_each_eml($input,
 							\&mdir_cb, $self);
@@ -72,11 +82,12 @@ sub call { # the main "lei convert" method
 	$opt->{kw} //= 1;
 	my $self = $lei->{cnv} = bless {}, $cls;
 	my $in_fmt = $opt->{'in-format'};
-	my ($nrd, @f, @d);
+	my (@f, @d);
 	$opt->{dedupe} //= 'none';
 	my $ovv = PublicInbox::LeiOverview->new($lei, 'out-format');
 	$lei->{l2m} or return
 		$lei->fail("output not specified or is not a mail destination");
+	my $net = $lei->{net}; # NetWriter may be created by l2m
 	$opt->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
 	if ($opt->{stdin}) {
 		@inputs and return $lei->fail("--stdin and @inputs do not mix");
@@ -88,8 +99,8 @@ sub call { # the main "lei convert" method
 		my $input_path = $input;
 		if ($input =~ m!\A(?:imaps?|nntps?|s?news)://!i) {
 			require PublicInbox::NetReader;
-			$nrd //= PublicInbox::NetReader->new;
-			$nrd->add_url($input);
+			$net //= PublicInbox::NetReader->new;
+			$net->add_url($input);
 		} elsif ($input_path =~ s/\A([a-z0-9]+)://is) {
 			my $ifmt = lc $1;
 			if (($in_fmt // $ifmt) ne $ifmt) {
@@ -117,12 +128,12 @@ sub call { # the main "lei convert" method
 		require PublicInbox::MdirReader;
 	}
 	$self->{inputs} = \@inputs;
-	if ($nrd) {
-		if (my $err = $nrd->errors) {
+	if ($net) {
+		if (my $err = $net->errors) {
 			return $lei->fail($err);
 		}
-		$nrd->{quiet} = $opt->{quiet};
-		$lei->{nrd} = $nrd;
+		$net->{quiet} = $opt->{quiet};
+		$lei->{net} //= $net;
 	}
 	my $op = $lei->workers_start($self, 'lei_convert', 1, {
 		'' => [ $lei->can('dclose'), $lei ]
@@ -137,11 +148,15 @@ sub ipc_atfork_child {
 	my $lei = $self->{lei};
 	$lei->lei_atfork_child;
 	my $l2m = delete $lei->{l2m};
+	if (my $net = $lei->{net}) { # may prompt user once
+		$net->{mics_cached} = $net->imap_common_init($lei);
+		$net->{nn_cached} = $net->nntp_common_init($lei);
+	}
+	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$l2m->pre_augment($lei);
 	$l2m->do_augment($lei);
 	$l2m->post_augment($lei);
 	$self->{wcb} = $l2m->write_cb($lei);
-	$SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
 	$self->SUPER::ipc_atfork_child;
 }
 
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index cbfb3127..13e817d0 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -7,7 +7,6 @@ use strict;
 use v5.10.1;
 use parent qw(PublicInbox::IPC);
 use PublicInbox::Eml;
-use PublicInbox::InboxWritable qw(eml_from_path);
 use PublicInbox::PktOp qw(pkt_do);
 
 sub _import_eml { # MboxReader callback
diff --git a/lib/PublicInbox/NetWriter.pm b/lib/PublicInbox/NetWriter.pm
index c68b0669..e26e9815 100644
--- a/lib/PublicInbox/NetWriter.pm
+++ b/lib/PublicInbox/NetWriter.pm
@@ -16,7 +16,8 @@ my %IMAPkw2flags;
 sub imap_append {
 	my ($mic, $folder, $bref, $smsg, $eml) = @_;
 	$bref //= \($eml->as_string);
-	$smsg //= bless { }, 'PublicInbox::Smsg';
+	$smsg //= bless {}, 'PublicInbox::Smsg';
+	bless($smsg, 'PublicInbox::Smsg') if ref($smsg) eq 'HASH';
 	$smsg->{ts} //= msg_timestamp($eml // PublicInbox::Eml->new($$bref));
 	my @f = map { $IMAPkw2flags{$_} } @{$smsg->{kw}};
 	$mic->append_string($folder, $$bref, "@f", $smsg->internaldate) or
diff --git a/t/lei-convert.t b/t/lei-convert.t
index 2ba62db3..20099f65 100644
--- a/t/lei-convert.t
+++ b/t/lei-convert.t
@@ -5,6 +5,7 @@ use strict; use v5.10.1; use PublicInbox::TestCommon;
 use PublicInbox::MboxReader;
 use PublicInbox::MdirReader;
 use PublicInbox::NetReader;
+use PublicInbox::Eml;
 require_git 2.6;
 require_mods(qw(DBD::SQLite Search::Xapian Mail::IMAPClient Net::NNTP));
 my ($tmpdir, $for_destroy) = tmpdir;
@@ -84,5 +85,19 @@ test_lei({ tmpdir => $tmpdir }, sub {
 	open $fh, '<', "$d/foo.mboxrd" or BAIL_OUT;
 	my $exp = do { local $/; <$fh> };
 	is($out, $exp, 'stdin => stdout');
+
+	lei_ok qw(convert -F eml -o mboxcl2:/dev/stdout t/plack-qp.eml);
+	open $fh, '<', \$lei_out or BAIL_OUT;
+	@bar = ();
+	PublicInbox::MboxReader->mboxcl2($fh, sub {
+		my $eml = shift;
+		for my $h (qw(Status Content-Length Lines)) {
+			ok(defined($eml->header_raw($h)),
+				"$h defined for mboxcl2");
+			$eml->header_set($h);
+		}
+		push @bar, $eml;
+	});
+	is_deeply(\@bar, [ eml_load('t/plack-qp.eml') ], 'eml => mboxcl2');
 });
 done_testing;
diff --git a/xt/net_writer-imap.t b/xt/net_writer-imap.t
index 64f822cf..da435926 100644
--- a/xt/net_writer-imap.t
+++ b/xt/net_writer-imap.t
@@ -138,6 +138,10 @@ test_lei(sub {
 	$nwr->imap_each($folder_uri, $imap_slurp_all, my $empty = []);
 	is(scalar(@$empty), 0, 'no results w/o augment');
 
+	lei_ok qw(convert -F eml t/msg_iter-order.eml -o), $$folder_uri;
+	$nwr->imap_each($folder_uri, $imap_slurp_all, $empty = []);
+	is_deeply($empty, [ [ [], eml_load('t/msg_iter-order.eml') ] ],
+		'converted to IMAP destination');
 });
 
 undef $cleanup; # remove temporary folder

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] lei import: use --in-format/-F for consistency
  2021-02-25 10:11 [PATCH 0/4] lei: fleshing out some existing features Eric Wong
  2021-02-25 10:11 ` [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs Eric Wong
@ 2021-02-25 10:11 ` Eric Wong
  2021-02-25 10:11 ` [PATCH 3/4] test_common: io_modes: always support read/write Eric Wong
  2021-02-25 10:11 ` [PATCH 4/4] lei q: -tt marks direct hits as "flagged" Eric Wong
  3 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

Since we recommend $IN_FORMAT:$LOCATION, this is hopefully not
intrusive (not that this is released software, yet).  This is
to be consistent with "lei convert" usage.

We'll keep "-f" only for output formats, since that is used
for "lei q" and "lei convert" for outputs
---
 Documentation/lei-import.pod  |  2 +-
 lib/PublicInbox/LEI.pm        |  8 ++++----
 lib/PublicInbox/LeiConvert.pm |  4 ++--
 lib/PublicInbox/LeiImport.pm  |  7 +++----
 t/lei-import.t                | 12 ++++++------
 t/lei_to_mail.t               |  2 +-
 6 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/Documentation/lei-import.pod b/Documentation/lei-import.pod
index 2051e6bc..ef20e2f6 100644
--- a/Documentation/lei-import.pod
+++ b/Documentation/lei-import.pod
@@ -22,7 +22,7 @@ TODO: Update when URL support is added.
 
 =over
 
-=item -f MAIL_FORMAT, --format=MAIL_FORMAT
+=item -F MAIL_FORMAT, --in-format=MAIL_FORMAT
 
 Message input format.  Unless messages are given on C<stdin>, using a
 format prefix with C<LOCATION> is preferred.
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 50665b3e..8eb96e78 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -172,7 +172,7 @@ our %CMD = ( # sorted in order of importance/use:
 'import' => [ 'LOCATION...|--stdin',
 	'one-time import/update from URL or filesystem',
 	qw(stdin| offset=i recursive|r exclude=s include|I=s
-	format|f=s kw|keywords|flags! C=s@),
+	in-format|F=s kw|keywords|flags! C=s@),
 	],
 'convert' => [ 'LOCATION...|--stdin',
 	'one-time conversion from URL or filesystem to another format',
@@ -399,9 +399,9 @@ sub fail ($$;$) {
 	undef;
 }
 
-sub check_input_format ($;$$) {
-	my ($self, $files, $opt_key) = @_;
-	$opt_key //= 'format';
+sub check_input_format ($;$) {
+	my ($self, $files) = @_;
+	my $opt_key = 'in-format';
 	my $fmt = $self->{opt}->{$opt_key};
 	if (!$fmt) {
 		my $err = $files ? "regular file(s):\n@$files" : '--stdin';
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 32aa2edb..45d42c9c 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -91,7 +91,7 @@ sub call { # the main "lei convert" method
 	$opt->{augment} = 1 unless $ovv->{dst} eq '/dev/stdout';
 	if ($opt->{stdin}) {
 		@inputs and return $lei->fail("--stdin and @inputs do not mix");
-		$lei->check_input_format(undef, 'in-format') or return;
+		$lei->check_input_format(undef) or return;
 		$self->{0} = $lei->{0};
 	}
 	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
@@ -123,7 +123,7 @@ sub call { # the main "lei convert" method
 		elsif (-d _) { push @d, $input }
 		else { return $lei->fail("Unable to handle $input") }
 	}
-	if (@f) { $lei->check_input_format(\@f, 'in-format') or return }
+	if (@f) { $lei->check_input_format(\@f) or return }
 	if (@d) { # TODO: check for MH vs Maildir, here
 		require PublicInbox::MdirReader;
 	}
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index 13e817d0..7f247b64 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -68,8 +68,7 @@ sub call { # the main "lei import" method
 		$self->{0} = $lei->{0};
 	}
 
-	# TODO: do we need --format for non-stdin?
-	my $fmt = $lei->{opt}->{'format'};
+	my $fmt = $lei->{opt}->{'in-format'};
 	# e.g. Maildir:/home/user/Mail/ or imaps://example.com/INBOX
 	for my $input (@inputs) {
 		my $input_path = $input;
@@ -159,7 +158,7 @@ sub _import_net { # imap_each, nntp_each cb
 sub import_path_url {
 	my ($self, $input) = @_;
 	my $lei = $self->{lei};
-	my $ifmt = lc($lei->{opt}->{'format'} // '');
+	my $ifmt = lc($lei->{opt}->{'in-format'} // '');
 	# TODO auto-detect?
 	if ($input =~ m!\Aimaps?://!i) {
 		$lei->{net}->imap_each($input, \&_import_net, $lei->{sto},
@@ -191,7 +190,7 @@ EOM
 sub import_stdin {
 	my ($self) = @_;
 	my $lei = $self->{lei};
-	_import_fh($lei, delete $self->{0}, '<stdin>', $lei->{opt}->{'format'});
+	_import_fh($lei, delete $self->{0}, '<stdin>', $lei->{opt}->{'in-format'});
 }
 
 no warnings 'once'; # the following works even when LeiAuth is lazy-loaded
diff --git a/t/lei-import.t b/t/lei-import.t
index fa4fc504..edb0cd20 100644
--- a/t/lei-import.t
+++ b/t/lei-import.t
@@ -3,13 +3,13 @@
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 test_lei(sub {
-ok(!lei(qw(import -f bogus), 't/plack-qp.eml'), 'fails with bogus format');
+ok(!lei(qw(import -F bogus), 't/plack-qp.eml'), 'fails with bogus format');
 like($lei_err, qr/\bbogus unrecognized/, 'gave error message');
 
 lei_ok(qw(q s:boolean), \'search miss before import');
 unlike($lei_out, qr/boolean/i, 'no results, yet');
 open my $fh, '<', 't/data/0001.patch' or BAIL_OUT $!;
-lei_ok([qw(import -f eml -)], undef, { %$lei_opt, 0 => $fh },
+lei_ok([qw(import -F eml -)], undef, { %$lei_opt, 0 => $fh },
 	\'import single file from stdin') or diag $lei_err;
 close $fh;
 lei_ok(qw(q s:boolean), \'search hit after import');
@@ -26,7 +26,7 @@ lei_ok(qw(q s:boolean -f mboxrd), \'blob accessible after import');
 	});
 	is_deeply(\@cmp, $expect, 'got expected message in mboxrd');
 }
-lei_ok(qw(import -f eml), 't/data/message_embed.eml',
+lei_ok(qw(import -F eml), 't/data/message_embed.eml',
 	\'import single file by path');
 
 my $str = <<'';
@@ -35,7 +35,7 @@ Message-ID: <x@y>
 Status: RO
 
 my $opt = { %$lei_opt, 0 => \$str };
-lei_ok([qw(import -f eml -)], undef, $opt,
+lei_ok([qw(import -F eml -)], undef, $opt,
 	\'import single file with keywords from stdin');
 lei_ok(qw(q m:x@y));
 my $res = json_utf8->decode($lei_out);
@@ -43,13 +43,13 @@ is($res->[1], undef, 'only one result');
 is_deeply($res->[0]->{kw}, ['seen'], "message `seen' keyword set");
 
 $str =~ tr/x/v/; # v@y
-lei_ok([qw(import --no-kw -f eml -)], undef, $opt,
+lei_ok([qw(import --no-kw -F eml -)], undef, $opt,
 	\'import single file with --no-kw from stdin');
 lei(qw(q m:v@y));
 $res = json_utf8->decode($lei_out);
 is($res->[1], undef, 'only one result');
 is_deeply($res->[0]->{kw}, [], 'no keywords set');
 
-# see t/lei_to_mail.t for "import -f mbox*"
+# see t/lei_to_mail.t for "import -F mbox*"
 });
 done_testing;
diff --git a/t/lei_to_mail.t b/t/lei_to_mail.t
index 72b90700..7898cc48 100644
--- a/t/lei_to_mail.t
+++ b/t/lei_to_mail.t
@@ -130,7 +130,7 @@ my $orig = do {
 };
 
 test_lei(sub {
-	ok(lei(qw(import -f), $mbox, $fn), 'imported mbox');
+	ok(lei(qw(import -F), $mbox, $fn), 'imported mbox');
 	ok(lei(qw(q s:x)), 'lei q works') or diag $lei_err;
 	my $res = json_utf8->decode($lei_out);
 	my $x = $res->[0];

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] test_common: io_modes: always support read/write
  2021-02-25 10:11 [PATCH 0/4] lei: fleshing out some existing features Eric Wong
  2021-02-25 10:11 ` [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs Eric Wong
  2021-02-25 10:11 ` [PATCH 2/4] lei import: use --in-format/-F for consistency Eric Wong
@ 2021-02-25 10:11 ` Eric Wong
  2021-02-25 10:11 ` [PATCH 4/4] lei q: -tt marks direct hits as "flagged" Eric Wong
  3 siblings, 0 replies; 8+ messages in thread
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

This avoids warnings when redirecting STDIN to a scalarref
via run_script().
---
 lib/PublicInbox/TestCommon.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index fc32b57f..af1b2e4f 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -155,8 +155,8 @@ sub key2script ($) {
 	'blib/script/'.$key;
 }
 
-my @io_mode = ([ *STDIN{IO}, '<&' ], [ *STDOUT{IO}, '>&' ],
-		[ *STDERR{IO}, '>&' ]);
+my @io_mode = ([ *STDIN{IO}, '+<&' ], [ *STDOUT{IO}, '+>&' ],
+		[ *STDERR{IO}, '+>&' ]);
 
 sub _prepare_redirects ($) {
 	my ($fhref) = @_;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] lei q: -tt marks direct hits as "flagged"
  2021-02-25 10:11 [PATCH 0/4] lei: fleshing out some existing features Eric Wong
                   ` (2 preceding siblings ...)
  2021-02-25 10:11 ` [PATCH 3/4] test_common: io_modes: always support read/write Eric Wong
@ 2021-02-25 10:11 ` Eric Wong
  2021-02-26  3:38   ` Kyle Meyer
  3 siblings, 1 reply; 8+ messages in thread
From: Eric Wong @ 2021-02-25 10:11 UTC (permalink / raw)
  To: meta

This can be used to quickly distinguish messages which were
direct hits when doing thread expansion vs messages that
were merely part of the same thread.

This is NOT mairix-derived behavior, but I occasionally found
it useful when looking at results in an MUA to know whether
a message was a direct hit or not.

This makes "-t" consistent with non-"-t" cases as far as keyword
reading goes.
---
 Documentation/lei-q.pod       |  8 ++++++
 MANIFEST                      |  1 +
 lib/PublicInbox/LEI.pm        |  4 +--
 lib/PublicInbox/LeiXSearch.pm | 21 +++++++++++++---
 t/lei-q-thread.t              | 47 +++++++++++++++++++++++++++++++++++
 5 files changed, 75 insertions(+), 6 deletions(-)
 create mode 100644 t/lei-q-thread.t

diff --git a/Documentation/lei-q.pod b/Documentation/lei-q.pod
index 75fdc613..0959beac 100644
--- a/Documentation/lei-q.pod
+++ b/Documentation/lei-q.pod
@@ -79,6 +79,14 @@ Augment output destination instead of clobbering it.
 
 Return all messages in the same thread as the actual match(es).
 
+Using this twice (C<-tt>) sets the C<flagged> (AKA "important")
+on messages which were actual messages.  This is useful to distinguish
+messages which were direct hits from messages which were merely part
+of the same thread.
+
+TODO: Warning: this flag may become persistent and saved in
+lei/store unless an MUA unflags it!  (Behavior undecided)
+
 =item -d STRATEGY, --dedupe=STRATEGY
 
 Strategy for deduplicating messages: C<content>, C<oid>, C<mid>, or
diff --git a/MANIFEST b/MANIFEST
index adbd108f..9cf33d48 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -373,6 +373,7 @@ t/lei-import-nntp.t
 t/lei-import.t
 t/lei-mirror.t
 t/lei-q-remote-import.t
+t/lei-q-thread.t
 t/lei.t
 t/lei_dedupe.t
 t/lei_external.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 8eb96e78..8825fa43 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -109,7 +109,7 @@ sub index_opt {
 # command => [ positional_args, 1-line description, Getopt::Long option spec ]
 our %CMD = ( # sorted in order of importance/use:
 'q' => [ '--stdin|SEARCH_TERMS...', 'search for messages matching terms', qw(
-	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t augment|a
+	save-as=s output|mfolder|o=s format|f=s dedupe|d=s threads|t+ augment|a
 	sort|s=s reverse|r offset=i remote! local! external! pretty
 	include|I=s@ exclude=s@ only=s@ jobs|j=s globoff|g stdin|
 	import-remote!
@@ -233,7 +233,7 @@ my %OPTDESC = (
 'dedupe|d=s' => ['STRATEGY|content|oid|mid|none',
 		'deduplication strategy'],
 'show	threads|t' => 'display entire thread a message belongs to',
-'q	threads|t' =>
+'q	threads|t+' =>
 	'return all messages in the same threads as the actual match(es)',
 'alert=s@' => ['CMD,:WINCH,:bell,<any command>',
 	'run command(s) or perform ops when done writing to output ' .
diff --git a/lib/PublicInbox/LeiXSearch.pm b/lib/PublicInbox/LeiXSearch.pm
index 2d399653..eb015978 100644
--- a/lib/PublicInbox/LeiXSearch.pm
+++ b/lib/PublicInbox/LeiXSearch.pm
@@ -66,6 +66,13 @@ sub remotes { @{$_[0]->{remotes} // []} }
 # called by PublicInbox::Search::xdb
 sub xdb_shards_flat { @{$_[0]->{shards_flat} // []} }
 
+sub mitem_kw ($$;$) {
+	my ($smsg, $mitem, $flagged) = @_;
+	my $kw = xap_terms('K', $mitem->get_document);
+	$kw->{flagged} = 1 if $flagged;
+	$smsg->{kw} = [ sort keys %$kw ];
+}
+
 # like over->get_art
 sub smsg_for {
 	my ($self, $mitem) = @_;
@@ -76,10 +83,7 @@ sub smsg_for {
 	my $num = int(($docid - 1) / $nshard) + 1;
 	my $ibx = $self->{shard2ibx}->[$shard];
 	my $smsg = $ibx->over->get_art($num);
-	if (ref($ibx->can('msg_keywords'))) {
-		my $kw = xap_terms('K', $mitem->get_document);
-		$smsg->{kw} = [ sort keys %$kw ];
-	}
+	mitem_kw($smsg, $mitem) if $ibx->can('msg_keywords');
 	$smsg->{docid} = $docid;
 	$smsg;
 }
@@ -143,6 +147,8 @@ sub query_thread_mset { # for --threads
 	my $mo = { %{$lei->{mset_opt}} };
 	my $mset;
 	my $each_smsg = $lei->{ovv}->ovv_each_smsg_cb($lei, $ibxish);
+	my $can_kw = !!$ibxish->can('msg_keywords');
+	my $fl = $lei->{opt}->{threads} > 1;
 	do {
 		$mset = $srch->mset($mo->{qstr}, $mo);
 		mset_progress($lei, $desc, $mset->size,
@@ -156,6 +162,13 @@ sub query_thread_mset { # for --threads
 				my $smsg = $over->get_art($n) or next;
 				wait_startq($lei);
 				my $mitem = delete $n2item{$smsg->{num}};
+				if ($mitem) {
+					if ($can_kw) {
+						mitem_kw($smsg, $mitem, $fl);
+					} else {
+						$smsg->{kw} = [ 'flagged' ];
+					}
+				}
 				$each_smsg->($smsg, $mitem);
 			}
 			@{$ctx->{xids}} = ();
diff --git a/t/lei-q-thread.t b/t/lei-q-thread.t
new file mode 100644
index 00000000..66db28a9
--- /dev/null
+++ b/t/lei-q-thread.t
@@ -0,0 +1,47 @@
+#!perl -w
+# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+require_git 2.6;
+require_mods(qw(json DBD::SQLite Search::Xapian));
+use PublicInbox::LeiToMail;
+my ($ro_home, $cfg_path) = setup_public_inboxes;
+test_lei(sub {
+	my $eml = eml_load('t/utf8.eml');
+	my $buf = PublicInbox::LeiToMail::eml2mboxrd($eml, { kw => ['seen'] });
+	lei_ok([qw(import -F mboxrd -)], undef, { 0 => $buf, %$lei_opt });
+
+	lei_ok qw(q -t m:testmessage@example.com);
+	my $res = json_utf8->decode($lei_out);
+	is_deeply($res->[0]->{kw}, [ 'seen' ], 'q -t sets keywords');
+
+	$eml = eml_load('t/utf8.eml');
+	$eml->header_set('References', $eml->header('Message-ID'));
+	$eml->header_set('Message-ID', '<a-reply@miss>');
+	$buf = PublicInbox::LeiToMail::eml2mboxrd($eml, { kw => ['draft'] });
+	lei_ok([qw(import -F mboxrd -)], undef, { 0 => $buf, %$lei_opt });
+
+	lei_ok qw(q -t m:testmessage@example.com);
+	$res = json_utf8->decode($lei_out);
+	is(scalar(@$res), 3, 'got 2 results');
+	pop @$res;
+	my %m = map { $_->{'m'} => $_ } @$res;
+	is_deeply($m{'<testmessage@example.com>'}->{kw}, ['seen'],
+		'flag set in direct hit');
+	'TODO' or is_deeply($m{'<a-reply@miss>'}->{kw}, ['draft'],
+		'flag set in thread hit');
+
+	lei_ok qw(q -t -t m:testmessage@example.com);
+	$res = json_utf8->decode($lei_out);
+	is(scalar(@$res), 3, 'got 2 results with -t -t');
+	pop @$res;
+	%m = map { $_->{'m'} => $_ } @$res;
+	is_deeply($m{'<testmessage@example.com>'}->{kw}, ['flagged', 'seen'],
+		'flagged set in direct hit');
+	'TODO' or is_deeply($m{'<testmessage@example.com>'}->{kw}, ['draft'],
+		'flagged set in direct hit');
+	lei_ok qw(q -t -t m:testmessage@example.com --only), "$ro_home/t2";
+	$res = json_utf8->decode($lei_out);
+	is_deeply($res->[0]->{kw}, [ 'flagged' ], 'flagged set on external');
+});
+done_testing;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/4] lei q: -tt marks direct hits as "flagged"
  2021-02-25 10:11 ` [PATCH 4/4] lei q: -tt marks direct hits as "flagged" Eric Wong
@ 2021-02-26  3:38   ` Kyle Meyer
  2021-02-26  4:13     ` Eric Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Kyle Meyer @ 2021-02-26  3:38 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> This can be used to quickly distinguish messages which were
> direct hits when doing thread expansion vs messages that
> were merely part of the same thread.

Ah, that's very useful.

> +Using this twice (C<-tt>) sets the C<flagged> (AKA "important")
> +on messages which were actual messages.  This is useful to distinguish
> +messages which were direct hits from messages which were merely part
> +of the same thread.
> +
> +TODO: Warning: this flag may become persistent and saved in
> +lei/store unless an MUA unflags it!  (Behavior undecided)

Oy, I understand even less than I thought I did.  How does the
information about what the MUA unflags get back into the store?  Is
there an implicit additional step (`lei import ...')?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/4] lei q: -tt marks direct hits as "flagged"
  2021-02-26  3:38   ` Kyle Meyer
@ 2021-02-26  4:13     ` Eric Wong
  2021-02-26  4:38       ` Kyle Meyer
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Wong @ 2021-02-26  4:13 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: meta

Kyle Meyer <kyle@kyleam.com> wrote:
> Eric Wong writes:
> > +TODO: Warning: this flag may become persistent and saved in
> > +lei/store unless an MUA unflags it!  (Behavior undecided)
> 
> Oy, I understand even less than I thought I did.  How does the
> information about what the MUA unflags get back into the store?  Is
> there an implicit additional step (`lei import ...')?

lei will watch (via inotify/EVFILT_VNODE) mail stores it knows
about for flag updates.  At least that's the plan...

Also, when overwriting an existing output, I think it would be
wise to do an implicit import of any messages that aren't
already in lei/store or an external.  That would save users
from accidentally trashing their data.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/4] lei q: -tt marks direct hits as "flagged"
  2021-02-26  4:13     ` Eric Wong
@ 2021-02-26  4:38       ` Kyle Meyer
  0 siblings, 0 replies; 8+ messages in thread
From: Kyle Meyer @ 2021-02-26  4:38 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

Eric Wong writes:

> Kyle Meyer <kyle@kyleam.com> wrote:

>> Oy, I understand even less than I thought I did.  How does the
>> information about what the MUA unflags get back into the store?  Is
>> there an implicit additional step (`lei import ...')?
>
> lei will watch (via inotify/EVFILT_VNODE) mail stores it knows
> about for flag updates.  At least that's the plan...
>
> Also, when overwriting an existing output, I think it would be
> wise to do an implicit import of any messages that aren't
> already in lei/store or an external.  That would save users
> from accidentally trashing their data.

Makes sense.  Thanks for the details (especially if you're repeating
yourself).

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-02-26  4:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-25 10:11 [PATCH 0/4] lei: fleshing out some existing features Eric Wong
2021-02-25 10:11 ` [PATCH 1/4] lei convert: support IMAP output and "-F eml" inputs Eric Wong
2021-02-25 10:11 ` [PATCH 2/4] lei import: use --in-format/-F for consistency Eric Wong
2021-02-25 10:11 ` [PATCH 3/4] test_common: io_modes: always support read/write Eric Wong
2021-02-25 10:11 ` [PATCH 4/4] lei q: -tt marks direct hits as "flagged" Eric Wong
2021-02-26  3:38   ` Kyle Meyer
2021-02-26  4:13     ` Eric Wong
2021-02-26  4:38       ` Kyle Meyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).