unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 0/9] minor tweaks and fixes
@ 2018-03-30  1:20 Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 1/9] search: warn on reopens and die on total failure Eric Wong (Contractor, The Linux Foundation)
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

Xapian seems poorly-optimized for the
"gimme-the-most-recent-N-messages" request which hits the
per-inbox homepage (/$INBOX/) so we make a minor tweak
in hopes of making it less painful

A few other fixes and improved instructions for
creating a v2 mirror.

Eric Wong (Contractor, The Linux Foundation) (9):
  search: warn on reopens and die on total failure
  v2writable: allow gaps in git partitions
  v2writable: convert some fatal reindex errors to warnings
  wwwstream: flesh out clone instructions for v2
  v2writable: go backwards through alternate Message-IDs
  view: speed up homepage loading time with date clamp
  view: drop load_results
  msgtime: parse 3-digit years properly
  feed: optimize query for feeds, too

 MANIFEST                      |  1 +
 lib/PublicInbox/Feed.pm       |  2 +-
 lib/PublicInbox/Inbox.pm      | 20 ++++++++++++++++++++
 lib/PublicInbox/MsgTime.pm    |  3 +++
 lib/PublicInbox/Search.pm     |  4 +++-
 lib/PublicInbox/V2Writable.pm | 27 ++++++++++++++++++++++-----
 lib/PublicInbox/View.pm       | 19 ++++++-------------
 lib/PublicInbox/WwwStream.pm  | 25 ++++++++++++++++++++-----
 t/time.t                      | 28 ++++++++++++++++++++++++++++
 9 files changed, 104 insertions(+), 25 deletions(-)
 create mode 100644 t/time.t

-- 
EW


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/9] search: warn on reopens and die on total failure
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 2/9] v2writable: allow gaps in git partitions Eric Wong (Contractor, The Linux Foundation)
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

-watch on a busy/giant Maildir caused too many Xapian
errors while attempting to browse.
---
 lib/PublicInbox/Search.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index 5fc7682..de296e1 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -215,18 +215,20 @@ sub get_thread {
 sub retry_reopen {
 	my ($self, $cb) = @_;
 	my $ret;
-	for (1..10) {
+	for my $i (1..10) {
 		eval { $ret = $cb->() };
 		return $ret unless $@;
 		# Exception: The revision being read has been discarded -
 		# you should call Xapian::Database::reopen()
 		if (ref($@) eq 'Search::Xapian::DatabaseModifiedError') {
+			warn "reopen try #$i on $@\n";
 			reopen($self);
 		} else {
 			warn "ref: ", ref($@), "\n";
 			die;
 		}
 	}
+	die "Too many Xapian database modifications in progress\n";
 }
 
 sub _do_enquire {
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/9] v2writable: allow gaps in git partitions
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 1/9] search: warn on reopens and die on total failure Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 3/9] v2writable: convert some fatal reindex errors to warnings Eric Wong (Contractor, The Linux Foundation)
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

Somebody may only care about the most recent history,
so allow -init and -index to operate quietly on missing
partitions.
---
 lib/PublicInbox/V2Writable.pm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 4e7d6de..6394d30 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -67,7 +67,9 @@ sub init_inbox {
 	my ($self, $parallel) = @_;
 	$self->{parallel} = $parallel;
 	$self->idx_init;
-	$self->git_init(0);
+	my $max_git = -1;
+	git_dir_latest($self, \$max_git);
+	$self->git_init($max_git >= 0 ? $max_git : 0);
 	$self->done;
 }
 
@@ -621,6 +623,7 @@ sub reindex {
 		for (my $cur = $max_git; $cur >= 0; $cur--) {
 			die "already reindexing!\n" if $self->{reindex_pipe};
 			my $git = PublicInbox::Git->new("$pfx/$cur.git");
+			-d $git->{git_dir} or next; # missing parts are fine
 			chomp($tip = $git->qx('rev-parse', $head)) unless $tip;
 			my $h = $cur == $max_git ? $tip : $head;
 			my @count = ('rev-list', '--count', $h, '--', 'm');
@@ -642,6 +645,7 @@ sub reindex {
 		die "already reindexing!\n" if delete $self->{reindex_pipe};
 		my $cmt;
 		my $git_dir = "$pfx/$cur.git";
+		-d $git_dir or next; # missing parts are fine
 		my $git = PublicInbox::Git->new($git_dir);
 		my $h = $cur == $max_git ? $tip : $head;
 		my $fh = $self->{reindex_pipe} = $git->popen(@cmd, $h);
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/9] v2writable: convert some fatal reindex errors to warnings
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 1/9] search: warn on reopens and die on total failure Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 2/9] v2writable: allow gaps in git partitions Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 4/9] wwwstream: flesh out clone instructions for v2 Eric Wong (Contractor, The Linux Foundation)
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

By supporting purge and allowing users to delete git partitions,
we can open up ourselves to gaps and un-reindexible data.  Let
that be.
---
 lib/PublicInbox/V2Writable.pm | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 6394d30..269b028 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -552,7 +552,7 @@ sub reindex_oid {
 		$num = $$regen--;
 		die "BUG: ran out of article numbers\n" if $num <= 0;
 		my $mm = $self->{skel}->{mm};
-		foreach my $mid (@$mids) {
+		foreach my $mid (reverse @$mids) {
 			if ($mm->mid_set($num, $mid) == 1) {
 				$mid0 = $mid;
 				last;
@@ -560,7 +560,11 @@ sub reindex_oid {
 		}
 		if (!defined($mid0)) {
 			my $id = '<' . join('> <', @$mids) . '>';
-			warn "Message-Id $id unusable for $num\n";
+			warn "Message-ID $id unusable for $num\n";
+			foreach my $mid (@$mids) {
+				defined(my $n = $mm->num_for($mid)) or next;
+				warn "#$n previously mapped for <$mid>\n";
+			}
 		}
 	}
 
@@ -661,8 +665,17 @@ sub reindex {
 		}
 		delete $self->{reindex_pipe};
 	}
+	my $gaps;
+	if ($regen && $$regen != 0) {
+		warn "W: leftover article number ($$regen)\n";
+		$gaps = 1;
+	}
 	my ($min, $max) = $mm_tmp->minmax;
-	defined $max and die "leftover article numbers at $min..$max\n";
+	if (defined $max) {
+		warn "W: leftover article numbers at $min..$max\n";
+		$gaps = 1;
+	}
+	warn "W: were old git partitions deleted?\n" if $gaps;
 	my @d = sort keys %$D;
 	if (@d) {
 		warn "BUG: ", scalar(@d)," unseen deleted messages marked\n";
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/9] wwwstream: flesh out clone instructions for v2
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
                   ` (2 preceding siblings ...)
  2018-03-30  1:20 ` [PATCH 3/9] v2writable: convert some fatal reindex errors to warnings Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 5/9] v2writable: go backwards through alternate Message-IDs Eric Wong (Contractor, The Linux Foundation)
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

Relying solely on git for v2 repos is probably not
so useful, so add pointers to public-inbox-init/index
commands.
---
 lib/PublicInbox/WwwStream.pm | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index 7631754..ec75f16 100644
--- a/lib/PublicInbox/WwwStream.pm
+++ b/lib/PublicInbox/WwwStream.pm
@@ -74,14 +74,17 @@ sub _html_end {
 
 	my (%seen, @urls);
 	my $http = $obj->base_url($ctx->{env});
-	chop $http; # no trailing slash
+	chop $http; # no trailing slash for clone
 	my $part = $obj->max_git_part;
+	my $dir = (split(m!/!, $http))[-1];
 	if (defined($part)) { # v2
-		# most recent partition first:
-		for (; $part >= 0; $part--) {
-			my $url = "$http/$part";
+		$seen{$http} = 1;
+		for my $i (0..$part) {
+			# old parts my be deleted:
+			-d "$obj->{mainrepo}/git/$i.git" or next;
+			my $url = "$http/$i";
 			$seen{$url} = 1;
-			push @urls, $url;
+			push @urls, "$url $dir/git/$i.git";
 		}
 	} else { # v1
 		$seen{$http} = 1;
@@ -102,7 +105,19 @@ sub _html_end {
 		$urls .= "\n" .
 			join("\n", map { "\tgit clone --mirror $_" } @urls);
 	}
+	if (defined $part) {
+		my $addrs = $obj->{address};
+		$addrs = join(' ', @$addrs) if ref($addrs) eq 'ARRAY';
+		$urls .=  <<EOF
+
 
+	# If you have public-inbox 1.1+ installed, you may
+	# initialize and index your mirror using the following commands:
+	public-inbox-init -V2 $obj->{name} $dir/ $http \\
+		$addrs
+	public-inbox-index $dir
+EOF
+	}
 	my @nntp = map { qq(<a\nhref="$_">$_</a>) } @{$obj->nntp_url};
 	if (@nntp) {
 		$urls .= "\n\n";
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/9] v2writable: go backwards through alternate Message-IDs
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
                   ` (3 preceding siblings ...)
  2018-03-30  1:20 ` [PATCH 4/9] wwwstream: flesh out clone instructions for v2 Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 6/9] view: speed up homepage loading time with date clamp Eric Wong (Contractor, The Linux Foundation)
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

This is consistent with how we internally generate new
Message-IDs to break conflicts and allows ->reindex to
succeed while walking backwards through history
---
 lib/PublicInbox/V2Writable.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 269b028..34f13e2 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -137,7 +137,7 @@ sub num_for {
 		warn "<$mid> reused for mismatched content\n";
 
 		# try the rest of the mids
-		foreach my $i (1..$#$mids) {
+		for(my $i = $#$mids; $i >= 1; $i--) {
 			my $m = $mids->[$i];
 			$num = $self->{skel}->{mm}->mid_insert($m);
 			if (defined $num) {
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/9] view: speed up homepage loading time with date clamp
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
                   ` (4 preceding siblings ...)
  2018-03-30  1:20 ` [PATCH 5/9] v2writable: go backwards through alternate Message-IDs Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 7/9] view: drop load_results Eric Wong (Contractor, The Linux Foundation)
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

This saves over 400ms on my system with the full LKML
with over 2.8 million messages.
---
 lib/PublicInbox/Inbox.pm |  1 +
 lib/PublicInbox/View.pm  | 20 +++++++++++++++++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 265360d..90ac9eb 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -132,6 +132,7 @@ sub max_git_part {
 sub mm {
 	my ($self) = @_;
 	$self->{mm} ||= eval {
+		require PublicInbox::Msgmap;
 		_cleanup_later($self);
 		my $dir = $self->{mainrepo};
 		if (($self->{version} || 1) >= 2) {
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index ec04343..60fc1df 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -1069,17 +1069,31 @@ sub index_nav { # callback for WwwStream
 sub index_topics {
 	my ($ctx) = @_;
 	my ($off) = (($ctx->{qp}->{o} || '0') =~ /(\d+)/);
-	my $opts = { offset => $off, limit => 200 };
+	my $lim = 200;
+	my $opts = { offset => $off, limit => $lim };
 
 	$ctx->{order} = [];
 	my $srch = $ctx->{srch};
-	my $sres = $srch->query('', $opts);
+
+	my $qs = '';
+	# this complicated bit cuts loading time by over 400ms on my system:
+	if ($off == 0) {
+		my ($min, $max) = $ctx->{-inbox}->mm->minmax;
+		my $n = $max - $lim;
+		$n = $min if $n < $min;
+		for (; $qs eq '' && $n >= $min; --$n) {
+			my $smsg = $srch->lookup_article($n) or next;
+			$qs = POSIX::strftime('d:%Y%m%d..', gmtime($smsg->ts));
+		}
+	}
+
+	my $sres = $srch->query($qs, $opts);
 	my $nr = scalar @{$sres->{msgs}};
 	if ($nr) {
 		$sres = load_results($srch, $sres);
 		walk_thread(thread_results($ctx, $sres), $ctx, *acc_topic);
 	}
-	$ctx->{-next_o} = $off+ $nr;
+	$ctx->{-next_o} = $off + $nr;
 	$ctx->{-cur_o} = $off;
 	PublicInbox::WwwStream->response($ctx, dump_topics($ctx), *index_nav);
 }
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/9] view: drop load_results
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
                   ` (5 preceding siblings ...)
  2018-03-30  1:20 ` [PATCH 6/9] view: speed up homepage loading time with date clamp Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 8/9] msgtime: parse 3-digit years properly Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 9/9] feed: optimize query for feeds, too Eric Wong (Contractor, The Linux Foundation)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

It's no longer necessary to have this since load_expand
now populates $smsg->mid with the "preferred" Message-ID.
This saves around 10ms on the homepage for me.
---
 lib/PublicInbox/View.pm | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 60fc1df..c151f22 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -409,7 +409,7 @@ sub thread_html {
 	my $mid = $ctx->{mid};
 	my $srch = $ctx->{srch};
 	my $sres = $srch->get_thread($mid);
-	my $msgs = load_results($srch, $sres);
+	my $msgs = $sres->{msgs};
 	my $nr = $sres->{total};
 	return missing_thread($ctx) if $nr == 0;
 	my $skel = '<hr><pre>';
@@ -680,7 +680,7 @@ sub thread_skel {
 	$ctx->{prev_attr} = '';
 	$ctx->{prev_level} = 0;
 	$ctx->{dst} = $dst;
-	$sres = load_results($srch, $sres);
+	$sres = $sres->{msgs};
 
 	# reduce hash lookups in skel_dump
 	my $ibx = $ctx->{-inbox};
@@ -801,12 +801,6 @@ sub indent_for {
 	$level ? INDENT x ($level - 1) : '';
 }
 
-sub load_results {
-	my ($srch, $sres) = @_;
-	my $msgs = delete $sres->{msgs};
-	$srch->retry_reopen(sub { [ map { $_->mid; $_ } @$msgs ] });
-}
-
 sub thread_results {
 	my ($ctx, $msgs) = @_;
 	require PublicInbox::SearchThread;
@@ -1088,9 +1082,9 @@ sub index_topics {
 	}
 
 	my $sres = $srch->query($qs, $opts);
-	my $nr = scalar @{$sres->{msgs}};
+	$sres = $sres->{msgs};
+	my $nr = scalar @$sres;
 	if ($nr) {
-		$sres = load_results($srch, $sres);
 		walk_thread(thread_results($ctx, $sres), $ctx, *acc_topic);
 	}
 	$ctx->{-next_o} = $off + $nr;
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 8/9] msgtime: parse 3-digit years properly
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
                   ` (6 preceding siblings ...)
  2018-03-30  1:20 ` [PATCH 7/9] view: drop load_results Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  2018-03-30  1:20 ` [PATCH 9/9] feed: optimize query for feeds, too Eric Wong (Contractor, The Linux Foundation)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

Some folks had bad mail clients which generated 3-digit years
around Y2K...
---
 MANIFEST                   |  1 +
 lib/PublicInbox/MsgTime.pm |  3 +++
 t/time.t                   | 28 ++++++++++++++++++++++++++++
 3 files changed, 32 insertions(+)
 create mode 100644 t/time.t

diff --git a/MANIFEST b/MANIFEST
index ad145f7..4a1096d 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -185,6 +185,7 @@ t/spamcheck_spamc.t
 t/spawn.t
 t/thread-all.t
 t/thread-cycle.t
+t/time.t
 t/utf8.mbox
 t/v2mda.t
 t/v2reindex.t
diff --git a/lib/PublicInbox/MsgTime.pm b/lib/PublicInbox/MsgTime.pm
index 4295e87..c67a41f 100644
--- a/lib/PublicInbox/MsgTime.pm
+++ b/lib/PublicInbox/MsgTime.pm
@@ -47,6 +47,9 @@ sub msg_date_only ($) {
 	my ($ts, $zone);
 	foreach my $d (@date) {
 		$zone = undef;
+		# Y2K problems: 3-digit years
+		$d =~ s!([A-Za-z]{3}) (\d{3}) (\d\d:\d\d:\d\d)!
+			my $yyyy = $2 + 1900; "$1 $yyyy $3"!e;
 		$ts = eval { str2time($d) };
 		if ($@) {
 			my $mid = $hdr->header_raw('Message-ID');
diff --git a/t/time.t b/t/time.t
new file mode 100644
index 0000000..370a0bd
--- /dev/null
+++ b/t/time.t
@@ -0,0 +1,28 @@
+# Copyright (C) 2018 all contributors <meta@public-inbox.org>
+# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
+use strict;
+use warnings;
+use Test::More;
+use_ok 'PublicInbox::MIME';
+use PublicInbox::MsgTime qw(msg_datestamp);
+my $mime = PublicInbox::MIME->create(
+	header => [
+		From => 'a@example.com',
+		To => 'test@example.com',
+		Subject => 'this is a subject',
+		'Message-ID' => '<a-mid@b>',
+		Date => 'Fri, 02 Oct 93 00:00:00 +0000',
+	],
+	body => "hello world\n",
+);
+
+my $ts = msg_datestamp($mime->header_obj);
+use POSIX qw(strftime);
+is(strftime('%Y-%m-%d %H:%M:%S', gmtime($ts)), '1993-10-02 00:00:00',
+	'got expected date with 2 digit year');
+$mime->header_set(Date => 'Fri, 02 Oct 101 01:02:03 +0000');
+$ts = msg_datestamp($mime->header_obj);
+is(strftime('%Y-%m-%d %H:%M:%S', gmtime($ts)), '2001-10-02 01:02:03',
+	'got expected date with 3 digit year');
+
+done_testing();
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 9/9] feed: optimize query for feeds, too
  2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
                   ` (7 preceding siblings ...)
  2018-03-30  1:20 ` [PATCH 8/9] msgtime: parse 3-digit years properly Eric Wong (Contractor, The Linux Foundation)
@ 2018-03-30  1:20 ` Eric Wong (Contractor, The Linux Foundation)
  8 siblings, 0 replies; 10+ messages in thread
From: Eric Wong (Contractor, The Linux Foundation) @ 2018-03-30  1:20 UTC (permalink / raw)
  To: meta

This is a smaller improvement than the landing /$INBOX/ page
because full message bodies are shown; but still saves around
100ms for my system with LKML.
---
 lib/PublicInbox/Feed.pm  |  2 +-
 lib/PublicInbox/Inbox.pm | 19 +++++++++++++++++++
 lib/PublicInbox/View.pm  | 17 +----------------
 3 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/lib/PublicInbox/Feed.pm b/lib/PublicInbox/Feed.pm
index f2285a6..2f59f8c 100644
--- a/lib/PublicInbox/Feed.pm
+++ b/lib/PublicInbox/Feed.pm
@@ -114,7 +114,7 @@ sub recent_msgs {
 		my $o = $qp ? $qp->{o} : 0;
 		$o += 0;
 		$o = 0 if $o < 0;
-		my $res = $srch->query('', { limit => $max, offset => $o });
+		my $res = $ibx->recent({ limit => $max, offset => $o });
 		my $next = $o + $max;
 		$ctx->{next_page} = "o=$next" if $res->{total} >= $next;
 		return $res->{msgs};
diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index 90ac9eb..43cf15b 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -9,6 +9,7 @@ use PublicInbox::Git;
 use PublicInbox::MID qw(mid2path);
 use Devel::Peek qw(SvREFCNT);
 use PublicInbox::MIME;
+use POSIX qw(strftime);
 
 my $cleanup_timer;
 eval {
@@ -316,4 +317,22 @@ sub msg_by_mid ($$;$) {
 	$smsg ? msg_by_smsg($self, $smsg, $ref) : undef;
 }
 
+sub recent {
+	my ($self, $opts) = @_;
+	my $qs = '';
+	my $srch = search($self);
+	if (!$opts->{offset}) {
+		# this complicated bit cuts /$INBOX/ loading time by
+		# over 400ms on my system:
+		my ($min, $max) = mm($self)->minmax;
+		my $n = $max - $opts->{limit};
+		$n = $min if $n < $min;
+		for (; $qs eq '' && $n >= $min; --$n) {
+			my $smsg = $srch->lookup_article($n) or next;
+			$qs = strftime('d:%Y%m%d..', gmtime($smsg->ts));
+		}
+	}
+	$srch->query($qs, $opts);
+}
+
 1;
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index c151f22..8ac405f 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -1063,25 +1063,10 @@ sub index_nav { # callback for WwwStream
 sub index_topics {
 	my ($ctx) = @_;
 	my ($off) = (($ctx->{qp}->{o} || '0') =~ /(\d+)/);
-	my $lim = 200;
-	my $opts = { offset => $off, limit => $lim };
 
 	$ctx->{order} = [];
 	my $srch = $ctx->{srch};
-
-	my $qs = '';
-	# this complicated bit cuts loading time by over 400ms on my system:
-	if ($off == 0) {
-		my ($min, $max) = $ctx->{-inbox}->mm->minmax;
-		my $n = $max - $lim;
-		$n = $min if $n < $min;
-		for (; $qs eq '' && $n >= $min; --$n) {
-			my $smsg = $srch->lookup_article($n) or next;
-			$qs = POSIX::strftime('d:%Y%m%d..', gmtime($smsg->ts));
-		}
-	}
-
-	my $sres = $srch->query($qs, $opts);
+	my $sres = $ctx->{-inbox}->recent({offset => $off, limit => 200 });
 	$sres = $sres->{msgs};
 	my $nr = scalar @$sres;
 	if ($nr) {
-- 
EW


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-03-30  1:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-30  1:20 [PATCH 0/9] minor tweaks and fixes Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 1/9] search: warn on reopens and die on total failure Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 2/9] v2writable: allow gaps in git partitions Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 3/9] v2writable: convert some fatal reindex errors to warnings Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 4/9] wwwstream: flesh out clone instructions for v2 Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 5/9] v2writable: go backwards through alternate Message-IDs Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 6/9] view: speed up homepage loading time with date clamp Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 7/9] view: drop load_results Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 8/9] msgtime: parse 3-digit years properly Eric Wong (Contractor, The Linux Foundation)
2018-03-30  1:20 ` [PATCH 9/9] feed: optimize query for feeds, too Eric Wong (Contractor, The Linux Foundation)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).