unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* [PATCH 0/2] robustness improvements
@ 2016-06-19 10:20 Eric Wong
  2016-06-19 10:20 ` [PATCH 1/2] search: reopen and retry on updated databases Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Eric Wong @ 2016-06-19 10:20 UTC (permalink / raw)
  To: meta

Because I care about users downloading over 300 MB from
the all.mbox.gz endpoint over Tor or cloning nearly
800 MB over git.

It would be nice to be able to resume when a disconnect
does happen, however...

Eric Wong (2):
      search: reopen and retry on updated databases
      examples/*@.service: wait one day for graceful shutdown

 examples/public-inbox-httpd@.service |  2 +-
 examples/public-inbox-nntpd@.service |  2 +-
 lib/PublicInbox/Search.pm            | 35 ++++++++++++++++++++++-------------
 3 files changed, 24 insertions(+), 15 deletions(-)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] search: reopen and retry on updated databases
  2016-06-19 10:20 [PATCH 0/2] robustness improvements Eric Wong
@ 2016-06-19 10:20 ` Eric Wong
  2016-06-19 10:20 ` [PATCH 2/2] examples/*@.service: wait one day for graceful shutdown Eric Wong
  2016-06-19 11:55 ` [PATCH 0/2] robustness improvements Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2016-06-19 10:20 UTC (permalink / raw)
  To: meta

This seems like a nasty thing which breaks downloads of
large mailboxes.
---
 lib/PublicInbox/Search.pm | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index d9fbc36..856c8c1 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -97,7 +97,7 @@ sub query {
 		$opts->{relevance} = 1 unless exists $opts->{relevance};
 	}
 
-	$self->do_enquire($query, $opts);
+	_do_enquire($self, $query, $opts);
 }
 
 sub get_thread {
@@ -111,10 +111,26 @@ sub get_thread {
 	my $query = Search::Xapian::Query->new(OP_OR, $qtid, $qsub);
 	$opts ||= {};
 	$opts->{limit} ||= 1000;
-	$self->do_enquire($query, $opts);
+	_do_enquire($self, $query, $opts);
 }
 
-sub do_enquire {
+sub _do_enquire {
+	my ($self, $query, $opts) = @_;
+	my $ret;
+	for (1..10) {
+		eval { $ret = _enquire_once($self, $query, $opts) };
+		return $ret unless $@;
+		# Exception: The revision being read has been discarded -
+		# you should call Xapian::Database::reopen()
+		if (index($@, 'Xapian::Database::reopen') >= 0) {
+			reopen($self);
+		} else {
+			die $@;
+		}
+	}
+}
+
+sub _enquire_once {
 	my ($self, $query, $opts) = @_;
 	my $enquire = $self->enquire;
 	if (defined $query) {
@@ -127,6 +143,8 @@ sub do_enquire {
         my $desc = !$opts->{asc};
 	if ($opts->{relevance}) {
 		$enquire->set_sort_by_relevance_then_value(TS, $desc);
+	} elsif ($opts->{num}) {
+		$enquire->set_sort_by_value(NUM, 0);
 	} else {
 		$enquire->set_sort_by_value_then_relevance(TS, $desc);
 	}
@@ -186,21 +204,12 @@ sub num_range_processor {
 # only used for NNTP server
 sub query_xover {
 	my ($self, $beg, $end, $offset) = @_;
-	my $enquire = $self->enquire;
 	my $qp = Search::Xapian::QueryParser->new;
 	$qp->set_database($self->{xdb});
 	$qp->add_valuerangeprocessor($self->num_range_processor);
 	my $query = $qp->parse_query("$beg..$end", QP_FLAGS);
-	$query = Search::Xapian::Query->new(OP_AND, $mail_query, $query);
-	$enquire->set_query($query);
-	$enquire->set_sort_by_value(NUM, 0);
-	my $limit = 200;
-	my $mset = $enquire->get_mset($offset, $limit);
-	my @msgs = map {
-		PublicInbox::SearchMsg->load_doc($_->get_document);
-	} $mset->items;
 
-	{ total => $mset->get_matches_estimated, msgs => \@msgs }
+	_do_enquire($self, $query, {num => 1, limit => 200, offset => $offset});
 }
 
 sub lookup_message {

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] examples/*@.service: wait one day for graceful shutdown
  2016-06-19 10:20 [PATCH 0/2] robustness improvements Eric Wong
  2016-06-19 10:20 ` [PATCH 1/2] search: reopen and retry on updated databases Eric Wong
@ 2016-06-19 10:20 ` Eric Wong
  2016-06-19 11:55 ` [PATCH 0/2] robustness improvements Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2016-06-19 10:20 UTC (permalink / raw)
  To: meta

Because sometimes folks will want to download gigantic mboxes
or make large clones over Tor which are not resume-friendly.

Note: the timeout logic in nntpd is somewhat over-aggressive
and can break some large slrnpulls.  This ought to be easily
recoverable on the client-side, though, since it's based on
per-message fetches.
---
 examples/public-inbox-httpd@.service | 2 +-
 examples/public-inbox-nntpd@.service | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/public-inbox-httpd@.service b/examples/public-inbox-httpd@.service
index 6222de5..4efea2a 100644
--- a/examples/public-inbox-httpd@.service
+++ b/examples/public-inbox-httpd@.service
@@ -23,7 +23,7 @@ KillSignal = SIGQUIT
 User = nobody
 Group = nogroup
 ExecReload = /bin/kill -HUP $MAINPID
-TimeoutStopSec = 3600
+TimeoutStopSec = 86400
 KillMode = process
 
 [Install]
diff --git a/examples/public-inbox-nntpd@.service b/examples/public-inbox-nntpd@.service
index 3e203e0..bdd9734 100644
--- a/examples/public-inbox-nntpd@.service
+++ b/examples/public-inbox-nntpd@.service
@@ -25,7 +25,7 @@ KillSignal = SIGQUIT
 User = nobody
 Group = nogroup
 ExecReload = /bin/kill -HUP $MAINPID
-TimeoutStopSec = 3600
+TimeoutStopSec = 86400
 KillMode = process
 
 [Install]

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] robustness improvements
  2016-06-19 10:20 [PATCH 0/2] robustness improvements Eric Wong
  2016-06-19 10:20 ` [PATCH 1/2] search: reopen and retry on updated databases Eric Wong
  2016-06-19 10:20 ` [PATCH 2/2] examples/*@.service: wait one day for graceful shutdown Eric Wong
@ 2016-06-19 11:55 ` Eric Wong
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2016-06-19 11:55 UTC (permalink / raw)
  To: meta

Eric Wong <e@80x24.org> wrote:
> Because I care about users downloading over 300 MB from
> the all.mbox.gz endpoint over Tor or cloning nearly
> 800 MB over git.

Poo.  Seeing tor and varnishd processes wakeup constantly
when idle is disappointing, though :<

Well, I suppose it's better than wasting CPU power for
blockchain mining...

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-06-19 11:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-19 10:20 [PATCH 0/2] robustness improvements Eric Wong
2016-06-19 10:20 ` [PATCH 1/2] search: reopen and retry on updated databases Eric Wong
2016-06-19 10:20 ` [PATCH 2/2] examples/*@.service: wait one day for graceful shutdown Eric Wong
2016-06-19 11:55 ` [PATCH 0/2] robustness improvements Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).