unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 3/7] extsearchidx: close DB handles after use if FD constrained
Date: Fri, 25 Dec 2020 10:21:11 +0000	[thread overview]
Message-ID: <20201225102115.6745-4-e@80x24.org> (raw)
In-Reply-To: <20201225102115.6745-1-e@80x24.org>

Most distros ship with low RLIMIT_NOFILE limits and surprises
may lurk for admins who configure many inboxes.  Keep FD usage
under control to avoid EMFILE errors at inopportune times during
reindex.

From what I can tell, this is the only place where extindex can
have unpredictable FD growth when there's thousands of inboxes,
and it's in an extremely rare code path.
---
 lib/PublicInbox/ExtSearchIdx.pm | 37 ++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 386e1cee..3f197973 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -393,6 +393,32 @@ sub _ibx_for ($$$) {
 	$self->{ibx_list}->[$pos] // die "BUG: ibx for $smsg->{blob} not mapped"
 }
 
+sub _fd_constrained ($) {
+	my ($self) = @_;
+	$self->{-fd_constrained} //= do {
+		my $soft;
+		if (eval { require BSD::Resource; 1 }) {
+			my $NOFILE = BSD::Resource::RLIMIT_NOFILE();
+			($soft, undef) = BSD::Resource::getrlimit($NOFILE);
+		} else {
+			chomp($soft = `sh -c 'ulimit -n'`);
+		}
+		if (defined($soft)) {
+			my $want = scalar(@{$self->{ibx_list}}) + 64; # estimate
+			my $ret = $want > $soft;
+			if ($ret) {
+				warn <<EOF;
+RLIMIT_NOFILE=$soft insufficient (want: $want), will close DB handles early
+EOF
+			}
+			$ret;
+		} else {
+			warn "Unable to determine RLIMIT_NOFILE: $@\n";
+			1;
+		}
+	};
+}
+
 sub _reindex_finalize ($$$) {
 	my ($req, $smsg, $eml) = @_;
 	my $sync = $req->{sync};
@@ -429,11 +455,16 @@ sub _reindex_finalize ($$$) {
 		my $x = pop(@$ary) // die "BUG: #$docid {by_chash} empty";
 		$x->{num} = delete($x->{xnum}) // die '{xnum} unset';
 		$ibx = _ibx_for($self, $sync, $x);
-		my $e = $ibx->over->get_art($x->{num});
-		$e->{blob} eq $x->{blob} or die <<EOF;
+		if (my $over = $ibx->over) {
+			my $e = $over->get_art($x->{num});
+			$e->{blob} eq $x->{blob} or die <<EOF;
 $x->{blob} != $e->{blob} (${\$ibx->eidx_key}:$e->{num});
 EOF
-		push @todo, $ibx, $e;
+			push @todo, $ibx, $e;
+			$over->dbh_close if _fd_constrained($self);
+		} else {
+			die "$ibx->{inboxdir}: over.sqlite3 unusable: $!\n";
+		}
 	}
 	undef $by_chash;
 	while (my ($ibx, $e) = splice(@todo, 0, 2)) {

  parent reply	other threads:[~2020-12-25 10:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-25 10:21 [PATCH 0/7] index + extindex interaction improvements Eric Wong
2020-12-25 10:21 ` [PATCH 1/7] index: disable --fast-noop on --reindex Eric Wong
2020-12-25 10:21 ` [PATCH 2/7] extsearchidx: delay SQLite availability checks Eric Wong
2020-12-25 10:21 ` Eric Wong [this message]
2020-12-25 10:21 ` [PATCH 4/7] index: do not attach inbox to extindex unless updated Eric Wong
2020-12-25 10:21 ` [PATCH 5/7] index: fix --no-fsync flag propagation to extindex Eric Wong
2020-12-25 10:21 ` [PATCH 6/7] v2writable: don't verify tip if reindexing Eric Wong
2020-12-25 10:21 ` [PATCH 7/7] index: filter out indexlevel=basic from extindex Eric Wong
2020-12-25 10:39 ` [PATCH 0/7] index + extindex interaction improvements Eric Wong
2020-12-26  1:44   ` [PATCH 0/3] extindex --watch support Eric Wong
2020-12-26  1:44     ` [PATCH 1/3] default to CORE::warn in $SIG{__WARN__} handlers Eric Wong
2020-12-26  1:44     ` [PATCH 2/3] extindex: --watch for inotify-based updates Eric Wong
2020-12-26  1:44     ` [PATCH 3/3] init: use the return value of rel2abs_collapsed Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201225102115.6745-4-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).