From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 5/5] daemon: share and allow configuring Xapian helpers
Date: Thu, 25 Apr 2024 21:31:46 +0000 [thread overview]
Message-ID: <20240425213146.1166555-6-e@80x24.org> (raw)
In-Reply-To: <20240425213146.1166555-1-e@80x24.org>
Xapian helper processes are disabled by default once again.
However, they can be enabled via the new `-X INTEGER' parameter.
One big positive is the Xapian helpers being spawned by the
top-level daemon means they can be shared freely across all
workers for improved load balancing and memory reduction.
---
Documentation/public-inbox-daemon.pod | 38 +++++++++++++++++++++++++--
Makefile.PL | 6 +++++
lib/PublicInbox/Daemon.pm | 24 +++++++++++++++--
lib/PublicInbox/Search.pm | 8 +++---
lib/PublicInbox/TestCommon.pm | 9 ++++++-
lib/PublicInbox/XapClient.pm | 7 ++---
6 files changed, 80 insertions(+), 12 deletions(-)
diff --git a/Documentation/public-inbox-daemon.pod b/Documentation/public-inbox-daemon.pod
index 6f1e3b53..092be667 100644
--- a/Documentation/public-inbox-daemon.pod
+++ b/Documentation/public-inbox-daemon.pod
@@ -79,9 +79,9 @@ C<err=> may also be specified on a per-listener basis.
Default: /dev/null with C<--daemonize>, inherited otherwise
-=item -W
+=item -W INTEGER
-=item --worker-processes
+=item --worker-processes INTEGER
Set the number of worker processes.
@@ -96,6 +96,40 @@ the master on crashes.
Default: 1
+=item -X INTEGER
+
+=item --xapian-helpers INTEGER
+
+Enables the use of Xapian helper processes to handle expensive,
+non-deterministic Xapian search queries asynchronously without
+blocking simple requests.
+
+With positive values, there is an additional manager process
+that can be signaled to control the number of Xapian helper workers.
+
+* C<-X0> one worker, no manager process
+* C<-X1> one worker, one manager process
+...
+* C<-X8> eight workers, one manager process
+
+As with the public-facing public-inbox-* daemons, sending C<SIGTTIN>
+or C<SIGTTOU> to the Xapian helper manager process will increment or
+decrement the number of workers.
+
+Both Xapian helper workers and managers automatically respawn if they
+crash or are explicitly killed, even with C<-X0>.
+
+A C++ compiler, L<pkg-config(1)>, and Xapian development files (e.g.
+C<libxapian-dev> or C<xapian*-core-dev*>) are required to gain access to
+some expensive queries and significant memory savings.
+
+Xapian helper workers are shared by all C<--worker-processes> of the
+Perl daemon for additional memory savings.
+
+New in public-inbox 2.0.0.
+
+Default: undefined, search queries are handled synchronously
+
=item --cert /path/to/cert
The default TLS certificate for HTTPS, IMAPS, NNTPS, POP3S and/or STARTTLS
diff --git a/Makefile.PL b/Makefile.PL
index 2b2e6b18..27fe02ff 100644
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -255,6 +255,12 @@ check-run : check-man
# GNU and *BSD both allow it.
check-run_T_ARGS = -j\$(N)
+check-xh0 :
+ \$(MAKE) check-run TEST_DAEMON_XH='-X0'
+
+check-xh1 :
+ \$(MAKE) check-run TEST_DAEMON_XH='-X1'
+
check-debris check-run : pure_all
\$(EATMYDATA) \$(PROVE) -bvw xt/\$@.t :: \$(\$\@_T_ARGS)
-@\$(check_manifest)
diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm
index ec76d6b8..e08102e9 100644
--- a/lib/PublicInbox/Daemon.pm
+++ b/lib/PublicInbox/Daemon.pm
@@ -22,9 +22,11 @@ use PublicInbox::GitAsyncCat;
use PublicInbox::Eml;
use PublicInbox::Config;
use PublicInbox::OnDestroy;
+use PublicInbox::Search;
+use PublicInbox::XapClient;
our $SO_ACCEPTFILTER = 0x1000;
my @CMD;
-my ($set_user, $oldset);
+my ($set_user, $oldset, $xh_workers);
my (@cfg_listen, $stdout, $stderr, $group, $user, $pid_file, $daemonize);
my ($nworker, @listeners, %WORKERS, %logs);
my %tls_opt; # scheme://sockname => args for IO::Socket::SSL::SSL_Context->new
@@ -170,6 +172,7 @@ options:
--cert=FILE default SSL/TLS certificate
--key=FILE default SSL/TLS certificate key
-W WORKERS number of worker processes to spawn (default: 1)
+ -X XWORKERS number of Xapian helper processes (default: undefined)
See public-inbox-daemon(8) and $prog(1) man pages for more.
EOF
@@ -185,6 +188,7 @@ EOF
'multi-accept=i' => \$PublicInbox::Listener::MULTI_ACCEPT,
'cert=s' => \$default_cert,
'key=s' => \$default_key,
+ 'X|xapian-helpers=i' => \$xh_workers,
'help|h' => \(my $show_help),
);
GetOptions(%opt) or die $help;
@@ -687,6 +691,14 @@ sub worker_loop {
PublicInbox::DS::event_loop(\%WORKER_SIG, $oldset);
}
+sub respawn_xh { # awaitpid cb
+ my ($pid) = @_;
+ return unless @listeners;
+ warn "W: xap_helper PID:$pid died: \$?=$?, respawning...\n";
+ $PublicInbox::Search::XHC =
+ PublicInbox::XapClient::start_helper('-j', $xh_workers);
+}
+
sub run {
my ($default_listen) = @_;
$nworker = 1;
@@ -699,7 +711,15 @@ sub run {
local $PublicInbox::Git::async_warn = 1;
local $SIG{__WARN__} = PublicInbox::Eml::warn_ignore_cb();
local %WORKER_SIG = %WORKER_SIG;
- local %POST_ACCEPT;
+ local $PublicInbox::XapClient::tries = 0;
+
+ local $PublicInbox::Search::XHC = PublicInbox::XapClient::start_helper(
+ '-j', $xh_workers) if defined($xh_workers);
+ if ($PublicInbox::Search::XHC) {
+ require PublicInbox::XhcMset;
+ awaitpid($PublicInbox::Search::XHC->{io}->attached_pid,
+ \&respawn_xh);
+ }
daemon_loop();
# $unlink_on_leave runs
diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm
index b7732ae5..4adef366 100644
--- a/lib/PublicInbox/Search.pm
+++ b/lib/PublicInbox/Search.pm
@@ -11,7 +11,7 @@ our @EXPORT_OK = qw(retry_reopen int_val get_pct xap_terms);
use List::Util qw(max);
use POSIX qw(strftime);
use Carp ();
-our $XHC;
+our $XHC = 0; # defined but false
# values for searching, changing the numeric value breaks
# compatibility with old indices (so don't change them it)
@@ -57,7 +57,7 @@ use constant {
};
use PublicInbox::Smsg;
-use PublicInbox::Over;
+eval { require PublicInbox::Over };
our $QP_FLAGS;
our %X = map { $_ => 0 } qw(BoolWeight Database Enquire QueryParser Stem Query);
our $Xap; # 'Xapian' or 'Search::Xapian'
@@ -428,9 +428,9 @@ sub mset {
do_enquire($self, $qry, $opt, TS);
}
-sub xhc_start_maybe () {
+sub xhc_start_maybe (@) {
require PublicInbox::XapClient;
- my $xhc = PublicInbox::XapClient::start_helper();
+ my $xhc = PublicInbox::XapClient::start_helper(@_);
require PublicInbox::XhcMset if $xhc;
$xhc;
}
diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm
index a7ec9b5b..b8b7b827 100644
--- a/lib/PublicInbox/TestCommon.pm
+++ b/lib/PublicInbox/TestCommon.pm
@@ -17,6 +17,7 @@ my $lei_loud = $ENV{TEST_LEI_ERR_LOUD};
our $tail_cmd = $ENV{TAIL};
our ($lei_opt, $lei_out, $lei_err);
use autodie qw(chdir close fcntl mkdir open opendir seek unlink);
+$ENV{XDG_CACHE_HOME} //= "$ENV{HOME}/.cache"; # reuse C++ xap_helper builds
$_ = File::Spec->rel2abs($_) for (grep(!m!^/!, @INC));
@@ -565,6 +566,9 @@ sub start_script {
my $run_mode = $ENV{TEST_RUN_MODE} // $opt->{run_mode} // 2;
my $sub = $run_mode == 0 ? undef : key2sub($key);
my $tail;
+ my $xh = $ENV{TEST_DAEMON_XH};
+ $xh && $key =~ /-(?:imapd|netd|httpd|pop3d|nntpd)\z/ and
+ push @argv, split(/\s+/, $xh);
if ($tail_cmd) {
my @paths;
for (@argv) {
@@ -720,7 +724,10 @@ SKIP: {
require PublicInbox::Spawn;
require PublicInbox::Config;
require File::Path;
-
+ eval { # use XDG_CACHE_HOME, first:
+ require PublicInbox::XapHelperCxx;
+ PublicInbox::XapHelperCxx::build();
+ };
local %ENV = %ENV;
delete $ENV{XDG_DATA_HOME};
delete $ENV{XDG_CONFIG_HOME};
diff --git a/lib/PublicInbox/XapClient.pm b/lib/PublicInbox/XapClient.pm
index f0270091..24b3f45e 100644
--- a/lib/PublicInbox/XapClient.pm
+++ b/lib/PublicInbox/XapClient.pm
@@ -12,6 +12,7 @@ use PublicInbox::Spawn qw(spawn);
use Socket qw(AF_UNIX SOCK_SEQPACKET);
use PublicInbox::IPC;
use autodie qw(pipe socketpair);
+our $tries = 50;
sub mkreq {
my ($self, $ios, @arg) = @_;
@@ -19,13 +20,13 @@ sub mkreq {
pipe($r, $ios->[0]) if !defined($ios->[0]);
my @fds = map fileno($_), @$ios;
my $buf = join("\0", @arg, '');
- $n = $PublicInbox::IPC::send_cmd->($self->{io}, \@fds, $buf, 0) //
- die "send_cmd: $!";
+ $n = $PublicInbox::IPC::send_cmd->($self->{io}, \@fds, $buf, 0, $tries)
+ // die "send_cmd: $!";
$n == length($buf) or die "send_cmd: $n != ".length($buf);
$r;
}
-sub start_helper {
+sub start_helper (@) {
$PublicInbox::IPC::send_cmd or return; # can't work w/o SCM_RIGHTS
my @argv = @_;
socketpair(my $sock, my $in, AF_UNIX, SOCK_SEQPACKET, 0);
prev parent reply other threads:[~2024-04-25 21:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-25 21:31 [PATCH 0/5] xap_helper stuff for public daemons Eric Wong
2024-04-25 21:31 ` [PATCH 1/5] t/cindex: require DBD::SQLite for now Eric Wong
2024-04-25 21:31 ` [PATCH 2/5] www: mbox*: use Perl 5.12 Eric Wong
2024-04-25 21:31 ` [PATCH 3/5] send_cmd4: make `tries' a per-call parameter Eric Wong
2024-04-25 21:31 ` [PATCH 4/5] search: async_mset: pass resource errors to callback Eric Wong
2024-04-25 21:31 ` Eric Wong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://public-inbox.org/README
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240425213146.1166555-6-e@80x24.org \
--to=e@80x24.org \
--cc=meta@public-inbox.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).