* Holding on to deleted packfiles @ 2021-09-21 14:47 Konstantin Ryabitsev 2021-09-21 19:06 ` Eric Wong 0 siblings, 1 reply; 7+ messages in thread From: Konstantin Ryabitsev @ 2021-09-21 14:47 UTC (permalink / raw) To: meta Hello: A large git repack job that ran over the weekend revealed a minor problem -- public-inbox daemon processes will hold on to deleted pack files until they are restarted. Is there any way to gracefully recognize and handle this condition? It's not quite benign, as this ended up keeping 40GB+ worth of inodes from being released. -K ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Holding on to deleted packfiles 2021-09-21 14:47 Holding on to deleted packfiles Konstantin Ryabitsev @ 2021-09-21 19:06 ` Eric Wong 2021-09-21 20:57 ` Konstantin Ryabitsev 0 siblings, 1 reply; 7+ messages in thread From: Eric Wong @ 2021-09-21 19:06 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > Hello: > > A large git repack job that ran over the weekend revealed a minor problem -- > public-inbox daemon processes will hold on to deleted pack files until they > are restarted. Is there any way to gracefully recognize and handle this > condition? It's not quite benign, as this ended up keeping 40GB+ worth of > inodes from being released. Was this from /all/ (ALL.git using batch-file) or Gcf2? The old stuff has timers to do periodic cleanup, but the new stuff is trickier as the cost of a restart is higher... It should be alright to wire up the old timers to ALL.git with (hundreds) of inboxes lore currently has. git 2.33+ should be better when we get into the thousands; but it's still not great. Gcf2/libgit2 startup time is slow compared to git 2.33+, but I can write a dumb fstat loop to periodically look for st_nlink==0... At some point, I'd like to fix git.git to release packs (and add alternates dynamically) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Holding on to deleted packfiles 2021-09-21 19:06 ` Eric Wong @ 2021-09-21 20:57 ` Konstantin Ryabitsev 2021-09-21 21:37 ` Eric Wong 0 siblings, 1 reply; 7+ messages in thread From: Konstantin Ryabitsev @ 2021-09-21 20:57 UTC (permalink / raw) To: Eric Wong; +Cc: meta On Tue, Sep 21, 2021 at 07:06:53PM +0000, Eric Wong wrote: > Was this from /all/ (ALL.git using batch-file) or Gcf2? I believe this was from Gcf2, though I can't go back and check, unfortunately. > The old stuff has timers to do periodic cleanup, but the new > stuff is trickier as the cost of a restart is higher... > > It should be alright to wire up the old timers to ALL.git with > (hundreds) of inboxes lore currently has. git 2.33+ should be > better when we get into the thousands; but it's still not > great. Well, it may also not be something that's the responsibility of public-inbox either, e.g. other long-running daemons don't perform such checks. We can just issue a reload after we've done repacking. I was just wondering if perhaps you already did something that would recognize that old pack files have gone away. -K ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Holding on to deleted packfiles 2021-09-21 20:57 ` Konstantin Ryabitsev @ 2021-09-21 21:37 ` Eric Wong 2021-09-22 9:45 ` [PATCH] gcf2 + extsearch: check for unlinked files on Linux Eric Wong 0 siblings, 1 reply; 7+ messages in thread From: Eric Wong @ 2021-09-21 21:37 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > Well, it may also not be something that's the responsibility of public-inbox > either, e.g. other long-running daemons don't perform such checks. We can just > issue a reload after we've done repacking. The less operational knowledge required, the better; especially since I'd like to see more self-hosted instances pop up. All the dependencies we use become our responsibility, too. Given the lack of response to the libgit2 bug I've found[1], I'm far more inclined to improve git itself. [1] https://bugs.debian.org/975607 https://github.com/libgit2/libgit2/issues/5711 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] gcf2 + extsearch: check for unlinked files on Linux 2021-09-21 21:37 ` Eric Wong @ 2021-09-22 9:45 ` Eric Wong 2021-09-22 19:51 ` Eric Wong 2021-09-23 0:46 ` [PATCH 2/1] daemons: revamp periodic cleanup task Eric Wong 0 siblings, 2 replies; 7+ messages in thread From: Eric Wong @ 2021-09-22 9:45 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Eric Wong <e@80x24.org> wrote: > Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > > Well, it may also not be something that's the responsibility of public-inbox > > either, e.g. other long-running daemons don't perform such checks. We can just > > issue a reload after we've done repacking. > > The less operational knowledge required, the better; especially > since I'd like to see more self-hosted instances pop up. I've lightly tested the patch below; but I'm questioning the need for Gcf2 if /all/ is configured (more on that later, need nap) -----------8<--------- Subject: [PATCH] gcf2 + extsearch: check for unlinked files on Linux Check for unlinked mmap-ed files via /proc/$PID/maps every 60s or so. ExtSearch (extindex) is compatible-enough with Inbox objects to be wired into the old per-inbox code, but the startup cost is projected to be much higher down the line when there's >30K inboxes, so we scan /proc/$PID/maps for deleted files before unlinking. With old Inbox objects, it was (and is) simpler to just kill processes w/o checking due to the low startup cost (and non-portability of checking). Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Link: https://public-inbox.org/meta/20210921144754.gulkneuulzo27qbw@meerkat.local/ --- lib/PublicInbox/ExtSearch.pm | 10 ++++++++-- lib/PublicInbox/Gcf2.pm | 26 +++++++++++++++++++++++--- lib/PublicInbox/Git.pm | 16 +++++++++++++++- lib/PublicInbox/Inbox.pm | 11 +++++++++-- 4 files changed, 55 insertions(+), 8 deletions(-) diff --git a/lib/PublicInbox/ExtSearch.pm b/lib/PublicInbox/ExtSearch.pm index bd301158..cd7cba2e 100644 --- a/lib/PublicInbox/ExtSearch.pm +++ b/lib/PublicInbox/ExtSearch.pm @@ -33,12 +33,18 @@ sub misc { # same as per-inbox ->over, for now... sub over { my ($self) = @_; - $self->{over} //= PublicInbox::Over->new("$self->{xpfx}/over.sqlite3"); + $self->{over} //= do { + PublicInbox::Inbox::_cleanup_later($self); + PublicInbox::Over->new("$self->{xpfx}/over.sqlite3"); + }; } sub git { my ($self) = @_; - $self->{git} //= PublicInbox::Git->new("$self->{topdir}/ALL.git"); + $self->{git} //= do { + PublicInbox::Inbox::_cleanup_later($self); + PublicInbox::Git->new("$self->{topdir}/ALL.git"); + }; } # returns a hashref of { $NEWSGROUP_NAME => $ART_NO } using the `xref3' table diff --git a/lib/PublicInbox/Gcf2.pm b/lib/PublicInbox/Gcf2.pm index 64945ca6..f546208f 100644 --- a/lib/PublicInbox/Gcf2.pm +++ b/lib/PublicInbox/Gcf2.pm @@ -8,6 +8,7 @@ use strict; use v5.10.1; use PublicInbox::Spawn qw(which popen_rd); # may set PERL_INLINE_DIRECTORY use Fcntl qw(LOCK_EX SEEK_SET); +use Time::HiRes qw(clock_gettime CLOCK_MONOTONIC); use IO::Handle; # autoflush BEGIN { my (%CFG, $c_src); @@ -96,11 +97,21 @@ sub add_alt ($$) { 1; } -# Usage: $^X -MPublicInbox::Gcf2 -e PublicInbox::Gcf2::loop +sub have_unlinked_files () { + # FIXME: port gcf2-like over to git.git so we won't need to + # deal with libgit2 + return 1 if $^O ne 'linux'; + open my $fh, '<', "/proc/$$/maps" or return; + while (<$fh>) { return 1 if /\.(?:idx|pack) \(deleted\)$/ } + undef; +} + +# Usage: $^X -MPublicInbox::Gcf2 -e PublicInbox::Gcf2::loop [EXPIRE-TIMEOUT] # (see lib/PublicInbox/Gcf2Client.pm) -sub loop () { +sub loop (;$) { + my $exp = $_[0] || $ARGV[0] || 60; # seconds my $gcf2 = new(); - my %seen; + my (%seen, $check_at); STDERR->autoflush(1); STDOUT->autoflush(1); @@ -116,6 +127,7 @@ sub loop () { $gcf2 = new(); %seen = ($git_dir => add_alt($gcf2,"$git_dir/objects")); + $check_at = clock_gettime(CLOCK_MONOTONIC) + $exp; if ($gcf2->cat_oid(1, $oid)) { warn "I: $$ $oid found after retry\n"; @@ -123,6 +135,14 @@ sub loop () { warn "W: $$ $oid missing after retry\n"; print "$oid missing\n"; # mimic git-cat-file } + } else { # check expiry to deal with deleted pack files + my $now = clock_gettime(CLOCK_MONOTONIC); + $check_at //= $now + $exp; + if ($now > $check_at && have_unlinked_files()) { + undef $check_at; + $gcf2 = new(); + %seen = (); + } } } } diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index 5ef1db2f..cf51239f 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -12,7 +12,7 @@ use v5.10.1; use parent qw(Exporter); use POSIX (); use IO::Handle; # ->autoflush -use Errno qw(EINTR EAGAIN); +use Errno qw(EINTR EAGAIN ENOENT); use File::Glob qw(bsd_glob GLOB_NOSORT); use File::Spec (); use Time::HiRes qw(stat); @@ -523,6 +523,20 @@ sub manifest_entry { $ent; } +sub cleanup_if_unlinked { + my ($self) = @_; + return cleanup($self) if $^O ne 'linux'; + # Linux-specific /proc/$PID/maps access + # TODO: support this inside git.git + for my $fld (qw(pid pid_c)) { + my $pid = $self->{$fld} // next; + open my $fh, '<', "/proc/$pid/maps" or next; + while (<$fh>) { + return cleanup($self) if /\.(?:idx|pack) \(deleted\)$/; + } + } +} + 1; __END__ =pod diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm index 20f8c884..02ffc7be 100644 --- a/lib/PublicInbox/Inbox.pm +++ b/lib/PublicInbox/Inbox.pm @@ -22,8 +22,15 @@ my $CLEANUP = {}; # string(inbox) -> inbox sub git_cleanup ($) { my ($self) = @_; - my $git = $self->{git} or return; - $git->cleanup; + if ($self->isa(__PACKAGE__)) { + # normal Inbox; low startup cost, and likely to have many + # many so this keeps process/pipe counts in check + $self->{git}->cleanup if $self->{git}; + } else { + # ExtSearch, high startup cost if ->ALL, and probably + # only one per-daemon, so teardown only if required: + $self->git->cleanup_if_unlinked; + } } sub cleanup_task () { ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] gcf2 + extsearch: check for unlinked files on Linux 2021-09-22 9:45 ` [PATCH] gcf2 + extsearch: check for unlinked files on Linux Eric Wong @ 2021-09-22 19:51 ` Eric Wong 2021-09-23 0:46 ` [PATCH 2/1] daemons: revamp periodic cleanup task Eric Wong 1 sibling, 0 replies; 7+ messages in thread From: Eric Wong @ 2021-09-22 19:51 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Eric Wong <e@80x24.org> wrote: > I've lightly tested the patch below; Erm, seems broken for extsearch ALL.git, but fine for gcf2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/1] daemons: revamp periodic cleanup task 2021-09-22 9:45 ` [PATCH] gcf2 + extsearch: check for unlinked files on Linux Eric Wong 2021-09-22 19:51 ` Eric Wong @ 2021-09-23 0:46 ` Eric Wong 1 sibling, 0 replies; 7+ messages in thread From: Eric Wong @ 2021-09-23 0:46 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Neither Inboxes nor ExtSearch objects were retrying correctly when there are live git processes, but the inboxes were getting rescanned for search or other reasons. Ensure the scan retries eventually if there's live processes. We also need to update the cleanup task to detect Xapian shard count changes, since Xapian ->reopen is enough to detect any other Xapian changes. Otherwise, we just issue an inexpensive ->reopen call and let Xapian check whether there's anything worth reopening. This also lets us eliminate the Devel::Peek dependency. --- ci/deps.perl | 5 --- lib/PublicInbox/Admin.pm | 3 +- lib/PublicInbox/ExtSearch.pm | 7 +++- lib/PublicInbox/Git.pm | 11 ++++-- lib/PublicInbox/Inbox.pm | 67 ++++++++++++------------------------ lib/PublicInbox/Search.pm | 16 ++++++++- t/extsearch.t | 15 ++++++++ 7 files changed, 68 insertions(+), 56 deletions(-) diff --git a/ci/deps.perl b/ci/deps.perl index a797911a..ae85986d 100755 --- a/ci/deps.perl +++ b/ci/deps.perl @@ -17,7 +17,6 @@ my $profiles = { essential => [ qw( git perl - Devel::Peek Digest::SHA Encode ExtUtils::MakeMaker @@ -79,10 +78,6 @@ my $non_auto = { pkg => 'p5-TimeDate', rpm => 'perl-TimeDate', }, - 'Devel::Peek' => { - deb => 'perl', # libperl5.XX, but the XX varies - pkg => 'perl5', - }, 'Digest::SHA' => { deb => 'perl', # libperl5.XX, but the XX varies pkg => 'perl5', diff --git a/lib/PublicInbox/Admin.pm b/lib/PublicInbox/Admin.pm index 20964f9c..dcf17cf5 100644 --- a/lib/PublicInbox/Admin.pm +++ b/lib/PublicInbox/Admin.pm @@ -198,8 +198,7 @@ sub resolve_inboxes ($;$$) { $opt->{-eidx_ok} ? (\@ibxs, \@eidx) : @ibxs; } -# TODO: make Devel::Peek optional, only used for daemon -my @base_mod = qw(Devel::Peek); +my @base_mod = (); my @over_mod = qw(DBD::SQLite DBI); my %mod_groups = ( -index => [ @base_mod, @over_mod ], diff --git a/lib/PublicInbox/ExtSearch.pm b/lib/PublicInbox/ExtSearch.pm index cd7cba2e..7520e71c 100644 --- a/lib/PublicInbox/ExtSearch.pm +++ b/lib/PublicInbox/ExtSearch.pm @@ -112,6 +112,11 @@ sub description { '$EXTINDEX_DIR/description missing'; } +sub search { + PublicInbox::Inbox::_cleanup_later($_[0]); + $_[0]; +} + no warnings 'once'; *base_url = \&PublicInbox::Inbox::base_url; *smsg_eml = \&PublicInbox::Inbox::smsg_eml; @@ -121,6 +126,6 @@ no warnings 'once'; *recent = \&PublicInbox::Inbox::recent; *max_git_epoch = *nntp_usable = *msg_by_path = \&mm; # undef -*isrch = *search = \&PublicInbox::Search::reopen; +*isrch = \&search; 1; diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index cf51239f..3c577ab3 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/Git.pm @@ -400,7 +400,7 @@ sub cleanup { delete $self->{inflight_c}; _destroy($self, qw(cat_rbuf in out pid)); _destroy($self, qw(chk_rbuf in_c out_c pid_c err_c)); - !!($self->{pid} || $self->{pid_c}); + defined($self->{pid}) || defined($self->{pid_c}); } @@ -523,18 +523,25 @@ sub manifest_entry { $ent; } +# returns true if there are pending cat-file processes sub cleanup_if_unlinked { my ($self) = @_; return cleanup($self) if $^O ne 'linux'; # Linux-specific /proc/$PID/maps access # TODO: support this inside git.git + my $ret = 0; for my $fld (qw(pid pid_c)) { my $pid = $self->{$fld} // next; - open my $fh, '<', "/proc/$pid/maps" or next; + open my $fh, '<', "/proc/$pid/maps" or return cleanup($self); while (<$fh>) { + # n.b. we do not restart for unlinked multi-pack-index + # since it's not too huge, and the startup cost may + # be higher. return cleanup($self) if /\.(?:idx|pack) \(deleted\)$/; } + ++$ret; } + $ret; } 1; diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm index 02ffc7be..c0962af9 100644 --- a/lib/PublicInbox/Inbox.pm +++ b/lib/PublicInbox/Inbox.pm @@ -16,70 +16,47 @@ use Carp qw(croak); # admin will attempt to replace them atomically after compact/vacuum # and we need to be prepared for that. my $cleanup_timer; -my $cleanup_avail = -1; # 0, or 1 -my $have_devel_peek; my $CLEANUP = {}; # string(inbox) -> inbox sub git_cleanup ($) { my ($self) = @_; - if ($self->isa(__PACKAGE__)) { - # normal Inbox; low startup cost, and likely to have many - # many so this keeps process/pipe counts in check - $self->{git}->cleanup if $self->{git}; - } else { - # ExtSearch, high startup cost if ->ALL, and probably - # only one per-daemon, so teardown only if required: - $self->git->cleanup_if_unlinked; - } + my $git = $self->{git} // return undef; + # normal inboxes have low startup cost and there may be many, so + # keep process+pipe counts in check. ExtSearch may have high startup + # cost (e.g. ->ALL) and but likely one per-daemon, so cleanup only + # if there's unlinked files + my $live = $self->isa(__PACKAGE__) ? $git->cleanup + : $git->cleanup_if_unlinked; + delete($self->{git}) unless $live; + $live; } +# returns true if further checking is required +sub cleanup_shards { $_[0]->{search} ? $_[0]->{search}->cleanup_shards : undef } + sub cleanup_task () { $cleanup_timer = undef; my $next = {}; for my $ibx (values %$CLEANUP) { - my $again; - if ($have_devel_peek) { - foreach my $f (qw(search)) { - # we bump refcnt by assigning tmp, here: - my $tmp = $ibx->{$f} or next; - next if Devel::Peek::SvREFCNT($tmp) > 2; - delete $ibx->{$f}; - # refcnt is zero when tmp is out-of-scope - } - } - git_cleanup($ibx); - if (my $gits = $ibx->{-repo_objs}) { - foreach my $git (@$gits) { - $again = 1 if $git->cleanup; - } + my $again = git_cleanup($ibx); + $ibx->cleanup_shards and $again = 1; + for my $git (@{$ibx->{-repo_objs}}) { + $again = 1 if $git->cleanup; } check_inodes($ibx); - if ($have_devel_peek) { - $again ||= !!$ibx->{search}; - } $next->{"$ibx"} = $ibx if $again; } $CLEANUP = $next; + $cleanup_timer //= PublicInbox::DS::later(\&cleanup_task); } -sub cleanup_possible () { +sub _cleanup_later ($) { # no need to require DS, here, if it were enabled another # module would've require'd it, already - eval { PublicInbox::DS::in_loop() } or return 0; - - eval { - require Devel::Peek; # needs separate package in Fedora - $have_devel_peek = 1; - }; - 1; -} - -sub _cleanup_later ($) { - my ($self) = @_; - $cleanup_avail = cleanup_possible() if $cleanup_avail < 0; - return if $cleanup_avail != 1; - $cleanup_timer //= PublicInbox::DS::later(\&cleanup_task); - $CLEANUP->{"$self"} = $self; + if (eval { PublicInbox::DS::in_loop() }) { + $cleanup_timer //= PublicInbox::DS::later(\&cleanup_task); + $CLEANUP->{"$_[0]"} = $_[0]; # $self + } } sub _set_limiter ($$$) { diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm index 1d1cd2f5..d285c11c 100644 --- a/lib/PublicInbox/Search.pm +++ b/lib/PublicInbox/Search.pm @@ -199,7 +199,7 @@ sub xdb_shards_flat ($) { my (@xdb, $slow_phrase); load_xapian(); $self->{qp_flags} //= $QP_FLAGS; - if ($xpfx =~ m/xapian${\SCHEMA_VERSION}\z/) { + if ($xpfx =~ m!/xapian[0-9]+\z!) { @xdb = ($X{Database}->new($xpfx)); $self->{qp_flags} |= FLAG_PHRASE() if !-f "$xpfx/iamchert"; } else { @@ -243,6 +243,20 @@ sub xdb ($) { }; } +# returns true if a future rescan is desired +sub cleanup_shards { + my ($self) = @_; + return unless exists($self->{xdb}); + my $xpfx = $self->{xpfx}; + return reopen($self) if $xpfx =~ m!/xapian[0-9]+\z!; # true + opendir(my $dh, $xpfx) or return warn("$xpfx gone: $!\n"); # true + my $nr = grep(/\A[0-9]+\z/, readdir($dh)) or + return warn("$xpfx has no shards\n"); # true + return reopen($self) if $nr == ($self->{nshard} // -1); + delete($self->{xdb}); + undef; +} + sub new { my ($class, $ibx) = @_; ref $ibx or die "BUG: expected PublicInbox::Inbox object: $ibx"; diff --git a/t/extsearch.t b/t/extsearch.t index ad4f2c6d..6c074022 100644 --- a/t/extsearch.t +++ b/t/extsearch.t @@ -436,12 +436,27 @@ for my $j (1, 3, 6) { SKIP: { my $d = "$home/extindex-j1"; + my $es = PublicInbox::ExtSearch->new($d); + ok(my $nresult0 = $es->mset('z:0..')->size, 'got results'); + ok(ref($es->{xdb}), '{xdb} created'); + my $nshards1 = $es->{nshard}; + is($nshards1, 1, 'correct shard count'); + my $xdb_str = "$es->{xdb}"; + ok($es->cleanup_shards, 'cleanup_shards noop'); + is("$es->{xdb}", $xdb_str, '{xdb} unchanged'); + my $o = { 2 => \(my $err = '') }; ok(run_script([qw(-xcpdb -R4), $d]), 'xcpdb R4'); my @dirs = glob("$d/ei*/?"); for my $i (0..3) { is(grep(m!/ei[0-9]+/$i\z!, @dirs), 1, "shard [$i] created"); } + is($es->cleanup_shards, undef, 'cleanup_shards cleaned'); + ok(!defined($es->{xdb}), 'old {xdb} gone'); + is($es->cleanup_shards, undef, 'cleanup_shards clean idempotent'); + is($es->mset('z:0..')->size, $nresult0, 'new shards, same results'); + ok($es->cleanup_shards, 'cleanup_shards true after open'); + for my $i (4..5) { is(grep(m!/ei[0-9]+/$i\z!, @dirs), 0, "no shard [$i]"); } ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-09-23 0:46 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-09-21 14:47 Holding on to deleted packfiles Konstantin Ryabitsev 2021-09-21 19:06 ` Eric Wong 2021-09-21 20:57 ` Konstantin Ryabitsev 2021-09-21 21:37 ` Eric Wong 2021-09-22 9:45 ` [PATCH] gcf2 + extsearch: check for unlinked files on Linux Eric Wong 2021-09-22 19:51 ` Eric Wong 2021-09-23 0:46 ` [PATCH 2/1] daemons: revamp periodic cleanup task Eric Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).