* [PATCH 0/2] www: minor mem reduction in message threading @ 2024-06-10 11:34 Eric Wong 2024-06-10 11:34 ` [PATCH 1/2] xt/perf-threading: modernize + remove Xapian dependency Eric Wong 2024-06-10 11:34 ` [PATCH 2/2] www: deduplicate Message-ID in threading + skeleton Eric Wong 0 siblings, 2 replies; 3+ messages in thread From: Eric Wong @ 2024-06-10 11:34 UTC (permalink / raw) To: meta For gigantic 700+ message threads, this ends up being several megabytes of memory freed early and reusable for other requests. Eric Wong (2): xt/perf-threading: modernize + remove Xapian dependency www: deduplicate Message-ID in threading + skeleton lib/PublicInbox/SearchThread.pm | 9 +++++++-- lib/PublicInbox/View.pm | 1 + xt/perf-threading.t | 10 ++++------ 3 files changed, 12 insertions(+), 8 deletions(-) ^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/2] xt/perf-threading: modernize + remove Xapian dependency 2024-06-10 11:34 [PATCH 0/2] www: minor mem reduction in message threading Eric Wong @ 2024-06-10 11:34 ` Eric Wong 2024-06-10 11:34 ` [PATCH 2/2] www: deduplicate Message-ID in threading + skeleton Eric Wong 1 sibling, 0 replies; 3+ messages in thread From: Eric Wong @ 2024-06-10 11:34 UTC (permalink / raw) To: meta Threading hasn't required Xapian (only SQLite) for a while now; but I'm revisiting this test for another minor optimization. --- xt/perf-threading.t | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/xt/perf-threading.t b/xt/perf-threading.t index 57e9db9b..24c1a873 100644 --- a/xt/perf-threading.t +++ b/xt/perf-threading.t @@ -1,18 +1,16 @@ -# Copyright (C) 2016-2021 all contributors <meta@public-inbox.org> +#!perl -w +# Copyright (C) all contributors <meta@public-inbox.org> # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt> # # real-world testing of search threading -use strict; -use warnings; +use v5.12; use Test::More; use Benchmark qw(:all); use PublicInbox::Inbox; my $inboxdir = $ENV{GIANT_INBOX_DIR} // $ENV{GIANT_PI_DIR}; plan skip_all => "GIANT_INBOX_DIR not defined for $0" unless $inboxdir; my $ibx = PublicInbox::Inbox->new({ inboxdir => $inboxdir }); -eval { require PublicInbox::Search }; -my $srch = $ibx->search; -plan skip_all => "$inboxdir not configured for search $0 $@" unless $srch; +$ibx->over or plan skip_all => "$inboxdir not indexed for $0 $@"; require PublicInbox::View; ^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/2] www: deduplicate Message-ID in threading + skeleton 2024-06-10 11:34 [PATCH 0/2] www: minor mem reduction in message threading Eric Wong 2024-06-10 11:34 ` [PATCH 1/2] xt/perf-threading: modernize + remove Xapian dependency Eric Wong @ 2024-06-10 11:34 ` Eric Wong 1 sibling, 0 replies; 3+ messages in thread From: Eric Wong @ 2024-06-10 11:34 UTC (permalink / raw) To: meta xt/perf-threading.t reports a small 0.5-1.0% memory reduction in non-ancient Perls with CoW strings for threading alone (w/o rendering the View.pm stuff). On informal tests using -httpd and giant Linux stable patch set threads (700+ messages), this ends up being roughly 5MB saved in /T/ rendering since we use the {mid} field again in the $ctx->{mapping} table. This becomes even more beneficial if handling parallel HTTP requests for messages in the same message thread, even across different endpoints. --- lib/PublicInbox/SearchThread.pm | 9 +++++++-- lib/PublicInbox/View.pm | 1 + 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/SearchThread.pm b/lib/PublicInbox/SearchThread.pm index 00ae9fac..672c53ad 100644 --- a/lib/PublicInbox/SearchThread.pm +++ b/lib/PublicInbox/SearchThread.pm @@ -33,19 +33,24 @@ sub thread { # can be shakier if somebody used In-Reply-To with multiple, disparate # messages. So, take the client Date: into account since we can't # always determine ordering when somebody uses multiple In-Reply-To. + my (%dedupe, $mid); my @kids = sort { $a->{ds} <=> $b->{ds} } grep { # this delete saves around 4K across 1K messages # TODO: move this to a more appropriate place, breaks tests # if we do it during psgi_cull delete $_->{num}; bless $_, 'PublicInbox::SearchThread::Msg'; - if (exists $id_table{$_->{mid}}) { + $mid = $_->{mid}; + if (exists $id_table{$mid}) { $_->{children} = []; push @imposters, $_; # we'll deal with them later undef; } else { $_->{children} = {}; # will become arrayref later - $id_table{$_->{mid}} = $_; + %dedupe = ($mid => undef); + ($mid) = keys %dedupe; + $_->{mid} = $mid; + $id_table{$mid} = $_; defined($_->{references}); } } @$msgs; diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm index 958efa41..dcceb311 100644 --- a/lib/PublicInbox/View.pm +++ b/lib/PublicInbox/View.pm @@ -432,6 +432,7 @@ sub walk_thread ($$$) { sub pre_thread { # walk_thread callback my ($ctx, $level, $node, $idx) = @_; + # node->{mid} is deduplicated in PublicInbox::SearchThread::thread $ctx->{mapping}->{$node->{mid}} = [ '', $node, $idx, $level ]; skel_dump($ctx, $level, $node); } ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-06-10 11:34 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-10 11:34 [PATCH 0/2] www: minor mem reduction in message threading Eric Wong 2024-06-10 11:34 ` [PATCH 1/2] xt/perf-threading: modernize + remove Xapian dependency Eric Wong 2024-06-10 11:34 ` [PATCH 2/2] www: deduplicate Message-ID in threading + skeleton Eric Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).