unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
From: Eric Wong <e@80x24.org>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: meta@public-inbox.org
Subject: Re: how's memory usage on public-inbox-httpd?
Date: Sun, 9 Jun 2019 08:39:18 +0000	[thread overview]
Message-ID: <20190609083918.gfr2kurah7f2hysx@dcvr> (raw)
In-Reply-To: <20190606221009.y4fe2e2rervvq3z4@dcvr>

Eric Wong <e@80x24.org> wrote:
> Without concurrent connections; I can't see that happening
> unless there's a single message which is gigabytes in size.  I'm
> already irked that Email::MIME requires slurping entire emails
> into memory; but it should not be using more than one
> Email::MIME object in memory-at-a-time for a single client.

Giant multipart messages do a lot of damage.  Maybe concurrent
clients hitting the same endpoints will do more damage.

Largest I see in LKML is 7204747 bytes (which is frightening).
That bloats to 21626795 bytes when parsed by Email::MIME.

I thought it was bad enough that all Perl mail modules seem to
require slurping 7M into memory...

-------8<--------
use strict; use warnings;
require Email::MIME;
use bytes ();
use Devel::Size qw(total_size);
my $in = do { local $/; <STDIN> };
print 'string: ', total_size($in), ' actual: ', bytes::length($in), "\n";
print 'MIME: ', total_size(Email::MIME->new(\$in)), "\n";
-------8<--------

That shows (on amd64):

	string: 7204819 actual: 7204747
	MIME: 21626795

Maybe you have bigger messages outside of LKML.
This prints all objects >1MB in a git dir:

	git cat-file --buffer --batch-check --batch-all-objects \
		--unordered | awk '$3 > 1048576 { print }'

And I also remember you're supporting non-vger lists where HTML
mail is allowed, so that can't be good for memory use at all :<

Streaming MIME handling has been on the TODO for a while,
at least...

|* streaming Email::MIME replacement: currently we generate many
|  allocations/strings for headers we never look at and slurp
|  entire message bodies into memory.
|  (this is pie-in-the-sky territory...)


  parent reply	other threads:[~2019-06-09  8:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-01 19:44 how's memory usage on public-inbox-httpd? Eric Wong
2019-06-06 19:04 ` Konstantin Ryabitsev
2019-06-06 20:37   ` Eric Wong
2019-06-06 21:45     ` Konstantin Ryabitsev
2019-06-06 22:10       ` Eric Wong
2019-06-06 22:19         ` Konstantin Ryabitsev
2019-06-06 22:29           ` Eric Wong
2019-06-10 10:09             ` [RFC] optionally support glibc malloc_info via SIGCONT Eric Wong
2019-06-09  8:39         ` Eric Wong [this message]
2019-06-12 17:08           ` how's memory usage on public-inbox-httpd? Eric Wong
2019-06-06 20:54   ` Eric Wong
2019-10-16 22:10   ` Eric Wong
2019-10-18 19:23     ` Konstantin Ryabitsev
2019-10-19  0:11       ` Eric Wong
2019-10-22 17:28         ` Konstantin Ryabitsev
2019-10-22 19:11           ` Eric Wong
2019-10-28 23:24         ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190609083918.gfr2kurah7f2hysx@dcvr \
    --to=e@80x24.org \
    --cc=konstantin@linuxfoundation.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).