unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
From: "Robin H. Johnson" <robbat2@gentoo.org>
To: Eric Wong <e@80x24.org>
Cc: "Robin H. Johnson" <robbat2@gentoo.org>,
	meta@public-inbox.org, infra@gentoo.org
Subject: Re: public-inbox skipping new inboxes or many mails
Date: Thu, 18 Jul 2024 00:02:56 +0000	[thread overview]
Message-ID: <robbat2-20240717T235444-452142610Z@orbis-terrarum.net> (raw)
In-Reply-To: <20240717232532.M125694@dcvr>

[-- Attachment #1: Type: text/plain, Size: 2064 bytes --]

On Wed, Jul 17, 2024 at 11:25:32PM +0000, Eric Wong wrote:
> > Can I easily dump out every message-id at least? I can compare that
> > against the files, other than the old messages with no message-ids.
> 
> $ sqlite3 /path/to/msgmap.sqlite3 'SELECT mid FROM msgmap'
> 
> For v2, old messages without Message-IDs or recycled+conflicting
> Message-IDs will have Message-IDs synthesized
> (<YYYYmmddHHMMSS.$base64_digest@z>) as allowed by RFC 3977.
Thanks.

> > I hacked in stderr: but bad luck, it doesn't dump anything useful before
> > it seems to vanish. Nothing in dmesg either, so a mundane crash.
> Not having anything in stderr on errors is really bad :x
> 
> Any fast_import_crash_* files in the [0-9]+\.git dirs?
No crash files either.

> -watch really shouldn't just vanish...  I'm not familiar with
> OpenRC, does/can it wait on processes so it can report exit codes?
Not by default.

> OK.  The kernel shouldn't be a problem for inotify, just the
> older XS versions lacked some things and the pure Perl version
> reduces mmap||vm.max_map_count pressure.  But I also noticed a
> bug where we were favoring the XS :x.
> 
> Fwiw, I've actually struggled a lot with HDDs w/ Xapian||SQLite
> but glad it's working out for you.  I'm mainly working ~15 year
> old systems with SSDs that replaced dead HDDs.  Still have
> numerous performance and memory optimizations planned :>
I came up with a good hack for now:
I split the config file by list, and I'm running 116 instances of
public-inbox-watch, with different config files (and httpd has the giant
config file). Taking a listname as an arg would have been cleaner, but
this is working for now.

It was also finally able to hit the IO limits of the HDDs by doing this,
so there's a lot of low-hanging optimization fruit clearly.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]

      parent reply	other threads:[~2024-07-18  0:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-15  6:15 public-inbox skipping new inboxes or many mails Robin H. Johnson
2024-07-15 21:03 ` Eric Wong
2024-07-15 21:45   ` Robin H. Johnson
2024-07-15 23:58     ` Eric Wong
2024-07-16  5:45       ` Robin H. Johnson
2024-07-16 19:05         ` Eric Wong
2024-07-17  3:04           ` Robin H. Johnson
2024-07-17 23:25             ` Eric Wong
2024-07-17 23:50               ` Eric Wong
2024-07-18  0:02               ` Robin H. Johnson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=robbat2-20240717T235444-452142610Z@orbis-terrarum.net \
    --to=robbat2@gentoo.org \
    --cc=e@80x24.org \
    --cc=infra@gentoo.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).