unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
* public-inbox skipping new inboxes or many mails
@ 2024-07-15  6:15 Robin H. Johnson
  2024-07-15 21:03 ` Eric Wong
  0 siblings, 1 reply; 10+ messages in thread
From: Robin H. Johnson @ 2024-07-15  6:15 UTC (permalink / raw)
  To: meta; +Cc: infra

[-- Attachment #1: Type: text/plain, Size: 5124 bytes --]

Hi,

After some long delays, we're trying to roll out public-inbox for
Gentoo's mailing lists.

This is the latest HEAD at 18f41f5af397f903898154591de2cd1cd514c920 2024/07/07,
plus the AltID patch you sent before.

It has mostly been smooth so far, but have run into weirdness that it
seems to not read any files for many inboxes, and for other inboxes, it
has recent mail, but refuses to reindex any older mail.

Even many -vvvv options give no clue why it seems to skip entire folders.

Here's one of the lists where previously indexed exactly one file - a
very recent one, and ignored everything else. When working on a
reproduction case for you, it went down to not indexing ANY files.

The strace is really interesting in that it seems to not even open or stat
anything in the /var/archives path.

The most frustrating variant of the output is this:
$ public-inbox-index -vvvvv --reindex \
  /var/public-inbox/eudev.lists.gentoo.org.git
# indexing /var/public-inbox/eudev.lists.gentoo.org.git ...

(Nothing about why it seemed to not scan the maildirs at all).

gentoo-releng-autobuilds.lists.gentoo.org.git it indexed a single file - and not more.

Deleting & recreating
/var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git make it
go down from 1 file to not indexing any files.

$ export PI_CONFIG=/etc/public-inbox/config

$ public-inbox-init --indexlevel full \
  --version 2 --jobs 2 \
  gentoo-releng-autobuilds \
  /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git \
  https://public-inbox.gentoo.org/gentoo-releng-autobuilds \
  gentoo-releng-autobuilds@lists.gentoo.org

$ grep gentoo-releng-autobuilds /etc/public-inbox/config
[publicinbox "gentoo-releng-autobuilds"]
address = gentoo-releng-autobuilds@lists.gentoo.org
url = https://public-inbox.gentoo.org/gentoo-releng-autobuilds
inboxdir = /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git
altid = indexfilter:xarchiveshash:package=XArchivesHash
watch = maildir:/var/archives/.maildir/.gentoo-releng-autobuilds
watch = maildir:/var/archives/.maildir/.gentoo-releng-autobuilds/.201101
watch = maildir:/var/archives/.maildir/.gentoo-releng-autobuilds/.201102
...
watch = maildir:/var/archives/.maildir/.gentoo-releng-autobuilds/.202406
watch = maildir:/var/archives/.maildir/.gentoo-releng-autobuilds/.202407

$ public-inbox-index -vvvvv  --reindex /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git 
# indexing /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git ...
# 0.git indexing all of b0ecbb6f63ab5505707fbba7079980c9f7fc6e51
# gentoo-releng-autobuilds.lists.gentoo.org.git 0.git counting b0ecbb6f63ab5505707fbba7079980c9f7fc6e51 ... # 1
# all.git  1/1

$ find /var/archives/.maildir/.gentoo-releng-autobuilds/ -type f -printf '%h\n' |sort | uniq -c 
 14 /var/archives/.maildir/.gentoo-releng-autobuilds/.201101/cur
 34 /var/archives/.maildir/.gentoo-releng-autobuilds/.201102/cur
...
113 /var/archives/.maildir/.gentoo-releng-autobuilds/.202406/new
 48 /var/archives/.maildir/.gentoo-releng-autobuilds/.202407/new
 39 /var/archives/.maildir/.gentoo-releng-autobuilds/new

$ find /var/archives/.maildir/.gentoo-releng-autobuilds/ -type f |wc -l
14146

$ sqlite3 /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git/msgmap.sqlite3
SQLite version 3.45.3 2024-04-15 13:34:05
Enter ".help" for usage hints.
sqlite> .tables
meta    msgmap
sqlite> select * from meta;
created_at|1721012200
num_highwater|1
last_xap15-0|b0ecbb6f63ab5505707fbba7079980c9f7fc6e51
sqlite> select * from msgmap;
1|20240715052316.61817748FCA@milou.amd64.dev.gentoo.org

$ strace -s 65535 -ff \
  public-inbox-index -vvvvv --reindex /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git \
  2>&1 |grep -e /var/archives -e /etc/public-inbox \
 |grep -v -e ' read(' -e ' write(' -e 'read resumed' 

newfstatat(AT_FDCWD, "/var/archives/.cache/public-inbox/inline-c", 0x7fdb0bae2840, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/etc/public-inbox/config", {st_mode=S_IFREG|0644, st_size=464073, ...}, 0) = 0
newfstatat(AT_FDCWD, "/etc/public-inbox/config", {st_mode=S_IFREG|0644, st_size=464073, ...}, 0) = 0
[pid 226525] execve("/usr/bin/git", ["/usr/bin/git", "config", "-z", "-l", "--includes", "-f", "/etc/public-inbox/config"], 0x562cd071a960 /* 32 vars */ <unfinished ...>
[pid 226525] access("/var/archives/.config/git/config", R_OK) = -1 ENOENT (No such file or directory)
[pid 226525] access("/var/archives/.gitconfig", R_OK) = -1 ENOENT (No such file or directory)
[pid 226525] access("/var/archives/.config/git/config", R_OK) = -1 ENOENT (No such file or directory)
[pid 226525] access("/var/archives/.gitconfig", R_OK) = -1 ENOENT (No such file or directory)
[pid 226525] openat(AT_FDCWD, "/etc/public-inbox/config", O_RDONLY) = 3

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-15  6:15 public-inbox skipping new inboxes or many mails Robin H. Johnson
@ 2024-07-15 21:03 ` Eric Wong
  2024-07-15 21:45   ` Robin H. Johnson
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Wong @ 2024-07-15 21:03 UTC (permalink / raw)
  To: meta, infra

"Robin H. Johnson" <robbat2@gentoo.org> wrote:
> Hi,
> 
> After some long delays, we're trying to roll out public-inbox for
> Gentoo's mailing lists.
> 
> This is the latest HEAD at 18f41f5af397f903898154591de2cd1cd514c920 2024/07/07,
> plus the AltID patch you sent before.
> 
> It has mostly been smooth so far, but have run into weirdness that it
> seems to not read any files for many inboxes, and for other inboxes, it
> has recent mail, but refuses to reindex any older mail.
> 
> Even many -vvvv options give no clue why it seems to skip entire folders.

/me notes `-v' isn't an option for public-inbox-watch...

> Here's one of the lists where previously indexed exactly one file - a
> very recent one, and ignored everything else. When working on a
> reproduction case for you, it went down to not indexing ANY files.
> 
> The strace is really interesting in that it seems to not even open or stat
> anything in the /var/archives path.

Yeah, I've mainly used strace or similar tools for diagnostics
to avoid having to maintain code for tracing.

> The most frustrating variant of the output is this:
> $ public-inbox-index -vvvvv --reindex \
>   /var/public-inbox/eudev.lists.gentoo.org.git
> # indexing /var/public-inbox/eudev.lists.gentoo.org.git ...
> 
> (Nothing about why it seemed to not scan the maildirs at all).

public-inbox-index doesn't touch Maildirs (or mbox, MH, etc) at all.
-index only exists to handle mail already in git repos; that is
-index is intended for freshly cloned inboxes, adding search to
old v1 inboxes, and/or changing indexlevel after init.

Currently, public-inbox-watch is the only public-inbox-* tool which
works directly with Maildirs.

> gentoo-releng-autobuilds.lists.gentoo.org.git it indexed a single file - and not more.

Hmm... (more below)

> Deleting & recreating
> /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git make it
> go down from 1 file to not indexing any files.
> 
> $ export PI_CONFIG=/etc/public-inbox/config
> 
> $ public-inbox-init --indexlevel full \
>   --version 2 --jobs 2 \
>   gentoo-releng-autobuilds \
>   /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git \
>   https://public-inbox.gentoo.org/gentoo-releng-autobuilds \
>   gentoo-releng-autobuilds@lists.gentoo.org

sidenote: `.git' suffix is a bit confusing for v2 inboxes;
only v1 used a single bare git repo

> $ grep gentoo-releng-autobuilds /etc/public-inbox/config
> [publicinbox "gentoo-releng-autobuilds"]
> address = gentoo-releng-autobuilds@lists.gentoo.org
> url = https://public-inbox.gentoo.org/gentoo-releng-autobuilds
> inboxdir = /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git
> altid = indexfilter:xarchiveshash:package=XArchivesHash
> watch = maildir:/var/archives/.maildir/.gentoo-releng-autobuilds
> watch = maildir:/var/archives/.maildir/.gentoo-releng-autobuilds/.201101
> watch = maildir:/var/archives/.maildir/.gentoo-releng-autobuilds/.201102
> ...

Those watch= directives are intended for public-inbox-watch.

I'm curious how you got a single message indexed, however...
is that from public-inbox-mda?

Fwiw, I started working on a public-inbox-(import/ctl) tool to
quickly import a bunch of messages a while back but got
sidetracked.  Been busy dealing with personal problems much of
this year :<

But public-inbox-watch works reasonably well for large imports
even if the git history ordering gets a bit wonky from readdir.
SIGHUP/SIGUSR1 + strace are useful for reloading and tracing
configuration problems with the -watch daemon.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-15 21:03 ` Eric Wong
@ 2024-07-15 21:45   ` Robin H. Johnson
  2024-07-15 23:58     ` Eric Wong
  0 siblings, 1 reply; 10+ messages in thread
From: Robin H. Johnson @ 2024-07-15 21:45 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta, infra

[-- Attachment #1: Type: text/plain, Size: 3526 bytes --]

TL;DR: kill -USR1 seems to have triggered the import now, whereas even a
restart didn't before.

On Mon, Jul 15, 2024 at 09:03:40PM +0000, Eric Wong wrote:
> > (Nothing about why it seemed to not scan the maildirs at all).
> public-inbox-index doesn't touch Maildirs (or mbox, MH, etc) at all.
> -index only exists to handle mail already in git repos; that is
> -index is intended for freshly cloned inboxes, adding search to
> old v1 inboxes, and/or changing indexlevel after init.
> 
> Currently, public-inbox-watch is the only public-inbox-* tool which
> works directly with Maildirs.
HMm, I had excluded public-inbox-watch initially because it didn't seem
to be doing anything after the very long startup. I'm thinking that the
inotify is not working as expected, maybe relating to the huge number of
folders we watch.

Terminal 1:
# strace -p $(pidof /usr/bin/public-inbox-watch)  -ff 
strace: Process 93260 attached
pselect6(8, [3 4], NULL, NULL, NULL, NULL
(nothing more)

Terminal 2:
$ find /var/archives/.maildir/.gentoo* -maxdepth 2 -path '/var/archives/.maildir/.gentoo-*' -path '*202407/new' -mtime -1 |sed 's,/new,,g' >/tmp/list
...
/var/archives/.maildir/.gentoo-binhost-autobuilds/.202407/new
/var/archives/.maildir/.gentoo-dev/.202407/new
/var/archives/.maildir/.gentoo-dev-announce/.202407/new
/var/archives/.maildir/.gentoo-infrastructure/.202407/new
/var/archives/.maildir/.gentoo-kernel/.202407/new
...

$ fgrep -f /tmp/list /etc/public-inbox/config
...
watch = maildir:/var/archives/.maildir/.gentoo-binhost-autobuilds/.202407
watch = maildir:/var/archives/.maildir/.gentoo-dev/.202407
watch = maildir:/var/archives/.maildir/.gentoo-dev-announce/.202407
...

# Touch the mail so it SHOULD trigger inotify
$ cat /tmp/list |xargs -I^ find ^ -type f -mtime -1 |grep -v -e gentoo-commits |xargs touch



> > $ public-inbox-init --indexlevel full \
> >   --version 2 --jobs 2 \
> >   gentoo-releng-autobuilds \
> >   /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git \
> >   https://public-inbox.gentoo.org/gentoo-releng-autobuilds \
> >   gentoo-releng-autobuilds@lists.gentoo.org
> sidenote: `.git' suffix is a bit confusing for v2 inboxes;
> only v1 used a single bare git repo
I'll update our internal docs & tooling to drop it - it was a carryover.

...
> I'm curious how you got a single message indexed, however...
> is that from public-inbox-mda?
I think that message arrived and triggered public-inbox-watch but others
didn't.

> Fwiw, I started working on a public-inbox-(import/ctl) tool to
> quickly import a bunch of messages a while back but got
> sidetracked.  Been busy dealing with personal problems much of
> this year :<
> 
> But public-inbox-watch works reasonably well for large imports
> even if the git history ordering gets a bit wonky from readdir.
> SIGHUP/SIGUSR1 + strace are useful for reloading and tracing
> configuration problems with the -watch daemon.
kill USR1 seems to have tricked it into adding files now...

But why didn't it add files any other way? Weird.

Anyway, that public-inbox-(import/ctl) sounds like it might be better
for other folders, where we don't expect new mail to be added outside of
the archival cases previously mentioned.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-15 21:45   ` Robin H. Johnson
@ 2024-07-15 23:58     ` Eric Wong
  2024-07-16  5:45       ` Robin H. Johnson
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Wong @ 2024-07-15 23:58 UTC (permalink / raw)
  To: Robin H. Johnson; +Cc: meta, infra

"Robin H. Johnson" <robbat2@gentoo.org> wrote:
> TL;DR: kill -USR1 seems to have triggered the import now, whereas even a
> restart didn't before.

OK, good to know.

> On Mon, Jul 15, 2024 at 09:03:40PM +0000, Eric Wong wrote:
> > Currently, public-inbox-watch is the only public-inbox-* tool which
> > works directly with Maildirs.
> HMm, I had excluded public-inbox-watch initially because it didn't seem
> to be doing anything after the very long startup. I'm thinking that the
> inotify is not working as expected, maybe relating to the huge number of
> folders we watch.

-watch is (or should be) doing a full scan every startup, but it
switches between inboxes every few messages and tries to
prioritize new messages from inotify.   Curious to see the
strace immediately after startup to see if it's indeed doing the
full scan.  I should probably add a stderr diagnostic for full
scan completion...

How many Maildirs are you watching?  I wonder if it's hitting
RLIMIT_NOFILE... (errors should be logged to stderr).

In retrospect, it'd probably be better to go one Maildir
at a time to reduce open FDs...  I'll work on that.

> Terminal 1:
> # strace -p $(pidof /usr/bin/public-inbox-watch)  -ff 
> strace: Process 93260 attached
> pselect6(8, [3 4], NULL, NULL, NULL, NULL
> (nothing more)

Curious, which architecture is that and is it using
Linux::Inotify2 or inotify via the `syscall' perlop?
(I expect 3 is the inotify FD).

You may also be running into fs.inotify.max_user_watches or
max_queued_events sysctl limits; each Maildir costs 2 inotify
watches; and it looks like I forgot to handle IN_Q_OVERFLOW
properly :x

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-15 23:58     ` Eric Wong
@ 2024-07-16  5:45       ` Robin H. Johnson
  2024-07-16 19:05         ` Eric Wong
  0 siblings, 1 reply; 10+ messages in thread
From: Robin H. Johnson @ 2024-07-16  5:45 UTC (permalink / raw)
  To: Eric Wong; +Cc: Robin H. Johnson, meta, infra

[-- Attachment #1: Type: text/plain, Size: 2421 bytes --]

On Mon, Jul 15, 2024 at 11:58:08PM +0000, Eric Wong wrote:
> > On Mon, Jul 15, 2024 at 09:03:40PM +0000, Eric Wong wrote:
> > > Currently, public-inbox-watch is the only public-inbox-* tool which
> > > works directly with Maildirs.
> > HMm, I had excluded public-inbox-watch initially because it didn't seem
> > to be doing anything after the very long startup. I'm thinking that the
> > inotify is not working as expected, maybe relating to the huge number of
> > folders we watch.
> 
> -watch is (or should be) doing a full scan every startup, but it
> switches between inboxes every few messages and tries to
> prioritize new messages from inotify.   Curious to see the
> strace immediately after startup to see if it's indeed doing the
> full scan.  I should probably add a stderr diagnostic for full
> scan completion...
It's definitely very busy after scan, but I can't tell if it's the full
set.

At an admin level, is there a way to dump out all of the paths it's
indexes, to compare against the paths on disk?

> How many Maildirs are you watching?  I wonder if it's hitting
> RLIMIT_NOFILE... (errors should be logged to stderr).
6774 Maildirs right now.
I should probably improve the OpenRC script for it, I think we're
throwing away stderr right now for -watch.

RLIMIT_NOFILE 1M
sysctl fs.file-max 16M

> > Terminal 1:
> > # strace -p $(pidof /usr/bin/public-inbox-watch)  -ff 
> > strace: Process 93260 attached
> > pselect6(8, [3 4], NULL, NULL, NULL, NULL
> > (nothing more)
> Curious, which architecture is that and is it using
> Linux::Inotify2 or inotify via the `syscall' perlop?
> (I expect 3 is the inotify FD).
x86-64, dev-perl/Linux-Inotify2 is installed on the host, but I can't
tell at a glance if -watch used perlop or package.

fd 3 is inotify & fd 4 is signalfd.

> 
> You may also be running into fs.inotify.max_user_watches or
> max_queued_events sysctl limits; each Maildir costs 2 inotify
> watches; and it looks like I forgot to handle IN_Q_OVERFLOW
> properly :x
Already set quite high:
fs.inotify.max_queued_events = 16384
fs.inotify.max_user_instances = 1024
fs.inotify.max_user_watches = 65536

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-16  5:45       ` Robin H. Johnson
@ 2024-07-16 19:05         ` Eric Wong
  2024-07-17  3:04           ` Robin H. Johnson
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Wong @ 2024-07-16 19:05 UTC (permalink / raw)
  To: Robin H. Johnson; +Cc: meta, infra

"Robin H. Johnson" <robbat2@gentoo.org> wrote:
> On Mon, Jul 15, 2024 at 11:58:08PM +0000, Eric Wong wrote:
> > > On Mon, Jul 15, 2024 at 09:03:40PM +0000, Eric Wong wrote:
> > > > Currently, public-inbox-watch is the only public-inbox-* tool which
> > > > works directly with Maildirs.
> > > HMm, I had excluded public-inbox-watch initially because it didn't seem
> > > to be doing anything after the very long startup. I'm thinking that the
> > > inotify is not working as expected, maybe relating to the huge number of
> > > folders we watch.
> > 
> > -watch is (or should be) doing a full scan every startup, but it
> > switches between inboxes every few messages and tries to
> > prioritize new messages from inotify.   Curious to see the
> > strace immediately after startup to see if it's indeed doing the
> > full scan.  I should probably add a stderr diagnostic for full
> > scan completion...
> It's definitely very busy after scan, but I can't tell if it's the full
> set.

OK, I think adding stderr diagnostic messages for full scans
shouldn't be too noisy.

> At an admin level, is there a way to dump out all of the paths it's
> indexes, to compare against the paths on disk?

No, path information isn't stored for public-facing inboxes
since it's too unstable.  It should be possible to reverse map
things at real-time and add better diagnostic tools, but the
philosophy has always been to store||index as little as possible
and be able to infer/regenerate needed data on-the-fly to avoid
data consistency problems.

(lei stores path info, but it's been a problematic
implementation, too :<)

> > How many Maildirs are you watching?  I wonder if it's hitting
> > RLIMIT_NOFILE... (errors should be logged to stderr).
> 6774 Maildirs right now.
> I should probably improve the OpenRC script for it, I think we're
> throwing away stderr right now for -watch.

Yeah, watch stderr is important for diagnosing problems.

Fwiw, I run it inside a screen(1) session on one system,
and rely on systemd to redirect stderr to syslog on another

<snip>
OK, various limits seem fine.

> > Curious, which architecture is that and is it using
> > Linux::Inotify2 or inotify via the `syscall' perlop?
> > (I expect 3 is the inotify FD).
> x86-64, dev-perl/Linux-Inotify2 is installed on the host, but I can't
> tell at a glance if -watch used perlop or package.

Probably, yes; but it can/should favor the pure Perl version
soon.  Since it's Gentoo I trust it's up-to-date with broadcast
and overflow support?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-16 19:05         ` Eric Wong
@ 2024-07-17  3:04           ` Robin H. Johnson
  2024-07-17 23:25             ` Eric Wong
  0 siblings, 1 reply; 10+ messages in thread
From: Robin H. Johnson @ 2024-07-17  3:04 UTC (permalink / raw)
  To: Eric Wong; +Cc: Robin H. Johnson, meta, infra

[-- Attachment #1: Type: text/plain, Size: 2538 bytes --]

On Tue, Jul 16, 2024 at 07:05:50PM +0000, Eric Wong wrote:
> > It's definitely very busy after scan, but I can't tell if it's the full
> > set.
> OK, I think adding stderr diagnostic messages for full scans
> shouldn't be too noisy.
Thanks. I think it will be needed...

> > At an admin level, is there a way to dump out all of the paths it's
> > indexes, to compare against the paths on disk?
> No, path information isn't stored for public-facing inboxes
> since it's too unstable.  It should be possible to reverse map
> things at real-time and add better diagnostic tools, but the
> philosophy has always been to store||index as little as possible
> and be able to infer/regenerate needed data on-the-fly to avoid
> data consistency problems.
Can I easily dump out every message-id at least? I can compare that
against the files, other than the old messages with no message-ids.

> > > How many Maildirs are you watching?  I wonder if it's hitting
> > > RLIMIT_NOFILE... (errors should be logged to stderr).
> > 6774 Maildirs right now.
> > I should probably improve the OpenRC script for it, I think we're
> > throwing away stderr right now for -watch.
> 
> Yeah, watch stderr is important for diagnosing problems.
> 
> Fwiw, I run it inside a screen(1) session on one system,
> and rely on systemd to redirect stderr to syslog on another
I hacked in stderr: but bad luck, it doesn't dump anything useful before
it seems to vanish. Nothing in dmesg either, so a mundane crash.

> > > Curious, which architecture is that and is it using
> > > Linux::Inotify2 or inotify via the `syscall' perlop?
> > > (I expect 3 is the inotify FD).
> > x86-64, dev-perl/Linux-Inotify2 is installed on the host, but I can't
> > tell at a glance if -watch used perlop or package.
> 
> Probably, yes; but it can/should favor the pure Perl version
> soon.  Since it's Gentoo I trust it's up-to-date with broadcast
> and overflow support?
dev-perl/Linux-Inotify2 will be up to date. No guarantees that the
kernel is up to date - some legacy boxes at sponsors are pretty crufty
and have been unsafe to reboot to new kernels when we lack any OOB
management access: to that end, public-inbox's responsiveness is amazing
even running on a 10+ year old RAID1 HDD spinner setup.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-17  3:04           ` Robin H. Johnson
@ 2024-07-17 23:25             ` Eric Wong
  2024-07-17 23:50               ` Eric Wong
  2024-07-18  0:02               ` Robin H. Johnson
  0 siblings, 2 replies; 10+ messages in thread
From: Eric Wong @ 2024-07-17 23:25 UTC (permalink / raw)
  To: Robin H. Johnson; +Cc: meta, infra

"Robin H. Johnson" <robbat2@gentoo.org> wrote:
> On Tue, Jul 16, 2024 at 07:05:50PM +0000, Eric Wong wrote:
> > > It's definitely very busy after scan, but I can't tell if it's the full
> > > set.
> > OK, I think adding stderr diagnostic messages for full scans
> > shouldn't be too noisy.
> Thanks. I think it will be needed...

OK, will add...

> > > At an admin level, is there a way to dump out all of the paths it's
> > > indexes, to compare against the paths on disk?
> > No, path information isn't stored for public-facing inboxes
> > since it's too unstable.  It should be possible to reverse map
> > things at real-time and add better diagnostic tools, but the
> > philosophy has always been to store||index as little as possible
> > and be able to infer/regenerate needed data on-the-fly to avoid
> > data consistency problems.
> Can I easily dump out every message-id at least? I can compare that
> against the files, other than the old messages with no message-ids.

$ sqlite3 /path/to/msgmap.sqlite3 'SELECT mid FROM msgmap'

For v2, old messages without Message-IDs or recycled+conflicting
Message-IDs will have Message-IDs synthesized
(<YYYYmmddHHMMSS.$base64_digest@z>) as allowed by RFC 3977.

> > > > How many Maildirs are you watching?  I wonder if it's hitting
> > > > RLIMIT_NOFILE... (errors should be logged to stderr).
> > > 6774 Maildirs right now.
> > > I should probably improve the OpenRC script for it, I think we're
> > > throwing away stderr right now for -watch.
> > 
> > Yeah, watch stderr is important for diagnosing problems.
> > 
> > Fwiw, I run it inside a screen(1) session on one system,
> > and rely on systemd to redirect stderr to syslog on another
> I hacked in stderr: but bad luck, it doesn't dump anything useful before
> it seems to vanish. Nothing in dmesg either, so a mundane crash.

Not having anything in stderr on errors is really bad :x

Any fast_import_crash_* files in the [0-9]+\.git dirs?

-watch really shouldn't just vanish...  I'm not familiar with
OpenRC, does/can it wait on processes so it can report exit codes?

> > > > Curious, which architecture is that and is it using
> > > > Linux::Inotify2 or inotify via the `syscall' perlop?
> > > > (I expect 3 is the inotify FD).
> > > x86-64, dev-perl/Linux-Inotify2 is installed on the host, but I can't
> > > tell at a glance if -watch used perlop or package.
> > 
> > Probably, yes; but it can/should favor the pure Perl version
> > soon.  Since it's Gentoo I trust it's up-to-date with broadcast
> > and overflow support?
> dev-perl/Linux-Inotify2 will be up to date. No guarantees that the
> kernel is up to date - some legacy boxes at sponsors are pretty crufty
> and have been unsafe to reboot to new kernels when we lack any OOB
> management access: to that end, public-inbox's responsiveness is amazing
> even running on a 10+ year old RAID1 HDD spinner setup.

OK.  The kernel shouldn't be a problem for inotify, just the
older XS versions lacked some things and the pure Perl version
reduces mmap||vm.max_map_count pressure.  But I also noticed a
bug where we were favoring the XS :x.

Fwiw, I've actually struggled a lot with HDDs w/ Xapian||SQLite
but glad it's working out for you.  I'm mainly working ~15 year
old systems with SSDs that replaced dead HDDs.  Still have
numerous performance and memory optimizations planned :>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-17 23:25             ` Eric Wong
@ 2024-07-17 23:50               ` Eric Wong
  2024-07-18  0:02               ` Robin H. Johnson
  1 sibling, 0 replies; 10+ messages in thread
From: Eric Wong @ 2024-07-17 23:50 UTC (permalink / raw)
  To: Robin H. Johnson; +Cc: meta, infra

> "Robin H. Johnson" <robbat2@gentoo.org> wrote:
> > I hacked in stderr: but bad luck, it doesn't dump anything useful before
> > it seems to vanish. Nothing in dmesg either, so a mundane crash.

Actually, did it say "# scanning" anywhere at startup?
There's already a diagnostic message that's been there a while
(but not scan completion)

Similarly, SIGHUP should already emit "# reloaded" to stderr.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: public-inbox skipping new inboxes or many mails
  2024-07-17 23:25             ` Eric Wong
  2024-07-17 23:50               ` Eric Wong
@ 2024-07-18  0:02               ` Robin H. Johnson
  1 sibling, 0 replies; 10+ messages in thread
From: Robin H. Johnson @ 2024-07-18  0:02 UTC (permalink / raw)
  To: Eric Wong; +Cc: Robin H. Johnson, meta, infra

[-- Attachment #1: Type: text/plain, Size: 2064 bytes --]

On Wed, Jul 17, 2024 at 11:25:32PM +0000, Eric Wong wrote:
> > Can I easily dump out every message-id at least? I can compare that
> > against the files, other than the old messages with no message-ids.
> 
> $ sqlite3 /path/to/msgmap.sqlite3 'SELECT mid FROM msgmap'
> 
> For v2, old messages without Message-IDs or recycled+conflicting
> Message-IDs will have Message-IDs synthesized
> (<YYYYmmddHHMMSS.$base64_digest@z>) as allowed by RFC 3977.
Thanks.

> > I hacked in stderr: but bad luck, it doesn't dump anything useful before
> > it seems to vanish. Nothing in dmesg either, so a mundane crash.
> Not having anything in stderr on errors is really bad :x
> 
> Any fast_import_crash_* files in the [0-9]+\.git dirs?
No crash files either.

> -watch really shouldn't just vanish...  I'm not familiar with
> OpenRC, does/can it wait on processes so it can report exit codes?
Not by default.

> OK.  The kernel shouldn't be a problem for inotify, just the
> older XS versions lacked some things and the pure Perl version
> reduces mmap||vm.max_map_count pressure.  But I also noticed a
> bug where we were favoring the XS :x.
> 
> Fwiw, I've actually struggled a lot with HDDs w/ Xapian||SQLite
> but glad it's working out for you.  I'm mainly working ~15 year
> old systems with SSDs that replaced dead HDDs.  Still have
> numerous performance and memory optimizations planned :>
I came up with a good hack for now:
I split the config file by list, and I'm running 116 instances of
public-inbox-watch, with different config files (and httpd has the giant
config file). Taking a listname as an arg would have been cleaner, but
this is working for now.

It was also finally able to hit the IO limits of the HDDs by doing this,
so there's a lot of low-hanging optimization fruit clearly.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1113 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-07-18  0:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-15  6:15 public-inbox skipping new inboxes or many mails Robin H. Johnson
2024-07-15 21:03 ` Eric Wong
2024-07-15 21:45   ` Robin H. Johnson
2024-07-15 23:58     ` Eric Wong
2024-07-16  5:45       ` Robin H. Johnson
2024-07-16 19:05         ` Eric Wong
2024-07-17  3:04           ` Robin H. Johnson
2024-07-17 23:25             ` Eric Wong
2024-07-17 23:50               ` Eric Wong
2024-07-18  0:02               ` Robin H. Johnson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).