From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3701 140.211.0.0/16 X-Spam-Status: No, score=-3.5 required=3.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 2F28F1F572 for ; Mon, 15 Jul 2024 21:45:45 +0000 (UTC) Received: from grubbs.orbis-terrarum.net (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id EA057335D0F for ; Mon, 15 Jul 2024 21:45:42 +0000 (UTC) Received: from grubbs.orbis-terrarum.net (localhost [127.0.0.1]) by grubbs.orbis-terrarum.net (Postfix) with ESMTP id 10E98260182 for ; Mon, 15 Jul 2024 21:45:41 +0000 (UTC) Received: (qmail 524669 invoked by uid 10000); 15 Jul 2024 21:45:41 -0000 Date: Mon, 15 Jul 2024 21:45:41 +0000 From: "Robin H. Johnson" To: Eric Wong Cc: meta@public-inbox.org, infra@gentoo.org Subject: Re: public-inbox skipping new inboxes or many mails Message-ID: References: <20240715210340.M929931@dcvr> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="z5jC7MHn9RD7Vvtt" Content-Disposition: inline In-Reply-To: <20240715210340.M929931@dcvr> List-Id: --z5jC7MHn9RD7Vvtt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable TL;DR: kill -USR1 seems to have triggered the import now, whereas even a restart didn't before. On Mon, Jul 15, 2024 at 09:03:40PM +0000, Eric Wong wrote: > > (Nothing about why it seemed to not scan the maildirs at all). > public-inbox-index doesn't touch Maildirs (or mbox, MH, etc) at all. > -index only exists to handle mail already in git repos; that is > -index is intended for freshly cloned inboxes, adding search to > old v1 inboxes, and/or changing indexlevel after init. >=20 > Currently, public-inbox-watch is the only public-inbox-* tool which > works directly with Maildirs. HMm, I had excluded public-inbox-watch initially because it didn't seem to be doing anything after the very long startup. I'm thinking that the inotify is not working as expected, maybe relating to the huge number of folders we watch. Terminal 1: # strace -p $(pidof /usr/bin/public-inbox-watch) -ff=20 strace: Process 93260 attached pselect6(8, [3 4], NULL, NULL, NULL, NULL (nothing more) Terminal 2: $ find /var/archives/.maildir/.gentoo* -maxdepth 2 -path '/var/archives/.ma= ildir/.gentoo-*' -path '*202407/new' -mtime -1 |sed 's,/new,,g' >/tmp/list =2E.. /var/archives/.maildir/.gentoo-binhost-autobuilds/.202407/new /var/archives/.maildir/.gentoo-dev/.202407/new /var/archives/.maildir/.gentoo-dev-announce/.202407/new /var/archives/.maildir/.gentoo-infrastructure/.202407/new /var/archives/.maildir/.gentoo-kernel/.202407/new =2E.. $ fgrep -f /tmp/list /etc/public-inbox/config =2E.. watch =3D maildir:/var/archives/.maildir/.gentoo-binhost-autobuilds/.202407 watch =3D maildir:/var/archives/.maildir/.gentoo-dev/.202407 watch =3D maildir:/var/archives/.maildir/.gentoo-dev-announce/.202407 =2E.. # Touch the mail so it SHOULD trigger inotify $ cat /tmp/list |xargs -I^ find ^ -type f -mtime -1 |grep -v -e gentoo-comm= its |xargs touch > > $ public-inbox-init --indexlevel full \ > > --version 2 --jobs 2 \ > > gentoo-releng-autobuilds \ > > /var/public-inbox/gentoo-releng-autobuilds.lists.gentoo.org.git \ > > https://public-inbox.gentoo.org/gentoo-releng-autobuilds \ > > gentoo-releng-autobuilds@lists.gentoo.org > sidenote: `.git' suffix is a bit confusing for v2 inboxes; > only v1 used a single bare git repo I'll update our internal docs & tooling to drop it - it was a carryover. =2E.. > I'm curious how you got a single message indexed, however... > is that from public-inbox-mda? I think that message arrived and triggered public-inbox-watch but others didn't. > Fwiw, I started working on a public-inbox-(import/ctl) tool to > quickly import a bunch of messages a while back but got > sidetracked. Been busy dealing with personal problems much of > this year :< >=20 > But public-inbox-watch works reasonably well for large imports > even if the git history ordering gets a bit wonky from readdir. > SIGHUP/SIGUSR1 + strace are useful for reloading and tracing > configuration problems with the -watch daemon. kill USR1 seems to have tricked it into adding files now... But why didn't it add files any other way? Weird. Anyway, that public-inbox-(import/ctl) sounds like it might be better for other folders, where we don't expect new mail to be added outside of the archival cases previously mentioned. --=20 Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer E-Mail : robbat2@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 --z5jC7MHn9RD7Vvtt Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Robbat2 @ Orbis-Terrarum Networks - The text below is a digital signature. If it doesn't make any sense to you, ignore it. iQKTBAABCgB9FiEEveu2pS8Vb98xaNkRGTlfI8WIJsQFAmaVmIRfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEJE RUJCNkE1MkYxNTZGREYzMTY4RDkxMTE5Mzk1RjIzQzU4ODI2QzQACgkQGTlfI8WI JsR0XQ//dHpL+HPjKsAMhFBPFQ1CBGo4bU9hEde8zkRzcCGp5zBnM/zIAEsClUao Mtl3+NMRhWANSzobfGVuMZUSb2rPu4TvmiOnERx/eJOTrA3g3tqgS1+USqJFKwGt f88DpTUvmcWJ4HFmzFzMHwKb7ecMFTcqnGEviIwz3ULchEPsOqLnoDvAHmR/sLDj iT1KKyd5BassuMjxCEZZhwb0qPESIdFy+LyUKr2NOZ9XAcLZLtZTsHJgUZnzyQKq xdlZj4ggMEGEQ8zicpL6498gUFXhIvuAa/tJu16WcUk+AVA8nsXebXNxFtaVfnnZ QRQtn0LLIzlv3uZHdP6MPUvVaCC0PrytMv5dDeRo471bn5/wnAT3iLkmHikslOHi vCXMKhO0YFrkwLyX0zCyjtDm9/u+inxVsLA7RnUw4LlnDkk8kGWVVeJxpKjShrJz NXxNbZIaBV+LGVRrj5AdNP8GYagIcxyl4Ca1bcwttN/2mG2J6JAh7+bUOSwRv/QN UsXyYqlXy5vaE6KwZyO9/mad4CC4KSKFtaljWavJ50CnGekcFo6ZBtXl0NiGlaIK jTKrjplzWZRumoiVqKTsGBRI4XEM0EOv2DXdE5lKFBpVcTEp90PAWFtNwrQZp4Le X0GYkdFhI8/n2ze53lvrN8FAq+Kov3LCkAfWDuPzv/GCgrrD1FU= =PoCg -----END PGP SIGNATURE----- --z5jC7MHn9RD7Vvtt--