From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 783F61F572; Mon, 22 Jul 2024 19:41:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1721677261; bh=MnrsxTQCW18TghvEzpuiXL3cqT9vKo0xjSMw83EmnME=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=5SzyiXBZ6MWTwQ+AYlH3leFYNwE1LFY7102ylWQyJG0lgptOFJhz5vgS9IPUxhn0I kf7Cr8jGtZezJQcuV0MK22bsvhd+y+NzAQEppqbP9VtME/sL/z5Vj6M+qf4dLc6TPi ZB5JpHm05G3QVeHnQBH9Nf4aW3C05VYn63DCBKNA= Date: Mon, 22 Jul 2024 19:40:55 +0000 From: Eric Wong To: "Robin H. Johnson" Cc: meta@public-inbox.org Subject: Re: more debugging for gentoo usage & supporting feature requests Message-ID: <20240722194055.M603030@dcvr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: List-Id: "Robin H. Johnson" wrote: > Hi, > > I moved the Gentoo instance to a much beefier machine & newer kernel, > ingest is a lot faster; but there's still some hiccups. > > 1. Request for more debugging details about mails: Seems that many of > our oldest mails don't get ingested - and there's no output about why. > I don't know if -watch actually scanned that folder or not. I started working on it, but got sidetracked with some bugs in the FakeInotify implementation on low-time-resolution FS :x > 1.1. Possibly related: > Intended config is that the mail should be ingested regardless of the > email address on the headers. Way back in time, the Gentoo lists were > renamed a few times, and the files are sorted into the correct folders. > I think this impacted any attempted ingest via -mda because there's no > other way to override what list a given mail on stdin should be > associated with. > > The headers may be inconsistent, changed style, name, or even be absent > in a few cases. Fwiw, there can be multiple publicinbox.*.address directives for a given inbox. You can also use publicinbox.*.watchheader to match arbitrary headers (e.g. List-Id, X-BeenThere, etc...) I think "public-inbox-ctl import" will be needed to handle odd messages without any matching headers > 2. > What's the intended way for public-inbox-mda to function with no > SpamAssassin installed at all? "spamcheck = " doesn't seem to do it. spamcheck=none You can also use --no-precheck to disabl some builtin rules. > 3. > As a formal feature request: > Change the arguments of: public-inbox-watch > - Add --all to mean all lists in the config > - no arguments => implicit --all > - $LISTNAME/$INBOXPATH => one *OR* more inboxes manually specified. > > I did a hacky split of the configuration for Gentoo, and things are a > LOT more stable with 120 instances; but it's a little wasteful: I'd like > to give the high-traffic lists their own instance, and group the > low-traffic instances together. Fwiw, the IMAP code for watch is already 1:1 process:IMAP-mailbox because of the Mail::IMAPClient API. How about making that an option for Maildirs, too and at least get some benefit from copy-on-write memory savings... > 4. > Make public-inbox-init NOT attempt to write to any configuration files. > > Trying to implement segregation of roles: > - config files owned by root only; readable by public-inbox users. OK, it'd probably have to write a $INBOX_DIR/config.snippet.sample file with comments, then.. > - source maildirs read-only to user running public-inbox-watch > - public-inbox dirs writable to user running public-inbox-watch > - public-inbox dirs readable to user running public-inbox-httpd The last 3 has been what I've been doing since the beginning.