From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id EDCCF1F542; Wed, 14 Jun 2023 09:44:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1686735897; bh=dIEBZkkjA42KhvVOYWaIbIxrYBtOu400O/ztnh1gIYE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=3Zw2PzrZj1ubfemxp7m7saLrP9iMCAcnaOlirLXQAaU5xtkfOxAdJIUnt7N025ix6 7ftxmK30GcW0djwBShJB+Kg2k+iy0krhNEIteW2V/9EFF5IxttymVmfOHNjNrMovBS sczA1w+iyP9DXJKaRHPXKFatl7N7AqBcn/Ds/Kcs= Date: Wed, 14 Jun 2023 09:44:56 +0000 From: Eric Wong To: Rebecca Cran Cc: meta@public-inbox.org Subject: Re: Help setting up a public-inbox instance: importing from maildir archive isn't working Message-ID: <20230614094456.M853858@dcvr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: Rebecca Cran wrote: > I'm trying to set up a public-inbox instance to mirror the edk2-devel > mailing list. I'm using version 1.9.0. > > I'm having problems importing a mirror of 105,000 messages from 2016 onward > - when I configured it to pull them from an IMAP server it appeared to go > through all of them, but then the web interface only showed the last 10 or > so, with no more pages and the edk2-devel.git directory is only a few MB. > > When imap didn't work, I tried downloading them into a maildir and tried > importing them via that instead, but that isn't working either. OK, so I suppose it's a matching problem because the matching logic is shared between IMAP and Maildir. > [publicinbox "edk2-devel"] >     address = devel@edk2.groups.io >     inboxdir = /home/public-inbox/edk2-devel.git >     watchheader = List-Id: Any chance that List-Id doesn't match the older messages? You are allowed multiple `watchheader' directives for an inbox to account for address/name changes and such (and older headers such as `X-BeenThere') I haven't tried it, but this should work as long as you want every message in a watched Maildir (or IMAP folder): watchheader = From:. ...which matches all messages with a literal `.' in the From: header; so practically every valid message. Likewise for Received:, Date: or any practically-always-present header:value-substring combo. I think everything else in your configs+commands looked fine; but I'm still struggling with lack-of-sleep and could've missed things :< I designed the `watchheader' directives to handle multiple lists funneled into one Maildir; but I suppose it's less intuitive for users with a 1:1 list => Maildir mapping :x