From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id EJMCLaVdOWLFKwAAgWs5BA (envelope-from ) for ; Tue, 22 Mar 2022 06:24:53 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id EEPEJKVdOWIKZQAAG6o9tA (envelope-from ) for ; Tue, 22 Mar 2022 06:24:53 +0100 Received: from mail.notmuchmail.org (yantan.tethera.net [135.181.149.255]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id D8CDC2430D for ; Tue, 22 Mar 2022 06:24:52 +0100 (CET) Received: from yantan.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id 1D2BE5F71C; Tue, 22 Mar 2022 05:24:50 +0000 (UTC) X-Greylist: delayed 548 seconds by postgrey-1.36 at yantan; Tue, 22 Mar 2022 05:24:47 UTC Received: from out1.migadu.com (out1.migadu.com [91.121.223.63]) by mail.notmuchmail.org (Postfix) with ESMTPS id 823C35F700 for ; Tue, 22 Mar 2022 05:24:47 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kyleam.com; s=key1; t=1647926138; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gSB/RyK1Gj7XncVcWrbVV96h1zJBbkNw0sOyRKu3CTE=; b=YtArI3v0uf0FcA13C7FL1/4IiNqt2IUoHL8tmTeDt6tOqciMYADZ2Hgy3rA3eBNu+1AMJc DO8uPOJ69S4iTBAKHrtTbmDclK5voMVmgJMaKeCsZhTvnGwLbC1YDro1jJePXJQQ9hsOdt L9g07UXdylTfSAbxCJhXi/7T9d1F8cD7ALWDgyEagQFYXIDpT1S+TNMLXN4SNlXjeyPySB XknnM74AljRiGE9YZVKgI3wKtCasplj/JIJDjagBS1BPK4zhQnYzbv062INA+Pk2w4+kZY S24gHlOquLN0aGv1IZjpQhnEGvjPB+gHoC/Oe7ddB+0evIHCLa8bQAs28PsiQw== From: Kyle Meyer To: Carl Worth Subject: Re: Thanks for notmuch-lore In-Reply-To: <87mthiua9n.fsf@wondoo.home.cworth.org> References: <20220131154655.1614770-1-tobias@waldekranz.com> <20220131154655.1614770-2-tobias@waldekranz.com> <20220201170634.wnxy3s7f6jnmt737@skbuf> <87a6fabbtb.fsf@waldekranz.com> <20220201201141.u3qhhq75bo3xmpiq@skbuf> <8735l2b7ui.fsf@waldekranz.com> <87mthiua9n.fsf@wondoo.home.cworth.org> Date: Tue, 22 Mar 2022 01:15:35 -0400 Message-ID: <87pmme36vc.fsf@kyleam.com> MIME-Version: 1.0 Message-ID-Hash: 4WUUJGDNCQZPPG3SXR7LSEKBW5MSO4PR X-Message-ID-Hash: 4WUUJGDNCQZPPG3SXR7LSEKBW5MSO4PR X-MailFrom: kyle@kyleam.com X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0 CC: Tobias Waldekranz , notmuch@notmuchmail.org X-Mailman-Version: 3.3.3 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: DE ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1647926693; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-owner:list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=KssfdpZbdK7CBy+b/HYyMzUPckY0iZdrleSaMwiFWoQ=; b=ThGnSZhJp8P6vsOOkXimH2ncLk+p8zLEYqTxE+EjFaz9yw0HnwLQuyXa6KMPBrdxsdkh1w XUCMUoN0/jjLisfzStEEuAnP0SpHkfljBVxXlja5AT1pYyhqiO39xcEHyxRpLx3UNAnm9b RB09EPCHvjvmAfE2Vy+v8hbuWVC/PMV/nfkayMATO2EAx1CWXRJnm4xrP70NhQz9YT52uk u5Mp8MDaPJ0gWpurbz++4S9zOeL00HKbyrt2qeCGk1fugp7cWD2abTLdj4855rEo6a0klj ZOh8nrjqtquL57XnPXeJ25eEpnUqPZKBGXTvq+YJMMNoh6nzj41z8ug2zqiRgQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1647926693; a=rsa-sha256; cv=none; b=fvMlEnDpRkRyZ81dZiaI2y6yGe3IIHHM3EiPPGq837ugfNYpNr7s6qcr+MVlx/gSv80L+z KYm3p7EWza2eBoAIyNgx0BOI8RRfjw1Am2fm78PX2ECN4R70pDK5t077rl/IH+sHHMDKPN wKG6vaj0Zy2DoQ+zoVC73kNFYyChN62BZ54orTI1NahSQnqTB8iowUk4n8MiDCVC4JoUe2 lT4yOe9+RAp54gy4sAQshKzbmGoihhOZ6bKAmrYRW5cLKqK8R4wFDGJquNoet02P5FxCO/ FG9WSJ+fluQnTAqGxGhjMYZMGLPUS7jtWMiRsZyz/Wrx6RM1aFjuFtzvBM9r+Q== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=kyleam.com header.s=key1 header.b=YtArI3v0; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 135.181.149.255 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Spam-Score: 1.73 Authentication-Results: aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=kyleam.com header.s=key1 header.b=YtArI3v0; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 135.181.149.255 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Queue-Id: D8CDC2430D X-Spam-Score: 1.73 X-Migadu-Scanner: scn0.migadu.com X-TUID: zKb0PnP21bZm Carl Worth writes: > On Tue, Feb 01 2022, Tobias Waldekranz wrote: >> I actually gave up on getting my mailinglists from my email provider, >> now I just download it directly from lore. I hacked together a script >> that will scrape a public-inbox repo and convert it to a Maildir: >> >> https://github.com/wkz/notmuch-lore > > Thanks for sharing this, Tobias. I needed exactly this today, and was > happy to have found this. > > It looks like you've coded something to efficiently do the work that's > needed periodically, (fetch new emails from the public-inbox git > repository, convert them to maildir files, and prune away git state > other than a pointer to what's been converted already). > > What I'm missing is the piece to convert over the entire archive from > the past. I may be missing something (I didn't know about notmuch-lore before seeing it mentioned here), but it looks like the initialization step of notmuch-lore's pre-new handles that already. You just need to set `since` far enough back: --8<---------------cut here---------------start------------->8--- tmphome=$(mktemp -d "${TMPDIR:-/tmp}"/nm-lore-XXXXXXX) cd "$tmphome" HOME="$tmphome" export HOME mkdir mail notmuch setup notmuch new mkdir -p mail/.notmuch/.lore mail/.notmuch/hooks cat >mail/.notmuch/.lore/sources <<'EOF' [gwl] url=https://yhetil.org/gwl/git since=50 years ago EOF curl -fSsL \ https://raw.githubusercontent.com/wkz/notmuch-lore/3e2a13b32b178a4d3296cee6f69ee3491eebdb9f/pre-new \ >mail/.notmuch/hooks/pre-new chmod +x mail/.notmuch/hooks/pre-new ./mail/.notmuch/hooks/pre-new --8<---------------cut here---------------end--------------->8--- That returns the number of messages I expect for that (small) archive: $ find mail/gwl -type f | wc -l 288 Also, just to list some other options in this space, l2md and impibe are mentioned at as tools for converting public-inbox archives into maildir format. (I haven't used either myself.) Tobias, just a note of something I saw when looking over the script: $git rev-list $3 | while read sha; do $git show $sha:m >$db/$1/new/$sha done This would error if it encounters a deleted message in the archive because then the commit will have a "d" in the working tree instead of an "m". See .