From: Kyle Meyer <kyle@kyleam.com>
To: Carl Worth <cworth@cworth.org>
Cc: Tobias Waldekranz <tobias@waldekranz.com>, notmuch@notmuchmail.org
Subject: Re: Thanks for notmuch-lore
Date: Tue, 22 Mar 2022 01:15:35 -0400 [thread overview]
Message-ID: <87pmme36vc.fsf@kyleam.com> (raw)
In-Reply-To: <87mthiua9n.fsf@wondoo.home.cworth.org>
Carl Worth writes:
> On Tue, Feb 01 2022, Tobias Waldekranz wrote:
>> I actually gave up on getting my mailinglists from my email provider,
>> now I just download it directly from lore. I hacked together a script
>> that will scrape a public-inbox repo and convert it to a Maildir:
>>
>> https://github.com/wkz/notmuch-lore
>
> Thanks for sharing this, Tobias. I needed exactly this today, and was
> happy to have found this.
>
> It looks like you've coded something to efficiently do the work that's
> needed periodically, (fetch new emails from the public-inbox git
> repository, convert them to maildir files, and prune away git state
> other than a pointer to what's been converted already).
>
> What I'm missing is the piece to convert over the entire archive from
> the past.
I may be missing something (I didn't know about notmuch-lore before
seeing it mentioned here), but it looks like the initialization step of
notmuch-lore's pre-new handles that already. You just need to set
`since` far enough back:
--8<---------------cut here---------------start------------->8---
tmphome=$(mktemp -d "${TMPDIR:-/tmp}"/nm-lore-XXXXXXX)
cd "$tmphome"
HOME="$tmphome"
export HOME
mkdir mail
notmuch setup
notmuch new
mkdir -p mail/.notmuch/.lore mail/.notmuch/hooks
cat >mail/.notmuch/.lore/sources <<'EOF'
[gwl]
url=https://yhetil.org/gwl/git
since=50 years ago
EOF
curl -fSsL \
https://raw.githubusercontent.com/wkz/notmuch-lore/3e2a13b32b178a4d3296cee6f69ee3491eebdb9f/pre-new \
>mail/.notmuch/hooks/pre-new
chmod +x mail/.notmuch/hooks/pre-new
./mail/.notmuch/hooks/pre-new
--8<---------------cut here---------------end--------------->8---
That returns the number of messages I expect for that (small) archive:
$ find mail/gwl -type f | wc -l
288
Also, just to list some other options in this space, l2md and impibe are
mentioned at <https://public-inbox.org/clients.html> as tools for
converting public-inbox archives into maildir format. (I haven't used
either myself.)
Tobias, just a note of something I saw when looking over the script:
$git rev-list $3 | while read sha; do
$git show $sha:m >$db/$1/new/$sha
done
This would error if it encounters a deleted message in the archive
because then the commit will have a "d" in the working tree instead of
an "m". See <https://public-inbox.org/public-inbox-v2-format.html>.
next prev parent reply other threads:[~2022-03-22 5:24 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220131154655.1614770-1-tobias@waldekranz.com>
[not found] ` <20220131154655.1614770-2-tobias@waldekranz.com>
[not found] ` <20220201170634.wnxy3s7f6jnmt737@skbuf>
[not found] ` <87a6fabbtb.fsf@waldekranz.com>
[not found] ` <20220201201141.u3qhhq75bo3xmpiq@skbuf>
[not found] ` <8735l2b7ui.fsf@waldekranz.com>
2022-03-22 0:00 ` Thanks for notmuch-lore Carl Worth
2022-03-22 5:15 ` Kyle Meyer [this message]
2022-03-22 17:00 ` Carl Worth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pmme36vc.fsf@kyleam.com \
--to=kyle@kyleam.com \
--cc=cworth@cworth.org \
--cc=notmuch@notmuchmail.org \
--cc=tobias@waldekranz.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).