unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Kyle Meyer <kyle@kyleam.com>
To: Carl Worth <cworth@cworth.org>
Cc: Tobias Waldekranz <tobias@waldekranz.com>, notmuch@notmuchmail.org
Subject: Re: Thanks for notmuch-lore
Date: Tue, 22 Mar 2022 01:15:35 -0400	[thread overview]
Message-ID: <87pmme36vc.fsf@kyleam.com> (raw)
In-Reply-To: <87mthiua9n.fsf@wondoo.home.cworth.org>

Carl Worth writes:

> On Tue, Feb 01 2022, Tobias Waldekranz wrote:
>> I actually gave up on getting my mailinglists from my email provider,
>> now I just download it directly from lore. I hacked together a script
>> that will scrape a public-inbox repo and convert it to a Maildir:
>>
>> https://github.com/wkz/notmuch-lore
>
> Thanks for sharing this, Tobias. I needed exactly this today, and was
> happy to have found this.
>
> It looks like you've coded something to efficiently do the work that's
> needed periodically, (fetch new emails from the public-inbox git
> repository, convert them to maildir files, and prune away git state
> other than a pointer to what's been converted already).
>
> What I'm missing is the piece to convert over the entire archive from
> the past.

I may be missing something (I didn't know about notmuch-lore before
seeing it mentioned here), but it looks like the initialization step of
notmuch-lore's pre-new handles that already.  You just need to set
`since` far enough back:

--8<---------------cut here---------------start------------->8---
tmphome=$(mktemp -d "${TMPDIR:-/tmp}"/nm-lore-XXXXXXX)
cd "$tmphome"

HOME="$tmphome"
export HOME

mkdir mail
notmuch setup
notmuch new

mkdir -p mail/.notmuch/.lore  mail/.notmuch/hooks

cat >mail/.notmuch/.lore/sources <<'EOF'
[gwl]
url=https://yhetil.org/gwl/git
since=50 years ago
EOF

curl -fSsL \
     https://raw.githubusercontent.com/wkz/notmuch-lore/3e2a13b32b178a4d3296cee6f69ee3491eebdb9f/pre-new \
     >mail/.notmuch/hooks/pre-new
chmod +x mail/.notmuch/hooks/pre-new
./mail/.notmuch/hooks/pre-new
--8<---------------cut here---------------end--------------->8---

That returns the number of messages I expect for that (small) archive:

  $ find mail/gwl -type f | wc -l
  288

Also, just to list some other options in this space, l2md and impibe are
mentioned at <https://public-inbox.org/clients.html> as tools for
converting public-inbox archives into maildir format.  (I haven't used
either myself.)

Tobias, just a note of something I saw when looking over the script:

    $git rev-list $3 | while read sha; do
      $git show $sha:m >$db/$1/new/$sha
    done

This would error if it encounters a deleted message in the archive
because then the commit will have a "d" in the working tree instead of
an "m".  See <https://public-inbox.org/public-inbox-v2-format.html>.

  reply	other threads:[~2022-03-22  5:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220131154655.1614770-1-tobias@waldekranz.com>
     [not found] ` <20220131154655.1614770-2-tobias@waldekranz.com>
     [not found]   ` <20220201170634.wnxy3s7f6jnmt737@skbuf>
     [not found]     ` <87a6fabbtb.fsf@waldekranz.com>
     [not found]       ` <20220201201141.u3qhhq75bo3xmpiq@skbuf>
     [not found]         ` <8735l2b7ui.fsf@waldekranz.com>
2022-03-22  0:00           ` Thanks for notmuch-lore Carl Worth
2022-03-22  5:15             ` Kyle Meyer [this message]
2022-03-22 17:00               ` Carl Worth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pmme36vc.fsf@kyleam.com \
    --to=kyle@kyleam.com \
    --cc=cworth@cworth.org \
    --cc=notmuch@notmuchmail.org \
    --cc=tobias@waldekranz.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).