* Thanks for notmuch-lore [not found] ` <8735l2b7ui.fsf@waldekranz.com> @ 2022-03-22 0:00 ` Carl Worth 2022-03-22 5:15 ` Kyle Meyer 0 siblings, 1 reply; 3+ messages in thread From: Carl Worth @ 2022-03-22 0:00 UTC (permalink / raw) To: Tobias Waldekranz; +Cc: notmuch [-- Attachment #1.1: Type: text/plain, Size: 1705 bytes --] On Tue, Feb 01 2022, Tobias Waldekranz wrote: > I actually gave up on getting my mailinglists from my email provider, > now I just download it directly from lore. I hacked together a script > that will scrape a public-inbox repo and convert it to a Maildir: > > https://github.com/wkz/notmuch-lore Thanks for sharing this, Tobias. I needed exactly this today, and was happy to have found this. It looks like you've coded something to efficiently do the work that's needed periodically, (fetch new emails from the public-inbox git repository, convert them to maildir files, and prune away git state other than a pointer to what's been converted already). What I'm missing is the piece to convert over the entire archive from the past. I can fetch it all easily enough with public-inbox-clone. Maybe what I want could be captured in a tool named something like: public-inbox-export --output=maildir After which I'd be all bootstrapped and ready to use your notmuch-lore pre-new hook. > As you can tell from the name, it is tailored for plugging into notmuch, > but the guts are pretty generic. Indeed. And it looks like all the code I would need for the export I described above is right there in your script. It's as simple as: git rev-list | while read sha; do $git show $sha:m > $maildir/new/$sha done So, next I should go put together a patch against public-inbox to add that. Thanks again, -Carl PS. I debated whether to CC lkml where the original message I was replying to was from originally. I decided against it and almost just emailed Tobias alone, but I really do want discussion like this to be archived in public. So I CCed the notmuch mailing list at least. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Thanks for notmuch-lore 2022-03-22 0:00 ` Thanks for notmuch-lore Carl Worth @ 2022-03-22 5:15 ` Kyle Meyer 2022-03-22 17:00 ` Carl Worth 0 siblings, 1 reply; 3+ messages in thread From: Kyle Meyer @ 2022-03-22 5:15 UTC (permalink / raw) To: Carl Worth; +Cc: Tobias Waldekranz, notmuch Carl Worth writes: > On Tue, Feb 01 2022, Tobias Waldekranz wrote: >> I actually gave up on getting my mailinglists from my email provider, >> now I just download it directly from lore. I hacked together a script >> that will scrape a public-inbox repo and convert it to a Maildir: >> >> https://github.com/wkz/notmuch-lore > > Thanks for sharing this, Tobias. I needed exactly this today, and was > happy to have found this. > > It looks like you've coded something to efficiently do the work that's > needed periodically, (fetch new emails from the public-inbox git > repository, convert them to maildir files, and prune away git state > other than a pointer to what's been converted already). > > What I'm missing is the piece to convert over the entire archive from > the past. I may be missing something (I didn't know about notmuch-lore before seeing it mentioned here), but it looks like the initialization step of notmuch-lore's pre-new handles that already. You just need to set `since` far enough back: --8<---------------cut here---------------start------------->8--- tmphome=$(mktemp -d "${TMPDIR:-/tmp}"/nm-lore-XXXXXXX) cd "$tmphome" HOME="$tmphome" export HOME mkdir mail notmuch setup notmuch new mkdir -p mail/.notmuch/.lore mail/.notmuch/hooks cat >mail/.notmuch/.lore/sources <<'EOF' [gwl] url=https://yhetil.org/gwl/git since=50 years ago EOF curl -fSsL \ https://raw.githubusercontent.com/wkz/notmuch-lore/3e2a13b32b178a4d3296cee6f69ee3491eebdb9f/pre-new \ >mail/.notmuch/hooks/pre-new chmod +x mail/.notmuch/hooks/pre-new ./mail/.notmuch/hooks/pre-new --8<---------------cut here---------------end--------------->8--- That returns the number of messages I expect for that (small) archive: $ find mail/gwl -type f | wc -l 288 Also, just to list some other options in this space, l2md and impibe are mentioned at <https://public-inbox.org/clients.html> as tools for converting public-inbox archives into maildir format. (I haven't used either myself.) Tobias, just a note of something I saw when looking over the script: $git rev-list $3 | while read sha; do $git show $sha:m >$db/$1/new/$sha done This would error if it encounters a deleted message in the archive because then the commit will have a "d" in the working tree instead of an "m". See <https://public-inbox.org/public-inbox-v2-format.html>. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Thanks for notmuch-lore 2022-03-22 5:15 ` Kyle Meyer @ 2022-03-22 17:00 ` Carl Worth 0 siblings, 0 replies; 3+ messages in thread From: Carl Worth @ 2022-03-22 17:00 UTC (permalink / raw) To: Kyle Meyer; +Cc: Tobias Waldekranz, notmuch [-- Attachment #1.1: Type: text/plain, Size: 1145 bytes --] On Tue, Mar 22 2022, Kyle Meyer wrote: > I may be missing something (I didn't know about notmuch-lore before > seeing it mentioned here), but it looks like the initialization step of > notmuch-lore's pre-new handles that already. You just need to set > `since` far enough back: Hmm... I did see the "since" parameter and cranked it back. It didn't seem to do what I wanted, but it's possible the bug is only with multi-epoch archives, (I was trying to bring in LKML). From poking at it, it looked like it did perform a "deepening" operation using the "since" parameter after the initial clone, but then didn't use anything older than the most-recent upstream commit for the range of commits from which to get messages out. But my examination of the code and behavior was very cursory, I admit. > Also, just to list some other options in this space, l2md and impibe are > mentioned at <https://public-inbox.org/clients.html> as tools for > converting public-inbox archives into maildir format. (I haven't used > either myself.) Thanks! I clearly didn't look quite hard enough. I appreciate the pointers. -Carl [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-03-22 17:01 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20220131154655.1614770-1-tobias@waldekranz.com> [not found] ` <20220131154655.1614770-2-tobias@waldekranz.com> [not found] ` <20220201170634.wnxy3s7f6jnmt737@skbuf> [not found] ` <87a6fabbtb.fsf@waldekranz.com> [not found] ` <20220201201141.u3qhhq75bo3xmpiq@skbuf> [not found] ` <8735l2b7ui.fsf@waldekranz.com> 2022-03-22 0:00 ` Thanks for notmuch-lore Carl Worth 2022-03-22 5:15 ` Kyle Meyer 2022-03-22 17:00 ` Carl Worth
Code repositories for project(s) associated with this public inbox https://yhetil.org/notmuch.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).