2022-12-05 20:03 kyle@kyleam.com: > On 12/05/22 09:45:41 +0100, zimoun wrote: > > Personally, I use “git clone” from a public-inbox instance [1]. > > > > git clone --mirror https://yhetil.org/guix-patches/1 \ > > guix-patches/git/1.git > > > > where ’1’ can be also replace by ’0’ for the very old ones. > > In the case of guix-patches, there's not a 0.git. I started it at 1.git > to leave open the possibility of adding 0.git with the messages I was > missing from the beginning of the list's history (from Feb 12, 2017 to > March 27, 2017). I'm not sure I'm ever going to do so at this point > and, even if I did, reserving 0.git doesn't have much advantage over > just adding the old messages on top of the existing epoch, so I probably > should have just started it at 0.git. > > > Then the conversion from Git commit to maildir is done by a small script > > [2], where all the job reads: > > > > --8<---------------cut here---------------start------------->8--- > > # Extract the message from each commit in the range and store it > > # in the Maildir for notmuch to consume. > > $git rev-list $range | while read sha; do > > # XXXX: fatal: path 'm' does not exist in > > # and it can also raise issues with notmuch, as: > > # Note: Ignoring non-mail file: $maildir/new/$sha > > A tree can either have m or d ("deleted" messages): > > https://public-inbox.org/public-inbox-v2-format.html > > So you should be able to avoid this error by skipping d's. > > > $git show $sha:m > $maildir/new/$sha > > done > > --8<---------------cut here---------------end--------------->8--- > > > > (Maybe better could be done and more robust are around.) > > No need to change what works, of course, but > https://public-inbox.org/clients.html mentions l2md and impibe as tools > for converting public-inbox archives to Maildir. > > * https://git.kernel.org/pub/scm/linux/kernel/git/dborkman/l2md.git > * https://leahneukirchen.org/dotfiles/bin/impibe > > In terms of cloning archives, plain cloning and fetching with Git is > fine, but, if you have public-inbox locally, you can instead use > public-inbox-clone and public-inbox-fetch, which will handle some > details for you (e.g, cloning underlying epochs and recognizing that new > epochs have been added): > > $ public-inbox-clone https://yhetil.org/guix-patches > > Another option for fetching that's nice if you're mirror multiple repos > is grokmirror: > > * https://git.kernel.org/pub/scm/utils/grokmirror/grokmirror.git/about/ > * example setup for guix: https://yhetil.org/guix-patches/878scww903.fsf@kyleam.com/ > > Both grokmirror and public-inbox-clone/fetch make use of the manifests > that are published for public-inbox archives: > > $ curl -fSsL https://yhetil.org/manifest.js.gz | gzip -d | \ > jq -r 'keys | .[] | select(contains("guix"))' > /guix-bugs/git/0.git > /guix-devel/git/0.git > /guix-patches/git/1.git > /guix-science/git/0.git > /guix-user/git/0.git > > Then there's of course also public-inbox's lei (local email interface). > I won't get into that, but, for anyone interested, here are messages > where I've given some examples: > > * https://yhetil.org/emacs-devel/87wnh22w7o.fsf@kyleam.com > * https://yhetil.org/guix-devel/87y1zcljq3.fsf@kyleam.com Thank you both for sharing your approaches. As I personally use isync / mbsync to fetch the emails of my email-accounts via IMAP into maildirs (which I then proceed to index with mu and read with mu4e), I prefer to use the same tech stack to fetch the mailing-list archives. And I now accomplished this with the attached isync-configuration. Note that, as Kyle noted, I needed to use the "0" path sometimes, and the "1" path other times. Also, you might want to set and adapt MaxMessages to fit your needs.