* inbox-update: new competition of notmuch-lore @ 2023-04-17 1:26 Felipe Contreras 2023-04-17 10:23 ` Michael J Gruber 0 siblings, 1 reply; 3+ messages in thread From: Felipe Contreras @ 2023-04-17 1:26 UTC (permalink / raw) To: notmuch; +Cc: Tobias Waldekranz Hi, I'm moving from mbsync to public-inbox and I find there aren't many tools to make it work with notmuch. I gave a try to notmuch-lore [1] but I found it too slow and had a couple of issues. So I wrote my own script to convert public-inbox mailing lists to Maildir format: notmuch-tools/inbox-update [2]. It's much faster at the initial clone, it deals with deleted mails, and YAML is a much better configuration format. Also, you can configure which epochs you want to fetch (notmuch-lore fetches all of them). One thing it doesn't yet do is trim the repository once the mails have been converted, but that's probably easy to add later on. You can check the GitHub page for more information [2]. Cheers. [1] https://github.com/wkz/notmuch-lore [2] https://github.com/felipec/notmuch-tools -- Felipe Contreras ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: inbox-update: new competition of notmuch-lore 2023-04-17 1:26 inbox-update: new competition of notmuch-lore Felipe Contreras @ 2023-04-17 10:23 ` Michael J Gruber 2023-04-17 12:30 ` Felipe Contreras 0 siblings, 1 reply; 3+ messages in thread From: Michael J Gruber @ 2023-04-17 10:23 UTC (permalink / raw) To: Felipe Contreras; +Cc: notmuch, Tobias Waldekranz Hi Felipe > I'm moving from mbsync to public-inbox and I find there aren't many tools to > make it work with notmuch. Looking at that, too. > I gave a try to notmuch-lore [1] but I found it too slow and had a couple of > issues. > > So I wrote my own script to convert public-inbox mailing lists to Maildir > format: notmuch-tools/inbox-update [2]. > > It's much faster at the initial clone, it deals with deleted mails, and YAML is > a much better configuration format. Looking at both scripts: Is the speed-up mainly due to `git cat-file` vs. `git show`? > Also, you can configure which epochs you want to fetch (notmuch-lore fetches > all of them). > > One thing it doesn't yet do is trim the repository once the mails have been > converted, but that's probably easy to add later on. What kind of trimming are you thinking about here? Partial history? I guess this shows that public-inbox's repo format is simply not the best choice for the purpose of mail readers. It is optimised for other uses, and I always wondered why they use a non-bare repo at all. That single file path m at the root creates absolutely meaningless diffs. And the commit message doubles the info which is present in the blob. notes-ref could have served better for inspiration of public-inbox. (Barking up the wrong tree, I know.) There are even tools in the public-inbox eco system which feed that info into a xapian db, though not notmuch-like, as if notmuch hadn't existed already. What I'm dreaming of is a notmuch "storage backend" which is git object db based rather than maildir based, and compatible with public-inbox (at least with the use case, i.e. v3 or v4...). I mean - why do we need a checkout of basically immutable files which are stored in blobs already, just so that notmuch can index them? We need them for the MUAs, I know, and we would need a solution for them, too. Or simply a tree in public-inbox which allows clients to use a mere checkout ... Cheers, Michael ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: inbox-update: new competition of notmuch-lore 2023-04-17 10:23 ` Michael J Gruber @ 2023-04-17 12:30 ` Felipe Contreras 0 siblings, 0 replies; 3+ messages in thread From: Felipe Contreras @ 2023-04-17 12:30 UTC (permalink / raw) To: Michael J Gruber, Felipe Contreras; +Cc: notmuch, Tobias Waldekranz Michael J Gruber wrote: > > I'm moving from mbsync to public-inbox and I find there aren't many tools to > > make it work with notmuch. > > Looking at that, too. > > > I gave a try to notmuch-lore [1] but I found it too slow and had a couple of > > issues. > > > > So I wrote my own script to convert public-inbox mailing lists to Maildir > > format: notmuch-tools/inbox-update [2]. > > > > It's much faster at the initial clone, it deals with deleted mails, and YAML is > > a much better configuration format. > > Looking at both scripts: Is the speed-up mainly due to `git cat-file` > vs. `git show`? My guess is that it's due to using `git cat-file` in batch mode, so it's called only once, instead of thousands of times. Presumably this can be done in notmuch-lore as well, with something like: git rev-list | sed -e /$/:m/ | git cat-file --batch But this still has the issue that some commits remove mail, don't add. > > Also, you can configure which epochs you want to fetch (notmuch-lore fetches > > all of them). > > > > One thing it doesn't yet do is trim the repository once the mails have been > > converted, but that's probably easy to add later on. > > What kind of trimming are you thinking about here? Partial history? Same as notmuch-lore does: just the last commit. Once the mails have been extracted there's no need for those commits. > I guess this shows that public-inbox's repo format is simply not the > best choice for the purpose of mail readers. It is optimised for other > uses, and I always wondered why they use a non-bare repo at all. That > single file path m at the root creates absolutely meaningless diffs. > And the commit message doubles the info which is present in the blob. > notes-ref could have served better for inspiration of public-inbox. > (Barking up the wrong tree, I know.) I don't know if there's a better format, git stores shapshots anyway, so as long as the information is retrivable in some way, I think that' fine. And I clone the public-inbox repositories as bare (mirror, actually), that's something for the client to decide. > There are even tools in the public-inbox eco system which feed that > info into a xapian db, though not notmuch-like, as if notmuch hadn't > existed already. > > What I'm dreaming of is a notmuch "storage backend" which is git > object db based rather than maildir based, and compatible with > public-inbox (at least with the use case, i.e. v3 or v4...). I mean - > why do we need a checkout of basically immutable files which are > stored in blobs already, just so that notmuch can index them? Yeap, that's exactly what I want as well. It should not be that difficult to decouple notmuch from physical files and feed some virtual content. > We need them for the MUAs, I know, and we would need a solution for > them, too. Or simply a tree in public-inbox which allows clients to > use a mere checkout ... 99% of the time the content is not needed for the MUAs. So perhaps there could be a way to request the body of the message through libnotmuch, and some provider of virtual messages retrives it on demand. Maildir seems like a cumbersome intermediary to me, at the moment. -- Felipe Contreras ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-04-17 12:30 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-17 1:26 inbox-update: new competition of notmuch-lore Felipe Contreras 2023-04-17 10:23 ` Michael J Gruber 2023-04-17 12:30 ` Felipe Contreras
Code repositories for project(s) associated with this public inbox https://yhetil.org/notmuch.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).