From: Felipe Contreras <felipe.contreras@gmail.com>
To: Michael J Gruber <michaeljgruber+grubix+git@gmail.com>,
Felipe Contreras <felipe.contreras@gmail.com>
Cc: notmuch@notmuchmail.org, Tobias Waldekranz <tobias@waldekranz.com>
Subject: Re: inbox-update: new competition of notmuch-lore
Date: Mon, 17 Apr 2023 06:30:18 -0600 [thread overview]
Message-ID: <643d3bdae7ae6_751a29486@chronos.notmuch> (raw)
In-Reply-To: <CAA19uiTA-mQ1nTPs14-Vhy0Q6Y6DTbHKVGtpNdrCMpCX3ueV=A@mail.gmail.com>
Michael J Gruber wrote:
> > I'm moving from mbsync to public-inbox and I find there aren't many tools to
> > make it work with notmuch.
>
> Looking at that, too.
>
> > I gave a try to notmuch-lore [1] but I found it too slow and had a couple of
> > issues.
> >
> > So I wrote my own script to convert public-inbox mailing lists to Maildir
> > format: notmuch-tools/inbox-update [2].
> >
> > It's much faster at the initial clone, it deals with deleted mails, and YAML is
> > a much better configuration format.
>
> Looking at both scripts: Is the speed-up mainly due to `git cat-file`
> vs. `git show`?
My guess is that it's due to using `git cat-file` in batch mode, so it's called
only once, instead of thousands of times.
Presumably this can be done in notmuch-lore as well, with something like:
git rev-list | sed -e /$/:m/ | git cat-file --batch
But this still has the issue that some commits remove mail, don't add.
> > Also, you can configure which epochs you want to fetch (notmuch-lore fetches
> > all of them).
> >
> > One thing it doesn't yet do is trim the repository once the mails have been
> > converted, but that's probably easy to add later on.
>
> What kind of trimming are you thinking about here? Partial history?
Same as notmuch-lore does: just the last commit.
Once the mails have been extracted there's no need for those commits.
> I guess this shows that public-inbox's repo format is simply not the
> best choice for the purpose of mail readers. It is optimised for other
> uses, and I always wondered why they use a non-bare repo at all. That
> single file path m at the root creates absolutely meaningless diffs.
> And the commit message doubles the info which is present in the blob.
> notes-ref could have served better for inspiration of public-inbox.
> (Barking up the wrong tree, I know.)
I don't know if there's a better format, git stores shapshots anyway, so as
long as the information is retrivable in some way, I think that' fine.
And I clone the public-inbox repositories as bare (mirror, actually), that's
something for the client to decide.
> There are even tools in the public-inbox eco system which feed that
> info into a xapian db, though not notmuch-like, as if notmuch hadn't
> existed already.
>
> What I'm dreaming of is a notmuch "storage backend" which is git
> object db based rather than maildir based, and compatible with
> public-inbox (at least with the use case, i.e. v3 or v4...). I mean -
> why do we need a checkout of basically immutable files which are
> stored in blobs already, just so that notmuch can index them?
Yeap, that's exactly what I want as well.
It should not be that difficult to decouple notmuch from physical files and
feed some virtual content.
> We need them for the MUAs, I know, and we would need a solution for
> them, too. Or simply a tree in public-inbox which allows clients to
> use a mere checkout ...
99% of the time the content is not needed for the MUAs. So perhaps there could
be a way to request the body of the message through libnotmuch, and some
provider of virtual messages retrives it on demand.
Maildir seems like a cumbersome intermediary to me, at the moment.
--
Felipe Contreras
prev parent reply other threads:[~2023-04-17 12:30 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-17 1:26 inbox-update: new competition of notmuch-lore Felipe Contreras
2023-04-17 10:23 ` Michael J Gruber
2023-04-17 12:30 ` Felipe Contreras [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=643d3bdae7ae6_751a29486@chronos.notmuch \
--to=felipe.contreras@gmail.com \
--cc=michaeljgruber+grubix+git@gmail.com \
--cc=notmuch@notmuchmail.org \
--cc=tobias@waldekranz.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).