unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* inbox-update: new competition of notmuch-lore
@ 2023-04-17  1:26 Felipe Contreras
  2023-04-17 10:23 ` Michael J Gruber
  0 siblings, 1 reply; 3+ messages in thread
From: Felipe Contreras @ 2023-04-17  1:26 UTC (permalink / raw)
  To: notmuch; +Cc: Tobias Waldekranz

Hi,

I'm moving from mbsync to public-inbox and I find there aren't many tools to
make it work with notmuch.

I gave a try to notmuch-lore [1] but I found it too slow and had a couple of
issues.

So I wrote my own script to convert public-inbox mailing lists to Maildir
format: notmuch-tools/inbox-update [2].

It's much faster at the initial clone, it deals with deleted mails, and YAML is
a much better configuration format.

Also, you can configure which epochs you want to fetch (notmuch-lore fetches
all of them).

One thing it doesn't yet do is trim the repository once the mails have been
converted, but that's probably easy to add later on.

You can check the GitHub page for more information [2].

Cheers.

[1] https://github.com/wkz/notmuch-lore
[2] https://github.com/felipec/notmuch-tools

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: inbox-update: new competition of notmuch-lore
  2023-04-17  1:26 inbox-update: new competition of notmuch-lore Felipe Contreras
@ 2023-04-17 10:23 ` Michael J Gruber
  2023-04-17 12:30   ` Felipe Contreras
  0 siblings, 1 reply; 3+ messages in thread
From: Michael J Gruber @ 2023-04-17 10:23 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: notmuch, Tobias Waldekranz

Hi Felipe

> I'm moving from mbsync to public-inbox and I find there aren't many tools to
> make it work with notmuch.

Looking at that, too.

> I gave a try to notmuch-lore [1] but I found it too slow and had a couple of
> issues.
>
> So I wrote my own script to convert public-inbox mailing lists to Maildir
> format: notmuch-tools/inbox-update [2].
>
> It's much faster at the initial clone, it deals with deleted mails, and YAML is
> a much better configuration format.

Looking at both scripts: Is the speed-up mainly due to `git cat-file`
vs. `git show`?

> Also, you can configure which epochs you want to fetch (notmuch-lore fetches
> all of them).
>
> One thing it doesn't yet do is trim the repository once the mails have been
> converted, but that's probably easy to add later on.

What kind of trimming are you thinking about here? Partial history?

I guess this shows that public-inbox's repo format is simply not the
best choice for the purpose of mail readers. It is optimised for other
uses, and I always wondered why they use a non-bare repo at all. That
single file path m at the root creates absolutely meaningless diffs.
And the commit message doubles the info which is present in the blob.
notes-ref could have served better for inspiration of public-inbox.
(Barking up the wrong tree, I know.)

There are even tools in the public-inbox eco system which feed that
info into a xapian db, though not notmuch-like, as if notmuch hadn't
existed already.

What I'm dreaming of is a notmuch "storage backend" which is git
object db based rather than maildir based, and compatible with
public-inbox (at least with the use case, i.e. v3 or v4...). I mean -
why do we need a checkout of basically immutable files which are
stored in blobs already, just so that notmuch can index them?

We need them for the MUAs, I know, and we would need a solution for
them, too. Or simply a tree in public-inbox which allows clients to
use a mere checkout ...

Cheers,
Michael

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: inbox-update: new competition of notmuch-lore
  2023-04-17 10:23 ` Michael J Gruber
@ 2023-04-17 12:30   ` Felipe Contreras
  0 siblings, 0 replies; 3+ messages in thread
From: Felipe Contreras @ 2023-04-17 12:30 UTC (permalink / raw)
  To: Michael J Gruber, Felipe Contreras; +Cc: notmuch, Tobias Waldekranz

Michael J Gruber wrote:
> > I'm moving from mbsync to public-inbox and I find there aren't many tools to
> > make it work with notmuch.
> 
> Looking at that, too.
> 
> > I gave a try to notmuch-lore [1] but I found it too slow and had a couple of
> > issues.
> >
> > So I wrote my own script to convert public-inbox mailing lists to Maildir
> > format: notmuch-tools/inbox-update [2].
> >
> > It's much faster at the initial clone, it deals with deleted mails, and YAML is
> > a much better configuration format.
> 
> Looking at both scripts: Is the speed-up mainly due to `git cat-file`
> vs. `git show`?

My guess is that it's due to using `git cat-file` in batch mode, so it's called
only once, instead of thousands of times.

Presumably this can be done in notmuch-lore as well, with something like:

  git rev-list | sed -e /$/:m/ | git cat-file --batch

But this still has the issue that some commits remove mail, don't add.

> > Also, you can configure which epochs you want to fetch (notmuch-lore fetches
> > all of them).
> >
> > One thing it doesn't yet do is trim the repository once the mails have been
> > converted, but that's probably easy to add later on.
> 
> What kind of trimming are you thinking about here? Partial history?

Same as notmuch-lore does: just the last commit.

Once the mails have been extracted there's no need for those commits.

> I guess this shows that public-inbox's repo format is simply not the
> best choice for the purpose of mail readers. It is optimised for other
> uses, and I always wondered why they use a non-bare repo at all. That
> single file path m at the root creates absolutely meaningless diffs.
> And the commit message doubles the info which is present in the blob.
> notes-ref could have served better for inspiration of public-inbox.
> (Barking up the wrong tree, I know.)

I don't know if there's a better format, git stores shapshots anyway, so as
long as the information is retrivable in some way, I think that' fine.

And I clone the public-inbox repositories as bare (mirror, actually), that's
something for the client to decide.
 
> There are even tools in the public-inbox eco system which feed that
> info into a xapian db, though not notmuch-like, as if notmuch hadn't
> existed already.
> 
> What I'm dreaming of is a notmuch "storage backend" which is git
> object db based rather than maildir based, and compatible with
> public-inbox (at least with the use case, i.e. v3 or v4...). I mean -
> why do we need a checkout of basically immutable files which are
> stored in blobs already, just so that notmuch can index them?

Yeap, that's exactly what I want as well.

It should not be that difficult to decouple notmuch from physical files and
feed some virtual content.

> We need them for the MUAs, I know, and we would need a solution for
> them, too. Or simply a tree in public-inbox which allows clients to
> use a mere checkout ...

99% of the time the content is not needed for the MUAs. So perhaps there could
be a way to request the body of the message through libnotmuch, and some
provider of virtual messages retrives it on demand.

Maildir seems like a cumbersome intermediary to me, at the moment.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-04-17 12:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-17  1:26 inbox-update: new competition of notmuch-lore Felipe Contreras
2023-04-17 10:23 ` Michael J Gruber
2023-04-17 12:30   ` Felipe Contreras

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).