unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* notmuch and public-inbox
@ 2021-04-30 23:23 Felipe Contreras
  2021-05-01  0:05 ` Eric Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Felipe Contreras @ 2021-04-30 23:23 UTC (permalink / raw)
  To: notmuch; +Cc: Eric Wong

Hi,

My workflow with notmuch is near to perfect, however, the only pain
point I have is fetching all the mail of a particular mailing list.

To do this efficiently public-inbox seems ideal, however, when
searching information to link notmuch to public-inbox I don't find
anything of value. In fact, I can't find an URL of a public-inbox
repository of the notmuch mailing list.

Am I missing something or has nobody really worked on linking these two
tools? Seems like an obvious area of opportunity.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: notmuch and public-inbox
  2021-04-30 23:23 notmuch and public-inbox Felipe Contreras
@ 2021-05-01  0:05 ` Eric Wong
  2021-05-01  1:23   ` Felipe Contreras
       [not found]   ` <87lf8zs2by.fsf@wobbit.home.cworth.org>
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Wong @ 2021-05-01  0:05 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: notmuch, W. Trevor King

Felipe Contreras <felipe.contreras@gmail.com> wrote:
> Hi,
> 
> My workflow with notmuch is near to perfect, however, the only pain
> point I have is fetching all the mail of a particular mailing list.
> 
> To do this efficiently public-inbox seems ideal, however, when
> searching information to link notmuch to public-inbox I don't find
> anything of value. In fact, I can't find an URL of a public-inbox
> repository of the notmuch mailing list.

Kyle maintains an unofficial mirror at https://yhetil.org/notmuch

There's no real relationship between them aside from they both
use Xapian (and I learned Xapian from reading the notmuch source).

> Am I missing something or has nobody really worked on linking these two
> tools? Seems like an obvious area of opportunity.

I think W. Trevor King (Cc-ed) also started looking something
many years ago, but I'm not sure if anything became of it.

I never had the interest in using notmuch since Maildirs are a
non-starter with millions of messages with current FSes/OSes.
mairix + gzipped mboxes mostly works for me, (though mairix
indexing is silly expensive[1])


[1] also, there's a footnote about something on the git list
    which wouldn't be appropriate to discuss here on the
    notmuch list.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: notmuch and public-inbox
  2021-05-01  0:05 ` Eric Wong
@ 2021-05-01  1:23   ` Felipe Contreras
  2021-05-01  5:16     ` Eric Wong
       [not found]   ` <87lf8zs2by.fsf@wobbit.home.cworth.org>
  1 sibling, 1 reply; 5+ messages in thread
From: Felipe Contreras @ 2021-05-01  1:23 UTC (permalink / raw)
  To: Eric Wong; +Cc: notmuch@notmuchmail.org, W. Trevor King

On Fri, Apr 30, 2021 at 7:05 PM Eric Wong <e@80x24.org> wrote:
>
> Felipe Contreras <felipe.contreras@gmail.com> wrote:
> > My workflow with notmuch is near to perfect, however, the only pain
> > point I have is fetching all the mail of a particular mailing list.
> >
> > To do this efficiently public-inbox seems ideal, however, when
> > searching information to link notmuch to public-inbox I don't find
> > anything of value. In fact, I can't find an URL of a public-inbox
> > repository of the notmuch mailing list.
>
> Kyle maintains an unofficial mirror at https://yhetil.org/notmuch

Nice. Who is Kyle?

> There's no real relationship between them aside from they both
> use Xapian (and I learned Xapian from reading the notmuch source).

I don't mean sharing the Xapian database (although that could be
interesting for the future). I'm talking about as a client of
public-inbox, not as a server.

I mean doing a git clone for a public-inbox repository and notmuch
indexing that repository.

> > Am I missing something or has nobody really worked on linking these two
> > tools? Seems like an obvious area of opportunity.
>
> I think W. Trevor King (Cc-ed) also started looking something
> many years ago, but I'm not sure if anything became of it.
>
> I never had the interest in using notmuch since Maildirs are a
> non-starter with millions of messages with current FSes/OSes.
> mairix + gzipped mboxes mostly works for me, (though mairix
> indexing is silly expensive[1])

If notmuch was patched to support the public-inbox format--as an
alternative to Maildir--then users of public-inbox could clone a
repository, and use notmuch to index that.

I don't see how that could be difficult. But then again, I haven't
looked at the Maildir code.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: notmuch and public-inbox
       [not found]   ` <87lf8zs2by.fsf@wobbit.home.cworth.org>
@ 2021-05-01  4:58     ` Eric Wong
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-05-01  4:58 UTC (permalink / raw)
  To: Carl Worth; +Cc: notmuch, W. Trevor King

Carl Worth <cworth@cworth.org> wrote:
> On Sat, May 01 2021, Eric Wong wrote:
> > I never had the interest in using notmuch since Maildirs are a
> > non-starter with millions of messages with current FSes/OSes.
> 
> What bottleneck are you seeing here?
> 
> I don't have million(s) of messages but I'm getting close with 1.48M
> messages in my current notmuch index.
> 
> I'm not seeing any problematic performance from the filesystem or OS
> myself, so I'm curious what problem you're referring to here.

I assume you have several Maildirs and not just one with 1.48M?

Since I never actually used notmuch myself; most of my aversion
comes from years of using Maildir sync tools (mbsync,
offlineimap, rsync).  They all struggle with many inodes
and syscalls + cache required to walk them.

It's the same reason git puts old objects in packfiles rather
than having millions of loose objects.

Furthermore, my MUA (mutt) struggles on a single Maildir when
its size goes over ~50K.  Maildir is fine as a dumping ground
for mairix search results (typically a few dozen/hundred results).

Maildir is better nowadays on FSes with compression and
checksums; but lack of compression and checksumming were also
points against it; though syscalls are also more expensive with
CPU vulnerability mitigations.

I've always gzipped my archival mboxes for compression and CRC.

My local mirror of all the messages on lore.kernel.org/* is over
14.6M(*) and growing...  (LKML is 4M of that).


(*) 14.6M in the new combined "extindex" format that should be on
    lore.kernel.org, soon.  For now, I have an experimental
    instance on https://yhbt.net/lore/all/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: notmuch and public-inbox
  2021-05-01  1:23   ` Felipe Contreras
@ 2021-05-01  5:16     ` Eric Wong
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Wong @ 2021-05-01  5:16 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: notmuch, W. Trevor King

Felipe Contreras <felipe.contreras@gmail.com> wrote:
> On Fri, Apr 30, 2021 at 7:05 PM Eric Wong <e@80x24.org> wrote:
> >
> > Felipe Contreras <felipe.contreras@gmail.com> wrote:
> > > My workflow with notmuch is near to perfect, however, the only pain
> > > point I have is fetching all the mail of a particular mailing list.
> > >
> > > To do this efficiently public-inbox seems ideal, however, when
> > > searching information to link notmuch to public-inbox I don't find
> > > anything of value. In fact, I can't find an URL of a public-inbox
> > > repository of the notmuch mailing list.
> >
> > Kyle maintains an unofficial mirror at https://yhetil.org/notmuch
> 
> Nice. Who is Kyle?

A notmuch user and public-inbox user/contributor; beyond that I
don't know.

public-inbox is all designed so anybody can make mirrors of any
mail they have.  (as I've mirrored a bunch of lists myself
without ever asking permission)

> > There's no real relationship between them aside from they both
> > use Xapian (and I learned Xapian from reading the notmuch source).
> 
> I don't mean sharing the Xapian database (although that could be
> interesting for the future). I'm talking about as a client of
> public-inbox, not as a server.
> 
> I mean doing a git clone for a public-inbox repository and notmuch
> indexing that repository.

Ah, the git repository formats are documented at:

	https://public-inbox.org/public-inbox-v2-format.html
	https://public-inbox.org/public-inbox-v1-format.html

> > > Am I missing something or has nobody really worked on linking these two
> > > tools? Seems like an obvious area of opportunity.
> >
> > I think W. Trevor King (Cc-ed) also started looking something
> > many years ago, but I'm not sure if anything became of it.
> >
> > I never had the interest in using notmuch since Maildirs are a
> > non-starter with millions of messages with current FSes/OSes.
> > mairix + gzipped mboxes mostly works for me, (though mairix
> > indexing is silly expensive[1])
> 
> If notmuch was patched to support the public-inbox format--as an
> alternative to Maildir--then users of public-inbox could clone a
> repository, and use notmuch to index that.
> 
> I don't see how that could be difficult. But then again, I haven't
> looked at the Maildir code.

That would be cool; always room for more tools to interoperate
with each other.  (I'm quite busy with public-inbox and trying
to avoid AOT languages as much as possible).

Keep in mind some users are already happy with l2md and impibe
for writing Maildir, so there's already (space inefficient ways)
to make notmuch index data from public-inboxes:

* l2md - Maildir and procmail importer using C + libgit2
  https://git.kernel.org/pub/scm/linux/kernel/git/dborkman/l2md.git

* impibe - Perl script to import v1 or v2 to Maildir
  https://leahneukirchen.org/dotfiles/bin/impibe
  discussion: https://public-inbox.org/meta/87v9m0l8t1.fsf@vuxu.org/

(maybe more will appear at <https://public-inbox.org/clients.html>)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-05-01  5:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-30 23:23 notmuch and public-inbox Felipe Contreras
2021-05-01  0:05 ` Eric Wong
2021-05-01  1:23   ` Felipe Contreras
2021-05-01  5:16     ` Eric Wong
     [not found]   ` <87lf8zs2by.fsf@wobbit.home.cworth.org>
2021-05-01  4:58     ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).