unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Ben Gamari <bgamari@gmail.com>
To: notmuch <notmuch@notmuchmail.org>
Subject: Re: Mail in git
Date: Wed, 17 Feb 2010 10:03:36 -0500	[thread overview]
Message-ID: <1266418124-sup-6308@ben-laptop> (raw)
In-Reply-To: <20100217012101.GD8249@lapse.rw.madduck.net>

Excerpts from martin f krafft's message of Tue Feb 16 20:21:01 -0500 2010:
> What I am wondering is if (explicit) tags couldn't be represented as
> tree-objects with this.
> 
>   evenless-link   — link a message object with a tree object
>   evenless–unlink – unlink a message object from tree object
>     [replaces evenless-unlink]

I was actually wondering this very thing. I'd just be worried about tags
with large numbers of messages (presumably we would need an All tag,
that would contain a reference to every known message). It seems like
the simple act of adding a message to the repository could turn into an
extremely expensive operation.

Moreover, deleting a message could also be quite expensive as this will
require rewriting all of the tags that reference it. Surely, we would
need to batch these sort of operations to avoid disasterous performance.

However, even with batching, it seems we would face some pretty serious
scalability issues. I think if we were to implement tag storage in
trees, we'd need to use a multi-level tree. This way we could avoid
rewriting a tree object containing all of the tag's messages on every
change. I apologize if this was already obvious to everyone but me.

> 
> messages would then be deleted whenever using git-gc.
> 
> No idea how this would sync if we don't keep ancestry. Otoh, it
> would probably not be very expensive to do just that.

I think that keeping the ancestry would be quite important and would
come with relatively low overhead given the correct dereferencing of
data structures.

> 
> notmuch would then only search and provide the hash ID(s); tags
> would be a function of storage.
> 
> Is it possible to find out all trees that reference a given object
> with Git in constant or sub-linear time?
> 
I don't believe so. I think this is one of the reasons why git gc is so
expensive.

- Ben

  reply	other threads:[~2010-02-17 15:03 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-15  0:29 Mail in git Stewart Smith
2010-02-16  9:08 ` Michal Sojka
2010-02-16 19:06 ` Ben Gamari
2010-02-17  0:21   ` Stewart Smith
2010-02-17 10:07     ` Stewart Smith
2011-05-21  7:05       ` martin f krafft
2011-05-21  7:25         ` Stewart Smith
2010-02-17  1:21 ` martin f krafft
2010-02-17 15:03   ` Ben Gamari [this message]
2010-02-17 19:23     ` Mark Anderson
2010-02-17 19:34       ` Ben Gamari
2010-02-17 23:52         ` martin f krafft
2010-02-18  0:39           ` Ben Gamari
2010-02-18  1:58             ` martin f krafft
2010-02-18  2:19               ` Ben Gamari
2010-02-18  2:48                 ` nested tag trees (was: Mail in git) martin f krafft
2010-02-18  4:32                   ` martin f krafft
     [not found]                   ` <1266463007-sup-8777@ben-laptop>
2010-02-18  4:34                     ` martin f krafft
     [not found]                     ` <20100218034613.GD1991@lapse.rw.madduck.net>
2010-02-18  4:44                       ` Ben Gamari
2010-02-18  4:59                         ` martin f krafft
2010-02-18  5:10                           ` Ben Gamari
2010-02-19  0:31                             ` martin f krafft
2010-02-19  9:52                               ` Michal Sojka
2010-02-19 14:27                                 ` Ben Gamari
2010-02-17 23:56   ` Mail in git Stewart Smith
2010-02-18  1:01     ` Ben Gamari
2010-02-18  2:00       ` martin f krafft
2010-02-18  2:11         ` Git ancestry and sync problems (was: Mail in git) martin f krafft
2010-02-18  8:34           ` racin
2010-02-18 12:20             ` Jameson Rollins
2010-02-18 12:47             ` Ben Gamari
2010-02-18 23:23             ` martin f krafft

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1266418124-sup-6308@ben-laptop \
    --to=bgamari@gmail.com \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).