unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Eirik Byrkjeflot Anonsen <eirik@eirikba.org>
To: "notmuch mailing list" <notmuch@notmuchmail.org>
Subject: Re: On disk tag storage format
Date: Thu, 29 Nov 2012 20:34:50 +0100	[thread overview]
Message-ID: <874nk8td7p.fsf@star.eba> (raw)
In-Reply-To: <874nk8v9zw.fsf@zancas.localnet> (David Bremner's message of "Thu, 29 Nov 2012 09:01:23 -0400")

David Bremner <david@tethera.net> writes:

> Austin outlined on IRC a way of representing tags on disk as hardlinks
> to messages. In order to make the discussion more concrete, I wrote a
> prototype in python to dump the notmuch database to this format. On my
> 250k messages, this creates 40k new hardlinks, and uses about 5M of
> diskspace. The dump process takes about 20s on
> my core i7 machine.  With symbolic links, the same database takes about
> 150M of disk space; this isn't great but it isn't unbearable either.

And eating 40k inodes, I suppose.  Which may matter to some systems.
(Hardlinks do not use extra inodes, as they are just directory entries
pointing to already existing inodes).

Of course, the space usage also depends on the file system, as e.g. ext2
would use 1 complete block (typically 4kiB) to store the file name
pointed to per symlink.  ReiserFS would probably use 5M for the
directory entries and another 5M for the symlink data (wild guess).

eirik

  reply	other threads:[~2012-11-29 19:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-29 13:01 On disk tag storage format David Bremner
2012-11-29 19:34 ` Eirik Byrkjeflot Anonsen [this message]
2012-11-30  7:31   ` Tomi Ollila
2013-02-21  1:29 ` David Bremner
2013-10-05  1:28   ` Ethan Glasser-Camp
2013-10-07  4:49     ` Ethan Glasser-Camp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874nk8td7p.fsf@star.eba \
    --to=eirik@eirikba.org \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).