From: Austin Clements <amdragon@MIT.EDU>
To: notmuch@notmuchmail.org
Subject: [PATCH v2] new: Don't scan unchanged directories with no sub-directories
Date: Thu, 24 Oct 2013 17:38:59 -0400 [thread overview]
Message-ID: <1382650739-12438-1-git-send-email-amdragon@mit.edu> (raw)
In-Reply-To: <20131024210837.GH20337@mit.edu>
This can substantially reduce the cost of notmuch new in some
situations, such as when the file system cache is cold or when the
Maildir is on NFS.
---
This should fix the problem with directories containing symlinks to
other directories, but no actual sub-directories.
notmuch-new.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/notmuch-new.c b/notmuch-new.c
index faa33f1..ba05cb4 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -323,6 +323,35 @@ add_files (notmuch_database_t *notmuch,
}
db_mtime = directory ? notmuch_directory_get_mtime (directory) : 0;
+ /* If the directory is unchanged from our last scan and has no
+ * sub-directories, then return without scanning it at all. In
+ * some situations, skipping the scan can substantially reduce the
+ * cost of notmuch new, especially since the huge numbers of files
+ * in Maildirs make scans expensive, but all files live in leaf
+ * directories.
+ *
+ * To check for sub-directories, we borrow a trick from find,
+ * kpathsea, and many other UNIX tools: since a directory's link
+ * count is the number of sub-directories (specifically, their
+ * '..' entries) plus 2 (the link from the parent and the link for
+ * '.'). This check is safe even on weird file systems, since
+ * file systems that can't compute this will return 0 or 1. This
+ * is safe even on *really* weird file systems like HFS+ that
+ * mistakenly return the total number of directory entries, since
+ * that only inflates the count beyond 2.
+ */
+ if (directory && fs_mtime == db_mtime && st.st_nlink == 2) {
+ /* There's one catch: pass 1 below considers symlinks to
+ * directories to be directories, but these don't increase the
+ * file system link count. So, only bail early if the
+ * database agrees that there are no sub-directories. */
+ db_subdirs = notmuch_directory_get_child_directories (directory);
+ if (!notmuch_filenames_valid (db_subdirs))
+ goto DONE;
+ notmuch_filenames_destroy (db_subdirs);
+ db_subdirs = NULL;
+ }
+
/* If the database knows about this directory, then we sort based
* on strcmp to match the database sorting. Otherwise, we can do
* inode-based sorting for faster filesystem operation. */
--
1.8.4.rc3
next prev parent reply other threads:[~2013-10-24 21:39 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-24 20:33 [PATCH] new: Don't scan unchanged directories with no sub-directories Austin Clements
2013-10-24 21:08 ` Austin Clements
2013-10-24 21:38 ` Austin Clements [this message]
2013-10-25 11:46 ` [PATCH v2] " Tomi Ollila
2013-10-25 11:59 ` Vladimir Marek
2013-10-26 0:13 ` David Bremner
2013-10-26 11:52 ` David Bremner
2013-10-28 20:00 ` David Bremner
2013-10-28 20:46 ` Vladimir Marek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1382650739-12438-1-git-send-email-amdragon@mit.edu \
--to=amdragon@mit.edu \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).