From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id EC2F9431FBD for ; Thu, 24 Oct 2013 13:33:54 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 252tNtMlqSzj for ; Thu, 24 Oct 2013 13:33:48 -0700 (PDT) Received: from dmz-mailsec-scanner-4.mit.edu (dmz-mailsec-scanner-4.mit.edu [18.9.25.15]) by olra.theworths.org (Postfix) with ESMTP id A28E8431FBC for ; Thu, 24 Oct 2013 13:33:48 -0700 (PDT) X-AuditID: 1209190f-b7fa08e0000009c6-5d-5269842c51d6 Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP id 40.D7.02502.C2489625; Thu, 24 Oct 2013 16:33:48 -0400 (EDT) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id r9OKXjG9030310; Thu, 24 Oct 2013 16:33:46 -0400 Received: from drake.dyndns.org (26-4-172.dynamic.csail.mit.edu [18.26.4.172]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id r9OKXhnO016620 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Thu, 24 Oct 2013 16:33:45 -0400 Received: from amthrax by drake.dyndns.org with local (Exim 4.77) (envelope-from ) id 1VZRbD-0006Oz-U8; Thu, 24 Oct 2013 16:33:43 -0400 From: Austin Clements To: notmuch@notmuchmail.org Subject: [PATCH] new: Don't scan unchanged directories with no sub-directories Date: Thu, 24 Oct 2013 16:33:42 -0400 Message-Id: <1382646822-24556-1-git-send-email-amdragon@mit.edu> X-Mailer: git-send-email 1.8.4.rc3 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrMIsWRmVeSWpSXmKPExsUixG6nrqvTkhlkMH87p8X1mzOZLTpu72Zz YPJ4tuoWs8fHp7dYApiiuGxSUnMyy1KL9O0SuDKafzxjLZjKV9Fy/B9bA+Ndri5GTg4JAROJ x23zWCBsMYkL99azgdhCAvsYJbYfcuti5AKyNzJK7Pj/kB0icYRJ4tgzJ4jEXEaJT62LwbrZ BDQktu1fzghiiwhIS+y8O5sVxGYWcJTYc2A5WI2wgJ/E/SkrgeIcHCwCqhJX5vCBhHkFHCSu tR5kgzhCSWLhqW2sExh5FzAyrGKUTcmt0s1NzMwpTk3WLU5OzMtLLdI10cvNLNFLTSndxAgK C05J/h2M3w4qHWIU4GBU4uHV+JAeJMSaWFZcmXuIUZKDSUmUd11TZpAQX1J+SmVGYnFGfFFp TmrxIUYJDmYlEd5pekA53pTEyqrUonyYlDQHi5I4700O+yAhgfTEktTs1NSC1CKYrAwHh5IE 7w2QoYJFqempFWmZOSUIaSYOTpDhPEDDn4HU8BYXJOYWZ6ZD5E8xKkqJ8/4ESQiAJDJK8+B6 YXH7ilEc6BVh3n8gVTzAmIfrfgU0mAlo8JQlaSCDSxIRUlINjIHzNLe++KZvH3f4ssqzHxt2 CO0puR7aKJWqaqcruuPxpV4uA40E/tuvGbZbFt2cdnC9y+2FD3ojSoVWTHMr0zw6f3NB1sTO RaGMLIs7TvUJGIsen1e6Ivv+v30f1dK4M87pzuY/YsAUzidVsYlvnmyEk37hkZJpExQcu39a zuyWqf2xrjLtoxJLcUaioRZzUXEiAB53vU+2AgAA X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Oct 2013 20:33:55 -0000 This can substantially reduce the cost of notmuch new in some situations, such as when the file system cache is cold or when the Maildir is on NFS. --- notmuch-new.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/notmuch-new.c b/notmuch-new.c index faa33f1..364c73a 100644 --- a/notmuch-new.c +++ b/notmuch-new.c @@ -323,6 +323,26 @@ add_files (notmuch_database_t *notmuch, } db_mtime = directory ? notmuch_directory_get_mtime (directory) : 0; + /* If the directory is unchanged from our last scan and has no + * sub-directories, then return without scanning it at all. In + * some situations, skipping the scan can substantially reduce the + * cost of notmuch new, especially since the huge numbers of files + * in Maildirs make scans expensive, but all files live in leaf + * directories. + * + * To check for sub-directories, we borrow a trick from find, + * kpathsea, and many other UNIX tools: since a directory's link + * count is the number of sub-directories (specifically, their + * '..' entries) plus 2 (the link from the parent and the link for + * '.'). This check is safe even on weird file systems, since + * file systems that can't compute this will return 0 or 1. This + * is safe even on *really* weird file systems like HFS+ that + * mistakenly return the total number of directory entries, since + * that only inflates the count beyond 2. + */ + if (directory && fs_mtime == db_mtime && st.st_nlink == 2) + goto DONE; + /* If the database knows about this directory, then we sort based * on strcmp to match the database sorting. Otherwise, we can do * inode-based sorting for faster filesystem operation. */ -- 1.8.4.rc3