* [PATCH] new: read db_files and db_subdirs if mtime changed
@ 2011-02-04 21:44 Karel Zak
2011-02-27 8:45 ` Austin Clements
0 siblings, 1 reply; 3+ messages in thread
From: Karel Zak @ 2011-02-04 21:44 UTC (permalink / raw)
To: notmuch
The db_files and db_subdirs are unnecessary for unchanged directories.
maildir with 10000 e-mails:
old version:
$ time ./notmuch new
No new mail.
real 0m0.053s
user 0m0.028s
sys 0m0.026s
new version:
$ time ./notmuch new
No new mail.
real 0m0.032s
user 0m0.009s
sys 0m0.023s
Signed-off-by: Karel Zak <kzak@redhat.com>
---
notmuch-new.c | 15 ++++++---------
1 files changed, 6 insertions(+), 9 deletions(-)
diff --git a/notmuch-new.c b/notmuch-new.c
index 941f9d6..31d4553 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -247,15 +247,7 @@ add_files_recursive (notmuch_database_t *notmuch,
directory = notmuch_database_get_directory (notmuch, path);
db_mtime = notmuch_directory_get_mtime (directory);
- if (db_mtime == 0) {
- new_directory = TRUE;
- db_files = NULL;
- db_subdirs = NULL;
- } else {
- new_directory = FALSE;
- db_files = notmuch_directory_get_child_files (directory);
- db_subdirs = notmuch_directory_get_child_directories (directory);
- }
+ new_directory = db_mtime ? FALSE : TRUE;
/* If the database knows about this directory, then we sort based
* on strcmp to match the database sorting. Otherwise, we can do
@@ -328,6 +320,11 @@ add_files_recursive (notmuch_database_t *notmuch,
if (fs_mtime == db_mtime)
goto DONE;
+ if (!new_directory) {
+ db_files = notmuch_directory_get_child_files (directory);
+ db_subdirs = notmuch_directory_get_child_directories (directory);
+ }
+
/* Pass 2: Scan for new files, removed files, and removed directories. */
for (i = 0; i < num_fs_entries; i++)
{
--
1.7.3.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] new: read db_files and db_subdirs if mtime changed
2011-02-04 21:44 [PATCH] new: read db_files and db_subdirs if mtime changed Karel Zak
@ 2011-02-27 8:45 ` Austin Clements
2011-03-10 20:01 ` Carl Worth
0 siblings, 1 reply; 3+ messages in thread
From: Austin Clements @ 2011-02-27 8:45 UTC (permalink / raw)
To: Karel Zak; +Cc: notmuch
Looks good (faster than, but provably equivalent to the original code!
notmuch_directory_get_child_* are side-effect free,
db_files/db_subdirs aren't used between where they were set in the old
code and where they are set in the new code, and db_files/db_subdirs
are initialized to NULL when declared).
Another timing data point:
Old code: ./notmuch new 0.77s user 0.28s system 99% cpu 1.051 total
New code: ./notmuch new 0.09s user 0.27s system 98% cpu 0.368 total
I wonder if an even faster approach than the current recursive walk
would be to get *all* of the directory names and mtimes out of Xapian
in one pass and stat them all. If the mtime didn't change, then
there's no need to scandir that directory at all. This could even
beat the "time find >/dev/null" bound, but the gains may be too
marginal to make it worthwhile.
On Fri, Feb 4, 2011 at 4:44 PM, Karel Zak <kzak@redhat.com> wrote:
> The db_files and db_subdirs are unnecessary for unchanged directories.
>
> maildir with 10000 e-mails:
>
> old version:
> $ time ./notmuch new
> No new mail.
>
> real 0m0.053s
> user 0m0.028s
> sys 0m0.026s
>
> new version:
> $ time ./notmuch new
> No new mail.
>
> real 0m0.032s
> user 0m0.009s
> sys 0m0.023s
>
> Signed-off-by: Karel Zak <kzak@redhat.com>
> ---
> notmuch-new.c | 15 ++++++---------
> 1 files changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/notmuch-new.c b/notmuch-new.c
> index 941f9d6..31d4553 100644
> --- a/notmuch-new.c
> +++ b/notmuch-new.c
> @@ -247,15 +247,7 @@ add_files_recursive (notmuch_database_t *notmuch,
> directory = notmuch_database_get_directory (notmuch, path);
> db_mtime = notmuch_directory_get_mtime (directory);
>
> - if (db_mtime == 0) {
> - new_directory = TRUE;
> - db_files = NULL;
> - db_subdirs = NULL;
> - } else {
> - new_directory = FALSE;
> - db_files = notmuch_directory_get_child_files (directory);
> - db_subdirs = notmuch_directory_get_child_directories (directory);
> - }
> + new_directory = db_mtime ? FALSE : TRUE;
>
> /* If the database knows about this directory, then we sort based
> * on strcmp to match the database sorting. Otherwise, we can do
> @@ -328,6 +320,11 @@ add_files_recursive (notmuch_database_t *notmuch,
> if (fs_mtime == db_mtime)
> goto DONE;
>
> + if (!new_directory) {
> + db_files = notmuch_directory_get_child_files (directory);
> + db_subdirs = notmuch_directory_get_child_directories (directory);
> + }
> +
> /* Pass 2: Scan for new files, removed files, and removed directories. */
> for (i = 0; i < num_fs_entries; i++)
> {
> --
> 1.7.3.4
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] new: read db_files and db_subdirs if mtime changed
2011-02-27 8:45 ` Austin Clements
@ 2011-03-10 20:01 ` Carl Worth
0 siblings, 0 replies; 3+ messages in thread
From: Carl Worth @ 2011-03-10 20:01 UTC (permalink / raw)
To: Austin Clements, Karel Zak; +Cc: notmuch
[-- Attachment #1: Type: text/plain, Size: 1280 bytes --]
On Fri, 4 Feb 2011 22:44:31 +0100, Karel Zak <kzak@redhat.com> wrote:
> The db_files and db_subdirs are unnecessary for unchanged directories.
...
> old version: real 0m0.053s
> new version: real 0m0.032s
Thanks Karel! What a lovely optimization.
On Sun, 27 Feb 2011 03:45:05 -0500, Austin Clements <amdragon@mit.edu> wrote:
> Looks good (faster than, but provably equivalent to the original code!
> notmuch_directory_get_child_* are side-effect free,
> db_files/db_subdirs aren't used between where they were set in the old
> code and where they are set in the new code, and db_files/db_subdirs
> are initialized to NULL when declared).
And thank you, Austin for the careful review. This kind of thing is very
helpful for me in reviewing patches. When there's a message like this on
the mailing list it makes it very easy for me to trust the patch.
I've now pushed the original patch with an updated commit message to
include Austin's review comments.
I also followed up with a subsequent commit that updates the comments
for the add_files_recursive function to match the current
implementation, (reflecting both this change and a previous change to a
strict equality test for the mtime comparison).
-Carl
--
carl.d.worth@intel.com
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-03-10 20:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-04 21:44 [PATCH] new: read db_files and db_subdirs if mtime changed Karel Zak
2011-02-27 8:45 ` Austin Clements
2011-03-10 20:01 ` Carl Worth
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).