unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH] new: read db_files and db_subdirs if mtime changed
@ 2011-02-04 21:44 Karel Zak
  2011-02-27  8:45 ` Austin Clements
  0 siblings, 1 reply; 3+ messages in thread
From: Karel Zak @ 2011-02-04 21:44 UTC (permalink / raw)
  To: notmuch

The db_files and db_subdirs are unnecessary for unchanged directories.

maildir with 10000 e-mails:

old version:
	$ time ./notmuch new
	No new mail.

	real    0m0.053s
	user    0m0.028s
	sys     0m0.026s

new version:
	$ time ./notmuch new
	No new mail.

	real    0m0.032s
	user    0m0.009s
	sys     0m0.023s

Signed-off-by: Karel Zak <kzak@redhat.com>
---
 notmuch-new.c |   15 ++++++---------
 1 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 941f9d6..31d4553 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -247,15 +247,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     directory = notmuch_database_get_directory (notmuch, path);
     db_mtime = notmuch_directory_get_mtime (directory);
 
-    if (db_mtime == 0) {
-	new_directory = TRUE;
-	db_files = NULL;
-	db_subdirs = NULL;
-    } else {
-	new_directory = FALSE;
-	db_files = notmuch_directory_get_child_files (directory);
-	db_subdirs = notmuch_directory_get_child_directories (directory);
-    }
+    new_directory = db_mtime ? FALSE : TRUE;
 
     /* If the database knows about this directory, then we sort based
      * on strcmp to match the database sorting. Otherwise, we can do
@@ -328,6 +320,11 @@ add_files_recursive (notmuch_database_t *notmuch,
     if (fs_mtime == db_mtime)
 	goto DONE;
 
+    if (!new_directory) {
+	db_files = notmuch_directory_get_child_files (directory);
+	db_subdirs = notmuch_directory_get_child_directories (directory);
+    }
+
     /* Pass 2: Scan for new files, removed files, and removed directories. */
     for (i = 0; i < num_fs_entries; i++)
     {
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] new: read db_files and db_subdirs if mtime changed
  2011-02-04 21:44 [PATCH] new: read db_files and db_subdirs if mtime changed Karel Zak
@ 2011-02-27  8:45 ` Austin Clements
  2011-03-10 20:01   ` Carl Worth
  0 siblings, 1 reply; 3+ messages in thread
From: Austin Clements @ 2011-02-27  8:45 UTC (permalink / raw)
  To: Karel Zak; +Cc: notmuch

Looks good (faster than, but provably equivalent to the original code!
 notmuch_directory_get_child_* are side-effect free,
db_files/db_subdirs aren't used between where they were set in the old
code and where they are set in the new code, and db_files/db_subdirs
are initialized to NULL when declared).

Another timing data point:
Old code: ./notmuch new  0.77s user 0.28s system 99% cpu 1.051 total
New code: ./notmuch new  0.09s user 0.27s system 98% cpu 0.368 total

I wonder if an even faster approach than the current recursive walk
would be to get *all* of the directory names and mtimes out of Xapian
in one pass and stat them all.  If the mtime didn't change, then
there's no need to scandir that directory at all.  This could even
beat the "time find >/dev/null" bound, but the gains may be too
marginal to make it worthwhile.

On Fri, Feb 4, 2011 at 4:44 PM, Karel Zak <kzak@redhat.com> wrote:
> The db_files and db_subdirs are unnecessary for unchanged directories.
>
> maildir with 10000 e-mails:
>
> old version:
>        $ time ./notmuch new
>        No new mail.
>
>        real    0m0.053s
>        user    0m0.028s
>        sys     0m0.026s
>
> new version:
>        $ time ./notmuch new
>        No new mail.
>
>        real    0m0.032s
>        user    0m0.009s
>        sys     0m0.023s
>
> Signed-off-by: Karel Zak <kzak@redhat.com>
> ---
>  notmuch-new.c |   15 ++++++---------
>  1 files changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/notmuch-new.c b/notmuch-new.c
> index 941f9d6..31d4553 100644
> --- a/notmuch-new.c
> +++ b/notmuch-new.c
> @@ -247,15 +247,7 @@ add_files_recursive (notmuch_database_t *notmuch,
>     directory = notmuch_database_get_directory (notmuch, path);
>     db_mtime = notmuch_directory_get_mtime (directory);
>
> -    if (db_mtime == 0) {
> -       new_directory = TRUE;
> -       db_files = NULL;
> -       db_subdirs = NULL;
> -    } else {
> -       new_directory = FALSE;
> -       db_files = notmuch_directory_get_child_files (directory);
> -       db_subdirs = notmuch_directory_get_child_directories (directory);
> -    }
> +    new_directory = db_mtime ? FALSE : TRUE;
>
>     /* If the database knows about this directory, then we sort based
>      * on strcmp to match the database sorting. Otherwise, we can do
> @@ -328,6 +320,11 @@ add_files_recursive (notmuch_database_t *notmuch,
>     if (fs_mtime == db_mtime)
>        goto DONE;
>
> +    if (!new_directory) {
> +       db_files = notmuch_directory_get_child_files (directory);
> +       db_subdirs = notmuch_directory_get_child_directories (directory);
> +    }
> +
>     /* Pass 2: Scan for new files, removed files, and removed directories. */
>     for (i = 0; i < num_fs_entries; i++)
>     {
> --
> 1.7.3.4
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] new: read db_files and db_subdirs if mtime changed
  2011-02-27  8:45 ` Austin Clements
@ 2011-03-10 20:01   ` Carl Worth
  0 siblings, 0 replies; 3+ messages in thread
From: Carl Worth @ 2011-03-10 20:01 UTC (permalink / raw)
  To: Austin Clements, Karel Zak; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 1280 bytes --]

On Fri,  4 Feb 2011 22:44:31 +0100, Karel Zak <kzak@redhat.com> wrote:
> The db_files and db_subdirs are unnecessary for unchanged directories.
...
> old version: real    0m0.053s
> new version: real    0m0.032s

Thanks Karel! What a lovely optimization.

On Sun, 27 Feb 2011 03:45:05 -0500, Austin Clements <amdragon@mit.edu> wrote:
> Looks good (faster than, but provably equivalent to the original code!
>  notmuch_directory_get_child_* are side-effect free,
> db_files/db_subdirs aren't used between where they were set in the old
> code and where they are set in the new code, and db_files/db_subdirs
> are initialized to NULL when declared).

And thank you, Austin for the careful review. This kind of thing is very
helpful for me in reviewing patches. When there's a message like this on
the mailing list it makes it very easy for me to trust the patch.

I've now pushed the original patch with an updated commit message to
include Austin's review comments.

I also followed up with a subsequent commit that updates the comments
for the add_files_recursive function to match the current
implementation, (reflecting both this change and a previous change to a
strict equality test for the mtime comparison).

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-03-10 20:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-04 21:44 [PATCH] new: read db_files and db_subdirs if mtime changed Karel Zak
2011-02-27  8:45 ` Austin Clements
2011-03-10 20:01   ` Carl Worth

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).