unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [RFC PATCH 0/5] cli: notmuch new scandir rework
@ 2016-04-15 19:29 Jani Nikula
  2016-04-15 19:29 ` [RFC PATCH 1/5] cli: remove leftover dir variable Jani Nikula
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Jani Nikula @ 2016-04-15 19:29 UTC (permalink / raw)
  To: notmuch

David gave some feedback to my old series to add globbing to new.ignore
[1]. One thing that annoyed me in that series was that we have to do the
ignore check in several places. I thought it would be nice to use the
filter parameter to scandir(3) but alas you can't pass a context
parameter there.

I ended up experimenting with that, and here's the result. Maybe this
makes it easier to read and understand the add_files() function, and
this centralizes the new.ignore handling, but is this really worth it?
I'm unsure.

Patch 1 is trivial and should be pushed no matter what.

BR,
Jani.


[1] id:871t6efgdt.fsf@maritornes.cs.unb.ca

Jani Nikula (5):
  cli: remove leftover dir variable
  cli: drop inode sort order on directories unknown to the database
  util: add a homebrew scandir implementation
  cli: use homebrew scandir in notmuch new add_files
  cli: convert count_files to new scandir

 notmuch-new.c       | 382 ++++++++++++++++++++++++++++------------------------
 util/Makefile.local |   2 +-
 util/scandir.c      |  87 ++++++++++++
 util/scandir.h      |  11 ++
 4 files changed, 305 insertions(+), 177 deletions(-)
 create mode 100644 util/scandir.c
 create mode 100644 util/scandir.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH 1/5] cli: remove leftover dir variable
  2016-04-15 19:29 [RFC PATCH 0/5] cli: notmuch new scandir rework Jani Nikula
@ 2016-04-15 19:29 ` Jani Nikula
  2016-05-02 11:23   ` David Bremner
  2016-04-15 19:29 ` [RFC PATCH 2/5] cli: drop inode sort order on directories unknown to the database Jani Nikula
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Jani Nikula @ 2016-04-15 19:29 UTC (permalink / raw)
  To: notmuch

No functional changes.
---
 notmuch-new.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 04cb5cac092a..2d975eb5b640 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -351,7 +351,6 @@ add_files (notmuch_database_t *notmuch,
 	   const char *path,
 	   add_files_state_t *state)
 {
-    DIR *dir = NULL;
     struct dirent *entry = NULL;
     char *next = NULL;
     time_t fs_mtime, db_mtime;
@@ -655,8 +654,6 @@ add_files (notmuch_database_t *notmuch,
   DONE:
     if (next)
 	talloc_free (next);
-    if (dir)
-	closedir (dir);
     if (fs_entries) {
 	for (i = 0; i < num_fs_entries; i++)
 	    free (fs_entries[i]);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 2/5] cli: drop inode sort order on directories unknown to the database
  2016-04-15 19:29 [RFC PATCH 0/5] cli: notmuch new scandir rework Jani Nikula
  2016-04-15 19:29 ` [RFC PATCH 1/5] cli: remove leftover dir variable Jani Nikula
@ 2016-04-15 19:29 ` Jani Nikula
  2016-04-15 19:29 ` [RFC PATCH 3/5] util: add a homebrew scandir implementation Jani Nikula
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Jani Nikula @ 2016-04-15 19:29 UTC (permalink / raw)
  To: notmuch

The claim is that inode sort order leads to faster filesystem
operation:

commit a45ff8c36112a2f17c1ad5c20a16c30a47759797
Author: Stewart Smith <stewart@flamingspork.com>
Date:   Wed Nov 18 12:56:40 2009 +1100

    Read mail directory in inode number order

The numbers cited seem convincing, but since then we've limited the
inode sorting to directories new to the database. Directories known to
the database are scanned in asciibetical order.

Making this change helps future work, and having it as a standalone
step makes it easier to evaluate the potential performance impact.
---
 notmuch-new.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 2d975eb5b640..930cbbc9b86f 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -151,12 +151,6 @@ generic_print_progress (const char *action, const char *object,
 }
 
 static int
-dirent_sort_inode (const struct dirent **a, const struct dirent **b)
-{
-    return ((*a)->d_ino < (*b)->d_ino) ? -1 : 1;
-}
-
-static int
 dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
 {
     return strcmp ((*a)->d_name, (*b)->d_name);
@@ -415,11 +409,10 @@ add_files (notmuch_database_t *notmuch,
     }
 
     /* If the database knows about this directory, then we sort based
-     * on strcmp to match the database sorting. Otherwise, we can do
-     * inode-based sorting for faster filesystem operation. */
+     * on strcmp to match the database sorting. */
     num_fs_entries = scandir (path, &fs_entries, 0,
 			      directory ?
-			      dirent_sort_strcmp_name : dirent_sort_inode);
+			      dirent_sort_strcmp_name : NULL);
 
     if (num_fs_entries == -1) {
 	fprintf (stderr, "Error opening directory %s: %s\n",
@@ -722,7 +715,7 @@ count_files (const char *path, int *count, add_files_state_t *state)
     struct dirent *entry = NULL;
     char *next;
     struct dirent **fs_entries = NULL;
-    int num_fs_entries = scandir (path, &fs_entries, 0, dirent_sort_inode);
+    int num_fs_entries = scandir (path, &fs_entries, 0, NULL);
     int entry_type, i;
 
     if (num_fs_entries == -1) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 3/5] util: add a homebrew scandir implementation
  2016-04-15 19:29 [RFC PATCH 0/5] cli: notmuch new scandir rework Jani Nikula
  2016-04-15 19:29 ` [RFC PATCH 1/5] cli: remove leftover dir variable Jani Nikula
  2016-04-15 19:29 ` [RFC PATCH 2/5] cli: drop inode sort order on directories unknown to the database Jani Nikula
@ 2016-04-15 19:29 ` Jani Nikula
  2016-04-15 19:29 ` [RFC PATCH 4/5] cli: use homebrew scandir in notmuch new add_files Jani Nikula
  2016-04-15 19:29 ` [RFC PATCH 5/5] cli: convert count_files to new scandir Jani Nikula
  4 siblings, 0 replies; 7+ messages in thread
From: Jani Nikula @ 2016-04-15 19:29 UTC (permalink / raw)
  To: notmuch

Add support for a filter callback with a context parameter, propagate
errors from the filter callback, generate a list of filenames instead
of dirents.
---
 util/Makefile.local |  2 +-
 util/scandir.c      | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 util/scandir.h      | 11 +++++++
 3 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 util/scandir.c
 create mode 100644 util/scandir.h

diff --git a/util/Makefile.local b/util/Makefile.local
index 905f23763468..8893209f320e 100644
--- a/util/Makefile.local
+++ b/util/Makefile.local
@@ -5,7 +5,7 @@ extra_cflags += -I$(srcdir)/$(dir)
 
 libutil_c_srcs := $(dir)/xutil.c $(dir)/error_util.c $(dir)/hex-escape.c \
 		  $(dir)/string-util.c $(dir)/talloc-extra.c $(dir)/zlib-extra.c \
-		$(dir)/util.c
+		$(dir)/util.c $(dir)/scandir.c
 
 libutil_modules := $(libutil_c_srcs:.c=.o)
 
diff --git a/util/scandir.c b/util/scandir.c
new file mode 100644
index 000000000000..c69717724235
--- /dev/null
+++ b/util/scandir.c
@@ -0,0 +1,87 @@
+/* scandir.c - Dedicated scandir implementation.
+ *
+ * Copyright (c) 2016 Jani Nikula
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see http://www.gnu.org/licenses/ .
+ *
+ * Author: Jani Nikula <jani@nikula.org>
+ */
+
+#include "scandir.h"
+
+#include <dirent.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+int scandirx (const char *path, char ***namelist,
+	      int (*filter)(const struct dirent *dirent, void *context),
+	      int (*compar)(const void *a, const void *b),
+	      void *context)
+{
+    DIR *dir;
+    struct dirent *d;
+    char **array = NULL;
+    int i, count = 0, array_size = 0;
+    char *d_name;
+
+    dir = opendir (path);
+    if (!dir)
+	return -1;
+
+    while ((d = readdir (dir)) != NULL) {
+	if (filter) {
+	    int selected = filter (d, context);
+	    if (selected < 0)
+		goto err;
+	    else if (! selected)
+		continue;
+	}
+
+	if (count == array_size) {
+	    char **new_array;
+
+	    array_size = array_size ? 2 * array_size : 16;
+
+	    new_array = realloc (array, array_size * sizeof (*array));
+	    if (! new_array)
+		goto err;
+
+	    array = new_array;
+	}
+
+	d_name = strdup (d->d_name);
+	if (! d_name)
+	    goto err;
+
+	array[count++] = d_name;
+    }
+
+    closedir (dir);
+
+    if (compar)
+	qsort (array, count, sizeof (*array), compar);
+
+    *namelist = array;
+
+    return count;
+
+err:
+    for (i = 0; i < count; i++) {
+	free (array[i]);
+    }
+    free (array);
+
+    return -1;
+}
diff --git a/util/scandir.h b/util/scandir.h
new file mode 100644
index 000000000000..cc5ed95a8b1b
--- /dev/null
+++ b/util/scandir.h
@@ -0,0 +1,11 @@
+#ifndef _SCANDIR_H
+#define _SCANDIR_H
+
+#include <dirent.h>
+
+int scandirx (const char *path, char ***namelist,
+	      int (*filter)(const struct dirent *dirent, void *context),
+	      int (*compar)(const void *a, const void *b),
+	      void *context);
+
+#endif /* _SCANDIR_H */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 4/5] cli: use homebrew scandir in notmuch new add_files
  2016-04-15 19:29 [RFC PATCH 0/5] cli: notmuch new scandir rework Jani Nikula
                   ` (2 preceding siblings ...)
  2016-04-15 19:29 ` [RFC PATCH 3/5] util: add a homebrew scandir implementation Jani Nikula
@ 2016-04-15 19:29 ` Jani Nikula
  2016-04-15 19:29 ` [RFC PATCH 5/5] cli: convert count_files to new scandir Jani Nikula
  4 siblings, 0 replies; 7+ messages in thread
From: Jani Nikula @ 2016-04-15 19:29 UTC (permalink / raw)
  To: notmuch

Split the scanning to files and subdirectories. Filter out ignored
files etc. in the scandir filter callback.

The end result is perhaps slightly easier to follow than before and it
should be easier to add clever ignore mechanisms like globbing in the
scandir filter callback, but it's not at all certain if this is worth
the trouble.

Some tests will fail due to changes in where files are ignored and
what debug logging is done; I didn't bother updating the logging or
the tests much at this point.
---
 notmuch-new.c | 275 +++++++++++++++++++++++++++++++---------------------------
 1 file changed, 148 insertions(+), 127 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 930cbbc9b86f..262895d466ae 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -20,6 +20,7 @@
 
 #include "notmuch-client.h"
 #include "tag-util.h"
+#include "scandir.h"
 
 #include <unistd.h>
 
@@ -151,9 +152,12 @@ generic_print_progress (const char *action, const char *object,
 }
 
 static int
-dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
+dirent_sort_strcmp_name (const void *_a, const void *_b)
 {
-    return strcmp ((*a)->d_name, (*b)->d_name);
+    char * const *a = _a;
+    char * const *b = _b;
+
+    return strcmp (*a, *b);
 }
 
 /* Return the type of a directory entry relative to path as a stat(2)
@@ -206,18 +210,14 @@ dirent_type (const char *path, const struct dirent *entry)
  * Return 1 if the directory looks like a Maildir and 0 otherwise.
  */
 static int
-_entries_resemble_maildir (const char *path, struct dirent **entries, int count)
+_entries_resemble_maildir (char **subdirs, int count)
 {
     int i, found = 0;
 
     for (i = 0; i < count; i++) {
-	if (dirent_type (path, entries[i]) != S_IFDIR)
-	    continue;
-
-	if (strcmp(entries[i]->d_name, "new") == 0 ||
-	    strcmp(entries[i]->d_name, "cur") == 0 ||
-	    strcmp(entries[i]->d_name, "tmp") == 0)
-	{
+	if (strcmp(subdirs[i], "new") == 0 ||
+	    strcmp(subdirs[i], "cur") == 0 ||
+	    strcmp(subdirs[i], "tmp") == 0) {
 	    found++;
 	    if (found == 3)
 		return 1;
@@ -241,6 +241,49 @@ _entry_in_ignore_list (const char *entry, add_files_state_t *state)
     return FALSE;
 }
 
+struct filter {
+    const char *path;
+    add_files_state_t *state;
+    int entry_type;
+};
+
+static int filter_fn (const struct dirent *entry, void *context)
+{
+    struct filter *filter = context;
+    int entry_type;
+
+    /* Ignore any files/directories the user has configured to
+     * ignore.  We do this before dirent_type both for performance
+     * and because we don't care if dirent_type fails on entries
+     * that are explicitly ignored.
+     */
+    if (_entry_in_ignore_list (entry->d_name, filter->state)) {
+	if (filter->state->debug)
+	    printf ("(D) add_files, pass 1: explicitly ignoring %s/%s\n",
+		    filter->path, entry->d_name);
+	return 0;
+    }
+
+    /* We only want to descend into directories (and symlinks to
+     * directories). */
+    entry_type = dirent_type (filter->path, entry);
+    if (entry_type == -1) {
+	/* Be pessimistic, e.g. so we don't lose lots of mail just
+	 * because a user broke a symlink. */
+	fprintf (stderr, "Error reading file %s/%s: %s\n",
+		 filter->path, entry->d_name, strerror (errno));
+	return -1;
+    }
+
+    if (entry_type != filter->entry_type)
+	return 0;
+
+    if (strcmp (entry->d_name, ".") == 0 || strcmp (entry->d_name, "..") == 0)
+	return 0;
+
+    return 1;
+}
+
 /* Add a single file to the database. */
 static notmuch_status_t
 add_file (notmuch_database_t *notmuch, const char *filename,
@@ -345,18 +388,22 @@ add_files (notmuch_database_t *notmuch,
 	   const char *path,
 	   add_files_state_t *state)
 {
-    struct dirent *entry = NULL;
     char *next = NULL;
     time_t fs_mtime, db_mtime;
     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
-    struct dirent **fs_entries = NULL;
-    int i, num_fs_entries = 0, entry_type;
+    char **fs_files = NULL, **fs_subdirs = NULL;
+    int num_fs_files = 0, num_fs_subdirs = 0;
+    int i;
     notmuch_directory_t *directory;
     notmuch_filenames_t *db_files = NULL;
     notmuch_filenames_t *db_subdirs = NULL;
     time_t stat_time;
     struct stat st;
     notmuch_bool_t is_maildir;
+    struct filter filter = {
+	.path = path,
+	.state = state,
+    };
 
     if (stat (path, &st)) {
 	fprintf (stderr, "Error reading directory %s: %s\n",
@@ -410,11 +457,11 @@ add_files (notmuch_database_t *notmuch,
 
     /* If the database knows about this directory, then we sort based
      * on strcmp to match the database sorting. */
-    num_fs_entries = scandir (path, &fs_entries, 0,
-			      directory ?
-			      dirent_sort_strcmp_name : NULL);
-
-    if (num_fs_entries == -1) {
+    filter.entry_type = S_IFDIR;
+    num_fs_subdirs = scandirx (path, &fs_subdirs, filter_fn,
+			       directory ? dirent_sort_strcmp_name : NULL,
+			       &filter);
+    if (num_fs_subdirs == -1) {
 	fprintf (stderr, "Error opening directory %s: %s\n",
 		 path, strerror (errno));
 	/* We consider this a fatal error because, if a user moved a
@@ -426,50 +473,20 @@ add_files (notmuch_database_t *notmuch,
     }
 
     /* Pass 1: Recurse into all sub-directories. */
-    is_maildir = _entries_resemble_maildir (path, fs_entries, num_fs_entries);
-
-    for (i = 0; i < num_fs_entries; i++) {
-	if (interrupted)
-	    break;
+    is_maildir = _entries_resemble_maildir (fs_subdirs, num_fs_subdirs);
 
-	entry = fs_entries[i];
+    for (i = 0; i < num_fs_subdirs && !interrupted; i++) {
+	const char *name = fs_subdirs[i];
 
-	/* Ignore any files/directories the user has configured to
-	 * ignore.  We do this before dirent_type both for performance
-	 * and because we don't care if dirent_type fails on entries
-	 * that are explicitly ignored.
+	/*
+	 * Ignore the .notmuch directory and any "tmp" directory that
+	 * appears within a maildir.
 	 */
-	if (_entry_in_ignore_list (entry->d_name, state)) {
-	    if (state->debug)
-		printf ("(D) add_files, pass 1: explicitly ignoring %s/%s\n",
-			path, entry->d_name);
+	if ((is_maildir && strcmp (name, "tmp") == 0) ||
+	    strcmp (name, ".notmuch") == 0)
 	    continue;
-	}
 
-	/* We only want to descend into directories (and symlinks to
-	 * directories). */
-	entry_type = dirent_type (path, entry);
-	if (entry_type == -1) {
-	    /* Be pessimistic, e.g. so we don't lose lots of mail just
-	     * because a user broke a symlink. */
-	    fprintf (stderr, "Error reading file %s/%s: %s\n",
-		     path, entry->d_name, strerror (errno));
-	    return NOTMUCH_STATUS_FILE_ERROR;
-	} else if (entry_type != S_IFDIR) {
-	    continue;
-	}
-
-	/* Ignore special directories to avoid infinite recursion.
-	 * Also ignore the .notmuch directory and any "tmp" directory
-	 * that appears within a maildir.
-	 */
-	if (strcmp (entry->d_name, ".") == 0 ||
-	    strcmp (entry->d_name, "..") == 0 ||
-	    (is_maildir && strcmp (entry->d_name, "tmp") == 0) ||
-	    strcmp (entry->d_name, ".notmuch") == 0)
-	    continue;
-
-	next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
+	next = talloc_asprintf (notmuch, "%s/%s", path, name);
 	status = add_files (notmuch, next, state);
 	if (status) {
 	    ret = status;
@@ -498,27 +515,71 @@ add_files (notmuch_database_t *notmuch,
 	db_subdirs = notmuch_directory_get_child_directories (directory);
     }
 
-    /* Pass 2: Scan for new files, removed files, and removed directories. */
-    for (i = 0; i < num_fs_entries; i++)
-    {
-	if (interrupted)
-	    break;
+    /* Pass 1½: Scan for removed directories. */
+    for (i = 0; i < num_fs_subdirs && !interrupted; i++) {
+	const char *name = fs_subdirs[i];
 
-        entry = fs_entries[i];
+	/* Check if we've walked past any names in db_subdirs. If so,
+	 * these have been deleted. */
+	while (notmuch_filenames_valid (db_subdirs) &&
+	       strcmp (notmuch_filenames_get (db_subdirs), name) < 0) {
+	    char *absolute = talloc_asprintf (state->removed_directories,
+					      "%s/%s", path,
+					      notmuch_filenames_get (db_subdirs));
 
-	/* Ignore files & directories user has configured to be ignored */
-	if (_entry_in_ignore_list (entry->d_name, state)) {
 	    if (state->debug)
-		printf ("(D) add_files, pass 2: explicitly ignoring %s/%s\n",
-			path, entry->d_name);
-	    continue;
+		printf ("(D) add_files, pass 2: queuing passed directory %s for deletion from database\n",
+			absolute);
+
+	    _filename_list_add (state->removed_directories, absolute);
+
+	    notmuch_filenames_move_to_next (db_subdirs);
 	}
 
-	/* Check if we've walked past any names in db_files or
-	 * db_subdirs. If so, these have been deleted. */
+	if (notmuch_filenames_valid (db_subdirs) &&
+	    strcmp (notmuch_filenames_get (db_subdirs), name) == 0)
+	    notmuch_filenames_move_to_next (db_subdirs);
+    }
+
+    /* Now that we've walked the whole filesystem subdir list, subdirs
+     * left over in the database list have been deleted. */
+    while (notmuch_filenames_valid (db_subdirs)) {
+	char *absolute = talloc_asprintf (state->removed_directories,
+					  "%s/%s", path,
+					  notmuch_filenames_get (db_subdirs));
+
+	if (state->debug)
+	    printf ("(D) add_files, pass 3: queuing leftover directory %s for deletion from database\n",
+		    absolute);
+
+	_filename_list_add (state->removed_directories, absolute);
+
+	notmuch_filenames_move_to_next (db_subdirs);
+    }
+
+    /* Pass 2: Scan for new and removed files. */
+    filter.entry_type = S_IFREG;
+    num_fs_files = scandirx (path, &fs_files, filter_fn,
+			     directory ? dirent_sort_strcmp_name : NULL,
+			     &filter);
+    if (num_fs_files == -1) {
+	fprintf (stderr, "Error opening directory %s: %s\n",
+		 path, strerror (errno));
+	/* We consider this a fatal error because, if a user moved a
+	 * message from another directory that we were able to scan
+	 * into this directory, skipping this directory will cause
+	 * that message to be lost. */
+	ret = NOTMUCH_STATUS_FILE_ERROR;
+	goto DONE;
+    }
+
+    for (i = 0; i < num_fs_files && !interrupted; i++) {
+	const char *name = fs_files[i];
+
+	/* Check if we've walked past any names in db_files. If so,
+	 * these have been deleted. */
 	while (notmuch_filenames_valid (db_files) &&
-	       strcmp (notmuch_filenames_get (db_files), entry->d_name) < 0)
-	{
+	       strcmp (notmuch_filenames_get (db_files), name) < 0) {
 	    char *absolute = talloc_asprintf (state->removed_files,
 					      "%s/%s", path,
 					      notmuch_filenames_get (db_files));
@@ -532,46 +593,16 @@ add_files (notmuch_database_t *notmuch,
 	    notmuch_filenames_move_to_next (db_files);
 	}
 
-	while (notmuch_filenames_valid (db_subdirs) &&
-	       strcmp (notmuch_filenames_get (db_subdirs), entry->d_name) <= 0)
-	{
-	    const char *filename = notmuch_filenames_get (db_subdirs);
-
-	    if (strcmp (filename, entry->d_name) < 0)
-	    {
-		char *absolute = talloc_asprintf (state->removed_directories,
-						  "%s/%s", path, filename);
-		if (state->debug)
-		    printf ("(D) add_files, pass 2: queuing passed directory %s for deletion from database\n",
-			absolute);
-
-		_filename_list_add (state->removed_directories, absolute);
-	    }
-
-	    notmuch_filenames_move_to_next (db_subdirs);
-	}
-
-	/* Only add regular files (and symlinks to regular files). */
-	entry_type = dirent_type (path, entry);
-	if (entry_type == -1) {
-	    fprintf (stderr, "Error reading file %s/%s: %s\n",
-		     path, entry->d_name, strerror (errno));
-	    return NOTMUCH_STATUS_FILE_ERROR;
-	} else if (entry_type != S_IFREG) {
-	    continue;
-	}
-
 	/* Don't add a file that we've added before. */
 	if (notmuch_filenames_valid (db_files) &&
-	    strcmp (notmuch_filenames_get (db_files), entry->d_name) == 0)
-	{
+	    strcmp (notmuch_filenames_get (db_files), name) == 0) {
 	    notmuch_filenames_move_to_next (db_files);
 	    continue;
 	}
 
 	/* We're now looking at a regular file that doesn't yet exist
 	 * in the database, so add it. */
-	next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
+	next = talloc_asprintf (notmuch, "%s/%s", path, name);
 
 	state->processed_files++;
 
@@ -605,10 +636,9 @@ add_files (notmuch_database_t *notmuch,
     if (interrupted)
 	goto DONE;
 
-    /* Now that we've walked the whole filesystem list, anything left
-     * over in the database lists has been deleted. */
-    while (notmuch_filenames_valid (db_files))
-    {
+    /* Now that we've walked the whole filesystem file list, files
+     * left over in the database list have been deleted. */
+    while (notmuch_filenames_valid (db_files)) {
 	char *absolute = talloc_asprintf (state->removed_files,
 					  "%s/%s", path,
 					  notmuch_filenames_get (db_files));
@@ -621,21 +651,6 @@ add_files (notmuch_database_t *notmuch,
 	notmuch_filenames_move_to_next (db_files);
     }
 
-    while (notmuch_filenames_valid (db_subdirs))
-    {
-	char *absolute = talloc_asprintf (state->removed_directories,
-					  "%s/%s", path,
-					  notmuch_filenames_get (db_subdirs));
-
-	if (state->debug)
-	    printf ("(D) add_files, pass 3: queuing leftover directory %s for deletion from database\n",
-		    absolute);
-
-	_filename_list_add (state->removed_directories, absolute);
-
-	notmuch_filenames_move_to_next (db_subdirs);
-    }
-
     /* If the directory's mtime is the same as the wall-clock time
      * when we stat'ed the directory, we skip updating the mtime in
      * the database because a message could be delivered later in this
@@ -647,11 +662,17 @@ add_files (notmuch_database_t *notmuch,
   DONE:
     if (next)
 	talloc_free (next);
-    if (fs_entries) {
-	for (i = 0; i < num_fs_entries; i++)
-	    free (fs_entries[i]);
+    if (fs_subdirs) {
+	for (i = 0; i < num_fs_subdirs; i++)
+	    free (fs_subdirs[i]);
+
+	free (fs_subdirs);
+    }
+    if (fs_files) {
+	for (i = 0; i < num_fs_files; i++)
+	    free (fs_files[i]);
 
-	free (fs_entries);
+	free (fs_files);
     }
     if (db_subdirs)
 	notmuch_filenames_destroy (db_subdirs);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 5/5] cli: convert count_files to new scandir
  2016-04-15 19:29 [RFC PATCH 0/5] cli: notmuch new scandir rework Jani Nikula
                   ` (3 preceding siblings ...)
  2016-04-15 19:29 ` [RFC PATCH 4/5] cli: use homebrew scandir in notmuch new add_files Jani Nikula
@ 2016-04-15 19:29 ` Jani Nikula
  4 siblings, 0 replies; 7+ messages in thread
From: Jani Nikula @ 2016-04-15 19:29 UTC (permalink / raw)
  To: notmuch

Makes sense to use similar code in both places.
---
 notmuch-new.c | 97 +++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 58 insertions(+), 39 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 262895d466ae..966b89ca39a0 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -733,64 +733,83 @@ stop_progress_printing_timer (void)
 static void
 count_files (const char *path, int *count, add_files_state_t *state)
 {
-    struct dirent *entry = NULL;
-    char *next;
-    struct dirent **fs_entries = NULL;
-    int num_fs_entries = scandir (path, &fs_entries, 0, NULL);
-    int entry_type, i;
+    char **fs_files = NULL, **fs_subdirs = NULL;
+    int num_fs_files = 0, num_fs_subdirs = 0;
+    int i;
+    notmuch_bool_t is_maildir;
+    struct filter filter = {
+	.path = path,
+	.state = state,
+    };
 
-    if (num_fs_entries == -1) {
+    filter.entry_type = S_IFDIR;
+    num_fs_subdirs = scandirx (path, &fs_subdirs, filter_fn, NULL, &filter);
+    if (num_fs_subdirs == -1) {
 	fprintf (stderr, "Warning: failed to open directory %s: %s\n",
 		 path, strerror (errno));
 	goto DONE;
     }
 
-    for (i = 0; i < num_fs_entries && ! interrupted; i++) {
-        entry = fs_entries[i];
+    is_maildir = _entries_resemble_maildir (fs_subdirs, num_fs_subdirs);
+
+    /* Recurse to subdirs. */
+    for (i = 0; i < num_fs_subdirs && !interrupted; i++) {
+	const char *name = fs_subdirs[i];
+	char *next;
 
-	/* Ignore special directories to avoid infinite recursion.
-	 * Also ignore the .notmuch directory and files/directories
-	 * the user has configured to be ignored.
+	/*
+	 * Ignore the .notmuch directory and any "tmp" directory that
+	 * appears within a maildir.
 	 */
-	if (strcmp (entry->d_name, ".") == 0 ||
-	    strcmp (entry->d_name, "..") == 0 ||
-	    strcmp (entry->d_name, ".notmuch") == 0 ||
-	    _entry_in_ignore_list (entry->d_name, state))
-	{
-	    if (state->debug && _entry_in_ignore_list (entry->d_name, state))
-		printf ("(D) count_files: explicitly ignoring %s/%s\n",
-			path,
-			entry->d_name);
+	if ((is_maildir && strcmp (name, "tmp") == 0) ||
+	    strcmp (name, ".notmuch") == 0)
 	    continue;
-	}
 
-	if (asprintf (&next, "%s/%s", path, entry->d_name) == -1) {
-	    next = NULL;
+	next = talloc_asprintf (NULL, "%s/%s", path, name);
+	if (!next) {
 	    fprintf (stderr, "Error descending from %s to %s: Out of memory\n",
-		     path, entry->d_name);
+		     path, name);
 	    continue;
 	}
+	count_files (next, count, state);
+	talloc_free (next);
+	next = NULL;
+    }
 
-	entry_type = dirent_type (path, entry);
-	if (entry_type == S_IFREG) {
-	    *count = *count + 1;
-	    if (*count % 1000 == 0 && state->verbosity >= VERBOSITY_NORMAL) {
-		printf ("Found %d files so far.\r", *count);
-		fflush (stdout);
-	    }
-	} else if (entry_type == S_IFDIR) {
-	    count_files (next, count, state);
-	}
+    if (interrupted)
+	goto DONE;
+
+    filter.entry_type = S_IFREG;
+    num_fs_files = scandirx (path, &fs_files, filter_fn, NULL, &filter);
+    if (num_fs_files == -1) {
+	fprintf (stderr, "Warning: failed to open directory %s: %s\n",
+		 path, strerror (errno));
+	goto DONE;
+    }
+
+    if (state->verbosity >= VERBOSITY_NORMAL) {
+	int new_count = *count + num_fs_files;
 
-	free (next);
+	if (new_count / 1000 > *count / 1000) {
+	    printf ("Found %d files so far.\r", new_count / 1000 * 1000);
+	    fflush (stdout);
+	}
     }
 
+    *count += num_fs_files;
+
   DONE:
-    if (fs_entries) {
-	for (i = 0; i < num_fs_entries; i++)
-	    free (fs_entries[i]);
+    if (fs_subdirs) {
+	for (i = 0; i < num_fs_subdirs; i++)
+	    free (fs_subdirs[i]);
+
+	free (fs_subdirs);
+    }
+    if (fs_files) {
+	for (i = 0; i < num_fs_files; i++)
+	    free (fs_files[i]);
 
-        free (fs_entries);
+	free (fs_files);
     }
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/5] cli: remove leftover dir variable
  2016-04-15 19:29 ` [RFC PATCH 1/5] cli: remove leftover dir variable Jani Nikula
@ 2016-05-02 11:23   ` David Bremner
  0 siblings, 0 replies; 7+ messages in thread
From: David Bremner @ 2016-05-02 11:23 UTC (permalink / raw)
  To: Jani Nikula, notmuch

Jani Nikula <jani@nikula.org> writes:

> No functional changes.

pushed this one patch.

d

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-05-02 11:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-15 19:29 [RFC PATCH 0/5] cli: notmuch new scandir rework Jani Nikula
2016-04-15 19:29 ` [RFC PATCH 1/5] cli: remove leftover dir variable Jani Nikula
2016-05-02 11:23   ` David Bremner
2016-04-15 19:29 ` [RFC PATCH 2/5] cli: drop inode sort order on directories unknown to the database Jani Nikula
2016-04-15 19:29 ` [RFC PATCH 3/5] util: add a homebrew scandir implementation Jani Nikula
2016-04-15 19:29 ` [RFC PATCH 4/5] cli: use homebrew scandir in notmuch new add_files Jani Nikula
2016-04-15 19:29 ` [RFC PATCH 5/5] cli: convert count_files to new scandir Jani Nikula

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).