unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Better, more portable stat-ing logic for notmuch new
@ 2012-05-07 18:09 Austin Clements
  2012-05-07 18:09 ` [PATCH 1/2] test: Test notmuch new with a broken symlink Austin Clements
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-07 18:09 UTC (permalink / raw)
  To: notmuch; +Cc: Vladimir Marek

This fully implements my suggestion from
id:"20120411193609.GC13549@mit.edu".  Originally this was just meant
to make the code more portable (since d_type is  not portable), but it
wound up also simplifying some of the notmuch new logic.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] test: Test notmuch new with a broken symlink
  2012-05-07 18:09 [PATCH 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
@ 2012-05-07 18:09 ` Austin Clements
  2012-05-07 18:09 ` [PATCH 2/2] new: Centralize file type stat-ing logic Austin Clements
  2012-05-07 22:20 ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
  2 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-07 18:09 UTC (permalink / raw)
  To: notmuch; +Cc: Vladimir Marek

---
 test/new |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/test/new b/test/new
index 99f9913..26253db 100755
--- a/test/new
+++ b/test/new
@@ -136,6 +136,16 @@ output=$(NOTMUCH_NEW)
 test_expect_equal "$output" "Added 1 new message to the database."
 
 
+test_begin_subtest "Broken symlink aborts"
+ln -s does-not-exist "${MAIL_DIR}/broken"
+output=$(NOTMUCH_NEW 2>&1)
+test_expect_equal "$output" \
+"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file or directory
+Note: A fatal error was encountered: Something went wrong trying to read or write a file
+No new mail."
+rm "${MAIL_DIR}/broken"
+
+
 test_begin_subtest "New two-level directory"
 
 generate_message [dir]=two/levels
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] new: Centralize file type stat-ing logic
  2012-05-07 18:09 [PATCH 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
  2012-05-07 18:09 ` [PATCH 1/2] test: Test notmuch new with a broken symlink Austin Clements
@ 2012-05-07 18:09 ` Austin Clements
  2012-05-07 21:08   ` Jani Nikula
  2012-05-07 22:20 ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
  2 siblings, 1 reply; 29+ messages in thread
From: Austin Clements @ 2012-05-07 18:09 UTC (permalink / raw)
  To: notmuch; +Cc: Vladimir Marek

This moves our logic to get a file's type into one function.  This has
several benefits: we can support OSes and file systems that do not
provide dirent.d_type or always return DT_UNKNOWN, complex
symlink-handling logic has been replaced by a simple stat fall-through
in one place, and the error message for un-stat-able file is more
accurate (previously, the error always mentioned directories, even
though a broken symlink is not a directory).
---
 notmuch-new.c |   99 ++++++++++++++++++++++++++++++++++-----------------------
 test/new      |    2 +-
 2 files changed, 60 insertions(+), 41 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index cb720cc..cf2580e 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -154,6 +154,44 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
     return strcmp ((*a)->d_name, (*b)->d_name);
 }
 
+/* Return the type of a directory entry relative to path as a stat(2)
+ * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
+ * if the file's type cannot be determined (which includes dangling
+ * symlinks).
+ */
+static int
+dirent_type (const char *path, const struct dirent *entry)
+{
+    struct stat statbuf;
+    char *abspath;
+    int err;
+
+#ifdef _DIRENT_HAVE_D_TYPE
+    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
+     * we'll fall through to stat and get the real file type. */
+    static const mode_t modes[] = {
+	[DT_BLK]  = S_IFBLK,
+	[DT_CHR]  = S_IFCHR,
+	[DT_DIR]  = S_IFDIR,
+	[DT_FIFO] = S_IFIFO,
+	[DT_REG]  = S_IFREG,
+	[DT_SOCK] = S_IFSOCK
+    };
+    if (entry->d_type < sizeof(modes)/sizeof(modes[0]) &&
+	modes[entry->d_type])
+	return modes[entry->d_type];
+#endif
+
+    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
+    if (!abspath)
+	return -1;
+    err = stat(abspath, &statbuf);
+    talloc_free (abspath);
+    if (err < 0)
+	return -1;
+    return statbuf.st_mode & S_IFMT;
+}
+
 /* Test if the directory looks like a Maildir directory.
  *
  * Search through the array of directory entries to see if we can find all
@@ -162,12 +200,12 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
  * Return 1 if the directory looks like a Maildir and 0 otherwise.
  */
 static int
-_entries_resemble_maildir (struct dirent **entries, int count)
+_entries_resemble_maildir (const char *path, struct dirent **entries, int count)
 {
     int i, found = 0;
 
     for (i = 0; i < count; i++) {
-	if (entries[i]->d_type != DT_DIR && entries[i]->d_type != DT_UNKNOWN)
+	if (dirent_type (path, entries[i]) != S_IFDIR)
 	    continue;
 
 	if (strcmp(entries[i]->d_name, "new") == 0 ||
@@ -250,7 +288,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
     notmuch_message_t *message = NULL;
     struct dirent **fs_entries = NULL;
-    int i, num_fs_entries;
+    int i, num_fs_entries, entry_type;
     notmuch_directory_t *directory;
     notmuch_filenames_t *db_files = NULL;
     notmuch_filenames_t *db_subdirs = NULL;
@@ -317,7 +355,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     }
 
     /* Pass 1: Recurse into all sub-directories. */
-    is_maildir = _entries_resemble_maildir (fs_entries, num_fs_entries);
+    is_maildir = _entries_resemble_maildir (path, fs_entries, num_fs_entries);
 
     for (i = 0; i < num_fs_entries; i++) {
 	if (interrupted)
@@ -325,17 +363,16 @@ add_files_recursive (notmuch_database_t *notmuch,
 
 	entry = fs_entries[i];
 
-	/* We only want to descend into directories.
-	 * But symlinks can be to directories too, of course.
-	 *
-	 * And if the filesystem doesn't tell us the file type in the
-	 * scandir results, then it might be a directory (and if not,
-	 * then we'll stat and return immediately in the next level of
-	 * recursion). */
-	if (entry->d_type != DT_DIR &&
-	    entry->d_type != DT_LNK &&
-	    entry->d_type != DT_UNKNOWN)
-	{
+	/* We only want to descend into directories (and symlinks to
+	 * directories). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    /* Be pessimistic, e.g. so we don't loose lots of mail
+	     * just because a user broke a symlink. */
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFDIR) {
 	    continue;
 	}
 
@@ -425,31 +462,13 @@ add_files_recursive (notmuch_database_t *notmuch,
 	    notmuch_filenames_move_to_next (db_subdirs);
 	}
 
-	/* If we're looking at a symlink, we only want to add it if it
-	 * links to a regular file, (and not to a directory, say).
-	 *
-	 * Similarly, if the file is of unknown type (due to filesystem
-	 * limitations), then we also need to look closer.
-	 *
-	 * In either case, a stat does the trick.
-	 */
-	if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
-	    int err;
-
-	    next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
-	    err = stat (next, &st);
-	    talloc_free (next);
-	    next = NULL;
-
-	    /* Don't emit an error for a link pointing nowhere, since
-	     * the directory-traversal pass will have already done
-	     * that. */
-	    if (err)
-		continue;
-
-	    if (! S_ISREG (st.st_mode))
-		continue;
-	} else if (entry->d_type != DT_REG) {
+	/* Only add regular files (and symlinks to regular files). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFREG) {
 	    continue;
 	}
 
diff --git a/test/new b/test/new
index 26253db..e3900f5 100755
--- a/test/new
+++ b/test/new
@@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
 ln -s does-not-exist "${MAIL_DIR}/broken"
 output=$(NOTMUCH_NEW 2>&1)
 test_expect_equal "$output" \
-"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file or directory
+"Error reading file /run/shm/nm/tmp.new/mail/broken: No such file or directory
 Note: A fatal error was encountered: Something went wrong trying to read or write a file
 No new mail."
 rm "${MAIL_DIR}/broken"
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] new: Centralize file type stat-ing logic
  2012-05-07 18:09 ` [PATCH 2/2] new: Centralize file type stat-ing logic Austin Clements
@ 2012-05-07 21:08   ` Jani Nikula
  2012-05-07 21:14     ` Jani Nikula
  2012-05-07 22:18     ` Austin Clements
  0 siblings, 2 replies; 29+ messages in thread
From: Jani Nikula @ 2012-05-07 21:08 UTC (permalink / raw)
  To: Austin Clements; +Cc: Notmuch Mail, Vladimir Marek

[-- Attachment #1: Type: text/plain, Size: 7538 bytes --]

On May 7, 2012 9:10 PM, "Austin Clements" <amdragon@mit.edu> wrote:
>
> This moves our logic to get a file's type into one function.  This has
> several benefits: we can support OSes and file systems that do not
> provide dirent.d_type or always return DT_UNKNOWN, complex
> symlink-handling logic has been replaced by a simple stat fall-through
> in one place, and the error message for un-stat-able file is more
> accurate (previously, the error always mentioned directories, even
> though a broken symlink is not a directory).

Please find some quick drive-by-review below.

J.

> ---
>  notmuch-new.c |   99
++++++++++++++++++++++++++++++++++-----------------------
>  test/new      |    2 +-
>  2 files changed, 60 insertions(+), 41 deletions(-)
>
> diff --git a/notmuch-new.c b/notmuch-new.c
> index cb720cc..cf2580e 100644
> --- a/notmuch-new.c
> +++ b/notmuch-new.c
> @@ -154,6 +154,44 @@ dirent_sort_strcmp_name (const struct dirent **a,
const struct dirent **b)
>     return strcmp ((*a)->d_name, (*b)->d_name);
>  }
>
> +/* Return the type of a directory entry relative to path as a stat(2)
> + * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
> + * if the file's type cannot be determined (which includes dangling
> + * symlinks).
> + */
> +static int
> +dirent_type (const char *path, const struct dirent *entry)
> +{
> +    struct stat statbuf;
> +    char *abspath;
> +    int err;
> +
> +#ifdef _DIRENT_HAVE_D_TYPE
> +    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
> +     * we'll fall through to stat and get the real file type. */
> +    static const mode_t modes[] = {
> +       [DT_BLK]  = S_IFBLK,
> +       [DT_CHR]  = S_IFCHR,
> +       [DT_DIR]  = S_IFDIR,
> +       [DT_FIFO] = S_IFIFO,
> +       [DT_REG]  = S_IFREG,
> +       [DT_SOCK] = S_IFSOCK
> +    };
> +    if (entry->d_type < sizeof(modes)/sizeof(modes[0]) &&

ARRAY_SIZE()

> +       modes[entry->d_type])
> +       return modes[entry->d_type];
> +#endif
> +
> +    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
> +    if (!abspath)
> +       return -1;

Does talloc set errno in this case? I suspect not.

> +    err = stat(abspath, &statbuf);
> +    talloc_free (abspath);

This likely breaks your promise about errno. You can't trust talloc_free()
not calling some function that sets errno.

> +    if (err < 0)
> +       return -1;
> +    return statbuf.st_mode & S_IFMT;
> +}
> +
>  /* Test if the directory looks like a Maildir directory.
>  *
>  * Search through the array of directory entries to see if we can find all
> @@ -162,12 +200,12 @@ dirent_sort_strcmp_name (const struct dirent **a,
const struct dirent **b)
>  * Return 1 if the directory looks like a Maildir and 0 otherwise.
>  */
>  static int
> -_entries_resemble_maildir (struct dirent **entries, int count)
> +_entries_resemble_maildir (const char *path, struct dirent **entries,
int count)
>  {
>     int i, found = 0;
>
>     for (i = 0; i < count; i++) {
> -       if (entries[i]->d_type != DT_DIR && entries[i]->d_type !=
DT_UNKNOWN)
> +       if (dirent_type (path, entries[i]) != S_IFDIR)
>            continue;
>
>        if (strcmp(entries[i]->d_name, "new") == 0 ||
> @@ -250,7 +288,7 @@ add_files_recursive (notmuch_database_t *notmuch,
>     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
>     notmuch_message_t *message = NULL;
>     struct dirent **fs_entries = NULL;
> -    int i, num_fs_entries;
> +    int i, num_fs_entries, entry_type;
>     notmuch_directory_t *directory;
>     notmuch_filenames_t *db_files = NULL;
>     notmuch_filenames_t *db_subdirs = NULL;
> @@ -317,7 +355,7 @@ add_files_recursive (notmuch_database_t *notmuch,
>     }
>
>     /* Pass 1: Recurse into all sub-directories. */
> -    is_maildir = _entries_resemble_maildir (fs_entries, num_fs_entries);
> +    is_maildir = _entries_resemble_maildir (path, fs_entries,
num_fs_entries);
>
>     for (i = 0; i < num_fs_entries; i++) {
>        if (interrupted)
> @@ -325,17 +363,16 @@ add_files_recursive (notmuch_database_t *notmuch,
>
>        entry = fs_entries[i];
>
> -       /* We only want to descend into directories.
> -        * But symlinks can be to directories too, of course.
> -        *
> -        * And if the filesystem doesn't tell us the file type in the
> -        * scandir results, then it might be a directory (and if not,
> -        * then we'll stat and return immediately in the next level of
> -        * recursion). */
> -       if (entry->d_type != DT_DIR &&
> -           entry->d_type != DT_LNK &&
> -           entry->d_type != DT_UNKNOWN)
> -       {
> +       /* We only want to descend into directories (and symlinks to
> +        * directories). */
> +       entry_type = dirent_type (path, entry);
> +       if (entry_type == -1) {
> +           /* Be pessimistic, e.g. so we don't loose lots of mail

s/loose/lose/ ?

> +            * just because a user broke a symlink. */
> +           fprintf (stderr, "Error reading file %s/%s: %s\n",
> +                    path, entry->d_name, strerror (errno));

You can't trust errno here, as explained above.

> +           return NOTMUCH_STATUS_FILE_ERROR;
> +       } else if (entry_type != S_IFDIR) {
>            continue;
>        }
>
> @@ -425,31 +462,13 @@ add_files_recursive (notmuch_database_t *notmuch,
>            notmuch_filenames_move_to_next (db_subdirs);
>        }
>
> -       /* If we're looking at a symlink, we only want to add it if it
> -        * links to a regular file, (and not to a directory, say).
> -        *
> -        * Similarly, if the file is of unknown type (due to filesystem
> -        * limitations), then we also need to look closer.
> -        *
> -        * In either case, a stat does the trick.
> -        */
> -       if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
> -           int err;
> -
> -           next = talloc_asprintf (notmuch, "%s/%s", path,
entry->d_name);
> -           err = stat (next, &st);
> -           talloc_free (next);
> -           next = NULL;
> -
> -           /* Don't emit an error for a link pointing nowhere, since
> -            * the directory-traversal pass will have already done
> -            * that. */
> -           if (err)
> -               continue;
> -
> -           if (! S_ISREG (st.st_mode))
> -               continue;
> -       } else if (entry->d_type != DT_REG) {
> +       /* Only add regular files (and symlinks to regular files). */
> +       entry_type = dirent_type (path, entry);
> +       if (entry_type == -1) {
> +           fprintf (stderr, "Error reading file %s/%s: %s\n",
> +                    path, entry->d_name, strerror (errno));

Ditto.

> +           return NOTMUCH_STATUS_FILE_ERROR;
> +       } else if (entry_type != S_IFREG) {
>            continue;
>        }
>
> diff --git a/test/new b/test/new
> index 26253db..e3900f5 100755
> --- a/test/new
> +++ b/test/new
> @@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
>  ln -s does-not-exist "${MAIL_DIR}/broken"
>  output=$(NOTMUCH_NEW 2>&1)
>  test_expect_equal "$output" \
> -"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file
or directory
> +"Error reading file /run/shm/nm/tmp.new/mail/broken: No such file or
directory
>  Note: A fatal error was encountered: Something went wrong trying to read
or write a file
>  No new mail."
>  rm "${MAIL_DIR}/broken"
> --
> 1.7.10
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

[-- Attachment #2: Type: text/html, Size: 9907 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] new: Centralize file type stat-ing logic
  2012-05-07 21:08   ` Jani Nikula
@ 2012-05-07 21:14     ` Jani Nikula
  2012-05-07 22:18     ` Austin Clements
  1 sibling, 0 replies; 29+ messages in thread
From: Jani Nikula @ 2012-05-07 21:14 UTC (permalink / raw)
  To: Austin Clements; +Cc: Notmuch Mail, Vladimir Marek

[-- Attachment #1: Type: text/plain, Size: 8067 bytes --]

On May 8, 2012 12:08 AM, "Jani Nikula" <jani@nikula.org> wrote:
>
>
> On May 7, 2012 9:10 PM, "Austin Clements" <amdragon@mit.edu> wrote:
> >
> > This moves our logic to get a file's type into one function.  This has
> > several benefits: we can support OSes and file systems that do not
> > provide dirent.d_type or always return DT_UNKNOWN, complex
> > symlink-handling logic has been replaced by a simple stat fall-through
> > in one place, and the error message for un-stat-able file is more
> > accurate (previously, the error always mentioned directories, even
> > though a broken symlink is not a directory).
>
> Please find some quick drive-by-review below.

Forgot to add that the general approach seems sensible to me.

>
> J.
>
> > ---
> >  notmuch-new.c |   99
++++++++++++++++++++++++++++++++++-----------------------
> >  test/new      |    2 +-
> >  2 files changed, 60 insertions(+), 41 deletions(-)
> >
> > diff --git a/notmuch-new.c b/notmuch-new.c
> > index cb720cc..cf2580e 100644
> > --- a/notmuch-new.c
> > +++ b/notmuch-new.c
> > @@ -154,6 +154,44 @@ dirent_sort_strcmp_name (const struct dirent **a,
const struct dirent **b)
> >     return strcmp ((*a)->d_name, (*b)->d_name);
> >  }
> >
> > +/* Return the type of a directory entry relative to path as a stat(2)
> > + * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
> > + * if the file's type cannot be determined (which includes dangling
> > + * symlinks).
> > + */
> > +static int
> > +dirent_type (const char *path, const struct dirent *entry)
> > +{
> > +    struct stat statbuf;
> > +    char *abspath;
> > +    int err;
> > +
> > +#ifdef _DIRENT_HAVE_D_TYPE
> > +    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
> > +     * we'll fall through to stat and get the real file type. */
> > +    static const mode_t modes[] = {
> > +       [DT_BLK]  = S_IFBLK,
> > +       [DT_CHR]  = S_IFCHR,
> > +       [DT_DIR]  = S_IFDIR,
> > +       [DT_FIFO] = S_IFIFO,
> > +       [DT_REG]  = S_IFREG,
> > +       [DT_SOCK] = S_IFSOCK
> > +    };
> > +    if (entry->d_type < sizeof(modes)/sizeof(modes[0]) &&
>
> ARRAY_SIZE()
>
> > +       modes[entry->d_type])
> > +       return modes[entry->d_type];
> > +#endif
> > +
> > +    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
> > +    if (!abspath)
> > +       return -1;
>
> Does talloc set errno in this case? I suspect not.
>
> > +    err = stat(abspath, &statbuf);
> > +    talloc_free (abspath);
>
> This likely breaks your promise about errno. You can't trust
talloc_free() not calling some function that sets errno.
>
> > +    if (err < 0)
> > +       return -1;
> > +    return statbuf.st_mode & S_IFMT;
> > +}
> > +
> >  /* Test if the directory looks like a Maildir directory.
> >  *
> >  * Search through the array of directory entries to see if we can find
all
> > @@ -162,12 +200,12 @@ dirent_sort_strcmp_name (const struct dirent **a,
const struct dirent **b)
> >  * Return 1 if the directory looks like a Maildir and 0 otherwise.
> >  */
> >  static int
> > -_entries_resemble_maildir (struct dirent **entries, int count)
> > +_entries_resemble_maildir (const char *path, struct dirent **entries,
int count)
> >  {
> >     int i, found = 0;
> >
> >     for (i = 0; i < count; i++) {
> > -       if (entries[i]->d_type != DT_DIR && entries[i]->d_type !=
DT_UNKNOWN)
> > +       if (dirent_type (path, entries[i]) != S_IFDIR)
> >            continue;
> >
> >        if (strcmp(entries[i]->d_name, "new") == 0 ||
> > @@ -250,7 +288,7 @@ add_files_recursive (notmuch_database_t *notmuch,
> >     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
> >     notmuch_message_t *message = NULL;
> >     struct dirent **fs_entries = NULL;
> > -    int i, num_fs_entries;
> > +    int i, num_fs_entries, entry_type;
> >     notmuch_directory_t *directory;
> >     notmuch_filenames_t *db_files = NULL;
> >     notmuch_filenames_t *db_subdirs = NULL;
> > @@ -317,7 +355,7 @@ add_files_recursive (notmuch_database_t *notmuch,
> >     }
> >
> >     /* Pass 1: Recurse into all sub-directories. */
> > -    is_maildir = _entries_resemble_maildir (fs_entries,
num_fs_entries);
> > +    is_maildir = _entries_resemble_maildir (path, fs_entries,
num_fs_entries);
> >
> >     for (i = 0; i < num_fs_entries; i++) {
> >        if (interrupted)
> > @@ -325,17 +363,16 @@ add_files_recursive (notmuch_database_t *notmuch,
> >
> >        entry = fs_entries[i];
> >
> > -       /* We only want to descend into directories.
> > -        * But symlinks can be to directories too, of course.
> > -        *
> > -        * And if the filesystem doesn't tell us the file type in the
> > -        * scandir results, then it might be a directory (and if not,
> > -        * then we'll stat and return immediately in the next level of
> > -        * recursion). */
> > -       if (entry->d_type != DT_DIR &&
> > -           entry->d_type != DT_LNK &&
> > -           entry->d_type != DT_UNKNOWN)
> > -       {
> > +       /* We only want to descend into directories (and symlinks to
> > +        * directories). */
> > +       entry_type = dirent_type (path, entry);
> > +       if (entry_type == -1) {
> > +           /* Be pessimistic, e.g. so we don't loose lots of mail
>
> s/loose/lose/ ?
>
> > +            * just because a user broke a symlink. */
> > +           fprintf (stderr, "Error reading file %s/%s: %s\n",
> > +                    path, entry->d_name, strerror (errno));
>
> You can't trust errno here, as explained above.
>
> > +           return NOTMUCH_STATUS_FILE_ERROR;
> > +       } else if (entry_type != S_IFDIR) {
> >            continue;
> >        }
> >
> > @@ -425,31 +462,13 @@ add_files_recursive (notmuch_database_t *notmuch,
> >            notmuch_filenames_move_to_next (db_subdirs);
> >        }
> >
> > -       /* If we're looking at a symlink, we only want to add it if it
> > -        * links to a regular file, (and not to a directory, say).
> > -        *
> > -        * Similarly, if the file is of unknown type (due to filesystem
> > -        * limitations), then we also need to look closer.
> > -        *
> > -        * In either case, a stat does the trick.
> > -        */
> > -       if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
> > -           int err;
> > -
> > -           next = talloc_asprintf (notmuch, "%s/%s", path,
entry->d_name);
> > -           err = stat (next, &st);
> > -           talloc_free (next);
> > -           next = NULL;
> > -
> > -           /* Don't emit an error for a link pointing nowhere, since
> > -            * the directory-traversal pass will have already done
> > -            * that. */
> > -           if (err)
> > -               continue;
> > -
> > -           if (! S_ISREG (st.st_mode))
> > -               continue;
> > -       } else if (entry->d_type != DT_REG) {
> > +       /* Only add regular files (and symlinks to regular files). */
> > +       entry_type = dirent_type (path, entry);
> > +       if (entry_type == -1) {
> > +           fprintf (stderr, "Error reading file %s/%s: %s\n",
> > +                    path, entry->d_name, strerror (errno));
>
> Ditto.
>
> > +           return NOTMUCH_STATUS_FILE_ERROR;
> > +       } else if (entry_type != S_IFREG) {
> >            continue;
> >        }
> >
> > diff --git a/test/new b/test/new
> > index 26253db..e3900f5 100755
> > --- a/test/new
> > +++ b/test/new
> > @@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
> >  ln -s does-not-exist "${MAIL_DIR}/broken"
> >  output=$(NOTMUCH_NEW 2>&1)
> >  test_expect_equal "$output" \
> > -"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file
or directory
> > +"Error reading file /run/shm/nm/tmp.new/mail/broken: No such file or
directory
> >  Note: A fatal error was encountered: Something went wrong trying to
read or write a file
> >  No new mail."
> >  rm "${MAIL_DIR}/broken"
> > --
> > 1.7.10
> >
> > _______________________________________________
> > notmuch mailing list
> > notmuch@notmuchmail.org
> > http://notmuchmail.org/mailman/listinfo/notmuch

[-- Attachment #2: Type: text/html, Size: 11186 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] new: Centralize file type stat-ing logic
  2012-05-07 21:08   ` Jani Nikula
  2012-05-07 21:14     ` Jani Nikula
@ 2012-05-07 22:18     ` Austin Clements
  1 sibling, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-07 22:18 UTC (permalink / raw)
  To: Jani Nikula; +Cc: Notmuch Mail, Vladimir Marek

Quoth Jani Nikula on May 08 at 12:08 am:
>    On May 7, 2012 9:10 PM, "Austin Clements" <[1]amdragon@mit.edu> wrote:
>    >
>    > This moves our logic to get a file's type into one function.  This has
>    > several benefits: we can support OSes and file systems that do not
>    > provide dirent.d_type or always return DT_UNKNOWN, complex
>    > symlink-handling logic has been replaced by a simple stat fall-through
>    > in one place, and the error message for un-stat-able file is more
>    > accurate (previously, the error always mentioned directories, even
>    > though a broken symlink is not a directory).
> 
>    Please find some quick drive-by-review below.

Thanks for the review!  New version on its way...

>    J.
> 
>    > ---
>    >  notmuch-new.c |   99
>    ++++++++++++++++++++++++++++++++++-----------------------
>    >  test/new      |    2 +-
>    >  2 files changed, 60 insertions(+), 41 deletions(-)
>    >
>    > diff --git a/notmuch-new.c b/notmuch-new.c
>    > index cb720cc..cf2580e 100644
>    > --- a/notmuch-new.c
>    > +++ b/notmuch-new.c
>    > @@ -154,6 +154,44 @@ dirent_sort_strcmp_name (const struct dirent **a,
>    const struct dirent **b)
>    >     return strcmp ((*a)->d_name, (*b)->d_name);
>    >  }
>    >
>    > +/* Return the type of a directory entry relative to path as a stat(2)
>    > + * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
>    > + * if the file's type cannot be determined (which includes dangling
>    > + * symlinks).
>    > + */
>    > +static int
>    > +dirent_type (const char *path, const struct dirent *entry)
>    > +{
>    > +    struct stat statbuf;
>    > +    char *abspath;
>    > +    int err;
>    > +
>    > +#ifdef _DIRENT_HAVE_D_TYPE
>    > +    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
>    > +     * we'll fall through to stat and get the real file type. */
>    > +    static const mode_t modes[] = {
>    > +       [DT_BLK]  = S_IFBLK,
>    > +       [DT_CHR]  = S_IFCHR,
>    > +       [DT_DIR]  = S_IFDIR,
>    > +       [DT_FIFO] = S_IFIFO,
>    > +       [DT_REG]  = S_IFREG,
>    > +       [DT_SOCK] = S_IFSOCK
>    > +    };
>    > +    if (entry->d_type < sizeof(modes)/sizeof(modes[0]) &&
> 
>    ARRAY_SIZE()

Huh.  I went looking for that.  I wonder how I missed it.

>    > +       modes[entry->d_type])
>    > +       return modes[entry->d_type];
>    > +#endif
>    > +
>    > +    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
>    > +    if (!abspath)
>    > +       return -1;
> 
>    Does talloc set errno in this case? I suspect not.
> 
>    > +    err = stat(abspath, &statbuf);
>    > +    talloc_free (abspath);
> 
>    This likely breaks your promise about errno. You can't trust talloc_free()
>    not calling some function that sets errno.

Setting errno was a late and apparently not well thought-out addition.
The new patch should set it on all of the error paths and save it
around things that may modify it.

>    > +    if (err < 0)
>    > +       return -1;
>    > +    return statbuf.st_mode & S_IFMT;
>    > +}
>    > +
>    >  /* Test if the directory looks like a Maildir directory.
>    >  *
>    >  * Search through the array of directory entries to see if we can find
>    all
>    > @@ -162,12 +200,12 @@ dirent_sort_strcmp_name (const struct dirent **a,
>    const struct dirent **b)
>    >  * Return 1 if the directory looks like a Maildir and 0 otherwise.
>    >  */
>    >  static int
>    > -_entries_resemble_maildir (struct dirent **entries, int count)
>    > +_entries_resemble_maildir (const char *path, struct dirent **entries,
>    int count)
>    >  {
>    >     int i, found = 0;
>    >
>    >     for (i = 0; i < count; i++) {
>    > -       if (entries[i]->d_type != DT_DIR && entries[i]->d_type !=
>    DT_UNKNOWN)
>    > +       if (dirent_type (path, entries[i]) != S_IFDIR)
>    >            continue;
>    >
>    >        if (strcmp(entries[i]->d_name, "new") == 0 ||
>    > @@ -250,7 +288,7 @@ add_files_recursive (notmuch_database_t *notmuch,
>    >     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
>    >     notmuch_message_t *message = NULL;
>    >     struct dirent **fs_entries = NULL;
>    > -    int i, num_fs_entries;
>    > +    int i, num_fs_entries, entry_type;
>    >     notmuch_directory_t *directory;
>    >     notmuch_filenames_t *db_files = NULL;
>    >     notmuch_filenames_t *db_subdirs = NULL;
>    > @@ -317,7 +355,7 @@ add_files_recursive (notmuch_database_t *notmuch,
>    >     }
>    >
>    >     /* Pass 1: Recurse into all sub-directories. */
>    > -    is_maildir = _entries_resemble_maildir (fs_entries,
>    num_fs_entries);
>    > +    is_maildir = _entries_resemble_maildir (path, fs_entries,
>    num_fs_entries);
>    >
>    >     for (i = 0; i < num_fs_entries; i++) {
>    >        if (interrupted)
>    > @@ -325,17 +363,16 @@ add_files_recursive (notmuch_database_t *notmuch,
>    >
>    >        entry = fs_entries[i];
>    >
>    > -       /* We only want to descend into directories.
>    > -        * But symlinks can be to directories too, of course.
>    > -        *
>    > -        * And if the filesystem doesn't tell us the file type in the
>    > -        * scandir results, then it might be a directory (and if not,
>    > -        * then we'll stat and return immediately in the next level of
>    > -        * recursion). */
>    > -       if (entry->d_type != DT_DIR &&
>    > -           entry->d_type != DT_LNK &&
>    > -           entry->d_type != DT_UNKNOWN)
>    > -       {
>    > +       /* We only want to descend into directories (and symlinks to
>    > +        * directories). */
>    > +       entry_type = dirent_type (path, entry);
>    > +       if (entry_type == -1) {
>    > +           /* Be pessimistic, e.g. so we don't loose lots of mail
> 
>    s/loose/lose/ ?

Oops.

>    > +            * just because a user broke a symlink. */
>    > +           fprintf (stderr, "Error reading file %s/%s: %s\n",
>    > +                    path, entry->d_name, strerror (errno));
> 
>    You can't trust errno here, as explained above.

Fixed above.

>    > +           return NOTMUCH_STATUS_FILE_ERROR;
>    > +       } else if (entry_type != S_IFDIR) {
>    >            continue;
>    >        }
>    >
>    > @@ -425,31 +462,13 @@ add_files_recursive (notmuch_database_t *notmuch,
>    >            notmuch_filenames_move_to_next (db_subdirs);
>    >        }
>    >
>    > -       /* If we're looking at a symlink, we only want to add it if it
>    > -        * links to a regular file, (and not to a directory, say).
>    > -        *
>    > -        * Similarly, if the file is of unknown type (due to filesystem
>    > -        * limitations), then we also need to look closer.
>    > -        *
>    > -        * In either case, a stat does the trick.
>    > -        */
>    > -       if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
>    > -           int err;
>    > -
>    > -           next = talloc_asprintf (notmuch, "%s/%s", path,
>    entry->d_name);
>    > -           err = stat (next, &st);
>    > -           talloc_free (next);
>    > -           next = NULL;
>    > -
>    > -           /* Don't emit an error for a link pointing nowhere, since
>    > -            * the directory-traversal pass will have already done
>    > -            * that. */
>    > -           if (err)
>    > -               continue;
>    > -
>    > -           if (! S_ISREG (st.st_mode))
>    > -               continue;
>    > -       } else if (entry->d_type != DT_REG) {
>    > +       /* Only add regular files (and symlinks to regular files). */
>    > +       entry_type = dirent_type (path, entry);
>    > +       if (entry_type == -1) {
>    > +           fprintf (stderr, "Error reading file %s/%s: %s\n",
>    > +                    path, entry->d_name, strerror (errno));
> 
>    Ditto.
> 
>    > +           return NOTMUCH_STATUS_FILE_ERROR;
>    > +       } else if (entry_type != S_IFREG) {
>    >            continue;
>    >        }
>    >
>    > diff --git a/test/new b/test/new
>    > index 26253db..e3900f5 100755
>    > --- a/test/new
>    > +++ b/test/new
>    > @@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
>    >  ln -s does-not-exist "${MAIL_DIR}/broken"
>    >  output=$(NOTMUCH_NEW 2>&1)
>    >  test_expect_equal "$output" \
>    > -"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file
>    or directory
>    > +"Error reading file /run/shm/nm/tmp.new/mail/broken: No such file or
>    directory
>    >  Note: A fatal error was encountered: Something went wrong trying to
>    read or write a file
>    >  No new mail."
>    >  rm "${MAIL_DIR}/broken"
>    >
>    > _______________________________________________
>    > notmuch mailing list
>    > [2]notmuch@notmuchmail.org
>    > [3]http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new
  2012-05-07 18:09 [PATCH 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
  2012-05-07 18:09 ` [PATCH 1/2] test: Test notmuch new with a broken symlink Austin Clements
  2012-05-07 18:09 ` [PATCH 2/2] new: Centralize file type stat-ing logic Austin Clements
@ 2012-05-07 22:20 ` Austin Clements
  2012-05-07 22:20   ` [PATCH v2 1/2] test: Test notmuch new with a broken symlink Austin Clements
                     ` (3 more replies)
  2 siblings, 4 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-07 22:20 UTC (permalink / raw)
  To: notmuch; +Cc: Vladimir Marek

This version uses ARRAY_SIZE, fixes errno handling bugs, and corrects
a spelling error.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/2] test: Test notmuch new with a broken symlink
  2012-05-07 22:20 ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
@ 2012-05-07 22:20   ` Austin Clements
  2012-05-07 22:20   ` [PATCH v2 2/2] new: Centralize file type stat-ing logic Austin Clements
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-07 22:20 UTC (permalink / raw)
  To: notmuch; +Cc: Vladimir Marek

---
 test/new |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/test/new b/test/new
index 99f9913..26253db 100755
--- a/test/new
+++ b/test/new
@@ -136,6 +136,16 @@ output=$(NOTMUCH_NEW)
 test_expect_equal "$output" "Added 1 new message to the database."
 
 
+test_begin_subtest "Broken symlink aborts"
+ln -s does-not-exist "${MAIL_DIR}/broken"
+output=$(NOTMUCH_NEW 2>&1)
+test_expect_equal "$output" \
+"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file or directory
+Note: A fatal error was encountered: Something went wrong trying to read or write a file
+No new mail."
+rm "${MAIL_DIR}/broken"
+
+
 test_begin_subtest "New two-level directory"
 
 generate_message [dir]=two/levels
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 2/2] new: Centralize file type stat-ing logic
  2012-05-07 22:20 ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
  2012-05-07 22:20   ` [PATCH v2 1/2] test: Test notmuch new with a broken symlink Austin Clements
@ 2012-05-07 22:20   ` Austin Clements
  2012-05-08  7:58     ` Jani Nikula
  2012-05-09 18:45   ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
  2012-05-24 19:31   ` [PATCH v3 0/4] " Austin Clements
  3 siblings, 1 reply; 29+ messages in thread
From: Austin Clements @ 2012-05-07 22:20 UTC (permalink / raw)
  To: notmuch; +Cc: Vladimir Marek

This moves our logic to get a file's type into one function.  This has
several benefits: we can support OSes and file systems that do not
provide dirent.d_type or always return DT_UNKNOWN, complex
symlink-handling logic has been replaced by a simple stat fall-through
in one place, and the error message for un-stat-able file is more
accurate (previously, the error always mentioned directories, even
though a broken symlink is not a directory).
---
 notmuch-new.c |  103 +++++++++++++++++++++++++++++++++++----------------------
 test/new      |    2 +-
 2 files changed, 64 insertions(+), 41 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index cb720cc..8955677 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -154,6 +154,48 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
     return strcmp ((*a)->d_name, (*b)->d_name);
 }
 
+/* Return the type of a directory entry relative to path as a stat(2)
+ * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
+ * if the file's type cannot be determined (which includes dangling
+ * symlinks).
+ */
+static int
+dirent_type (const char *path, const struct dirent *entry)
+{
+    struct stat statbuf;
+    char *abspath;
+    int err, saved_errno;
+
+#ifdef _DIRENT_HAVE_D_TYPE
+    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
+     * we'll fall through to stat and get the real file type. */
+    static const mode_t modes[] = {
+	[DT_BLK]  = S_IFBLK,
+	[DT_CHR]  = S_IFCHR,
+	[DT_DIR]  = S_IFDIR,
+	[DT_FIFO] = S_IFIFO,
+	[DT_REG]  = S_IFREG,
+	[DT_SOCK] = S_IFSOCK
+    };
+    if (entry->d_type < ARRAY_SIZE(modes) && modes[entry->d_type])
+	return modes[entry->d_type];
+#endif
+
+    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
+    if (!abspath) {
+	errno = ENOMEM;
+	return -1;
+    }
+    err = stat(abspath, &statbuf);
+    saved_errno = errno;
+    talloc_free (abspath);
+    if (err < 0) {
+	errno = saved_errno;
+	return -1;
+    }
+    return statbuf.st_mode & S_IFMT;
+}
+
 /* Test if the directory looks like a Maildir directory.
  *
  * Search through the array of directory entries to see if we can find all
@@ -162,12 +204,12 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
  * Return 1 if the directory looks like a Maildir and 0 otherwise.
  */
 static int
-_entries_resemble_maildir (struct dirent **entries, int count)
+_entries_resemble_maildir (const char *path, struct dirent **entries, int count)
 {
     int i, found = 0;
 
     for (i = 0; i < count; i++) {
-	if (entries[i]->d_type != DT_DIR && entries[i]->d_type != DT_UNKNOWN)
+	if (dirent_type (path, entries[i]) != S_IFDIR)
 	    continue;
 
 	if (strcmp(entries[i]->d_name, "new") == 0 ||
@@ -250,7 +292,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
     notmuch_message_t *message = NULL;
     struct dirent **fs_entries = NULL;
-    int i, num_fs_entries;
+    int i, num_fs_entries, entry_type;
     notmuch_directory_t *directory;
     notmuch_filenames_t *db_files = NULL;
     notmuch_filenames_t *db_subdirs = NULL;
@@ -317,7 +359,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     }
 
     /* Pass 1: Recurse into all sub-directories. */
-    is_maildir = _entries_resemble_maildir (fs_entries, num_fs_entries);
+    is_maildir = _entries_resemble_maildir (path, fs_entries, num_fs_entries);
 
     for (i = 0; i < num_fs_entries; i++) {
 	if (interrupted)
@@ -325,17 +367,16 @@ add_files_recursive (notmuch_database_t *notmuch,
 
 	entry = fs_entries[i];
 
-	/* We only want to descend into directories.
-	 * But symlinks can be to directories too, of course.
-	 *
-	 * And if the filesystem doesn't tell us the file type in the
-	 * scandir results, then it might be a directory (and if not,
-	 * then we'll stat and return immediately in the next level of
-	 * recursion). */
-	if (entry->d_type != DT_DIR &&
-	    entry->d_type != DT_LNK &&
-	    entry->d_type != DT_UNKNOWN)
-	{
+	/* We only want to descend into directories (and symlinks to
+	 * directories). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    /* Be pessimistic, e.g. so we don't lose lots of mail just
+	     * because a user broke a symlink. */
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFDIR) {
 	    continue;
 	}
 
@@ -425,31 +466,13 @@ add_files_recursive (notmuch_database_t *notmuch,
 	    notmuch_filenames_move_to_next (db_subdirs);
 	}
 
-	/* If we're looking at a symlink, we only want to add it if it
-	 * links to a regular file, (and not to a directory, say).
-	 *
-	 * Similarly, if the file is of unknown type (due to filesystem
-	 * limitations), then we also need to look closer.
-	 *
-	 * In either case, a stat does the trick.
-	 */
-	if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
-	    int err;
-
-	    next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
-	    err = stat (next, &st);
-	    talloc_free (next);
-	    next = NULL;
-
-	    /* Don't emit an error for a link pointing nowhere, since
-	     * the directory-traversal pass will have already done
-	     * that. */
-	    if (err)
-		continue;
-
-	    if (! S_ISREG (st.st_mode))
-		continue;
-	} else if (entry->d_type != DT_REG) {
+	/* Only add regular files (and symlinks to regular files). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFREG) {
 	    continue;
 	}
 
diff --git a/test/new b/test/new
index 26253db..e3900f5 100755
--- a/test/new
+++ b/test/new
@@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
 ln -s does-not-exist "${MAIL_DIR}/broken"
 output=$(NOTMUCH_NEW 2>&1)
 test_expect_equal "$output" \
-"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file or directory
+"Error reading file /run/shm/nm/tmp.new/mail/broken: No such file or directory
 Note: A fatal error was encountered: Something went wrong trying to read or write a file
 No new mail."
 rm "${MAIL_DIR}/broken"
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] new: Centralize file type stat-ing logic
  2012-05-07 22:20   ` [PATCH v2 2/2] new: Centralize file type stat-ing logic Austin Clements
@ 2012-05-08  7:58     ` Jani Nikula
  2012-05-08  8:33       ` Jani Nikula
  0 siblings, 1 reply; 29+ messages in thread
From: Jani Nikula @ 2012-05-08  7:58 UTC (permalink / raw)
  To: Austin Clements, notmuch; +Cc: Vladimir Marek

On Mon,  7 May 2012 18:20:40 -0400, Austin Clements <amdragon@MIT.EDU> wrote:
> This moves our logic to get a file's type into one function.  This has
> several benefits: we can support OSes and file systems that do not
> provide dirent.d_type or always return DT_UNKNOWN, complex
> symlink-handling logic has been replaced by a simple stat fall-through
> in one place, and the error message for un-stat-able file is more
> accurate (previously, the error always mentioned directories, even
> though a broken symlink is not a directory).

LGTM.

> ---
>  notmuch-new.c |  103 +++++++++++++++++++++++++++++++++++----------------------
>  test/new      |    2 +-
>  2 files changed, 64 insertions(+), 41 deletions(-)
> 
> diff --git a/notmuch-new.c b/notmuch-new.c
> index cb720cc..8955677 100644
> --- a/notmuch-new.c
> +++ b/notmuch-new.c
> @@ -154,6 +154,48 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
>      return strcmp ((*a)->d_name, (*b)->d_name);
>  }
>  
> +/* Return the type of a directory entry relative to path as a stat(2)
> + * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
> + * if the file's type cannot be determined (which includes dangling
> + * symlinks).
> + */
> +static int
> +dirent_type (const char *path, const struct dirent *entry)
> +{
> +    struct stat statbuf;
> +    char *abspath;
> +    int err, saved_errno;
> +
> +#ifdef _DIRENT_HAVE_D_TYPE
> +    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
> +     * we'll fall through to stat and get the real file type. */
> +    static const mode_t modes[] = {
> +	[DT_BLK]  = S_IFBLK,
> +	[DT_CHR]  = S_IFCHR,
> +	[DT_DIR]  = S_IFDIR,
> +	[DT_FIFO] = S_IFIFO,
> +	[DT_REG]  = S_IFREG,
> +	[DT_SOCK] = S_IFSOCK
> +    };
> +    if (entry->d_type < ARRAY_SIZE(modes) && modes[entry->d_type])
> +	return modes[entry->d_type];
> +#endif
> +
> +    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
> +    if (!abspath) {
> +	errno = ENOMEM;
> +	return -1;
> +    }
> +    err = stat(abspath, &statbuf);
> +    saved_errno = errno;
> +    talloc_free (abspath);
> +    if (err < 0) {
> +	errno = saved_errno;
> +	return -1;
> +    }
> +    return statbuf.st_mode & S_IFMT;
> +}
> +
>  /* Test if the directory looks like a Maildir directory.
>   *
>   * Search through the array of directory entries to see if we can find all
> @@ -162,12 +204,12 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
>   * Return 1 if the directory looks like a Maildir and 0 otherwise.
>   */
>  static int
> -_entries_resemble_maildir (struct dirent **entries, int count)
> +_entries_resemble_maildir (const char *path, struct dirent **entries, int count)
>  {
>      int i, found = 0;
>  
>      for (i = 0; i < count; i++) {
> -	if (entries[i]->d_type != DT_DIR && entries[i]->d_type != DT_UNKNOWN)
> +	if (dirent_type (path, entries[i]) != S_IFDIR)
>  	    continue;
>  
>  	if (strcmp(entries[i]->d_name, "new") == 0 ||
> @@ -250,7 +292,7 @@ add_files_recursive (notmuch_database_t *notmuch,
>      notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
>      notmuch_message_t *message = NULL;
>      struct dirent **fs_entries = NULL;
> -    int i, num_fs_entries;
> +    int i, num_fs_entries, entry_type;
>      notmuch_directory_t *directory;
>      notmuch_filenames_t *db_files = NULL;
>      notmuch_filenames_t *db_subdirs = NULL;
> @@ -317,7 +359,7 @@ add_files_recursive (notmuch_database_t *notmuch,
>      }
>  
>      /* Pass 1: Recurse into all sub-directories. */
> -    is_maildir = _entries_resemble_maildir (fs_entries, num_fs_entries);
> +    is_maildir = _entries_resemble_maildir (path, fs_entries, num_fs_entries);
>  
>      for (i = 0; i < num_fs_entries; i++) {
>  	if (interrupted)
> @@ -325,17 +367,16 @@ add_files_recursive (notmuch_database_t *notmuch,
>  
>  	entry = fs_entries[i];
>  
> -	/* We only want to descend into directories.
> -	 * But symlinks can be to directories too, of course.
> -	 *
> -	 * And if the filesystem doesn't tell us the file type in the
> -	 * scandir results, then it might be a directory (and if not,
> -	 * then we'll stat and return immediately in the next level of
> -	 * recursion). */
> -	if (entry->d_type != DT_DIR &&
> -	    entry->d_type != DT_LNK &&
> -	    entry->d_type != DT_UNKNOWN)
> -	{
> +	/* We only want to descend into directories (and symlinks to
> +	 * directories). */
> +	entry_type = dirent_type (path, entry);
> +	if (entry_type == -1) {
> +	    /* Be pessimistic, e.g. so we don't lose lots of mail just
> +	     * because a user broke a symlink. */
> +	    fprintf (stderr, "Error reading file %s/%s: %s\n",
> +		     path, entry->d_name, strerror (errno));
> +	    return NOTMUCH_STATUS_FILE_ERROR;
> +	} else if (entry_type != S_IFDIR) {
>  	    continue;
>  	}
>  
> @@ -425,31 +466,13 @@ add_files_recursive (notmuch_database_t *notmuch,
>  	    notmuch_filenames_move_to_next (db_subdirs);
>  	}
>  
> -	/* If we're looking at a symlink, we only want to add it if it
> -	 * links to a regular file, (and not to a directory, say).
> -	 *
> -	 * Similarly, if the file is of unknown type (due to filesystem
> -	 * limitations), then we also need to look closer.
> -	 *
> -	 * In either case, a stat does the trick.
> -	 */
> -	if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
> -	    int err;
> -
> -	    next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
> -	    err = stat (next, &st);
> -	    talloc_free (next);
> -	    next = NULL;
> -
> -	    /* Don't emit an error for a link pointing nowhere, since
> -	     * the directory-traversal pass will have already done
> -	     * that. */
> -	    if (err)
> -		continue;
> -
> -	    if (! S_ISREG (st.st_mode))
> -		continue;
> -	} else if (entry->d_type != DT_REG) {
> +	/* Only add regular files (and symlinks to regular files). */
> +	entry_type = dirent_type (path, entry);
> +	if (entry_type == -1) {
> +	    fprintf (stderr, "Error reading file %s/%s: %s\n",
> +		     path, entry->d_name, strerror (errno));
> +	    return NOTMUCH_STATUS_FILE_ERROR;
> +	} else if (entry_type != S_IFREG) {
>  	    continue;
>  	}
>  
> diff --git a/test/new b/test/new
> index 26253db..e3900f5 100755
> --- a/test/new
> +++ b/test/new
> @@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
>  ln -s does-not-exist "${MAIL_DIR}/broken"
>  output=$(NOTMUCH_NEW 2>&1)
>  test_expect_equal "$output" \
> -"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file or directory
> +"Error reading file /run/shm/nm/tmp.new/mail/broken: No such file or directory
>  Note: A fatal error was encountered: Something went wrong trying to read or write a file
>  No new mail."
>  rm "${MAIL_DIR}/broken"
> -- 
> 1.7.10
> 
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] new: Centralize file type stat-ing logic
  2012-05-08  7:58     ` Jani Nikula
@ 2012-05-08  8:33       ` Jani Nikula
  2012-05-09 18:27         ` Austin Clements
  0 siblings, 1 reply; 29+ messages in thread
From: Jani Nikula @ 2012-05-08  8:33 UTC (permalink / raw)
  To: Austin Clements, notmuch; +Cc: Vladimir Marek

On Tue, 08 May 2012 07:58:28 +0000, Jani Nikula <jani@nikula.org> wrote:
> On Mon,  7 May 2012 18:20:40 -0400, Austin Clements <amdragon@MIT.EDU> wrote:
> > This moves our logic to get a file's type into one function.  This has
> > several benefits: we can support OSes and file systems that do not
> > provide dirent.d_type or always return DT_UNKNOWN, complex
> > symlink-handling logic has been replaced by a simple stat fall-through
> > in one place, and the error message for un-stat-able file is more
> > accurate (previously, the error always mentioned directories, even
> > though a broken symlink is not a directory).
> 
> LGTM.

Okay, it's good, but I think you can make it even better:

add_files_recursive() has check for "! S_ISDIR (st.st_mode)" in the
beginning, returning silently in case it recursed based on a symlink to
regular file. IIUC, this will no longer happen with your patch, as
symlinks are resolved and stat'ed before recursing.

add_files() exists to fail loudly in the same situation, and has
otherwise the same checks in the beginning. I think you could now use
the checks from add_files() to replace the ones in
add_files_recursive(), and get rid of add_files() altogether.

Please double check my thinking. Also this should probably be a separate
patch, no need to change the current one.

Jani.


> 
> > ---
> >  notmuch-new.c |  103 +++++++++++++++++++++++++++++++++++----------------------
> >  test/new      |    2 +-
> >  2 files changed, 64 insertions(+), 41 deletions(-)
> > 
> > diff --git a/notmuch-new.c b/notmuch-new.c
> > index cb720cc..8955677 100644
> > --- a/notmuch-new.c
> > +++ b/notmuch-new.c
> > @@ -154,6 +154,48 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
> >      return strcmp ((*a)->d_name, (*b)->d_name);
> >  }
> >  
> > +/* Return the type of a directory entry relative to path as a stat(2)
> > + * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
> > + * if the file's type cannot be determined (which includes dangling
> > + * symlinks).
> > + */
> > +static int
> > +dirent_type (const char *path, const struct dirent *entry)
> > +{
> > +    struct stat statbuf;
> > +    char *abspath;
> > +    int err, saved_errno;
> > +
> > +#ifdef _DIRENT_HAVE_D_TYPE
> > +    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
> > +     * we'll fall through to stat and get the real file type. */
> > +    static const mode_t modes[] = {
> > +	[DT_BLK]  = S_IFBLK,
> > +	[DT_CHR]  = S_IFCHR,
> > +	[DT_DIR]  = S_IFDIR,
> > +	[DT_FIFO] = S_IFIFO,
> > +	[DT_REG]  = S_IFREG,
> > +	[DT_SOCK] = S_IFSOCK
> > +    };
> > +    if (entry->d_type < ARRAY_SIZE(modes) && modes[entry->d_type])
> > +	return modes[entry->d_type];
> > +#endif
> > +
> > +    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
> > +    if (!abspath) {
> > +	errno = ENOMEM;
> > +	return -1;
> > +    }
> > +    err = stat(abspath, &statbuf);
> > +    saved_errno = errno;
> > +    talloc_free (abspath);
> > +    if (err < 0) {
> > +	errno = saved_errno;
> > +	return -1;
> > +    }
> > +    return statbuf.st_mode & S_IFMT;
> > +}
> > +
> >  /* Test if the directory looks like a Maildir directory.
> >   *
> >   * Search through the array of directory entries to see if we can find all
> > @@ -162,12 +204,12 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
> >   * Return 1 if the directory looks like a Maildir and 0 otherwise.
> >   */
> >  static int
> > -_entries_resemble_maildir (struct dirent **entries, int count)
> > +_entries_resemble_maildir (const char *path, struct dirent **entries, int count)
> >  {
> >      int i, found = 0;
> >  
> >      for (i = 0; i < count; i++) {
> > -	if (entries[i]->d_type != DT_DIR && entries[i]->d_type != DT_UNKNOWN)
> > +	if (dirent_type (path, entries[i]) != S_IFDIR)
> >  	    continue;
> >  
> >  	if (strcmp(entries[i]->d_name, "new") == 0 ||
> > @@ -250,7 +292,7 @@ add_files_recursive (notmuch_database_t *notmuch,
> >      notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
> >      notmuch_message_t *message = NULL;
> >      struct dirent **fs_entries = NULL;
> > -    int i, num_fs_entries;
> > +    int i, num_fs_entries, entry_type;
> >      notmuch_directory_t *directory;
> >      notmuch_filenames_t *db_files = NULL;
> >      notmuch_filenames_t *db_subdirs = NULL;
> > @@ -317,7 +359,7 @@ add_files_recursive (notmuch_database_t *notmuch,
> >      }
> >  
> >      /* Pass 1: Recurse into all sub-directories. */
> > -    is_maildir = _entries_resemble_maildir (fs_entries, num_fs_entries);
> > +    is_maildir = _entries_resemble_maildir (path, fs_entries, num_fs_entries);
> >  
> >      for (i = 0; i < num_fs_entries; i++) {
> >  	if (interrupted)
> > @@ -325,17 +367,16 @@ add_files_recursive (notmuch_database_t *notmuch,
> >  
> >  	entry = fs_entries[i];
> >  
> > -	/* We only want to descend into directories.
> > -	 * But symlinks can be to directories too, of course.
> > -	 *
> > -	 * And if the filesystem doesn't tell us the file type in the
> > -	 * scandir results, then it might be a directory (and if not,
> > -	 * then we'll stat and return immediately in the next level of
> > -	 * recursion). */
> > -	if (entry->d_type != DT_DIR &&
> > -	    entry->d_type != DT_LNK &&
> > -	    entry->d_type != DT_UNKNOWN)
> > -	{
> > +	/* We only want to descend into directories (and symlinks to
> > +	 * directories). */
> > +	entry_type = dirent_type (path, entry);
> > +	if (entry_type == -1) {
> > +	    /* Be pessimistic, e.g. so we don't lose lots of mail just
> > +	     * because a user broke a symlink. */
> > +	    fprintf (stderr, "Error reading file %s/%s: %s\n",
> > +		     path, entry->d_name, strerror (errno));
> > +	    return NOTMUCH_STATUS_FILE_ERROR;
> > +	} else if (entry_type != S_IFDIR) {
> >  	    continue;
> >  	}
> >  
> > @@ -425,31 +466,13 @@ add_files_recursive (notmuch_database_t *notmuch,
> >  	    notmuch_filenames_move_to_next (db_subdirs);
> >  	}
> >  
> > -	/* If we're looking at a symlink, we only want to add it if it
> > -	 * links to a regular file, (and not to a directory, say).
> > -	 *
> > -	 * Similarly, if the file is of unknown type (due to filesystem
> > -	 * limitations), then we also need to look closer.
> > -	 *
> > -	 * In either case, a stat does the trick.
> > -	 */
> > -	if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
> > -	    int err;
> > -
> > -	    next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
> > -	    err = stat (next, &st);
> > -	    talloc_free (next);
> > -	    next = NULL;
> > -
> > -	    /* Don't emit an error for a link pointing nowhere, since
> > -	     * the directory-traversal pass will have already done
> > -	     * that. */
> > -	    if (err)
> > -		continue;
> > -
> > -	    if (! S_ISREG (st.st_mode))
> > -		continue;
> > -	} else if (entry->d_type != DT_REG) {
> > +	/* Only add regular files (and symlinks to regular files). */
> > +	entry_type = dirent_type (path, entry);
> > +	if (entry_type == -1) {
> > +	    fprintf (stderr, "Error reading file %s/%s: %s\n",
> > +		     path, entry->d_name, strerror (errno));
> > +	    return NOTMUCH_STATUS_FILE_ERROR;
> > +	} else if (entry_type != S_IFREG) {
> >  	    continue;
> >  	}
> >  
> > diff --git a/test/new b/test/new
> > index 26253db..e3900f5 100755
> > --- a/test/new
> > +++ b/test/new
> > @@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
> >  ln -s does-not-exist "${MAIL_DIR}/broken"
> >  output=$(NOTMUCH_NEW 2>&1)
> >  test_expect_equal "$output" \
> > -"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file or directory
> > +"Error reading file /run/shm/nm/tmp.new/mail/broken: No such file or directory
> >  Note: A fatal error was encountered: Something went wrong trying to read or write a file
> >  No new mail."
> >  rm "${MAIL_DIR}/broken"
> > -- 
> > 1.7.10
> > 
> > _______________________________________________
> > notmuch mailing list
> > notmuch@notmuchmail.org
> > http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/2] new: Centralize file type stat-ing logic
  2012-05-08  8:33       ` Jani Nikula
@ 2012-05-09 18:27         ` Austin Clements
  0 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-09 18:27 UTC (permalink / raw)
  To: Jani Nikula; +Cc: notmuch, Vladimir Marek

Quoth Jani Nikula on May 08 at  8:33 am:
> On Tue, 08 May 2012 07:58:28 +0000, Jani Nikula <jani@nikula.org> wrote:
> > On Mon,  7 May 2012 18:20:40 -0400, Austin Clements <amdragon@MIT.EDU> wrote:
> > > This moves our logic to get a file's type into one function.  This has
> > > several benefits: we can support OSes and file systems that do not
> > > provide dirent.d_type or always return DT_UNKNOWN, complex
> > > symlink-handling logic has been replaced by a simple stat fall-through
> > > in one place, and the error message for un-stat-able file is more
> > > accurate (previously, the error always mentioned directories, even
> > > though a broken symlink is not a directory).
> > 
> > LGTM.
> 
> Okay, it's good, but I think you can make it even better:
> 
> add_files_recursive() has check for "! S_ISDIR (st.st_mode)" in the
> beginning, returning silently in case it recursed based on a symlink to
> regular file. IIUC, this will no longer happen with your patch, as
> symlinks are resolved and stat'ed before recursing.
> 
> add_files() exists to fail loudly in the same situation, and has
> otherwise the same checks in the beginning. I think you could now use
> the checks from add_files() to replace the ones in
> add_files_recursive(), and get rid of add_files() altogether.
> 
> Please double check my thinking. Also this should probably be a separate
> patch, no need to change the current one.

Excellent idea.  It works fantastically.  I'll wait to send the
patches until this series gets pushed, both the avoid dependent series
confusion and so I can refer to the appropriate commit ID in the
commit message.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new
  2012-05-07 22:20 ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
  2012-05-07 22:20   ` [PATCH v2 1/2] test: Test notmuch new with a broken symlink Austin Clements
  2012-05-07 22:20   ` [PATCH v2 2/2] new: Centralize file type stat-ing logic Austin Clements
@ 2012-05-09 18:45   ` Austin Clements
  2012-05-24 19:31   ` [PATCH v3 0/4] " Austin Clements
  3 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-09 18:45 UTC (permalink / raw)
  To: Vladimir Marek; +Cc: notmuch

On Mon, 07 May 2012, Austin Clements <amdragon@MIT.EDU> wrote:
> This version uses ARRAY_SIZE, fixes errno handling bugs, and corrects
> a spelling error.

Vladimir, if you get a moment, could you confirm that this is, in fact,
portable?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 0/4] Better, more portable stat-ing logic for notmuch new
  2012-05-07 22:20 ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
                     ` (2 preceding siblings ...)
  2012-05-09 18:45   ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
@ 2012-05-24 19:31   ` Austin Clements
  2012-05-24 19:31     ` [PATCH v3 1/4] test: Test notmuch new with a broken symlink Austin Clements
                       ` (4 more replies)
  3 siblings, 5 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 19:31 UTC (permalink / raw)
  To: notmuch

v3 is a rebase to current master that resolves one trivial conflict.
This also fixes a bug in the broken symlink test where I had
hard-coded the maildir path.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 1/4] test: Test notmuch new with a broken symlink
  2012-05-24 19:31   ` [PATCH v3 0/4] " Austin Clements
@ 2012-05-24 19:31     ` Austin Clements
  2012-05-24 19:32     ` [PATCH v3 2/4] new: Centralize file type stat-ing logic Austin Clements
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 19:31 UTC (permalink / raw)
  To: notmuch

---
 test/new |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/test/new b/test/new
index 99f9913..a7bc146 100755
--- a/test/new
+++ b/test/new
@@ -136,6 +136,16 @@ output=$(NOTMUCH_NEW)
 test_expect_equal "$output" "Added 1 new message to the database."
 
 
+test_begin_subtest "Broken symlink aborts"
+ln -s does-not-exist "${MAIL_DIR}/broken"
+output=$(NOTMUCH_NEW 2>&1)
+test_expect_equal "$output" \
+"Error reading directory ${MAIL_DIR}/broken: No such file or directory
+Note: A fatal error was encountered: Something went wrong trying to read or write a file
+No new mail."
+rm "${MAIL_DIR}/broken"
+
+
 test_begin_subtest "New two-level directory"
 
 generate_message [dir]=two/levels
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 2/4] new: Centralize file type stat-ing logic
  2012-05-24 19:31   ` [PATCH v3 0/4] " Austin Clements
  2012-05-24 19:31     ` [PATCH v3 1/4] test: Test notmuch new with a broken symlink Austin Clements
@ 2012-05-24 19:32     ` Austin Clements
  2012-05-24 19:32     ` [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive Austin Clements
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 19:32 UTC (permalink / raw)
  To: notmuch

This moves our logic to get a file's type into one function.  This has
several benefits: we can support OSes and file systems that do not
provide dirent.d_type or always return DT_UNKNOWN, complex
symlink-handling logic has been replaced by a simple stat fall-through
in one place, and the error message for un-stat-able file is more
accurate (previously, the error always mentioned directories, even
though a broken symlink is not a directory).
---
 notmuch-new.c |  103 +++++++++++++++++++++++++++++++++++----------------------
 test/new      |    2 +-
 2 files changed, 64 insertions(+), 41 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 72dd558..c64f1a7 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -154,6 +154,48 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
     return strcmp ((*a)->d_name, (*b)->d_name);
 }
 
+/* Return the type of a directory entry relative to path as a stat(2)
+ * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
+ * if the file's type cannot be determined (which includes dangling
+ * symlinks).
+ */
+static int
+dirent_type (const char *path, const struct dirent *entry)
+{
+    struct stat statbuf;
+    char *abspath;
+    int err, saved_errno;
+
+#ifdef _DIRENT_HAVE_D_TYPE
+    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
+     * we'll fall through to stat and get the real file type. */
+    static const mode_t modes[] = {
+	[DT_BLK]  = S_IFBLK,
+	[DT_CHR]  = S_IFCHR,
+	[DT_DIR]  = S_IFDIR,
+	[DT_FIFO] = S_IFIFO,
+	[DT_REG]  = S_IFREG,
+	[DT_SOCK] = S_IFSOCK
+    };
+    if (entry->d_type < ARRAY_SIZE(modes) && modes[entry->d_type])
+	return modes[entry->d_type];
+#endif
+
+    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
+    if (!abspath) {
+	errno = ENOMEM;
+	return -1;
+    }
+    err = stat(abspath, &statbuf);
+    saved_errno = errno;
+    talloc_free (abspath);
+    if (err < 0) {
+	errno = saved_errno;
+	return -1;
+    }
+    return statbuf.st_mode & S_IFMT;
+}
+
 /* Test if the directory looks like a Maildir directory.
  *
  * Search through the array of directory entries to see if we can find all
@@ -162,12 +204,12 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
  * Return 1 if the directory looks like a Maildir and 0 otherwise.
  */
 static int
-_entries_resemble_maildir (struct dirent **entries, int count)
+_entries_resemble_maildir (const char *path, struct dirent **entries, int count)
 {
     int i, found = 0;
 
     for (i = 0; i < count; i++) {
-	if (entries[i]->d_type != DT_DIR && entries[i]->d_type != DT_UNKNOWN)
+	if (dirent_type (path, entries[i]) != S_IFDIR)
 	    continue;
 
 	if (strcmp(entries[i]->d_name, "new") == 0 ||
@@ -250,7 +292,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
     notmuch_message_t *message = NULL;
     struct dirent **fs_entries = NULL;
-    int i, num_fs_entries = 0;
+    int i, num_fs_entries = 0, entry_type;
     notmuch_directory_t *directory;
     notmuch_filenames_t *db_files = NULL;
     notmuch_filenames_t *db_subdirs = NULL;
@@ -300,7 +342,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     }
 
     /* Pass 1: Recurse into all sub-directories. */
-    is_maildir = _entries_resemble_maildir (fs_entries, num_fs_entries);
+    is_maildir = _entries_resemble_maildir (path, fs_entries, num_fs_entries);
 
     for (i = 0; i < num_fs_entries; i++) {
 	if (interrupted)
@@ -308,17 +350,16 @@ add_files_recursive (notmuch_database_t *notmuch,
 
 	entry = fs_entries[i];
 
-	/* We only want to descend into directories.
-	 * But symlinks can be to directories too, of course.
-	 *
-	 * And if the filesystem doesn't tell us the file type in the
-	 * scandir results, then it might be a directory (and if not,
-	 * then we'll stat and return immediately in the next level of
-	 * recursion). */
-	if (entry->d_type != DT_DIR &&
-	    entry->d_type != DT_LNK &&
-	    entry->d_type != DT_UNKNOWN)
-	{
+	/* We only want to descend into directories (and symlinks to
+	 * directories). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    /* Be pessimistic, e.g. so we don't lose lots of mail just
+	     * because a user broke a symlink. */
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFDIR) {
 	    continue;
 	}
 
@@ -407,31 +448,13 @@ add_files_recursive (notmuch_database_t *notmuch,
 	    notmuch_filenames_move_to_next (db_subdirs);
 	}
 
-	/* If we're looking at a symlink, we only want to add it if it
-	 * links to a regular file, (and not to a directory, say).
-	 *
-	 * Similarly, if the file is of unknown type (due to filesystem
-	 * limitations), then we also need to look closer.
-	 *
-	 * In either case, a stat does the trick.
-	 */
-	if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
-	    int err;
-
-	    next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
-	    err = stat (next, &st);
-	    talloc_free (next);
-	    next = NULL;
-
-	    /* Don't emit an error for a link pointing nowhere, since
-	     * the directory-traversal pass will have already done
-	     * that. */
-	    if (err)
-		continue;
-
-	    if (! S_ISREG (st.st_mode))
-		continue;
-	} else if (entry->d_type != DT_REG) {
+	/* Only add regular files (and symlinks to regular files). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFREG) {
 	    continue;
 	}
 
diff --git a/test/new b/test/new
index a7bc146..cab7c01 100755
--- a/test/new
+++ b/test/new
@@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
 ln -s does-not-exist "${MAIL_DIR}/broken"
 output=$(NOTMUCH_NEW 2>&1)
 test_expect_equal "$output" \
-"Error reading directory ${MAIL_DIR}/broken: No such file or directory
+"Error reading file ${MAIL_DIR}/broken: No such file or directory
 Note: A fatal error was encountered: Something went wrong trying to read or write a file
 No new mail."
 rm "${MAIL_DIR}/broken"
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive
  2012-05-24 19:31   ` [PATCH v3 0/4] " Austin Clements
  2012-05-24 19:31     ` [PATCH v3 1/4] test: Test notmuch new with a broken symlink Austin Clements
  2012-05-24 19:32     ` [PATCH v3 2/4] new: Centralize file type stat-ing logic Austin Clements
@ 2012-05-24 19:32     ` Austin Clements
  2012-05-24 20:57       ` Jani Nikula
  2012-05-24 20:57       ` Jani Nikula
  2012-05-24 19:32     ` [PATCH v3 4/4] new: Unify " Austin Clements
  2012-05-24 22:01     ` [PATCH v4 0/4] Better, more portable stat-ing logic for notmuch new Austin Clements
  4 siblings, 2 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 19:32 UTC (permalink / raw)
  To: notmuch

Before XXX, add_files_recursive could have been called on a symlink to
a non-directory.  Hence, calling it on a non-directory was not an
error, so a separate function, add_files, existed to fail loudly in
situations where the path had to be a directory.

With the new stat-ing logic, add_files_recursive is always called on
directories, so the separation of this logic is no longer necessary.
Hence, this patch moves the strict error checking previously done by
add_files into add_files_recursive.
---
 notmuch-new.c |   27 +++++----------------------
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index c64f1a7..2b05605 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -308,11 +308,10 @@ add_files_recursive (notmuch_database_t *notmuch,
     }
     stat_time = time (NULL);
 
-    /* This is not an error since we may have recursed based on a
-     * symlink to a regular file, not a directory, and we don't know
-     * that until this stat. */
-    if (! S_ISDIR (st.st_mode))
-	return NOTMUCH_STATUS_SUCCESS;
+    if (! S_ISDIR (st.st_mode)) {
+	fprintf (stderr, "Error: %s is not a directory.\n", path);
+	return NOTMUCH_STATUS_FILE_ERROR;
+    }
 
     fs_mtime = st.st_mtime;
 
@@ -655,23 +654,7 @@ add_files (notmuch_database_t *notmuch,
 	   const char *path,
 	   add_files_state_t *state)
 {
-    notmuch_status_t status;
-    struct stat st;
-
-    if (stat (path, &st)) {
-	fprintf (stderr, "Error reading directory %s: %s\n",
-		 path, strerror (errno));
-	return NOTMUCH_STATUS_FILE_ERROR;
-    }
-
-    if (! S_ISDIR (st.st_mode)) {
-	fprintf (stderr, "Error: %s is not a directory.\n", path);
-	return NOTMUCH_STATUS_FILE_ERROR;
-    }
-
-    status = add_files_recursive (notmuch, path, state);
-
-    return status;
+    return add_files_recursive (notmuch, path, state);
 }
 
 /* XXX: This should be merged with the add_files function since it
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 4/4] new: Unify add_files and add_files_recursive
  2012-05-24 19:31   ` [PATCH v3 0/4] " Austin Clements
                       ` (2 preceding siblings ...)
  2012-05-24 19:32     ` [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive Austin Clements
@ 2012-05-24 19:32     ` Austin Clements
  2012-05-24 22:01     ` [PATCH v4 0/4] Better, more portable stat-ing logic for notmuch new Austin Clements
  4 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 19:32 UTC (permalink / raw)
  To: notmuch

Since starting at the top of a directory tree and recursing within
that tree are now identical operations, there's no need for both
add_files and add_files_recursive.  This eliminates add_files (which
did nothing more than call add_files_recursive after the previous
patch) and renames add_files_recursive to add_files.
---
 notmuch-new.c |   18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 2b05605..938ae29 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -281,9 +281,9 @@ _entry_in_ignore_list (const char *entry, add_files_state_t *state)
  *     if fs_mtime isn't the current wall-clock time.
  */
 static notmuch_status_t
-add_files_recursive (notmuch_database_t *notmuch,
-		     const char *path,
-		     add_files_state_t *state)
+add_files (notmuch_database_t *notmuch,
+	   const char *path,
+	   add_files_state_t *state)
 {
     DIR *dir = NULL;
     struct dirent *entry = NULL;
@@ -377,7 +377,7 @@ add_files_recursive (notmuch_database_t *notmuch,
 	}
 
 	next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
-	status = add_files_recursive (notmuch, next, state);
+	status = add_files (notmuch, next, state);
 	if (status) {
 	    ret = status;
 	    goto DONE;
@@ -647,16 +647,6 @@ stop_progress_printing_timer (void)
 }
 
 
-/* This is the top-level entry point for add_files. It does a couple
- * of error checks and then calls into the recursive function. */
-static notmuch_status_t
-add_files (notmuch_database_t *notmuch,
-	   const char *path,
-	   add_files_state_t *state)
-{
-    return add_files_recursive (notmuch, path, state);
-}
-
 /* XXX: This should be merged with the add_files function since it
  * shares a lot of logic with it. */
 /* Recursively count all regular files in path and all sub-directories
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive
  2012-05-24 19:32     ` [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive Austin Clements
@ 2012-05-24 20:57       ` Jani Nikula
  2012-05-24 21:57         ` Austin Clements
  2012-05-24 20:57       ` Jani Nikula
  1 sibling, 1 reply; 29+ messages in thread
From: Jani Nikula @ 2012-05-24 20:57 UTC (permalink / raw)
  To: Austin Clements, notmuch

On Thu, 24 May 2012, Austin Clements <amdragon@MIT.EDU> wrote:
> Before XXX, add_files_recursive could have been called on a symlink to
> a non-directory.  Hence, calling it on a non-directory was not an
> error, so a separate function, add_files, existed to fail loudly in
> situations where the path had to be a directory.

"Before XXX"?

Otherwise, this 3/4 and following 4/4 patch LGTM. I didn't bother
looking at 1/4 and 2/4 again, as you say there were no changes.


BR,
Jani.


>
> With the new stat-ing logic, add_files_recursive is always called on
> directories, so the separation of this logic is no longer necessary.
> Hence, this patch moves the strict error checking previously done by
> add_files into add_files_recursive.
> ---
>  notmuch-new.c |   27 +++++----------------------
>  1 file changed, 5 insertions(+), 22 deletions(-)
>
> diff --git a/notmuch-new.c b/notmuch-new.c
> index c64f1a7..2b05605 100644
> --- a/notmuch-new.c
> +++ b/notmuch-new.c
> @@ -308,11 +308,10 @@ add_files_recursive (notmuch_database_t *notmuch,
>      }
>      stat_time = time (NULL);
>  
> -    /* This is not an error since we may have recursed based on a
> -     * symlink to a regular file, not a directory, and we don't know
> -     * that until this stat. */
> -    if (! S_ISDIR (st.st_mode))
> -	return NOTMUCH_STATUS_SUCCESS;
> +    if (! S_ISDIR (st.st_mode)) {
> +	fprintf (stderr, "Error: %s is not a directory.\n", path);
> +	return NOTMUCH_STATUS_FILE_ERROR;
> +    }
>  
>      fs_mtime = st.st_mtime;
>  
> @@ -655,23 +654,7 @@ add_files (notmuch_database_t *notmuch,
>  	   const char *path,
>  	   add_files_state_t *state)
>  {
> -    notmuch_status_t status;
> -    struct stat st;
> -
> -    if (stat (path, &st)) {
> -	fprintf (stderr, "Error reading directory %s: %s\n",
> -		 path, strerror (errno));
> -	return NOTMUCH_STATUS_FILE_ERROR;
> -    }
> -
> -    if (! S_ISDIR (st.st_mode)) {
> -	fprintf (stderr, "Error: %s is not a directory.\n", path);
> -	return NOTMUCH_STATUS_FILE_ERROR;
> -    }
> -
> -    status = add_files_recursive (notmuch, path, state);
> -
> -    return status;
> +    return add_files_recursive (notmuch, path, state);
>  }
>  
>  /* XXX: This should be merged with the add_files function since it
> -- 
> 1.7.10
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive
  2012-05-24 19:32     ` [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive Austin Clements
  2012-05-24 20:57       ` Jani Nikula
@ 2012-05-24 20:57       ` Jani Nikula
  2012-05-24 21:08         ` Jani Nikula
  1 sibling, 1 reply; 29+ messages in thread
From: Jani Nikula @ 2012-05-24 20:57 UTC (permalink / raw)
  To: Austin Clements, notmuch

On Thu, 24 May 2012, Austin Clements <amdragon@MIT.EDU> wrote:
> Before XXX, add_files_recursive could have been called on a symlink to
> a non-directory.  Hence, calling it on a non-directory was not an
> error, so a separate function, add_files, existed to fail loudly in
> situations where the path had to be a directory.

"Before XXX"?

Otherwise, this 3/4 and following 4/4 patch LGTM. I didn't bother
looking at 1/4 and 2/4 again, as you say there were no changes.


BR,
Jani.


>
> With the new stat-ing logic, add_files_recursive is always called on
> directories, so the separation of this logic is no longer necessary.
> Hence, this patch moves the strict error checking previously done by
> add_files into add_files_recursive.
> ---
>  notmuch-new.c |   27 +++++----------------------
>  1 file changed, 5 insertions(+), 22 deletions(-)
>
> diff --git a/notmuch-new.c b/notmuch-new.c
> index c64f1a7..2b05605 100644
> --- a/notmuch-new.c
> +++ b/notmuch-new.c
> @@ -308,11 +308,10 @@ add_files_recursive (notmuch_database_t *notmuch,
>      }
>      stat_time = time (NULL);
>  
> -    /* This is not an error since we may have recursed based on a
> -     * symlink to a regular file, not a directory, and we don't know
> -     * that until this stat. */
> -    if (! S_ISDIR (st.st_mode))
> -	return NOTMUCH_STATUS_SUCCESS;
> +    if (! S_ISDIR (st.st_mode)) {
> +	fprintf (stderr, "Error: %s is not a directory.\n", path);
> +	return NOTMUCH_STATUS_FILE_ERROR;
> +    }
>  
>      fs_mtime = st.st_mtime;
>  
> @@ -655,23 +654,7 @@ add_files (notmuch_database_t *notmuch,
>  	   const char *path,
>  	   add_files_state_t *state)
>  {
> -    notmuch_status_t status;
> -    struct stat st;
> -
> -    if (stat (path, &st)) {
> -	fprintf (stderr, "Error reading directory %s: %s\n",
> -		 path, strerror (errno));
> -	return NOTMUCH_STATUS_FILE_ERROR;
> -    }
> -
> -    if (! S_ISDIR (st.st_mode)) {
> -	fprintf (stderr, "Error: %s is not a directory.\n", path);
> -	return NOTMUCH_STATUS_FILE_ERROR;
> -    }
> -
> -    status = add_files_recursive (notmuch, path, state);
> -
> -    return status;
> +    return add_files_recursive (notmuch, path, state);
>  }
>  
>  /* XXX: This should be merged with the add_files function since it
> -- 
> 1.7.10
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive
  2012-05-24 20:57       ` Jani Nikula
@ 2012-05-24 21:08         ` Jani Nikula
  2012-05-24 21:44           ` Jameson Graef Rollins
  0 siblings, 1 reply; 29+ messages in thread
From: Jani Nikula @ 2012-05-24 21:08 UTC (permalink / raw)
  To: Austin Clements, notmuch


Grrh, sorry for the spam. My favourite MUA told me:

"Sending...failed to WARNING: gnome-keyring:: couldn't connect to:
/tmp/keyring-Ky8i5h/pkcs11: No such file or directory"

but sent the message anyway. Apparently msmpt tries to chat with the
gnome-keyring-daemon even though that should not happen per the man
page, and would be unnecessary and useless anyway.

J.


On Thu, 24 May 2012, Jani Nikula <jani@nikula.org> wrote:
> On Thu, 24 May 2012, Austin Clements <amdragon@MIT.EDU> wrote:
>> Before XXX, add_files_recursive could have been called on a symlink to
>> a non-directory.  Hence, calling it on a non-directory was not an
>> error, so a separate function, add_files, existed to fail loudly in
>> situations where the path had to be a directory.
>
> "Before XXX"?
>
> Otherwise, this 3/4 and following 4/4 patch LGTM. I didn't bother
> looking at 1/4 and 2/4 again, as you say there were no changes.
>
>
> BR,
> Jani.
>
>
>>
>> With the new stat-ing logic, add_files_recursive is always called on
>> directories, so the separation of this logic is no longer necessary.
>> Hence, this patch moves the strict error checking previously done by
>> add_files into add_files_recursive.
>> ---
>>  notmuch-new.c |   27 +++++----------------------
>>  1 file changed, 5 insertions(+), 22 deletions(-)
>>
>> diff --git a/notmuch-new.c b/notmuch-new.c
>> index c64f1a7..2b05605 100644
>> --- a/notmuch-new.c
>> +++ b/notmuch-new.c
>> @@ -308,11 +308,10 @@ add_files_recursive (notmuch_database_t *notmuch,
>>      }
>>      stat_time = time (NULL);
>>  
>> -    /* This is not an error since we may have recursed based on a
>> -     * symlink to a regular file, not a directory, and we don't know
>> -     * that until this stat. */
>> -    if (! S_ISDIR (st.st_mode))
>> -	return NOTMUCH_STATUS_SUCCESS;
>> +    if (! S_ISDIR (st.st_mode)) {
>> +	fprintf (stderr, "Error: %s is not a directory.\n", path);
>> +	return NOTMUCH_STATUS_FILE_ERROR;
>> +    }
>>  
>>      fs_mtime = st.st_mtime;
>>  
>> @@ -655,23 +654,7 @@ add_files (notmuch_database_t *notmuch,
>>  	   const char *path,
>>  	   add_files_state_t *state)
>>  {
>> -    notmuch_status_t status;
>> -    struct stat st;
>> -
>> -    if (stat (path, &st)) {
>> -	fprintf (stderr, "Error reading directory %s: %s\n",
>> -		 path, strerror (errno));
>> -	return NOTMUCH_STATUS_FILE_ERROR;
>> -    }
>> -
>> -    if (! S_ISDIR (st.st_mode)) {
>> -	fprintf (stderr, "Error: %s is not a directory.\n", path);
>> -	return NOTMUCH_STATUS_FILE_ERROR;
>> -    }
>> -
>> -    status = add_files_recursive (notmuch, path, state);
>> -
>> -    return status;
>> +    return add_files_recursive (notmuch, path, state);
>>  }
>>  
>>  /* XXX: This should be merged with the add_files function since it
>> -- 
>> 1.7.10
>>
>> _______________________________________________
>> notmuch mailing list
>> notmuch@notmuchmail.org
>> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive
  2012-05-24 21:08         ` Jani Nikula
@ 2012-05-24 21:44           ` Jameson Graef Rollins
  0 siblings, 0 replies; 29+ messages in thread
From: Jameson Graef Rollins @ 2012-05-24 21:44 UTC (permalink / raw)
  To: Jani Nikula, Austin Clements, notmuch

On Thu, May 24 2012, Jani Nikula <jani@nikula.org> wrote:
> Grrh, sorry for the spam. My favourite MUA told me:
>
> "Sending...failed to WARNING: gnome-keyring:: couldn't connect to:
> /tmp/keyring-Ky8i5h/pkcs11: No such file or directory"
>
> but sent the message anyway. Apparently msmpt tries to chat with the
> gnome-keyring-daemon even though that should not happen per the man
> page, and would be unnecessary and useless anyway.

heh.  I've been getting that message all over the place recently.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive
  2012-05-24 20:57       ` Jani Nikula
@ 2012-05-24 21:57         ` Austin Clements
  0 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 21:57 UTC (permalink / raw)
  To: Jani Nikula; +Cc: notmuch

Quoth Jani Nikula on May 24 at 11:57 pm:
> On Thu, 24 May 2012, Austin Clements <amdragon@MIT.EDU> wrote:
> > Before XXX, add_files_recursive could have been called on a symlink to
> > a non-directory.  Hence, calling it on a non-directory was not an
> > error, so a separate function, add_files, existed to fail loudly in
> > situations where the path had to be a directory.
> 
> "Before XXX"?

Arg.  I meant to replace that with a commit ID, but now I don't
remember what commit I wanted to reference.

> Otherwise, this 3/4 and following 4/4 patch LGTM. I didn't bother
> looking at 1/4 and 2/4 again, as you say there were no changes.
> 
> 
> BR,
> Jani.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v4 0/4] Better, more portable stat-ing logic for notmuch new
  2012-05-24 19:31   ` [PATCH v3 0/4] " Austin Clements
                       ` (3 preceding siblings ...)
  2012-05-24 19:32     ` [PATCH v3 4/4] new: Unify " Austin Clements
@ 2012-05-24 22:01     ` Austin Clements
  2012-05-24 22:01       ` [PATCH v4 1/4] test: Test notmuch new with a broken symlink Austin Clements
                         ` (3 more replies)
  4 siblings, 4 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 22:01 UTC (permalink / raw)
  To: notmuch

This time for sure.  v4 fixes a left-over XXX in the commit message of
patch 3.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v4 1/4] test: Test notmuch new with a broken symlink
  2012-05-24 22:01     ` [PATCH v4 0/4] Better, more portable stat-ing logic for notmuch new Austin Clements
@ 2012-05-24 22:01       ` Austin Clements
  2012-05-25  1:03         ` David Bremner
  2012-05-24 22:01       ` [PATCH v4 2/4] new: Centralize file type stat-ing logic Austin Clements
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Austin Clements @ 2012-05-24 22:01 UTC (permalink / raw)
  To: notmuch

---
 test/new |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/test/new b/test/new
index 99f9913..a7bc146 100755
--- a/test/new
+++ b/test/new
@@ -136,6 +136,16 @@ output=$(NOTMUCH_NEW)
 test_expect_equal "$output" "Added 1 new message to the database."
 
 
+test_begin_subtest "Broken symlink aborts"
+ln -s does-not-exist "${MAIL_DIR}/broken"
+output=$(NOTMUCH_NEW 2>&1)
+test_expect_equal "$output" \
+"Error reading directory ${MAIL_DIR}/broken: No such file or directory
+Note: A fatal error was encountered: Something went wrong trying to read or write a file
+No new mail."
+rm "${MAIL_DIR}/broken"
+
+
 test_begin_subtest "New two-level directory"
 
 generate_message [dir]=two/levels
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v4 2/4] new: Centralize file type stat-ing logic
  2012-05-24 22:01     ` [PATCH v4 0/4] Better, more portable stat-ing logic for notmuch new Austin Clements
  2012-05-24 22:01       ` [PATCH v4 1/4] test: Test notmuch new with a broken symlink Austin Clements
@ 2012-05-24 22:01       ` Austin Clements
  2012-05-24 22:01       ` [PATCH v4 3/4] new: Merge error checks from add_files and add_files_recursive Austin Clements
  2012-05-24 22:01       ` [PATCH v4 4/4] new: Unify " Austin Clements
  3 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 22:01 UTC (permalink / raw)
  To: notmuch

This moves our logic to get a file's type into one function.  This has
several benefits: we can support OSes and file systems that do not
provide dirent.d_type or always return DT_UNKNOWN, complex
symlink-handling logic has been replaced by a simple stat fall-through
in one place, and the error message for un-stat-able file is more
accurate (previously, the error always mentioned directories, even
though a broken symlink is not a directory).
---
 notmuch-new.c |  103 +++++++++++++++++++++++++++++++++++----------------------
 test/new      |    2 +-
 2 files changed, 64 insertions(+), 41 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 72dd558..c64f1a7 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -154,6 +154,48 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
     return strcmp ((*a)->d_name, (*b)->d_name);
 }
 
+/* Return the type of a directory entry relative to path as a stat(2)
+ * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno
+ * if the file's type cannot be determined (which includes dangling
+ * symlinks).
+ */
+static int
+dirent_type (const char *path, const struct dirent *entry)
+{
+    struct stat statbuf;
+    char *abspath;
+    int err, saved_errno;
+
+#ifdef _DIRENT_HAVE_D_TYPE
+    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that
+     * we'll fall through to stat and get the real file type. */
+    static const mode_t modes[] = {
+	[DT_BLK]  = S_IFBLK,
+	[DT_CHR]  = S_IFCHR,
+	[DT_DIR]  = S_IFDIR,
+	[DT_FIFO] = S_IFIFO,
+	[DT_REG]  = S_IFREG,
+	[DT_SOCK] = S_IFSOCK
+    };
+    if (entry->d_type < ARRAY_SIZE(modes) && modes[entry->d_type])
+	return modes[entry->d_type];
+#endif
+
+    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name);
+    if (!abspath) {
+	errno = ENOMEM;
+	return -1;
+    }
+    err = stat(abspath, &statbuf);
+    saved_errno = errno;
+    talloc_free (abspath);
+    if (err < 0) {
+	errno = saved_errno;
+	return -1;
+    }
+    return statbuf.st_mode & S_IFMT;
+}
+
 /* Test if the directory looks like a Maildir directory.
  *
  * Search through the array of directory entries to see if we can find all
@@ -162,12 +204,12 @@ dirent_sort_strcmp_name (const struct dirent **a, const struct dirent **b)
  * Return 1 if the directory looks like a Maildir and 0 otherwise.
  */
 static int
-_entries_resemble_maildir (struct dirent **entries, int count)
+_entries_resemble_maildir (const char *path, struct dirent **entries, int count)
 {
     int i, found = 0;
 
     for (i = 0; i < count; i++) {
-	if (entries[i]->d_type != DT_DIR && entries[i]->d_type != DT_UNKNOWN)
+	if (dirent_type (path, entries[i]) != S_IFDIR)
 	    continue;
 
 	if (strcmp(entries[i]->d_name, "new") == 0 ||
@@ -250,7 +292,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS;
     notmuch_message_t *message = NULL;
     struct dirent **fs_entries = NULL;
-    int i, num_fs_entries = 0;
+    int i, num_fs_entries = 0, entry_type;
     notmuch_directory_t *directory;
     notmuch_filenames_t *db_files = NULL;
     notmuch_filenames_t *db_subdirs = NULL;
@@ -300,7 +342,7 @@ add_files_recursive (notmuch_database_t *notmuch,
     }
 
     /* Pass 1: Recurse into all sub-directories. */
-    is_maildir = _entries_resemble_maildir (fs_entries, num_fs_entries);
+    is_maildir = _entries_resemble_maildir (path, fs_entries, num_fs_entries);
 
     for (i = 0; i < num_fs_entries; i++) {
 	if (interrupted)
@@ -308,17 +350,16 @@ add_files_recursive (notmuch_database_t *notmuch,
 
 	entry = fs_entries[i];
 
-	/* We only want to descend into directories.
-	 * But symlinks can be to directories too, of course.
-	 *
-	 * And if the filesystem doesn't tell us the file type in the
-	 * scandir results, then it might be a directory (and if not,
-	 * then we'll stat and return immediately in the next level of
-	 * recursion). */
-	if (entry->d_type != DT_DIR &&
-	    entry->d_type != DT_LNK &&
-	    entry->d_type != DT_UNKNOWN)
-	{
+	/* We only want to descend into directories (and symlinks to
+	 * directories). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    /* Be pessimistic, e.g. so we don't lose lots of mail just
+	     * because a user broke a symlink. */
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFDIR) {
 	    continue;
 	}
 
@@ -407,31 +448,13 @@ add_files_recursive (notmuch_database_t *notmuch,
 	    notmuch_filenames_move_to_next (db_subdirs);
 	}
 
-	/* If we're looking at a symlink, we only want to add it if it
-	 * links to a regular file, (and not to a directory, say).
-	 *
-	 * Similarly, if the file is of unknown type (due to filesystem
-	 * limitations), then we also need to look closer.
-	 *
-	 * In either case, a stat does the trick.
-	 */
-	if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) {
-	    int err;
-
-	    next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
-	    err = stat (next, &st);
-	    talloc_free (next);
-	    next = NULL;
-
-	    /* Don't emit an error for a link pointing nowhere, since
-	     * the directory-traversal pass will have already done
-	     * that. */
-	    if (err)
-		continue;
-
-	    if (! S_ISREG (st.st_mode))
-		continue;
-	} else if (entry->d_type != DT_REG) {
+	/* Only add regular files (and symlinks to regular files). */
+	entry_type = dirent_type (path, entry);
+	if (entry_type == -1) {
+	    fprintf (stderr, "Error reading file %s/%s: %s\n",
+		     path, entry->d_name, strerror (errno));
+	    return NOTMUCH_STATUS_FILE_ERROR;
+	} else if (entry_type != S_IFREG) {
 	    continue;
 	}
 
diff --git a/test/new b/test/new
index a7bc146..cab7c01 100755
--- a/test/new
+++ b/test/new
@@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts"
 ln -s does-not-exist "${MAIL_DIR}/broken"
 output=$(NOTMUCH_NEW 2>&1)
 test_expect_equal "$output" \
-"Error reading directory ${MAIL_DIR}/broken: No such file or directory
+"Error reading file ${MAIL_DIR}/broken: No such file or directory
 Note: A fatal error was encountered: Something went wrong trying to read or write a file
 No new mail."
 rm "${MAIL_DIR}/broken"
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v4 3/4] new: Merge error checks from add_files and add_files_recursive
  2012-05-24 22:01     ` [PATCH v4 0/4] Better, more portable stat-ing logic for notmuch new Austin Clements
  2012-05-24 22:01       ` [PATCH v4 1/4] test: Test notmuch new with a broken symlink Austin Clements
  2012-05-24 22:01       ` [PATCH v4 2/4] new: Centralize file type stat-ing logic Austin Clements
@ 2012-05-24 22:01       ` Austin Clements
  2012-05-24 22:01       ` [PATCH v4 4/4] new: Unify " Austin Clements
  3 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 22:01 UTC (permalink / raw)
  To: notmuch

Previously, add_files_recursive could have been called on a symlink to
a non-directory.  Hence, calling it on a non-directory was not an
error, so a separate function, add_files, existed to fail loudly in
situations where the path had to be a directory.

With the new stat-ing logic, add_files_recursive is always called on
directories, so the separation of this logic is no longer necessary.
Hence, this patch moves the strict error checking previously done by
add_files into add_files_recursive.
---
 notmuch-new.c |   27 +++++----------------------
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index c64f1a7..2b05605 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -308,11 +308,10 @@ add_files_recursive (notmuch_database_t *notmuch,
     }
     stat_time = time (NULL);
 
-    /* This is not an error since we may have recursed based on a
-     * symlink to a regular file, not a directory, and we don't know
-     * that until this stat. */
-    if (! S_ISDIR (st.st_mode))
-	return NOTMUCH_STATUS_SUCCESS;
+    if (! S_ISDIR (st.st_mode)) {
+	fprintf (stderr, "Error: %s is not a directory.\n", path);
+	return NOTMUCH_STATUS_FILE_ERROR;
+    }
 
     fs_mtime = st.st_mtime;
 
@@ -655,23 +654,7 @@ add_files (notmuch_database_t *notmuch,
 	   const char *path,
 	   add_files_state_t *state)
 {
-    notmuch_status_t status;
-    struct stat st;
-
-    if (stat (path, &st)) {
-	fprintf (stderr, "Error reading directory %s: %s\n",
-		 path, strerror (errno));
-	return NOTMUCH_STATUS_FILE_ERROR;
-    }
-
-    if (! S_ISDIR (st.st_mode)) {
-	fprintf (stderr, "Error: %s is not a directory.\n", path);
-	return NOTMUCH_STATUS_FILE_ERROR;
-    }
-
-    status = add_files_recursive (notmuch, path, state);
-
-    return status;
+    return add_files_recursive (notmuch, path, state);
 }
 
 /* XXX: This should be merged with the add_files function since it
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v4 4/4] new: Unify add_files and add_files_recursive
  2012-05-24 22:01     ` [PATCH v4 0/4] Better, more portable stat-ing logic for notmuch new Austin Clements
                         ` (2 preceding siblings ...)
  2012-05-24 22:01       ` [PATCH v4 3/4] new: Merge error checks from add_files and add_files_recursive Austin Clements
@ 2012-05-24 22:01       ` Austin Clements
  3 siblings, 0 replies; 29+ messages in thread
From: Austin Clements @ 2012-05-24 22:01 UTC (permalink / raw)
  To: notmuch

Since starting at the top of a directory tree and recursing within
that tree are now identical operations, there's no need for both
add_files and add_files_recursive.  This eliminates add_files (which
did nothing more than call add_files_recursive after the previous
patch) and renames add_files_recursive to add_files.
---
 notmuch-new.c |   18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 2b05605..938ae29 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -281,9 +281,9 @@ _entry_in_ignore_list (const char *entry, add_files_state_t *state)
  *     if fs_mtime isn't the current wall-clock time.
  */
 static notmuch_status_t
-add_files_recursive (notmuch_database_t *notmuch,
-		     const char *path,
-		     add_files_state_t *state)
+add_files (notmuch_database_t *notmuch,
+	   const char *path,
+	   add_files_state_t *state)
 {
     DIR *dir = NULL;
     struct dirent *entry = NULL;
@@ -377,7 +377,7 @@ add_files_recursive (notmuch_database_t *notmuch,
 	}
 
 	next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
-	status = add_files_recursive (notmuch, next, state);
+	status = add_files (notmuch, next, state);
 	if (status) {
 	    ret = status;
 	    goto DONE;
@@ -647,16 +647,6 @@ stop_progress_printing_timer (void)
 }
 
 
-/* This is the top-level entry point for add_files. It does a couple
- * of error checks and then calls into the recursive function. */
-static notmuch_status_t
-add_files (notmuch_database_t *notmuch,
-	   const char *path,
-	   add_files_state_t *state)
-{
-    return add_files_recursive (notmuch, path, state);
-}
-
 /* XXX: This should be merged with the add_files function since it
  * shares a lot of logic with it. */
 /* Recursively count all regular files in path and all sub-directories
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 1/4] test: Test notmuch new with a broken symlink
  2012-05-24 22:01       ` [PATCH v4 1/4] test: Test notmuch new with a broken symlink Austin Clements
@ 2012-05-25  1:03         ` David Bremner
  0 siblings, 0 replies; 29+ messages in thread
From: David Bremner @ 2012-05-25  1:03 UTC (permalink / raw)
  To: Austin Clements, notmuch

Austin Clements <amdragon@MIT.EDU> writes:

> ---
>  test/new |   10 ++++++++++
>  1 file changed, 10 insertions(+)

All 4 pushed to master.

d

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2012-05-25  1:03 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-07 18:09 [PATCH 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
2012-05-07 18:09 ` [PATCH 1/2] test: Test notmuch new with a broken symlink Austin Clements
2012-05-07 18:09 ` [PATCH 2/2] new: Centralize file type stat-ing logic Austin Clements
2012-05-07 21:08   ` Jani Nikula
2012-05-07 21:14     ` Jani Nikula
2012-05-07 22:18     ` Austin Clements
2012-05-07 22:20 ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
2012-05-07 22:20   ` [PATCH v2 1/2] test: Test notmuch new with a broken symlink Austin Clements
2012-05-07 22:20   ` [PATCH v2 2/2] new: Centralize file type stat-ing logic Austin Clements
2012-05-08  7:58     ` Jani Nikula
2012-05-08  8:33       ` Jani Nikula
2012-05-09 18:27         ` Austin Clements
2012-05-09 18:45   ` [PATCH v2 0/2] Better, more portable stat-ing logic for notmuch new Austin Clements
2012-05-24 19:31   ` [PATCH v3 0/4] " Austin Clements
2012-05-24 19:31     ` [PATCH v3 1/4] test: Test notmuch new with a broken symlink Austin Clements
2012-05-24 19:32     ` [PATCH v3 2/4] new: Centralize file type stat-ing logic Austin Clements
2012-05-24 19:32     ` [PATCH v3 3/4] new: Merge error checks from add_files and add_files_recursive Austin Clements
2012-05-24 20:57       ` Jani Nikula
2012-05-24 21:57         ` Austin Clements
2012-05-24 20:57       ` Jani Nikula
2012-05-24 21:08         ` Jani Nikula
2012-05-24 21:44           ` Jameson Graef Rollins
2012-05-24 19:32     ` [PATCH v3 4/4] new: Unify " Austin Clements
2012-05-24 22:01     ` [PATCH v4 0/4] Better, more portable stat-ing logic for notmuch new Austin Clements
2012-05-24 22:01       ` [PATCH v4 1/4] test: Test notmuch new with a broken symlink Austin Clements
2012-05-25  1:03         ` David Bremner
2012-05-24 22:01       ` [PATCH v4 2/4] new: Centralize file type stat-ing logic Austin Clements
2012-05-24 22:01       ` [PATCH v4 3/4] new: Merge error checks from add_files and add_files_recursive Austin Clements
2012-05-24 22:01       ` [PATCH v4 4/4] new: Unify " Austin Clements

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).