From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 65018431FB6 for ; Mon, 7 May 2012 15:18:41 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3cXfqcv1+zrc for ; Mon, 7 May 2012 15:18:40 -0700 (PDT) Received: from dmz-mailsec-scanner-4.mit.edu (DMZ-MAILSEC-SCANNER-4.MIT.EDU [18.9.25.15]) by olra.theworths.org (Postfix) with ESMTP id 1F65F431FAE for ; Mon, 7 May 2012 15:18:40 -0700 (PDT) X-AuditID: 1209190f-b7f4f6d00000092b-79-4fa84a3dc175 Received: from mailhub-auth-2.mit.edu ( [18.7.62.36]) by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP id 38.29.02347.D3A48AF4; Mon, 7 May 2012 18:18:37 -0400 (EDT) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-2.mit.edu (8.13.8/8.9.2) with ESMTP id q47MIZuo013106; Mon, 7 May 2012 18:18:36 -0400 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q47MIXDY005680 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Mon, 7 May 2012 18:18:34 -0400 (EDT) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1SRWGH-0000B2-Hc; Mon, 07 May 2012 18:18:33 -0400 Date: Mon, 7 May 2012 18:18:33 -0400 From: Austin Clements To: Jani Nikula Subject: Re: [PATCH 2/2] new: Centralize file type stat-ing logic Message-ID: <20120507221833.GW2704@mit.edu> References: <1336414186-15293-1-git-send-email-amdragon@mit.edu> <1336414186-15293-3-git-send-email-amdragon@mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrCKsWRmVeSWpSXmKPExsUixG6nomvrtcLf4PlOdoum6c4W12/OZLaY cX4XiwOzx637r9k9nq26xeyx7OhPxgDmKC6blNSczLLUIn27BK6MeRO+sRRsdK/49fALUwPj c6MuRk4OCQETiS9bDrND2GISF+6tZwOxhQT2MUqs3MnRxcgFZK9nlGg+t4AdwjnBJLH43DxW CGcJo8Snd90sIC0sAioSk7vesoLYbAIaEtv2L2cEsUUEFCU2n9wPZjML+Eq8vfYArEZYwFFi +pLNzF2MHBy8AtoS/1brQcwE2vy0+RTYGbwCghInZz5hgejVkdi59Q4bSD2zgLTE8n8cEGF5 ieats5lBbE6BQInvjzaA2aJA50w5uY1tAqPwLCSTZiGZNAth0iwkkxYwsqxilE3JrdLNTczM KU5N1i1OTszLSy3SNdHLzSzRS00p3cQIigxOSf4djN8OKh1iFOBgVOLhVXyx3F+INbGsuDL3 EKMkB5OSKC+z2wp/Ib6k/JTKjMTijPii0pzU4kOMEhzMSiK8bWJAOd6UxMqq1KJ8mJQ0B4uS OK+a1js/IYH0xJLU7NTUgtQimKwMB4eSBO85T6BGwaLU9NSKtMycEoQ0EwcnyHAeoOHeIDW8 xQWJucWZ6RD5U4yKUuK8fSAJAZBERmkeXC8scb1iFAd6RZg3D6SKB5j04LpfAQ1mAhq8+dky kMEliQgpqQZGkenxJUfq9s5qqbTf+ZPnXpXAyVnS1sU5/qdLueSOrGsP0REoZHfc2nHF9NRz NcutE1T3PJ6Wu/epfMaGHRlKC1U/vNl6fv+cmjs7Yn2C3nzZ/HWBRzQnY9P5Naz9NeF7p5wx tjK/zcA857/rslZF8T033u55tbFjwsvZm7Q711/gWRRd+nzZXCWW4oxEQy3mouJEAA45T043 AwAA Cc: Notmuch Mail , Vladimir Marek X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 May 2012 22:18:41 -0000 Quoth Jani Nikula on May 08 at 12:08 am: > On May 7, 2012 9:10 PM, "Austin Clements" <[1]amdragon@mit.edu> wrote: > > > > This moves our logic to get a file's type into one function.  This has > > several benefits: we can support OSes and file systems that do not > > provide dirent.d_type or always return DT_UNKNOWN, complex > > symlink-handling logic has been replaced by a simple stat fall-through > > in one place, and the error message for un-stat-able file is more > > accurate (previously, the error always mentioned directories, even > > though a broken symlink is not a directory). > > Please find some quick drive-by-review below. Thanks for the review! New version on its way... > J. > > > --- > >  notmuch-new.c |   99 > ++++++++++++++++++++++++++++++++++----------------------- > >  test/new      |    2 +- > >  2 files changed, 60 insertions(+), 41 deletions(-) > > > > diff --git a/notmuch-new.c b/notmuch-new.c > > index cb720cc..cf2580e 100644 > > --- a/notmuch-new.c > > +++ b/notmuch-new.c > > @@ -154,6 +154,44 @@ dirent_sort_strcmp_name (const struct dirent **a, > const struct dirent **b) > >     return strcmp ((*a)->d_name, (*b)->d_name); > >  } > > > > +/* Return the type of a directory entry relative to path as a stat(2) > > + * mode.  Like stat, this follows symlinks.  Returns -1 and sets errno > > + * if the file's type cannot be determined (which includes dangling > > + * symlinks). > > + */ > > +static int > > +dirent_type (const char *path, const struct dirent *entry) > > +{ > > +    struct stat statbuf; > > +    char *abspath; > > +    int err; > > + > > +#ifdef _DIRENT_HAVE_D_TYPE > > +    /* Mapping from d_type to stat mode_t.  We omit DT_LNK so that > > +     * we'll fall through to stat and get the real file type. */ > > +    static const mode_t modes[] = { > > +       [DT_BLK]  = S_IFBLK, > > +       [DT_CHR]  = S_IFCHR, > > +       [DT_DIR]  = S_IFDIR, > > +       [DT_FIFO] = S_IFIFO, > > +       [DT_REG]  = S_IFREG, > > +       [DT_SOCK] = S_IFSOCK > > +    }; > > +    if (entry->d_type < sizeof(modes)/sizeof(modes[0]) && > > ARRAY_SIZE() Huh. I went looking for that. I wonder how I missed it. > > +       modes[entry->d_type]) > > +       return modes[entry->d_type]; > > +#endif > > + > > +    abspath = talloc_asprintf (NULL, "%s/%s", path, entry->d_name); > > +    if (!abspath) > > +       return -1; > > Does talloc set errno in this case? I suspect not. > > > +    err = stat(abspath, &statbuf); > > +    talloc_free (abspath); > > This likely breaks your promise about errno. You can't trust talloc_free() > not calling some function that sets errno. Setting errno was a late and apparently not well thought-out addition. The new patch should set it on all of the error paths and save it around things that may modify it. > > +    if (err < 0) > > +       return -1; > > +    return statbuf.st_mode & S_IFMT; > > +} > > + > >  /* Test if the directory looks like a Maildir directory. > >  * > >  * Search through the array of directory entries to see if we can find > all > > @@ -162,12 +200,12 @@ dirent_sort_strcmp_name (const struct dirent **a, > const struct dirent **b) > >  * Return 1 if the directory looks like a Maildir and 0 otherwise. > >  */ > >  static int > > -_entries_resemble_maildir (struct dirent **entries, int count) > > +_entries_resemble_maildir (const char *path, struct dirent **entries, > int count) > >  { > >     int i, found = 0; > > > >     for (i = 0; i < count; i++) { > > -       if (entries[i]->d_type != DT_DIR && entries[i]->d_type != > DT_UNKNOWN) > > +       if (dirent_type (path, entries[i]) != S_IFDIR) > >            continue; > > > >        if (strcmp(entries[i]->d_name, "new") == 0 || > > @@ -250,7 +288,7 @@ add_files_recursive (notmuch_database_t *notmuch, > >     notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS; > >     notmuch_message_t *message = NULL; > >     struct dirent **fs_entries = NULL; > > -    int i, num_fs_entries; > > +    int i, num_fs_entries, entry_type; > >     notmuch_directory_t *directory; > >     notmuch_filenames_t *db_files = NULL; > >     notmuch_filenames_t *db_subdirs = NULL; > > @@ -317,7 +355,7 @@ add_files_recursive (notmuch_database_t *notmuch, > >     } > > > >     /* Pass 1: Recurse into all sub-directories. */ > > -    is_maildir = _entries_resemble_maildir (fs_entries, > num_fs_entries); > > +    is_maildir = _entries_resemble_maildir (path, fs_entries, > num_fs_entries); > > > >     for (i = 0; i < num_fs_entries; i++) { > >        if (interrupted) > > @@ -325,17 +363,16 @@ add_files_recursive (notmuch_database_t *notmuch, > > > >        entry = fs_entries[i]; > > > > -       /* We only want to descend into directories. > > -        * But symlinks can be to directories too, of course. > > -        * > > -        * And if the filesystem doesn't tell us the file type in the > > -        * scandir results, then it might be a directory (and if not, > > -        * then we'll stat and return immediately in the next level of > > -        * recursion). */ > > -       if (entry->d_type != DT_DIR && > > -           entry->d_type != DT_LNK && > > -           entry->d_type != DT_UNKNOWN) > > -       { > > +       /* We only want to descend into directories (and symlinks to > > +        * directories). */ > > +       entry_type = dirent_type (path, entry); > > +       if (entry_type == -1) { > > +           /* Be pessimistic, e.g. so we don't loose lots of mail > > s/loose/lose/ ? Oops. > > +            * just because a user broke a symlink. */ > > +           fprintf (stderr, "Error reading file %s/%s: %s\n", > > +                    path, entry->d_name, strerror (errno)); > > You can't trust errno here, as explained above. Fixed above. > > +           return NOTMUCH_STATUS_FILE_ERROR; > > +       } else if (entry_type != S_IFDIR) { > >            continue; > >        } > > > > @@ -425,31 +462,13 @@ add_files_recursive (notmuch_database_t *notmuch, > >            notmuch_filenames_move_to_next (db_subdirs); > >        } > > > > -       /* If we're looking at a symlink, we only want to add it if it > > -        * links to a regular file, (and not to a directory, say). > > -        * > > -        * Similarly, if the file is of unknown type (due to filesystem > > -        * limitations), then we also need to look closer. > > -        * > > -        * In either case, a stat does the trick. > > -        */ > > -       if (entry->d_type == DT_LNK || entry->d_type == DT_UNKNOWN) { > > -           int err; > > - > > -           next = talloc_asprintf (notmuch, "%s/%s", path, > entry->d_name); > > -           err = stat (next, &st); > > -           talloc_free (next); > > -           next = NULL; > > - > > -           /* Don't emit an error for a link pointing nowhere, since > > -            * the directory-traversal pass will have already done > > -            * that. */ > > -           if (err) > > -               continue; > > - > > -           if (! S_ISREG (st.st_mode)) > > -               continue; > > -       } else if (entry->d_type != DT_REG) { > > +       /* Only add regular files (and symlinks to regular files). */ > > +       entry_type = dirent_type (path, entry); > > +       if (entry_type == -1) { > > +           fprintf (stderr, "Error reading file %s/%s: %s\n", > > +                    path, entry->d_name, strerror (errno)); > > Ditto. > > > +           return NOTMUCH_STATUS_FILE_ERROR; > > +       } else if (entry_type != S_IFREG) { > >            continue; > >        } > > > > diff --git a/test/new b/test/new > > index 26253db..e3900f5 100755 > > --- a/test/new > > +++ b/test/new > > @@ -140,7 +140,7 @@ test_begin_subtest "Broken symlink aborts" > >  ln -s does-not-exist "${MAIL_DIR}/broken" > >  output=$(NOTMUCH_NEW 2>&1) > >  test_expect_equal "$output" \ > > -"Error reading directory /run/shm/nm/tmp.new/mail/broken: No such file > or directory > > +"Error reading file /run/shm/nm/tmp.new/mail/broken: No such file or > directory > >  Note: A fatal error was encountered: Something went wrong trying to > read or write a file > >  No new mail." > >  rm "${MAIL_DIR}/broken" > > > > _______________________________________________ > > notmuch mailing list > > [2]notmuch@notmuchmail.org > > [3]http://notmuchmail.org/mailman/listinfo/notmuch