unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: 21459@debbugs.gnu.org
Cc: bug-guix@gnu.org
Subject: bug#21459: Race condition in tests/tail-2/assert.sh
Date: Fri, 11 Sep 2015 18:23:31 +0200	[thread overview]
Message-ID: <87wpvw2ad8.fsf@gnu.org> (raw)

Hello,

We have observed intermittent failures of tests/tail-2/assert.sh
(Coreutils 8.24, libc 2.22), especially showing up on relatively slow
machines (armhf and mips64el.)

The failure is with ‘tail --follow=name’, which, in inotify mode, would
fail to report that file ‘foo’ has been deleted.

The strace of a correct execution (x86_64) is like this:

--8<---------------cut here---------------start------------->8---
lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
inotify_init()                          = 5
write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3
open("a", O_RDONLY|O_NONBLOCK)          = 6
lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstat(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
close(6)                                = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
open("foo", O_RDONLY|O_NONBLOCK)        = 6
lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
close(6)                                = 0
fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
open("foo", O_RDONLY|O_NONBLOCK)        = -1 ENOENT (No such file or directory)
lstat("foo", 0x7ffee174c3b0)            = -1 ENOENT (No such file or directory)
write(2, "tail: ", 6)                   = 6
write(2, "foo", 3)                      = 3
write(2, ": No such file or directory", 27) = 27
write(2, "\n", 1)                       = 1
close(4)                                = 0
read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) = 16
inotify_rm_watch(5, 3)                  = -1 EINVAL (Invalid argument)
--8<---------------cut here---------------end--------------->8---

For a *failing* execution (armhf), we get:

--8<---------------cut here---------------start------------->8---
lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
inotify_init()                          = 5
write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3
open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178938, f_bavail=34274467, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
read(5,
--8<---------------cut here---------------end--------------->8---

In both cases, reading from the inotify file descriptor (number 5)
returns a notification for watch #3 (corresponding to ‘foo’), with mask
IN_ATTRIB (value 4).

However, the open("foo") call that immediately follows does *not* return
ENOENT in the failing case: The file is still there.

The kernel’s ‘vfs_unlink’ goes like this:

        fsnotify_link_count(target);                    /* IN_ATTRIB */
        d_delete(dentry); /* fsnotify_nameremove → IN_DELETE for “.” */

So, ‘tail’ first receives the IN_ATTRIB notification corresponding to
the link count decrease on ‘foo’; at that point, ‘foo’ is still
available.  And then, ‘tail’ should receive the IN_DELETE_SELF
notification, at which point ‘foo’ would have been actually been
unlinked.  But in practice we don’t seem to be receiving IN_DELETE_SELF,
even in the succeeding case.

I think the problem happens when ‘tail’ opens ‘foo’ right in between of
the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t
report anything.

Does that make sense?

Thanks,
Ludo’.

             reply	other threads:[~2015-09-11 16:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-11 16:23 Ludovic Courtès [this message]
2015-09-11 17:18 ` bug#21460: Race condition in tests/tail-2/assert.sh Paul Eggert
2015-09-11 20:55   ` Ludovic Courtès
2015-09-11 22:49     ` Pádraig Brady
2015-09-11 23:48       ` Pádraig Brady
2015-09-12  1:09       ` Paul Eggert
2015-09-12  2:22         ` Pádraig Brady
2015-10-02 15:50           ` Pádraig Brady
2015-09-11 20:34 ` bug#21459: " Pádraig Brady

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wpvw2ad8.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=21459@debbugs.gnu.org \
    --cc=bug-guix@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).