From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#20597: =?UTF-8?Q?=E2=80=98unlinkat=E2=80=99?= bug in Linux 4.0.2 leads to tar test failure Date: Sun, 24 May 2015 13:33:49 +0200 Message-ID: <87617i9plu.fsf_-___38483.1194453456$1432467320$gmane$org@gnu.org> References: <55584206.8020101@uwaterloo.ca> <871tich8ui.fsf@gnu.org> <5561771F.2010203@uwaterloo.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:35482) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwUBN-0001iO-Gu for bug-guix@gnu.org; Sun, 24 May 2015 07:35:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YwUBM-00089u-8w for bug-guix@gnu.org; Sun, 24 May 2015 07:35:05 -0400 Received: from debbugs.gnu.org ([140.186.70.43]:44324) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YwUBM-000893-6T for bug-guix@gnu.org; Sun, 24 May 2015 07:35:04 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1YwUBL-0004E4-Gb for bug-guix@gnu.org; Sun, 24 May 2015 07:35:03 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <5561771F.2010203@uwaterloo.ca> (Andy Patterson's message of "Sun, 24 May 2015 03:00:47 -0400") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org To: Andy Patterson Cc: 20597@debbugs.gnu.org, bug-gnulib@gnu.org (Please keep 20597@debbugs.gnu.org Cc'd.) (Gnulib: please scroll further down for the =E2=80=98unlinkat=E2=80=99 issu= e.) Andy Patterson skribis: > > I suppose this is Guix 0.8.2 on top of another distribution, right? Did > > you install from source or from the binary tarball? Did you enable > > substitutes (info "(guix) Substitutes")? >=20 > I was using the USB install medium in a live environment. So this is on GuixSD 0.8.2. =E2=80=98test-suite.log=E2=80=99 indeed mentio= ns Linux-libre 4.0.2. > I had substitutes enabled (I'm pretty sure they're enabled by default > here, but I also enabled them manually just to be sure). I wasn't able > to install anything with substitutes enabled; it would always stall > while trying to update the substitutes list from hydra. When my > network went down briefly, it informed me that it was still at 0.0% > before exiting. I think that this is probably a separate issue, but > which which I was less concerned about since I didn't want to use > substitutes anyway. OK. hydra.gnu.org is unfortunately too often overloaded these days, so you probably arrived on a bad day. Nevertheless, the solution to this specific issue is for you to use substitutes to circumvent the bug described below. >> Does the build succeed if you run it another time with: >> >> guix build tar -K -c 1 > > I tried this (with --no-substitutes), but I don't think the test suite > actually runs in parallel. I didn't notice any difference in that regard > when it was running; it seemed to take up the same amount of time with > or without -c 1. I had the same tests fail with the flag enabled. Oh you must be right. Looking at tests/Makefile.in, I see: --8<---------------cut here---------------start------------->8--- check-local: atconfig atlocal $(TESTSUITE) $(SHELL) $(TESTSUITE) $(TESTSUITEFLAGS) --8<---------------cut here---------------end--------------->8--- ... which shows that ./testsuite is not automatically passed -j, contrary to what I thought. reports a similar issue but on a different OS. I just tried this in a GuixSD VM with Linux-libre 4.0.2: --8<---------------cut here---------------start------------->8--- mkdir foo mkdir bar echo foo/foo_file > foo/foo_file echo bar/bar_file > bar/bar_file tar -cvf foo.tar --remove-files -C foo . -C ../bar . find . stat bar --8<---------------cut here---------------end--------------->8--- And indeed, it fails (that is, =E2=80=98bar=E2=80=99 is left behind.) It w= orks fine on 4.0.4-gnu though. On 4.0.2-gnu, I strace=E2=80=99d the =E2=80=98tar=E2=80=99 command above: --8<---------------cut here---------------start------------->8--- openat(AT_FDCWD, "foo", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC)= =3D 4 [...] openat(4, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) =3D 5 [...] openat(5, "foo_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = =3D 6 [...] openat(4, "../bar", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) =3D= 5 newfstatat(5, ".", {st_mode=3DS_IFDIR|0755, st_size=3D60, ...}, AT_SYMLINK_= NOFOLLOW) =3D 0 openat(5, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) =3D 6 [...] openat(6, "bar_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = =3D 7 fstat(7, {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 write(1, "./bar_file\n", 11) =3D 11 read(7, "x\n", 2) =3D 2 fstat(7, {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 close(7) =3D 0 fstat(6, {st_mode=3DS_IFDIR|0755, st_size=3D60, ...}) =3D 0 brk(0x1a34000) =3D 0x1a34000 close(6) =3D 0 write(3, "./\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..= ., 10240) =3D 10240 close(3) =3D 0 unlinkat(4, "foo_file", 0) =3D 0 unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) =3D 0 unlinkat(5, "bar_file", 0) =3D 0 unlinkat(4, "../bar", AT_REMOVEDIR) =3D -1 ENOENT (No such file or dire= ctory) --8<---------------cut here---------------end--------------->8--- Contrast this with the same thing on 4.0.4-gnu: --8<---------------cut here---------------start------------->8--- unlinkat(4, "foo_file", 0) =3D 0 unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) =3D 0 unlinkat(5, "bar_file", 0) =3D 0 unlinkat(4, "../bar", AT_REMOVEDIR) =3D 0 --8<---------------cut here---------------end--------------->8--- So this looks like a 4.0.2 kernel bug that Gnulib=E2=80=99s unlinkat should perhaps work around. Thoughts? Thanks, Ludo=E2=80=99.