all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#74203: coreutils fails to build
@ 2024-11-04 15:37 Collin J. Doering via Bug reports for GNU Guix
  2024-11-04 15:42 ` bug#74203: [PATCH] gnu: coreutils: Disable cp/reflink-auto.sh as it can fail on btrfs Collin J. Doering via Bug reports for GNU Guix
  2024-11-14  2:50 ` bug#74203: Further investigation and workaround Collin J. Doering via Bug reports for GNU Guix
  0 siblings, 2 replies; 3+ messages in thread
From: Collin J. Doering via Bug reports for GNU Guix @ 2024-11-04 15:37 UTC (permalink / raw)
  To: 74203

[-- Attachment #1: Type: text/plain, Size: 1619 bytes --]

Hi lovely maintainers of Guix!

Some time ago I announced the availability of a guix build farm running out of the University of Tennessee[1]. Some time ago, builds started failing due to a failure to build coreutils[2]; investigation showed a unexpected failing test:

--8<---------------cut here---------------start------------->8---
FAIL tests/cp/reflink-auto.sh (exit status: 1)
--8<---------------cut here---------------end--------------->8---

I found that on other guix systems, this is not occurring. After some online sleuthing, it appears that the nix folks have seen this before[3]. They opted to disable the test 'tests/cp/reflink-auto.sh' as it can fail when using btrfs. On the guix system impacted, disabling coreutils tests makes the package build.

For reference, coreutils was building on cuirass.genenetwork.org on guix commit `0c908518375aea50be6dec703367c01944c8c721` and stopped building on `66611696975409a52478b95a862a464daeaefe2a`.

I suggest we follow what the nix folks did (disable `tests/cp/reflink-auto.sh`). In a following email you will find a patch that does so, however, because it changes coreutils, this will cause many packages to be rebuilt, so I'm unsure whats the best way to correct this without having to wait for core-updates to be merged.

Any advise or insight appreciated.

[1]: https://lists.gnu.org/archive/html/guix-devel/2024-07/msg00033.html
[2]: https://cuirass.genenetwork.org/eval/157119/log/raw
[3]: https://github.com/NixOS/nixpkgs/pull/190211

-- 
Collin J. Doering

http://rekahsoft.ca
http://blog.rekahsoft.ca
http://git.rekahsoft.ca

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#74203: [PATCH] gnu: coreutils: Disable cp/reflink-auto.sh as it can fail on btrfs
  2024-11-04 15:37 bug#74203: coreutils fails to build Collin J. Doering via Bug reports for GNU Guix
@ 2024-11-04 15:42 ` Collin J. Doering via Bug reports for GNU Guix
  2024-11-14  2:50 ` bug#74203: Further investigation and workaround Collin J. Doering via Bug reports for GNU Guix
  1 sibling, 0 replies; 3+ messages in thread
From: Collin J. Doering via Bug reports for GNU Guix @ 2024-11-04 15:42 UTC (permalink / raw)
  To: 74203; +Cc: Collin J. Doering, Andreas Enge, Ludovic Courtès

* gnu/packages/base.scm: Similarly to
nix (https://github.com/NixOS/nixpkgs/pull/190211), disable
tests/cp/reflink-auto.sh test as it can fail on btrfs. This was discovered by
the cuirass.genenetwork.org build farm.

Change-Id: If1cc3d516c5807e580ec64ab93670e30090581a7
---
 gnu/packages/base.scm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gnu/packages/base.scm b/gnu/packages/base.scm
index 4e8121ae2c..bed708fc27 100644
--- a/gnu/packages/base.scm
+++ b/gnu/packages/base.scm
@@ -506,6 +506,8 @@ (define-public coreutils
                                    "tests/split/fail.sh"
                                    ;; These tests error
                                    "tests/dd/nocache.sh"
+                                   ;; These tests can intermitently fail on btrfs
+                                   "tests/cp/reflink-auto.sh"
                                    ;; These tests fail
                                    "tests/cp/sparse.sh"
                                    "tests/cp/special-f.sh"

base-commit: 915f807ce61c48c34141f0300ea7623170f4148a
-- 
2.46.0





^ permalink raw reply related	[flat|nested] 3+ messages in thread

* bug#74203: Further investigation and workaround
  2024-11-04 15:37 bug#74203: coreutils fails to build Collin J. Doering via Bug reports for GNU Guix
  2024-11-04 15:42 ` bug#74203: [PATCH] gnu: coreutils: Disable cp/reflink-auto.sh as it can fail on btrfs Collin J. Doering via Bug reports for GNU Guix
@ 2024-11-14  2:50 ` Collin J. Doering via Bug reports for GNU Guix
  1 sibling, 0 replies; 3+ messages in thread
From: Collin J. Doering via Bug reports for GNU Guix @ 2024-11-14  2:50 UTC (permalink / raw)
  To: 74203

[-- Attachment #1: Type: text/plain, Size: 3718 bytes --]

Hi again,

I wanted to follow up on my previous report and patch. I still think its useful to consider disabling the coreutils test I previously suggested, however I found a way to work around the issue and wanted to make note of it, as well as provide some details of my investigation.

To work around the coreutils test `tests/cp/reflink-auto.sh` failing on guix commit `66611696975409a52478b95a862a464daeaefe2a`, I temporarily mounted a tmpfs to replace /tmp (which was on btrfs).

--8<---------------cut here---------------start------------->8---
mv /tmp /tmp.old
mkdir /tmp
mount -t tmpfs tmpfs /tmp
chmod 1777 /tmp
mv /tmp.old/{.*,*} /tmp/
--8<---------------cut here---------------end--------------->8---

Now, what made me do this? Well let me explain!

In `tests/cp/reflink-auto.sh` (https://github.com/coreutils/coreutils/blob/v9.1/tests/cp/reflink-auto.sh), the failing part of the test:

--8<---------------cut here---------------start------------->8---
# we shouldn't be able to reflink() files on separate partitions
. "$abs_srcdir/tests/other-fs-tmpdir"
a_other="$other_partition_tmpdir/a"
<..>
returns_ 1 cp --reflink "$a_other" b || fail=1
--8<---------------cut here---------------end--------------->8---

'$other_partition_tmpdir' is defined in 'tests/other-fs-tmpdir' (https://github.com/coreutils/coreutils/blob/v9.1/tests/other-fs-tmpdir) by looking through a list of candidate directories, comparing the current working directory to each candidate to see if they have different device ids (as given by 'stat -c %d <path>') and that the current user can create directories there. Once it finds a candidate, it sets '$other_partition_tmpdir' to the temporary directory it created. The candidate directories that are considered are as follows:

--8<---------------cut here---------------start------------->8---
test "${CANDIDATE_TMP_DIRS+set}" = set \
  || CANDIDATE_TMP_DIRS="$TMPDIR /tmp /dev/shm /var/tmp /usr/tmp $HOME"
--8<---------------cut here---------------end--------------->8---

Looking at a remaining failed build of coreutils (left over by building with `--keep-failed`), I see that in 'top/environment-variables', 'TMPDIR' is set to '/tmp/guix-build-guix-1.4.0-26.5ab3c4c.drv-0'. This directory is the same place the build is taking place, so I would expect it to 'be on the same partition'. So, next would be /tmp, where the same premise applies; next is /dev/shm. From my tests simulating the coreutils guix shell build environment, this would meet the conditions and be selected. However, if this were the case, I wouldn't expect the coreutils reflink test to fail.

My suspicion is that for some reason, 'stat -c %d <path>' to check whether two files, a and b are on the same partition doesn't play well with btrfs subvolumes in some instances with guix-daemon sandboxed builds. However, when trying to test this in a simulated coreutils guix shell build environment, I found that paths outside of the environment on different subvolumes (that do indeed show different device ids (as per 'stat -c %d <path>' outside of the guix shell container)), show the same id's within it. I suspect this is related to why the coreutils test fails, but does not when I use a tmpfs for /tmp. Its worth noting that on the system impacted, /gnu/store is a btrfs subvolume.

I am not yet satisfied with my with my partial explanation, and am very curious if anyone spots something I'm missing (eg. has a better understanding of the guix build environment and why the reflink coreutils test could be failing like this).

Thanks for your time and attention.

-- 
Collin J. Doering

http://rekahsoft.ca
http://blog.rekahsoft.ca
http://git.rekahsoft.ca

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-11-14  2:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-04 15:37 bug#74203: coreutils fails to build Collin J. Doering via Bug reports for GNU Guix
2024-11-04 15:42 ` bug#74203: [PATCH] gnu: coreutils: Disable cp/reflink-auto.sh as it can fail on btrfs Collin J. Doering via Bug reports for GNU Guix
2024-11-14  2:50 ` bug#74203: Further investigation and workaround Collin J. Doering via Bug reports for GNU Guix

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.