unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
From: Eric Wong <e@80x24.org>
To: Dominique Martinet <asmadeus@codewreck.org>
Cc: Julien Moutinho <julm+public-inbox@sourcephile.fr>,
	meta@public-inbox.org
Subject: Re: [PATCH] nodatacow: quiet chattr errors [was: Test failures with 1.7.0]
Date: Sun, 30 Jan 2022 21:49:08 +0000	[thread overview]
Message-ID: <20220130214908.M320944@dcvr> (raw)
In-Reply-To: <YbErSFq8GLerwOEN@codewreck.org>

Dominique Martinet <asmadeus@codewreck.org> wrote:
> Dominique Martinet wrote on Thu, Dec 09, 2021 at 06:14:36AM +0900:
> > I'll try giving it one, in my opinion it's more representative to test
> > with inline-c working.
> 
> So giving tests a home appear to make another test hang
> (t/lei-refresh-mail-sync.t)
> 
> I've run out of time, will provide more traces tonight
> 
> > > Yes on tests requiring stderr to be empty.  Below is a patch
> > > which should fix it; however it should only be calling chattr on
> > > btrfs mounts.
> > 
> > I'll give this a try as well.
> 
> This patch makes the tests pass as expected.
> 
> > > You can also try:
> > > 
> > >   BTRFS_TESTDIR=/path/to/your/btrfs-mount prove -bvw t/nodatacow.t
> > 
> > I'll try something similar as well.
> 
> I can confirm this one works as well after installing chattr and running
> on btrfs, so there's no problem if mounts parsing is fixed.
> 
> I'd say this hints at a problem so we're probably better off not
> silencing chattr warnings, but would also need to check if the chattr
> binary is present... Probably not worth the hassle, I don't know.

Thanks for testing the previous patch.  Actually, I prefer we
drop previous implementations and instead rely on Linux ABI
stability while shifting maintenance burden to maintainers.

Can you test the patch below?  It supercedes the other one.

Apologies for the delay, I've pretty much lost all motivation for
everything in life :<  Will try to find some energy to look into the
other issues in a bit...

Thanks.

---------8<----------
Subject: [PATCH] rewrite Linux nodatacow use in pure Perl w/o system

btrfs is Linux-only at the moment (and likely to remain that way
for practical purposes).  So rely on Linux ABI stability and use
the `syscall' and `ioctl' perlops rather than relying on Inline::C.
Inline::C (and gcc||clang) are monstrous dependencies which we
can't expect users to have.

This makes supporting new architectures more difficult, but new
architectures come along rarely and this reduces the burden for
the majority of Linux users on popular architectures (while
still avoiding the distribution of pre-built binaries).

Link: https://public-inbox.org/meta/YbCPWGaJEkV6eWfo@codewreck.org/
---
 MANIFEST                       |  1 -
 devel/syscall-list             | 10 ++++-
 lib/PublicInbox/IMAPTracker.pm |  6 +--
 lib/PublicInbox/LeiMailSync.pm |  6 +--
 lib/PublicInbox/MiscIdx.pm     |  6 +--
 lib/PublicInbox/Msgmap.pm      |  6 +--
 lib/PublicInbox/NDC_PP.pm      | 34 ---------------
 lib/PublicInbox/Over.pm        |  7 ++--
 lib/PublicInbox/SearchIdx.pm   |  7 ++--
 lib/PublicInbox/SharedKV.pm    |  6 +--
 lib/PublicInbox/Spawn.pm       | 76 +++-------------------------------
 lib/PublicInbox/Syscall.pm     | 46 +++++++++++++++++++-
 lib/PublicInbox/Xapcmd.pm      | 11 ++---
 t/nodatacow.t                  | 35 +++++++---------
 14 files changed, 101 insertions(+), 156 deletions(-)
 delete mode 100644 lib/PublicInbox/NDC_PP.pm

diff --git a/MANIFEST b/MANIFEST
index 1287182d..ca840210 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -289,7 +289,6 @@ lib/PublicInbox/MsgIter.pm
 lib/PublicInbox/MsgTime.pm
 lib/PublicInbox/Msgmap.pm
 lib/PublicInbox/MultiGit.pm
-lib/PublicInbox/NDC_PP.pm
 lib/PublicInbox/NNTP.pm
 lib/PublicInbox/NNTPD.pm
 lib/PublicInbox/NNTPdeflate.pm
diff --git a/devel/syscall-list b/devel/syscall-list
index 3d55df1f..a6b1bfa7 100755
--- a/devel/syscall-list
+++ b/devel/syscall-list
@@ -26,8 +26,10 @@ system($cc, '-o', $x, $f, @cflags) == 0 or die "cc failed \$?=$?";
 exec($x);
 __DATA__
 #define _GNU_SOURCE
-#include <unistd.h>
 #include <sys/syscall.h>
+#include <sys/ioctl.h>
+#include <linux/fs.h>
+#include <unistd.h>
 #include <stdio.h>
 
 #define D(x) printf("$" #x " = %ld;\n", (long)x)
@@ -46,6 +48,12 @@ int main(void)
 	D(SYS_inotify_add_watch);
 	D(SYS_inotify_rm_watch);
 	D(SYS_prctl);
+	D(SYS_fstatfs);
+#ifdef FS_IOC_GETFLAGS
+	printf("FS_IOC_GETFLAGS=%#lx\nFS_IOC_SETFLAGS=%#lx\n",
+		(unsigned long)FS_IOC_GETFLAGS, (unsigned long)FS_IOC_SETFLAGS);
+#endif
+
 #ifdef SYS_renameat2
 	D(SYS_renameat2);
 #endif
diff --git a/lib/PublicInbox/IMAPTracker.pm b/lib/PublicInbox/IMAPTracker.pm
index 2fd66440..4efa8a7e 100644
--- a/lib/PublicInbox/IMAPTracker.pm
+++ b/lib/PublicInbox/IMAPTracker.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2018-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 package PublicInbox::IMAPTracker;
 use strict;
@@ -75,11 +75,11 @@ sub new {
 	}
 	if (!-f $dbname) {
 		require File::Path;
-		require PublicInbox::Spawn;
+		require PublicInbox::Syscall;
 		my ($dir) = ($dbname =~ m!(.*?/)[^/]+\z!);
 		File::Path::mkpath($dir);
+		PublicInbox::Syscall::nodatacow_dir($dir);
 		open my $fh, '+>>', $dbname or die "failed to open $dbname: $!";
-		PublicInbox::Spawn::nodatacow_fd(fileno($fh));
 	}
 	my $self = bless { lock_path => "$dbname.lock", url => $url }, $class;
 	$self->lock_acquire;
diff --git a/lib/PublicInbox/LeiMailSync.pm b/lib/PublicInbox/LeiMailSync.pm
index 124eb969..182b0c22 100644
--- a/lib/PublicInbox/LeiMailSync.pm
+++ b/lib/PublicInbox/LeiMailSync.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # for maintaining synchronization between lei/store <=> Maildir|MH|IMAP|JMAP
@@ -15,9 +15,9 @@ sub dbh_new {
 	my $f = $self->{filename};
 	my $creat = $rw && !-s $f;
 	if ($creat) {
-		require PublicInbox::Spawn;
+		require PublicInbox::Syscall;
 		open my $fh, '+>>', $f or Carp::croak "open($f): $!";
-		PublicInbox::Spawn::nodatacow_fd(fileno($fh));
+		PublicInbox::Syscall::nodatacow_fh($fh);
 	}
 	my $dbh = DBI->connect("dbi:SQLite:dbname=$f",'','', {
 		AutoCommit => 1,
diff --git a/lib/PublicInbox/MiscIdx.pm b/lib/PublicInbox/MiscIdx.pm
index f5a374b2..dc15442d 100644
--- a/lib/PublicInbox/MiscIdx.pm
+++ b/lib/PublicInbox/MiscIdx.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # like PublicInbox::SearchIdx, but for searching for non-mail messages.
@@ -16,11 +16,11 @@ use v5.10.1;
 use PublicInbox::InboxWritable;
 use PublicInbox::Search; # for SWIG Xapian and Search::Xapian compat
 use PublicInbox::SearchIdx qw(index_text term_generator add_val);
-use PublicInbox::Spawn qw(nodatacow_dir);
 use Carp qw(croak);
 use File::Path ();
 use PublicInbox::MiscSearch;
 use PublicInbox::Config;
+use PublicInbox::Syscall;
 my $json;
 
 sub new {
@@ -28,7 +28,7 @@ sub new {
 	PublicInbox::SearchIdx::load_xapian_writable();
 	my $mi_dir = "$eidx->{xpfx}/misc";
 	File::Path::mkpath($mi_dir);
-	nodatacow_dir($mi_dir);
+	PublicInbox::Syscall::nodatacow_dir($mi_dir);
 	my $flags = $PublicInbox::SearchIdx::DB_CREATE_OR_OPEN;
 	$flags |= $PublicInbox::SearchIdx::DB_NO_SYNC if $eidx->{-no_fsync};
 	$json //= PublicInbox::Config::json();
diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm
index 699a8bf0..1041cd17 100644
--- a/lib/PublicInbox/Msgmap.pm
+++ b/lib/PublicInbox/Msgmap.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2015-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # bidirectional Message-ID <-> Article Number mapping for the NNTP
@@ -13,7 +13,6 @@ use v5.10.1;
 use DBI;
 use DBD::SQLite;
 use PublicInbox::Over;
-use PublicInbox::Spawn;
 use Scalar::Util qw(blessed);
 
 sub new_file {
@@ -53,7 +52,8 @@ sub tmp_clone {
 	require File::Temp;
 	my $tmp = "mm_tmp-$$-XXXX";
 	my ($fh, $fn) = File::Temp::tempfile($tmp, EXLOCK => 0, DIR => $dir);
-	PublicInbox::Spawn::nodatacow_fd(fileno($fh));
+	require PublicInbox::Syscall;
+	PublicInbox::Syscall::nodatacow_fh($fh);
 	$self->{dbh}->sqlite_backup_to_file($fn);
 	$tmp = ref($self)->new_file($fn, 2);
 	$tmp->{dbh}->do('PRAGMA journal_mode = MEMORY');
diff --git a/lib/PublicInbox/NDC_PP.pm b/lib/PublicInbox/NDC_PP.pm
deleted file mode 100644
index 57abccbe..00000000
--- a/lib/PublicInbox/NDC_PP.pm
+++ /dev/null
@@ -1,34 +0,0 @@
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
-# License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
-
-# Pure-perl class for Linux non-Inline::C users to disable COW for btrfs
-package PublicInbox::NDC_PP;
-use strict;
-use v5.10.1;
-
-sub nodatacow_dir ($) {
-	my ($path) = @_;
-	open my $mh, '<', '/proc/self/mounts' or return;
-	for (grep(/ btrfs /, <$mh>)) {
-		my (undef, $mnt_path, $type) = split(/ /);
-		next if $type ne 'btrfs'; # in case of false-positive from grep
-
-		# weird chars are escaped as octal
-		$mnt_path =~ s/\\(0[0-9]{2})/chr(oct($1))/egs;
-		$mnt_path .= '/' unless $mnt_path =~ m!/\z!;
-		if (index($path, $mnt_path) == 0) {
-			# error goes to stderr, but non-fatal for us
-			system('chattr', '+C', $path);
-			last;
-		}
-	}
-}
-
-sub nodatacow_fd ($) {
-	my ($fd) = @_;
-	return if $^O ne 'linux';
-	defined(my $path = readlink("/proc/self/fd/$fd")) or return;
-	nodatacow_dir($path);
-}
-
-1;
diff --git a/lib/PublicInbox/Over.pm b/lib/PublicInbox/Over.pm
index 30ad949d..786f9d92 100644
--- a/lib/PublicInbox/Over.pm
+++ b/lib/PublicInbox/Over.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2018-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # for XOVER, OVER in NNTP, and feeds/homepage/threads in PSGI
@@ -18,11 +18,10 @@ sub dbh_new {
 	my $f = delete $self->{filename};
 	if (!-s $f) { # SQLite defaults mode to 0644, we want 0666
 		if ($rw) {
-			require PublicInbox::Spawn;
+			require PublicInbox::Syscall;
 			my ($dir) = ($f =~ m!(.+)/[^/]+\z!);
-			PublicInbox::Spawn::nodatacow_dir($dir);
+			PublicInbox::Syscall::nodatacow_dir($dir);
 			open my $fh, '+>>', $f or die "failed to open $f: $!";
-			PublicInbox::Spawn::nodatacow_fd(fileno($fh));
 		} else {
 			$self->{filename} = $f; # die on stat() below:
 		}
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
index 4e5d7d44..95b14c3a 100644
--- a/lib/PublicInbox/SearchIdx.pm
+++ b/lib/PublicInbox/SearchIdx.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2015-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 # based on notmuch, but with no concept of folders, files
 #
@@ -20,7 +20,7 @@ use Carp qw(croak carp);
 use POSIX qw(strftime);
 use Time::Local qw(timegm);
 use PublicInbox::OverIdx;
-use PublicInbox::Spawn qw(spawn nodatacow_dir);
+use PublicInbox::Spawn qw(spawn);
 use PublicInbox::Git qw(git_unquote);
 use PublicInbox::MsgTime qw(msg_timestamp msg_datestamp);
 use PublicInbox::Address;
@@ -139,7 +139,8 @@ sub idx_acquire {
 		if (!-d $dir && (!$is_shard ||
 				($is_shard && need_xapian($self)))) {
 			File::Path::mkpath($dir);
-			nodatacow_dir($dir);
+			require PublicInbox::Syscall;
+			PublicInbox::Syscall::nodatacow_dir($dir);
 			$self->{-set_has_threadid_once} = 1;
 		}
 	}
diff --git a/lib/PublicInbox/SharedKV.pm b/lib/PublicInbox/SharedKV.pm
index 4297efed..95a3cb14 100644
--- a/lib/PublicInbox/SharedKV.pm
+++ b/lib/PublicInbox/SharedKV.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 
 # fork()-friendly key-value store.  Will be used for making
@@ -49,9 +49,9 @@ sub new {
 	my $f = $self->{filename} = "$dir/$base.sqlite3";
 	$self->{lock_path} = $opt->{lock_path} // "$dir/$base.flock";
 	unless (-s $f) {
-		PublicInbox::Spawn::nodatacow_dir($dir); # for journal/shm/wal
+		require PublicInbox::Syscall;
+		PublicInbox::Syscall::nodatacow_dir($dir); # for journal/shm/wal
 		open my $fh, '+>>', $f or die "failed to open $f: $!";
-		PublicInbox::Spawn::nodatacow_fd(fileno($fh));
 	}
 	$self;
 }
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index e0a51c21..137b8087 100644
--- a/lib/PublicInbox/Spawn.pm
+++ b/lib/PublicInbox/Spawn.pm
@@ -1,4 +1,4 @@
-# Copyright (C) 2016-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 #
 # This allows vfork to be used for spawning subprocesses if
@@ -21,7 +21,7 @@ use Symbol qw(gensym);
 use Fcntl qw(LOCK_EX SEEK_SET);
 use IO::Handle ();
 use PublicInbox::ProcessPipe;
-our @EXPORT_OK = qw(which spawn popen_rd run_die nodatacow_dir);
+our @EXPORT_OK = qw(which spawn popen_rd run_die);
 our @RLIMITS = qw(RLIMIT_CPU RLIMIT_CORE RLIMIT_DATA);
 
 BEGIN {
@@ -268,62 +268,12 @@ void recv_cmd4(PerlIO *s, SV *buf, STRLEN n)
 #endif /* defined(CMSG_SPACE) && defined(CMSG_LEN) */
 ALL_LIBC
 
-# btrfs on Linux is copy-on-write (COW) by default.  As of Linux 5.7,
-# this still leads to fragmentation for SQLite and Xapian files where
-# random I/O happens, so we disable COW just for SQLite files and Xapian
-# directories.  Disabling COW disables checksumming, so we only do this
-# for regeneratable files, and not canonical git storage (git doesn't
-# checksum refs, only data under $GIT_DIR/objects).
-	my $set_nodatacow = $^O eq 'linux' ? <<'SET_NODATACOW' : '';
-#include <sys/ioctl.h>
-#include <sys/vfs.h>
-#include <linux/magic.h>
-#include <linux/fs.h>
-#include <dirent.h>
-
-void nodatacow_fd(int fd)
-{
-	struct statfs buf;
-	int val = 0;
-
-	if (fstatfs(fd, &buf) < 0) {
-		fprintf(stderr, "fstatfs: %s\\n", strerror(errno));
-		return;
-	}
-
-	/* only btrfs is known to have this problem, so skip for non-btrfs */
-	if (buf.f_type != BTRFS_SUPER_MAGIC)
-		return;
-
-	if (ioctl(fd, FS_IOC_GETFLAGS, &val) < 0) {
-		fprintf(stderr, "FS_IOC_GET_FLAGS: %s\\n", strerror(errno));
-		return;
-	}
-	val |= FS_NOCOW_FL;
-	if (ioctl(fd, FS_IOC_SETFLAGS, &val) < 0)
-		fprintf(stderr, "FS_IOC_SET_FLAGS: %s\\n", strerror(errno));
-}
-
-void nodatacow_dir(const char *dir)
-{
-	DIR *dh = opendir(dir);
-	int fd;
-
-	if (!dh) croak("opendir(%s): %s", dir, strerror(errno));
-	fd = dirfd(dh);
-	if (fd >= 0)
-		nodatacow_fd(fd);
-	/* ENOTSUP probably won't happen under Linux... */
-	closedir(dh);
-}
-SET_NODATACOW
-
 	my $inline_dir = $ENV{PERL_INLINE_DIRECTORY} //= (
 			$ENV{XDG_CACHE_HOME} //
 			( ($ENV{HOME} // '/nonexistent').'/.cache' )
 		).'/public-inbox/inline-c';
 	warn "$inline_dir exists, not writable\n" if -e $inline_dir && !-w _;
-	$set_nodatacow = $all_libc = undef unless -d _ && -w _;
+	$all_libc = undef unless -d _ && -w _;
 	if (defined $all_libc) {
 		my $f = "$inline_dir/.public-inbox.lock";
 		open my $oldout, '>&', \*STDOUT or die "dup(1): $!";
@@ -337,17 +287,10 @@ SET_NODATACOW
 		# CentOS 7.x ships Inline 0.53, 0.64+ has built-in locking
 		flock($fh, LOCK_EX) or die "LOCK_EX($f): $!";
 		eval <<'EOM';
-use Inline C => $all_libc.$set_nodatacow, BUILD_NOISY => 1;
+use Inline C => $all_libc, BUILD_NOISY => 1;
 EOM
 		my $err = $@;
 		my $ndc_err = '';
-		if ($err && $set_nodatacow) { # missing Linux kernel headers
-			$ndc_err = "with set_nodatacow: <\n$err\n>\n";
-			undef $set_nodatacow;
-			eval <<'EOM';
-use Inline C => $all_libc, BUILD_NOISY => 1;
-EOM
-		};
 		$err = $@;
 		open(STDERR, '>&', $olderr) or warn "restore stderr: $!";
 		open(STDOUT, '>&', $oldout) or warn "restore stdout: $!";
@@ -356,22 +299,13 @@ EOM
 			my @msg = <$fh>;
 			warn "Inline::C build failed:\n",
 				$ndc_err, $err, "\n", @msg;
-			$set_nodatacow = $all_libc = undef;
-		} elsif ($ndc_err) {
-			warn "Inline::C build succeeded w/o set_nodatacow\n",
-				"error $ndc_err";
+			$all_libc = undef;
 		}
 	}
 	unless ($all_libc) {
 		require PublicInbox::SpawnPP;
 		*pi_fork_exec = \&PublicInbox::SpawnPP::pi_fork_exec
 	}
-	unless ($set_nodatacow) {
-		require PublicInbox::NDC_PP;
-		no warnings 'once';
-		*nodatacow_fd = \&PublicInbox::NDC_PP::nodatacow_fd;
-		*nodatacow_dir = \&PublicInbox::NDC_PP::nodatacow_dir;
-	}
 } # /BEGIN
 
 sub which ($) {
diff --git a/lib/PublicInbox/Syscall.pm b/lib/PublicInbox/Syscall.pm
index c00385b9..bcfae2cb 100644
--- a/lib/PublicInbox/Syscall.pm
+++ b/lib/PublicInbox/Syscall.pm
@@ -5,7 +5,7 @@
 # This license differs from the rest of public-inbox
 #
 # This module is Copyright (c) 2005 Six Apart, Ltd.
-# Copyright (C) 2019-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 #
 # All rights reserved.
 #
@@ -68,6 +68,7 @@ our (
      $SYS_renameat2,
      );
 
+my $SYS_fstatfs; # don't need fstatfs64, just statfs.f_type
 my $SFD_CLOEXEC = 02000000; # Perl does not expose O_CLOEXEC
 our $no_deprecated = 0;
 
@@ -96,18 +97,21 @@ if ($^O eq "linux") {
         $SYS_epoll_wait   = 256;
         $SYS_signalfd4 = 327;
         $SYS_renameat2 //= 353;
+	$SYS_fstatfs = 100;
     } elsif ($machine eq "x86_64") {
         $SYS_epoll_create = 213;
         $SYS_epoll_ctl    = 233;
         $SYS_epoll_wait   = 232;
         $SYS_signalfd4 = 289;
 	$SYS_renameat2 //= 316;
+	$SYS_fstatfs = 138;
     } elsif ($machine eq 'x32') {
         $SYS_epoll_create = 1073742037;
         $SYS_epoll_ctl = 1073742057;
         $SYS_epoll_wait = 1073742056;
         $SYS_signalfd4 = 1073742113;
 	$SYS_renameat2 //= 0x40000000 + 316;
+	$SYS_fstatfs = 138;
     } elsif ($machine eq 'sparc64') {
 	$SYS_epoll_create = 193;
 	$SYS_epoll_ctl = 194;
@@ -116,6 +120,7 @@ if ($^O eq "linux") {
 	$SYS_signalfd4 = 317;
 	$SYS_renameat2 //= 345;
 	$SFD_CLOEXEC = 020000000;
+	$SYS_fstatfs = 158;
     } elsif ($machine =~ m/^parisc/) {
         $SYS_epoll_create = 224;
         $SYS_epoll_ctl    = 225;
@@ -129,6 +134,7 @@ if ($^O eq "linux") {
         $u64_mod_8        = 1;
         $SYS_signalfd4 = 313;
 	$SYS_renameat2 //= 357;
+	$SYS_fstatfs = 100;
     } elsif ($machine eq "ppc") {
         $SYS_epoll_create = 236;
         $SYS_epoll_ctl    = 237;
@@ -136,6 +142,7 @@ if ($^O eq "linux") {
         $u64_mod_8        = 1;
         $SYS_signalfd4 = 313;
 	$SYS_renameat2 //= 357;
+	$SYS_fstatfs = 100;
     } elsif ($machine =~ m/^s390/) {
         $SYS_epoll_create = 249;
         $SYS_epoll_ctl    = 250;
@@ -143,6 +150,7 @@ if ($^O eq "linux") {
         $u64_mod_8        = 1;
         $SYS_signalfd4 = 322;
 	$SYS_renameat2 //= 347;
+	$SYS_fstatfs = 100;
     } elsif ($machine eq "ia64") {
         $SYS_epoll_create = 1243;
         $SYS_epoll_ctl    = 1244;
@@ -165,6 +173,7 @@ if ($^O eq "linux") {
         $no_deprecated    = 1;
         $SYS_signalfd4 = 74;
 	$SYS_renameat2 //= 276;
+	$SYS_fstatfs = 44;
     } elsif ($machine =~ m/arm(v\d+)?.*l/) {
         # ARM OABI
         $SYS_epoll_create = 250;
@@ -173,6 +182,7 @@ if ($^O eq "linux") {
         $u64_mod_8        = 1;
         $SYS_signalfd4 = 355;
 	$SYS_renameat2 //= 382;
+	$SYS_fstatfs = 100;
     } elsif ($machine =~ m/^mips64/) {
         $SYS_epoll_create = 5207;
         $SYS_epoll_ctl    = 5208;
@@ -180,6 +190,7 @@ if ($^O eq "linux") {
         $u64_mod_8        = 1;
         $SYS_signalfd4 = 5283;
 	$SYS_renameat2 //= 5311;
+	$SYS_fstatfs = 5135;
     } elsif ($machine =~ m/^mips/) {
         $SYS_epoll_create = 4248;
         $SYS_epoll_ctl    = 4249;
@@ -187,6 +198,7 @@ if ($^O eq "linux") {
         $u64_mod_8        = 1;
         $SYS_signalfd4 = 4324;
 	$SYS_renameat2 //= 4351;
+	$SYS_fstatfs = 4100;
     } else {
         # as a last resort, try using the *.ph files which may not
         # exist or may be wrong
@@ -323,6 +335,38 @@ sub rename_noreplace ($$) {
 	}
 }
 
+sub nodatacow_fh {
+	return if !defined($SYS_fstatfs);
+	my $buf = '';
+	vec($buf, 120 * 8 - 1, 1) = 0;
+	my ($fh) = @_;
+	syscall($SYS_fstatfs, fileno($fh), $buf) == 0 or
+		return warn("fstatfs: $!\n");
+	my $f_type = unpack('l!', $buf); # statfs.f_type is a signed word
+	return if $f_type != 0x9123683E; # BTRFS_SUPER_MAGIC
+
+	state ($FS_IOC_GETFLAGS, $FS_IOC_SETFLAGS);
+	unless (defined $FS_IOC_GETFLAGS) {
+		if (substr($Config{byteorder}, 0, 4) eq '1234') {
+			$FS_IOC_GETFLAGS = 0x80086601;
+			$FS_IOC_SETFLAGS = 0x40086602;
+		} else { # Big endian
+			$FS_IOC_GETFLAGS = 0x40086601;
+			$FS_IOC_SETFLAGS = 0x80086602;
+		}
+	}
+	ioctl($fh, $FS_IOC_GETFLAGS, $buf) //
+		return warn("FS_IOC_GET_FLAGS: $!\n");
+	my $attr = unpack('l!', $buf);
+	return if ($attr & 0x00800000); # FS_NOCOW_FL;
+	ioctl($fh, $FS_IOC_SETFLAGS, pack('l', $attr | 0x00800000)) //
+		warn("FS_IOC_SET_FLAGS: $!\n");
+}
+
+sub nodatacow_dir {
+	if (open my $fh, '<', $_[0]) { nodatacow_fh($fh) }
+}
+
 1;
 
 =head1 WARRANTY
diff --git a/lib/PublicInbox/Xapcmd.pm b/lib/PublicInbox/Xapcmd.pm
index 44e0f8e5..10685636 100644
--- a/lib/PublicInbox/Xapcmd.pm
+++ b/lib/PublicInbox/Xapcmd.pm
@@ -1,8 +1,9 @@
-# Copyright (C) 2018-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 package PublicInbox::Xapcmd;
 use strict;
-use PublicInbox::Spawn qw(which popen_rd nodatacow_dir);
+use PublicInbox::Spawn qw(which popen_rd);
+use PublicInbox::Syscall;
 use PublicInbox::Admin qw(setup_signals);
 use PublicInbox::Over;
 use PublicInbox::SearchIdx;
@@ -211,7 +212,7 @@ sub prepare_run {
 		my $v = PublicInbox::Search::SCHEMA_VERSION();
 		my $wip = File::Temp->newdir("xapian$v-XXXX", DIR => $dir);
 		$tmp->{$old} = $wip;
-		nodatacow_dir($wip->dirname);
+		PublicInbox::Syscall::nodatacow_dir($wip->dirname);
 		push @queue, [ $old, $wip ];
 	} elsif (defined $old) {
 		opendir my $dh, $old or die "Failed to opendir $old: $!\n";
@@ -242,7 +243,7 @@ sub prepare_run {
 			same_fs_or_die($old, $wip->dirname);
 			my $cur = "$old/$dn";
 			push @queue, [ $src // $cur , $wip ];
-			nodatacow_dir($wip->dirname);
+			PublicInbox::Syscall::nodatacow_dir($wip->dirname);
 			$tmp->{$cur} = $wip;
 		}
 		# mark old shards to be unlinked
@@ -443,7 +444,7 @@ sub cpdb ($$) { # cb_spawn callback
 		$ft = File::Temp->newdir("$new.compact-XXXX", DIR => $dir);
 		setup_signals();
 		$tmp = $ft->dirname;
-		nodatacow_dir($tmp);
+		PublicInbox::Syscall::nodatacow_dir($tmp);
 	} else {
 		$tmp = $new;
 	}
diff --git a/t/nodatacow.t b/t/nodatacow.t
index 19247c10..83aa227f 100644
--- a/t/nodatacow.t
+++ b/t/nodatacow.t
@@ -1,48 +1,41 @@
 #!perl -w
-# Copyright (C) 2020-2021 all contributors <meta@public-inbox.org>
+# Copyright (C) all contributors <meta@public-inbox.org>
 # License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
 use strict; use v5.10.1; use PublicInbox::TestCommon;
 use File::Temp 0.19;
-use_ok 'PublicInbox::NDC_PP';
+use_ok 'PublicInbox::Syscall';
+
+# btrfs on Linux is copy-on-write (COW) by default.  As of Linux 5.7,
+# this still leads to fragmentation for SQLite and Xapian files where
+# random I/O happens, so we disable COW just for SQLite files and Xapian
+# directories.  Disabling COW disables checksumming, so we only do this
+# for regeneratable files, and not canonical git storage (git doesn't
+# checksum refs, only data under $GIT_DIR/objects).
 
 SKIP: {
 	my $nr = 2;
 	skip 'test is Linux-only', $nr if $^O ne 'linux';
 	my $dir = $ENV{BTRFS_TESTDIR};
 	skip 'BTRFS_TESTDIR not defined', $nr unless defined $dir;
-	require_cmd('chattr', 1) or skip 'chattr(1) not installed', $nr;
+
 	my $lsattr = require_cmd('lsattr', 1) or
 		skip 'lsattr(1) not installed', $nr;
+
 	my $tmp = File::Temp->newdir('nodatacow-XXXX', DIR => $dir);
 	my $dn = $tmp->dirname;
 
 	my $name = "$dn/pp.f";
 	open my $fh, '>', $name or BAIL_OUT "open($name): $!";
-	my $pp_sub = \&PublicInbox::NDC_PP::nodatacow_fd;
-	$pp_sub->(fileno($fh));
+	PublicInbox::Syscall::nodatacow_fh($fh);
 	my $res = xqx([$lsattr, $name]);
 	like($res, qr/C.*\Q$name\E/, "`C' attribute set on fd with pure Perl");
 
+
 	$name = "$dn/pp.d";
 	mkdir($name) or BAIL_OUT "mkdir($name) $!";
-	PublicInbox::NDC_PP::nodatacow_dir($name);
+	PublicInbox::Syscall::nodatacow_dir($name);
 	$res = xqx([$lsattr, '-d', $name]);
 	like($res, qr/C.*\Q$name\E/, "`C' attribute set on dir with pure Perl");
-
-	$name = "$dn/ic.f";
-	my $ic_sub = \&PublicInbox::Spawn::nodatacow_fd;
-	$pp_sub == $ic_sub and
-		skip 'Inline::C or Linux kernel headers missing', 2;
-	open $fh, '>', $name or BAIL_OUT "open($name): $!";
-	$ic_sub->(fileno($fh));
-	$res = xqx([$lsattr, $name]);
-	like($res, qr/C.*\Q$name\E/, "`C' attribute set on fd with Inline::C");
-
-	$name = "$dn/ic.d";
-	mkdir($name) or BAIL_OUT "mkdir($name) $!";
-	PublicInbox::Spawn::nodatacow_dir($name);
-	$res = xqx([$lsattr, '-d', $name]);
-	like($res, qr/C.*\Q$name\E/, "`C' attribute set on dir with Inline::C");
 };
 
 done_testing;

  reply	other threads:[~2022-01-30 21:49 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08  1:07 Test failures with 1.7.0 Julien Moutinho
2021-12-08  4:08 ` Eric Wong
2021-12-08 10:56   ` Dominique Martinet
2021-12-08 18:22     ` [PATCH] nodatacow: quiet chattr errors [was: Test failures with 1.7.0] Eric Wong
2021-12-08 21:14       ` Dominique Martinet
2021-12-08 22:01         ` Dominique Martinet
2022-01-30 21:49           ` Eric Wong [this message]
2022-01-30 23:18             ` Dominique Martinet
2022-01-31  2:03               ` Eric Wong
2022-01-31  3:34                 ` Dominique Martinet
2022-02-01  1:27                   ` Eric Wong
2021-12-09  1:37     ` Test failures with 1.7.0 Julien Moutinho
2021-12-09  2:53       ` Dominique Martinet
2022-02-01  9:37         ` Eric Wong
2022-02-01 23:27       ` FD_CLOEXEC w/ nix-shell [was: Test failures with 1.7.0] Eric Wong
2022-02-02  0:23         ` Dominique Martinet
2022-02-02  2:11           ` Dominique Martinet
2022-02-01 23:34       ` [PATCH] test_lei: use consistent locale for error messages Eric Wong
2022-02-17 21:02       ` [PATCH] t/lei-sigpipe: attempt to improve diagnostics for stuck test Eric Wong
2022-02-20  1:38         ` Julien Moutinho
2022-02-22  6:44           ` Eric Wong
2022-02-27  4:15             ` Julien Moutinho
2022-02-27  6:41               ` Julien Moutinho
2022-02-27  7:23                 ` Dominique Martinet
2022-02-27  8:04                   ` Julien Moutinho
2022-02-27 11:17                     ` [PATCH] t/lei-sigpipe: ensure SIGPIPE is unblocked for this test Eric Wong
2022-03-11 10:42                       ` [PATCH] t/lei-sigpipe.t: ensure SIGPIPE is not ignored instead of not blocked Julien Moutinho
2022-03-14 22:14                         ` Eric Wong
2022-03-15  2:56                           ` Julien Moutinho
2022-03-01  2:30   ` Test failures with 1.7.0 Julien Moutinho
2022-03-01  4:05     ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220130214908.M320944@dcvr \
    --to=e@80x24.org \
    --cc=asmadeus@codewreck.org \
    --cc=julm+public-inbox@sourcephile.fr \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).