Paul Eggert writes: > On 2/12/20 2:37 PM, David Koppelman wrote: >> Except I don't get the >> efficient kernel-space-to-kernel-space transfer that copy_file_range >> uses.) > > It's more than just kernel-space-to-kernel-space copying. When copying > a file within an NFS server, you don't need to ship its contents over > the network; the server can do the copy. Also, many modern filesystems > can copy files by fiddling with pointers rather than data and thus can > copy much faster than read+write would do, even on local filesystems. > So avoiding copy_file_range entirely would mean a big performance loss > on big files. >> I do not experience the problem on the version of Emacs packaged with >> rhel 8, "GNU Emacs 26.1 (build 1, x86_64-redhat-linux-gnu, GTK+ >> Version 3.22.30) of 2018-09-10". > > Emacs 26.1 doesn't use copy_file_range, which explains why it doesn't > encounter your problem. Emacs 27 is planned to use it, though, so we > should see how to best fix the problem. > > As you say, it's a serious bug in your filesystem. It strikes me that > it is likely to affect programs other than Emacs, so it should be high > priority to fix regardless of what we do in Emacs. > > Some questions: What is the NFS fileserver (NetApp, etc.)? What's the > blocksize on the remote file system? Does copy_file_range work > correctly when the size is a multiple of 32*1024? If so, perhaps we > could tweak Emacs to use copy_file_range for most of the file, and use > read+write only for the trailing <32 KiB. > >> When I have time I'll try to reproduce the problem with a quick C++ >> routine using copy_file_range. > > To save you some time, attached is a quick C routine that attempts to > reproduce the problem. Does it reproduce the problem for you? If so, > you can use it in your bug report to Red Hat. > > Also, can you strace the failing Emacs? Something like this: > > strace -o trace.log emacs -Q -batch -eval '(copy-file "a" "b" t t)' > > and then look at the relevant part of trace.log.