Hi Ludo, On Thu, 01 Oct 2020 09:14:10 +0200 Ludovic Courtès wrote: > > This problem has nothing to do with emulation. > > Now I’m lost; I thought this had to do with qemu-user. I had thought so, too, a few weeks ago. But that's not the case. It's not at all related to qemu. The problem is a fundamental problem: a 64 bit value does NOT fit into a 32 bit slot. glibc uses getdents64 to get 64 bit dents and then acts all surprised and errory when it gets 64 bit dents. (also on 32 bit glibc) The same happens natively when using armhf on aarch64, without qemu-user. That's what the table I sent was all about. Calling getdents64 is not the problem--glibc has to do that, otherwise a 32 bit glibc won't work RELIABLY on a 64 bit kernel anyway. And emitting an error it does because we do not enable large file support. This table (updated after I finally compiled a guix gcc-toolchain-10 with the (unpatched) glibc in question on armhf): system _FILE_OFFSET_BITS off_t d_off-sizeof d_off-values --------------------------------------------------------------- x86_64 - 8 Byte 8 Byte 8 Byte i686 - 4 Byte 4 Byte 4 Byte i686 64 8 Byte 8 Byte FAIL* i686 32 4 Byte 4 Byte FAIL* i686 7 4 Byte 4 Byte 4 Byte armhf - 4 Byte 4 Byte FAIL* armhf 64 8 Byte 8 Byte 8 Byte armhf 32 4 Byte 4 Byte FAIL* armhf 7 4 Byte 4 Byte FAIL* a64armhf - 4 Byte 4 Byte FAIL* a64armhf 64 8 Byte 8 Byte 8 Byte a64armhf 32 4 Byte 4 Byte FAIL* a64armhf 7 4 Byte 4 Byte FAIL* aarch64 - 8 Byte 8 Byte 8 Byte *: Using FUSE filesystem with big d_off value. None of those tests were done with qemu. They were all native. That's why I wanted access to real aarch64 machines--otherwise I could have done it with qemu on my x86_64 computer :P > I’m very reluctant to patching public libc headers. Well, I don't like it either--that's why it's very very careful. My patch doesn't change anything that users experience at runtime and basically just prevents developers from compiling something that is using readdir without thinking about large files first (because they HAVE TO if their programs run on a host kernel that has bigger d_off--there's no sane way around it). If they absolutely want to, they can set _FILE_OFFSET_BITS=32 and it will let them do it (the argument by Marius is that people might want to do that on embedded. But that means they'll sometimes have readdir fail--depending on their setup (also on 32 bit kernels). Embedded is not specially exempt from ths bug ;) ). I think that this patch is guix-specific in the sense that it happens pretty often that we do "-s i686-linux" on x86_64, "-s armhf-linux" on aarch64 and expect that to work. And there's no qemu we could even patch in those cases, because no qemu is used in the first place. > Also, it’s not just > “our” problem, we should definitely discuss it with upstream and perhaps > propose your dirent.h patch. Sure. I think 15 years of migration path to 64 bit off_t was more than enough. Now, I'd prefer if glibc made people choose _FILE_OFFSET_BITS explicitly on 32 bit. Everything else is a hack that WILL break unexpectedly. Users still can choose _FILE_OFFSET_BITS=32, if they want. > I’m also not sure what you mean by “using it wrong”, what is “it”? "it" is users calling readdir() without defining _FILE_OFFSET_BITS=64 in their source file / Makefile. This causes glibc to call getdents64 and then act all surprised when it gets a 64 bit result back. > > Also, this won't work on armhf or any other 32 bit architecture--so there, > > we would be both philosophically and practically wrong. > > > > Also, the "not telling us the truth for d_off on i686" is a leaky compat layer. > > It totally DOES wind up telling us the truth sometimes (see my earlier test > > table)--and then we have a problem. > > Hmm I guess I need to re-read all that, I’m overwhelmed. Yeah--it's understandable. I'm working on understanding and fixing this problem for a hundred hours now--it took forever for me to get to the bottom of this, too. And in the beginning I, too, suspected qemu. But it's totally blameless. Nothing should be changed in qemu-user or in our qemu binfmt service. The fundamental problem is that POSIX specifies that telldir and seekdir must exist, and return and take a LONG, respectively. That means that glibc has to preserve d_off it got from getdents64 (size is 64 bits), otherwise how would seekdir work? But the offset parameter of seekdir is standardized as LONG, which means that it won't work in the first place on 32 bit when there is either a 64 bit kernel or a filesystem that just happens to store bigger stuff. So glibc chose to check whether the getdents64 d_off just happens to fit into the LONG this time around it was called. I argue that that is insane. It would be better to always fail, or never fail--not only fail on the first d_off that is > 2**32. When that happens is a filesystem implementation detail :P I think the assumption was that the kernel would store an actual offset into d_off. But it doesn't--it stores a hash in the case of ext4 (and probably in other cases). And in any case, even if it was an offset, that is still an unsafe way to fix the problem. First, someone needs to fix the POSIX standard to say "off_t", not "long". Then, distributions who want to use 32 bit userland on 64 bit kernel need to enable large files globally. That is a choice a distribution has to make. Not making a choice is a choice too--the behavior will be random, and if my research in wip-file-offset-bits-64 is any indication then very fundamental things will be broken at unexpected places, and usually it DOES NOT result in a build failure (without my glibc patch). That basically means that using 32 bit native on 64 bit kernel cannot be supported in Guix if no choice is made. If choice "yes" is made, one needs to have a way to find these non-build-breaking using-readdir-wrong packages. How would an alternative way to do this look?