From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id OD5hBSDHbV+YFAAA0tVLHw (envelope-from ) for ; Fri, 25 Sep 2020 10:32:00 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id 6IoiASDHbV9MRgAAbx9fmQ (envelope-from ) for ; Fri, 25 Sep 2020 10:32:00 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 50EA39407C8 for ; Fri, 25 Sep 2020 10:31:59 +0000 (UTC) Received: from localhost ([::1]:51846 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLl1C-0000Y0-0x for larch@yhetil.org; Fri, 25 Sep 2020 06:31:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43690) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLkrc-0007ep-Hx for guix-patches@gnu.org; Fri, 25 Sep 2020 06:22:06 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:59454) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLkra-0007yN-Qy for guix-patches@gnu.org; Fri, 25 Sep 2020 06:22:03 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kLkra-0003wd-MJ for guix-patches@gnu.org; Fri, 25 Sep 2020 06:22:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir. Resent-From: Danny Milosavljevic Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Fri, 25 Sep 2020 10:22:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 43591 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: Marius Bakke Cc: 43591@debbugs.gnu.org Received: via spool by 43591-submit@debbugs.gnu.org id=B43591.160102926215062 (code B ref 43591); Fri, 25 Sep 2020 10:22:02 +0000 Received: (at 43591) by debbugs.gnu.org; 25 Sep 2020 10:21:02 +0000 Received: from localhost ([127.0.0.1]:42763 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kLkqb-0003uS-Vr for submit@debbugs.gnu.org; Fri, 25 Sep 2020 06:21:02 -0400 Received: from dd26836.kasserver.com ([85.13.145.193]:36630) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kLkqZ-0003tz-01 for 43591@debbugs.gnu.org; Fri, 25 Sep 2020 06:21:00 -0400 Received: from localhost (80-110-126-103.cgn.dynamic.surfer.at [80.110.126.103]) by dd26836.kasserver.com (Postfix) with ESMTPSA id 4954F33682CD; Fri, 25 Sep 2020 12:20:57 +0200 (CEST) Date: Fri, 25 Sep 2020 12:20:04 +0200 From: Danny Milosavljevic Message-ID: <20200925122004.38275411@scratchpost.org> In-Reply-To: <87tuvm4vop.fsf@gnu.org> References: <20200924141211.21649-1-dannym@scratchpost.org> <87363759at.fsf@gnu.org> <20200924222711.2f22281a@scratchpost.org> <87tuvm4vop.fsf@gnu.org> X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-unknown-linux-gnu) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="Sig_/.UThJ_OqIlDjljxsYSUzUuS"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -1.7 (-) X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Spam-Score: -1.11 X-TUID: x5ks33Tscacn --Sig_/.UThJ_OqIlDjljxsYSUzUuS Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Marius, On Fri, 25 Sep 2020 01:11:18 +0200 Marius Bakke wrote: > Arguably running code for foreign architectures through QEMU binfmt is > something of a hack. The problem here is not qemu. It's not even the size of the whole struct dirent that's the problem, or endianness, or alignment, or any of that stuf= f. The problem is that if you have a 64-bit hash value then you can't fit it i= nto a 32-bit field. You have to make the field bigger. It's the only non-insane way to fix this and still be able to use the host filesystem from the guest. This bug is making readdir randomly and silently succeed or fail. Note: The bug I'm referring to is caused by having 32 bit offsets in user s= pace in the first place--nothing else. That is the root of the problem, not qem= u, not ext4, not the kernel. Having 32 bit offsets in user space when kernel = space uses 64 bit offsets is just weird (note: the kernel always uses 64 bit offsets--also on 32 bit architectures!). Note: This also affects 9p, which has been recently "fixed" in the Linux ke= rnel. >Why is this not an issue with i686 on x86_64 kernels? I'm not sure. I'll check. >Or armhf on AArch64? I cannot check that, but it totally should be a problem there. It has nothing to do with qemu specifically, and IMO qemu is right in not "fixing" it (because it's NOT a problem in qemu), and in expecting 32 bit guests not to use the getdents64 system call :P But even using getdents (no 64) as a flag what size d_off the caller expects is an ugly workaround. Not because of the flag, but because of what you have to do if it's set. The right fix is to switch to LFS (and thus also 64 bit d_off) on 32 bit gu= ests. It's 2020; normal systems have drives a lot bigger than 4 GiB. Even Raspbe= rrys have bigger drives. Also, just for the record, glibc always calls getdents64, never getdents. Using getdents64 and then being surprised when a 64 bit result comes back is seriously weird. And they are still defending it! If they didn't do that, at least the kernel would know something is up, and qemu would still pass through getdents as getdents to the host kernel a= nd getdents64 as getdents64 to the host kernel. Also, qemu could warn on calls to getdents (and not on calls to getdents64). There are no downsides to knowing what the user expects. But even then, user space probably would have LFS disabled. That means as soon as user space mounts something from NFS or 9p or whatever on their (let's say embedded) platform, they would have this problem again, and it would be very difficult to find what happened. Why keep this problem alive? >This problem only manifests when using QEMUs syscall emulation, right? No. It's a fundamental problem with the slot you want to store something in being too small for it. We just didn't see it yet, because usually the system *silently returns* ha= lfway through the readdir loop and does not tell us about any problems once it encounters any d_off >=3D 2**32 along the way. In the "json-c" bug, luckily json-c or cmake actually NEEDED a file that had not been returned because of that. That is not a given. >Mandating that every package *must* be patched to support it seems user-ho= stile. Ok, then how about a flag that users can set in order to let it compile regardless? But then they shouldn't use that flag in any (non-cross-compil= ed) guix packages. We can mention it in the news so people are not surprised. (Thinking about it, the flag could just be "-D_FILE_OFFSET_BITS=3D32". Bec= ause when you are setting it explicitly, you probably know what you are doing) It really depends on whether we officially support running 32 bit guests on 64 bit hosts. Qemu has nothing to do with it. Since the majority of our build farm runs qemu transparent emulation, and since a majority of our ARM packages are built using it, it seems that we decided to support 32 bit guests on 64 bit hosts. Well, then something like this patch has to be done, otherwise we have Schr=C3=B6dinger's build syste= m--and no one likes that. (I want to stress that the main problem is the case where readdir SUCCEEDS even though there is a problem; if readdir would always fail in the situation with a 32 bit guest on 64 bit host, that would be much less bad. But it doesn't always fail!) I'd be really uneasy if the sanity check this patch introduces were only enabled when explicitly setting a flag--for something as bad as this. Especially for an error that silently makes it into base derivations and th= en distributes all the way to who-knows-what normal client derivations. Keeping the user from shooting himself in the foot by default if he doesn't say "yes, please shoot me in the foot" beforehand is not "user-hostile". We could also do a deprecation warning instead: struct dirent* readdir(DIR* d) __attribute__((deprecated("Unsafe if not usi= ng large file support"))); f1.c:5:2: warning: 'readdir' is deprecated: Unsafe if not using large file = support [-Wdeprecated-declarations] But then the warning would not be emitted in the, for example, json-c packa= ge, but in cmake. I would never have found the bug that way. I guess now we know, but still... only emitting a warning seems weak for something this ba= d. So I'm against it. > I'm more in favor of dropping it on the > build farm, >or just keep fixing things on a per-package basis. That means you have to *find* the things first. The only reason glibc found the problem this one time is because the d_off telldir pointer just happened to be >=3D 2**32. If it's not bigger one time, and is bigger another time = in the same package build run, then you get weird unreproducible results. That is horrible. A copy-recursively could leave half the files off and not tell anyone. PLEASE let's do something about it. Also, let's delete all our ARM substitutes we have so far. > A less user-hostile solution could perhaps be to (setenv "CFLAGS" > "-D_FILE_OFFSET_BITS=3D64") on 32-bit architectures in gnu-build-system. Sure, we can definitely do that in addition. It's a good idea. There can be no ABI compatibility problem inside Guix if *all* Guix packages use that flag. While we are at it, we can do it on ALL architectures. Because that's what we actually want. Should there be a 128 bit CPU, we still want all CPUs (no matter how many bit registers the CPU has) use the same file offset bit size. But will that be picked up everywhere? cmake? Rust C extensions? Python extensions? > Not sure whether that could cause any adverse effects. It will. And if we want to find out where those effects would be, this patch--or one with warnings (that one is only useful if there is a way to automatically collect AND KEEP all warnings into one place)--is how to find out. > But again, I don't like the idea of optimizing for QEMUs user-mode emula= tion. It's not optimizing for qemu specifically. Qemu is not at fault here. (guest) glibc is at fault. Because of the way we build it. I've thought through a few possible fixes for this problem: (1) Wait until the kernel adds a "I-have-space-for-x-bits" field to the getdents64 system call and for qemu to actually use it. But how would qemu know how much space the guest glibc has in its field? But fine, let's say they have a crystal ball. Even then, what is ext4 going to do with this request? Just cut the cookie somewhere and throw away half? That's not safe. So this solution is right out. (2) Have glibc use getdents on 32 bit and getdents64 on 64 bit. See above-= -only now qemu can tell know much space the guest glibc has, and so can the host = kernel (which is much better). Cutting the cookie is not safe. So this solution is still out, barely. (3) Have qemu maintain a mapping table per file of which 64-bit d_off is associated with which weird cookie qemu invents and remembers. That doesn't help us in the case where you DO NOT USE qemu but still have a 32 bit executable running on a 64 bit host (natively). Therefore, this solution is out. (4) Build user space with large file support. There are no downsides to th= is, and as a distribution we can indeed decide only to support large file guix packages. I argue that this is the right solution for all "Guix system" ho= sts, because of the "SD card" argument above. Also, directories with Guix on it aren't exactly small either. So you need it anyway. If there are embedded platforms with disks smaller than 4 GiB used in guix, we can given those glibcs an original dirent.h (they will fail loudly--so we will find them ea= sily). I prefer this solution. (5) Build the qemu that emulates 32 bit ARM for 32 bit i686 (this is not a = typo). In that case the kernel would enable compatibility mode (X86_32 personality= ) and it would work. I want to stress that this compatibility mode calculates different hashes depending on the program that is requesting it, on the same host, directory and entry. This is not something one would want, especially when building things--the kernel is basically lying to you. See kernel option CONFIG_X86_32--which we enable. It would work okay in practice, because user space programs usually don't s= tore d_off. But (4) is just better. P.S. by now, my local build found a binutils 2.35.1 bfd plugin, and (libstdc++-v3/src/filesystem/dir.cc in) libstdc++ 7.5.0, that also don't use large files and thus would also only work sporadically in random ways. How many hundreds of packages does this affect that we don't know yet? PPS. I get a lot of "grep: invalid option -o" messages when building glibc-2.32--even without the patch. That's probably not good either. PPPS. new patch will come soon. --Sig_/.UThJ_OqIlDjljxsYSUzUuS Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9txFQACgkQ5xo1VCww uqWCSwf/WwR2WWHPz4kO0hVdcEjmjMTwarKs1VF5DNs+pA6WwVN5nB2Nyp5El4mV wwo4ik3BAzOicLBHwNYk6H+/RINRqFrtDqHUkC6dRo3zXzhmtgXSs2PajYie2ljr eviZNBmZcwGdEitvv6E/9+K4fbDIIG6rhygWiY/x9YD+3/mdw8eTJ5Km1tnjCSZ4 uPT7CiWB43LuValm7R9k034N/dgix8bT8QTgLPQ25X7bhII1sX6Xs33d20eLCDen cD1pKmcCvbVgh5ZGsG8BI7Lcp86AfUvfOVlZZlwNIFvSVS9DUMJ23sCiL48GGFV8 FYUb2ZvPyFdP60JzPykH/PlpTXg8nA== =eqNk -----END PGP SIGNATURE----- --Sig_/.UThJ_OqIlDjljxsYSUzUuS--