unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
@ 2020-09-24 14:12 Danny Milosavljevic
  2020-09-24 14:16 ` Danny Milosavljevic
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-24 14:12 UTC (permalink / raw)
  To: 43591; +Cc: Danny Milosavljevic

* gnu/packages/commencement.scm (glibc-final): Catch all cases of a glibc user not
requesting 64-bit offsets and then using readdir.
---
 gnu/packages/commencement.scm | 52 ++++++++++++++++++++++++++++++++++-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index e5a4caa95c..5a62c9df3a 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -3462,7 +3462,57 @@ exec ~a/bin/~a-~a -B~a/lib -Wl,-dynamic-linker -Wl,~a/~a \"$@\"~%"
                ,hurd-headers-boot0)
              '())
        ,@(package-outputs glibc-final-with-bootstrap-bash))
-      ,@(package-arguments glibc-final-with-bootstrap-bash)))))
+      ,@(substitute-keyword-arguments
+         (package-arguments glibc-final-with-bootstrap-bash)
+         ((#:phases phases)
+          `(modify-phases ,phases
+             (add-after 'unpack 'patch-dirent
+               (lambda* (#:key outputs #:allow-other-keys)
+                 ;; QEMU transparent emulation is in somewhat of a pickle sometimes.
+                 ;; There is no support in the kernel syscalls of specifying what
+                 ;; kind of userspace you are emulating.  Some parts of the
+                 ;; structures passed back-and-forth between kernel and guest
+                 ;; userspace can change size (including size of individual fields).
+                 ;;
+                 ;; One of the affected structures is "struct dirent".  The ext4
+                 ;; file system puts a 64 bit hash into "d_off" on the kernel side.
+                 ;; If the guest system's glibc is 32 bit it is going to be very
+                 ;; confused (it does check whether d_off fits into the structure
+                 ;; it gives back to the user--and it doesn't fit.  Hence readdir
+                 ;; fails).
+                 ;; This manifests itself in simple directory reads not working
+                 ;; anymore in parts of cmake, for example.
+                 ;;
+                 ;; There is a very simple and complete way to avoid this problem:
+                 ;; Just always use 64 bit offsets in user space programs (also
+                 ;; on 32 bit machines).
+                 ;;
+                 ;; Note: We might want to avoid using 64 bit when bootstrapping
+                 ;; using mescc (since mescc doesn't directly support 64 bit
+                 ;; values)--but then bootstrapping has to be done on a
+                 ;; file system other than ext4, or on ext4 with the feature
+                 ;; "dir_index" disabled.
+                 ;;
+                 ;; The change below does not affect 64 bit users.
+                 ;;
+                 ;; See <https://issues.guix.gnu.org/43513>.
+                 (let ((port (open-file "include/dirent.h" "a")))
+                   (display "
+#if __SIZEOF_LONG__ < 8
+#ifndef __USE_FILE_OFFSET64
+#undef readdir
+#define readdir @READDIR_WITHOUT_FILE_OFFSET64_IS_A_REALLY_BAD_IDEA@
+#endif
+#endif
+" port)
+                   (close-port port))
+                 ;; This file includes <dirent.h> and thus checks sanity already.
+                 ;; TODO: Check dirent/scandir-tail.c, dirent/scandir64-tail.c.
+                 (substitute* "posix/glob.c"
+                  (("(#[ ]*define[ ][ ]*readdir)") "
+#undef readdir
+#define readdir"))
+                 #t)))))))))
 
 (define/system-dependent gcc-boot0-wrapped
   ;; Make the cross-tools GCC-BOOT0 and BINUTILS-BOOT0 available under the




^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 14:12 [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
@ 2020-09-24 14:16 ` Danny Milosavljevic
  2020-09-24 18:17 ` Marius Bakke
  2020-09-30  8:45 ` [bug#43591] [PATCH core-updates v2 0/5] " Danny Milosavljevic
  2 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-24 14:16 UTC (permalink / raw)
  To: 43591

[-- Attachment #1: Type: text/plain, Size: 641 bytes --]

On Thu, 24 Sep 2020 16:12:11 +0200
Danny Milosavljevic <dannym@scratchpost.org> wrote:
> +                 ;; Note: We might want to avoid using 64 bit when bootstrapping
> +                 ;; using mescc (since mescc doesn't directly support 64 bit
> +                 ;; values)--but then bootstrapping has to be done on a
> +                 ;; file system other than ext4, or on ext4 with the feature
> +                 ;; "dir_index" disabled.

Or on a real 32 bit machine, where there is no problem with mismatching struct
sizes between emulated guest glibc and host system kernel, because there is no
emulation going on.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 14:12 [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
  2020-09-24 14:16 ` Danny Milosavljevic
@ 2020-09-24 18:17 ` Marius Bakke
  2020-09-24 20:27   ` Danny Milosavljevic
  2020-09-25 10:24   ` Danny Milosavljevic
  2020-09-30  8:45 ` [bug#43591] [PATCH core-updates v2 0/5] " Danny Milosavljevic
  2 siblings, 2 replies; 33+ messages in thread
From: Marius Bakke @ 2020-09-24 18:17 UTC (permalink / raw)
  To: Danny Milosavljevic, 43591; +Cc: Danny Milosavljevic

[-- Attachment #1: Type: text/plain, Size: 3375 bytes --]

Danny Milosavljevic <dannym@scratchpost.org> writes:

> +                 ;; QEMU transparent emulation is in somewhat of a pickle sometimes.
> +                 ;; There is no support in the kernel syscalls of specifying what
> +                 ;; kind of userspace you are emulating.  Some parts of the
> +                 ;; structures passed back-and-forth between kernel and guest
> +                 ;; userspace can change size (including size of individual fields).
> +                 ;;
> +                 ;; One of the affected structures is "struct dirent".  The ext4
> +                 ;; file system puts a 64 bit hash into "d_off" on the kernel side.
> +                 ;; If the guest system's glibc is 32 bit it is going to be very
> +                 ;; confused (it does check whether d_off fits into the structure
> +                 ;; it gives back to the user--and it doesn't fit.  Hence readdir
> +                 ;; fails).
> +                 ;; This manifests itself in simple directory reads not working
> +                 ;; anymore in parts of cmake, for example.

Note that for CMake in particular, this problem will be fixed in 3.19:

  https://gitlab.kitware.com/cmake/cmake/-/issues/20568

As mentioned in that issue, and which this patch states on no uncertain
terms, a workaround is to use -D_FILE_OFFSET_BITS=64 on 32-bit platforms.

> +                 ;;
> +                 ;; There is a very simple and complete way to avoid this problem:
> +                 ;; Just always use 64 bit offsets in user space programs (also
> +                 ;; on 32 bit machines).
> +                 ;;
> +                 ;; Note: We might want to avoid using 64 bit when bootstrapping
> +                 ;; using mescc (since mescc doesn't directly support 64 bit
> +                 ;; values)--but then bootstrapping has to be done on a
> +                 ;; file system other than ext4, or on ext4 with the feature
> +                 ;; "dir_index" disabled.
> +                 ;;
> +                 ;; The change below does not affect 64 bit users.
> +                 ;;
> +                 ;; See <https://issues.guix.gnu.org/43513>.
> +                 (let ((port (open-file "include/dirent.h" "a")))
> +                   (display "
> +#if __SIZEOF_LONG__ < 8
> +#ifndef __USE_FILE_OFFSET64
> +#undef readdir
> +#define readdir @READDIR_WITHOUT_FILE_OFFSET64_IS_A_REALLY_BAD_IDEA@

Won't this break _everything_ that uses readdir() without 64-bit
offsets?  Or does that @@ string get substituted by the glibc build
system somehow.

> +#endif
> +#endif
> +" port)
> +                   (close-port port))
> +                 ;; This file includes <dirent.h> and thus checks sanity already.
> +                 ;; TODO: Check dirent/scandir-tail.c, dirent/scandir64-tail.c.
> +                 (substitute* "posix/glob.c"
> +                  (("(#[ ]*define[ ][ ]*readdir)") "
> +#undef readdir
> +#define readdir"))

Can you file a bug report upstream about the duplicate definition(s)?

Enforcing this restriction in glibc feels rather sledgehammer-y.  Would
it make sense to introduce a GCC warning instead?  I'm sure there are
legitimate uses of smaller file offsets (i.e. embedded).  A GCC warning
will still break -Werror, but that's a lot more manageable than breaking
almost every use of readdir() on 32-bit platforms.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 18:17 ` Marius Bakke
@ 2020-09-24 20:27   ` Danny Milosavljevic
  2020-09-24 23:11     ` Marius Bakke
  2020-09-25 10:24   ` Danny Milosavljevic
  1 sibling, 1 reply; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-24 20:27 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 43591

[-- Attachment #1: Type: text/plain, Size: 3284 bytes --]

Hi Marius,

On Thu, 24 Sep 2020 20:17:14 +0200
Marius Bakke <marius@gnu.org> wrote:

> As mentioned in that issue, and which this patch states on no uncertain
> terms, a workaround is to use -D_FILE_OFFSET_BITS=64 on 32-bit platforms.

Yeah.  The problem is how to find all those places where we need to add it,
and how to keep finding them as we maintain stuff (read: as upstream changes
stuff).

> Won't this break _everything_ that uses readdir() without 64-bit
> offsets?

That's the goal of that patch.  Because it won't work at runtime anyway, when
run on a guix x86_64 build machine, building for ARM (which is the usual way
we build stuff for ARM).

Maybe we can find out whether dirent.h is #included for compiling for one of
the main Guix platforms (I prefer this latter solution, if possible).

It's either that or we stop qemu transparent emulation on the build farm,
because the result isn't reliable.  This time it was "caught" because of
an overabundance of caution on behalf of the glibc people.  We won't be
so lucky next time.

>  Or does that @@ string get substituted by the glibc build
> system somehow.

No.  It's specifically made so downstream users of glibc on guix can't use
readdir() on 32-bit systems without enabling large file support.

It totally could use some more #if about whether it's building stuff for
an actual guix main platform (i.e. not on embedded).

> Can you file a bug report upstream about the duplicate definition(s)?

It's not really a problem for them.  But I can try anyway.

> Enforcing this restriction in glibc feels rather sledgehammer-y.  Would
> it make sense to introduce a GCC warning instead?  I'm sure there are
> legitimate uses of smaller file offsets (i.e. embedded).  A GCC warning
> will still break -Werror, but that's a lot more manageable than breaking
> almost every use of readdir() on 32-bit platforms.

I thought about a gcc #warning.  But I don't want it to warn when readdir
isn't used in the first place but dirent.h ended up being included by some
dependency.  I guess we could use a deprecation warning on readdir() instead.
Can this warning print a custom message?

What do you think?

In any case, I'd like to have some overview of how bad the problem is and
thus I'd like to have a wip branch with this patch being built entirely
for 32 bits via x86_64 (not sure whether that includes i686 btw).
Is it possible to force using these x86_64 machines in the build farm?

>I'm sure there are legitimate uses of smaller file offsets (i.e. embedded).

On guix?  With our glibc?  Do we support that?

>A GCC warning will still break -Werror, but that's a lot more manageable
>than breaking almost every use of readdir() on 32-bit platforms.

These uses of readdir() are not reliable the way we are building guix on
cuirass.

But I'm all for refining this patch, if there are suggestions on how.

Actually, I think a warning is not strong enough.  This bug made it all the
way down to json-c (!) before finally being detected.  That should not happen.

And if the build farm would have selected an actual ARM machine, the build
would have succeeded and no one on x86_64 could have reproduced the result.
That would have been real bad.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 20:27   ` Danny Milosavljevic
@ 2020-09-24 23:11     ` Marius Bakke
  2020-09-25 10:20       ` Danny Milosavljevic
                         ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Marius Bakke @ 2020-09-24 23:11 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: 43591

[-- Attachment #1: Type: text/plain, Size: 1839 bytes --]

Danny Milosavljevic <dannym@scratchpost.org> writes:

> Hi Marius,
>
> On Thu, 24 Sep 2020 20:17:14 +0200
> Marius Bakke <marius@gnu.org> wrote:
>
>> As mentioned in that issue, and which this patch states on no uncertain
>> terms, a workaround is to use -D_FILE_OFFSET_BITS=64 on 32-bit platforms.
>
> Yeah.  The problem is how to find all those places where we need to add it,
> and how to keep finding them as we maintain stuff (read: as upstream changes
> stuff).
>
>> Won't this break _everything_ that uses readdir() without 64-bit
>> offsets?
>
> That's the goal of that patch.  Because it won't work at runtime anyway, when
> run on a guix x86_64 build machine, building for ARM (which is the usual way
> we build stuff for ARM).

Why is this not an issue with i686 on x86_64 kernels?  Or armhf on
AArch64?  This problem only manifests when using QEMUs syscall
emulation, right?

> Maybe we can find out whether dirent.h is #included for compiling for one of
> the main Guix platforms (I prefer this latter solution, if possible).
>
> It's either that or we stop qemu transparent emulation on the build farm,
> because the result isn't reliable.  This time it was "caught" because of
> an overabundance of caution on behalf of the glibc people.  We won't be
> so lucky next time.

Arguably running code for foreign architectures through QEMU binfmt is
something of a hack.  Mandating that every package *must* be patched to
support it seems user-hostile.  I'm more in favor of dropping it on the
build farm, or just keep fixing things on a per-package basis.

A less user-hostile solution could perhaps be to (setenv "CFLAGS"
"-D_FILE_OFFSET_BITS=64") on 32-bit architectures in gnu-build-system.
Not sure whether that could cause any adverse effects.  But again, I
don't like the idea of optimizing for QEMUs user-mode emulation.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 23:11     ` Marius Bakke
@ 2020-09-25 10:20       ` Danny Milosavljevic
  2020-09-25 10:42         ` [bug#43591] [PATCH v2 core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless Danny Milosavljevic
  2020-09-25 13:36         ` [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
  2020-09-25 20:03       ` Andreas Enge
  2020-09-29 20:52       ` Ludovic Courtès
  2 siblings, 2 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-25 10:20 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 43591

[-- Attachment #1: Type: text/plain, Size: 9865 bytes --]

Hi Marius,

On Fri, 25 Sep 2020 01:11:18 +0200
Marius Bakke <marius@gnu.org> wrote:

> Arguably running code for foreign architectures through QEMU binfmt is
> something of a hack.

The problem here is not qemu.  It's not even the size of the whole struct
dirent that's the problem, or endianness, or alignment, or any of that stuff.

The problem is that if you have a 64-bit hash value then you can't fit it into
a 32-bit field.

You have to make the field bigger.

It's the only non-insane way to fix this and still be able to use the host
filesystem from the guest.

This bug is making readdir randomly and silently succeed or fail.

Note: The bug I'm referring to is caused by having 32 bit offsets in user space
in the first place--nothing else.  That is the root of the problem, not qemu,
not ext4, not the kernel.  Having 32 bit offsets in user space when kernel space
uses 64 bit offsets is just weird (note: the kernel always uses 64 bit
offsets--also on 32 bit architectures!).

Note: This also affects 9p, which has been recently "fixed" in the Linux kernel.

>Why is this not an issue with i686 on x86_64 kernels?

I'm not sure.  I'll check.

>Or armhf on AArch64?

I cannot check that, but it totally should be a problem there.

It has nothing to do with qemu specifically, and IMO qemu is right in not
"fixing" it (because it's NOT a problem in qemu), and in expecting
32 bit guests not to use the getdents64 system call :P

But even using getdents (no 64) as a flag what size d_off the caller
expects is an ugly workaround.
Not because of the flag, but because of what you have to do if it's set.

The right fix is to switch to LFS (and thus also 64 bit d_off) on 32 bit guests.
It's 2020; normal systems have drives a lot bigger than 4 GiB.  Even Raspberrys
have bigger drives.

Also, just for the record, glibc always calls getdents64, never getdents.
Using getdents64 and then being surprised when a 64 bit result comes back is
seriously weird.  And they are still defending it!

If they didn't do that, at least the kernel would know something is up,
and qemu would still pass through getdents as getdents to the host kernel and
getdents64 as getdents64 to the host kernel.  Also, qemu could warn on calls
to getdents (and not on calls to getdents64).

There are no downsides to knowing what the user expects.

But even then, user space probably would have LFS disabled.  That means
as soon as user space mounts something from NFS or 9p or whatever on their
(let's say embedded) platform, they would have this problem again,
and it would be very difficult to find what happened.  Why keep this problem
alive?

>This problem only manifests when using QEMUs syscall emulation, right?

No.

It's a fundamental problem with the slot you want to store something in
being too small for it.

We just didn't see it yet, because usually the system *silently returns* halfway
through the readdir loop and does not tell us about any problems once it
encounters any d_off >= 2**32 along the way.

In the "json-c" bug, luckily json-c or cmake actually NEEDED a file that had
not been returned because of that.  That is not a given.

>Mandating that every package *must* be patched to support it seems user-hostile.

Ok, then how about a flag that users can set in order to let it compile
regardless?  But then they shouldn't use that flag in any (non-cross-compiled)
guix packages.
We can mention it in the news so people are not surprised.
(Thinking about it, the flag could just be "-D_FILE_OFFSET_BITS=32".  Because
when you are setting it explicitly, you probably know what you are doing)

It really depends on whether we officially support running 32 bit guests
on 64 bit hosts.  Qemu has nothing to do with it.

Since the majority of our build farm runs qemu transparent emulation, and
since a majority of our ARM packages are built using it, it seems that we
decided to support 32 bit guests on 64 bit hosts.  Well, then something like
this patch has to be done, otherwise we have Schrödinger's build system--and
no one likes that.

(I want to stress that the main problem is the case where readdir SUCCEEDS
even though there is a problem;
if readdir would always fail in the situation with a 32 bit guest on
64 bit host, that would be much less bad.  But it doesn't always fail!)

I'd be really uneasy if the sanity check this patch introduces were only
enabled when explicitly setting a flag--for something as bad as this.

Especially for an error that silently makes it into base derivations and then
distributes all the way to who-knows-what normal client derivations.

Keeping the user from shooting himself in the foot by default if he doesn't
say "yes, please shoot me in the foot" beforehand is not "user-hostile".

We could also do a deprecation warning instead:

struct dirent* readdir(DIR* d) __attribute__((deprecated("Unsafe if not using large file support")));

f1.c:5:2: warning: 'readdir' is deprecated: Unsafe if not using large file support [-Wdeprecated-declarations]

But then the warning would not be emitted in the, for example, json-c package,
but in cmake.  I would never have found the bug that way.  I guess now we
know, but still... only emitting a warning seems weak for something this bad.

So I'm against it.

>  I'm more in favor of dropping it on the
> build farm,

>or just keep fixing things on a per-package basis.

That means you have to *find* the things first.  The only reason glibc found
the problem this one time is because the d_off telldir pointer just happened
to be >= 2**32.  If it's not bigger one time, and is bigger another time in
the same package build run, then you get weird unreproducible results.

That is horrible.  A copy-recursively could leave half the files off and not
tell anyone.  PLEASE let's do something about it.  Also, let's delete all
our ARM substitutes we have so far.

> A less user-hostile solution could perhaps be to (setenv "CFLAGS"
> "-D_FILE_OFFSET_BITS=64") on 32-bit architectures in gnu-build-system.

Sure, we can definitely do that in addition.  It's a good idea.  There can
be no ABI compatibility problem inside Guix if *all* Guix packages use that
flag.

While we are at it, we can do it on ALL architectures.  Because that's what
we actually want.  Should there be a 128 bit CPU, we still want all CPUs
(no matter how many bit registers the CPU has) use the same file offset
bit size.

But will that be picked up everywhere?  cmake?  Rust C extensions?  Python
extensions?

> Not sure whether that could cause any adverse effects.

It will.  And if we want to find out where those effects would be, this
patch--or one with warnings (that one is only useful if there is a way to
automatically collect AND KEEP all warnings into one place)--is how to find
out.

>  But again, I don't like the idea of optimizing for QEMUs user-mode emulation.

It's not optimizing for qemu specifically.  Qemu is not at fault here.
(guest) glibc is at fault.  Because of the way we build it.

I've thought through a few possible fixes for this problem:

(1) Wait until the kernel adds a "I-have-space-for-x-bits" field to the
getdents64 system call and for qemu to actually use it.  But how would qemu
know how much space the guest glibc has in its field?
But fine, let's say they have a crystal ball.  Even then, what is ext4 going
to do with this request?  Just cut the cookie somewhere and throw away half?
That's not safe.  So this solution is right out.

(2) Have glibc use getdents on 32 bit and getdents64 on 64 bit.  See above--only
now qemu can tell know much space the guest glibc has, and so can the host kernel
(which is much better).
Cutting the cookie is not safe.  So this solution is still out, barely.

(3) Have qemu maintain a mapping table per file of which 64-bit d_off is
associated with which weird cookie qemu invents and remembers.

That doesn't help us in the case where you DO NOT USE qemu but still have
a 32 bit executable running on a 64 bit host (natively).

Therefore, this solution is out.

(4) Build user space with large file support.  There are no downsides to this,
and as a distribution we can indeed decide only to support large file guix
packages.  I argue that this is the right solution for all "Guix system" hosts,
because of the "SD card" argument above.  Also, directories with Guix on it
aren't exactly small either.  So you need it anyway.  If there are embedded
platforms with disks smaller than 4 GiB used in guix, we can given those
glibcs an original dirent.h (they will fail loudly--so we will find them easily).

I prefer this solution.

(5) Build the qemu that emulates 32 bit ARM for 32 bit i686 (this is not a typo).
In that case the kernel would enable compatibility mode (X86_32 personality) and
it would work.  I want to stress that this compatibility mode calculates
different hashes depending on the program that is requesting it, on the
same host, directory and entry.  This is not something one would want,
especially when building things--the kernel is basically lying to you.
See kernel option CONFIG_X86_32--which we enable.

It would work okay in practice, because user space programs usually don't store
d_off.  But (4) is just better.

P.S. by now, my local build found a binutils 2.35.1 bfd plugin, and
(libstdc++-v3/src/filesystem/dir.cc in) libstdc++ 7.5.0, that also don't use
large files and thus would also only work sporadically in random ways.

How many hundreds of packages does this affect that we don't know yet?

PPS. I get a lot of "grep: invalid option -o" messages when building
glibc-2.32--even without the patch.  That's probably not good either.

PPPS. new patch will come soon.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 18:17 ` Marius Bakke
  2020-09-24 20:27   ` Danny Milosavljevic
@ 2020-09-25 10:24   ` Danny Milosavljevic
  1 sibling, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-25 10:24 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 43591

[-- Attachment #1: Type: text/plain, Size: 467 bytes --]

On Thu, 24 Sep 2020 20:17:14 +0200
Marius Bakke <marius@gnu.org> wrote:

> As mentioned in that issue, and which this patch states on no uncertain
> terms, a workaround is to use -D_FILE_OFFSET_BITS=64 on 32-bit platforms.

That is not a workaround.  That is the correct fix.

The kernel uses 64 bit offsets on 32-bit platforms anyway.

So either they don't ask the kernel questions they don't want to know the
answer to, or they use 64 bit offsets, too.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH v2 core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless.
  2020-09-25 10:20       ` Danny Milosavljevic
@ 2020-09-25 10:42         ` Danny Milosavljevic
  2020-09-25 13:36         ` [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
  1 sibling, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-25 10:42 UTC (permalink / raw)
  To: 43591; +Cc: Danny Milosavljevic

* gnu/packages/commencement.scm (glibc-final): Catch all cases of a glibc user not
requesting 64-bit offsets and then using readdir.
---
 gnu/packages/commencement.scm | 68 ++++++++++++++++++++++++++++++++++-
 1 file changed, 67 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index e5a4caa95c..284fa65d20 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -3462,7 +3462,73 @@ exec ~a/bin/~a-~a -B~a/lib -Wl,-dynamic-linker -Wl,~a/~a \"$@\"~%"
                ,hurd-headers-boot0)
              '())
        ,@(package-outputs glibc-final-with-bootstrap-bash))
-      ,@(package-arguments glibc-final-with-bootstrap-bash)))))
+      ,@(substitute-keyword-arguments
+         (package-arguments glibc-final-with-bootstrap-bash)
+         ((#:phases phases)
+          `(modify-phases ,phases
+             (add-after 'unpack 'patch-dirent
+               (lambda* (#:key outputs #:allow-other-keys)
+                 ;; Linux kernel file offsets are always 64 bits.
+                 ;; But userspace can be built to use 32 bit offsets.
+                 ;;
+                 ;; "struct dirent", returned by readdir, uses d_off to store
+                 ;; such an "offset" that it got from the Linux kernel.
+                 ;; In the case of ext4 that "offset" is actually a 64 bit
+                 ;; cookie which includes a hash value.
+                 ;;
+                 ;; Therefore, there are cases where such an offset that it got
+                 ;; from the Linux kernel does not fit in the "struct dirent"
+                 ;; field "d_off".
+                 ;;
+                 ;; If the guest system's glibc is 32 bit AND uses 32 bit
+                 ;; file offsets it is going to be very confused.
+                 ;; It does check whether d_off fits into the structure
+                 ;; it gives back to the user--and it doesn't fit.  Hence readdir
+                 ;; fails, with errno == EOVERFLOW (which is undocumented and thus
+                 ;; an API error).
+                 ;; This manifests itself in simple directory reads not working
+                 ;; anymore in parts of cmake, for example.
+                 ;;
+                 ;; This manifested in Guix when building stuff for
+                 ;; ARMHF on a x86_64 build host using QEMU transparent emulation.
+                 ;;
+                 ;; There is a very simple and complete way to avoid this problem:
+                 ;; Just always use 64 bit offsets in user space programs (also
+                 ;; on 32 bit machines).  The Linux kernel does that already
+                 ;; anyway.
+                 ;;
+                 ;; Note: We might want to avoid using 64 bit when bootstrapping
+                 ;; using mescc (since mescc doesn't directly support 64 bit
+                 ;; values)--but then bootstrapping has to be done on a
+                 ;; file system other than ext4, or on ext4 with the feature
+                 ;; "dir_index" disabled.
+                 ;;
+                 ;; The change below does not affect 64 bit users.
+                 ;;
+                 ;; See <https://issues.guix.gnu.org/43513>.
+                 (let ((port (open-file "dirent/dirent.h" "a")))
+                   (display "
+#ifndef _LIBC
+#if __SIZEOF_LONG__ < 8
+#ifndef __USE_FILE_OFFSET64
+#if defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 32
+#warning \"Using -D_FILE_OFFSET_BITS=32 and using readdir is a bad idea, see <https://bugzilla.kernel.org/show_bug.cgi?id=205957>\"
+#else
+#undef readdir
+#define readdir @READDIR_WITHOUT_FILE_OFFSET64_IS_A_REALLY_BAD_IDEA@
+#endif
+#endif
+#endif
+#endif
+" port)
+                   (close-port port))
+                 ;; This file includes <dirent.h> and thus checks sanity already.
+                 ;; TODO: Check dirent/scandir-tail.c, dirent/scandir64-tail.c.
+                 (substitute* "posix/glob.c"
+                  (("(#[ ]*define[ ][ ]*readdir)") "
+#undef readdir
+#define readdir"))
+                 #t)))))))))
 
 (define/system-dependent gcc-boot0-wrapped
   ;; Make the cross-tools GCC-BOOT0 and BINUTILS-BOOT0 available under the




^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-25 10:20       ` Danny Milosavljevic
  2020-09-25 10:42         ` [bug#43591] [PATCH v2 core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless Danny Milosavljevic
@ 2020-09-25 13:36         ` Danny Milosavljevic
  2020-09-25 15:33           ` Danny Milosavljevic
                             ` (2 more replies)
  1 sibling, 3 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-25 13:36 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 43591

[-- Attachment #1: Type: text/plain, Size: 3070 bytes --]

> >Why is this not an issue with i686 on x86_64 kernels?  
> 
> I'm not sure.  I'll check.

$ cat a00.c
#include <stdio.h>
#if defined( __ILP32__ )
#warning ILP32
#endif
int main() {
        return sizeof(off_t);
}
$ LC_ALL=C guix environment -s i686-linux gcc-toolchain -- gcc -o a00 a00.c
a00.c:3:2: warning: #warning ILP32 [-Wcpp]
    3 | #warning ILP32
      |  ^~~~~~~
$ ./a00
$ echo $?
4

That means they are using the Linux kernel's X86_32 ABI.
It has its own getdents64 system call that returns another value for d_off.

$ LC_ALL=C guix environment -s i686-linux gcc-toolchain -- gcc -o a00 -D_FILE_OFFSET_BITS=64 a00.c
a00.c:3:2: warning: #warning ILP32 [-Wcpp]
    3 | #warning ILP32
      |  ^~~~~~~
$ ./a00
$ echo $?
8

That is why __i686__ is not affected--at the cost of the kernel lying to us
about the file system.

Note that I also tried printing the actual d_off values[1]--on ILP32, even with
_FILE_OFFSET_BITS=64, the VALUES are still 32 bits, and the same values as
without _FILE_OFFSET_BITS.  The d_off SLOT gets bigger on _FILE_OFFSET_BITS=64.

(I also tried printing the actual d_off values[1] on x86_64 without any guix
environment -s, I get entirely different d_off values!!)

I also tried the former on native ARMHF--you get 32 bits d_off values.  And d_off
is always the same size as off_t.

off_t size changes depending on _FILE_OFFSET_BITS.

I do not have access to a real aarch64 machine--so no idea how it is there.
That would be the most interesting case, because those don't have X86_32,
so ILP32 is either not present or implemented differently.

ppc64 would also be interesting...

Test result of [1]:

system _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
-------------------------------------------------------------
x86_64 -                 8 Byte  8 Byte         8 Byte
i686   -                 4 Byte  4 Byte         4 Byte
i686   64                8 Byte  8 Byte         4 Byte
i686   32                4 Byte  4 Byte         4 Byte
i686   7                 4 Byte  4 Byte         4 Byte
armhf  -                 4 Byte  4 Byte         4 Byte
armhf  64                8 Byte  8 Byte         4 Byte
armhf  32                4 Byte  4 Byte         4 Byte
armhf  7                 4 Byte  4 Byte         4 Byte

This is all without qemu--in order to simplify the test case.

So I take it ext4 has some special compilation mode for 32 bits...

Could someone please test [1] on (real!) aarch64 and ppc64 and ppc32
machines?

[1] $ cat a00.c 
#include <stdio.h>
#include <errno.h>
#include <assert.h>
#include <dirent.h>
#if defined( __ILP32__ )
#warning ILP32
#endif

int main() {
        DIR* d;
        struct dirent* ent;
        d = opendir("/tmp");
        errno = 0;
        assert(sizeof(ent->d_off) == sizeof(off_t));
        while ((ent = readdir(d)) != NULL) {
                printf("%llu\n", (unsigned long long) ent->d_off);
        }
        if (errno)
                perror("readdir");
        return sizeof(off_t);

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-25 13:36         ` [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
@ 2020-09-25 15:33           ` Danny Milosavljevic
  2020-09-26  1:42           ` Danny Milosavljevic
  2020-09-27  6:43           ` Efraim Flashner
  2 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-25 15:33 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 43591

[-- Attachment #1: Type: text/plain, Size: 2007 bytes --]

Hi,

I wrote a FUSE filesystem to test what happens with big d_off (I just
hard-or-ed a bitmask) and ran it on a real ARMHF machine, then made the program
from before([1] from before) look into that directory.

Result (on ARMHF, so real 32 bit machine!):

$ gcc --version
gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
$ gcc a00.c 
$ ./a00 
1737031971 .
1737032035 ..
1737032035 hello
$ gcc -D_FILE_OFFSET_BITS=64 a00.c 
$ ./a.out 
320255973458211 .
320255973458275 ..
320255973458275 hello

(Note: Guix gcc-toolchain 10 on ARMHF is still building from source--and
will continue to do so for some hours I guess)

I only had to patch fuse 2.9.4 (lib/fuse_lowlevel.c) to do this:

char *fuse_add_dirent(char *buf, const char *name, const struct stat *stbuf,
                      off_t off)
{
        unsigned namelen = strlen(name);
        unsigned entlen = FUSE_NAME_OFFSET + namelen;
        unsigned entsize = fuse_dirent_size(namelen);
        unsigned padlen = entsize - entlen;
        struct fuse_dirent *dirent = (struct fuse_dirent *) buf;

        dirent->ino = stbuf->st_ino;
        dirent->off = off | 0x1234567890123; // !!!!
        dirent->namelen = namelen;
        dirent->type = (stbuf->st_mode & 0170000) >> 12;
        strncpy(dirent->name, name, namelen);
        if (padlen)
                memset(buf + entlen, 0, padlen);

        return buf + entsize;
}

(I DID NOT have to patch the kernel or even have root)

So it can happen that you get 64 bit d_off even on real 32 bit machines!

That's what I thought--but I still wanted to make sure.

And the same on Guix i686 (a00 is [1] from my previous e-mail):

$ ./a00-i686
readdir: Value too large for defined data type
$ ./a00-i686_flag_32 
readdir: Value too large for defined data type
$ ./a00-i686_flag_64
320255973458211
320255973458275
320255973458275

So there you have it, even on i686--without emulating anything--you can get a
64 bit d_off value.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 23:11     ` Marius Bakke
  2020-09-25 10:20       ` Danny Milosavljevic
@ 2020-09-25 20:03       ` Andreas Enge
  2020-09-26 10:50         ` Danny Milosavljevic
  2020-09-29 20:52       ` Ludovic Courtès
  2 siblings, 1 reply; 33+ messages in thread
From: Andreas Enge @ 2020-09-25 20:03 UTC (permalink / raw)
  To: Marius Bakke; +Cc: Danny Milosavljevic, 43591

On Fri, Sep 25, 2020 at 01:11:18AM +0200, Marius Bakke wrote:
> Arguably running code for foreign architectures through QEMU binfmt is
> something of a hack.  Mandating that every package *must* be patched to
> support it seems user-hostile.  I'm more in favor of dropping it on the
> build farm

Indeed it is weird we do not only compile packages natively on the build
farm. If we do not have enough aarch64 machines, would it make sense to buy
some more? How many would we need? Which are our options of machines that
can run the GNU system (as opposed to Guix on a foreign distro, and possibly
non-free firmware)?

Andreas





^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-25 13:36         ` [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
  2020-09-25 15:33           ` Danny Milosavljevic
@ 2020-09-26  1:42           ` Danny Milosavljevic
  2020-09-26  1:49             ` Danny Milosavljevic
  2020-09-27  6:43           ` Efraim Flashner
  2 siblings, 1 reply; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-26  1:42 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 43591

[-- Attachment #1: Type: text/plain, Size: 1417 bytes --]

Hi,

by now I also tested it on overdrive1, an aarch64 machine.

> I do not have access to a real aarch64 machine--so no idea how it is there.
> That would be the most interesting case, because those don't have X86_32,
> so ILP32 is either not present or implemented differently.

I added it in the table below under "a64armhf".

ILP32 is not present on aarch64.

Test result of [1] updated:

system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
---------------------------------------------------------------
x86_64   -                 8 Byte  8 Byte         8 Byte
i686     -                 4 Byte  4 Byte         4 Byte
i686     64                8 Byte  8 Byte         4 Byte
i686     32                4 Byte  4 Byte         4 Byte
i686     7                 4 Byte  4 Byte         4 Byte
armhf    -                 4 Byte  4 Byte         4 Byte
armhf    64                8 Byte  8 Byte         4 Byte
armhf    32                4 Byte  4 Byte         4 Byte
armhf    7                 4 Byte  4 Byte         4 Byte
a64armhf -                 4 Byte  4 Byte         FAIL
a64armhf 64                8 Byte  8 Byte         8 Byte
a64armhf 32                4 Byte  4 Byte         FAIL
a64armhf 7                 4 Byte  4 Byte         FAIL

For a64armhf, the version with -D_FILE_OFFSET_BITS=64 is the only one where
readdir succeeds on my specially-prepared directory.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-26  1:42           ` Danny Milosavljevic
@ 2020-09-26  1:49             ` Danny Milosavljevic
  2020-09-29 14:51               ` Danny Milosavljevic
  0 siblings, 1 reply; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-26  1:49 UTC (permalink / raw)
  To: Marius Bakke; +Cc: 43591

[-- Attachment #1: Type: text/plain, Size: 1636 bytes --]

On Sat, 26 Sep 2020 03:42:44 +0200
Danny Milosavljevic <dannym@scratchpost.org> wrote:

> Hi,
> 
> by now I also tested it on overdrive1, an aarch64 machine.
> 
> > I do not have access to a real aarch64 machine--so no idea how it is there.
> > That would be the most interesting case, because those don't have X86_32,
> > so ILP32 is either not present or implemented differently.  
> 
> I added it in the table below under "a64armhf".
> 
> ILP32 is not present on aarch64.
> 
> Test result of [1] updated:
> 
> system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
> ---------------------------------------------------------------
> x86_64   -                 8 Byte  8 Byte         8 Byte
> i686     -                 4 Byte  4 Byte         4 Byte
> i686     64                8 Byte  8 Byte         4 Byte
> i686     32                4 Byte  4 Byte         4 Byte
> i686     7                 4 Byte  4 Byte         4 Byte
> armhf    -                 4 Byte  4 Byte         4 Byte
> armhf    64                8 Byte  8 Byte         4 Byte
> armhf    32                4 Byte  4 Byte         4 Byte
> armhf    7                 4 Byte  4 Byte         4 Byte
> a64armhf -                 4 Byte  4 Byte         FAIL
> a64armhf 64                8 Byte  8 Byte         8 Byte
> a64armhf 32                4 Byte  4 Byte         FAIL
> a64armhf 7                 4 Byte  4 Byte         FAIL

aarch64    -                 8 Byte  8 Byte         8 Byte

> For a64armhf, the version with -D_FILE_OFFSET_BITS=64 is the only one where
> readdir succeeds on my specially-prepared directory.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-25 20:03       ` Andreas Enge
@ 2020-09-26 10:50         ` Danny Milosavljevic
  0 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-26 10:50 UTC (permalink / raw)
  To: Andreas Enge; +Cc: 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 1255 bytes --]

Hi Andreas,

On Fri, 25 Sep 2020 22:03:23 +0200
Andreas Enge <andreas@enge.fr> wrote:

> On Fri, Sep 25, 2020 at 01:11:18AM +0200, Marius Bakke wrote:
> > Arguably running code for foreign architectures through QEMU binfmt is
> > something of a hack.  Mandating that every package *must* be patched to
> > support it seems user-hostile.  I'm more in favor of dropping it on the
> > build farm  
> 
> Indeed it is weird we do not only compile packages natively on the build
> farm.

I'm sorry that my earlier analysis mentioned qemu at all.  qemu is not at fault
at all.  The same happens if you run 32 bit code on 64 bit hosts without using
qemu!  Except for one case: i686 on x86_64--where they have a compatibility
layer in the kernel that works around this problem (which I don't like
either--it obfuscates the problem).

For example a problem appears on:

- armhf on x86_64 host; fault not because of qemu
- armhf on aarch64 host; not using qemu in the first place

And it should appear also on (but I didn't test):

- i686 on aarch64 host; not using qemu in the first place

The problem always appears if the host is 64 bits and the guest is 32 bits,
no matter what cpus both are, except for the case "i686 on x86_64".

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-25 13:36         ` [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
  2020-09-25 15:33           ` Danny Milosavljevic
  2020-09-26  1:42           ` Danny Milosavljevic
@ 2020-09-27  6:43           ` Efraim Flashner
  2 siblings, 0 replies; 33+ messages in thread
From: Efraim Flashner @ 2020-09-27  6:43 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 4188 bytes --]

On Fri, Sep 25, 2020 at 03:36:46PM +0200, Danny Milosavljevic wrote:
> > >Why is this not an issue with i686 on x86_64 kernels?  
> > 
> > I'm not sure.  I'll check.
> 
> $ cat a00.c
> #include <stdio.h>
> #if defined( __ILP32__ )
> #warning ILP32
> #endif
> int main() {
>         return sizeof(off_t);
> }
> $ LC_ALL=C guix environment -s i686-linux gcc-toolchain -- gcc -o a00 a00.c
> a00.c:3:2: warning: #warning ILP32 [-Wcpp]
>     3 | #warning ILP32
>       |  ^~~~~~~
> $ ./a00
> $ echo $?
> 4
> 
> That means they are using the Linux kernel's X86_32 ABI.
> It has its own getdents64 system call that returns another value for d_off.
> 
> $ LC_ALL=C guix environment -s i686-linux gcc-toolchain -- gcc -o a00 -D_FILE_OFFSET_BITS=64 a00.c
> a00.c:3:2: warning: #warning ILP32 [-Wcpp]
>     3 | #warning ILP32
>       |  ^~~~~~~
> $ ./a00
> $ echo $?
> 8
> 
> That is why __i686__ is not affected--at the cost of the kernel lying to us
> about the file system.
> 
> Note that I also tried printing the actual d_off values[1]--on ILP32, even with
> _FILE_OFFSET_BITS=64, the VALUES are still 32 bits, and the same values as
> without _FILE_OFFSET_BITS.  The d_off SLOT gets bigger on _FILE_OFFSET_BITS=64.
> 
> (I also tried printing the actual d_off values[1] on x86_64 without any guix
> environment -s, I get entirely different d_off values!!)
> 
> I also tried the former on native ARMHF--you get 32 bits d_off values.  And d_off
> is always the same size as off_t.
> 
> off_t size changes depending on _FILE_OFFSET_BITS.
> 
> I do not have access to a real aarch64 machine--so no idea how it is there.
> That would be the most interesting case, because those don't have X86_32,
> so ILP32 is either not present or implemented differently.
> 
> ppc64 would also be interesting...
> 
> Test result of [1]:
> 
> system _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
> -------------------------------------------------------------
> x86_64 -                 8 Byte  8 Byte         8 Byte
> i686   -                 4 Byte  4 Byte         4 Byte
> i686   64                8 Byte  8 Byte         4 Byte
> i686   32                4 Byte  4 Byte         4 Byte
> i686   7                 4 Byte  4 Byte         4 Byte
> armhf  -                 4 Byte  4 Byte         4 Byte
> armhf  64                8 Byte  8 Byte         4 Byte
> armhf  32                4 Byte  4 Byte         4 Byte
> armhf  7                 4 Byte  4 Byte         4 Byte
> 
> This is all without qemu--in order to simplify the test case.
> 
> So I take it ext4 has some special compilation mode for 32 bits...
> 
> Could someone please test [1] on (real!) aarch64 and ppc64 and ppc32
> machines?
> 
> [1] $ cat a00.c 
> #include <stdio.h>
> #include <errno.h>
> #include <assert.h>
> #include <dirent.h>
> #if defined( __ILP32__ )
> #warning ILP32
> #endif
> 
> int main() {
>         DIR* d;
>         struct dirent* ent;
>         d = opendir("/tmp");
>         errno = 0;
>         assert(sizeof(ent->d_off) == sizeof(off_t));
>         while ((ent = readdir(d)) != NULL) {
>                 printf("%llu\n", (unsigned long long) ent->d_off);
>         }
>         if (errno)
>                 perror("readdir");
>         return sizeof(off_t);
missing } at the end

tested on my ibook G4:
(I believe for powerpc the 32-bit is defined as __powerpc__)

(ins)efraim@g4:~$ gcc -o a00 a00.c
(ins)efraim@g4:~$ ./a00
404218588
473424804
681064434
708508942
805784207
980330722
1080794671
1734802884
1909818320
1923689764
2019671154
2125602108
2147483647
(ins)efraim@g4:~$ echo $?
4
(ins)efraim@g4:~$ rm a00
(ins)efraim@g4:~$ gcc -o a00 -D_FILE_OFFSET_BITS=64 a00.c
(ins)efraim@g4:~$ ./a00
404218588
473424804
681064434
708508942
805784207
980330722
1080794671
1734802884
1909818320
1923689764
2019671154
2125602108
2147483647
(ins)efraim@g4:~$ echo $?
8



-- 
Efraim Flashner   <efraim@flashner.co.il>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-26  1:49             ` Danny Milosavljevic
@ 2020-09-29 14:51               ` Danny Milosavljevic
  0 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-29 14:51 UTC (permalink / raw)
  To: 43591; +Cc: Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 1030 bytes --]

(Note: ILP32 is not present on aarch64)

system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
---------------------------------------------------------------
x86_64   -                 8 Byte  8 Byte         8 Byte
i686     -                 4 Byte  4 Byte         4 Byte
i686     64                8 Byte  8 Byte         FAIL*
i686     32                4 Byte  4 Byte         FAIL*
i686     7                 4 Byte  4 Byte         4 Byte
armhf    -                 4 Byte  4 Byte         4 Byte
armhf    64                8 Byte  8 Byte         4 Byte
armhf    32                4 Byte  4 Byte         4 Byte
armhf    7                 4 Byte  4 Byte         4 Byte
a64armhf -                 4 Byte  4 Byte         FAIL*
a64armhf 64                8 Byte  8 Byte         8 Byte
a64armhf 32                4 Byte  4 Byte         FAIL*
a64armhf 7                 4 Byte  4 Byte         FAIL* 
aarch64  -                 8 Byte  8 Byte         8 Byte

*: Using FUSE filesystem with big d_off value.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 23:11     ` Marius Bakke
  2020-09-25 10:20       ` Danny Milosavljevic
  2020-09-25 20:03       ` Andreas Enge
@ 2020-09-29 20:52       ` Ludovic Courtès
  2020-09-29 22:09         ` Danny Milosavljevic
  2 siblings, 1 reply; 33+ messages in thread
From: Ludovic Courtès @ 2020-09-29 20:52 UTC (permalink / raw)
  To: Marius Bakke; +Cc: Danny Milosavljevic, 43591

Hi,

Marius Bakke <marius@gnu.org> skribis:

> Arguably running code for foreign architectures through QEMU binfmt is
> something of a hack.  Mandating that every package *must* be patched to
> support it seems user-hostile.  I'm more in favor of dropping it on the
> build farm, or just keep fixing things on a per-package basis.

I’m fine with dropping things on the build farm; it’s just about
modifying machines-for-berlin.scm in maintenance.git.  Any takers?  :-)

> A less user-hostile solution could perhaps be to (setenv "CFLAGS"
> "-D_FILE_OFFSET_BITS=64") on 32-bit architectures in gnu-build-system.
> Not sure whether that could cause any adverse effects.  But again, I
> don't like the idea of optimizing for QEMUs user-mode emulation.

The above would override the default CFLAGS in Autoconf-generated
configure scripts (which is “-O2 -g”).  So we’d have to be cautious.
But I think a global solution is preferable to adding
-D_FILE_OFFSET_BITS=64 to tens of packages.

WDYT?

Ludo’.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-29 20:52       ` Ludovic Courtès
@ 2020-09-29 22:09         ` Danny Milosavljevic
  2020-09-30  9:32           ` Ludovic Courtès
  0 siblings, 1 reply; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-29 22:09 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 2349 bytes --]

Hi Ludo,

On Tue, 29 Sep 2020 22:52:10 +0200
Ludovic Courtès <ludo@gnu.org> wrote:

> Marius Bakke <marius@gnu.org> skribis:
> 
> > Arguably running code for foreign architectures through QEMU binfmt is
> > something of a hack.  Mandating that every package *must* be patched to
> > support it seems user-hostile.  I'm more in favor of dropping it on the
> > build farm, or just keep fixing things on a per-package basis.  
> 
> I’m fine with dropping things on the build farm; it’s just about
> modifying machines-for-berlin.scm in maintenance.git.  Any takers?  :-)

I don't know what "dropping things on the build farm" means in this context.
Dropping what exactly?

> The above would override the default CFLAGS in Autoconf-generated
> configure scripts (which is “-O2 -g”).  

That is correct.  I'm currently working on v2 (testing a patchset already)
and I totally forgot to add "-g -O2" the first time around.  Also, glibc
itself must NOT have -D_FILE_OFFSET_BITS=64 (it makes sense not to, too).

>So we’d have to be cautious.
> But I think a global solution is preferable to adding
> -D_FILE_OFFSET_BITS=64 to tens of packages.

I agree.

I still would like to see what actually changes--and I think with
guix-data-services it should actually be possible to compare derivations
before-and-after and find out which derivations of which packages changed
at all because of the global -D_FILE_OFFSET_BITS=64.  I'd like some help
using guix-data-services to find that out.  Otherwise a risk estimation
cannot be done.

Technically, if a package used direct assembly offsets (for some unfathomable
reason), it could have an undetectable problem with the size change of off_t
(and also struct dirent).  So examining the source code of the most essential
packages manually is still good.  That's what I did in
branch wip-file-offsets-64.

I'm in the process of testing a patchset that globally sets

  CFLAGS="-D_FILE_OFFSET_BITS=64 -g -O2"

instead.

That alone is not enough since there are a lot of non-autotools projects that
just ignore the environment variable entirely--not to mention languages other
than C.

I have access to bayfront--but I don't remember how to evaluate a new branch
there  (the new branch is "wip-file-offset-bits-64-sledgehammer").  How does
it work?

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates v2 0/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-24 14:12 [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
  2020-09-24 14:16 ` Danny Milosavljevic
  2020-09-24 18:17 ` Marius Bakke
@ 2020-09-30  8:45 ` Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless Danny Milosavljevic
                     ` (4 more replies)
  2 siblings, 5 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-30  8:45 UTC (permalink / raw)
  To: 43591; +Cc: Danny Milosavljevic

Linux kernel file offsets are always 64 bits.
But userspace can be built to use 32 bit offsets.

"struct dirent", returned by readdir, uses d_off to store
such an "offset" that it got from the Linux kernel.
In the case of ext4 that "offset" is actually a 64 bit
hash value.

Therefore, there are cases where such an offset that it got
from the Linux kernel does not fit in the "struct dirent"
field "d_off".

If the guest system's glibc is 32 bit AND uses 32 bit
file offsets it is going to be very confused.
It does check whether d_off fits into the structure
it gives back to the user--and it doesn't fit.  Hence readdir
fails, with errno == EOVERFLOW (which is undocumented and thus
an API error).
This manifests itself in simple directory reads not working
anymore in parts of cmake, for example.

This happened in Guix when building stuff for
ARMHF on a x86_64 build host using QEMU transparent emulation.

There is a very simple and complete way to avoid this problem:
Just always use 64 bit offsets in user space programs (also
on 32 bit machines).

Danny Milosavljevic (5):
  gnu: glibc-final: Catch all cases of a glibc user not requesting
    64-bit offsets and then using readdir regardless.
  build-system/gnu: Explicity declare the _FILE_OFFSET_BITS we want.
  gnu: glibc: Do not explicitly set _FILE_OFFSET_BITS.
  gnu: glibc-mesboot0: Do not explicitly set _FILE_OFFSET_BITS.
  gnu: rhash: Explicity declare the _FILE_OFFSET_BITS we want.

 gnu/packages/base.scm           |  2 +
 gnu/packages/commencement.scm   | 70 ++++++++++++++++++++++++++++++++-
 gnu/packages/crypto.scm         |  9 ++++-
 guix/build/gnu-build-system.scm | 13 +++++-
 4 files changed, 90 insertions(+), 4 deletions(-)





^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless.
  2020-09-30  8:45 ` [bug#43591] [PATCH core-updates v2 0/5] " Danny Milosavljevic
@ 2020-09-30  8:45   ` Danny Milosavljevic
  2020-09-30 16:55     ` Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 2/5] build-system/gnu: Explicity declare the _FILE_OFFSET_BITS we want Danny Milosavljevic
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-30  8:45 UTC (permalink / raw)
  To: 43591; +Cc: Danny Milosavljevic

* gnu/packages/commencement.scm (glibc-final): Catch all cases of a glibc user not
requesting 64-bit offsets and then using readdir.
---
 gnu/packages/commencement.scm | 68 ++++++++++++++++++++++++++++++++++-
 1 file changed, 67 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index e5a4caa95c..284fa65d20 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -3462,7 +3462,73 @@ exec ~a/bin/~a-~a -B~a/lib -Wl,-dynamic-linker -Wl,~a/~a \"$@\"~%"
                ,hurd-headers-boot0)
              '())
        ,@(package-outputs glibc-final-with-bootstrap-bash))
-      ,@(package-arguments glibc-final-with-bootstrap-bash)))))
+      ,@(substitute-keyword-arguments
+         (package-arguments glibc-final-with-bootstrap-bash)
+         ((#:phases phases)
+          `(modify-phases ,phases
+             (add-after 'unpack 'patch-dirent
+               (lambda* (#:key outputs #:allow-other-keys)
+                 ;; Linux kernel file offsets are always 64 bits.
+                 ;; But userspace can be built to use 32 bit offsets.
+                 ;;
+                 ;; "struct dirent", returned by readdir, uses d_off to store
+                 ;; such an "offset" that it got from the Linux kernel.
+                 ;; In the case of ext4 that "offset" is actually a 64 bit
+                 ;; hash value.
+                 ;;
+                 ;; Therefore, there are cases where such an offset that it got
+                 ;; from the Linux kernel does not fit in the "struct dirent"
+                 ;; field "d_off".
+                 ;;
+                 ;; If the guest system's glibc is 32 bit AND uses 32 bit
+                 ;; file offsets it is going to be very confused.
+                 ;; It does check whether d_off fits into the structure
+                 ;; it gives back to the user--and it doesn't fit.  Hence readdir
+                 ;; fails, with errno == EOVERFLOW (which is undocumented and thus
+                 ;; an API error).
+                 ;; This manifests itself in simple directory reads not working
+                 ;; anymore in parts of cmake, for example.
+                 ;;
+                 ;; This manifested in Guix when building stuff for
+                 ;; ARMHF on a x86_64 build host using QEMU transparent emulation.
+                 ;;
+                 ;; There is a very simple and complete way to avoid this problem:
+                 ;; Just always use 64 bit offsets in user space programs (also
+                 ;; on 32 bit machines).  The Linux kernel does that already
+                 ;; anyway.
+                 ;;
+                 ;; Note: We might want to avoid using 64 bit when bootstrapping
+                 ;; using mescc (since mescc doesn't directly support 64 bit
+                 ;; values)--but then bootstrapping has to be done on a
+                 ;; file system other than ext4, or on ext4 with the feature
+                 ;; "dir_index" disabled.
+                 ;;
+                 ;; The change below does not affect 64 bit users.
+                 ;;
+                 ;; See <https://issues.guix.gnu.org/43513>.
+                 (let ((port (open-file "dirent/dirent.h" "a")))
+                   (display "
+#ifndef _LIBC
+#if __SIZEOF_LONG__ < 8
+#ifndef __USE_FILE_OFFSET64
+#if defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 32
+#warning \"Using -D_FILE_OFFSET_BITS=32 and using readdir is a bad idea, see <https://bugzilla.kernel.org/show_bug.cgi?id=205957>\"
+#else
+#undef readdir
+#define readdir @READDIR_WITHOUT_FILE_OFFSET64_IS_A_REALLY_BAD_IDEA@
+#endif
+#endif
+#endif
+#endif
+" port)
+                   (close-port port))
+                 ;; This file includes <dirent.h> and thus checks sanity already.
+                 ;; TODO: Check dirent/scandir-tail.c, dirent/scandir64-tail.c.
+                 (substitute* "posix/glob.c"
+                  (("(#[ ]*define[ ][ ]*readdir)") "
+#undef readdir
+#define readdir"))
+                 #t)))))))))
 
 (define/system-dependent gcc-boot0-wrapped
   ;; Make the cross-tools GCC-BOOT0 and BINUTILS-BOOT0 available under the




^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates v2 2/5] build-system/gnu: Explicity declare the _FILE_OFFSET_BITS we want.
  2020-09-30  8:45 ` [bug#43591] [PATCH core-updates v2 0/5] " Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless Danny Milosavljevic
@ 2020-09-30  8:45   ` Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 3/5] gnu: glibc: Do not explicitly set _FILE_OFFSET_BITS Danny Milosavljevic
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-30  8:45 UTC (permalink / raw)
  To: 43591; +Cc: Danny Milosavljevic

* guix/build/gnu-build-system.scm (set-FILE-OFFSET-BITS): New procedure.
(%standard-phases): Add it.
---
 guix/build/gnu-build-system.scm | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/guix/build/gnu-build-system.scm b/guix/build/gnu-build-system.scm
index d3347c9518..1a1c4627ab 100644
--- a/guix/build/gnu-build-system.scm
+++ b/guix/build/gnu-build-system.scm
@@ -60,6 +60,17 @@ See https://reproducible-builds.org/specs/source-date-epoch/."
   (setenv "SOURCE_DATE_EPOCH" "1")
   #t)
 
+(define* (set-FILE-OFFSET-BITS #:rest _)
+  "Define '_FILE_OFFSET_BITS' using the C preprocessor.
+This ensures that 32 bit and 64 bit user space both can handle results
+returned from a 64 bit kernel.
+See https://bugzilla.kernel.org/show_bug.cgi?id=205957."
+  ;; Setting CFLAGS here makes packages not default CFLAGS.
+  ;; Since glibc (and probably other packages) don't like being built without optimization, enable optimization here.
+  (setenv "CFLAGS" "-D_FILE_OFFSET_BITS=64 -g -O2")
+  (setenv "CXXFLAGS" "-D_FILE_OFFSET_BITS=64 -g -O2")
+  #t)
+
 (define (first-subdirectory directory)
   "Return the file name of the first sub-directory of DIRECTORY."
   (match (scandir directory
@@ -803,7 +814,7 @@ which cannot be found~%"
   ;; Standard build phases, as a list of symbol/procedure pairs.
   (let-syntax ((phases (syntax-rules ()
                          ((_ p ...) `((p . ,p) ...)))))
-    (phases set-SOURCE-DATE-EPOCH set-paths install-locale unpack
+    (phases set-SOURCE-DATE-EPOCH set-FILE-OFFSET-BITS set-paths install-locale unpack
             bootstrap
             patch-usr-bin-file
             patch-source-shebangs configure patch-generated-file-shebangs




^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates v2 3/5] gnu: glibc: Do not explicitly set _FILE_OFFSET_BITS.
  2020-09-30  8:45 ` [bug#43591] [PATCH core-updates v2 0/5] " Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 2/5] build-system/gnu: Explicity declare the _FILE_OFFSET_BITS we want Danny Milosavljevic
@ 2020-09-30  8:45   ` Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 4/5] gnu: glibc-mesboot0: " Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 5/5] gnu: rhash: Explicity declare the _FILE_OFFSET_BITS we want Danny Milosavljevic
  4 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-30  8:45 UTC (permalink / raw)
  To: 43591; +Cc: Danny Milosavljevic

* gnu/packages/base.scm (glibc)[arguments]<#:phases>[set-FILE-OFFSET-BITS]:
Delete phase.
---
 gnu/packages/base.scm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gnu/packages/base.scm b/gnu/packages/base.scm
index 41976c5ab0..6f3c717b50 100644
--- a/gnu/packages/base.scm
+++ b/gnu/packages/base.scm
@@ -782,6 +782,8 @@ the store.")
 
       #:tests? #f                                 ; XXX
       #:phases (modify-phases %standard-phases
+                 ;; glibc itself should support both--so don't choose here.
+                 (delete 'set-FILE-OFFSET-BITS)
                  (add-before
                   'configure 'pre-configure
                   (lambda* (#:key inputs native-inputs outputs




^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates v2 4/5] gnu: glibc-mesboot0: Do not explicitly set _FILE_OFFSET_BITS.
  2020-09-30  8:45 ` [bug#43591] [PATCH core-updates v2 0/5] " Danny Milosavljevic
                     ` (2 preceding siblings ...)
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 3/5] gnu: glibc: Do not explicitly set _FILE_OFFSET_BITS Danny Milosavljevic
@ 2020-09-30  8:45   ` Danny Milosavljevic
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 5/5] gnu: rhash: Explicity declare the _FILE_OFFSET_BITS we want Danny Milosavljevic
  4 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-30  8:45 UTC (permalink / raw)
  To: 43591; +Cc: Danny Milosavljevic

* gnu/packages/commencement.scm (glibc-mesboot0)[arguments]<#:phases>[set-FILE-OFFSET-BITS]:
Delete phase.
---
 gnu/packages/commencement.scm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index 284fa65d20..ad6a27ba6b 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -1398,6 +1398,8 @@ ac_cv_c_float_format='IEEE (little-endian)'
            ,(string-append "--prefix=" out)))
        #:phases
        (modify-phases %standard-phases
+         ;; glibc itself should support both--so don't choose here.
+         (delete 'set-FILE-OFFSET-BITS)
          (add-after 'unpack 'apply-boot-patch
            (lambda* (#:key inputs #:allow-other-keys)
              (and (let ((patch (assoc-ref inputs "boot-patch")))




^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates v2 5/5] gnu: rhash: Explicity declare the _FILE_OFFSET_BITS we want.
  2020-09-30  8:45 ` [bug#43591] [PATCH core-updates v2 0/5] " Danny Milosavljevic
                     ` (3 preceding siblings ...)
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 4/5] gnu: glibc-mesboot0: " Danny Milosavljevic
@ 2020-09-30  8:45   ` Danny Milosavljevic
  4 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-30  8:45 UTC (permalink / raw)
  To: 43591; +Cc: Danny Milosavljevic

* gnu/packages/crypto.scm (rhash)[arguments]<#:make-flags>:
Explicity declare the _FILE_OFFSET_BITS we want.
---
 gnu/packages/crypto.scm | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/gnu/packages/crypto.scm b/gnu/packages/crypto.scm
index 028c140185..a10dd62e8b 100644
--- a/gnu/packages/crypto.scm
+++ b/gnu/packages/crypto.scm
@@ -845,8 +845,13 @@ BLAKE.")
                                       "/bin/" ,target "-gcc"))
                      '())))
        #:make-flags
-       ;; The binaries in /bin need some help finding librhash.so.0.
-       (list (string-append "LDFLAGS=-Wl,-rpath=" %output "/lib"))
+       (list ;; This package uses a configure script that is not from GNU
+             ;; autotools; it doesn't handle the environment variable
+             ;; CFLAGS (or for that matter the configure option).
+             ;; Therefore, directly pass it to make.
+             "CFLAGS=-D_FILE_OFFSET_BITS=64"
+             ;; The binaries in /bin need some help finding librhash.so.0.
+             (string-append "LDFLAGS=-Wl,-rpath=" %output "/lib"))
        #:test-target "test"             ; ‘make check’ just checks the sources
        #:phases
        (modify-phases %standard-phases




^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-29 22:09         ` Danny Milosavljevic
@ 2020-09-30  9:32           ` Ludovic Courtès
  2020-09-30 10:28             ` Danny Milosavljevic
  0 siblings, 1 reply; 33+ messages in thread
From: Ludovic Courtès @ 2020-09-30  9:32 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: 43591, Marius Bakke

Hi,

Danny Milosavljevic <dannym@scratchpost.org> skribis:

> On Tue, 29 Sep 2020 22:52:10 +0200
> Ludovic Courtès <ludo@gnu.org> wrote:
>
>> Marius Bakke <marius@gnu.org> skribis:
>> 
>> > Arguably running code for foreign architectures through QEMU binfmt is
>> > something of a hack.  Mandating that every package *must* be patched to
>> > support it seems user-hostile.  I'm more in favor of dropping it on the
>> > build farm, or just keep fixing things on a per-package basis.  
>> 
>> I’m fine with dropping things on the build farm; it’s just about
>> modifying machines-for-berlin.scm in maintenance.git.  Any takers?  :-)
>
> I don't know what "dropping things on the build farm" means in this context.
> Dropping what exactly?

Dropping emulated builds, or at least 32-bit emulated builds.  We just
need to remove build machines from the file above.

>> The above would override the default CFLAGS in Autoconf-generated
>> configure scripts (which is “-O2 -g”).  
>
> That is correct.  I'm currently working on v2 (testing a patchset already)
> and I totally forgot to add "-g -O2" the first time around.  Also, glibc
> itself must NOT have -D_FILE_OFFSET_BITS=64 (it makes sense not to, too).
>
>>So we’d have to be cautious.
>> But I think a global solution is preferable to adding
>> -D_FILE_OFFSET_BITS=64 to tens of packages.
>
> I agree.
>
> I still would like to see what actually changes--and I think with
> guix-data-services it should actually be possible to compare derivations
> before-and-after and find out which derivations of which packages changed
> at all because of the global -D_FILE_OFFSET_BITS=64.  I'd like some help
> using guix-data-services to find that out.  Otherwise a risk estimation
> cannot be done.

A change in gnu-build-system would change all the derivations.  I don’t
think the Data Service can help us here.

> Technically, if a package used direct assembly offsets (for some unfathomable
> reason), it could have an undetectable problem with the size change of off_t
> (and also struct dirent).  So examining the source code of the most essential
> packages manually is still good.  That's what I did in
> branch wip-file-offsets-64.

Yeah.

> I'm in the process of testing a patchset that globally sets
>
>   CFLAGS="-D_FILE_OFFSET_BITS=64 -g -O2"
>
> instead.

OK.

> That alone is not enough since there are a lot of non-autotools projects that
> just ignore the environment variable entirely--not to mention languages other
> than C.

Yeah…

I have mixed feelings: fixing packages one by one doesn’t sound great,
but OTOH setting the ‘CFLAGS’ environment variable globally can have
unexpected side effects in some cases (overriding package-specific
CFLAGS) and zero effects in other cases (for non-Autoconf packages or
badly-written ‘configure.ac’ files), both of which would be hard to
detect.

~~~

If we take a step back: what’s the problem?  We have a problem with
emulated 32-bit architectures on 64-bit architectures.  But that’s OK,
we can stop those emulations for now.  Then we have packages that do not
support large files; it’s not great but evidently we can live with it.
:-)  Ideally, we’d report it upstream when we encounter it.

So to me that hints at targeted fixes: fixing key packages like CMake
(roughly what you already did) where lack of large file support can be
problematic.

Thoughts?

Ludo’.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-30  9:32           ` Ludovic Courtès
@ 2020-09-30 10:28             ` Danny Milosavljevic
  2020-10-01  7:14               ` Ludovic Courtès
  0 siblings, 1 reply; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-30 10:28 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 4628 bytes --]

Hi Ludo,

On Wed, 30 Sep 2020 11:32:58 +0200
Ludovic Courtès <ludo@gnu.org> wrote:

> Dropping emulated builds, or at least 32-bit emulated builds.  We just
> need to remove build machines from the file above.

Oh.

Do we have real armhf machines?  (as in not aarch64)

*Looks at guix-maintenance*  We do.  Awesome.  Then sure.

But just to be clear, WE MUST NOT USE aarch64 to build armhf as long as this
problem isn't fixed.

This problem has nothing to do with emulation.

> A change in gnu-build-system would change all the derivations.  I don’t
> think the Data Service can help us here.

I want to know what actually changes in the final binaries.  Surely that works
somehow--guix data services or not.

Basically, for each package in Guix,

  diff -r `~/broken-guix/pre-inst-env guix build $package` `~/fixed-guix/pre-inst-env guix build $package` || echo "affected: $package"

but after replacing references by deduplicated content addressed references
(for example if derivation A refers to files in derivation B, but derivation B
only changed the directory name it's in and not the content of the derivation,
then that should not count as a diff in A.  That should happen recursively).

Basically ignore the directory names in /gnu/store and make new directory
names that are the hash of each directory's (recursive) CONTENTS (after
fixing references in that content, too, obviously)--as opposed to sources.
Then diff those.

If necessary, I can run that on my laptop--it will just take several weeks
and miss derivations I don't have in the first place.

> I have mixed feelings: fixing packages one by one doesn’t sound great,
> but OTOH setting the ‘CFLAGS’ environment variable globally can have
> unexpected side effects in some cases (overriding package-specific
> CFLAGS) and zero effects in other cases (for non-Autoconf packages or
> badly-written ‘configure.ac’ files), both of which would be hard to
> detect.

The latter is easy to detect since I patched dirent.h in glibc exactly for that
reason.  That way, glibc WON'T let you use it wrong (except if you explicitly
ask for it).  On Guix systems, there is no legitimate reason to use it wrong
in the first place.

In my opinion, not having an automated way to tell us when a package is using
it wrong would be not doing our due diligence--how would you know we had
actually fixed the problem for good?  You wouldn't know.
And you wouldn't have fixed the problem for good--I can tell you that much now.

There already was an essential package, rhash, which didn't pick up the
CFLAGS--and that's how I found it.  It's easy.

About the unexpected side effects--yes, that's right.  That's why we should get
a list of diff results (see above for the command) and then manually look at
the source code of those packages and their dependencies.

> If we take a step back: what’s the problem?

It means we have no trustworthy i686 packages, which means we do not have a
trustworthy full source bootstrap using Mes (since that uses i686 executable
to bootstrap).

In practice, this problem is not so bad since the kernel on i686 has a compat
layer that hasn't been telling us the truth for d_off, so we should be "good".
But philosophically, we are doing it dead wrong.

Also, this won't work on armhf or any other 32 bit architecture--so there,
we would be both philosophically and practically wrong.

Also, the "not telling us the truth for d_off on i686" is a leaky compat layer.
It totally DOES wind up telling us the truth sometimes (see my earlier test
table)--and then we have a problem.

>  Then we have packages that do not
> support large files; it’s not great but evidently we can live with it.
> :-)  Ideally, we’d report it upstream when we encounter it.

I really don't care about large file support.  That's mostly a bonus we get
while fixing this whole ordeal the right way.  That said, maybe users
care they actually can store a 5 GiB file on their 4000 GiB drive on armhf :P

> So to me that hints at targeted fixes: fixing key packages like CMake
> (roughly what you already did) where lack of large file support can be
> problematic.

As long as we patch glibc's dirent.h so it tells us when we are doing stupid
stuff, we can't go wrong much.  So sure.

As I said, the main problem is FINDING the affected packages.  It's not like
they'll fail building (in general).  They'll just do weird stuff at runtime
instead.  Case in point: cmake DID NOT fail building.  And it totally is
broken.

Everything after that is easy.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless.
  2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless Danny Milosavljevic
@ 2020-09-30 16:55     ` Danny Milosavljevic
  0 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-09-30 16:55 UTC (permalink / raw)
  To: 43591

[-- Attachment #1: Type: text/plain, Size: 1336 bytes --]

Still broken: boost.

In file included from ./boost/interprocess/detail/os_file_functions.hpp:40:0,
                 from ./boost/interprocess/mapped_region.hpp:29,
                 from libs/log/src/posix/ipc_reliable_message_queue.cpp:51:
./boost/interprocess/detail/os_file_functions.hpp:636:16: error: stray ‘@’ in program
    while((de=::readdir(d))) {
[...]
build of /gnu/store/j29v8ph3asv4sywi0437hs6wwd9knvy2-boost-1.74.0.drv failed

That means that boost, or that source file, is not being compiled with
_FILE_OFFSET_BITS=64 .

Nowhere does the header file boost/interprocess/detail/os_file_functions.hpp
even check the value of _FILE_OFFSET_BITS.

./libs/filesystem/src/platform_config.hpp is responsible for setting up
_FILE_OFFSET_BITS in general.

That means that platform_config.hpp is not being #included in the former.

boost/interprocess/errors.hpp:// Parts of this code are taken from boost::filesystem library
boost/interprocess/errors.hpp://  See library home page at http://www.boost.org/libs/filesystem

But that means that this does not USE the "filesystem" library and neither does
it set _FILE_OFFSET_BITS like the "filesystem" library would.

All this happens even after I added a phase to gnu-build-system to set
_FILE_OFFSET_BITS unconditionally, so boost ignores that, too.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-09-30 10:28             ` Danny Milosavljevic
@ 2020-10-01  7:14               ` Ludovic Courtès
  2020-10-02  7:18                 ` Danny Milosavljevic
  2020-10-06 15:39                 ` Danny Milosavljevic
  0 siblings, 2 replies; 33+ messages in thread
From: Ludovic Courtès @ 2020-10-01  7:14 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: 43591, Marius Bakke

Hi,

Danny Milosavljevic <dannym@scratchpost.org> skribis:

> On Wed, 30 Sep 2020 11:32:58 +0200
> Ludovic Courtès <ludo@gnu.org> wrote:
>
>> Dropping emulated builds, or at least 32-bit emulated builds.  We just
>> need to remove build machines from the file above.
>
> Oh.
>
> Do we have real armhf machines?  (as in not aarch64)
>
> *Looks at guix-maintenance*  We do.  Awesome.  Then sure.
>
> But just to be clear, WE MUST NOT USE aarch64 to build armhf as long as this
> problem isn't fixed.
>
> This problem has nothing to do with emulation.

Now I’m lost; I thought this had to do with qemu-user.

Could you propose a patch for maintenance.git?

> I want to know what actually changes in the final binaries.  Surely that works
> somehow--guix data services or not.
>
> Basically, for each package in Guix,
>
>   diff -r `~/broken-guix/pre-inst-env guix build $package` `~/fixed-guix/pre-inst-env guix build $package` || echo "affected: $package"
>
> but after replacing references by deduplicated content addressed references
> (for example if derivation A refers to files in derivation B, but derivation B
> only changed the directory name it's in and not the content of the derivation,
> then that should not count as a diff in A.  That should happen recursively).

Yeah, you “just” need to compare modulo store file names…

>> I have mixed feelings: fixing packages one by one doesn’t sound great,
>> but OTOH setting the ‘CFLAGS’ environment variable globally can have
>> unexpected side effects in some cases (overriding package-specific
>> CFLAGS) and zero effects in other cases (for non-Autoconf packages or
>> badly-written ‘configure.ac’ files), both of which would be hard to
>> detect.
>
> The latter is easy to detect since I patched dirent.h in glibc exactly for that
> reason.  That way, glibc WON'T let you use it wrong (except if you explicitly
> ask for it).  On Guix systems, there is no legitimate reason to use it wrong
> in the first place.

I’m very reluctant to patching public libc headers.  Also, it’s not just
“our” problem, we should definitely discuss it with upstream and perhaps
propose your dirent.h patch.

I’m also not sure what you mean by “using it wrong”, what is “it”?

Building without _FILE_OFFSET_BITS=64 is still a valid option, albeit
one that is not recommended.

> About the unexpected side effects--yes, that's right.  That's why we should get
> a list of diff results (see above for the command) and then manually look at
> the source code of those packages and their dependencies.

A diff at one point in time (if we ever managed to get a usable diff) is
not enough: problems could pop up anytime.  Setting ‘CFLAGS’ globally as
an environment variable seems risky.

>> If we take a step back: what’s the problem?
>
> It means we have no trustworthy i686 packages, which means we do not have a
> trustworthy full source bootstrap using Mes (since that uses i686 executable
> to bootstrap).
>
> In practice, this problem is not so bad since the kernel on i686 has a compat
> layer that hasn't been telling us the truth for d_off, so we should be "good".
> But philosophically, we are doing it dead wrong.
>
> Also, this won't work on armhf or any other 32 bit architecture--so there,
> we would be both philosophically and practically wrong.
>
> Also, the "not telling us the truth for d_off on i686" is a leaky compat layer.
> It totally DOES wind up telling us the truth sometimes (see my earlier test
> table)--and then we have a problem.

Hmm I guess I need to re-read all that, I’m overwhelmed.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-10-01  7:14               ` Ludovic Courtès
@ 2020-10-02  7:18                 ` Danny Milosavljevic
  2020-10-02  8:12                   ` Danny Milosavljevic
  2020-10-02  9:32                   ` Danny Milosavljevic
  2020-10-06 15:39                 ` Danny Milosavljevic
  1 sibling, 2 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-10-02  7:18 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 6534 bytes --]

Hi Ludo,

On Thu, 01 Oct 2020 09:14:10 +0200
Ludovic Courtès <ludo@gnu.org> wrote:

> > This problem has nothing to do with emulation.  
> 
> Now I’m lost; I thought this had to do with qemu-user.

I had thought so, too, a few weeks ago.  But that's not the case.

It's not at all related to qemu.

The problem is a fundamental problem: a 64 bit value does NOT fit into a 32 bit
slot.

glibc uses getdents64 to get 64 bit dents and then acts all surprised and
errory when it gets 64 bit dents. (also on 32 bit glibc)

The same happens natively when using armhf on aarch64, without qemu-user.
That's what the table I sent was all about.

Calling getdents64 is not the problem--glibc has to do that, otherwise a 32 bit
glibc won't work RELIABLY on a 64 bit kernel anyway.

And emitting an error it does because we do not enable large file support.

This table (updated after I finally compiled a guix gcc-toolchain-10 with the
(unpatched) glibc in question on armhf):

system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
---------------------------------------------------------------
x86_64   -                 8 Byte  8 Byte         8 Byte
i686     -                 4 Byte  4 Byte         4 Byte
i686     64                8 Byte  8 Byte         FAIL*
i686     32                4 Byte  4 Byte         FAIL*
i686     7                 4 Byte  4 Byte         4 Byte
armhf    -                 4 Byte  4 Byte         FAIL*
armhf    64                8 Byte  8 Byte         8 Byte
armhf    32                4 Byte  4 Byte         FAIL*
armhf    7                 4 Byte  4 Byte         FAIL*
a64armhf -                 4 Byte  4 Byte         FAIL*
a64armhf 64                8 Byte  8 Byte         8 Byte
a64armhf 32                4 Byte  4 Byte         FAIL*
a64armhf 7                 4 Byte  4 Byte         FAIL* 
aarch64  -                 8 Byte  8 Byte         8 Byte

*: Using FUSE filesystem with big d_off value.

None of those tests were done with qemu.  They were all native.

That's why I wanted access to real aarch64 machines--otherwise I could have
done it with qemu on my x86_64 computer :P

> I’m very reluctant to patching public libc headers.

Well, I don't like it either--that's why it's very very careful.  My patch
doesn't change anything that users experience at runtime and basically just
prevents developers from compiling something that is using readdir without
thinking about large files first (because they HAVE TO if their programs run
on a host kernel that has bigger d_off--there's no sane way around it).

If they absolutely want to, they can set _FILE_OFFSET_BITS=32 and it will
let them do it (the argument by Marius is that people might want to do that
on embedded.  But that means they'll sometimes have readdir fail--depending
on their setup (also on 32 bit kernels).  Embedded is not specially exempt
from ths bug ;) ).

I think that this patch is guix-specific in the sense that it happens
pretty often that we do "-s i686-linux" on x86_64, "-s armhf-linux" on
aarch64 and expect that to work.  And there's no qemu we could even patch in
those cases, because no qemu is used in the first place.

>  Also, it’s not just
> “our” problem, we should definitely discuss it with upstream and perhaps
> propose your dirent.h patch.

Sure.  I think 15 years of migration path to 64 bit off_t was more than enough.

Now, I'd prefer if glibc made people choose _FILE_OFFSET_BITS explicitly on
32 bit.  Everything else is a hack that WILL break unexpectedly.  Users still
can choose _FILE_OFFSET_BITS=32, if they want.

> I’m also not sure what you mean by “using it wrong”, what is “it”?

"it" is users calling readdir() without defining _FILE_OFFSET_BITS=64 in their
source file / Makefile.
This causes glibc to call getdents64 and then act all surprised when it gets
a 64 bit result back.

> > Also, this won't work on armhf or any other 32 bit architecture--so there,
> > we would be both philosophically and practically wrong.
> >
> > Also, the "not telling us the truth for d_off on i686" is a leaky compat layer.
> > It totally DOES wind up telling us the truth sometimes (see my earlier test
> > table)--and then we have a problem.  
> 
> Hmm I guess I need to re-read all that, I’m overwhelmed.

Yeah--it's understandable.  I'm working on understanding and fixing this problem
for a hundred hours now--it took forever for me to get to the bottom of this,
too.

And in the beginning I, too, suspected qemu.  But it's totally blameless.
Nothing should be changed in qemu-user or in our qemu binfmt service.

The fundamental problem is that POSIX specifies that telldir and seekdir must
exist, and return and take a LONG, respectively.

That means that glibc has to preserve d_off it got from getdents64 (size is
64 bits), otherwise how would seekdir work?

But the offset parameter of seekdir is standardized as LONG, which means that it
won't work in the first place on 32 bit when there is either a 64 bit kernel or
a filesystem that just happens to store bigger stuff.

So glibc chose to check whether the getdents64 d_off just happens to fit into
the LONG this time around it was called.  I argue that that is insane.  It
would be better to always fail, or never fail--not only fail on the first d_off
that is > 2**32.  When that happens is a filesystem implementation detail :P

I think the assumption was that the kernel would store an actual offset into
d_off.  But it doesn't--it stores a hash in the case of ext4 (and probably
in other cases).

And in any case, even if it was an offset, that is still an unsafe way to fix
the problem.

First, someone needs to fix the POSIX standard to say "off_t", not "long".

Then, distributions who want to use 32 bit userland on 64 bit kernel need
to enable large files globally.  That is a choice a distribution has to make.

Not making a choice is a choice too--the behavior will be random, and if
my research in wip-file-offset-bits-64 is any indication then very
fundamental things will be broken at unexpected places, and usually it DOES
NOT result in a build failure (without my glibc patch).  That basically
means that using 32 bit native on 64 bit kernel cannot be supported in Guix
if no choice is made.

If choice "yes" is made, one needs to have a way to find these
non-build-breaking using-readdir-wrong packages.  How would an alternative
way to do this look?

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-10-02  7:18                 ` Danny Milosavljevic
@ 2020-10-02  8:12                   ` Danny Milosavljevic
  2020-10-02  9:47                     ` Danny Milosavljevic
  2020-10-02  9:32                   ` Danny Milosavljevic
  1 sibling, 1 reply; 33+ messages in thread
From: Danny Milosavljevic @ 2020-10-02  8:12 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 501 bytes --]

> My patch
> doesn't change anything that users experience at runtime and basically just
> prevents developers from compiling something that is using readdir without
> thinking about large files first (because they HAVE TO if their programs run
> on a host kernel that has bigger d_off--there's no sane way around it).

And just to be clear, they still SHOULD choose whether or not to support
large files even if the host kernel is 32 bit (!!!!)--see table.  It's
just less common there.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-10-02  7:18                 ` Danny Milosavljevic
  2020-10-02  8:12                   ` Danny Milosavljevic
@ 2020-10-02  9:32                   ` Danny Milosavljevic
  1 sibling, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-10-02  9:32 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]

Updated table after more tests:

system   _FILE_OFFSET_BITS off_t   d_off-sizeof   d_off-values
---------------------------------------------------------------
x86_64   -                 8 Byte  8 Byte         8 Byte
i686     -                 4 Byte  4 Byte         4 Byte
i686     64                8 Byte  8 Byte         FAIL*
i686     32                4 Byte  4 Byte         FAIL*
i686     7                 4 Byte  4 Byte         FAIL*
armhf    -                 4 Byte  4 Byte         FAIL*
armhf    64                8 Byte  8 Byte         8 Byte
armhf    32                4 Byte  4 Byte         FAIL*
armhf    7                 4 Byte  4 Byte         FAIL*
a64armhf -                 4 Byte  4 Byte         FAIL*
a64armhf 64                8 Byte  8 Byte         8 Byte
a64armhf 32                4 Byte  4 Byte         FAIL*
a64armhf 7                 4 Byte  4 Byte         FAIL* 
aarch64  -                 8 Byte  8 Byte         8 Byte

*: Using FUSE filesystem with big d_off value.

None of those tests were done with qemu.  They were all native.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-10-02  8:12                   ` Danny Milosavljevic
@ 2020-10-02  9:47                     ` Danny Milosavljevic
  0 siblings, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-10-02  9:47 UTC (permalink / raw)
  Cc: Ludovic Courtès, 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 73 bytes --]

glibc bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=23960

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir.
  2020-10-01  7:14               ` Ludovic Courtès
  2020-10-02  7:18                 ` Danny Milosavljevic
@ 2020-10-06 15:39                 ` Danny Milosavljevic
  1 sibling, 0 replies; 33+ messages in thread
From: Danny Milosavljevic @ 2020-10-06 15:39 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 43591, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 2018 bytes --]

Hi Ludo,

On Thu, 01 Oct 2020 09:14:10 +0200
Ludovic Courtès <ludo@gnu.org> wrote:

> Could you propose a patch for maintenance.git?

OK--patch #43829.

> >> I have mixed feelings: fixing packages one by one doesn’t sound great,
> >> but OTOH setting the ‘CFLAGS’ environment variable globally can have
> >> unexpected side effects in some cases (overriding package-specific
> >> CFLAGS) 

Doing (setenv CFLAGS "-g -O2 -D_FILE_OFFSET_BITS=64") right after
phase 'set-SOURCE-DATE-EPOCH doesn't override package-specific CFLAGS.

Quite the opposite can happen, though.  But:

> > The latter is easy to detect since I patched dirent.h in glibc exactly for that
> > reason.  That way, glibc WON'T let you use it wrong (except if you explicitly
> > ask for it).  On Guix systems, there is no legitimate reason to use it wrong
> > in the first place.  
> 
> I’m very reluctant to patching public libc headers.  Also, it’s not just
> “our” problem, we should definitely discuss it with upstream and perhaps
> propose your dirent.h patch.

I've reported it upstream.

However, GNU gcc and glibc support a lot of weird architectures--but Guix
system really doesn't.  So it's much easier for us to get a good patch than
it is for them.

> A diff at one point in time (if we ever managed to get a usable diff) is
> not enough: problems could pop up anytime.  Setting ‘CFLAGS’ globally as
> an environment variable seems risky.

We are about 15 years late--so all other distributions already triggered
most of the bugs in that time.  I don't think it's that bad anymore...

That's why I would prefer setting CFLAGS globally anyway.

And I don't have the energy to manually FIND AND fix however many packages
are affected otherwise.

Having this problem in 2020 is ridiculous--it's like someone accidentially
enabled a time machine...

The only reason this didn't fall on our head on x86_64 is because on 64 bit
systems something like it is the default anyway.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2020-10-06 15:41 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-24 14:12 [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
2020-09-24 14:16 ` Danny Milosavljevic
2020-09-24 18:17 ` Marius Bakke
2020-09-24 20:27   ` Danny Milosavljevic
2020-09-24 23:11     ` Marius Bakke
2020-09-25 10:20       ` Danny Milosavljevic
2020-09-25 10:42         ` [bug#43591] [PATCH v2 core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless Danny Milosavljevic
2020-09-25 13:36         ` [bug#43591] [PATCH core-updates] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir Danny Milosavljevic
2020-09-25 15:33           ` Danny Milosavljevic
2020-09-26  1:42           ` Danny Milosavljevic
2020-09-26  1:49             ` Danny Milosavljevic
2020-09-29 14:51               ` Danny Milosavljevic
2020-09-27  6:43           ` Efraim Flashner
2020-09-25 20:03       ` Andreas Enge
2020-09-26 10:50         ` Danny Milosavljevic
2020-09-29 20:52       ` Ludovic Courtès
2020-09-29 22:09         ` Danny Milosavljevic
2020-09-30  9:32           ` Ludovic Courtès
2020-09-30 10:28             ` Danny Milosavljevic
2020-10-01  7:14               ` Ludovic Courtès
2020-10-02  7:18                 ` Danny Milosavljevic
2020-10-02  8:12                   ` Danny Milosavljevic
2020-10-02  9:47                     ` Danny Milosavljevic
2020-10-02  9:32                   ` Danny Milosavljevic
2020-10-06 15:39                 ` Danny Milosavljevic
2020-09-25 10:24   ` Danny Milosavljevic
2020-09-30  8:45 ` [bug#43591] [PATCH core-updates v2 0/5] " Danny Milosavljevic
2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 1/5] gnu: glibc-final: Catch all cases of a glibc user not requesting 64-bit offsets and then using readdir regardless Danny Milosavljevic
2020-09-30 16:55     ` Danny Milosavljevic
2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 2/5] build-system/gnu: Explicity declare the _FILE_OFFSET_BITS we want Danny Milosavljevic
2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 3/5] gnu: glibc: Do not explicitly set _FILE_OFFSET_BITS Danny Milosavljevic
2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 4/5] gnu: glibc-mesboot0: " Danny Milosavljevic
2020-09-30  8:45   ` [bug#43591] [PATCH core-updates v2 5/5] gnu: rhash: Explicity declare the _FILE_OFFSET_BITS we want Danny Milosavljevic

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).