Hello everyone, This is going to be a long post, but that's mostly because it includes a write-up of how many things currently function. Let's start with the TLDR: I think we need to rethink the way we build and package Guix, so that a) the guix package is always current compared to the guix that contains it; b) we can say with confidence that the development, in-tree Guix, the `guix pull` and the guix package are all built exactly in the same way, with *extremely minor* differences in areas such as commit/version embedding. I know this isn't a light proposition, but this would solve a swath of issues, included but not limited to: the system-wide guix not working well when interacting with a `guix pull`ed guix, having to update the guix package twice for the installer to pick up needed changes, issues only showing up for people using the system-wide Guix that the local development setup doesn't catch (eg. [1]), installer images never being actually "current", installer images not carrying the closure of %bare-bones-os corresponding to the guix version it will use, not being able to test installer/guix-daemon/guix-build-process changes without tinkering with the guix package definition, etc. Two recent things have brought today's subject to the table: * a recent surge of problems related to the new manifest format (4) that rendered some users' Guixes unusable; * me working on adding a C Guile extension to Guix that would add an interface for posix_spawn, resolving some of our issues with resource leakage and deadlocks in inferiors (as well as in other places, I'm certain (yes I'm looking at the installer)). The elephant in the room in both of these situations is the myriad of different ways we have of building a Guix, their versioning, and how these different products interact between each other on a running system. Let's start with the first case. Commit 4ff12d1de7cd617b791996ee7ca1240660b4c20e changed the version of (internal) profile manifests produced by guix commands to 4, along with other changes. Commit 06493e738825598447f5b45d7100ca7eff8b669d updated the guix package (the one in gnu/packages/package-management.scm) to a commit that includes the former one. Now comes the issue: systems installed using a commit between those two would produce a system profile that uses manifest 4, but with a Guix that cannot read such a manifest. This won't be an issue for people that have already installed their system and `guix pull`'d, but for fresh systems `guix pull` and `guix describe` will both bail out because they want to read the manifest of the profile they're running from since it might contain their commit SHA (guix pull uses it to prevent downgrades), but don't understand it. [Notes about why it searches there later on] This effectively locks the user out from updating their guix. In the second case, I am in the process of adding a very simple Guile C extension to Guix that only requires to wrap a simple libc function. The C code itself took approx. 5% of my time on it, while adding the magical invocations for the Autotools took 35%, and now testing the changes is taking 60%, because I need to test that `guix pull` works, that the local development setup works, and that the guix package works, all with a local change that cannot be authenticated and that also cannot be referred to with a git-reference [more on that later] since it's entirely local. This is causing me many headaches, and I don't think that even after manually testing this with handwritten hacks so that these changes actually do appear everywhere, I will be confident enough for them to be merged. In any case, it's extremely difficult for someone who hasn't just stared at the code for a long time to discern the different issues that might arise when adding non-trivial changes to Guix. Let me try to spell out the different ways that we build Guix: a) The local, development way. This uses autotools, with configure.ac and configure-daemon.ac as autoconf inputs, and Makefile.am, gnu/local.mk and nix/local.mk as automake inputs. Basically, that means `autoreconf`, `./configure` followed by `make`, the standard gnu-build-system fare. This is the number 1 way that guix developers build their copy with, and the one they interact the most with. Currently, the resulting Guix doesn't know any of its provenance info, ie. guix/config.scm isn't properly filled out. b) The guix package, gnu-build-system way. This is quite similar to a), except that guix/config.scm is properly filled out, fill out paths to hardcoded compressors, and wrap `guix` so that it finds the proper guile and extensions. We also disable some failing tests because of the build container. This package is used by `guix system` to install the system-wide guix to the system profile, which is used by users that haven't guix pull'd yet. This is necessarily *always* out of date, because we can't see commits in the future :p. c) The `guix pull` way. This is a bit of an hybrid. Internally, it amounts to eval'ing build-aux/build-self.scm inside of the currently running guix, which returns a procedure which, given a source code checkout of Guix, returns a derivation that builds the new Guix. This itself relies on (guix self) where most of the building code is. The scheme modules are compiled in an ad-hoc manner (not following Makefile.am), and files are included without consulting the Makefile. This is why `guix pull` users are not affected by [1]. But then, there still is an issue with the guix daemon (and in the future the C extensions), which is C code. Since we're not pretending to know how to universally configure C packages, we rely on the guix-daemon package defined in gnu/packages/package-management.scm of the future Guix, which inherits from the guix package itself, and then builds itself roughly following b), meaning the daemon is still out-of-date. This way, you can see why many issues arise: the double update for the installer issue is because the guix that will be installed in the final system corresponds to the guix package that's defined inside the guix on the installer OS, itself corresponding to the guix package defined installed the one running `guix system image` (or similar). So if we want an update to hit the system-wide Guix for the end-user after installing, we need 2 linked updates. What I personally think, is that we should rationalize the way we interact with Guix source: a running Guix should always be able to hold a reference to its source. The guix package or future equivalent (ie. good for internal consumption) should always refer to that same source, but that will also require factoring the daemon (and extensions) out of the repository, so that the C code doesn't get compiled again on every unrelated commit. Finally, and I think this is the most challenging one, we should try to keep the differences between a) and c) to the minimum, meaning that one way of building has to go. This is a big change, conceptually and technically, and I understand that this might be way more complicated that we'd like, but I think this needs to be done at some point. WDYT? [1] https://issues.guix.gnu.org/52572 (20211217222522.2440-1-dev@jpoiret.xyz) -- Josselin Poiret
[-- Attachment #1: Type: text/plain, Size: 2803 bytes --] Josselin Poiret <dev@jpoiret.xyz> writes: > In the second case, I am in the process of adding a very simple Guile C > extension to Guix that only requires to wrap a simple libc function. > The C code itself took approx. 5% of my time on it, while adding the > magical invocations for the Autotools took 35%, and now testing the > changes is taking 60%, If your foreign function use case is very trivial? Why not give Guile dynamic FFI a try? > because I need to test that `guix pull` works, > that the local development setup works, and that the guix package works, > all with a local change that cannot be authenticated and that also > cannot be referred to with a git-reference [more on that later] since > it's entirely local. This is causing me many headaches, and I don't > think that even after manually testing this with handwritten hacks so > that these changes actually do appear everywhere, I will be confident > enough for them to be merged. It's possible to use guix channel to test a local guix repo. A short example here. ``` cat > local-channel.scm << "EOF" (list (channel (inherit %default-guix-channel) (url "/home/foo/bar/path/to/local/guix/repo") (branch "test-branch"))) EOF guix time-machine -C local-channels.scm --disable-authentication -- build hello ``` > c) The `guix pull` way. This is a bit of an hybrid. Internally, it > amounts to eval'ing build-aux/build-self.scm inside of the currently > running guix, which returns a procedure which, given a source code > checkout of Guix, returns a derivation that builds the new Guix. This > itself relies on (guix self) where most of the building code is. The > scheme modules are compiled in an ad-hoc manner (not following > Makefile.am), and files are included without consulting the Makefile. > This is why `guix pull` users are not affected by [1]. But then, there > still is an issue with the guix daemon (and in the future the C > extensions), which is C code. Since we're not pretending to know how to > universally configure C packages, we rely on the guix-daemon package > defined in gnu/packages/package-management.scm of the future Guix, which > inherits from the guix package itself, and then builds itself roughly > following b), meaning the daemon is still out-of-date. This is somewhat "the bootstrap problem". It's very ideal if we can describe the build graph in Guix with derivations. But we still need a daemon first to process derivations. So we need to build daemon without Guix. This issue may be solved by rewriting daemon in Guile. If daemon is written in Guile. We can run it without compilation. -- Retrieve my PGP public key: gpg --recv-keys D47A9C8B2AE3905B563D9135BE42B352A9F6821F Zihao [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 255 bytes --]
Hello, Zhu Zihao <all_but_last@163.com> writes: > If your foreign function use case is very trivial? Why not give Guile > dynamic FFI a try? That could be another option, but I'd like to have autoconf be able to detect whether the target supports things like posix_spawn and getrlimit, which I use in the code. > It's possible to use guix channel to test a local guix repo. A short > example here. Right, but for the example of adding extensions it won't work since there's a part of `guix pull` that uses the guix package, as well as in the installer or system-wide Guix, as I outlined before. The issue [1] I outlined in the opening mail was an issue that is specific to the guix package method, so there really isn't a way to uniformly test changes without knowing the intricacies of the different builds and where they end up (I do know these, having run into these issues myself before). > This is somewhat "the bootstrap problem". It's very ideal if we can > describe the build graph in Guix with derivations. But we still need a > daemon first to process derivations. So we need to build daemon without > Guix. I don't think that's the case, (guix self) relies on a working daemon connection before anything else, the built daemon will just be a part of the resulting `guix pull` profile, but won't be used to build the new Guix (as a matter of fact, the build daemon is built... using the build daemon!). > This issue may be solved by rewriting daemon in Guile. If daemon is > written in Guile. We can run it without compilation. I don't think this is directly related, although some changes that we could bring to it would definitely ease what I'm proposing here: having a way to build things directly without relying on a root-owned daemon running would make the bootstrapping problem easier to solve. Best, -- Josselin Poiret
Hi Josselin,
I have some naive questions below :)
On +2022-07-07 16:34:17 +0200, Josselin Poiret wrote:
> Hello,
>
> Zhu Zihao <all_but_last@163.com> writes:
>
> > If your foreign function use case is very trivial? Why not give Guile
> > dynamic FFI a try?
>
> That could be another option, but I'd like to have autoconf be able to
> detect whether the target supports things like posix_spawn and
> getrlimit, which I use in the code.
>
> > It's possible to use guix channel to test a local guix repo. A short
> > example here.
>
> Right, but for the example of adding extensions it won't work since
> there's a part of `guix pull` that uses the guix package, as well as in
> the installer or system-wide Guix, as I outlined before. The issue [1]
> I outlined in the opening mail was an issue that is specific to the guix
> package method, so there really isn't a way to uniformly test changes
> without knowing the intricacies of the different builds and where they
> end up (I do know these, having run into these issues myself before).
>
> > This is somewhat "the bootstrap problem". It's very ideal if we can
> > describe the build graph in Guix with derivations. But we still need a
> > daemon first to process derivations. So we need to build daemon without
> > Guix.
>
> I don't think that's the case, (guix self) relies on a working daemon
> connection before anything else, the built daemon will just be a part of
> the resulting `guix pull` profile, but won't be used to build the new
> Guix (as a matter of fact, the build daemon is built... using the build
> daemon!).
>
> > This issue may be solved by rewriting daemon in Guile. If daemon is
> > written in Guile. We can run it without compilation.
>
> I don't think this is directly related, although some changes that we
> could bring to it would definitely ease what I'm proposing here: having
> a way to build things directly without relying on a root-owned daemon
> running would make the bootstrapping problem easier to solve.
>
> Best,
> --
> Josselin Poiret
>
Naively:
Why does "the" guix daemon per se need root access at all?
Why not let it be an ordinary peer user? The main one already is, UIAM.
Why couldn't it protect /gnu/ storage as a user which the kernel can
keep others from writing to in the usual way?
Another option for managing storage and quickly switching access might be
if you trust the wayland daemons and their protocol for managing a single
user thread's buffers. You might be able to use its event loop to schedule
multiplexed concurrent build jobs.
A peer user daemon scenario might also open possibilities for networked
job distribution beyond a local router's connections, I imagine?
--
Regards,
Bengt Richter
Hi! I’m late to the party, but I like your thoughtful message. Josselin Poiret <dev@jpoiret.xyz> skribis: > What I personally think, is that we should rationalize the way we > interact with Guix source: a running Guix should always be able to hold > a reference to its source. The guix package or future equivalent > (ie. good for internal consumption) should always refer to that same > source, but that will also require factoring the daemon (and extensions) > out of the repository, so that the C code doesn't get compiled again on > every unrelated commit. Finally, and I think this is the most > challenging one, we should try to keep the differences between a) and c) > to the minimum, meaning that one way of building has to go. This is a > big change, conceptually and technically, and I understand that this > might be way more complicated that we'd like, but I think this needs to > be done at some point. I definitely agree. I’d summarize things a bit differently: 1. We have two build systems for the same software: the GNU build system and (guix self). 2. We have that crazy ‘guix’ package snapshot, which causes the problems you mentioned. Both contribute to a poor developer experience, but I believe they can be addressed separately. It’s not clear that #1 is much of a problem in practice. Someone who contributes to Guix may have to touch gnu/local.mk, but that rarely goes beyond that. It’s annoying, but it’s not clear to me that it’s a showstopper for newcomers. One thing that would be nice is getting rid of ‘guix-daemon’ and replacing it with a pure Scheme/Guix way of building the C++ code that (guix self) would use. #2 is the main issue to me. There’s an open bug about it¹, which I’d like to address at least in the context of the installer. Thanks, Ludo’. ¹ https://issues.guix.gnu.org/53210
Hello,
bokr@bokr.com writes:
> Naively:
>
> Why does "the" guix daemon per se need root access at all?
The main thing is that all files in the store end up being written by
the guix daemon user. So if we want the files to be easily
substitutable, they'd need to have a fixed uid/gid, and the only one we
can guarantee is root. Other than that, it needs to use a bunch of
Linux namespaces to isolate the builds from the rest of the system,
which depending on the kernel build-time configuration might not be
possible when unprivileged.
Best,
--
Josselin Poiret
[-- Attachment #1.1.1.1: Type: text/plain, Size: 900 bytes --] On 21-07-2022 18:10, Josselin Poiret wrote: > bokr@bokr.com writes: >> Naively: >> >> Why does "the" guix daemon per se need root access at all? > The main thing is that all files in the store end up being written by > the guix daemon user. So if we want the files to be easily > substitutable, they'd need to have a fixed uid/gid, and the only one we > can guarantee is root. Other than that, it needs to use a bunch of > Linux namespaces to isolate the builds from the rest of the system, > which depending on the kernel build-time configuration might not be > possible when unprivileged. Also, resource savings on multi-user systems. And if the guix daemon is run as the regular user, then all other daemons (on Guix System) would need to be run as that user or as root to be able to access theirselves, which is bad from a security perspective. Greetings, Maxime. [-- Attachment #1.1.1.2: Type: text/html, Size: 1511 bytes --] [-- Attachment #1.1.2: OpenPGP public key --] [-- Type: application/pgp-keys, Size: 929 bytes --] [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 236 bytes --]
Hi Josselin, tl;dr: I naively don't buy the rationale against a non-root guix daemon :) Skip to [2] if tl ;) On +2022-07-21 18:10:53 +0200, Josselin Poiret wrote: > Hello, > > bokr@bokr.com writes: > > Naively: > > > > Why does "the" guix daemon per se need root access at all? > > The main thing is that all files in the store end up being written by > the guix daemon user. IIUC, that would be guixrootd per se, not the homeless guixbuilder{0-10} users created by the default install, right? (IIUC the latter are meant to allow guixrootd to shed its root write privilege by spawning a user mode continuation, so to speak?) > ... So if we want the files to be easily > substitutable, they'd need to have a fixed uid/gid, Why? "Easily" in what way? untarring tarballs? Even if tar sets bogus external file system metadata, UIAM the only privilege you need is ownership, and neither guix nor the installer need to run as root themselves to get ownership. That can be done by a tiny helper whose source you can see on one page, and whose execution you can limit to a user such as the single-writer guixstored, which UIAM can then chmod and chown at will to share or not. I don't believe in blanket permissions to accomplish a tiny important priviliged action. I want to see it factored out for auditability and comprehensibility. (Here I did s/guixrootd/guixstored/ as a name for a non-root user which has exclusive control over gnu/store because it creates /home/guixstored/gnu/store thus in its home directory and no other user has write access to it except by talking to guixstored via message or sharing read-only files if mounted and reachable and permitted by guixstored setting perissions on files it owns. > ... and the only one we > can guarantee is root. But why would you have to? ISTM not necessary. > ... Other than that, it needs to use a bunch of > Linux namespaces to isolate the builds from the rest of the system, You mean like -G from --8<---------------cut here---------------start------------->8--- useradd -g guixbuild -G guixbuild${KVMGROUP} \ -d /var/empty -s "$(which nologin)" \ -c "Guix build user $i" --system \ "guixbuilder${i}"; _msg "${PAS}user added <guixbuilder${i}>" fi --8<---------------cut here---------------end--------------->8--- YOW! running as root AND being able to do KVM stuff? My naive paranoia button has been pushed, and I don't want to do the searching for info to calm myself ;/ BTW, above snip was from guix clone repo pull as of ┌──────────────────────────────────────────────────────────┐ │ # $ git log --pretty=medium --grep 'install\.sh'|head -3 │ ├──────────────────────────────────────────────────────────┤ │ commit 3348e485b7229e062e563945ed7e6ac216f25125 │ │ Author: Philip McGrath <philip@philipmcgrath.com> │ │ Date: Sun Jul 3 22:35:03 2022 -0400 │ └──────────────────────────────────────────────────────────┘ > which depending on the kernel build-time configuration might not be > possible when unprivileged. > IWT think that goes away if you run a single-writer daemon on the local OS, and let the kernel use its namespaces to keep the users from trampling on each other (any more than they already maybe can with the current setup). If the OS can't separate users, ISTM everbody is kind-of root in effect, but then maybe we can run a single user thread as if root, if you have an environment where that's useful -- maybe cloud virtual pcs? Communicating would be an adapter problem, but virtual pcs can boot fully fledged linux these days, I think, and it seems doubtful that you would run a big guix build ON a raspberry pi even though TARGETING an rpi makes lots of sense. Whew, I've got to stop re-editing this :/ [2]: So, do you see any real obstacles to making guixrootd an ordinary user (in the sense of /home/ordinary_user/ ) and calling it guixstored instead, with an ordinary /home/guixstored/ home directory (where it has natural protection as guixstored:guixstored on useradd creation, with added group guixbuilder for helper r/o sharing, which as owner it can control)? However many guixbuilder0{1..9}:guixbuilder0{1..9} helper users are created, (plus guixbuilder10:guixbuilder10 in the default naming :) they would also belong to the guixbuilder extra group (no suffixed number) but they only would have read access to parts of /home/guixstored/gnu/store/... unless guixstored as owner sets other permissions for the guixbuilder group. I'm not seeing why there needs to be any guix daemon running as root :) But this means you can't just uncompress files, metadata and all, for substitution purposes, which I guess you were alluding to with "...can't easily...", right? But IWT that guixstored could use a tiny helper to get ownership, as above. Becoming owner by using a factored-out-tiny-helper to chown untarred stuff should be safer than running bigstuff as root IWT). It can then create and exclusively control /home/guixstored/gnu/store/... and allow guixbuilder{1..10} user-helpers to have group read access to /home/guixstored/gnu/store/... I am imagining smallish /home directories /home/guixbuilder{1..10}/... for those helpers to maintain their own states in whatever they are helping with, and outputting their final results to guixstored by the most efficient means available to the process. If they're on the same SOC or on the other side of the globe will make a difference, but IMO that's an adaper concern which should not affect the design of the essential guile/guix package management sources or their hashes. Well, IMHO any adaptations to particular file systems or transfer protocols should be considered as just that: adapters, orthogonal to the essential data transforming that guix/guile does, where all data is just various interpretations of #vu8 compositions. Otherwise, if adaptor cruft is incorporated into the essential guix package manager, IMHO it will grow into a rube-goldberg kludge-ball. Let it manage libraries of adapter implementations, but keep them scrupulously outside, along with the definitions of their dependencies on the metadata describing what they are adapting to. I think basically everything guix/guile reads and writes as it does its package management should have an official documented #vu8 representation. Including snarfed and transformed foreign metadata. > Best, > -- > Josselin Poiret For reference, this is a remnant from an old install, but I assume this is still what the intaller creates (right?). ┌─────────────────────────────────────────────────────────────────────────┐ │ # $ grep guix /etc/passwd │ ├─────────────────────────────────────────────────────────────────────────┤ │ guixbuilder01:x:998:998:Guix build user 01:/var/empty:/usr/sbin/nologin │ │ guixbuilder02:x:997:998:Guix build user 02:/var/empty:/usr/sbin/nologin │ │ guixbuilder03:x:996:998:Guix build user 03:/var/empty:/usr/sbin/nologin │ │ guixbuilder04:x:995:998:Guix build user 04:/var/empty:/usr/sbin/nologin │ │ guixbuilder05:x:994:998:Guix build user 05:/var/empty:/usr/sbin/nologin │ │ guixbuilder06:x:993:998:Guix build user 06:/var/empty:/usr/sbin/nologin │ │ guixbuilder07:x:992:998:Guix build user 07:/var/empty:/usr/sbin/nologin │ │ guixbuilder08:x:991:998:Guix build user 08:/var/empty:/usr/sbin/nologin │ │ guixbuilder09:x:990:998:Guix build user 09:/var/empty:/usr/sbin/nologin │ │ guixbuilder10:x:989:998:Guix build user 10:/var/empty:/usr/sbin/nologin │ │ guixrootd:988:998:Guix root daemon:/home/guixrootd: │ └─────────────────────────────────────────────────────────────────────────┘ So WDYT? :) -- Regards, engt Richter
Hi Josselin, Thank you for the clear explanations. On mer., 06 juil. 2022 at 22:01, Josselin Poiret <dev@jpoiret.xyz> wrote: [...] > What I personally think, is that we should rationalize the way we > interact with Guix source: a running Guix should always be able to hold > a reference to its source. The guix package or future equivalent > (ie. good for internal consumption) should always refer to that same > source, but that will also require factoring the daemon (and extensions) > out of the repository, so that the C code doesn't get compiled again on > every unrelated commit. Finally, and I think this is the most > challenging one, we should try to keep the differences between a) and c) > to the minimum, meaning that one way of building has to go. This is a > big change, conceptually and technically, and I understand that this > might be way more complicated that we'd like, but I think this needs to > be done at some point. Now #53210 [1] is closed, it improves the situation, right? Is it enough for the C extension you are working on? 1: <https://issues.guix.gnu.org/53210#17> Cheers, simon
Hi Bengt, On mar., 26 juil. 2022 at 03:09, Bengt Richter <bokr@bokr.com> wrote: > I naively don't buy the rationale against a non-root guix daemon :) For sure, we can imagine many other designs than the current implemented one. However, at one point or the other, “something with privileges” is required, no? I mean, consider that the user named Alice installs the package ’foo’ and the user named Bob installs the package ’foo’ too, then, to have a shared store, “something” needs to know that ’foo’ is installed by Alice *and* Bob. I mean, the paths contained by the binaries need to be hard-coded (for reproducibility), so a common location is required and this common location requires special privileges to be manipulated. Moreover, this “something” also requires some privileges to run isolated environments (build, etc.), Well, at the end, this “something” needs the same privileges as ’root’, no? I mean, it appears to me the simplest; especially to configure on various foreign distros. Pjotr wrote, some time ago, some explanations [1] for running Guix with non-root daemon. 1: <https://github.com/pjotrp/guix-notes/blob/master/GUIX-NO-ROOT.org> Cheers, simon
Hi zimoun,
zimoun <zimon.toutoune@gmail.com> writes:
> Now #53210 [1] is closed, it improves the situation, right? Is it
> enough for the C extension you are working on?
>
> 1: <https://issues.guix.gnu.org/53210#17>
Well, wrt. the C extension, it doesn't really improve much, there's
still the issue of having both build systems that Ludo summarized work
the same, and also the guix snapshot is still used for the system-wide
guix, so the issue remains.
Best,
--
Josselin Poiret