unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxime Devos <maximedevos@telenet.be>
To: Philip McGrath <philip@philipmcgrath.com>, guix <guix-devel@gnu.org>
Cc: Liliana Marie Prikler <liliana.prikler@gmail.com>,
	Liliana Marie Prikler <liliana.prikler@ist.tugraz.at>
Subject: Re: What 'sh' should 'system' use?
Date: Mon, 26 Sep 2022 14:24:08 +0200	[thread overview]
Message-ID: <2c178e61-291f-5c84-a6fc-84a7dc64e854@telenet.be> (raw)
In-Reply-To: <c77defd0-2db3-6f07-883c-c7fd808d5a56@philipmcgrath.com>


[-- Attachment #1.1.1: Type: text/plain, Size: 9599 bytes --]



On 26-09-2022 09:04, Philip McGrath wrote:
> [...]
> (Very occasionally, a program really does want to invoke the shell, such
> as when shell expansion is part of an existing API.)
> 
> From a different perspective, this is part of why I've recently been
> thinking we should find 'sh' dynamically: most programs/environments
> don't, and shouldn't, need bash{-minimal,-static}, so it seems wrong to
> make it a mandatory dependency of libc.

In another thread, I proposed replacing 'system' by a macro 'system' 
that looks for for a, say, GUIX_BIN_SH preprocessor definition and then 
calls _guix_system(GUIX_BIN_SH,...) or such, and remove the 'system' 
function.

That way, glibc does not use bash-whatever anymore, but we still avoid 
doing things dynamically, avoiding the problems that dynamic finding 
entails.

For packages that use 'system', we would need to then resolve the build 
resulting build failure by passing -DGUIX_BIN_SH (maybe we could have a 
libc-system-function package that overrides the header containing 
'system' and automatically sets GUIX_BIN_SH)?

>> See (1) (reproducibility) -- also, you would need to modify the daemon for that, so there are compatibility concerns, and then we're
>>  stuck with the /bin/sh special case forever (unless breaking compatibility would later be considered acceptable).
>>
> 
> I don't think there's a reproducibility problem.

You are proposing 'weak references' -- weak references are automatically 
broken if the thing referred to is GC'ed (the weak reference is weak, so 
it doesn't count as a reference that keeps it from being GC'ed).

That means that the build process depends on whether bash-whatever is in 
the store or not.

Even if not, the compatibility concerns remain, and incompatible daemons 
sound like a form of irreproducibility to me.

> Guix already can create
> reproducible containers with "/bin/sh" (e.g. 'guix shell coreutils
> --container -- ls /bin') and without "/bin/sh" (as in package build
> environments).
> 
> I haven't investigated whether adding the ability to create "/bin/sh" in
> build containers would require modifying the daemon or just sending the
> daemon different instructions. However, AIUI, Nix *always* creates
> "/bin/sh" in build containers, which makes me further expect that any
> change needed to the daemon would be small.
 >
> To be clear, I'm not proposing that we always create "/bin/sh" in build
> containers. At a low level, I'm suggesting that we add the ability to
> create "/bin/sh" when desired. I can imagine one possibility for a
> high-level interface would be to create "/bin/sh" by default when an
> input provides "bin/sh", and it might turn out that we end up wanting
> "/bin/sh" in most or all build containers in practice, but I see those
> as secondary questions.

Again, I don't see how special-casing /bin/sh even further is desirable.

> There are a few dimensions here that I want to try to pick apart.
> 
> When you say:
> 
>> a plain "sh" looked up in the $PATH (like other binaries) and substitute*-ed by Guix should suffice >
> there are a few different things that might mean. > I think you're probably referring to the status quo, where "sh" is
> looked up in the 'inputs' or a G-expression equivalent and an absolute
> reference to that particular "sh" is embedded into the package being
> built. (But, when cross-compiling, that "sh" would not be in the $PATH
> in the build environment.)

Yes -- to be clear, the looking up in $PATH is for upstream, in Guix it 
would be patched with the absolute reference to something in 'inputs' 
instead.

> There's a different question about $PATH vs. _CS_PATH that I'll address
> later.
> 
> I see at least two reasons to prefer finding "sh" dynamically at run-time.
> 
> First, we have other POSIX-like shells packaged in Guix, such as dash,
> zsh, and gash. Currently, to create an environment where one of these
> shells is used to run 'system'-like functions (e.g. because dash is
> supposed to be faster than bash), you would have to recompile everything
> that depends on glibc. (Maybe you could craft a very ugly graft.)

dash, zsh and gash are incompatible, so you can't simply replace things 
-- looking it up dynamically would potentially introduce bugs. 
Additionally, 'sh' might not exist in /bin/sh or $PATH, so possibly it 
couldn't be found dynamically, and possibly the version it finds is 
incompatible (reproducibility).

If dash is faster than bash and sufficiently compatible, you can propose to

> Second, sometimes people may want to create environments, images, etc.
> without an "sh" available.

You can do this without dynamic finding and its downsides, see e.g. the 
preprocessor thing mentioned in the beginning.

> In some sense this is a special case of using
> an alternate shell, but the consequences of the status quo are
> especially notable. Currently, practically any environment or image Guix
> creates will include bash-static, because glibc refers to it.
> 
> For an especially ironic example, consider this note from `info
> "(guix)Invoking guix pack"`:
> 
> [...]

That example is about installing something in /bin/sh, it's unrelated to 
'system' AFAICT.

>>>
>>> 3) If we want a dynamic 'sh' not located at '/bin/sh', I think we should implement a function similar to '__bionic_get_shell_path()'
>>>  and use it for '_PATH_BSHELL', 'system', etc. That begs the question of how the function should find 'sh', and I don't have an
>>>  answer for that.
>>
>> How about $PATH?
>>
> 
> This is a subtle point, and it depends in some ways on what you are
> trying to use the 'sh' for. From the "Rationale" in the POSIX spec for
> 'confstr' <https://pubs.opengroup.org/onlinepubs/9699919799/functions/confstr.html>:
> 
>> The original need for this function was to provide a way of finding
>> the configuration-defined default value for the environment variable
>> PATH. Since PATH can be modified by the user to include directories
>> that could contain utilities replacing the standard utilities in the
>> Shell and Utilities volume of POSIX.1-2017, applications need a way
>> to determine the system-supplied PATH environment variable value that
>> contains the correct search path for the standard utilities.

Guix likes users being able to replace things, so $PATH seems more 
desirable here than _CS_PATH (the latter being more difficult to modify 
or install things in) -- the 'system-supplied PATH' should be whatever 
the user wants it to be.

> I don't have a strong view about the merits of using PATH or not in general, and, again 'confstr' with '_CS_PATH' doesn't currently give a useful result on Guix.
> 
> For 'sh' specifically, though, there's some set of programs that look at $SHELL or /etc/passwd or other mechanisms for a highly-configurable choice of shell: those aren't relevant here. This question concerns a different set of programs that are looking for a reliable plain-vanilla 'sh': this may be configured at the level of the environment (OS, container, chroot, etc.)---including configuring it not to exist---but it's a less fine-grained sort of configuration, and there's a stronger expectation that it will be a POSIX-like 'sh' (not fish or /usr/sbin/nologin).

I don't see an argument against $PATH or for _PATH_BSHELL or_CS_PATH here.

> It seems POSIX would like 'sh' to be found using '_CS_PATH',

How is this relevant?

> but I don't know of any programs that actually do that, and it doesn't work on Guix.

If it doesn't work, we can make it work -- in the initial e-mail, you 
are proposing chances, so I don't think 'it (currently) doesn't work' 
counts as an argument.

> 
> Programs in practice seem to look at "/bin/sh", and environments configuring it by choosing what (possibly nothing) to put at "/bin/sh" from the perspective of programs in that environment. 

In the initial e-mail, you were among other things asking what mechanism 
programs and libraries should use.  Now, you are mentioning what 
programs are currently using (*), and presenting it as an argument. 
This is rather cyclic.

(*) This is not true for appropriately patched programs and libraries in 
Guix, e.g. glibc and racket.

> I don't mean "document the decision" to necessarily imply something elaborate or formal, but I think the next person packaging a language with a function like 'system' in its standard library shouldn't have to reevaluate these questions from scratch. Also, if we decided the right thing were to advocate for upstreams to do something differently for the sake of portability (e.g. trying to get people to use _CS_PATH---which I'm not suggesting), it would help to have a rationale to point to.

OK, though I think the answer is: don't do that, 'system' is prone to 
errors, implement interfaces like 'system*' it.

> Specifically with respect to bash-minimal vs. bash-static, I'm still not clear on when I should use which. The package descriptions are identical, and I haven't found a clear (to me, at least) explanation in the source code comments. For example, if bash-static is needed to avoid a cycle as you say, what is the benefit of also having bash-minimal? 

'bash-minimal'.  'bash-static' is a hack to resolve the cycle, hence 
something to be avoided where possible and preferably eventually 
eliminated (e.g. with preprocessor tricks or by dynamic finding).

Greetings,
Maxime.



[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

  parent reply	other threads:[~2022-09-26 12:43 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-19  0:13 What 'sh' should 'system' use? Philip McGrath
2022-09-19  7:07 ` Liliana Marie Prikler
2022-09-26  8:07   ` Philip McGrath
2022-09-26 10:04     ` Liliana Marie Prikler
2022-09-19 12:55 ` Maxime Devos
2022-09-26  7:04   ` Philip McGrath
2022-09-26  9:41     ` Liliana Marie Prikler
2022-09-26 12:24     ` Maxime Devos [this message]
2022-10-01 16:54 ` Ludovic Courtès
2022-10-15 23:23   ` Philip McGrath
2022-10-16  7:04     ` Liliana Marie Prikler
2022-10-16  7:56       ` Philip McGrath
2022-10-16  8:23         ` Liliana Marie Prikler
2022-10-19 15:30     ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2c178e61-291f-5c84-a6fc-84a7dc64e854@telenet.be \
    --to=maximedevos@telenet.be \
    --cc=guix-devel@gnu.org \
    --cc=liliana.prikler@gmail.com \
    --cc=liliana.prikler@ist.tugraz.at \
    --cc=philip@philipmcgrath.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).