Ludovic Courtès writes: > Previously, services started indirectly with ‘exec-command’ (which is usually > the case) would not inherit any file descriptor from shepherd because > ‘exec-command’ would explicitly close all of them. However, services started > with ‘make-system-constructor’ and processes created by some other means, such > as calling ‘system*’, would inherit some of those descriptors, giving them > more authority than intended. Interestingly enough, guile's system* closes all file descriptors aside from the first 3, and we now provide a replacement for system* that uses fork+exec-command and therefore does not. > The FD-closing loop was removed on purpose, in > 2c0354258047133db8b885bcc11afdf0def5d885. > > Now, as you write, it means that service writers must be careful now and > not leave any non-CLOEXEC file descriptor behind them. > > At the time I audited Guix System to check that this was a reasonable > thing to expect and that we could indeed ensure no file descriptors were > leaked. There’s also ‘tests/close-on-exec.sh’. > > If you found cases where it would be necessary, what we could do is have > ‘shepherd’ replace ‘call-with-input-file’ & co. with a variant that > opens files as O_CLOEXEC by default. WDYT? The problem is that there isn't a way for the user to replicate the old behavior of default-non-inheriting short of wrapping the desired command with another file-descriptor-closing program, by which point the user and group may already have been changed. O_CLOEXEC is good, but to use it to prevent leaks to any particular program is a matter of global correctness, while closing everything not explicitly allowed for a program is a matter of local correctness. For example, if I have a program whose quality (and, to some extent, trustworthiness) I am wary of, I would really prefer to be certain that it doesn't get any file descriptors it shouldn't. The only way to be certain of this currently is to examine the entire shepherd process - including all services - to be sure there aren't any leaks. Replacing 'call-with-input-file' and such - as I now see is done in guix's shepherd config file - would probably be a good idea (I see we use call-with-input-file in primitive-load* and read-pid-file, for example), but it's still no guarantee. For example, I didn't even know SOCK_CLOEXEC was a thing, and so didn't use it in one of my services that calls socketpair. I did use fcntl afterward, but the time between those two introduces all kinds of questions, like "have you called system, system*, spawn-command, sleep, done any I/O, or had the misfortune to receive a SIGCHLD on a system without signalfd support". And if I hadn't happened to have looked at the source of exec-command, I wouldn't have even tried fcntl, because the manual entry for fork+exec-command and exec-command still clearly states: File descriptors 1 and 2 are kept as is or redirected to either LOG-PORT or LOG-FILE if it’s true, whereas file descriptor 0 (standard input) points to INPUT-PORT or ‘/dev/null’; all other file descriptors are closed prior to yielding control to COMMAND. Whatever course of action is taken, it is imperative that the documentation and code be aligned on this matter. Personally I'm inclined to bring the code in line with the documentation. >> In other words, the copying of earlier FDs in EXTRA-PORTS may overwrite >> later FDs in EXTRA-PORTS. > > Good catch! > > Could you make a more minimal patch fixing this specific issue, also > adding a test reproducing the problem being fixed? I'm sending a more granular v2 series that adds the requested test, fixes the clobbering, makes extra-ports be honored regardless of the values of log-port and log-file, fixes edge cases around file descriptors 0-2, and finally adds support for closing all other file descriptors to exec-command. That last one also makes closing all other file descriptors the default, mostly because it makes the most sense for the default value of extra-ports, and otherwise I'd have to update the manual. It does include the option of #:extra-ports #t, though, in which case no fds are closed. One alternative would be to have the close-all-others behavior controlled by an independent argument; that way the file descriptor rearranging functionality can be independent of whether other file descriptors are closed. Speaking of file descriptor rearranging functionality, the v2 replaces 'preserve-ports' with 'reconfigure-fds', which uses a different, graph-based approach to avoiding clobbering while using fewer system calls. It's also more robust, and will function as long as there is even a single unused file descriptor, and won't touch any open file descriptors other than those in the target range. - ulfvonbelow