* bug#61095: possible misuse of posix_spawn API on non-linux OSes
@ 2023-01-27 11:51 Omar Polo
2023-01-27 12:25 ` Omar Polo
2023-03-28 9:34 ` Ludovic Courtès
0 siblings, 2 replies; 11+ messages in thread
From: Omar Polo @ 2023-01-27 11:51 UTC (permalink / raw)
To: 61095
Hello,
I've noticed that test-system-cmds fails on OpenBSD-CURRENT while
testing the update to guile 3.0.9:
test-system-cmds: system* exit status was 127 rather than 42
FAIL: test-system-cmds
Here's an excerpt of the ktrace of the child process while executing
that specific test: (the first fork() is the one implicitly done by
posix_spawn(3))
5590 guile RET fork 0
[...]
5590 guile CALL dup2(0,3)
5590 guile RET dup2 3
5590 guile CALL dup2(1,4)
5590 guile RET dup2 4
5590 guile CALL dup2(2,5)
5590 guile RET dup2 5
5590 guile CALL dup2(3,0)
5590 guile RET dup2 0
5590 guile CALL dup2(4,1)
5590 guile RET dup2 1
5590 guile CALL dup2(5,2)
5590 guile RET dup2 2
5590 guile CALL close(1023)
5590 guile RET close -1 errno 9 Bad file descriptor
5590 guile CALL kbind(0x7f7ffffd51f8,24,0x2b5c5ced59893fa9)
5590 guile RET kbind 0
5590 guile CALL exit(127)
(if you prefer I can provide a full ktrace of guile executing that
test case)
My interpretation is that the sequence of dup2(2) is from
posix_spawn_file_actions_adddup2 in do_spawn, while the strange
close(1023) is from close_inherited_fds_slow. Such file descriptor is
not open, so close(2) fails with EBADF and the posix_spawn machinery
exits prematurely. My current RLIMIT_NOFILE is 1024, so the number
would make sense.
On OpenBSD I've tried to use the following patch to work around the
issue:
[[[
Index: libguile/posix.c
--- libguile/posix.c.orig
+++ libguile/posix.c
@@ -1325,6 +1325,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
static void
close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
{
+ max_fd = getdtablecount();
while (--max_fd > 2)
posix_spawn_file_actions_addclose (actions, max_fd);
}
]]]
getdtablecount(2) returns the number of file descriptor currently open
by the process. unfortunately it doesn't seem to be portable. (well,
tbf /proc/self/fd is not portable too.)
However, while this pleases the system* test, it breaks the pipe
tests:
Running popen.test
FAIL: popen.test: open-input-pipe: echo hello
FAIL: popen.test: pipeline - arguments: (expected-value ("HELLO WORLD\n" (0 0)) actual-value ("" (127 0)))
the reason seem to be similar:
74865 guile CALL dup2(7,3)
74865 guile RET dup2 3
74865 guile CALL dup2(10,4)
74865 guile RET dup2 4
74865 guile CALL dup2(2,5)
74865 guile RET dup2 5
74865 guile CALL dup2(3,0)
74865 guile RET dup2 0
74865 guile CALL dup2(4,1)
74865 guile RET dup2 1
74865 guile CALL dup2(5,2)
74865 guile RET dup2 2
74865 guile CALL close(8)
74865 guile RET close -1 errno 9 Bad file descriptor
74865 guile CALL kbind(0x7f7ffffcfa88,24,0x2125923bdf2ca9e)
74865 guile RET kbind 0
74865 guile CALL exit(127)
I guess it's trying to close the fd of the pipe that was closed.
I'm not sure what to do from here, I'm not used to the posix_spawn_*
APIs. I'm happy to help testing diffs or by providing more info if
needed.
Thanks,
Omar Polo
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#61095: possible misuse of posix_spawn API on non-linux OSes
2023-01-27 11:51 bug#61095: possible misuse of posix_spawn API on non-linux OSes Omar Polo
@ 2023-01-27 12:25 ` Omar Polo
2023-03-28 9:34 ` Ludovic Courtès
1 sibling, 0 replies; 11+ messages in thread
From: Omar Polo @ 2023-01-27 12:25 UTC (permalink / raw)
To: 61095
Actually I can avoid the EBADF by checking that the fd is 'live' with
something like fstat:
[[[
Index: libguile/posix.c
--- libguile/posix.c.orig
+++ libguile/posix.c
@@ -1325,8 +1325,12 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
static void
close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
{
- while (--max_fd > 2)
- posix_spawn_file_actions_addclose (actions, max_fd);
+ struct stat sb;
+ max_fd = getdtablecount();
+ while (--max_fd > 2) {
+ if (fstat(max_fd, &sb) != -1)
+ posix_spawn_file_actions_addclose (actions, max_fd);
+ }
}
static void
]]]
The regress passes and while this workaround may be temporarly
acceptable I -personally- don't like it much. There's a reason guile
can't set CLOEXEC for all the file descriptors > 2 obtained via open,
socket, pipe, ... like perl -for example- does?
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#61095: possible misuse of posix_spawn API on non-linux OSes
2023-01-27 11:51 bug#61095: possible misuse of posix_spawn API on non-linux OSes Omar Polo
2023-01-27 12:25 ` Omar Polo
@ 2023-03-28 9:34 ` Ludovic Courtès
2023-03-28 16:10 ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
1 sibling, 1 reply; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-28 9:34 UTC (permalink / raw)
To: Omar Polo; +Cc: Josselin Poiret, Andrew Whatson, 61095
[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]
Hi Omar,
Apologies for the late reply.
Omar Polo <op@omarpolo.com> skribis:
> I've noticed that test-system-cmds fails on OpenBSD-CURRENT while
> testing the update to guile 3.0.9:
>
> test-system-cmds: system* exit status was 127 rather than 42
> FAIL: test-system-cmds
We’re seeing the same failure on GNU/Hurd:
https://issues.guix.gnu.org/61079
> Actually I can avoid the EBADF by checking that the fd is 'live' with
> something like fstat:
>
> [[[
>
> Index: libguile/posix.c
> --- libguile/posix.c.orig
> +++ libguile/posix.c
> @@ -1325,8 +1325,12 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
> static void
> close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
> {
> - while (--max_fd > 2)
> - posix_spawn_file_actions_addclose (actions, max_fd);
> + struct stat sb;
> + max_fd = getdtablecount();
> + while (--max_fd > 2) {
> + if (fstat(max_fd, &sb) != -1)
> + posix_spawn_file_actions_addclose (actions, max_fd);
> + }
> }
I came up with the following patch:
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 3545 bytes --]
diff --git a/libguile/posix.c b/libguile/posix.c
index 3a8be94e4..cde199888 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1322,39 +1322,18 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
#undef FUNC_NAME
#endif /* HAVE_FORK */
-static void
-close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
-{
- while (--max_fd > 2)
- posix_spawn_file_actions_addclose (actions, max_fd);
-}
-
static void
close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
{
- DIR *dirp;
- struct dirent *d;
- int fd;
-
- /* Try to use the platform-specific list of open file descriptors, so
- we don't need to use the brute force approach. */
- dirp = opendir ("/proc/self/fd");
-
- if (dirp == NULL)
- return close_inherited_fds_slow (actions, max_fd);
-
- while ((d = readdir (dirp)) != NULL)
+ while (--max_fd > 2)
{
- fd = atoi (d->d_name);
-
- /* Skip "." and "..", garbage entries, stdin/stdout/stderr. */
- if (fd <= 2)
- continue;
-
- posix_spawn_file_actions_addclose (actions, fd);
+ /* Adding invalid file descriptors to an 'addclose' action leads
+ to 'posix_spawn' failures on some operating systems:
+ <https://bugs.gnu.org/61095>. Hence the extra check. */
+ int flags = fcntl (max_fd, F_GETFD, NULL);
+ if ((flags >= 0) && ((flags & FD_CLOEXEC) == 0))
+ posix_spawn_file_actions_addclose (actions, max_fd);
}
-
- closedir (dirp);
}
static pid_t
@@ -1366,6 +1345,26 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
posix_spawn_file_actions_t actions;
posix_spawnattr_t *attrp = NULL;
+ posix_spawn_file_actions_init (&actions);
+
+ /* Duplicate IN, OUT, and ERR unconditionally to clear their
+ FD_CLOEXEC flag, if any. */
+ posix_spawn_file_actions_adddup2 (&actions, in, STDIN_FILENO);
+ posix_spawn_file_actions_adddup2 (&actions, out, STDOUT_FILENO);
+ posix_spawn_file_actions_adddup2 (&actions, err, STDERR_FILENO);
+
+ /* TODO: Use 'closefrom' where available. */
+#if 0
+ /* Version 2.34 of the GNU libc provides this function. */
+ posix_spawn_file_actions_addclosefrom_np (&actions, 3);
+#else
+ if (in > 2)
+ posix_spawn_file_actions_addclose (&actions, in);
+ if (out > 2 && out != in)
+ posix_spawn_file_actions_addclose (&actions, out);
+ if (err > 2 && err != out && err != in)
+ posix_spawn_file_actions_addclose (&actions, err);
+
int max_fd = 1024;
#if defined (HAVE_GETRLIMIT) && defined (RLIMIT_NOFILE)
@@ -1376,31 +1375,8 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
}
#endif
- posix_spawn_file_actions_init (&actions);
-
- int free_fd_slots = 0;
- int fd_slot[3];
-
- for (int fdnum = 3; free_fd_slots < 3 && fdnum < max_fd; fdnum++)
- {
- if (fdnum != in && fdnum != out && fdnum != err)
- {
- fd_slot[free_fd_slots] = fdnum;
- free_fd_slots++;
- }
- }
-
- /* Move the fds out of the way, so that duplicate fds or fds equal
- to 0, 1, 2 don't trample each other */
-
- posix_spawn_file_actions_adddup2 (&actions, in, fd_slot[0]);
- posix_spawn_file_actions_adddup2 (&actions, out, fd_slot[1]);
- posix_spawn_file_actions_adddup2 (&actions, err, fd_slot[2]);
- posix_spawn_file_actions_adddup2 (&actions, fd_slot[0], 0);
- posix_spawn_file_actions_adddup2 (&actions, fd_slot[1], 1);
- posix_spawn_file_actions_adddup2 (&actions, fd_slot[2], 2);
-
close_inherited_fds (&actions, max_fd);
+#endif
int res = -1;
if (spawnp)
[-- Attachment #3: Type: text/plain, Size: 1152 bytes --]
Could you confirm that it works on OpenBSD and that there’s no
performance regression?
Andrew: it removes the /proc/self/fd loop you added to fix
<https://bugs.gnu.org/59321>, but it reduces the number of ‘close’ calls
in the child. Could you check whether that’s okay performance-wise?
Eventually I plan to use ‘posix_spawn_file_actions_addclosefrom_np’ on
glibc >= 2.34, but I have yet to test it. That will be the best
solution.
Josselin: I simplified the ‘dup2’ logic somewhat.
Feedback welcome!
> The regress passes and while this workaround may be temporarly
> acceptable I -personally- don't like it much. There's a reason guile
> can't set CLOEXEC for all the file descriptors > 2 obtained via open,
> socket, pipe, ... like perl -for example- does?
Guile does that for file descriptors it opens internally, but
applications using ‘open-file’ without the recently-added “e” flag, or
‘socket’ without ‘SOCK_CLOEXEC’, etc., end up with more file descriptors
that need to be taken care of.
I wish the default were close-on-exec, but we’re not there yet.
Thanks,
Ludo’.
^ permalink raw reply related [flat|nested] 11+ messages in thread
* bug#61095: possible misuse of posix_spawn API on non-linux OSes
2023-03-28 9:34 ` Ludovic Courtès
@ 2023-03-28 16:10 ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
2023-03-29 22:30 ` Ludovic Courtès
0 siblings, 1 reply; 11+ messages in thread
From: Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language @ 2023-03-28 16:10 UTC (permalink / raw)
To: Ludovic Courtès, Omar Polo; +Cc: Andrew Whatson, 61095
[-- Attachment #1: Type: text/plain, Size: 3014 bytes --]
Hi Ludo,
Ludovic Courtès <ludo@gnu.org> writes:
> - posix_spawn_file_actions_addclose (actions, fd);
> + /* Adding invalid file descriptors to an 'addclose' action leads
> + to 'posix_spawn' failures on some operating systems:
> + <https://bugs.gnu.org/61095>. Hence the extra check. */
> + int flags = fcntl (max_fd, F_GETFD, NULL);
> + if ((flags >= 0) && ((flags & FD_CLOEXEC) == 0))
> + posix_spawn_file_actions_addclose (actions, max_fd);
> }
I'm worried about TOCTOU in multi-threaded contexts here, which is why I
opted for the heavy handed-approach here. In general I don't think we
can know in advance which fdes to close :/ It's a shame that the
posix_spawn actions fails on other kernels, since I don't really see a
way to mitigate this problem (apart from the new
posix_spawn_file_actions_addclosefrom_np as you mentioned). I don't
know what we could do here. Maybe not provide spawn? Or provide it in
spite of the broken fd closing?
> -
> - closedir (dirp);
> }
>
> static pid_t
> @@ -1366,6 +1345,26 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
> posix_spawn_file_actions_t actions;
> posix_spawnattr_t *attrp = NULL;
>
> + posix_spawn_file_actions_init (&actions);
> +
> + /* Duplicate IN, OUT, and ERR unconditionally to clear their
> + FD_CLOEXEC flag, if any. */
> + posix_spawn_file_actions_adddup2 (&actions, in, STDIN_FILENO);
> + posix_spawn_file_actions_adddup2 (&actions, out, STDOUT_FILENO);
> + posix_spawn_file_actions_adddup2 (&actions, err, STDERR_FILENO);
This won't work, and actually this was one of the original logic bugs I
was trying to fix. If err is equal to, let's say, STDIN_FILENO, then
the first call will overwrite the initial file descriptor at
STDIN_FILENO, and the second call won't do what the caller intended.
This is why I was moving them out of the way first, so that they would
not overwrite each other.
> + /* TODO: Use 'closefrom' where available. */
> +#if 0
> + /* Version 2.34 of the GNU libc provides this function. */
> + posix_spawn_file_actions_addclosefrom_np (&actions, 3);
> +#else
> + if (in > 2)
> + posix_spawn_file_actions_addclose (&actions, in);
> + if (out > 2 && out != in)
> + posix_spawn_file_actions_addclose (&actions, out);
> + if (err > 2 && err != out && err != in)
> + posix_spawn_file_actions_addclose (&actions, err);
Isn't this unneeded given we call close_inherited_fds below?
> [...]
>
> Josselin: I simplified the ‘dup2’ logic somewhat.
See my comments above.
> Guile does that for file descriptors it opens internally, but
> applications using ‘open-file’ without the recently-added “e” flag, or
> ‘socket’ without ‘SOCK_CLOEXEC’, etc., end up with more file descriptors
> that need to be taken care of.
>
> I wish the default were close-on-exec, but we’re not there yet.
+1
Best,
--
Josselin Poiret
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 682 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#61095: possible misuse of posix_spawn API on non-linux OSes
2023-03-28 16:10 ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
@ 2023-03-29 22:30 ` Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
2023-03-30 20:21 ` bug#61095: possible misuse of posix_spawn API on non-linux OSes Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
0 siblings, 2 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-29 22:30 UTC (permalink / raw)
To: Josselin Poiret; +Cc: 61095, Andrew Whatson, Omar Polo
Hi!
Josselin Poiret <dev@jpoiret.xyz> skribis:
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> - posix_spawn_file_actions_addclose (actions, fd);
>> + /* Adding invalid file descriptors to an 'addclose' action leads
>> + to 'posix_spawn' failures on some operating systems:
>> + <https://bugs.gnu.org/61095>. Hence the extra check. */
>> + int flags = fcntl (max_fd, F_GETFD, NULL);
>> + if ((flags >= 0) && ((flags & FD_CLOEXEC) == 0))
>> + posix_spawn_file_actions_addclose (actions, max_fd);
>> }
>
> I'm worried about TOCTOU in multi-threaded contexts here,
Yes, that’s a problem. The current /proc/self/fd optimization has that
problem too. :-/
> which is why I opted for the heavy handed-approach here. In general I
> don't think we can know in advance which fdes to close :/ It's a shame
> that the posix_spawn actions fails on other kernels, since I don't
> really see a way to mitigate this problem (apart from the new
> posix_spawn_file_actions_addclosefrom_np as you mentioned). I don't
> know what we could do here. Maybe not provide spawn? Or provide it
> in spite of the broken fd closing?
Not providing ‘spawn’ is no longer an option.
We can expect the problem to practically vanish “soon” on GNU variants
with ‘closefrom’ (glibc 2.34 was released in Aug. 2021).
On Linux with glibc < 2.34, we’d keep the current code (maybe without
the /proc/self/fd optimization?).
On other systems, we can have the racy code above as a last resort or
OS-specific tricks, like Omar was suggesting for OpenBSD. It sucks but
what else can we do?
(BTW,
<https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn_file_actions_addclose.html>
reads:
It shall not be considered an error for the fildes argument passed to
these functions to specify a file descriptor for which the specified
operation could not be performed at the time of the call. Any such
error will be detected when the associated file actions object is
later used during a posix_spawn() or posix_spawnp() operation.
OpenBSD and GNU/Hurd follow this to the letter.
OTOH ‘linux/spawni.c’ in glibc is purposefully more liberal:
/* Signal errors only for file descriptors out of range. */
)
>> + /* Duplicate IN, OUT, and ERR unconditionally to clear their
>> + FD_CLOEXEC flag, if any. */
>> + posix_spawn_file_actions_adddup2 (&actions, in, STDIN_FILENO);
>> + posix_spawn_file_actions_adddup2 (&actions, out, STDOUT_FILENO);
>> + posix_spawn_file_actions_adddup2 (&actions, err, STDERR_FILENO);
>
> This won't work, and actually this was one of the original logic bugs I
> was trying to fix. If err is equal to, let's say, STDIN_FILENO, then
> the first call will overwrite the initial file descriptor at
> STDIN_FILENO, and the second call won't do what the caller intended.
> This is why I was moving them out of the way first, so that they would
> not overwrite each other.
Oh, my bad.
>> + /* TODO: Use 'closefrom' where available. */
>> +#if 0
>> + /* Version 2.34 of the GNU libc provides this function. */
>> + posix_spawn_file_actions_addclosefrom_np (&actions, 3);
>> +#else
>> + if (in > 2)
>> + posix_spawn_file_actions_addclose (&actions, in);
>> + if (out > 2 && out != in)
>> + posix_spawn_file_actions_addclose (&actions, out);
>> + if (err > 2 && err != out && err != in)
>> + posix_spawn_file_actions_addclose (&actions, err);
>
> Isn't this unneeded given we call close_inherited_fds below?
No, because of the FD_CLOEXEC selection.
Coming next is an updated patch series addressing this as proposed
above. Let me know what y’all think!
I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:
guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm
… which gives us glibc 2.35.
Ludo’.
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems.
2023-03-29 22:30 ` Ludovic Courtès
@ 2023-03-29 22:30 ` Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 2/3] Remove racy optimized file descriptor closing loop in 'spawn' Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 3/3] Use 'posix_spawn_file_actions_addclosefrom_np' where available Ludovic Courtès
2023-03-30 20:21 ` bug#61095: possible misuse of posix_spawn API on non-linux OSes Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
1 sibling, 2 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-29 22:30 UTC (permalink / raw)
To: 61095; +Cc: Ludovic Courtès, Josselin Poiret, Omar Polo, Andrew Whatson
Fixes <https://bugs.gnu.org/61095>.
Reported by Omar Polo <op@omarpolo.com>.
* libguile/posix.c (close_inherited_fds_slow): On systems other than
GNU/Linux, call 'addclose' only when 'fcntl' succeeds on MAX_FD.
---
libguile/posix.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/libguile/posix.c b/libguile/posix.c
index 3a8be94e4..68e9bfade 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1326,7 +1326,24 @@ static void
close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
{
while (--max_fd > 2)
- posix_spawn_file_actions_addclose (actions, max_fd);
+ {
+ /* Adding a 'close' action for a file descriptor that is not open
+ causes 'posix_spawn' to fail on GNU/Hurd and on OpenBSD, but
+ not on GNU/Linux: <https://bugs.gnu.org/61095>. Hence this
+ strategy:
+
+ - On GNU/Linux, close every FD, since that's the only
+ race-free way to make sure the child doesn't inherit one.
+ - On other systems, only close FDs currently open in the
+ parent; it works, but it's racy (XXX).
+
+ The only reliable option is 'addclosefrom'. */
+#if ! (defined __GLIBC__ && defined __linux__)
+ int flags = fcntl (max_fd, F_GETFD, NULL);
+ if (flags >= 0)
+#endif
+ posix_spawn_file_actions_addclose (actions, max_fd);
+ }
}
static void
base-commit: e334e59589c3cbfc68d3f7d0d739000e0876b36d
--
2.39.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* bug#61095: [PATCH 2/3] Remove racy optimized file descriptor closing loop in 'spawn'.
2023-03-29 22:30 ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
@ 2023-03-29 22:30 ` Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 3/3] Use 'posix_spawn_file_actions_addclosefrom_np' where available Ludovic Courtès
1 sibling, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-29 22:30 UTC (permalink / raw)
To: 61095; +Cc: Ludovic Courtès, Josselin Poiret, Omar Polo, Andrew Whatson
This reverts 9332b632407894c2e1951cce1bc678f19e1fa8e4, thereby
reinstating the performance issue in <https://bugs.gnu.org/59321>.
This optimization was subject to race conditions in multi-threaded code:
new file descriptors could pop up at any time and thus leak in the
child.
* libguile/posix.c (close_inherited_fds): Remove.
(close_inherited_fds_slow): Rename to...
(close_inherited_fds): ... this.
---
libguile/posix.c | 30 +-----------------------------
1 file changed, 1 insertion(+), 29 deletions(-)
diff --git a/libguile/posix.c b/libguile/posix.c
index 68e9bfade..b5830c43b 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1323,7 +1323,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
#endif /* HAVE_FORK */
static void
-close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
+close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
{
while (--max_fd > 2)
{
@@ -1346,34 +1346,6 @@ close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
}
}
-static void
-close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
-{
- DIR *dirp;
- struct dirent *d;
- int fd;
-
- /* Try to use the platform-specific list of open file descriptors, so
- we don't need to use the brute force approach. */
- dirp = opendir ("/proc/self/fd");
-
- if (dirp == NULL)
- return close_inherited_fds_slow (actions, max_fd);
-
- while ((d = readdir (dirp)) != NULL)
- {
- fd = atoi (d->d_name);
-
- /* Skip "." and "..", garbage entries, stdin/stdout/stderr. */
- if (fd <= 2)
- continue;
-
- posix_spawn_file_actions_addclose (actions, fd);
- }
-
- closedir (dirp);
-}
-
static pid_t
do_spawn (char *exec_file, char **exec_argv, char **exec_env,
int in, int out, int err, int spawnp)
--
2.39.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* bug#61095: [PATCH 3/3] Use 'posix_spawn_file_actions_addclosefrom_np' where available.
2023-03-29 22:30 ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 2/3] Remove racy optimized file descriptor closing loop in 'spawn' Ludovic Courtès
@ 2023-03-29 22:30 ` Ludovic Courtès
1 sibling, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-29 22:30 UTC (permalink / raw)
To: 61095; +Cc: Ludovic Courtès, Josselin Poiret, Omar Polo, Andrew Whatson
* configure.ac: Check for 'posix_spawn_file_actions_addclosefrom_np'.
* libguile/posix.c (HAVE_ADDCLOSEFROM): New macro.
(close_inherited_fds): Wrap in #ifdef HAVE_ADDCLOSEFROM.
(do_spawn) [HAVE_ADDCLOSEFROM]: Use 'posix_spawn_file_actions_addclosefrom_np'.
---
configure.ac | 4 +++-
libguile/posix.c | 14 ++++++++++++++
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/configure.ac b/configure.ac
index d5ce1c4ac..4a93be979 100644
--- a/configure.ac
+++ b/configure.ac
@@ -515,6 +515,7 @@ AC_CHECK_HEADERS([crt_externs.h])
# sched_getaffinity, sched_setaffinity - GNU extensions (glibc)
# sendfile - non-POSIX, found in glibc
# pipe2 - non-POSIX, found in glibc (GNU/Linux and GNU/Hurd)
+# posix_spawn_file_actions_addclosefrom_np - glibc >= 2.34
#
AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid \
fesetround ftime ftruncate fchown fchownat fchmod fchdir readlinkat \
@@ -528,7 +529,8 @@ AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid \
index bcopy rindex truncate isblank _NSGetEnviron \
strcoll_l strtod_l strtol_l newlocale uselocale utimensat \
fstatat futimens openat \
- sched_getaffinity sched_setaffinity sendfile pipe2])
+ sched_getaffinity sched_setaffinity sendfile pipe2
+ posix_spawn_file_actions_addclosefrom_np])
# The newlib C library uses _NL_ prefixed locale langinfo constants.
AC_CHECK_DECLS([_NL_NUMERIC_GROUPING], [], [], [[#include <langinfo.h>]])
diff --git a/libguile/posix.c b/libguile/posix.c
index b5830c43b..3adc743c4 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1322,6 +1322,12 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
#undef FUNC_NAME
#endif /* HAVE_FORK */
+#ifdef HAVE_POSIX_SPAWN_FILE_ACTIONS_ADDCLOSEFROM_NP
+# define HAVE_ADDCLOSEFROM 1
+#endif
+
+#ifndef HAVE_ADDCLOSEFROM
+
static void
close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
{
@@ -1346,6 +1352,8 @@ close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
}
}
+#endif
+
static pid_t
do_spawn (char *exec_file, char **exec_argv, char **exec_env,
int in, int out, int err, int spawnp)
@@ -1389,7 +1397,13 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
posix_spawn_file_actions_adddup2 (&actions, fd_slot[1], 1);
posix_spawn_file_actions_adddup2 (&actions, fd_slot[2], 2);
+#ifdef HAVE_ADDCLOSEFROM
+ /* This function appears in glibc 2.34. It's both free from race
+ conditions and more efficient than the alternative. */
+ posix_spawn_file_actions_addclosefrom_np (&actions, 3);
+#else
close_inherited_fds (&actions, max_fd);
+#endif
int res = -1;
if (spawnp)
--
2.39.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* bug#61095: possible misuse of posix_spawn API on non-linux OSes
2023-03-29 22:30 ` Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
@ 2023-03-30 20:21 ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
2023-03-31 17:45 ` Omar Polo
1 sibling, 1 reply; 11+ messages in thread
From: Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language @ 2023-03-30 20:21 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: 61095, Andrew Whatson, Omar Polo
[-- Attachment #1: Type: text/plain, Size: 448 bytes --]
Hi Ludo,
Ludovic Courtès <ludo@gnu.org> writes:
> Coming next is an updated patch series addressing this as proposed
> above. Let me know what y’all think!
>
> I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:
>
> guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm
I didn't test, but this LGTM! Maybe someone on OpenBSD could test this
patchset?
Best,
--
Josselin Poiret
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 682 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#61095: possible misuse of posix_spawn API on non-linux OSes
2023-03-30 20:21 ` bug#61095: possible misuse of posix_spawn API on non-linux OSes Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
@ 2023-03-31 17:45 ` Omar Polo
2023-04-02 13:44 ` Ludovic Courtès
0 siblings, 1 reply; 11+ messages in thread
From: Omar Polo @ 2023-03-31 17:45 UTC (permalink / raw)
To: Josselin Poiret; +Cc: Ludovic Courtès, Andrew Whatson, 61095
On 2023/03/30 22:21:28 +0200, Josselin Poiret <dev@jpoiret.xyz> wrote:
> Hi Ludo,
>
> Ludovic Courtès <ludo@gnu.org> writes:
>
> > Coming next is an updated patch series addressing this as proposed
> > above. Let me know what y’all think!
> >
> > I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:
> >
> > guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm
>
> I didn't test, but this LGTM! Maybe someone on OpenBSD could test this
> patchset?
% gmake check
<snip />
gmake[5]: Entering directory '/home/op/w/guile/test-suite/standalone'
PASS: test-system-cmds
it seems to work on OpenBSD 7.3 :)
but note that our libc doesn't have posix_spawn_file_actions_addclosefrom_np,
so this is using the "racy" code path.
Just for curiosity, as it's outside the scope of the bug, what's the
reason posix_spawn was used instead of a more classic fork() +
closefrom()?
Thanks,
Omar Polo
^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#61095: possible misuse of posix_spawn API on non-linux OSes
2023-03-31 17:45 ` Omar Polo
@ 2023-04-02 13:44 ` Ludovic Courtès
0 siblings, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-04-02 13:44 UTC (permalink / raw)
To: Omar Polo; +Cc: Josselin Poiret, Andrew Whatson, 61095-done
Hi!
Omar Polo <op@omarpolo.com> skribis:
> On 2023/03/30 22:21:28 +0200, Josselin Poiret <dev@jpoiret.xyz> wrote:
>> Hi Ludo,
>>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>> > Coming next is an updated patch series addressing this as proposed
>> > above. Let me know what y’all think!
>> >
>> > I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:
>> >
>> > guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm
>>
>> I didn't test, but this LGTM! Maybe someone on OpenBSD could test this
>> patchset?
>
> % gmake check
> <snip />
> gmake[5]: Entering directory '/home/op/w/guile/test-suite/standalone'
> PASS: test-system-cmds
>
> it seems to work on OpenBSD 7.3 :)
Awesome! Pushed as 9cc85a4f52147fcdaa4c52a62bcc87bdb267d0a9.
> but note that our libc doesn't have posix_spawn_file_actions_addclosefrom_np,
> so this is using the "racy" code path.
Yeah, not great. :-/ I hope that function will be adopted by other
libcs, especially since ‘closefrom’ is already available.
> Just for curiosity, as it's outside the scope of the bug, what's the
> reason posix_spawn was used instead of a more classic fork() +
> closefrom()?
There’s a long discussion at:
https://issues.guix.gnu.org/52835
Essentially, ‘fork’ is unusable in multi-threaded context, in addition
to being inefficient.
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-04-02 13:44 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-27 11:51 bug#61095: possible misuse of posix_spawn API on non-linux OSes Omar Polo
2023-01-27 12:25 ` Omar Polo
2023-03-28 9:34 ` Ludovic Courtès
2023-03-28 16:10 ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
2023-03-29 22:30 ` Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 2/3] Remove racy optimized file descriptor closing loop in 'spawn' Ludovic Courtès
2023-03-29 22:30 ` bug#61095: [PATCH 3/3] Use 'posix_spawn_file_actions_addclosefrom_np' where available Ludovic Courtès
2023-03-30 20:21 ` bug#61095: possible misuse of posix_spawn API on non-linux OSes Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
2023-03-31 17:45 ` Omar Polo
2023-04-02 13:44 ` Ludovic Courtès
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).