unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#61095: possible misuse of posix_spawn API on non-linux OSes
@ 2023-01-27 11:51 Omar Polo
  2023-01-27 12:25 ` Omar Polo
  2023-03-28  9:34 ` Ludovic Courtès
  0 siblings, 2 replies; 11+ messages in thread
From: Omar Polo @ 2023-01-27 11:51 UTC (permalink / raw)
  To: 61095

Hello,

I've noticed that test-system-cmds fails on OpenBSD-CURRENT while
testing the update to guile 3.0.9:

    test-system-cmds: system* exit status was 127 rather than 42
    FAIL: test-system-cmds

Here's an excerpt of the ktrace of the child process while executing
that specific test: (the first fork() is the one implicitly done by
posix_spawn(3))

  5590 guile RET   fork 0
  [...]
  5590 guile CALL  dup2(0,3)
  5590 guile RET   dup2 3
  5590 guile CALL  dup2(1,4)
  5590 guile RET   dup2 4
  5590 guile CALL  dup2(2,5)
  5590 guile RET   dup2 5
  5590 guile CALL  dup2(3,0)
  5590 guile RET   dup2 0
  5590 guile CALL  dup2(4,1)
  5590 guile RET   dup2 1
  5590 guile CALL  dup2(5,2)
  5590 guile RET   dup2 2
  5590 guile CALL  close(1023)
  5590 guile RET   close -1 errno 9 Bad file descriptor
  5590 guile CALL  kbind(0x7f7ffffd51f8,24,0x2b5c5ced59893fa9)
  5590 guile RET   kbind 0
  5590 guile CALL  exit(127)

(if you prefer I can provide a full ktrace of guile executing that
test case)

My interpretation is that the sequence of dup2(2) is from
posix_spawn_file_actions_adddup2 in do_spawn, while the strange
close(1023) is from close_inherited_fds_slow.  Such file descriptor is
not open, so close(2) fails with EBADF and the posix_spawn machinery
exits prematurely.  My current RLIMIT_NOFILE is 1024, so the number
would make sense.

On OpenBSD I've tried to use the following patch to work around the
issue:

[[[
Index: libguile/posix.c
--- libguile/posix.c.orig
+++ libguile/posix.c
@@ -1325,6 +1325,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
 static void
 close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
 {
+  max_fd = getdtablecount();
   while (--max_fd > 2)
     posix_spawn_file_actions_addclose (actions, max_fd);
 }
]]]

getdtablecount(2) returns the number of file descriptor currently open
by the process.  unfortunately it doesn't seem to be portable.  (well,
tbf /proc/self/fd is not portable too.)

However, while this pleases the system* test, it breaks the pipe
tests:

    Running popen.test
    FAIL: popen.test: open-input-pipe: echo hello
    FAIL: popen.test: pipeline - arguments: (expected-value ("HELLO WORLD\n" (0 0)) actual-value ("" (127 0)))

the reason seem to be similar:

 74865 guile    CALL  dup2(7,3)
 74865 guile    RET   dup2 3
 74865 guile    CALL  dup2(10,4)
 74865 guile    RET   dup2 4
 74865 guile    CALL  dup2(2,5)
 74865 guile    RET   dup2 5
 74865 guile    CALL  dup2(3,0)
 74865 guile    RET   dup2 0
 74865 guile    CALL  dup2(4,1)
 74865 guile    RET   dup2 1
 74865 guile    CALL  dup2(5,2)
 74865 guile    RET   dup2 2
 74865 guile    CALL  close(8)
 74865 guile    RET   close -1 errno 9 Bad file descriptor
 74865 guile    CALL  kbind(0x7f7ffffcfa88,24,0x2125923bdf2ca9e)
 74865 guile    RET   kbind 0
 74865 guile    CALL  exit(127)

I guess it's trying to close the fd of the pipe that was closed.

I'm not sure what to do from here, I'm not used to the posix_spawn_*
APIs.  I'm happy to help testing diffs or by providing more info if
needed.


Thanks,

Omar Polo





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#61095: possible misuse of posix_spawn API on non-linux OSes
  2023-01-27 11:51 bug#61095: possible misuse of posix_spawn API on non-linux OSes Omar Polo
@ 2023-01-27 12:25 ` Omar Polo
  2023-03-28  9:34 ` Ludovic Courtès
  1 sibling, 0 replies; 11+ messages in thread
From: Omar Polo @ 2023-01-27 12:25 UTC (permalink / raw)
  To: 61095

Actually I can avoid the EBADF by checking that the fd is 'live' with
something like fstat:

[[[
Index: libguile/posix.c
--- libguile/posix.c.orig
+++ libguile/posix.c
@@ -1325,8 +1325,12 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
 static void
 close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
 {
-  while (--max_fd > 2)
-    posix_spawn_file_actions_addclose (actions, max_fd);
+  struct stat sb;
+  max_fd = getdtablecount();
+  while (--max_fd > 2) {
+    if (fstat(max_fd, &sb) != -1)
+      posix_spawn_file_actions_addclose (actions, max_fd);
+  }
 }
 
 static void

]]]

The regress passes and while this workaround may be temporarly
acceptable I -personally- don't like it much.  There's a reason guile
can't set CLOEXEC for all the file descriptors > 2 obtained via open,
socket, pipe, ... like perl -for example- does?





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#61095: possible misuse of posix_spawn API on non-linux OSes
  2023-01-27 11:51 bug#61095: possible misuse of posix_spawn API on non-linux OSes Omar Polo
  2023-01-27 12:25 ` Omar Polo
@ 2023-03-28  9:34 ` Ludovic Courtès
  2023-03-28 16:10   ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
  1 sibling, 1 reply; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-28  9:34 UTC (permalink / raw)
  To: Omar Polo; +Cc: Josselin Poiret, Andrew Whatson, 61095

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

Hi Omar,

Apologies for the late reply.

Omar Polo <op@omarpolo.com> skribis:

> I've noticed that test-system-cmds fails on OpenBSD-CURRENT while
> testing the update to guile 3.0.9:
>
>     test-system-cmds: system* exit status was 127 rather than 42
>     FAIL: test-system-cmds

We’re seeing the same failure on GNU/Hurd:

  https://issues.guix.gnu.org/61079

> Actually I can avoid the EBADF by checking that the fd is 'live' with
> something like fstat:
>
> [[[
>
> Index: libguile/posix.c
> --- libguile/posix.c.orig
> +++ libguile/posix.c
> @@ -1325,8 +1325,12 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
>  static void
>  close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
>  {
> -  while (--max_fd > 2)
> -    posix_spawn_file_actions_addclose (actions, max_fd);
> +  struct stat sb;
> +  max_fd = getdtablecount();
> +  while (--max_fd > 2) {
> +    if (fstat(max_fd, &sb) != -1)
> +      posix_spawn_file_actions_addclose (actions, max_fd);
> +  }
>  }

I came up with the following patch:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 3545 bytes --]

diff --git a/libguile/posix.c b/libguile/posix.c
index 3a8be94e4..cde199888 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1322,39 +1322,18 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
 #undef FUNC_NAME
 #endif /* HAVE_FORK */
 
-static void
-close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
-{
-  while (--max_fd > 2)
-    posix_spawn_file_actions_addclose (actions, max_fd);
-}
-
 static void
 close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
 {
-  DIR *dirp;
-  struct dirent *d;
-  int fd;
-
-  /* Try to use the platform-specific list of open file descriptors, so
-     we don't need to use the brute force approach. */
-  dirp = opendir ("/proc/self/fd");
-
-  if (dirp == NULL)
-    return close_inherited_fds_slow (actions, max_fd);
-
-  while ((d = readdir (dirp)) != NULL)
+  while (--max_fd > 2)
     {
-      fd = atoi (d->d_name);
-
-      /* Skip "." and "..", garbage entries, stdin/stdout/stderr. */
-      if (fd <= 2)
-        continue;
-
-      posix_spawn_file_actions_addclose (actions, fd);
+      /* Adding invalid file descriptors to an 'addclose' action leads
+         to 'posix_spawn' failures on some operating systems:
+         <https://bugs.gnu.org/61095>.  Hence the extra check.  */
+      int flags = fcntl (max_fd, F_GETFD, NULL);
+      if ((flags >= 0) && ((flags & FD_CLOEXEC) == 0))
+        posix_spawn_file_actions_addclose (actions, max_fd);
     }
-
-  closedir (dirp);
 }
 
 static pid_t
@@ -1366,6 +1345,26 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
   posix_spawn_file_actions_t actions;
   posix_spawnattr_t *attrp = NULL;
 
+  posix_spawn_file_actions_init (&actions);
+
+  /* Duplicate IN, OUT, and ERR unconditionally to clear their
+     FD_CLOEXEC flag, if any.  */
+  posix_spawn_file_actions_adddup2 (&actions, in, STDIN_FILENO);
+  posix_spawn_file_actions_adddup2 (&actions, out, STDOUT_FILENO);
+  posix_spawn_file_actions_adddup2 (&actions, err, STDERR_FILENO);
+
+  /* TODO: Use 'closefrom' where available.  */
+#if 0
+  /* Version 2.34 of the GNU libc provides this function.  */
+  posix_spawn_file_actions_addclosefrom_np (&actions, 3);
+#else
+  if (in > 2)
+    posix_spawn_file_actions_addclose (&actions, in);
+  if (out > 2 && out != in)
+    posix_spawn_file_actions_addclose (&actions, out);
+  if (err > 2 && err != out && err != in)
+    posix_spawn_file_actions_addclose (&actions, err);
+
   int max_fd = 1024;
 
 #if defined (HAVE_GETRLIMIT) && defined (RLIMIT_NOFILE)
@@ -1376,31 +1375,8 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
   }
 #endif
 
-  posix_spawn_file_actions_init (&actions);
-
-  int free_fd_slots = 0;
-  int fd_slot[3];
-
-  for (int fdnum = 3; free_fd_slots < 3 && fdnum < max_fd; fdnum++)
-    {
-      if (fdnum != in && fdnum != out && fdnum != err)
-        {
-          fd_slot[free_fd_slots] = fdnum;
-          free_fd_slots++;
-        }
-    }
-
-  /* Move the fds out of the way, so that duplicate fds or fds equal
-     to 0, 1, 2 don't trample each other */
-
-  posix_spawn_file_actions_adddup2 (&actions, in, fd_slot[0]);
-  posix_spawn_file_actions_adddup2 (&actions, out, fd_slot[1]);
-  posix_spawn_file_actions_adddup2 (&actions, err, fd_slot[2]);
-  posix_spawn_file_actions_adddup2 (&actions, fd_slot[0], 0);
-  posix_spawn_file_actions_adddup2 (&actions, fd_slot[1], 1);
-  posix_spawn_file_actions_adddup2 (&actions, fd_slot[2], 2);
-
   close_inherited_fds (&actions, max_fd);
+#endif
 
   int res = -1;
   if (spawnp)

[-- Attachment #3: Type: text/plain, Size: 1152 bytes --]


Could you confirm that it works on OpenBSD and that there’s no
performance regression?

Andrew: it removes the /proc/self/fd loop you added to fix
<https://bugs.gnu.org/59321>, but it reduces the number of ‘close’ calls
in the child.  Could you check whether that’s okay performance-wise?

Eventually I plan to use ‘posix_spawn_file_actions_addclosefrom_np’ on
glibc >= 2.34, but I have yet to test it.  That will be the best
solution.

Josselin: I simplified the ‘dup2’ logic somewhat.

Feedback welcome!

> The regress passes and while this workaround may be temporarly
> acceptable I -personally- don't like it much.  There's a reason guile
> can't set CLOEXEC for all the file descriptors > 2 obtained via open,
> socket, pipe, ... like perl -for example- does?

Guile does that for file descriptors it opens internally, but
applications using ‘open-file’ without the recently-added “e” flag, or
‘socket’ without ‘SOCK_CLOEXEC’, etc., end up with more file descriptors
that need to be taken care of.

I wish the default were close-on-exec, but we’re not there yet.

Thanks,
Ludo’.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#61095: possible misuse of posix_spawn API on non-linux OSes
  2023-03-28  9:34 ` Ludovic Courtès
@ 2023-03-28 16:10   ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
  2023-03-29 22:30     ` Ludovic Courtès
  0 siblings, 1 reply; 11+ messages in thread
From: Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language @ 2023-03-28 16:10 UTC (permalink / raw)
  To: Ludovic Courtès, Omar Polo; +Cc: Andrew Whatson, 61095

[-- Attachment #1: Type: text/plain, Size: 3014 bytes --]

Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

> -      posix_spawn_file_actions_addclose (actions, fd);
> +      /* Adding invalid file descriptors to an 'addclose' action leads
> +         to 'posix_spawn' failures on some operating systems:
> +         <https://bugs.gnu.org/61095>.  Hence the extra check.  */
> +      int flags = fcntl (max_fd, F_GETFD, NULL);
> +      if ((flags >= 0) && ((flags & FD_CLOEXEC) == 0))
> +        posix_spawn_file_actions_addclose (actions, max_fd);
>      }

I'm worried about TOCTOU in multi-threaded contexts here, which is why I
opted for the heavy handed-approach here.  In general I don't think we
can know in advance which fdes to close :/ It's a shame that the
posix_spawn actions fails on other kernels, since I don't really see a
way to mitigate this problem (apart from the new
posix_spawn_file_actions_addclosefrom_np as you mentioned).  I don't
know what we could do here.  Maybe not provide spawn?  Or provide it in
spite of the broken fd closing?

> -
> -  closedir (dirp);
>  }
>  
>  static pid_t
> @@ -1366,6 +1345,26 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
>    posix_spawn_file_actions_t actions;
>    posix_spawnattr_t *attrp = NULL;
>  
> +  posix_spawn_file_actions_init (&actions);
> +
> +  /* Duplicate IN, OUT, and ERR unconditionally to clear their
> +     FD_CLOEXEC flag, if any.  */
> +  posix_spawn_file_actions_adddup2 (&actions, in, STDIN_FILENO);
> +  posix_spawn_file_actions_adddup2 (&actions, out, STDOUT_FILENO);
> +  posix_spawn_file_actions_adddup2 (&actions, err, STDERR_FILENO);

This won't work, and actually this was one of the original logic bugs I
was trying to fix.  If err is equal to, let's say, STDIN_FILENO, then
the first call will overwrite the initial file descriptor at
STDIN_FILENO, and the second call won't do what the caller intended.
This is why I was moving them out of the way first, so that they would
not overwrite each other.

> +  /* TODO: Use 'closefrom' where available.  */
> +#if 0
> +  /* Version 2.34 of the GNU libc provides this function.  */
> +  posix_spawn_file_actions_addclosefrom_np (&actions, 3);
> +#else
> +  if (in > 2)
> +    posix_spawn_file_actions_addclose (&actions, in);
> +  if (out > 2 && out != in)
> +    posix_spawn_file_actions_addclose (&actions, out);
> +  if (err > 2 && err != out && err != in)
> +    posix_spawn_file_actions_addclose (&actions, err);

Isn't this unneeded given we call close_inherited_fds below?

> [...]
>
> Josselin: I simplified the ‘dup2’ logic somewhat.

See my comments above.

> Guile does that for file descriptors it opens internally, but
> applications using ‘open-file’ without the recently-added “e” flag, or
> ‘socket’ without ‘SOCK_CLOEXEC’, etc., end up with more file descriptors
> that need to be taken care of.
>
> I wish the default were close-on-exec, but we’re not there yet.

+1

Best,
-- 
Josselin Poiret

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 682 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#61095: possible misuse of posix_spawn API on non-linux OSes
  2023-03-28 16:10   ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
@ 2023-03-29 22:30     ` Ludovic Courtès
  2023-03-29 22:30       ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
  2023-03-30 20:21       ` bug#61095: possible misuse of posix_spawn API on non-linux OSes Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
  0 siblings, 2 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-29 22:30 UTC (permalink / raw)
  To: Josselin Poiret; +Cc: 61095, Andrew Whatson, Omar Polo

Hi!

Josselin Poiret <dev@jpoiret.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> -      posix_spawn_file_actions_addclose (actions, fd);
>> +      /* Adding invalid file descriptors to an 'addclose' action leads
>> +         to 'posix_spawn' failures on some operating systems:
>> +         <https://bugs.gnu.org/61095>.  Hence the extra check.  */
>> +      int flags = fcntl (max_fd, F_GETFD, NULL);
>> +      if ((flags >= 0) && ((flags & FD_CLOEXEC) == 0))
>> +        posix_spawn_file_actions_addclose (actions, max_fd);
>>      }
>
> I'm worried about TOCTOU in multi-threaded contexts here,

Yes, that’s a problem.  The current /proc/self/fd optimization has that
problem too.  :-/

> which is why I opted for the heavy handed-approach here.  In general I
> don't think we can know in advance which fdes to close :/ It's a shame
> that the posix_spawn actions fails on other kernels, since I don't
> really see a way to mitigate this problem (apart from the new
> posix_spawn_file_actions_addclosefrom_np as you mentioned).  I don't
> know what we could do here.  Maybe not provide spawn?  Or provide it
> in spite of the broken fd closing?

Not providing ‘spawn’ is no longer an option.

We can expect the problem to practically vanish “soon” on GNU variants
with ‘closefrom’ (glibc 2.34 was released in Aug. 2021).

On Linux with glibc < 2.34, we’d keep the current code (maybe without
the /proc/self/fd optimization?).

On other systems, we can have the racy code above as a last resort or
OS-specific tricks, like Omar was suggesting for OpenBSD.  It sucks but
what else can we do?

(BTW,
<https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn_file_actions_addclose.html>
reads:

  It shall not be considered an error for the fildes argument passed to
  these functions to specify a file descriptor for which the specified
  operation could not be performed at the time of the call. Any such
  error will be detected when the associated file actions object is
  later used during a posix_spawn() or posix_spawnp() operation.

OpenBSD and GNU/Hurd follow this to the letter.

OTOH ‘linux/spawni.c’ in glibc is purposefully more liberal:

  /* Signal errors only for file descriptors out of range.  */
)

>> +  /* Duplicate IN, OUT, and ERR unconditionally to clear their
>> +     FD_CLOEXEC flag, if any.  */
>> +  posix_spawn_file_actions_adddup2 (&actions, in, STDIN_FILENO);
>> +  posix_spawn_file_actions_adddup2 (&actions, out, STDOUT_FILENO);
>> +  posix_spawn_file_actions_adddup2 (&actions, err, STDERR_FILENO);
>
> This won't work, and actually this was one of the original logic bugs I
> was trying to fix.  If err is equal to, let's say, STDIN_FILENO, then
> the first call will overwrite the initial file descriptor at
> STDIN_FILENO, and the second call won't do what the caller intended.
> This is why I was moving them out of the way first, so that they would
> not overwrite each other.

Oh, my bad.

>> +  /* TODO: Use 'closefrom' where available.  */
>> +#if 0
>> +  /* Version 2.34 of the GNU libc provides this function.  */
>> +  posix_spawn_file_actions_addclosefrom_np (&actions, 3);
>> +#else
>> +  if (in > 2)
>> +    posix_spawn_file_actions_addclose (&actions, in);
>> +  if (out > 2 && out != in)
>> +    posix_spawn_file_actions_addclose (&actions, out);
>> +  if (err > 2 && err != out && err != in)
>> +    posix_spawn_file_actions_addclose (&actions, err);
>
> Isn't this unneeded given we call close_inherited_fds below?

No, because of the FD_CLOEXEC selection.

Coming next is an updated patch series addressing this as proposed
above.  Let me know what y’all think!

I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:

  guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm

… which gives us glibc 2.35.

Ludo’.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems.
  2023-03-29 22:30     ` Ludovic Courtès
@ 2023-03-29 22:30       ` Ludovic Courtès
  2023-03-29 22:30         ` bug#61095: [PATCH 2/3] Remove racy optimized file descriptor closing loop in 'spawn' Ludovic Courtès
  2023-03-29 22:30         ` bug#61095: [PATCH 3/3] Use 'posix_spawn_file_actions_addclosefrom_np' where available Ludovic Courtès
  2023-03-30 20:21       ` bug#61095: possible misuse of posix_spawn API on non-linux OSes Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
  1 sibling, 2 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-29 22:30 UTC (permalink / raw)
  To: 61095; +Cc: Ludovic Courtès, Josselin Poiret, Omar Polo, Andrew Whatson

Fixes <https://bugs.gnu.org/61095>.
Reported by Omar Polo <op@omarpolo.com>.

* libguile/posix.c (close_inherited_fds_slow): On systems other than
GNU/Linux, call 'addclose' only when 'fcntl' succeeds on MAX_FD.
---
 libguile/posix.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/libguile/posix.c b/libguile/posix.c
index 3a8be94e4..68e9bfade 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1326,7 +1326,24 @@ static void
 close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
 {
   while (--max_fd > 2)
-    posix_spawn_file_actions_addclose (actions, max_fd);
+    {
+      /* Adding a 'close' action for a file descriptor that is not open
+         causes 'posix_spawn' to fail on GNU/Hurd and on OpenBSD, but
+         not on GNU/Linux: <https://bugs.gnu.org/61095>.  Hence this
+         strategy:
+
+           - On GNU/Linux, close every FD, since that's the only
+             race-free way to make sure the child doesn't inherit one.
+           - On other systems, only close FDs currently open in the
+             parent; it works, but it's racy (XXX).
+
+         The only reliable option is 'addclosefrom'.  */
+#if ! (defined __GLIBC__ && defined __linux__)
+      int flags = fcntl (max_fd, F_GETFD, NULL);
+      if (flags >= 0)
+#endif
+        posix_spawn_file_actions_addclose (actions, max_fd);
+    }
 }
 
 static void

base-commit: e334e59589c3cbfc68d3f7d0d739000e0876b36d
-- 
2.39.2






^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#61095: [PATCH 2/3] Remove racy optimized file descriptor closing loop in 'spawn'.
  2023-03-29 22:30       ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
@ 2023-03-29 22:30         ` Ludovic Courtès
  2023-03-29 22:30         ` bug#61095: [PATCH 3/3] Use 'posix_spawn_file_actions_addclosefrom_np' where available Ludovic Courtès
  1 sibling, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-29 22:30 UTC (permalink / raw)
  To: 61095; +Cc: Ludovic Courtès, Josselin Poiret, Omar Polo, Andrew Whatson

This reverts 9332b632407894c2e1951cce1bc678f19e1fa8e4, thereby
reinstating the performance issue in <https://bugs.gnu.org/59321>.

This optimization was subject to race conditions in multi-threaded code:
new file descriptors could pop up at any time and thus leak in the
child.

* libguile/posix.c (close_inherited_fds): Remove.
(close_inherited_fds_slow): Rename to...
(close_inherited_fds): ... this.
---
 libguile/posix.c | 30 +-----------------------------
 1 file changed, 1 insertion(+), 29 deletions(-)

diff --git a/libguile/posix.c b/libguile/posix.c
index 68e9bfade..b5830c43b 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1323,7 +1323,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
 #endif /* HAVE_FORK */
 
 static void
-close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
+close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
 {
   while (--max_fd > 2)
     {
@@ -1346,34 +1346,6 @@ close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
     }
 }
 
-static void
-close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
-{
-  DIR *dirp;
-  struct dirent *d;
-  int fd;
-
-  /* Try to use the platform-specific list of open file descriptors, so
-     we don't need to use the brute force approach. */
-  dirp = opendir ("/proc/self/fd");
-
-  if (dirp == NULL)
-    return close_inherited_fds_slow (actions, max_fd);
-
-  while ((d = readdir (dirp)) != NULL)
-    {
-      fd = atoi (d->d_name);
-
-      /* Skip "." and "..", garbage entries, stdin/stdout/stderr. */
-      if (fd <= 2)
-        continue;
-
-      posix_spawn_file_actions_addclose (actions, fd);
-    }
-
-  closedir (dirp);
-}
-
 static pid_t
 do_spawn (char *exec_file, char **exec_argv, char **exec_env,
           int in, int out, int err, int spawnp)
-- 
2.39.2






^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#61095: [PATCH 3/3] Use 'posix_spawn_file_actions_addclosefrom_np' where available.
  2023-03-29 22:30       ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
  2023-03-29 22:30         ` bug#61095: [PATCH 2/3] Remove racy optimized file descriptor closing loop in 'spawn' Ludovic Courtès
@ 2023-03-29 22:30         ` Ludovic Courtès
  1 sibling, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-03-29 22:30 UTC (permalink / raw)
  To: 61095; +Cc: Ludovic Courtès, Josselin Poiret, Omar Polo, Andrew Whatson

* configure.ac: Check for 'posix_spawn_file_actions_addclosefrom_np'.
* libguile/posix.c (HAVE_ADDCLOSEFROM): New macro.
(close_inherited_fds): Wrap in #ifdef HAVE_ADDCLOSEFROM.
(do_spawn) [HAVE_ADDCLOSEFROM]: Use 'posix_spawn_file_actions_addclosefrom_np'.
---
 configure.ac     |  4 +++-
 libguile/posix.c | 14 ++++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index d5ce1c4ac..4a93be979 100644
--- a/configure.ac
+++ b/configure.ac
@@ -515,6 +515,7 @@ AC_CHECK_HEADERS([crt_externs.h])
 #   sched_getaffinity, sched_setaffinity - GNU extensions (glibc)
 #   sendfile - non-POSIX, found in glibc
 #   pipe2 - non-POSIX, found in glibc (GNU/Linux and GNU/Hurd)
+#   posix_spawn_file_actions_addclosefrom_np - glibc >= 2.34
 #
 AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid         \
   fesetround ftime ftruncate fchown fchownat fchmod fchdir readlinkat	\
@@ -528,7 +529,8 @@ AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid         \
   index bcopy rindex truncate isblank _NSGetEnviron              \
   strcoll_l strtod_l strtol_l newlocale uselocale utimensat     \
   fstatat futimens openat						\
-  sched_getaffinity sched_setaffinity sendfile pipe2])
+  sched_getaffinity sched_setaffinity sendfile pipe2
+  posix_spawn_file_actions_addclosefrom_np])
 
 # The newlib C library uses _NL_ prefixed locale langinfo constants.
 AC_CHECK_DECLS([_NL_NUMERIC_GROUPING], [], [], [[#include <langinfo.h>]])
diff --git a/libguile/posix.c b/libguile/posix.c
index b5830c43b..3adc743c4 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1322,6 +1322,12 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
 #undef FUNC_NAME
 #endif /* HAVE_FORK */
 
+#ifdef HAVE_POSIX_SPAWN_FILE_ACTIONS_ADDCLOSEFROM_NP
+# define HAVE_ADDCLOSEFROM 1
+#endif
+
+#ifndef HAVE_ADDCLOSEFROM
+
 static void
 close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
 {
@@ -1346,6 +1352,8 @@ close_inherited_fds (posix_spawn_file_actions_t *actions, int max_fd)
     }
 }
 
+#endif
+
 static pid_t
 do_spawn (char *exec_file, char **exec_argv, char **exec_env,
           int in, int out, int err, int spawnp)
@@ -1389,7 +1397,13 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
   posix_spawn_file_actions_adddup2 (&actions, fd_slot[1], 1);
   posix_spawn_file_actions_adddup2 (&actions, fd_slot[2], 2);
 
+#ifdef HAVE_ADDCLOSEFROM
+  /* This function appears in glibc 2.34.  It's both free from race
+     conditions and more efficient than the alternative.  */
+  posix_spawn_file_actions_addclosefrom_np (&actions, 3);
+#else
   close_inherited_fds (&actions, max_fd);
+#endif
 
   int res = -1;
   if (spawnp)
-- 
2.39.2






^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#61095: possible misuse of posix_spawn API on non-linux OSes
  2023-03-29 22:30     ` Ludovic Courtès
  2023-03-29 22:30       ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
@ 2023-03-30 20:21       ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
  2023-03-31 17:45         ` Omar Polo
  1 sibling, 1 reply; 11+ messages in thread
From: Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language @ 2023-03-30 20:21 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 61095, Andrew Whatson, Omar Polo

[-- Attachment #1: Type: text/plain, Size: 448 bytes --]

Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

> Coming next is an updated patch series addressing this as proposed
> above.  Let me know what y’all think!
>
> I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:
>
>   guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm

I didn't test, but this LGTM!  Maybe someone on OpenBSD could test this
patchset?

Best,
-- 
Josselin Poiret

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 682 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#61095: possible misuse of posix_spawn API on non-linux OSes
  2023-03-30 20:21       ` bug#61095: possible misuse of posix_spawn API on non-linux OSes Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
@ 2023-03-31 17:45         ` Omar Polo
  2023-04-02 13:44           ` Ludovic Courtès
  0 siblings, 1 reply; 11+ messages in thread
From: Omar Polo @ 2023-03-31 17:45 UTC (permalink / raw)
  To: Josselin Poiret; +Cc: Ludovic Courtès, Andrew Whatson, 61095

On 2023/03/30 22:21:28 +0200, Josselin Poiret <dev@jpoiret.xyz> wrote:
> Hi Ludo,
> 
> Ludovic Courtès <ludo@gnu.org> writes:
> 
> > Coming next is an updated patch series addressing this as proposed
> > above.  Let me know what y’all think!
> >
> > I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:
> >
> >   guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm
> 
> I didn't test, but this LGTM!  Maybe someone on OpenBSD could test this
> patchset?

    % gmake check
    <snip />
    gmake[5]: Entering directory '/home/op/w/guile/test-suite/standalone'
    PASS: test-system-cmds

it seems to work on OpenBSD 7.3 :)

but note that our libc doesn't have posix_spawn_file_actions_addclosefrom_np,
so this is using the "racy" code path.

Just for curiosity, as it's outside the scope of the bug, what's the
reason posix_spawn was used instead of a more classic fork() +
closefrom()?


Thanks,

Omar Polo





^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#61095: possible misuse of posix_spawn API on non-linux OSes
  2023-03-31 17:45         ` Omar Polo
@ 2023-04-02 13:44           ` Ludovic Courtès
  0 siblings, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2023-04-02 13:44 UTC (permalink / raw)
  To: Omar Polo; +Cc: Josselin Poiret, Andrew Whatson, 61095-done

Hi!

Omar Polo <op@omarpolo.com> skribis:

> On 2023/03/30 22:21:28 +0200, Josselin Poiret <dev@jpoiret.xyz> wrote:
>> Hi Ludo,
>> 
>> Ludovic Courtès <ludo@gnu.org> writes:
>> 
>> > Coming next is an updated patch series addressing this as proposed
>> > above.  Let me know what y’all think!
>> >
>> > I tested the ‘posix_spawn_file_actions_addclosefrom_np’ path by building in:
>> >
>> >   guix time-machine --branch=core-updates -- shell -CP -D -f guix.scm
>> 
>> I didn't test, but this LGTM!  Maybe someone on OpenBSD could test this
>> patchset?
>
>     % gmake check
>     <snip />
>     gmake[5]: Entering directory '/home/op/w/guile/test-suite/standalone'
>     PASS: test-system-cmds
>
> it seems to work on OpenBSD 7.3 :)

Awesome!  Pushed as 9cc85a4f52147fcdaa4c52a62bcc87bdb267d0a9.

> but note that our libc doesn't have posix_spawn_file_actions_addclosefrom_np,
> so this is using the "racy" code path.

Yeah, not great.  :-/  I hope that function will be adopted by other
libcs, especially since ‘closefrom’ is already available.

> Just for curiosity, as it's outside the scope of the bug, what's the
> reason posix_spawn was used instead of a more classic fork() +
> closefrom()?

There’s a long discussion at:

  https://issues.guix.gnu.org/52835

Essentially, ‘fork’ is unusable in multi-threaded context, in addition
to being inefficient.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-04-02 13:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-27 11:51 bug#61095: possible misuse of posix_spawn API on non-linux OSes Omar Polo
2023-01-27 12:25 ` Omar Polo
2023-03-28  9:34 ` Ludovic Courtès
2023-03-28 16:10   ` Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
2023-03-29 22:30     ` Ludovic Courtès
2023-03-29 22:30       ` bug#61095: [PATCH 1/3] 'spawn' closes only open file descriptors on non-GNU/Linux systems Ludovic Courtès
2023-03-29 22:30         ` bug#61095: [PATCH 2/3] Remove racy optimized file descriptor closing loop in 'spawn' Ludovic Courtès
2023-03-29 22:30         ` bug#61095: [PATCH 3/3] Use 'posix_spawn_file_actions_addclosefrom_np' where available Ludovic Courtès
2023-03-30 20:21       ` bug#61095: possible misuse of posix_spawn API on non-linux OSes Josselin Poiret via Bug reports for GUILE, GNU's Ubiquitous Extension Language
2023-03-31 17:45         ` Omar Polo
2023-04-02 13:44           ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).