unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
From: Andrew Whatson <whatson@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: hylophile@posteo.de, 59321@debbugs.gnu.org
Subject: bug#59321: ice-9's open-input-pipe is unexpectedly slow on some systems
Date: Mon, 21 Nov 2022 14:22:52 +1000	[thread overview]
Message-ID: <CAPE069cGWUWXMChrB86MBXKvNXFMUZUJnv=qkj-aDxSUTv3ygA@mail.gmail.com> (raw)
In-Reply-To: <87r0xx5yja.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 1947 bytes --]

Ludovic Courtès <ludo@gnu.org> wrote:
>
> Andrew Whatson <whatson@gmail.com> skribis:
>
> > Forcibly closing file descriptors like this shouldn't be necessary if
> > the application has properly opened descriptors with the FD_CLOEXEC
> > flag.  It would be good to get input from some more experienced Guile
> > hackers on the potential consequences of this change.
>
> Libguile opens all its own file descriptors at O_CLOEXEC (one omission
> was recently fixed in 0aa1a9976fc3c6af4d1087e59d728cb8fe7d369a) so it
> may be possible to remove that FD-closing loop.  There’s still the
> possibility that application bug unwillingly leaks FDs, but we could
> consider it’s none of our business.
>
> Thoughts?

I agree with this approach in principle, but from what Tomas is saying
it seems like it's not currently possible for applications to do the
right thing in all cases.

> Similarly, with commit a356ceebee000efe91a2a16dbcaa64d6c6a3a922, it’s
> possible to pass ‘open-file’ a flag that corresponds to O_CLOEXEC,
> which wasn’t possible before.

Nice!

> I’ve also been thinking that files opened with ‘call-with-*’ should be
> O_CLOEXEC.  That’d be an incompatible change though, so maybe not
> something for 3.0.x.

This sounds reasonable to me.

We also need equivalent functionality around SOCK_CLOEXEC.  It seems
this is implemented for ‘accept’, but not ‘socket’ or ‘socketpair’.

Python's PEP 433 contains a good explanation of the issues which can
arise from leaked file descriptors:
https://peps.python.org/pep-0433/#inherited-file-descriptors-issues

Given the risks, I'm convinced that Guile's conservative approach is
actually quite sensible.  It seems like the best path forward would be
to implement platform-specific optimizations where possible.

I've attached a draft patch which implements a fast-path on systems
where "/proc/self/fd" is available.

[-- Attachment #2: close-inherited-with-proc-self-fd.patch --]
[-- Type: text/x-patch, Size: 2573 bytes --]

commit 08943cae90545dddea44ca55eab68047e5ae2f9d
Author: Andrew Whatson <whatson@gmail.com>
Date:   Mon Nov 21 13:40:33 2022 +1000

    Reduce redundant close() calls when forking on some systems.
    
    Some systems provide "/proc/self/fd" which is a directory containing an
    entry for each open file descriptor in the current process.  We use this
    to limit the number of close() calls needed to ensure file descriptors
    aren't leaked to the child process when forking.
    
    * libguile/posix.c (close_inherited_fds_slow):
    (close_inherited_fds): New static helper functions.
    (start_child): Attempt to close inherited file descriptors efficiently
    using 'close_inherited_fds', falling back to the brute-force approach in
    'close_inherited_fds_slow'.

diff --git a/libguile/posix.c b/libguile/posix.c
index b5352c2c4..fc3512054 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -24,6 +24,7 @@
 #  include <config.h>
 #endif
 
+#include <dirent.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <stdio.h>
@@ -1337,6 +1338,46 @@ renumber_file_descriptor (int fd, int err)
 }
 #endif /* HAVE_FORK */
 
+static void
+close_inherited_fds_slow(int max_fd, int in, int out, int err)
+{
+  while (max_fd--)
+    if (max_fd != in && max_fd != out && max_fd != err)
+      close (max_fd);
+}
+
+static void
+close_inherited_fds(int max_fd, int in, int out, int err)
+{
+  DIR *dirp;
+  struct dirent *d;
+  int fd;
+
+  /* Try to use the platform-specific list of open file descriptors, so
+     we don't need to use the brute force approach. */
+  dirp = opendir ("/proc/self/fd");
+
+  if (dirp == NULL)
+    return close_inherited_fds_slow (max_fd, in, out, err);
+
+  while ((d = readdir (dirp)) != NULL)
+    {
+      fd = atoi (d->d_name);
+
+      /* Skip "." and "..", and any garbage entries. */
+      if (fd <= 0)
+        continue;
+
+      /* Keep in/out/err open. */
+      if (fd == in || fd == out || fd == err)
+        continue;
+
+      close (fd);
+    }
+
+  closedir (dirp);
+}
+
 #ifdef HAVE_FORK
 #define HAVE_START_CHILD 1
 /* Since Guile uses threads, we have to be very careful to avoid calling
@@ -1373,9 +1414,7 @@ start_child (const char *exec_file, char **exec_argv,
 
   /* Close all file descriptors in ports inherited from the parent
      except for in, out, and err.  Heavy-handed, but robust.  */
-  while (max_fd--)
-    if (max_fd != in && max_fd != out && max_fd != err)
-      close (max_fd);
+  close_inherited_fds (max_fd, in, out, err);
 
   /* Ignore errors on these open() calls.  */
   if (in == -1)

  parent reply	other threads:[~2022-11-21  4:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-16 18:06 bug#59321: ice-9's open-input-pipe is unexpectedly slow on some systems hylophile
2022-11-18  4:49 ` Andrew Whatson
2022-11-18  5:52   ` tomas
2022-11-20 17:24   ` Ludovic Courtès
2022-11-20 19:00     ` tomas
2022-11-21  4:22     ` Andrew Whatson [this message]
2022-11-26 14:39       ` Ludovic Courtès
2022-12-08 12:02         ` Andrew Whatson
2022-12-08 14:34           ` Ludovic Courtès
2023-01-13 15:37           ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPE069cGWUWXMChrB86MBXKvNXFMUZUJnv=qkj-aDxSUTv3ygA@mail.gmail.com' \
    --to=whatson@gmail.com \
    --cc=59321@debbugs.gnu.org \
    --cc=hylophile@posteo.de \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).