unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: David Thompson <davet@gnu.org>
Cc: 21694@debbugs.gnu.org
Subject: bug#21694: 'clone' syscall binding unreliable
Date: Fri, 16 Oct 2015 22:39:59 +0200	[thread overview]
Message-ID: <87zizio8dc.fsf@gnu.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 1062 bytes --]

I’m reporting the problem and (hopefully) the solution, but I think we’d
better double-check this.

The problem: Running the test below in a loop sometimes gets a SIGSEGV
in the child process (on x86_64, libc 2.22.)

--8<---------------cut here---------------start------------->8---
(use-modules (guix build syscalls) (ice-9 match))

(match (clone (logior CLONE_NEWUSER
                      CLONE_CHILD_SETTID
                      CLONE_CHILD_CLEARTID
                      SIGCHLD))
  (0
   (throw 'x))                                    ;XXX: sometimes segfaults
  (pid
   (match (waitpid pid)
     ((_ . status)
      (pk 'status status)
      (exit (not (status:term-sig status)))))))
--8<---------------cut here---------------end--------------->8---

Looking at (guix build syscalls) though, I see an ABI mismatch between
our definition and the actual ‘syscall’ C function, and between our
‘clone’ definition and the actual C function.

This leads to the attached patch, which also fixes the above problem for me.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1913 bytes --]

diff --git a/guix/build/syscalls.scm b/guix/build/syscalls.scm
index 80b9d00..f931f8d 100644
--- a/guix/build/syscalls.scm
+++ b/guix/build/syscalls.scm
@@ -322,10 +322,16 @@ string TMPL and return its file name.  TMPL must end with 'XXXXXX'."
 (define CLONE_NEWNET         #x40000000)
 
 ;; The libc interface to sys_clone is not useful for Scheme programs, so the
-;; low-level system call is wrapped instead.
+;; low-level system call is wrapped instead.  The 'syscall' function is
+;; declared in <unistd.h> as a variadic function; in practice, it expects 6
+;; pointer-sized arguments, as shown in, e.g., x86_64/syscall.S.
 (define clone
   (let* ((ptr        (dynamic-func "syscall" (dynamic-link)))
-         (proc       (pointer->procedure int ptr (list int int '*)))
+         (proc       (pointer->procedure long ptr
+                                         (list long                   ;sysno
+                                               unsigned-long          ;flags
+                                               '* '* '*
+                                               '*)))
          ;; TODO: Don't do this.
          (syscall-id (match (utsname:machine (uname))
                        ("i686"   120)
@@ -336,7 +342,10 @@ string TMPL and return its file name.  TMPL must end with 'XXXXXX'."
       "Create a new child process by duplicating the current parent process.
 Unlike the fork system call, clone accepts FLAGS that specify which resources
 are shared between the parent and child processes."
-      (let ((ret (proc syscall-id flags %null-pointer))
+      (let ((ret (proc syscall-id flags
+                       %null-pointer               ;child stack
+                       %null-pointer %null-pointer ;ptid & ctid
+                       %null-pointer))             ;unused
             (err (errno)))
         (if (= ret -1)
             (throw 'system-error "clone" "~d: ~A"

[-- Attachment #3: Type: text/plain, Size: 833 bytes --]


Could you test this patch?

Now, there remains the question of CLONE_CHILD_SETTID and
CLONE_CHILD_CLEARTID.  Since we’re passing NULL for ‘ctid’, I expect
that these flags have no effect at all.

Conversely, libc uses these flags to update the thread ID in the child
process (x86_64/arch-fork.h):

--8<---------------cut here---------------start------------->8---
#define ARCH_FORK() \
  INLINE_SYSCALL (clone, 4,                                                   \
                  CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD, 0,     \
                  NULL, &THREAD_SELF->tid)
--8<---------------cut here---------------end--------------->8---

This is certainly useful, but we’d have troubles doing it from the FFI…
It may that this is fine if the process doesn’t use threads.

Ludo’.

             reply	other threads:[~2015-10-16 20:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-16 20:39 Ludovic Courtès [this message]
2015-10-16 23:12 ` bug#21694: 'clone' syscall binding unreliable Thompson, David
2015-10-17 10:14   ` Ludovic Courtès
2015-10-28 14:39     ` Ludovic Courtès
2015-10-22 14:38 ` Mark H Weaver
2015-10-25 20:59   ` Ludovic Courtès
2015-10-28  4:53     ` Mark H Weaver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zizio8dc.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=21694@debbugs.gnu.org \
    --cc=davet@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).