unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: "Thompson, David" <dthompson2@worcester.edu>
Cc: 21694@debbugs.gnu.org, David Thompson <davet@gnu.org>
Subject: bug#21694: 'clone' syscall binding unreliable
Date: Sat, 17 Oct 2015 12:14:34 +0200	[thread overview]
Message-ID: <87k2qlls39.fsf@gnu.org> (raw)
In-Reply-To: <CAJ=RwfZvF3p1xHJ3nvPnS0ULTjL2rVnOPJXHJKEpiE=e7VPLzw@mail.gmail.com> (David Thompson's message of "Fri, 16 Oct 2015 19:12:53 -0400")

"Thompson, David" <dthompson2@worcester.edu> skribis:

> On Fri, Oct 16, 2015 at 4:39 PM, Ludovic Courtès <ludo@gnu.org> wrote:
>> I’m reporting the problem and (hopefully) the solution, but I think we’d
>> better double-check this.
>>
>> The problem: Running the test below in a loop sometimes gets a SIGSEGV
>> in the child process (on x86_64, libc 2.22.)
>>
>> --8<---------------cut here---------------start------------->8---
>> (use-modules (guix build syscalls) (ice-9 match))
>>
>> (match (clone (logior CLONE_NEWUSER
>>                       CLONE_CHILD_SETTID
>>                       CLONE_CHILD_CLEARTID
>>                       SIGCHLD))
>>   (0
>>    (throw 'x))                                    ;XXX: sometimes segfaults
>>   (pid
>>    (match (waitpid pid)
>>      ((_ . status)
>>       (pk 'status status)
>>       (exit (not (status:term-sig status)))))))
>> --8<---------------cut here---------------end--------------->8---
>>
>> Looking at (guix build syscalls) though, I see an ABI mismatch between
>> our definition and the actual ‘syscall’ C function, and between our
>> ‘clone’ definition and the actual C function.
>>
>> This leads to the attached patch, which also fixes the above problem for me.
>>
>> Could you test this patch?
>
> The patch looks good.  Thanks for catching this!

Great, pushed as 0e3cc31.

>> Now, there remains the question of CLONE_CHILD_SETTID and
>> CLONE_CHILD_CLEARTID.  Since we’re passing NULL for ‘ctid’, I expect
>> that these flags have no effect at all.
>
> I added those flags in commit ee78d02 because they solved a real issue
> I ran into.  Adding those flags made 'clone' look like a
> 'primitive-fork' call when examined with strace.

Could you check whether removing these flags makes a difference now?
How can we test?  (Preferably not at the REPL.)

>> Conversely, libc uses these flags to update the thread ID in the child
>> process (x86_64/arch-fork.h):
>>
>> --8<---------------cut here---------------start------------->8---
>> #define ARCH_FORK() \
>>   INLINE_SYSCALL (clone, 4,                                                   \
>>                   CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD, 0,     \
>>                   NULL, &THREAD_SELF->tid)
>> --8<---------------cut here---------------end--------------->8---
>>
>> This is certainly useful, but we’d have troubles doing it from the FFI…
>> It may that this is fine if the process doesn’t use threads.
>
> Right, so here's what 'primitive-fork' does:
>
>     clone(child_stack=0,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> child_tidptr=0x7fc5398cea10) = 13247
>
> Here's what 'clone' does:
>
>     clone(child_stack=0,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0)
> = 14038

You mean ‘clone’ from libc?

I guess CLONE_CHILD_{CLEARTID,SETTID} don’t hurt here, but they have no
effect either.  That’s what the clone(2) page suggests:

   CLONE_CHILD_CLEARTID (since Linux 2.5.49)
          Erase child thread ID at location ctid in child memory when
          the child exits, and do a wakeup on the futex at that address.
          The address involved may be changed by the set_tid_address(2)
          system call.  This is used by threading libraries.

   CLONE_CHILD_SETTID (since Linux 2.5.49)
          Store child thread ID at location ctid in child memory.

And here ctid == NULL.

And indeed, kernel/fork.c in Linux does:

    p->set_child_tid = (clone_flags & CLONE_CHILD_SETTID) ? child_tidptr : NULL;
    /*
     * Clear TID on mm_release()?
     */
    p->clear_child_tid = (clone_flags & CLONE_CHILD_CLEARTID) ? child_tidptr : NULL;

So in effect, using NULL for ctid equates to not passing the
CLEARTID/SETTID flags.

QED.  :-)

> In practice it may not be a problem since most of the time you'd
> 'exec' after cloning.  Is there any reliable way to get a hold of
> whatever THREAD_SELF is? 

THREAD_SELF is really not something we want to poke at; quoth
x86_64/tls.h:

--8<---------------cut here---------------start------------->8---
# define THREAD_SELF \
  ({ struct pthread *__self;						      \
     asm ("mov %%fs:%c1,%0" : "=r" (__self)				      \
	  : "i" (offsetof (struct pthread, header.self)));	 	      \
     __self;})
--8<---------------cut here---------------end--------------->8---

> I wish the libc 'clone' function didn't have that silly callback and
> behaved like 'fork', then we could have avoided these issues
> altogether.

Is the callback really an issue?  We have ‘procedure->pointer’ after
all.

Thanks,
Ludo’.

  reply	other threads:[~2015-10-17 10:15 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-16 20:39 bug#21694: 'clone' syscall binding unreliable Ludovic Courtès
2015-10-16 23:12 ` Thompson, David
2015-10-17 10:14   ` Ludovic Courtès [this message]
2015-10-28 14:39     ` Ludovic Courtès
2015-10-22 14:38 ` Mark H Weaver
2015-10-25 20:59   ` Ludovic Courtès
2015-10-28  4:53     ` Mark H Weaver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k2qlls39.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=21694@debbugs.gnu.org \
    --cc=davet@gnu.org \
    --cc=dthompson2@worcester.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).