unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Andy Wingo <wingo@igalia.com>, 28211@debbugs.gnu.org
Subject: bug#28211: Grafting code triggers GC/thread-safety issue on Guile 2.2.2
Date: Tue, 08 May 2018 20:32:26 -0400	[thread overview]
Message-ID: <87d0y5k6sl.fsf@netris.org> (raw)
In-Reply-To: <87fu3124nt.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Tue, 08 May 2018 23:55:50 +0200")

Hi Ludovic,

ludo@gnu.org (Ludovic Courtès) writes:
[...]
> Thread 1 (Thread 0x7f6fe6f5d700 (LWP 2856)):
> #0  0x00007f7019db0d79 in scm_is_pair (x=<error reading variable: ERROR: Cannot access memory at address 0x0>0x0) at ../libguile/pairs.h:159
> #1  scm_ilength (sx=<optimized out>) at list.c:190
[...]
> What this means is that Thread 1 gets NULL instead of a list as its
> on-stack argument (vm-engine.c:909 is ‘tail-apply’).
>
> How can arguments on the VM stack be zeroed?

I doubt that's what happened, because I expect that each VM stack is
dedicated to a single hardware thread.  In theory, if a single VM stack
is used by one thread, and then later used by another thread,
thread-safety issues on the VM stack could arise in the absense of
proper thread synchronization.

However, I think it's _far_ more likely that the NULL argument on the
stack was copied from memory shared by multiple threads without proper
thread synchronization.

On modern weakly-ordered memory architectures, writes by one thread may
be seen in a different order by another thread.  For example, if one
thread allocates a pair, initializes its CAR and CDR to non-NULL values,
and then writes the address of the pair somewhere, another thread could
first read the address of the pair, and then read NULL from its CAR or
CDR, unless proper thread synchronization is used.  At best, this
requires memory barriers in both the reader and writer which are
typically quite expensive.  On x86 processors the read barrier could
expand into a no-op in typical cases, but the write barrier cannot be
avoided, and must be placed after initializing the structure and before
writing its pointer.

I think it's most likely that something like this is happening, because
NULL is not a valid SCM value.  The only code that should normally write
a NULL to an SCM slot is Boehm GC, which clears memory before handing it
to the user.  So, if you read a NULL from a memory location that's meant
to hold an SCM, then it's most likely because the reading thread does
not yet see the writes that initialized it, because of the
weakly-ordered memory architecture.

> I commented out the MADV_DONTNEED call to be sure, but I can still
> reproduce the bug.

I strongly doubt that the MADV_DONTNEED is relevant to this issue.

> Then I thought vp->sp might be out-of-sync compared to the local
> variable ‘sp’, which in turn could cause ‘scm_i_vm_mark_stack’ to not
> mark a few items on the tip of the stack.  So I did this:
>
> diff --git a/libguile/vm-engine.c b/libguile/vm-engine.c
> index 9509cd643..1136b2271 100644
> --- a/libguile/vm-engine.c
> +++ b/libguile/vm-engine.c
> @@ -151,7 +151,8 @@
>     code, or otherwise push anything on the stack, you will need to
>     CACHE_SP afterwards to restore the possibly-changed stack pointer.  */
>  
> -#define SYNC_IP() vp->ip = (ip)
> +#define SYNC_IP()                                       \
> +  do { vp->ip = (ip); vp->sp = (sp); } while (0)

I don't see how a change like this could be useful for any thread safety
problem.  Since stores by one thread may be seen in a different order by
other threads, the memory write corresponding to "vp->sp = (sp)" might
be delayed for some arbitrarily long period of time after the writes
that follow it, up until the next appropriate memory barrier.

For now, I would suggest avoiding multi-threaded code in Guix, or at
least to avoid loading any Scheme code from multiple threads.

How about using multiple processes instead?

      Mark

  reply	other threads:[~2018-05-09  0:34 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-23 22:20 bug#28211: Grafting code triggers GC/thread-safety issue on Guile 2.2.2 Ludovic Courtès
2017-08-23 22:48 ` Ludovic Courtès
2018-04-24 16:03 ` Ludovic Courtès
2018-05-08 21:55   ` Ludovic Courtès
2018-05-09  0:32     ` Mark H Weaver [this message]
2018-05-09  7:17       ` Ludovic Courtès
2018-05-09  9:11       ` Andy Wingo
2018-05-10  6:50         ` Mark H Weaver
2018-05-10  7:53           ` Andy Wingo
2018-06-29 15:03             ` bug#28211: Stack marking issue in multi-threaded code Ludovic Courtès
2018-06-29 16:54               ` Mark H Weaver
2018-06-29 21:18                 ` Ludovic Courtès
2018-06-29 23:18                   ` Mark H Weaver
2018-06-30 20:53                     ` Ludovic Courtès
2018-06-30 21:49                       ` Mark H Weaver
2018-07-01 10:12                         ` Andy Wingo
2018-07-03 19:01                           ` Mark H Weaver
2020-03-12 21:59               ` bug#28211: Stack marking issue in multi-threaded code, 2020 edition Ludovic Courtès
2020-03-13 22:38                 ` Ludovic Courtès
2020-03-17 21:16                 ` Andy Wingo
2018-05-10 15:48     ` bug#28211: Grafting code triggers GC/thread-safety issue on Guile 2.2.2 Mark H Weaver
2018-05-10 16:01       ` Mark H Weaver
2018-07-02 10:28 ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d0y5k6sl.fsf@netris.org \
    --to=mhw@netris.org \
    --cc=28211@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    --cc=wingo@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).