unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Andy Wingo <wingo@igalia.com>
Cc: 28211@debbugs.gnu.org
Subject: bug#28211: Stack marking issue in multi-threaded code, 2020 edition
Date: Fri, 13 Mar 2020 23:38:01 +0100	[thread overview]
Message-ID: <87blozkhiu.fsf@gnu.org> (raw)
In-Reply-To: <87tv2tp74g.fsf_-_@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\?\= \=\?utf-8\?Q\?\=22's\?\= message of "Thu, 12 Mar 2020 22:59:11 +0100")

[-- Attachment #1: Type: text/plain, Size: 781 bytes --]

Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

> What if ‘scm_i_vm_mark_stack’ walks the stack right before the ‘vp->fp’
> assignment?  It can determine that one of the just-assigned ‘sp[nargs]’
> is a dead slot, and thus set it to SCM_UNSPECIFIED.  Later, when we set
> ‘vp->fp’, that stack slot that we just initialized has been overwritten
> by ‘scm_i_vm_mark_stack’.  Down the road, we get something like:
>
>   Wrong type to apply: #<unspecified>

I believe the patch below solves this problem.  Also attached is a small
program that reproduces the bug (without the patch) after a couple of
minutes.

Thoughts?

The full grafting test case exposes another VM stack-corruption-looking
crash that I’m still investigating.

Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 2720 bytes --]

diff --git a/libguile/vm.c b/libguile/vm.c
index b20c6eb5f..5cc934e99 100644
--- a/libguile/vm.c
+++ b/libguile/vm.c
@@ -1351,7 +1351,8 @@ scm_i_vm_emergency_abort (SCM *tag_and_argv, size_t n)
   scm_t_bits *prompt;
   scm_t_dynstack_prompt_flags flags;
   ptrdiff_t fp_offset, sp_offset;
-  union scm_vm_stack_element *fp, *sp;
+  union scm_vm_stack_element *fp;
+  volatile union scm_vm_stack_element *sp;
   SCM *argv;
   uint32_t *vra;
   uint8_t *mra;
@@ -1392,16 +1393,19 @@ scm_i_vm_emergency_abort (SCM *tag_and_argv, size_t n)
      to be jumping to an older part of the stack.  */
   if (sp < vp->sp)
     abort ();
-  sp[nargs].as_scm = cont;
-
-  while (nargs--)
-    sp[nargs].as_scm = *argv++;
 
   /* Restore VM regs */
   vp->fp = fp;
-  vp->sp = sp;
+  vp->sp = (union scm_vm_stack_element *) sp;
   vp->ip = vra;
 
+  /* Restore the arguments on SP.  This must be done after 'vp->fp' has
+     been set so that a concurrent 'scm_i_vm_mark_stack' does not
+     overwrite it (see <https://bugs.gnu.org/28211>).  */
+  sp[nargs].as_scm = cont;
+  while (nargs--)
+    sp[nargs].as_scm = *argv++;
+
   /* Jump! */
   vp->mra_after_abort = mra;
   longjmp (*registers, 1);
@@ -1417,7 +1421,8 @@ abort_to_prompt (scm_thread *thread, uint8_t *saved_mra)
   scm_t_bits *prompt;
   scm_t_dynstack_prompt_flags flags;
   ptrdiff_t fp_offset, sp_offset;
-  union scm_vm_stack_element *fp, *sp;
+  union scm_vm_stack_element *fp, *orig_sp;
+  volatile union scm_vm_stack_element *sp;
   uint32_t *vra;
   uint8_t *mra;
   jmp_buf *registers;
@@ -1452,6 +1457,7 @@ abort_to_prompt (scm_thread *thread, uint8_t *saved_mra)
   /* Recompute FP, as scm_dynstack_unwind may have expanded the stack.  */
   fp = vp->stack_top - fp_offset;
   sp = vp->stack_top - sp_offset;
+  orig_sp = vp->sp;
 
   /* Continuation gets nargs+1 values: the one more is for the cont.  */
   sp = sp - nargs - 1;
@@ -1460,15 +1466,19 @@ abort_to_prompt (scm_thread *thread, uint8_t *saved_mra)
      to be jumping to an older part of the stack.  */
   if (sp < vp->sp)
     abort ();
-  sp[nargs].as_scm = cont;
-  while (nargs--)
-    sp[nargs] = vp->sp[nargs];
 
   /* Restore VM regs */
   vp->fp = fp;
-  vp->sp = sp;
+  vp->sp = (union scm_vm_stack_element *) sp;
   vp->ip = vra;
 
+  /* Restore the arguments on SP.  This must be done after 'vp->fp' has
+     been set so that a concurrent 'scm_i_vm_mark_stack' does not
+     overwrite it (see <https://bugs.gnu.org/28211>).  */
+  sp[nargs].as_scm = cont;
+  while (nargs--)
+    sp[nargs] = orig_sp[nargs];
+
   /* If there are intervening C frames, then jump over them, making a
      nonlocal exit.  Otherwise fall through and let the VM pick up where
      it left off.  */

[-- Attachment #3: the reproducer --]
[-- Type: text/plain, Size: 434 bytes --]

;; https://issues.guix.gnu.org/issue/28211

(use-modules (ice-9 threads))

(define (thunk)
  (catch 'foo
    (lambda ()
      (throw 'foo (iota 10)))
    (lambda (key lst)
      (unless (equal? lst (iota 10))
        (primitive-_exit 42)))))

(n-par-for-each (* 2 (current-processor-count))
                (lambda _
                  (let loop ()
                    (thunk)
                    (loop)))
                (iota 1000))

  reply	other threads:[~2020-03-13 22:39 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-23 22:20 bug#28211: Grafting code triggers GC/thread-safety issue on Guile 2.2.2 Ludovic Courtès
2017-08-23 22:48 ` Ludovic Courtès
2018-04-24 16:03 ` Ludovic Courtès
2018-05-08 21:55   ` Ludovic Courtès
2018-05-09  0:32     ` Mark H Weaver
2018-05-09  7:17       ` Ludovic Courtès
2018-05-09  9:11       ` Andy Wingo
2018-05-10  6:50         ` Mark H Weaver
2018-05-10  7:53           ` Andy Wingo
2018-06-29 15:03             ` bug#28211: Stack marking issue in multi-threaded code Ludovic Courtès
2018-06-29 16:54               ` Mark H Weaver
2018-06-29 21:18                 ` Ludovic Courtès
2018-06-29 23:18                   ` Mark H Weaver
2018-06-30 20:53                     ` Ludovic Courtès
2018-06-30 21:49                       ` Mark H Weaver
2018-07-01 10:12                         ` Andy Wingo
2018-07-03 19:01                           ` Mark H Weaver
2020-03-12 21:59               ` bug#28211: Stack marking issue in multi-threaded code, 2020 edition Ludovic Courtès
2020-03-13 22:38                 ` Ludovic Courtès [this message]
2020-03-17 21:16                 ` Andy Wingo
2018-05-10 15:48     ` bug#28211: Grafting code triggers GC/thread-safety issue on Guile 2.2.2 Mark H Weaver
2018-05-10 16:01       ` Mark H Weaver
2018-07-02 10:28 ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87blozkhiu.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=28211@debbugs.gnu.org \
    --cc=wingo@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).