From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark H Weaver Subject: bug#28211: Stack marking issue in multi-threaded code Date: Tue, 03 Jul 2018 15:01:03 -0400 Message-ID: <87in5wnnyo.fsf@netris.org> References: <877exuj58y.fsf@gnu.org> <87d0yo1tie.fsf@gnu.org> <87fu3124nt.fsf@gnu.org> <87d0y5k6sl.fsf@netris.org> <871sel6vnq.fsf@igalia.com> <87fu30dmx3.fsf@netris.org> <87tvrg3q1d.fsf@igalia.com> <87a7rdvdm9.fsf_-_@gnu.org> <87in61y1m0.fsf@netris.org> <87muvdthp4.fsf@gnu.org> <87efgpw5ao.fsf@netris.org> <87h8lkro7c.fsf@gnu.org> <87tvpkrlm8.fsf@netris.org> <87d0w7l0wr.fsf@igalia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:50271) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1faQZu-0008D1-DW for bug-guix@gnu.org; Tue, 03 Jul 2018 15:03:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1faQZq-00049I-Ex for bug-guix@gnu.org; Tue, 03 Jul 2018 15:03:06 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:37771) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1faQZq-000497-B1 for bug-guix@gnu.org; Tue, 03 Jul 2018 15:03:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87d0w7l0wr.fsf@igalia.com> (Andy Wingo's message of "Sun, 01 Jul 2018 12:12:52 +0200") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Andy Wingo Cc: 28211@debbugs.gnu.org Hi Andy, Andy Wingo writes: > On Sat 30 Jun 2018 23:49, Mark H Weaver writes: > >>>> I should say that I'm not confident that _either_ of these proposed >>>> solutions will adequately address all of the possible problems that >>>> could occur when GC is performed on VM threads stopped at arbitrary >>>> points in their execution. >>> >>> Yeah, as discussed on IRC with Andy, we=E2=80=99d be better off if we w= ere sure >>> that each stack is marked by the thread it belongs to, in terms of data >>> locality, and thus also in terms of being sure that vp->fp is up-to-date >>> when the marker reads it. It=E2=80=99s not something we can change now= , though. >> >> I'm not sure it matters what thread the marking is done in, because when >> the actual collection happens, all threads are first stopped in their >> signal handlers, and presumably the appropriate memory barriers are >> performed so that all threads are synchronized before the full >> collection. > > I think you are right here. Still, it would be nice from a locality > POV if threads could mark themselves. In some future I think it would > be nice if threads cooperatively reached safepoints, instead of using > the signal mechanism. In that case we could precisely mark the most > recent stack frame as well. I agree that stopping threads at safepoints before collections would be ideal. >>> Anyway, I don=E2=80=99t think we=E2=80=99ll have the final word on all = this before >>> 2.2.4. The way I see it we should keep working on improving it, but >>> there are difficult choices to make, so it will probably take a bit of >>> time. >> >> Sounds good. > > Yeah! Really great that this is fixed, and apologies for introducing it > in the first place!! It's great that Ludovic found the problem (great debugging Ludovic!), but FWIW, my assessment is that this bug is not fixed by commit 23af45e248e8e2bec99c712842bf24d6661abbe2, and therefore is not fixed in Guile-2.2.4, contrary to the claims made in the commit log and the NEWS. Unless I'm mistaken, that commit makes *no* difference to the requirements on the compiler regarding the order of those writes. Mark