From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#28211: Stack marking issue in multi-threaded code Date: Sat, 30 Jun 2018 22:53:11 +0200 Message-ID: <87h8lkro7c.fsf@gnu.org> References: <877exuj58y.fsf@gnu.org> <87d0yo1tie.fsf@gnu.org> <87fu3124nt.fsf@gnu.org> <87d0y5k6sl.fsf@netris.org> <871sel6vnq.fsf@igalia.com> <87fu30dmx3.fsf@netris.org> <87tvrg3q1d.fsf@igalia.com> <87a7rdvdm9.fsf_-_@gnu.org> <87in61y1m0.fsf@netris.org> <87muvdthp4.fsf@gnu.org> <87efgpw5ao.fsf@netris.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:58162) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fZMsd-00060w-AG for bug-guix@gnu.org; Sat, 30 Jun 2018 16:54:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fZMsc-0002F5-8q for bug-guix@gnu.org; Sat, 30 Jun 2018 16:54:03 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:33594) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fZMsc-0002Et-5l for bug-guix@gnu.org; Sat, 30 Jun 2018 16:54:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87efgpw5ao.fsf@netris.org> (Mark H. Weaver's message of "Fri, 29 Jun 2018 19:18:07 -0400") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Mark H Weaver Cc: Andy Wingo , 28211@debbugs.gnu.org Hi Mark, Mark H Weaver skribis: > ludo@gnu.org (Ludovic Court=C3=A8s) writes: [...] >>> Indeed, we need something to ensure that the compiler won't reorder >>> these writes. My recommendation would be to use the 'volatile' >>> qualifier in the declarations of both the 'fp' field of 'struct scm_vm' >>> and the stack element modified by SCM_FRAME_SET_DYNAMIC_LINK. >>> >>> Maybe something like this: >>> >>> diff --git a/libguile/frames.h b/libguile/frames.h >>> index ef2db3df5..8554f886b 100644 >>> --- a/libguile/frames.h >>> +++ b/libguile/frames.h >>> @@ -89,6 +89,7 @@ >>> union scm_vm_stack_element >>> { >>> scm_t_uintptr as_uint; >>> + volatile scm_t_uintptr as_volatile_uint; >> >> [...] >> >>> +#define SCM_FRAME_SET_DYNAMIC_LINK(fp, dl) ((fp)[1].as_volatile_uint = =3D ((dl) - (fp))) >> >> I=E2=80=99m not sure this is exactly what we need. >> >> What I had in mind, to make sure the vp->fp assignment really happens >> after the SCM_FRAME_SET_DYNAMIC_LINK, was to do something like this: >> >> SCM_FRAME_SET_RETURN_ADDRESS (new_fp, =E2=80=A6); >> SCM_FRAME_SET_DYNAMIC_LINK (new_fp, =E2=80=A6); >> >> asm volatile ("" : : : "memory"); >> >> vp->fp =3D new_fp; >> >> I think that more accurately reflects what we want, WDYT? >> >> It=E2=80=99s GCC-specific though (I don=E2=80=99t think Clang implements= =E2=80=98asm=E2=80=99 in a >> compatible way, does it?) and I suppose we=E2=80=99d have to ignore the = non-GCC >> case. > > Right, the problem is that the "asm volatile" solution is not portable. > > To my mind, it's fine to optionally use GCC extensions for improved > performance, but I think we should ensure that Guile works reliably when > compiled with any conforming C compiler. I agree, of course (that said, most compilers implement most GCC extensions nowadays, but =E2=80=98asm=E2=80=99 is probably an exception). > What problem do you see with my proposed portable solution? If I > understand C99 section 5.1.2.3 paragraph 2 correctly, compilers are not > permitted to reorder accesses to volatile objects as long as there is a > sequence point between them. My understand is that what you propose is (almost*) equivalent to the asm trick, provided SCM_FRAME_SET_DYNAMIC_LINK is the last statement before the vp->fp assignment. So we=E2=80=99d have to make sure we keep th= ings in this order, right? [*] The =E2=80=98volatile=E2=80=99 qualifier on the field does more than ju= st an instruction reordering barrier: AIUI, it forces the compiler to emit a load and store instruction right at the assignment point, which is a stricter constraint than what we need, I think. > I should say that I'm not confident that _either_ of these proposed > solutions will adequately address all of the possible problems that > could occur when GC is performed on VM threads stopped at arbitrary > points in their execution. Yeah, as discussed on IRC with Andy, we=E2=80=99d be better off if we were = sure that each stack is marked by the thread it belongs to, in terms of data locality, and thus also in terms of being sure that vp->fp is up-to-date when the marker reads it. It=E2=80=99s not something we can change now, th= ough. Alternately, we could use atomics when accessing vp->fp to ensure memory consistency (I tried that during my debugging journey=E2=80=A6). It could = be costly. Anyway, I don=E2=80=99t think we=E2=80=99ll have the final word on all this= before 2.2.4. The way I see it we should keep working on improving it, but there are difficult choices to make, so it will probably take a bit of time. Thanks, Ludo=E2=80=99.