From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Pip Cet Newsgroups: gmane.emacs.devel Subject: Re: MPS: Win64 testers? Date: Tue, 03 Sep 2024 10:37:40 +0000 Message-ID: <87ed612fbm.fsf@protonmail.com> References: <86r0a3ahm9.fsf@gnu.org> <86plpnacsd.fsf@gnu.org> <86jzfva8k3.fsf@gnu.org> <87zfor2m6x.fsf@protonmail.com> <868qwa9utj.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7921"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , sebastian@sebasmonia.com, emacs-devel@gnu.org, yantar92@posteo.net To: Kien Nguyen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Sep 03 14:54:25 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1slT3B-0001tB-1p for ged-emacs-devel@m.gmane-mx.org; Tue, 03 Sep 2024 14:54:25 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1slT2e-0007g8-PH; Tue, 03 Sep 2024 08:53:52 -0400 Original-Received: from [2001:470:142:3::10] (helo=eggs.gnu.org) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1slRfU-0005hy-WC for emacs-devel@gnu.org; Tue, 03 Sep 2024 07:25:58 -0400 Original-Received: from [185.70.41.103] (helo=mail-41103.protonmail.ch) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1slReY-0004vi-KU for emacs-devel@gnu.org; Tue, 03 Sep 2024 07:25:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1725359864; x=1725619064; bh=k5aPMF3Coo0yZhMC8h6wof/cCn4+/zc4i8VDow2xoME=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=uIPLD1T5TzYllM09xzsQ91SJ5tDunBjuH6SbH3xQQm8kzKCxCVPBz7aeUBCw0udlD lngH81XciCcrN63rogZdz4PEA1hzUkcFmubG10dOa8Ve6C78jS+EyiCJCFZa8XLMAt +t/WK1cIf/CvWyiZBebBVD+QIiVSP20pToSvc69Z7daLRWsQeqo93zGZa76vmNZ9QV N+iy+PRZHS84GDM5FVt7NGp62LaSIkxhB5Svs2HApEhk2xk8pfYRD+k/DYJplpu9Wk 1m7Nkw6SLdq6oEfDwCjuVn7kvqaOKYlrfFFAEmPl+AG6bQJ5EfKHzKn/V+G3ZI3274 YoG+zRF7Rx6/g== In-Reply-To: Feedback-ID: 112775352:user:proton X-Pm-Message-ID: 1c6f10fd0a308e2562a3315c1451cf32d706d0df X-Host-Lookup-Failed: Reverse DNS lookup failed for 185.70.41.103 (deferred) Received-SPF: pass client-ip=185.70.41.103; envelope-from=pipcet@protonmail.com; helo=mail-41103.protonmail.ch X-Spam_score_int: 6 X-Spam_score: 0.6 X-Spam_bar: / X-Spam_report: (0.6 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, FREEMAIL_FROM=0.001, RDNS_NONE=0.793, SPOOFED_FREEMAIL_NO_RDNS=0.001, SUBJ_LACKS_WORDS=1.53, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_HELO_TEMPERROR=0.01, T_SPF_TEMPERROR=0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Tue, 03 Sep 2024 08:53:51 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:323324 Archived-At: "Kien Nguyen" writes: > I got another two crashes related to mps. > The backtrace is attached. The emacs source code is here[1]. > Hope this helps. Thank you, it does. I found the bug, but fixing it is going to require changes to your MPS patch series. The disassembled code that's causing the problem reads: 0x000000040013c950 <+1216>:=09mov (%r12),%rdx 0x000000040013c954 <+1220>:=09movq %rax,%xmm1 0x000000040013c959 <+1225>:=09mov 0x278(%rdx),%rax 0x000000040013c960 <+1232>:=09sub $0x1,%rbx 0x000000040013c964 <+1236>:=09mov $0x3,%edx 0x000000040013c969 <+1241>:=09movq (%rdi,%rbx,8),%xmm6 0x000000040013c96e <+1246>:=09mov $0x18,%ecx 0x000000040013c973 <+1251>:=09mov 0x20(%rax),%r8 0x000000040013c977 <+1255>:=09punpcklqdq %xmm1,%xmm6 0x000000040013c97b <+1259>:=09call 0x4002061e0 0x000000040013c980 <+1264>:=09movups %xmm6,0x8(%rax) 0x000000040013c984 <+1268>:=09add $0x3,%rax 0x000000040013c988 <+1272>:=09test %rbx,%rbx 0x000000040013c98b <+1275>:=09jne 0x40013c950 During the call to at +1259, the future value of the cons cell lives only in XMM registers (%xmm1 and %xmm6); while it was returned from the previous iteration in %rax, that register is overwritten by the mov at +1225. This isn't a problem for the first iteration, when the cdr is Qnil, which is safe to use, but after the first generation, this is the only place that holds a reference to the list we're building. alloc_impl can call GC, which will then fail to find a reference to the list that's being built and collect it, but return to +1264 where the reference to the freed cons cell is written back to memory. IOW, we need to make sure that the callee-saved %xmm registers are properly spilled to the stack and marked conservatively. However, this patch: https://github.com/kiennq/emacs-build/blob/main/patches/mps/0004-Fix-regist= er-scanning-on-FreeBSD-and-Linux.patch replaces setjmp() by a series of assembler statements: #define STACK_CONTEXT_SAVE(sc) \ BEGIN \ Word *_save =3D (sc)->calleeSave; \ __asm__ volatile ("mov %%rbp, %0" : "=3Dm" (_save[0])); \ __asm__ volatile ("mov %%rbx, %0" : "=3Dm" (_save[1])); \ __asm__ volatile ("mov %%r12, %0" : "=3Dm" (_save[2])); \ __asm__ volatile ("mov %%r13, %0" : "=3Dm" (_save[3])); \ __asm__ volatile ("mov %%r14, %0" : "=3Dm" (_save[4])); \ __asm__ volatile ("mov %%r15, %0" : "=3Dm" (_save[5])); \ END This assumes the SysV ABI is in use, which doesn't have callee-saved XMM registers. However, GCC (correctly) uses the Microsoft x64 ABI, which also requires XMM registers %xmm6-%xmm15, as well as general registers %rdi and %rsi, to be callee-saved (https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention). So this patch doesn't work for Windows 64 builds. I'm not entirely sure that simply removing this patch will make things work, though, because I don't know how Microsoft's UCRT implements setjmp. MPS relies on it to store the value of the callee-saved registers without any further mangling, on the stack, in aligned words. A safe (but very slightly slower) solution would be to simply store the registers twice, once using setjmp() and once using the assembler statements. Would that be acceptable for your build? Pip