From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#38748: 28.0.50; crash on MacOS 10.15.2 Date: Fri, 10 Jan 2020 10:27:45 +0200 Message-ID: <834kx3af7i.fsf@gnu.org> References: <20191226130420.GB71460@breton.holly.idiocy.org> <83fth7qa3a.fsf@gnu.org> <83blrtq2j0.fsf@gnu.org> <83sgl3lyii.fsf@gnu.org> <834kxej6lc.fsf@gnu.org> <8336cpbtzh.fsf@gnu.org> <83sgkp9uh9.fsf@gnu.org> Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="179543"; mail-complaints-to="usenet@blaine.gmane.org" Cc: rpluim@gmail.com, 38748@debbugs.gnu.org, alan@idiocy.org, andreyk.mad@gmail.com, jguenther@gmail.com To: Pip Cet Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Jan 10 09:29:40 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1ippeN-000Sfa-UU for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 10 Jan 2020 09:28:12 +0100 Original-Received: from localhost ([::1]:42500 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ippeM-0005rh-Ok for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 10 Jan 2020 03:28:10 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:59268) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ippeF-0005rZ-Bx for bug-gnu-emacs@gnu.org; Fri, 10 Jan 2020 03:28:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ippeE-0002w2-4O for bug-gnu-emacs@gnu.org; Fri, 10 Jan 2020 03:28:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:47663) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ippeD-0002vM-Vp for bug-gnu-emacs@gnu.org; Fri, 10 Jan 2020 03:28:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ippeD-0005KY-RR for bug-gnu-emacs@gnu.org; Fri, 10 Jan 2020 03:28:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 10 Jan 2020 08:28:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38748 X-GNU-PR-Package: emacs Original-Received: via spool by 38748-submit@debbugs.gnu.org id=B38748.157864487720478 (code B ref 38748); Fri, 10 Jan 2020 08:28:01 +0000 Original-Received: (at 38748) by debbugs.gnu.org; 10 Jan 2020 08:27:57 +0000 Original-Received: from localhost ([127.0.0.1]:53636 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ippe8-0005KE-PS for submit@debbugs.gnu.org; Fri, 10 Jan 2020 03:27:57 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:49470) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ippe6-0005K0-NG for 38748@debbugs.gnu.org; Fri, 10 Jan 2020 03:27:55 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:43059) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ippe1-00020U-2k; Fri, 10 Jan 2020 03:27:49 -0500 Original-Received: from [176.228.60.248] (port=1910 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ippdz-0001Ln-Jc; Fri, 10 Jan 2020 03:27:48 -0500 In-reply-to: (message from Pip Cet on Fri, 10 Jan 2020 07:32:07 +0000) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:174426 Archived-At: > From: Pip Cet > Date: Fri, 10 Jan 2020 07:32:07 +0000 > Cc: rpluim@gmail.com, alan@idiocy.org, jguenther@gmail.com, > andreyk.mad@gmail.com, 38748@debbugs.gnu.org > > > The backtrace shows a very recursive GC, it doesn't show any other > > function being deeply recursive. So I'm not sure I understand what > > tail-recursive function did you have in mind. Can you elaborate? > > I can. I think we're looking at two bugs: the first is the simple > use-after-free of XFRAME (frame)->output_data.ns where `frame' is a > dead frame. I've confirmed on GNU/Linux that mark_frame is called for > a frame for which x_free_frame_resources has already been called, if > there's a global variable still referencing the frame. I think the > same thing happens on macOS. This one doesn't depend on the 'ok's initialization in face_inherited_attr in any way, does it? > 1. I think face_inherited_attr is being optimized to tail-call itself > rather than calling itself in a new stack frame; thus, it loops > indefinitely for a faulty face setup which would otherwise lead to an > immediate crash. > 1b. that optimization only works without the harmless initialization of "ok". > > 2. Our initial face setup is faulty in the sense above. > > 3. Something happens on a secondary thread which causes our face setup > to become non-faulty, possibly during GC. What do you mean by "secondary thread"? And how can GC modify Lisp data structures? that'd be a terrible bug. In any case, the full backtrace shows no trace of face_inherited_attr call anywhere in the callstack, so if there is indeed infinite recursion in that function, it was somehow exited long ago by the time GC runs. As for the tail-recursion part: do you see any sign of that in the disassembly posted by Robert? I didn't, but maybe I missed something. And such subtleties should only rear their ugly heads in optimized code, whereas we already know that an unoptimized build crashes in the same way. I still think the shortest way to finding the culprit here is to patiently and painfully go over the last_marked array, deciphering the Lisp object we marked, until we succeed in identifying the Lisp data structure which got corrupted. Once we succeed in identifying that data structure, it should be relatively easy to find who and where corrupts it. This may mean a lot of inconvenient drudgery, exacerbated by the fact that having a functional GDB on macOS is not easy, but I don't think we have a better way at this point.