From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#72165: 31.0.50; Intermittent crashing with recent emacs build Date: Thu, 18 Jul 2024 07:58:54 +0300 Message-ID: <8634o7gusx.fsf@gnu.org> References: <87o76veo04.fsf@secretsauce.net> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="21178"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 72165@debbugs.gnu.org To: Dima Kogan Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Jul 18 07:00:21 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sUJFc-0005G4-Gi for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 18 Jul 2024 07:00:20 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sUJFJ-0007Dc-8L; Thu, 18 Jul 2024 01:00:02 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sUJFH-0007DP-4j for bug-gnu-emacs@gnu.org; Thu, 18 Jul 2024 00:59:59 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sUJFG-0005vx-SA for bug-gnu-emacs@gnu.org; Thu, 18 Jul 2024 00:59:58 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1sUJFK-0008Ji-23 for bug-gnu-emacs@gnu.org; Thu, 18 Jul 2024 01:00:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 18 Jul 2024 05:00:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 72165 X-GNU-PR-Package: emacs Original-Received: via spool by 72165-submit@debbugs.gnu.org id=B72165.172127875331877 (code B ref 72165); Thu, 18 Jul 2024 05:00:01 +0000 Original-Received: (at 72165) by debbugs.gnu.org; 18 Jul 2024 04:59:13 +0000 Original-Received: from localhost ([127.0.0.1]:36665 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sUJEX-0008I5-50 for submit@debbugs.gnu.org; Thu, 18 Jul 2024 00:59:13 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:50572) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sUJES-0008Hp-P7 for 72165@debbugs.gnu.org; Thu, 18 Jul 2024 00:59:11 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sUJEI-0004yy-Pz; Thu, 18 Jul 2024 00:58:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=UqFS4V0h0EK40a/9uobrkMpgNViIpCBQG2+xqeen6fs=; b=G7FCEtHWLkRw OsSuU27jRF7i08Xm5lT87WhX3fkrFw/UVkAQoEzZ6DIBrhJ8CEp1OfgMoIQPJQloOgoX8tQ1Vlh6m zV1XHBqNcq7eVFT0ROCfcRwE6DPQslaum8FPCUVzRSx2iAYptj/bcgmSV/dDGHxnS+//8webFl5Lq 2d0MkqNqXtkCCHKbk8xzovcXUzhTx6XvOiKHBvd7WV5XjDvCxGFc+15QfAhwu4p7LB2boBCoEm3+I IT+SOfiZMezVfd8DPTIiEwBLk7d8KagS2601HWLrEWqzf8rwGdXPlK61XndvFQeHOLAjJHUsD4v5P gmIqSwNdLtgv/baZpILL8g==; In-Reply-To: <87o76veo04.fsf@secretsauce.net> (message from Dima Kogan on Wed, 17 Jul 2024 13:56:27 -0700) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:288944 Archived-At: > From: Dima Kogan > Date: Wed, 17 Jul 2024 13:56:27 -0700 > > I'm running a bleeding-edge build of emacs. Using packages from: > > https://emacs.secretsauce.net/ > > Debian GNU/Linux. GTK+. Currently using a build from git as of > 2024/07/09 (8e46f44ea0e). It is crashing periodically, with an unclear > cause. > > This isn't a brand-new problem; I observed a similar crash with an earlier > build: 2024/04/30 (d24981d27ce). After that crash I upgraded, and I see > crashes still. > > Anecdotally, the 2024/04/30 build has been very stable. Today I started > to debug a different issue: something about mu4e modeline updating is > signalling args-out-of-range. To debug this I'm tweaking functions like > (truncate-string-to-width), and re-evaluating them. This debugging isn't > very interesting, but something about it is causing emacs to crash, with > both builds. So when you say that "anecdotally, the 2024/04/30 build has been very stable", what exactly do you mean? It sounds like both that build and the one from 2024/07/09 crash in the same way, so why do you consider the April one "very stable"? > I just made a core. I cannot xbacktrace because (I think) I'm looking at > a core, and not at a live process. If that would be helpful, I can > probably get that. And I see the crash every 20min maybe, while > debugging the mu4e modeline problem. Below is the backtrace. Hopefully > this speaks to somebody. Thanks! Thanks, but please always try to supply the information that explains the crash, not just the backtrace. (In this case, it's a deliberate abort, not a crash, but still.) That means look at the source code where GDB says the problem happens and print the values of the variables involved in the crash. In this case: > (gdb) bt full > #0 __pthread_kill_implementation (threadid=, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44 > tid = > ret = 0 > pd = > old_mask = { > __val = {0} > } > ret = > #1 0x00007fc68a4a6b7f in __pthread_kill_internal (signo=6, threadid=) at ./nptl/pthread_kill.c:78 > #2 0x00007fc68a4584e2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 > ret = > #3 0x0000561d3dcb9798 in terminate_due_to_signal (sig=sig@entry=6, backtrace_limit=backtrace_limit@entry=40) at ./debian/build-x/src/emacs.c:469 > #4 0x0000561d3dcb9d4e in emacs_abort () at ./debian/build-x/src/sysdep.c:2391 > #5 0x0000561d3dcb6c34 in redisplay_window (window=, just_this_one_p=just_this_one_p@entry=false) at ./debian/build-x/src/xdisp.c:20086 The call to emacs_abort seems to be here: /* Some sanity checks. */ CHECK_WINDOW_END (w); if (Z == Z_BYTE && CHARPOS (opoint) != BYTEPOS (opoint)) emacs_abort (); <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Now, your "bt full" doesn't help to understand what went wrong because GDB is unable to find the values of many variables: > w = 0x561d6bcb2bc8 > f = > buffer = > old = > lpoint = { > charpos = , > bytepos = > } > opoint = { > charpos = , > bytepos = > } Still, at least Z and Z_BYTE should be available; what are their values? And regarding opoint, look back in the code a small ways to where it was defined: SET_TEXT_POS (opoint, PT, PT_BYTE); If you look up the definition of SET_TEXT_POS, you will see: /* Set character position of POS to CHARPOS, byte position to BYTEPOS. */ #define SET_TEXT_POS(POS, CHARPOS, BYTEPOS) \ ((POS).charpos = (CHARPOS), (POS).bytepos = BYTEPOS) which means opoint takes its character position from PT and its byte position from PT_BYTE. So if you print the values of PT and PT_BYTE, we will know the ("optimized-out") values of opoint.charpos and opoint.bytepos, and will probably be able to understand why we aborted. IOW: (gdb) frame 5 (gdb) print Z (gdb) print Z_BYTE (gdb) print PT (gdb) pt PT_BYTE (The "frame 5" command is to get to the callstack frame where we call emacs_abort, shown as #5 at the right edge of the backtrace line.) If GDB says it doesn't know about these variables with up-cased names, like Z and PT_BYTE, it means your Emacs was built without macro information (the -g3 compiler option), and you will need to type the macro definitions instead. For example (from buffer.h): #define PT (current_buffer->pt + 0) So instead of "print PT" you will need to say "print current_buffer->pt". And similarly with other variables above. Next question is: what buffer did Emacs try to display? To answer that, print the name of the buffer that is current in this place in the code: (gdb) print current_buffer->name_ (gdb) xstring If GDB says it doesn't know what "xstring" is, type: (gdb) source /path/to/emacs/src/.gdbinit and then repeat the above 2 commands. Once you know which buffer was being displayed, try to describe the text that was in it, if you can. (If you cannot, I can give instructions how to find it out using GDB commands.) Thanks.