unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Pip Cet <pipcet@protonmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: execvy@gmail.com, gerd.moellmann@gmail.com, emacs-devel@gnu.org
Subject: Re: [scratch/igc] 985247b6bee crash on Linux, KDE, Wayland
Date: Sat, 07 Sep 2024 09:05:46 +0000	[thread overview]
Message-ID: <87jzfnesuv.fsf@protonmail.com> (raw)
In-Reply-To: <861q1w0zw1.fsf@gnu.org>

0"Eli Zaretskii" <eliz@gnu.org> writes:

>> Date: Fri, 06 Sep 2024 19:29:28 +0000
>> From: Pip Cet <pipcet@protonmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, gerd.moellmann@gmail.com, emacs-devel@gnu.org
>>
>> So we can decode those to three interleaved lists reading, in part:
>>
>> (nil font-lock-face (:foreground ...))
>> (rear-nonsticky t <bad symbol> ...)
>> (nil font-lock-face (...))
>>
>> <bad symbol> is a pointer to what looks like the nursery generation, but
>> one which we must have failed to trace (presumably the symbol was either
>> uninterned and freed or interned and moved to an older generation) and
>> which was subsequently reused for cons cells by composite.c
>>
>> Going back to the original report, I notice that it was trying to print
>> an "error in process filter: " message while handling what looks like a
>> (long) sequence of terminal escape codes.  Were you using M-x term at
>> the time?  Did you notice such error messages?
>>
>> I'll have another look at the process filter/longjmp code, but I suspect
>> we're going to have to wait for further crashes to get to the bottom of
>> this.
>
> What data is missing to get to the bottom of this, and how can we
> change the code and/or add some .gdbinit magic to provide that data?

I don't think .gdbinit magic would work.

The main problem is that while MPS GC should happen more frequently than
traditional GC, it's still unlikely to crash near the code that failed
to trace objects.  We got lucky there a few times, but it looks like our
luck ran out here.

So a first change would be an option for very eager garbage collection;
I'd already proposed a patch to do so on a separate OS thread, but it
would be better to do so on the main thread, to avoid false positives
when main thread code deliberately leaves things in an inconsistent
state while assuming GC doesn't happen.

> In general, our current facilities to investigate igc-related crashes
> are clearly insufficient.

I agree.

> The old GC has the last_marked[] array, which could be used to trace
> back any bad values which caused a GC-related crash, and I used that
> on several occasions.

To be honest, I don't even know whether MPS uses depth-first marking
(which would make the last_marked[] array useful).

> But there's nothing similar in igc.c, which
> makes the investigation basically a guesswork.  How can we improve
> this situation?  I expect this kind of trouble to happen a lot in the
> near future, so having efficient tools for debugging is crucial, IMO.

Just off the top of my head, here are a few ideas:

1. make garbage collection much more eager.  Easy to do, high
performance cost, provides slightly better traces.  In particular,
always perform a full GC after returning from a non-local exit, which
invalidates many ambiguous references at once.

2. a last_marked[] array.  Should be cheap to do if it's fixed size, but
may not help very much if the order MPS traces objects in has poor
locality.

3. Use the "extended header" (which already exists) to save a backtrace
for the function which allocated an object.  This will increase memory
usage for the whole of Emacs a lot.  I believe, in most cases, this is
the information we need: something allocates an object and stores a
reference to it in memory that's invisible to MPS, so it's not fixed
when the object moves.  Then it retrieves the reference (which now
points to random memory in the arena) and it's traced in the next GC,
but causes a crash.

4. Save a log of what moved where.  This would allow us, in this case,
to at least find out what the <bad symbol> above was, I think.

5. Provide a facility which repeatedly walks all pools and ends up
returning a (shortest) path of references which keep an object alive.
This works well for other languages, but Lisp tends to have very long
paths because of the cons cell linkage.  I'm not sure how difficult this
would be to implement.

6. Anything that involves modifying MPS.  Last for obvious reasons.

Just ideas for now, I'm afraid, no code yet.

Pip




  reply	other threads:[~2024-09-07  9:05 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-05  6:24 [scratch/igc] 985247b6bee crash on Linux, KDE, Wayland Eval EXEC
2024-09-05  7:14 ` Eval EXEC
2024-09-05  8:17   ` Eval EXEC
2024-09-05  7:21 ` Eval EXEC
2024-09-05  8:08 ` Eli Zaretskii
2024-09-05  8:14   ` Eval EXEC
2024-09-05 11:12   ` Pip Cet
2024-09-05  8:24 ` Helmut Eller
2024-09-05  8:28   ` Eval EXEC
2024-09-05  8:34     ` Helmut Eller
2024-09-05  8:37       ` Eval EXEC
2024-09-05 10:44       ` Eval EXEC
2024-09-05 11:01         ` Eli Zaretskii
2024-09-05 11:04           ` Eval EXEC
2024-09-05 11:09 ` Pip Cet
2024-09-05 11:15   ` Eval EXEC
2024-09-05 11:19     ` Pip Cet
2024-09-05 11:26       ` Eval EXEC
2024-09-05 16:04         ` Pip Cet
2024-09-05 16:32           ` Eval EXEC
2024-09-05 11:34       ` Eval EXEC
2024-09-05 11:49       ` Eval EXEC
2024-09-05 12:21         ` Eli Zaretskii
2024-09-05 13:20           ` Gerd Möllmann
2024-09-05 13:31             ` Eli Zaretskii
2024-09-05 13:37               ` Gerd Möllmann
2024-09-05 13:52                 ` Eli Zaretskii
2024-09-05 13:57                   ` Gerd Möllmann
2024-09-05 14:33                     ` Eli Zaretskii
2024-09-05 14:44                       ` Eli Zaretskii
2024-09-05 14:58                       ` Gerd Möllmann
2024-09-05 16:19                       ` Pip Cet
2024-09-05 16:40                         ` Eval EXEC
2024-09-05 16:45                         ` Eval EXEC
2024-09-05 16:57                         ` Eval EXEC
2024-09-05 16:59                           ` Eval EXEC
2024-09-05 17:03                           ` Pip Cet
2024-09-05 17:05                             ` Eval EXEC
2024-09-05 17:16                               ` Gerd Möllmann
2024-09-05 18:46                                 ` Eli Zaretskii
2024-09-05 19:24                                   ` Gerd Möllmann
2024-09-05 19:31                                     ` Eli Zaretskii
2024-09-05 18:48                               ` Eli Zaretskii
2024-09-05 18:56                                 ` Eval EXEC
2024-09-05 19:23                                   ` Eli Zaretskii
2024-09-05 19:31                                     ` Eli Zaretskii
2024-09-06  2:15                                     ` Eval EXEC
2024-09-06  3:10                                       ` Eval EXEC
2024-09-06  5:58                                         ` Pip Cet
2024-09-06  6:32                                           ` Eval EXEC
2024-09-06  7:41                                             ` Pip Cet
2024-09-06  8:28                                               ` Eval EXEC
2024-09-06 12:58                                                 ` Pip Cet
2024-09-06 13:14                                                   ` Eval EXEC
2024-09-06 10:57                                               ` Eli Zaretskii
2024-09-06 13:03                                                 ` Eval EXEC
2024-09-06 19:29                                             ` Pip Cet
2024-09-07  5:57                                               ` Eli Zaretskii
2024-09-07  9:05                                                 ` Pip Cet [this message]
2024-09-06  6:39                                           ` Eval EXEC
2024-09-06  7:43                                             ` Pip Cet
2024-09-07  7:46                                               ` Eval EXEC
2024-09-07  8:10                                                 ` Pip Cet
2024-09-07 13:20                                                   ` Gerd Möllmann
2024-09-06  6:05                                         ` Eli Zaretskii
2024-09-06  6:30                                           ` Pip Cet
2024-09-06  6:34                                           ` Eval EXEC
2024-09-06 11:49                                             ` Eli Zaretskii
2024-09-06 13:08                                               ` Eval EXEC
2024-09-05 19:01                                 ` Eval EXEC
2024-09-05 17:29                             ` Eval EXEC

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzfnesuv.fsf@protonmail.com \
    --to=pipcet@protonmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=execvy@gmail.com \
    --cc=gerd.moellmann@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).