From: Eli Zaretskii <eliz@gnu.org>
To: Pip Cet <pipcet@protonmail.com>
Cc: eller.helmut@gmail.com, gerd.moellmann@gmail.com,
yantar92@posteo.net, emacs-devel@gnu.org
Subject: Re: MPS: a random backtrace while toying with gdb
Date: Tue, 02 Jul 2024 17:57:08 +0300 [thread overview]
Message-ID: <86sewrc057.fsf@gnu.org> (raw)
In-Reply-To: <toQN75FiHqu4zzAjbK9x05oVUHinYzL_yWiB67YwQ_iupwYisPJfWm9E2NvP8j9sFrdJqBw7hpTkthOjS9ZoWWUrTzPxcTAZxV2Ij0JHvTU=@protonmail.com> (message from Pip Cet on Tue, 02 Jul 2024 14:24:33 +0000)
> Date: Tue, 02 Jul 2024 14:24:33 +0000
> From: Pip Cet <pipcet@protonmail.com>
> Cc: eller.helmut@gmail.com, gerd.moellmann@gmail.com, yantar92@posteo.net, emacs-devel@gnu.org
>
> > > > That's not the problem, AFAIU. The problem is that a signal handler
> > > > which accesses Lisp data or the state of the Lisp machine could
> > > > trigger an MPS call, which will try taking the arena lock, and that
> > > > cannot be nested, by MPS design. And our handlers do access the Lisp
> > > > machine, albeit cautiously and as little as necessary. So when the
> > > > signal happens in the middle of an MPS call which already took the
> > > > arena lock, we cannot safely access our data.
> > >
> > > I've tried quite hard to make this happen, but I didn't manage it. It seems that whenever MPS puts up a protection barrier for existing allocated memory, the arena lock has already been released. As signal handlers cannot allocate memory directly, there's no deadlock, either.
> > >
> > > I don't understand MPS as well as you apparently do, so could you help me and tell where to put a kill(getpid(), SIGWHATEVER) with an appropriate signal handler which will cause a crash (without, in the signal handler, allocating memory)?
> >
> > I thought using the profiler would trigger these easily enough? I
> > think someone (Helmut?) posted a simple recipe for reproducing that
> > some time ago?
>
> Those were all signals interrupting MPS's SIGSEGV handler. You were talking about signals interrupting MPS code that runs outside of a signal handler, weren't you?
I don't think they all were interrupting MPS's SIGSEGV handler. I
think it's the other way around: we interrupted MPS code, and our
signal handler accessed memory which triggered MPS's SIGSEGV.
But even if I'm wrong, why is that important? We need to solve both
kinds of situations, don't we?
> > Also, there was a recipe with SIGCHLD not long ago (you'd need to undo
> > Helmut's fixes for that, I believe, to be able to reproduce that).
>
> Same thing.
Not AFAICT. Look:
Thread 1 "emacs" hit Breakpoint 1, terminate_due_to_signal (sig=sig@entry=6, backtrace_limit=backtrace_limit@entry=2147483647) at emacs.c:443
443 {
(gdb) bt
#0 terminate_due_to_signal (sig=sig@entry=6, backtrace_limit=backtrace_limit@entry=2147483647) at emacs.c:443
#1 0x00005555558634be in set_state (state=IGC_STATE_DEAD) at igc.c:179
#2 igc_assert_fail (file=<optimized out>, line=<optimized out>, msg=<optimized out>) at igc.c:205
#3 0x00005555558f1e19 in mps_lib_assert_fail (condition=0x555555943c4c "res == 0", line=126, file=0x555555943c36 "lockix.c")
at /home/yantar92/Dist/mps/code/mpsliban.c:87
#4 LockClaim (lock=0x7fffe8000110) at /home/yantar92/Dist/mps/code/lockix.c:126
#5 0x00005555558f204d in ArenaEnterLock (arena=0x7ffff7fbf000, recursive=0) at /home/yantar92/Dist/mps/code/global.c:576
#6 0x000055555591aefe in ArenaEnter (arena=0x7ffff7fbf000) at /home/yantar92/Dist/mps/code/global.c:553
#7 ArenaAccess (addr=0x7fffeb908758, mode=mode@entry=3, context=context@entry=0x7fffffff97d0) at /home/yantar92/Dist/mps/code/global.c:655
#8 0x0000555555926202 in sigHandle (sig=<optimized out>, info=0x7fffffff9af0, uap=0x7fffffff99c0) at /home/yantar92/Dist/mps/code/protsgix.c:97
#9 0x00007ffff3048050 in <signal handler called> () at /lib64/libc.so.6
#10 0x0000555555827385 in PSEUDOVECTORP (a=XIL(0x7fffeb90875d), code=9) at /home/yantar92/Git/emacs/src/lisp.h:1105
#11 PROCESSP (a=XIL(0x7fffeb90875d)) at /home/yantar92/Git/emacs/src/process.h:212
#12 XPROCESS (a=XIL(0x7fffeb90875d)) at /home/yantar92/Git/emacs/src/process.h:224
#13 handle_child_signal (sig=sig@entry=17) at process.c:7660
#14 0x000055555573b771 in deliver_process_signal (sig=17, handler=handler@entry=0x555555827200 <handle_child_signal>) at sysdep.c:1758
#15 0x0000555555820647 in deliver_child_signal (sig=<optimized out>) at process.c:7702
#16 0x00007ffff3048050 in <signal handler called> () at /lib64/libc.so.6
#17 0x000055555585f77b in fix_lisp_obj (ss=ss@entry=0x7fffffffa9a8, pobj=pobj@entry=0x7fffeee7ffe8) at igc.c:841
#18 0x000055555586050d in fix_cons (ss=0x7fffffffa9a8, cons=0x7fffeee7ffe0) at igc.c:1474
#19 dflt_scan_obj (ss=0x7fffffffa9a8, base_start=0x7fffeee7ffd8, base_limit=0x7fffeee80000, closure=0x0) at igc.c:1578
#20 dflt_scanx (ss=ss@entry=0x7fffffffa9a8, base_start=<optimized out>, base_limit=0x7fffeee80000, closure=closure@entry=0x0) at igc.c:1658
#21 0x00005555558613a3 in dflt_scan (ss=0x7fffffffa9a8, base_start=<optimized out>, base_limit=<optimized out>) at igc.c:1669
#22 0x00005555558f163f in TraceScanFormat (limit=0x7fffeee80000, base=0x7fffeee7e000, ss=0x7fffffffa9a0) at /home/yantar92/Dist/mps/code/trace.c:1539
#23 amcSegScan (totalReturn=0x7fffffffa99c, seg=0x7fffe845e4c8, ss=0x7fffffffa9a0) at /home/yantar92/Dist/mps/code/poolamc.c:1440
#24 0x000055555591e7bc in traceScanSegRes (ts=ts@entry=1, rank=rank@entry=1, arena=arena@entry=0x7ffff7fbf000, seg=seg@entry=0x7fffe845e4c8)
at /home/yantar92/Dist/mps/code/trace.c:1205
#25 0x000055555591e9ca in traceScanSeg (ts=1, rank=1, arena=0x7ffff7fbf000, seg=0x7fffe845e4c8) at /home/yantar92/Dist/mps/code/trace.c:1267
#26 0x000055555591f3a4 in TraceAdvance (trace=trace@entry=0x7ffff7fbfaa8) at /home/yantar92/Dist/mps/code/trace.c:1728
#27 0x000055555591faa4 in TracePoll
(workReturn=workReturn@entry=0x7fffffffab90, collectWorldReturn=collectWorldReturn@entry=0x7fffffffab8c, globals=globals@entry=0x7ffff7fbf008, collectWorldAllowed=<optimized out>) at /home/yantar92/Dist/mps/code/trace.c:1849
#28 0x000055555591fceb in ArenaPoll (globals=globals@entry=0x7ffff7fbf008) at /home/yantar92/Dist/mps/code/global.c:745
#29 0x00005555559200da in mps_ap_fill (p_o=p_o@entry=0x7fffffffad00, mps_ap=mps_ap@entry=0x7fffe80017f0, size=size@entry=24)
at /home/yantar92/Dist/mps/code/mpsi.c:1097
#30 0x00005555558601ee in alloc_impl (size=24, type=IGC_OBJ_CONS, ap=0x7fffe80017f0) at igc.c:3330
#31 0x000055555586023c in alloc (size=size@entry=16, type=type@entry=IGC_OBJ_CONS) at igc.c:3358
#32 0x000055555586187a in igc_make_cons (car=XIL(0x133e0), cdr=XIL(0)) at igc.c:3385
#33 0x000055555578e7de in Fcons (car=<optimized out>, cdr=<optimized out>) at alloc.c:2926
#34 Flist (nargs=31, args=0x7fffffffaf38) at alloc.c:3054
#35 0x00007ffff06b13ea in F7365742d666163652d617474726962757465_set_face_attribute_0 ()
This says:
. we called Fcons (from a "normal" Emacs Lisp program, which called
set-face-attribute)
. that entered MPS by way of igc_make_cons
. MPS called our scanning code in dflt_scan
. while in fix_* functions called by dflt_scan, we got SIGCHLD
. the SIGCHLD handler accessed Lisp data of the process object(s),
which triggered MPS SIGSEGV handler
. the MPS handler tried to take the arena lock and aborted
IOW, SIGCHLD did NOT interrupt the MPS SIGSEGV handler, it interrupted
the "normal" MPS code when it called our scanning callbacks.
> > Why not simply bind the sigusr2 event to some function (see the node
> > "Misc Events" in the ELisp manual for how), and then use "kill -USR2"
> > outside of Emacs? IOW, I guess I don't understand why you'd need all
> > that complexity just to reproduce the crashes.
>
> Because I wanted to be sure to hit the tiny window while a global lock was taken.
I think the scenario above with SIGCHLD does precisely that, no?
next prev parent reply other threads:[~2024-07-02 14:57 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-29 19:12 MPS: a random backtrace while toying with gdb Ihor Radchenko
2024-06-29 19:19 ` Pip Cet
2024-06-29 21:46 ` Gerd Möllmann
2024-06-30 4:55 ` Eli Zaretskii
2024-06-30 5:33 ` Gerd Möllmann
2024-06-30 6:16 ` Eli Zaretskii
2024-06-30 6:43 ` Gerd Möllmann
2024-06-30 8:52 ` Eli Zaretskii
2024-06-30 9:43 ` Gerd Möllmann
2024-06-30 10:05 ` Eli Zaretskii
2024-06-30 11:20 ` Gerd Möllmann
2024-06-30 12:16 ` Eli Zaretskii
2024-06-30 12:43 ` Gerd Möllmann
2024-06-30 9:36 ` Helmut Eller
2024-06-30 10:00 ` Eli Zaretskii
2024-06-30 10:24 ` Helmut Eller
2024-06-30 10:43 ` Eli Zaretskii
2024-06-30 18:42 ` Helmut Eller
2024-06-30 18:59 ` Gerd Möllmann
2024-06-30 19:25 ` Pip Cet
2024-06-30 19:49 ` Ihor Radchenko
2024-06-30 20:09 ` Eli Zaretskii
2024-06-30 20:32 ` Pip Cet
2024-07-01 11:07 ` Eli Zaretskii
2024-07-01 17:27 ` Pip Cet
2024-07-01 17:42 ` Ihor Radchenko
2024-07-01 18:08 ` Eli Zaretskii
2024-07-02 7:55 ` Pip Cet
2024-07-02 13:10 ` Eli Zaretskii
2024-07-02 14:24 ` Pip Cet
2024-07-02 14:57 ` Eli Zaretskii [this message]
2024-07-02 17:06 ` Pip Cet
2024-07-03 11:31 ` Pip Cet
2024-07-03 11:50 ` Eli Zaretskii
2024-07-03 14:35 ` Pip Cet
2024-07-03 15:41 ` Eli Zaretskii
2024-07-01 2:33 ` Eli Zaretskii
2024-07-01 6:05 ` Helmut Eller
2024-06-30 19:58 ` Eli Zaretskii
2024-06-30 21:08 ` Ihor Radchenko
2024-07-01 2:35 ` Eli Zaretskii
2024-07-01 11:13 ` Eli Zaretskii
2024-07-01 11:47 ` Ihor Radchenko
2024-07-01 12:33 ` Eli Zaretskii
2024-07-01 17:17 ` Ihor Radchenko
2024-07-01 17:44 ` Eli Zaretskii
2024-07-01 18:01 ` Ihor Radchenko
2024-07-01 18:16 ` Eli Zaretskii
2024-07-01 18:24 ` Ihor Radchenko
2024-07-01 18:31 ` Eli Zaretskii
2024-07-01 18:51 ` Ihor Radchenko
2024-07-01 19:05 ` Eli Zaretskii
2024-07-01 19:34 ` Gerd Möllmann
2024-07-01 20:00 ` Ihor Radchenko
2024-07-02 4:33 ` Gerd Möllmann
2024-07-02 7:05 ` Ihor Radchenko
2024-07-02 7:06 ` Gerd Möllmann
2024-07-01 18:19 ` Gerd Möllmann
2024-07-01 18:23 ` Eli Zaretskii
2024-06-30 11:07 ` Gerd Möllmann
2024-06-30 11:06 ` Gerd Möllmann
2024-06-30 11:05 ` Gerd Möllmann
2024-06-30 9:59 ` Pip Cet
2024-06-30 10:09 ` Eli Zaretskii
2024-06-30 10:16 ` Pip Cet
2024-06-30 10:34 ` Eli Zaretskii
2024-06-30 13:06 ` Pip Cet
2024-06-30 11:10 ` Gerd Möllmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86sewrc057.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=eller.helmut@gmail.com \
--cc=emacs-devel@gnu.org \
--cc=gerd.moellmann@gmail.com \
--cc=pipcet@protonmail.com \
--cc=yantar92@posteo.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).