* bug#14474: 24.3.50; Zombie subprocesses (again) @ 2013-05-25 23:38 Michael Heerdegen 2013-05-25 23:49 ` Michael Heerdegen ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Michael Heerdegen @ 2013-05-25 23:38 UTC (permalink / raw) To: 14474 Hello, dunno if this is related to bug#12980. Although I had used a fresh build all the time, I saw the following problem yesterday for the first time (note: was on a trip before, so the problem could have been introduced one or two weeks before today). I'm using emacs-snapshot on Debian, currently a five days old build: "GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.4.2) of 2013-05-21 on dex, modified by Debian" I'm experiencing the following: - I start Emacs in X as a different user (via gksu), or - I start Emacs from an X session that was started with startx In such an Emacs, any child process seems to become a zombie after being finished. E.g., after typing "exit" in a *terminal* running bash, there is still a running buffer process. As a symptom, CPU is used continuously at 100% until I C-x C-c. However, if I log in via display manager and don't switch to another user via gksu, this doesn't happen. And: it happens with the gtk version as well as with the lucid version, but _not_ with emacs -nw in an xterm. Please ask me if you need more info. Thanks, Michael. In GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.4.2) of 2013-05-21 on dex, modified by Debian (emacs-snapshot package, version 2:20130520-1) Windowing system distributor `The X.Org Foundation', version 11.0.11204000 System Description: Debian GNU/Linux testing (jessie) Configured using: `configure --build x86_64-linux-gnu --host x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib --libexecdir=/usr/lib --localstatedir=/var --infodir=/usr/share/info --mandir=/usr/share/man --with-pop=yes --enable-locallisppath=/etc/emacs-snapshot:/etc/emacs:/usr/local/share/emacs/24.3.50/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.3.50/site-lisp:/usr/share/emacs/site-lisp --without-compress-info --with-crt-dir=/usr/lib/x86_64-linux-gnu/ --with-x=yes --with-x-toolkit=gtk3 --with-imagemagick=yes CFLAGS='-DDEBIAN -DSITELOAD_PURESIZE_EXTRA=5000 -g -O2' CPPFLAGS='-D_FORTIFY_SOURCE=2' LDFLAGS='-g -Wl,--as-needed -znocombreloc'' ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-25 23:38 bug#14474: 24.3.50; Zombie subprocesses (again) Michael Heerdegen @ 2013-05-25 23:49 ` Michael Heerdegen 2013-05-26 2:55 ` Eli Zaretskii 2013-05-26 17:37 ` Paul Eggert 2 siblings, 0 replies; 20+ messages in thread From: Michael Heerdegen @ 2013-05-25 23:49 UTC (permalink / raw) To: 14474 Michael Heerdegen <michael_heerdegen@web.de> writes: > I'm experiencing the following: > > - I start Emacs in X as a different user (via gksu), or > > - I start Emacs from an X session that was started with startx > > In such an Emacs, any child process seems to become a zombie after being > finished. E.g., after typing "exit" in a *terminal* running bash, there > is still a running buffer process. As a symptom, CPU is used > continuously at 100% until I C-x C-c. BTW, this is what Paul Eggert answered in emacs-dev: > I can reproduce the problem on Ubuntu 13.04. Apparently when you > start up a GTK Emacs session that can't talk to dbus (because it's > su'ed), the dbus library starts up its own service, using dbus-launch. > This messes up Emacs somehow (I don't know why). Thanks, Michael. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-25 23:38 bug#14474: 24.3.50; Zombie subprocesses (again) Michael Heerdegen 2013-05-25 23:49 ` Michael Heerdegen @ 2013-05-26 2:55 ` Eli Zaretskii 2013-05-26 17:37 ` Paul Eggert 2 siblings, 0 replies; 20+ messages in thread From: Eli Zaretskii @ 2013-05-26 2:55 UTC (permalink / raw) To: michael_heerdegen; +Cc: 14474 > From: Michael Heerdegen <michael_heerdegen@web.de> > Date: Sun, 26 May 2013 01:38:56 +0200 > > In such an Emacs, any child process seems to become a zombie after being > finished. E.g., after typing "exit" in a *terminal* running bash, there > is still a running buffer process. As a symptom, CPU is used > continuously at 100% until I C-x C-c. Can you attach a debugger and see where Emacs is looping? etc/DEBUG tells how to do that. Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-25 23:38 bug#14474: 24.3.50; Zombie subprocesses (again) Michael Heerdegen 2013-05-25 23:49 ` Michael Heerdegen 2013-05-26 2:55 ` Eli Zaretskii @ 2013-05-26 17:37 ` Paul Eggert 2013-05-26 18:33 ` Michael Heerdegen 2 siblings, 1 reply; 20+ messages in thread From: Paul Eggert @ 2013-05-26 17:37 UTC (permalink / raw) To: 14474 A workaround, for me at least, is to propagate the DBUS_SESSION_BUS_ADDRESS environment variable into the child process with a different userid. For example, here is a failing session, where I became the user 'exp' and later observed the problem in a shell window: $ env | grep DBUS DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf $ sudo sh # su exp $ env | grep DBUS $ emacs ** (emacs:15115): WARNING **: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/dbus-x2KgryK9C8: Connection refused And here is a session that worked. The key difference is that I used su's '-E' option: $ env | grep DBUS DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf $ sudo -E sh # su exp $ env | grep DBUS DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf $ emacs ** (emacs:15441): WARNING **: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/dbus-x2KgryK9C8: Connection refused In both cases, the dbus library complains to stderr that it can't connect to /tmp/dbus-x2KgryK9C8 (I don't know where it's getting that name from). When DBUS_SESSION_BUS_ADDRESS is unset, the dbus library arranges to run the shell script /usr/bin/dbus-launch, which seems to cause the problem. But when DBUS_SESSION_BUS_ADDRESS is set, the dbus library falls back on its contents and doesn't invoke dbus-launch. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-26 17:37 ` Paul Eggert @ 2013-05-26 18:33 ` Michael Heerdegen 2013-05-27 1:36 ` Paul Eggert 0 siblings, 1 reply; 20+ messages in thread From: Michael Heerdegen @ 2013-05-26 18:33 UTC (permalink / raw) To: Paul Eggert; +Cc: 14474 Paul Eggert <eggert@cs.ucla.edu> writes: > And here is a session that worked. The key difference is that > I used su's '-E' option: > > $ env | grep DBUS > DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf > $ sudo -E sh > # su exp > $ env | grep DBUS > DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf > $ emacs > ** (emacs:15441): WARNING **: Couldn't connect to accessibility bus: > Failed to connect to socket /tmp/dbus-x2KgryK9C8: Connection refused I see something similar - using the -E flag for sudo works as a workaround. However, I don't get this "Failed to connect to socket..." warning. Instead, I get ** (emacs:6638): WARNING **: The connection is closed Michael. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-26 18:33 ` Michael Heerdegen @ 2013-05-27 1:36 ` Paul Eggert 2013-05-27 12:46 ` Colin Walters 0 siblings, 1 reply; 20+ messages in thread From: Paul Eggert @ 2013-05-27 1:36 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 14474, Colin Walters [The bug is that a bleeding-edge GTK Emacs loses child processes when it's run via sudo; see <http://bugs.gnu.org/14474>.] I think I may have spotted the problem. Glib 2.36.2's glib/gmain.c has a function 'ensure_unix_signal_handler_installed_unlocked' that is run in the dconf worker thread. This function calls sigaction to replace Emacs's SIGCHLD handler with glib's own handler g_unix_signal_handler. Signal handlers are process-wide, so this replacement affects all threads, including the main (Emacs) thread. After that happens, Emacs never sees when its children exit, since g_unix_signal_handler discards Emacs's child-exit notices, and the Emacs function deliver_child_signal is never invoked. The comment for g_child_watch_source_new says that Emacs isn't supposed to invoke waitpid (-1, ...), but that's already the case in the Emacs trunk. Is there another limitation that we didn't know about, a limitation that says Emacs can't have signal handlers either? I'll CC: this to Colin Walters since he seemed to have a good handle on the situation from the glib point of view; see <https://bugzilla.gnome.org/show_bug.cgi?id=676167>. One possibility is to see if we can get Emacs to use glib's child watcher. But that's a bit of a delicate balance, since Emacs must work even when gtk is absent, and it may need to hand off from its own watcher to glib's watcher, and processes shouldn't get lost during the handoff. I don't offhand know how to do all that. A simpler but hacky workaround is to not use the graphical interface if DBUS_SESSION_BUS_ADDRESS is unset. Something like this: --- src/xterm.c 2013-05-09 14:49:56 +0000 +++ src/xterm.c 2013-05-27 01:32:44 +0000 @@ -9819,6 +9819,14 @@ x_display_ok (const char *display) int dpy_ok = 1; Display *dpy; +#ifdef USE_GTK + if (! egetenv ("DBUS_SESSION_BUS_ADDRESS")) + { + fprintf (stderr, "DBUS_SESSION_BUS_ADDRESS unset, so Gtk is unsafe\n"); + return 0; + } +#endif + dpy = XOpenDisplay (display); if (dpy) XCloseDisplay (dpy); ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-27 1:36 ` Paul Eggert @ 2013-05-27 12:46 ` Colin Walters 2013-05-27 17:36 ` Paul Eggert 2013-06-01 1:03 ` Paul Eggert 0 siblings, 2 replies; 20+ messages in thread From: Colin Walters @ 2013-05-27 12:46 UTC (permalink / raw) To: Paul Eggert; +Cc: Michael Heerdegen, 14474 On Sun, 2013-05-26 at 18:36 -0700, Paul Eggert wrote: > but that's already the case in the Emacs trunk. > Is there another limitation that we > didn't know about, a limitation that says Emacs can't > have signal handlers either? Basically it's going to be very hard over time to avoid codepaths in the GTK+ stack that don't call g_spawn_*() indirectly, thus installing a SIGCHLD handler, particuarly due to the pluggable nature of Gio. > I'll CC: this to Colin Walters since he seemed to have > a good handle on the situation from the glib point of view; see > <https://bugzilla.gnome.org/show_bug.cgi?id=676167>. Yeah, I don't think much has changed since then. > One possibility is to see if we can get Emacs to use > glib's child watcher. That'd be best obviously. > But that's a bit of a delicate balance, > since Emacs must work even when gtk is absent, Bear in mind that GLib is usable without gtk. Even if you don't have an X connection, if the GLib mainloop is linked into the process, I don't see a reason not to use it. > and it may need > to hand off from its own watcher to glib's watcher, and processes > shouldn't get lost during the handoff. Would Emacs really be spawning processes before initializing the frontend? > A simpler but hacky workaround is to not use the graphical interface if > DBUS_SESSION_BUS_ADDRESS is unset. I don't see a real problem with that as a temporary thing. Anyways, if there is something I can do GLib side, let me know. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-27 12:46 ` Colin Walters @ 2013-05-27 17:36 ` Paul Eggert 2013-05-28 16:56 ` Paul Eggert 2013-05-28 17:04 ` Jan Djärv 2013-06-01 1:03 ` Paul Eggert 1 sibling, 2 replies; 20+ messages in thread From: Paul Eggert @ 2013-05-27 17:36 UTC (permalink / raw) To: Colin Walters; +Cc: Michael Heerdegen, 14474 [The context is http://bugs.gnu.org/14474 ] On 05/27/2013 05:46 AM, Colin Walters wrote: > Basically it's going to be very hard over time to avoid codepaths > in the GTK+ stack that don't call g_spawn_*() indirectly, thus > installing a SIGCHLD handler Thanks. In that case, shouldn't the glib documentation be changed to warn application developers not to install a SIGCHLD handler as well? Currently it warns them only to not call waitpid(-1, ...). Are application developers allowed to temporarily mask SIGCHLD? Emacs does that a lot. >> One possibility is to see if we can get Emacs to use >> > glib's child watcher. > That'd be best obviously. I suspect so too, but it requires more expertise in glib than I have (which is, basically, nothing). If I understand things correctly, if Emacs is using Gtk it should * never call sigaction (SIGCHLD, ...) or signal (SIGCHLD, ...) or waitpid (-1, ...). E.g., remove the current call to sigaction (SIGCHLD, ...), in src/process.c's init_process_emacs. * Whenever Emacs creates a child process, use the following pattern: block SIGCHLD; pid = vfork (); if (pid > 0) { record pid in Emacs's process table, as location 'loc'; record in *loc that glib is watching this pid; g_child_watch_add (pid, watcher, loc); } unblock SIGCHLD; * never call waitpid (pid, ...) if PID is recorded in Emacs's process table as something that glib is watching. * Add a glue function ("watcher", above) that does something like this: void watcher (GPid pid, gint status, gpointer loc) { block SIGCHLD record that PID exited with status STATUS, by modifying *LOC, sort of like's what currently done in handle_child_signal; if (input_available_clear_time) *input_available_clear_time = make_emacs_time (0, 0); unblock SIGCHLD } But this sounds incomplete. No doubt there's something about the main loop, or setting up the watchers, that I don't know about. E.g., how does one remove the watcher once it has fired and told us that the process has exited? I'll CC: this to Jan Djärv, who knows about gtk, to see if he can help. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-27 17:36 ` Paul Eggert @ 2013-05-28 16:56 ` Paul Eggert 2013-05-28 20:42 ` Michael Heerdegen 2013-06-04 17:12 ` Michael Heerdegen 2013-05-28 17:04 ` Jan Djärv 1 sibling, 2 replies; 20+ messages in thread From: Paul Eggert @ 2013-05-28 16:56 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 14474 In <http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00628.html> something like the following milder workaround was suggested instead. Michael, does this patch work around the bug for your test case? === modified file 'src/xterm.c' --- src/xterm.c 2013-05-09 14:49:56 +0000 +++ src/xterm.c 2013-05-28 16:34:44 +0000 @@ -9897,6 +9897,13 @@ x_term_init (Lisp_Object display_name, c XSetLocaleModifiers (""); + /* If D-Bus is not already configured, inhibit D-Bus autolaunch, + as autolaunch can mess up Emacs's SIGCHLD handler. + FIXME: Rewrite subprocess handlers to use glib's child watchers. + See Bug#14474. */ + if (! egetenv ("DBUS_SESSION_BUS_ADDRESS")) + xputenv ("DBUS_SESSION_BUS_ADDRESS="); + /* Emacs can only handle core input events, so make sure Gtk doesn't use Xinput or Xinput2 extensions. */ xputenv ("GDK_CORE_DEVICE_EVENTS=1"); ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-28 16:56 ` Paul Eggert @ 2013-05-28 20:42 ` Michael Heerdegen 2013-06-04 17:12 ` Michael Heerdegen 1 sibling, 0 replies; 20+ messages in thread From: Michael Heerdegen @ 2013-05-28 20:42 UTC (permalink / raw) To: Paul Eggert; +Cc: 14474 Paul Eggert <eggert@cs.ucla.edu> writes: > In <http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00628.html> > something like the following milder workaround was suggested instead. > Michael, does this patch work around the bug for your test case? Thanks for that, but I currently use a precompiled package for my OS (emacs-snapshot), so I can neither debug C nor test patches. It would be great if someone else that can reproduce this bug could try that. If not, I'll try to build Emacs myself in the next days, hoping that the problem manifests there, too. Regards, Michael. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-28 16:56 ` Paul Eggert 2013-05-28 20:42 ` Michael Heerdegen @ 2013-06-04 17:12 ` Michael Heerdegen 2020-09-09 13:52 ` Lars Ingebrigtsen 1 sibling, 1 reply; 20+ messages in thread From: Michael Heerdegen @ 2013-06-04 17:12 UTC (permalink / raw) To: Paul Eggert; +Cc: 14474 Hi Paul, > In <http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00628.html> > something like the following milder workaround was suggested instead. > Michael, does this patch work around the bug for your test case? Have already installed it to trunk? The issue is fixed for me after upgrading my emacs-snapshot to (emacs-version) ==> GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.8.2) of 2013-06-03 on dex, modified by Debian Thanks, Michael. > > === modified file 'src/xterm.c' > --- src/xterm.c 2013-05-09 14:49:56 +0000 > +++ src/xterm.c 2013-05-28 16:34:44 +0000 > @@ -9897,6 +9897,13 @@ x_term_init (Lisp_Object display_name, c > > XSetLocaleModifiers (""); > > + /* If D-Bus is not already configured, inhibit D-Bus autolaunch, > + as autolaunch can mess up Emacs's SIGCHLD handler. > + FIXME: Rewrite subprocess handlers to use glib's child watchers. > + See Bug#14474. */ > + if (! egetenv ("DBUS_SESSION_BUS_ADDRESS")) > + xputenv ("DBUS_SESSION_BUS_ADDRESS="); > + > /* Emacs can only handle core input events, so make sure > Gtk doesn't use Xinput or Xinput2 extensions. */ > xputenv ("GDK_CORE_DEVICE_EVENTS=1"); ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-06-04 17:12 ` Michael Heerdegen @ 2020-09-09 13:52 ` Lars Ingebrigtsen 0 siblings, 0 replies; 20+ messages in thread From: Lars Ingebrigtsen @ 2020-09-09 13:52 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Paul Eggert, 14474 Michael Heerdegen <michael_heerdegen@web.de> writes: >> In <http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00628.html> >> something like the following milder workaround was suggested instead. >> Michael, does this patch work around the bug for your test case? > > Have already installed it to trunk? The issue is fixed for me after > upgrading my emacs-snapshot to > > (emacs-version) ==> > > GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.8.2) > of 2013-06-03 on dex, modified by Debian There was some followup talk here about other possible glib problems, but it looks like Paul fixed those two? (I just skimmed the patch, which was applied at the time.) So I'm closing this bug report; if there are any further issues here, please respond to the debbugs address and we'll reopen. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-27 17:36 ` Paul Eggert 2013-05-28 16:56 ` Paul Eggert @ 2013-05-28 17:04 ` Jan Djärv 1 sibling, 0 replies; 20+ messages in thread From: Jan Djärv @ 2013-05-28 17:04 UTC (permalink / raw) To: Paul Eggert; +Cc: Michael Heerdegen, Colin Walters, 14474 Hello. 27 maj 2013 kl. 19:36 skrev Paul Eggert <eggert@cs.ucla.edu>: > [The context is > http://bugs.gnu.org/14474 > ] > > On 05/27/2013 05:46 AM, Colin Walters wrote: > >> Basically it's going to be very hard over time to avoid codepaths >> in the GTK+ stack that don't call g_spawn_*() indirectly, thus >> installing a SIGCHLD handler > > Thanks. In that case, shouldn't the glib documentation be > changed to warn application developers not to install a SIGCHLD > handler as well? Currently it warns them only to not call > waitpid(-1, ...). > > Are application developers allowed to temporarily mask SIGCHLD? > Emacs does that a lot. > >>> One possibility is to see if we can get Emacs to use >>>> glib's child watcher. >> That'd be best obviously. > > I suspect so too, but it requires more expertise in > glib than I have (which is, basically, nothing). > If I understand things correctly, if Emacs is using > Gtk it should > Actually GLib is linked in whenever one of GSettings, GConf, Gtk or rsvg is used. I see rsvg only is not handeled in xgselect.c, an oversight. > * never call sigaction (SIGCHLD, ...) or signal (SIGCHLD, ...) > or waitpid (-1, ...). > E.g., remove the current call to sigaction (SIGCHLD, ...), > in src/process.c's init_process_emacs. > > * Whenever Emacs creates a child process, use the > following pattern: > > block SIGCHLD; > pid = vfork (); > if (pid > 0) > { > record pid in Emacs's process table, as location 'loc'; > record in *loc that glib is watching this pid; > g_child_watch_add (pid, watcher, loc); > } > unblock SIGCHLD; > > * never call waitpid (pid, ...) if PID is recorded > in Emacs's process table as something that glib is > watching. > > * Add a glue function ("watcher", above) that does > something like this: > > void watcher (GPid pid, gint status, gpointer loc) { > block SIGCHLD > record that PID exited with status STATUS, by modifying *LOC, > sort of like's what currently done in handle_child_signal; > if (input_available_clear_time) > *input_available_clear_time = make_emacs_time (0, 0); > unblock SIGCHLD > } > > But this sounds incomplete. No doubt there's something > about the main loop, or setting up the watchers, that I don't > know about. E.g., how does one remove the watcher once it > has fired and told us that the process has exited? > Keep track of the return value from g_child_watch_add and pass it to g_source_remove. I think g_source_remove can be called in the callback function. We kind of use GLibs main loop in xgselect.c, so child watches should be called from there. As GLib:s main loop is an "all or nothing" approach, we could also move the filedescriptor and timeout handling there. Then xgselect.c could more or less go away. But there is no real gain to do that, xgselect works well enough. Jan D. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-05-27 12:46 ` Colin Walters 2013-05-27 17:36 ` Paul Eggert @ 2013-06-01 1:03 ` Paul Eggert 2013-06-01 1:22 ` Colin Walters 1 sibling, 1 reply; 20+ messages in thread From: Paul Eggert @ 2013-06-01 1:03 UTC (permalink / raw) To: Colin Walters; +Cc: Michael Heerdegen, Michael Albinus, 14474 On 05/27/2013 05:46 AM, Colin Walters wrote: >> One possibility is to see if we can get Emacs to use >> > glib's child watcher. > That'd be best obviously. I looked into this a bit, and found a problem. Emacs wants to be notified about child processes that are stopped, so it invokes waitpid with the WUNTRACED option, but glib never uses WUNTRACED when invoking waitpid. If Emacs used glib to watch for child processes, Emacs will not be informed about a child process changing state because it has stopped. (Similarly for WCONTINUED and processes that have been continued.) Perhaps glib needs a new function, which lets the caller specify additional options to be given to waitpid? Something like this, say: g_child_watch_source_new_full (pid, WUNTRACED | WCONTINUED) Then, g_child_watch_source_new (pid) would be equivalent to g_child_watch_source_new_full (pid, 0). ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-06-01 1:03 ` Paul Eggert @ 2013-06-01 1:22 ` Colin Walters 2013-06-01 6:14 ` Paul Eggert 0 siblings, 1 reply; 20+ messages in thread From: Colin Walters @ 2013-06-01 1:22 UTC (permalink / raw) To: Paul Eggert; +Cc: Michael Heerdegen, Michael Albinus, 14474 On Fri, 2013-05-31 at 18:03 -0700, Paul Eggert wrote: > On 05/27/2013 05:46 AM, Colin Walters wrote: > >> One possibility is to see if we can get Emacs to use > >> > glib's child watcher. > > That'd be best obviously. > > I looked into this a bit, and found a problem. > Emacs wants to be notified about child processes > that are stopped, Why, out of curiosity? > g_child_watch_source_new_full (pid, WUNTRACED | WCONTINUED) We could add that to glib-unix.h probably, yeah. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-06-01 1:22 ` Colin Walters @ 2013-06-01 6:14 ` Paul Eggert 2013-06-01 14:33 ` Stefan Monnier 0 siblings, 1 reply; 20+ messages in thread From: Paul Eggert @ 2013-06-01 6:14 UTC (permalink / raw) To: Colin Walters; +Cc: Michael Heerdegen, Michael Albinus, 14474 On 05/31/2013 06:22 PM, Colin Walters wrote: > Why, out of curiosity? Emacs has a function process-status that returns a process's status. Possible statuses include run -- for a process that is running. stop -- for a process stopped but continuable. exit -- for a process that has exited. signal -- for a process that has got a fatal signal. To implement this, Emacs keeps track, for each of its child processes, what that process's status is. Emacs updates the information that it records about a child process whenever it's notified about a child process status change. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-06-01 6:14 ` Paul Eggert @ 2013-06-01 14:33 ` Stefan Monnier 2013-06-03 16:09 ` Colin Walters 0 siblings, 1 reply; 20+ messages in thread From: Stefan Monnier @ 2013-06-01 14:33 UTC (permalink / raw) To: Paul Eggert; +Cc: Michael Heerdegen, Colin Walters, Michael Albinus, 14474 > Emacs has a function process-status that returns > a process's status. Not only that, but the process-sentinel is called when the status changes. This said, I don't know if there are any process-sentinels out there that need to be told when a process is stopped or "continued". Stefan ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-06-01 14:33 ` Stefan Monnier @ 2013-06-03 16:09 ` Colin Walters 2013-06-04 7:20 ` Paul Eggert 0 siblings, 1 reply; 20+ messages in thread From: Colin Walters @ 2013-06-03 16:09 UTC (permalink / raw) To: Stefan Monnier; +Cc: Michael Heerdegen, Paul Eggert, Michael Albinus, 14474 On Sat, 2013-06-01 at 10:33 -0400, Stefan Monnier wrote: > > Emacs has a function process-status that returns > > a process's status. > > Not only that, but the process-sentinel is called when the status > changes. This said, I don't know if there are any process-sentinels out > there that need to be told when a process is stopped or "continued". Right; I kind of doubt it. Regardless though, I filed: https://bugzilla.gnome.org/show_bug.cgi?id=701538 Are there any other blocking issues for Emacs using the GLib mainloop? If that's the only one I can probably get around to doing a patch this week. I suspect though you could simply not report stopped status, and not break any real world programs. The only thing I can think of is a multiprocess application which sends SIGSTOP to children (but why would they do that?). ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-06-03 16:09 ` Colin Walters @ 2013-06-04 7:20 ` Paul Eggert 2013-06-05 17:21 ` Paul Eggert 0 siblings, 1 reply; 20+ messages in thread From: Paul Eggert @ 2013-06-04 7:20 UTC (permalink / raw) To: Colin Walters; +Cc: Michael Heerdegen, Michael Albinus, 14474 On 06/03/2013 09:09 AM, Colin Walters wrote: > Are there any other blocking issues for Emacs using the GLib mainloop? > If that's the only one I can probably get around to doing a patch this > week. Don't know of any. But I haven't implemented it yet. If it's the only problem, perhaps the Emacs code should be written to run on older glibs, where it'll ignore child-process stops and continues. If this turns out to be a real problem we can disable it (i.e., use the current godawful workaround) on older glibs. But anyway, the idea is to prevent this from being a blocking issue for Emacs. ^ permalink raw reply [flat|nested] 20+ messages in thread
* bug#14474: 24.3.50; Zombie subprocesses (again) 2013-06-04 7:20 ` Paul Eggert @ 2013-06-05 17:21 ` Paul Eggert 0 siblings, 0 replies; 20+ messages in thread From: Paul Eggert @ 2013-06-05 17:21 UTC (permalink / raw) To: Colin Walters; +Cc: Michael Heerdegen, Michael Albinus, 14474 I found another problem with trying to have Emacs use glib's child watcher. glib's signal handling code uses SA_RESTART and SA_NOCLDSTOP. Both flags are non-starters for Emacs. SA_NOCLDSTOP, I suppose, could be conditionalized based on the discussion in Gnome bug reports 701538 and 562501. But SA_RESTART is more of a worry. An interactive Emacs doesn't want SA_RESTART, because Emacs wants long-running syscalls to be interrupted after a signal, not restarted. I thought of a way to work around this problem: have Emacs catch SIGCHLD using its own flags, and call glib's SIGCHLD handler as part of Emacs's SIGCHLD handler. So I installed the patch quoted at the end of this message into the Emacs trunk as bzr 112859. If you've had D-bus problems please try this new approach. This raises three more questions for glib, though. First, why does glib use SA_RESTART? If it's to avoid having application syscalls fail with errno==EINTR, then we're OK. But if it's to avoid having glib's internal syscalls fail with errno==EINTR, then we have a problem, as that can happen with the following patch (and it can also happen with vanilla Emacs 24.3). Second, should there be a more robust way for Emacs to invoke glib's SIGCHLD handler. The code below is a bit of a hack: it uses g_source_unref (g_child_watch_source_new (0)) to create and free a dummy SIGCHLD source, the only reason being to trick glib into installing its SIGCHLD handler. It also assumes that glib does not use SA_SIGINFO. This all seems fairly fragile. Third, if a glib memory allocation fails, what does Emacs do? Emacs tries hard not to exit when there's a memory allocation failure, but I worry that glib will simply call 'exit' if malloc fails, which is not good. === modified file 'src/ChangeLog' --- src/ChangeLog 2013-06-05 12:17:02 +0000 +++ src/ChangeLog 2013-06-05 17:04:13 +0000 @@ -1,3 +1,17 @@ +2013-06-05 Paul Eggert <eggert@cs.ucla.edu> + + Chain glib's SIGCHLD handler from Emacs's (Bug#14474). + * process.c (dummy_handler): New function. + (lib_child_handler): New static var. + (handle_child_signal): Invoke it. + (catch_child_signal): If a library has set up a signal handler, + save it into lib_child_handler. + (init_process_emacs): If using glib and not on Windows, tickle glib's + child-handling code so that it initializes its private SIGCHLD handler. + * syssignal.h (SA_SIGINFO): Default to 0. + * xterm.c (x_term_init): Remove D-bus hack that I installed on May + 31; it should no longer be needed now. + 2013-06-05 Michael Albinus <michael.albinus@gmx.de> * emacs.c (main) [HAVE_GFILENOTIFY]: Call globals_of_gfilenotify. === modified file 'src/process.c' --- src/process.c 2013-06-03 18:47:35 +0000 +++ src/process.c 2013-06-05 17:04:13 +0000 @@ -6100,6 +6100,12 @@ might inadvertently reap a GTK-created process that happened to have the same process ID. */ +/* LIB_CHILD_HANDLER is a SIGCHLD handler that Emacs calls while doing + its own SIGCHLD handling. On POSIXish systems, glib needs this to + keep track of its own children. The default handler does nothing. */ +static void dummy_handler (int sig) {} +static signal_handler_t volatile lib_child_handler = dummy_handler; + /* Handle a SIGCHLD signal by looking for known child processes of Emacs whose status have changed. For each one found, record its new status. @@ -6184,6 +6190,8 @@ } } } + + lib_child_handler (sig); } static void @@ -7035,9 +7043,13 @@ void catch_child_signal (void) { - struct sigaction action; + struct sigaction action, old_action; emacs_sigaction_init (&action, deliver_child_signal); - sigaction (SIGCHLD, &action, 0); + sigaction (SIGCHLD, &action, &old_action); + eassert (! (old_action.sa_flags & SA_SIGINFO)); + if (old_action.sa_handler != SIG_DFL && old_action.sa_handler != SIG_IGN + && old_action.sa_handler != deliver_child_signal) + lib_child_handler = old_action.sa_handler; } \f @@ -7055,6 +7067,11 @@ if (! noninteractive || initialized) #endif { +#if defined HAVE_GLIB && !defined WINDOWSNT + /* Tickle glib's child-handling code so that it initializes its + private SIGCHLD handler. */ + g_source_unref (g_child_watch_source_new (0)); +#endif catch_child_signal (); } === modified file 'src/syssignal.h' --- src/syssignal.h 2013-01-02 16:13:04 +0000 +++ src/syssignal.h 2013-06-05 17:04:13 +0000 @@ -50,6 +50,10 @@ # define NSIG NSIG_MINIMUM #endif +#ifndef SA_SIGINFO +# define SA_SIGINFO 0 +#endif + #ifndef emacs_raise # define emacs_raise(sig) raise (sig) #endif === modified file 'src/xterm.c' --- src/xterm.c 2013-05-31 01:41:52 +0000 +++ src/xterm.c 2013-06-05 17:04:13 +0000 @@ -9897,13 +9897,6 @@ XSetLocaleModifiers (""); - /* If D-Bus is not already configured, inhibit D-Bus autolaunch, - as autolaunch can mess up Emacs's SIGCHLD handler. - FIXME: Rewrite subprocess handlers to use glib's child watchers. - See Bug#14474. */ - if (! egetenv ("DBUS_SESSION_BUS_ADDRESS")) - xputenv ("DBUS_SESSION_BUS_ADDRESS=unix:path=/dev/null"); - /* Emacs can only handle core input events, so make sure Gtk doesn't use Xinput or Xinput2 extensions. */ xputenv ("GDK_CORE_DEVICE_EVENTS=1"); ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2020-09-09 13:52 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-05-25 23:38 bug#14474: 24.3.50; Zombie subprocesses (again) Michael Heerdegen 2013-05-25 23:49 ` Michael Heerdegen 2013-05-26 2:55 ` Eli Zaretskii 2013-05-26 17:37 ` Paul Eggert 2013-05-26 18:33 ` Michael Heerdegen 2013-05-27 1:36 ` Paul Eggert 2013-05-27 12:46 ` Colin Walters 2013-05-27 17:36 ` Paul Eggert 2013-05-28 16:56 ` Paul Eggert 2013-05-28 20:42 ` Michael Heerdegen 2013-06-04 17:12 ` Michael Heerdegen 2020-09-09 13:52 ` Lars Ingebrigtsen 2013-05-28 17:04 ` Jan Djärv 2013-06-01 1:03 ` Paul Eggert 2013-06-01 1:22 ` Colin Walters 2013-06-01 6:14 ` Paul Eggert 2013-06-01 14:33 ` Stefan Monnier 2013-06-03 16:09 ` Colin Walters 2013-06-04 7:20 ` Paul Eggert 2013-06-05 17:21 ` Paul Eggert
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.