unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#8855: Fix backported to Emacs 24
       [not found] <50AD56C9.5040100@cornell.edu>
@ 2012-11-21 23:50 ` Paul Eggert
  2012-11-22  3:49   ` Eli Zaretskii
  2012-11-22 15:13   ` Stefan Monnier
  0 siblings, 2 replies; 8+ messages in thread
From: Paul Eggert @ 2012-11-21 23:50 UTC (permalink / raw)
  To: 8855

[-- Attachment #1: Type: text/plain, Size: 91 bytes --]

Attached is the fixed backported to the emacs-24 branch
(bzr 110936) if there's interest.


[-- Attachment #2: g_spawn_sync.txt --]
[-- Type: text/plain, Size: 11638 bytes --]

=== modified file 'src/ChangeLog'
--- src/ChangeLog	2012-11-21 19:28:14 +0000
+++ src/ChangeLog	2012-11-21 23:43:47 +0000
@@ -1,3 +1,34 @@
+2012-11-21  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Fix a race condition that causes Emacs to mess up glib (Bug#8855).
+	The symptom is a diagnostic "GLib-WARNING **: In call to
+	g_spawn_sync(), exit status of a child process was requested but
+	SIGCHLD action was set to SIG_IGN and ECHILD was received by
+	waitpid(), so exit status can't be returned."  The diagnostic
+	is partly wrong, as the SIGCHLD action is not set to SIG_IGN.
+	The real bug is a race condition between Emacs and glib: Emacs
+	does a waitpid (-1, ...) and reaps glib's subprocess by mistake,
+	so that glib can't find it.  Work around the bug by invoking
+	waitpid only on subprocesses that Emacs itself creates.
+	* process.c (create_process, record_child_status_change):
+	Don't use special value -1 in pid field, as the caller now must
+	know the pid rather than having the callee infer it.  The
+	inference was sometimes incorrect anyway, due to another race.
+	(create_process): Set new 'alive' member if child is created.
+	(process_status_retrieved): New function.
+	(record_child_status_change): Use it.
+	Accept negative 1st argument, which means to wait for the
+	processes that Emacs already knows about.  Move special-case code
+	for DOS_NT (which lacks WNOHANG) here, from caller.  Keep track of
+	processes that have already been waited for, by testing and
+	clearing new 'alive' member.
+	(CAN_HANDLE_MULTIPLE_CHILDREN): Remove, as record_child_status_change
+	now does this internally.
+	(handle_child_signal): Let record_child_status_change do all
+	the work, since we do not want to reap all exited child processes,
+	only the child processes that Emacs itself created.
+	* process.h (Lisp_Process): New boolean member 'alive'.
+
 2012-11-21  Eli Zaretskii  <eliz@gnu.org>
 
 	* indent.c (Fvertical_motion): If the starting position is covered

=== modified file 'src/process.c'
--- src/process.c	2012-10-31 17:27:29 +0000
+++ src/process.c	2012-11-21 23:43:47 +0000
@@ -130,6 +130,10 @@
 		       EMACS_TIME *, void *);
 #endif
 
+/* This is for DOS_NT ports.  FIXME: Remove this old portability cruft
+   by having DOS_NT ports implement waitpid instead of wait.  Nowadays
+   POSIXish hosts all define waitpid, WNOHANG, and WUNTRACED, as these
+   have been standard since POSIX.1-1988.  */
 #ifndef WNOHANG
 # undef waitpid
 # define waitpid(pid, status, options) wait (status)
@@ -795,9 +799,8 @@
 #ifdef SIGCHLD
 /* Fdelete_process promises to immediately forget about the process, but in
    reality, Emacs needs to remember those processes until they have been
-   treated by the SIGCHLD handler; otherwise this handler would consider the
-   process as being synchronous and say that the synchronous process is
-   dead.  */
+   treated by the SIGCHLD handler and waitpid has been invoked on them;
+   otherwise they might fill up the kernel's process table.  */
 static Lisp_Object deleted_pid_list;
 #endif
 
@@ -1704,16 +1707,7 @@
   if (inchannel > max_process_desc)
     max_process_desc = inchannel;
 
-  /* Until we store the proper pid, enable the SIGCHLD handler
-     to recognize an unknown pid as standing for this process.
-     It is very important not to let this `marker' value stay
-     in the table after this function has returned; if it does
-     it might cause call-process to hang and subsequent asynchronous
-     processes to get their return values scrambled.  */
-  XPROCESS (process)->pid = -1;
-
-  /* This must be called after the above line because it may signal an
-     error. */
+  /* This may signal an error. */
   setup_process_coding_systems (process);
 
   encoded_current_dir = ENCODE_FILE (current_dir);
@@ -1880,6 +1874,8 @@
 #endif
 
   XPROCESS (process)->pid = pid;
+  if (0 <= pid)
+    XPROCESS (process)->alive = 1;
 
   /* Stop blocking signals in the parent.  */
 #ifdef SIGCHLD
@@ -6273,9 +6269,35 @@
   return process;
 }
 \f
-/* On receipt of a signal that a child status has changed, loop asking
-   about children with changed statuses until the system says there
-   are no more.
+/* If the status of the process DESIRED has changed, return true and
+   set *STATUS to its exit status; otherwise, return false.
+   If HAVE is nonnegative, assume that HAVE = waitpid (HAVE, STATUS, ...)
+   has already been invoked, and do not invoke waitpid again.  */
+
+static bool
+process_status_retrieved (pid_t desired, pid_t have, int *status)
+{
+  if (have < 0)
+    {
+      /* Invoke waitpid only with a known process ID; do not invoke
+	 waitpid with a nonpositive argument.  Otherwise, Emacs might
+	 reap an unwanted process by mistake.  For example, invoking
+	 waitpid (-1, ...) can mess up glib by reaping glib's subprocesses,
+	 so that another thread running glib won't find them.  */
+      do
+	have = waitpid (desired, status, WNOHANG | WUNTRACED);
+      while (have < 0 && errno == EINTR);
+    }
+
+  return have == desired;
+}
+
+/* If PID is nonnegative, the child process PID with wait status W has
+   changed its status; record this and return true.
+
+   If PID is negative, ignore W, and look for known child processes
+   of Emacs whose status have changed.  For each one found, record its new
+   status.
 
    All we do is change the status; we do not run sentinels or print
    notifications.  That is saved for the next time keyboard input is
@@ -6298,13 +6320,23 @@
    ** Malloc WARNING: This should never call malloc either directly or
    indirectly; if it does, that is a bug  */
 
-/* Record the changed status of the child process PID with wait status W.  */
 void
 record_child_status_change (pid_t pid, int w)
 {
 #ifdef SIGCHLD
-  Lisp_Object proc;
-  struct Lisp_Process *p;
+
+# ifdef WNOHANG
+  /* On POSIXish hosts, record at most one child only if we already
+     know one child that has exited.  */
+  bool record_at_most_one_child = 0 <= pid;
+# else
+  /* On DOS_NT (the only porting target that lacks WNOHANG),
+     record the status of at most one child process, since the SIGCHLD
+     handler must return right away.  If any more processes want to
+     signal us, we will get another signal.  */
+  bool record_at_most_one_child = 1;
+# endif
+
   Lisp_Object tail;
 
   /* Find the process that signaled us, and record its status.  */
@@ -6312,68 +6344,69 @@
   /* The process can have been deleted by Fdelete_process.  */
   for (tail = deleted_pid_list; CONSP (tail); tail = XCDR (tail))
     {
+      bool all_pids_are_fixnums
+	= (MOST_NEGATIVE_FIXNUM <= TYPE_MINIMUM (pid_t)
+	   && TYPE_MAXIMUM (pid_t) <= MOST_POSITIVE_FIXNUM);
       Lisp_Object xpid = XCAR (tail);
-      if ((INTEGERP (xpid) && pid == XINT (xpid))
-	  || (FLOATP (xpid) && pid == XFLOAT_DATA (xpid)))
+      if (all_pids_are_fixnums ? INTEGERP (xpid) : NUMBERP (xpid))
 	{
-	  XSETCAR (tail, Qnil);
-	  return;
+	  pid_t deleted_pid;
+	  if (INTEGERP (xpid))
+	    deleted_pid = XINT (xpid);
+	  else
+	    deleted_pid = XFLOAT_DATA (xpid);
+	  if (process_status_retrieved (deleted_pid, pid, &w))
+	    {
+	      XSETCAR (tail, Qnil);
+	      if (record_at_most_one_child)
+		return;
+	    }
 	}
     }
 
   /* Otherwise, if it is asynchronous, it is in Vprocess_alist.  */
-  p = 0;
   for (tail = Vprocess_alist; CONSP (tail); tail = XCDR (tail))
     {
-      proc = XCDR (XCAR (tail));
-      p = XPROCESS (proc);
-      if (EQ (p->type, Qreal) && p->pid == pid)
-	break;
-      p = 0;
-    }
-
-  /* Look for an asynchronous process whose pid hasn't been filled
-     in yet.  */
-  if (! p)
-    for (tail = Vprocess_alist; CONSP (tail); tail = XCDR (tail))
-      {
-	proc = XCDR (XCAR (tail));
-	p = XPROCESS (proc);
-	if (p->pid == -1)
-	  break;
-	p = 0;
-      }
-
-  /* Change the status of the process that was found.  */
-  if (p)
-    {
-      int clear_desc_flag = 0;
-
-      p->tick = ++process_tick;
-      p->raw_status = w;
-      p->raw_status_new = 1;
-
-      /* If process has terminated, stop waiting for its output.  */
-      if ((WIFSIGNALED (w) || WIFEXITED (w))
-	  && p->infd >= 0)
-	clear_desc_flag = 1;
-
-      /* We use clear_desc_flag to avoid a compiler bug in Microsoft C.  */
-      if (clear_desc_flag)
+      Lisp_Object proc = XCDR (XCAR (tail));
+      struct Lisp_Process *p = XPROCESS (proc);
+      if (p->alive && process_status_retrieved (p->pid, pid, &w))
 	{
-	  FD_CLR (p->infd, &input_wait_mask);
-	  FD_CLR (p->infd, &non_keyboard_wait_mask);
+	  /* Change the status of the process that was found.  */
+	  p->tick = ++process_tick;
+	  p->raw_status = w;
+	  p->raw_status_new = 1;
+
+	  /* If process has terminated, stop waiting for its output.  */
+	  if (WIFSIGNALED (w) || WIFEXITED (w))
+	    {
+	      int clear_desc_flag = 0;
+	      p->alive = 0;
+	      if (p->infd >= 0)
+		clear_desc_flag = 1;
+
+	      /* clear_desc_flag avoids a compiler bug in Microsoft C.  */
+	      if (clear_desc_flag)
+		{
+		  FD_CLR (p->infd, &input_wait_mask);
+		  FD_CLR (p->infd, &non_keyboard_wait_mask);
+		}
+	    }
+
+	  /* Tell wait_reading_process_output that it needs to wake up and
+	     look around.  */
+	  if (input_available_clear_time)
+	    *input_available_clear_time = make_emacs_time (0, 0);
+
+	  if (record_at_most_one_child)
+	    return;
 	}
-
-      /* Tell wait_reading_process_output that it needs to wake up and
-	 look around.  */
-      if (input_available_clear_time)
-	*input_available_clear_time = make_emacs_time (0, 0);
     }
-  /* There was no asynchronous process found for that pid: we have
-     a synchronous process.  */
-  else
+
+  if (0 <= pid)
     {
+      /* The caller successfully waited for a pid but no asynchronous
+	 process was found for it, so this is a synchronous process.  */
+
       synch_process_alive = 0;
 
       /* Report the status of the synchronous process.  */
@@ -6392,38 +6425,10 @@
 
 #ifdef SIGCHLD
 
-/* On some systems, the SIGCHLD handler must return right away.  If
-   any more processes want to signal us, we will get another signal.
-   Otherwise, loop around to use up all the processes that have
-   something to tell us.  */
-#if (defined WINDOWSNT \
-     || (defined USG && !defined GNU_LINUX \
-	 && !(defined HPUX && defined WNOHANG)))
-enum { CAN_HANDLE_MULTIPLE_CHILDREN = 0 };
-#else
-enum { CAN_HANDLE_MULTIPLE_CHILDREN = 1 };
-#endif
-
 static void
 handle_child_signal (int sig)
 {
-  do
-    {
-      pid_t pid;
-      int status;
-
-      do
-	pid = waitpid (-1, &status, WNOHANG | WUNTRACED);
-      while (pid < 0 && errno == EINTR);
-
-      /* PID == 0 means no processes found, PID == -1 means a real failure.
-	 Either way, we have done all our job.  */
-      if (pid <= 0)
-	break;
-
-      record_child_status_change (pid, status);
-    }
-  while (CAN_HANDLE_MULTIPLE_CHILDREN);
+  record_child_status_change (-1, 0);
 }
 
 static void

=== modified file 'src/process.h'
--- src/process.h	2012-08-27 17:23:48 +0000
+++ src/process.h	2012-11-21 23:43:47 +0000
@@ -142,6 +142,9 @@
     /* Flag to set coding-system of the process buffer from the
        coding_system used to decode process output.  */
     unsigned int inherit_coding_system_flag : 1;
+    /* Whether the process is alive, i.e., can be waited for.  Running
+       processes can be waited for, but exited and fake processes cannot.  */
+    unsigned int alive : 1;
     /* Record the process status in the raw form in which it comes from `wait'.
        This is to avoid consing in a signal handler.  The `raw_status_new'
        flag indicates that `raw_status' contains a new status that still


^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#8855: Fix backported to Emacs 24
  2012-11-21 23:50 ` bug#8855: Fix backported to Emacs 24 Paul Eggert
@ 2012-11-22  3:49   ` Eli Zaretskii
  2012-11-22 15:13   ` Stefan Monnier
  1 sibling, 0 replies; 8+ messages in thread
From: Eli Zaretskii @ 2012-11-22  3:49 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 8855

> Date: Wed, 21 Nov 2012 15:50:59 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> 
> Attached is the fixed backported to the emacs-24 branch
> (bzr 110936) if there's interest.

The related changes for MS-Windows (r110922) are needed as well.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#8855: Fix backported to Emacs 24
  2012-11-21 23:50 ` bug#8855: Fix backported to Emacs 24 Paul Eggert
  2012-11-22  3:49   ` Eli Zaretskii
@ 2012-11-22 15:13   ` Stefan Monnier
  2012-11-22 19:20     ` Paul Eggert
  1 sibling, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2012-11-22 15:13 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 8855

> Attached is the fixed backported to the emacs-24 branch
> (bzr 110936) if there's interest.

There's something I don't understand.
On Oct 29, Chong said:

> > simplifications, unfortunately.  The patch fixes a bug that's been
> > reported multiple times so I'm thinking it may be worthwhile to
> > install now, even though there's a feature freeze.
> I think the patch is OK to commit.  Put the extended explanation in a
> comment somewhere in the code, though---not in the ChangeLog entry.

So how come the patch is not in emacs-24 yet?


        Stefan





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#8855: Fix backported to Emacs 24
  2012-11-22 15:13   ` Stefan Monnier
@ 2012-11-22 19:20     ` Paul Eggert
  2012-11-23  2:40       ` Chong Yidong
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Eggert @ 2012-11-22 19:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 8855

On 11/22/2012 07:13 AM, Stefan Monnier wrote:
> So how come the patch is not in emacs-24 yet?

I understood Chong's comment to be about the trunk, not
about emacs-24.  The trunk was frozen at the time, if
I recall, so his permission would have been needed for
the trunk.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#8855: Fix backported to Emacs 24
  2012-11-22 19:20     ` Paul Eggert
@ 2012-11-23  2:40       ` Chong Yidong
  2012-11-23 21:46         ` Ken Brown
  0 siblings, 1 reply; 8+ messages in thread
From: Chong Yidong @ 2012-11-23  2:40 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 8855

Paul Eggert <eggert@cs.ucla.edu> writes:

> On 11/22/2012 07:13 AM, Stefan Monnier wrote:
>> So how come the patch is not in emacs-24 yet?
>
> I understood Chong's comment to be about the trunk, not about
> emacs-24.  The trunk was frozen at the time, if I recall, so his
> permission would have been needed for the trunk.

The emacs-24 branch hadn't even been created when I wrote that, so
"commits to the trunk" at the time meant "code going into 24.3".

Feel free to backport it now.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#8855: Fix backported to Emacs 24
  2012-11-23  2:40       ` Chong Yidong
@ 2012-11-23 21:46         ` Ken Brown
  2012-11-23 22:22           ` Paul Eggert
  0 siblings, 1 reply; 8+ messages in thread
From: Ken Brown @ 2012-11-23 21:46 UTC (permalink / raw)
  To: Chong Yidong; +Cc: Paul Eggert, 8855

On 11/22/2012 9:40 PM, Chong Yidong wrote:
> Paul Eggert <eggert@cs.ucla.edu> writes:
>
>> On 11/22/2012 07:13 AM, Stefan Monnier wrote:
>>> So how come the patch is not in emacs-24 yet?
>>
>> I understood Chong's comment to be about the trunk, not about
>> emacs-24.  The trunk was frozen at the time, if I recall, so his
>> permission would have been needed for the trunk.
>
> The emacs-24 branch hadn't even been created when I wrote that, so
> "commits to the trunk" at the time meant "code going into 24.3".
>
> Feel free to backport it now.

It would be good to get this done before tomorrow's release of 24.2.90. 
  Paul, is there still time to do it?

Ken





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#8855: Fix backported to Emacs 24
  2012-11-23 21:46         ` Ken Brown
@ 2012-11-23 22:22           ` Paul Eggert
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Eggert @ 2012-11-23 22:22 UTC (permalink / raw)
  To: Ken Brown; +Cc: Chong Yidong, 8855

On 11/23/2012 01:46 PM, Ken Brown wrote:
> Paul, is there still time to do it?

Yes, done with Eli's suggestions as emacs-24 bzr 110946.

On 11/22/2012 06:40 PM, Chong Yidong wrote:
>
> The emacs-24 branch hadn't even been created when I wrote that, so
> "commits to the trunk" at the time meant "code going into 24.3".

Sorry about the confusion -- my memory must have been faulty.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#8855: Fix backported to Emacs 24
  2011-06-13 17:49 bug#8855: dbus error at startup Dan Nicolaescu
@ 2012-11-27  2:32 ` Paul Eggert
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Eggert @ 2012-11-27  2:32 UTC (permalink / raw)
  To: 8855

As discussed on emacs-devel, I had to undo that backport to
the emacs-24 branch.

The problem was that the fix for Bug#8855 introduces a new
bug, Bug#12980.  I have followed up at Bug#12980 but the patch
is nontrivial, so for now Bug#8855 remains unfixed on the
emacs-24 branch.





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-11-27  2:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <50AD56C9.5040100@cornell.edu>
2012-11-21 23:50 ` bug#8855: Fix backported to Emacs 24 Paul Eggert
2012-11-22  3:49   ` Eli Zaretskii
2012-11-22 15:13   ` Stefan Monnier
2012-11-22 19:20     ` Paul Eggert
2012-11-23  2:40       ` Chong Yidong
2012-11-23 21:46         ` Ken Brown
2012-11-23 22:22           ` Paul Eggert
2011-06-13 17:49 bug#8855: dbus error at startup Dan Nicolaescu
2012-11-27  2:32 ` bug#8855: Fix backported to Emacs 24 Paul Eggert

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).