busyloop in sigchld

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* busyloop in sigchld_handler
@ 2007-03-11 18:33 Sam Steingold
  2007-03-11 19:39 ` Kim F. Storm
  2007-03-26  1:47 ` YAMAMOTO Mitsuharu
  0 siblings, 2 replies; 58+ messages in thread
From: Sam Steingold @ 2007-03-11 18:33 UTC (permalink / raw)
  To: emacs-devel

GNU Emacs 22.0.95.2 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of
2007-03-11 on loiso

I have been observing the following behavior:
emacs hangs in sigchld_handler waiting for the child process to
terminate:

      do
	{
	  errno = 0;
	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
	}
      while (pid < 0 && errno == EINTR);

the system stops responding, loadavg goes to 5-8(!), CPU is 100% busy.
this lasts for ~10 seconds.
this happens on M-c compile and when I click on a URL (it is passed to
an existing Firefox).

I fixed the problem with the following patch:

Index: process.c
===================================================================
RCS file: /sources/emacs/emacs/src/process.c,v
retrieving revision 1.500
retrieving revision 1.501
diff -u -w -u -b -w -i -B -r1.500 -r1.501
--- process.c	1 Mar 2007 10:17:41 -0000	1.500
+++ process.c	11 Mar 2007 18:16:50 -0000	1.501
@@ -6497,6 +6497,7 @@
       /* Keep trying to get a status until we get a definitive result.  */
       do
 	{
+          sleep (1);
 	  errno = 0;
 	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
 	}


Thanks.

-- 
Sam Steingold (http://sds.podval.org/) on Fedora Core release 6 (Zod)
http://ffii.org http://pmw.org.il http://palestinefacts.org
http://iris.org.il http://mideasttruth.com http://truepeace.org
Old Age Comes at a Bad Time.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 18:33 busyloop in sigchld_handler Sam Steingold
@ 2007-03-11 19:39 ` Kim F. Storm
  2007-03-11 19:43   ` David Kastrup
  2007-03-11 21:06   ` Sam Steingold
  2007-03-26  1:47 ` YAMAMOTO Mitsuharu
  1 sibling, 2 replies; 58+ messages in thread
From: Kim F. Storm @ 2007-03-11 19:39 UTC (permalink / raw)
  To: emacs-devel

Sam Steingold <sds@gnu.org> writes:

> GNU Emacs 22.0.95.2 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of
> 2007-03-11 on loiso
>
> I have been observing the following behavior:
> emacs hangs in sigchld_handler waiting for the child process to
> terminate:
>
>       do
> 	{
> 	  errno = 0;
> 	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
> 	}
>       while (pid < 0 && errno == EINTR);
>
> the system stops responding, loadavg goes to 5-8(!), CPU is 100% busy.
> this lasts for ~10 seconds.
> this happens on M-c compile and when I click on a URL (it is passed to
> an existing Firefox).

I've been annoyed by this behaviour for years - but have seen it occasionally
outside Emacs too, so I didn't expect it to be a problem in Emacs.

> I fixed the problem with the following patch:

Absolutely brilliant!  I can confirm that it works.

But can you explain why it works?  
And why the problem primarily hit M-x compile.

Does the fix cause a 1 second delay for other sub-processes ?

>
> Index: process.c
> ===================================================================
> RCS file: /sources/emacs/emacs/src/process.c,v
> retrieving revision 1.500
> retrieving revision 1.501
> diff -u -w -u -b -w -i -B -r1.500 -r1.501
> --- process.c	1 Mar 2007 10:17:41 -0000	1.500
> +++ process.c	11 Mar 2007 18:16:50 -0000	1.501
> @@ -6497,6 +6497,7 @@
>        /* Keep trying to get a status until we get a definitive result.  */
>        do
>  	{
> +          sleep (1);
>  	  errno = 0;
>  	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
>  	}


-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 19:39 ` Kim F. Storm
@ 2007-03-11 19:43   ` David Kastrup
  2007-03-11 19:51     ` Sam Steingold
  2007-03-11 21:06   ` Sam Steingold
  1 sibling, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-11 19:43 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> Sam Steingold <sds@gnu.org> writes:
>
>> GNU Emacs 22.0.95.2 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of
>> 2007-03-11 on loiso
>>
>> I have been observing the following behavior:
>> emacs hangs in sigchld_handler waiting for the child process to
>> terminate:
>>
>>       do
>> 	{
>> 	  errno = 0;
>> 	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
>> 	}
>>       while (pid < 0 && errno == EINTR);
>
>> I fixed the problem with the following patch:
>
> Absolutely brilliant!  I can confirm that it works.
>
> But can you explain why it works?  
> And why the problem primarily hit M-x compile.
>
> Does the fix cause a 1 second delay for other sub-processes ?
>
>>
>> Index: process.c
>> ===================================================================
>> RCS file: /sources/emacs/emacs/src/process.c,v
>> retrieving revision 1.500
>> retrieving revision 1.501
>> diff -u -w -u -b -w -i -B -r1.500 -r1.501
>> --- process.c	1 Mar 2007 10:17:41 -0000	1.500
>> +++ process.c	11 Mar 2007 18:16:50 -0000	1.501
>> @@ -6497,6 +6497,7 @@
>>        /* Keep trying to get a status until we get a definitive result.  */
>>        do
>>  	{
>> +          sleep (1);
>>  	  errno = 0;
>>  	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
>>  	}
>

Wouldn't it make more sense to do

for (;;) {
    errno = 0;
    pid = wait3(&w, WNOHANG | WUNTRACD, 0);
    if (!(pid < 0 && errno == EINTR))
      break;
    sleep(1);
}

That way, we don't get an obligatory sleep if things happen to work
without it.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 19:43   ` David Kastrup
@ 2007-03-11 19:51     ` Sam Steingold
  2007-03-11 20:42       ` Kim F. Storm
  0 siblings, 1 reply; 58+ messages in thread
From: Sam Steingold @ 2007-03-11 19:51 UTC (permalink / raw)
  To: emacs-devel

> * David Kastrup <qnx@tah.bet> [2007-03-11 20:43:13 +0100]:
> storm@cua.dk (Kim F. Storm) writes:
>> Sam Steingold <sds@gnu.org> writes:
>>
>
> Wouldn't it make more sense to do
>
> for (;;) {
>     errno = 0;
>     pid = wait3(&w, WNOHANG | WUNTRACD, 0);
>     if (!(pid < 0 && errno == EINTR))
>       break;
>     sleep(1);
> }
>
> That way, we don't get an obligatory sleep if things happen to work
> without it.

yes - my point was to make the change as small as possible to make it
clear what I want (and minimize opposition). it appears that people do
not mind, so I will modify it along your lines.


-- 
Sam Steingold (http://sds.podval.org/) on Fedora Core release 6 (Zod)
http://honestreporting.com http://palestinefacts.org
http://openvotingconsortium.org http://israelunderattack.slide.com
Liberty is measured in meters: how far one can travel without showing an ID.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 19:51     ` Sam Steingold
@ 2007-03-11 20:42       ` Kim F. Storm
  0 siblings, 0 replies; 58+ messages in thread
From: Kim F. Storm @ 2007-03-11 20:42 UTC (permalink / raw)
  To: emacs-devel

Sam Steingold <sds@gnu.org> writes:

>> * David Kastrup <qnx@tah.bet> [2007-03-11 20:43:13 +0100]:
>> storm@cua.dk (Kim F. Storm) writes:
>>> Sam Steingold <sds@gnu.org> writes:
>>>
>>
>> Wouldn't it make more sense to do
>>
>> for (;;) {
>>     errno = 0;
>>     pid = wait3(&w, WNOHANG | WUNTRACD, 0);
>>     if (!(pid < 0 && errno == EINTR))
>>       break;
>>     sleep(1);
>> }
>>
>> That way, we don't get an obligatory sleep if things happen to work
>> without it.
>
> yes - my point was to make the change as small as possible to make it
> clear what I want (and minimize opposition). it appears that people do
> not mind, so I will modify it along your lines.

I already tried that, and it didn't work !!!

I also tried a shorter sleep (usleep(100000)) before the first wait,
and it didn't work either -- emacs did remain respond briefly just
after M-x compile completed, but then it became unresponsive for ~10
seconds.

That's why I asked you to explain how the change is supposed to work.

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 19:39 ` Kim F. Storm
  2007-03-11 19:43   ` David Kastrup
@ 2007-03-11 21:06   ` Sam Steingold
  2007-03-11 21:14     ` Eli Zaretskii
                       ` (3 more replies)
  1 sibling, 4 replies; 58+ messages in thread
From: Sam Steingold @ 2007-03-11 21:06 UTC (permalink / raw)
  To: emacs-devel

> * Kim F. Storm <fgbez@phn.qx> [2007-03-11 20:39:25 +0100]:
>
> Sam Steingold <sds@gnu.org> writes:
>
> Absolutely brilliant!  I can confirm that it works.

Thanks!

> But can you explain why it works?

wait3 is a system call, which, when invoked in a loop, prevents the
kernel from doing anything else (in this case, sending SIGCHLD to
emacs). sleep allows the kernel some time to pass the signal.


-- 
Sam Steingold (http://sds.podval.org/) on Fedora Core release 6 (Zod)
http://honestreporting.com http://israelunderattack.slide.com http://memri.org
http://iris.org.il http://jihadwatch.org http://dhimmi.com
The plural of "anecdote" is not "data".

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 21:06   ` Sam Steingold
@ 2007-03-11 21:14     ` Eli Zaretskii
  2007-03-11 21:17       ` Sam Steingold
  2007-03-11 22:27     ` Andreas Schwab
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2007-03-11 21:14 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

> From: Sam Steingold <sds@gnu.org>
> Date: Sun, 11 Mar 2007 17:06:07 -0400
> 
> > But can you explain why it works?
> 
> wait3 is a system call, which, when invoked in a loop, prevents the
> kernel from doing anything else (in this case, sending SIGCHLD to
> emacs). sleep allows the kernel some time to pass the signal.

So how portable is this trick?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 21:14     ` Eli Zaretskii
@ 2007-03-11 21:17       ` Sam Steingold
  2007-03-11 21:51         ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Sam Steingold @ 2007-03-11 21:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> * Eli Zaretskii <ryvm@tah.bet> [2007-03-11 23:14:01 +0200]:
>
>> From: Sam Steingold <sds@gnu.org>
>> Date: Sun, 11 Mar 2007 17:06:07 -0400
>> 
>> > But can you explain why it works?
>> 
>> wait3 is a system call, which, when invoked in a loop, prevents the
>> kernel from doing anything else (in this case, sending SIGCHLD to
>> emacs). sleep allows the kernel some time to pass the signal.
>
> So how portable is this trick?

sleep is in POSIX.

-- 
Sam Steingold (http://sds.podval.org/) on Fedora Core release 6 (Zod)
http://thereligionofpeace.com http://iris.org.il http://mideasttruth.com
http://palestinefacts.org http://pmw.org.il http://israelunderattack.slide.com
WHO ATE MY BREAKFAST PANTS?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 21:17       ` Sam Steingold
@ 2007-03-11 21:51         ` Eli Zaretskii
  2007-03-11 22:21           ` Sam Steingold
  0 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2007-03-11 21:51 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

> Date: Sun, 11 Mar 2007 17:17:05 -0400
> From: Sam Steingold <sds@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> > * Eli Zaretskii <ryvm@tah.bet> [2007-03-11 23:14:01 +0200]:
> >
> >> From: Sam Steingold <sds@gnu.org>
> >> Date: Sun, 11 Mar 2007 17:06:07 -0400
> >> 
> >> > But can you explain why it works?
> >> 
> >> wait3 is a system call, which, when invoked in a loop, prevents the
> >> kernel from doing anything else (in this case, sending SIGCHLD to
> >> emacs). sleep allows the kernel some time to pass the signal.
> >
> > So how portable is this trick?
> 
> sleep is in POSIX.

Obviously, that's not what I asked.  I asked how portable is the
behavior of the kernel that you described in the cited text.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 21:51         ` Eli Zaretskii
@ 2007-03-11 22:21           ` Sam Steingold
  2007-03-12  4:24             ` Richard Stallman
  0 siblings, 1 reply; 58+ messages in thread
From: Sam Steingold @ 2007-03-11 22:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> * Eli Zaretskii <ryvm@tah.bet> [2007-03-11 23:51:57 +0200]:
>
>> Date: Sun, 11 Mar 2007 17:17:05 -0400
>> From: Sam Steingold <sds@gnu.org>
>> Cc: emacs-devel@gnu.org
>> 
>> > * Eli Zaretskii <ryvm@tah.bet> [2007-03-11 23:14:01 +0200]:
>> >
>> >> From: Sam Steingold <sds@gnu.org>
>> >> Date: Sun, 11 Mar 2007 17:06:07 -0400
>> >> 
>> >> > But can you explain why it works?
>> >> 
>> >> wait3 is a system call, which, when invoked in a loop, prevents the
>> >> kernel from doing anything else (in this case, sending SIGCHLD to
>> >> emacs). sleep allows the kernel some time to pass the signal.
>> >
>> > So how portable is this trick?
>> 
>> sleep is in POSIX.
>
> Obviously, that's not what I asked.  I asked how portable is the
> behavior of the kernel that you described in the cited text.

Sorry.
I observe the lossage only on linux.
OTOH, linux is all I have access to.


-- 
Sam Steingold (http://sds.podval.org/) on Fedora Core release 6 (Zod)
http://honestreporting.com http://thereligionofpeace.com http://truepeace.org
http://iris.org.il http://jihadwatch.org http://ffii.org
I don't want to be young again, I just don't want to get any older.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 22:21           ` Sam Steingold
@ 2007-03-12  4:24             ` Richard Stallman
  2007-03-12  7:00               ` David Kastrup
  0 siblings, 1 reply; 58+ messages in thread
From: Richard Stallman @ 2007-03-12  4:24 UTC (permalink / raw)
  To: sds; +Cc: eliz, emacs-devel

    I observe the lossage only on linux.
    OTOH, linux is all I have access to.

Surely it is the GNU/Linux system, not just Linux.
Please don't call it "Linux" -- that is unfair to us.

See http://www.gnu.org/gnu/gnu-linux-faq.html  for more explanation.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12  4:24             ` Richard Stallman
@ 2007-03-12  7:00               ` David Kastrup
  2007-03-13  2:43                 ` Richard Stallman
  0 siblings, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-12  7:00 UTC (permalink / raw)
  To: rms; +Cc: eliz, sds, emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     I observe the lossage only on linux.
>     OTOH, linux is all I have access to.
>
> Surely it is the GNU/Linux system, not just Linux.
> Please don't call it "Linux" -- that is unfair to us.

Well, he was talking of the behavior of the system call and the
kernel.  It would be unfair to us to blame us for that.  There is no
part of GNU involved here unless you suspect the system call wrapper
in glibc.



-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12  7:00               ` David Kastrup
@ 2007-03-13  2:43                 ` Richard Stallman
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Stallman @ 2007-03-13  2:43 UTC (permalink / raw)
  To: David Kastrup; +Cc: eliz, sds, emacs-devel

    >     I observe the lossage only on linux.
    >     OTOH, linux is all I have access to.
    >
    > Surely it is the GNU/Linux system, not just Linux.
    > Please don't call it "Linux" -- that is unfair to us.

    Well, he was talking of the behavior of the system call and the
    kernel.

Yes, but it's clear that his second sentence refers to the whole system,
not just the kernel.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 21:06   ` Sam Steingold
  2007-03-11 21:14     ` Eli Zaretskii
@ 2007-03-11 22:27     ` Andreas Schwab
  2007-03-11 22:30     ` Kim F. Storm
  2007-03-12 17:37     ` Andreas Schwab
  3 siblings, 0 replies; 58+ messages in thread
From: Andreas Schwab @ 2007-03-11 22:27 UTC (permalink / raw)
  To: emacs-devel

Sam Steingold <sds@gnu.org> writes:

> wait3 is a system call, which, when invoked in a loop, prevents the
> kernel from doing anything else (in this case, sending SIGCHLD to
> emacs).

We *are* alread in SIGCHLD.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 21:06   ` Sam Steingold
  2007-03-11 21:14     ` Eli Zaretskii
  2007-03-11 22:27     ` Andreas Schwab
@ 2007-03-11 22:30     ` Kim F. Storm
  2007-03-12 17:37     ` Andreas Schwab
  3 siblings, 0 replies; 58+ messages in thread
From: Kim F. Storm @ 2007-03-11 22:30 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

Sam Steingold <sds@gnu.org> writes:

> The following message is a courtesy copy of an article
> that has been posted to gmane.emacs.devel as well.
>
>> * Kim F. Storm <fgbez@phn.qx> [2007-03-11 20:39:25 +0100]:
>>
>> Sam Steingold <sds@gnu.org> writes:
>>
>> Absolutely brilliant!  I can confirm that it works.
>
> Thanks!

Actually, I'm not quite sure your patch works.

It seems the unresponsiveness is just delayed some seconds -- at
least sometimes.

I don't know what the system is doing -- flushing the buffer
cache or some such?

GNU Emacs 22.0.95.13 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
Linux kernel: 2.4.20-8


>
>> But can you explain why it works?
>
> wait3 is a system call, which, when invoked in a loop, prevents the
> kernel from doing anything else (in this case, sending SIGCHLD to
> emacs). sleep allows the kernel some time to pass the signal.
>

I tried to verify that there is a loop, but I can't.

Apply the patch below, start emacs -Q, run M-x compile, and look at
sigchld_pid array.  I get this:

(gdb) p sigchld_count
$1 = 2
(gdb) p sigchld_sleep
$2 = 8
(gdb) p sigchld_pid
$3 = {1173651414, 0, 5453, 0, 1173651414, 0, -1, 10, 0 <repeats 12 times>}

So I'm puzzled what's going on.


*** process.c	11 Mar 2007 20:32:23 +0100	1.501
--- process.c	11 Mar 2007 23:24:08 +0100	
***************
*** 6468,6473 ****
--- 6468,6478 ----
     indirectly; if it does, that is a bug  */
  
  #ifdef SIGCHLD
+ 
+ long sigchld_pid[20];
+ int sigchld_count = 0;
+ int sigchld_sleep = 0;
+ 
  SIGTYPE
  sigchld_handler (signo)
       int signo;
***************
*** 6495,6505 ****
  #define WUNTRACED 0
  #endif /* no WUNTRACED */
        /* Keep trying to get a status until we get a definitive result.  */
        do
  	{
!           sleep (1);
  	  errno = 0;
  	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
  	}
        while (pid < 0 && errno == EINTR);
  
--- 6500,6522 ----
  #define WUNTRACED 0
  #endif /* no WUNTRACED */
        /* Keep trying to get a status until we get a definitive result.  */
+ 
+       sigchld_count++;
        do
  	{
! 	  unsigned long t1, t2;
!           /* sleep (1); */
! 	  time(&t1);
  	  errno = 0;
  	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
+ 	  if (sigchld_sleep < 15)
+ 	    {
+ 	      time(&t2);
+ 	      sigchld_pid[sigchld_sleep++] = t1;
+ 	      sigchld_pid[sigchld_sleep++] = t2 - t1;
+ 	      sigchld_pid[sigchld_sleep++] = pid;
+ 	      sigchld_pid[sigchld_sleep++] = errno;
+ 	    }
  	}
        while (pid < 0 && errno == EINTR);
  


--
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 21:06   ` Sam Steingold
                       ` (2 preceding siblings ...)
  2007-03-11 22:30     ` Kim F. Storm
@ 2007-03-12 17:37     ` Andreas Schwab
  2007-03-12 17:53       ` Sam Steingold
  3 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-12 17:37 UTC (permalink / raw)
  To: emacs-devel

Sam Steingold <sds@gnu.org> writes:

> wait3 is a system call, which, when invoked in a loop, prevents the
> kernel from doing anything else (in this case, sending SIGCHLD to
> emacs). sleep allows the kernel some time to pass the signal.

I don't find this explanation convincing at all.  Since the system call is
made _inside_ the signal handler, the signal is actually blocked here, so
it cannot be delivered anyway.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12 17:37     ` Andreas Schwab
@ 2007-03-12 17:53       ` Sam Steingold
  2007-03-12 18:45         ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Sam Steingold @ 2007-03-12 17:53 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: emacs-devel

Andreas Schwab wrote:
> Sam Steingold <sds@gnu.org> writes:
> 
>> wait3 is a system call, which, when invoked in a loop, prevents the
>> kernel from doing anything else (in this case, sending SIGCHLD to
>> emacs). sleep allows the kernel some time to pass the signal.
> 
> I don't find this explanation convincing at all.  Since the system call is
> made _inside_ the signal handler, the signal is actually blocked here, so
> it cannot be delivered anyway.

yes, you are right.
what I should have said was that by making a system call in a busyloop, 
emacs prevents the kernel from doing what it needs to do to the the 
child so that wait3 will succeed.

Sam.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12 17:53       ` Sam Steingold
@ 2007-03-12 18:45         ` Andreas Schwab
  2007-03-12 18:57           ` Sam Steingold
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-12 18:45 UTC (permalink / raw)
  To: Sam Steingold; +Cc: emacs-devel

Sam Steingold <sds@gnu.org> writes:

> Andreas Schwab wrote:
>> Sam Steingold <sds@gnu.org> writes:
>>
>>> wait3 is a system call, which, when invoked in a loop, prevents the
>>> kernel from doing anything else (in this case, sending SIGCHLD to
>>> emacs). sleep allows the kernel some time to pass the signal.
>>
>> I don't find this explanation convincing at all.  Since the system call is
>> made _inside_ the signal handler, the signal is actually blocked here, so
>> it cannot be delivered anyway.
>
> yes, you are right.
> what I should have said was that by making a system call in a busyloop,
> emacs prevents the kernel from doing what it needs to do to the the child
> so that wait3 will succeed.

What does the kernel have to do?  The EINTR error will only happen when
the system call was interrupted _and_ a signal handler was called.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12 18:45         ` Andreas Schwab
@ 2007-03-12 18:57           ` Sam Steingold
  2007-03-12 19:28             ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Sam Steingold @ 2007-03-12 18:57 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: emacs-devel

Andreas Schwab wrote:
> Sam Steingold <sds@gnu.org> writes:
> 
>> Andreas Schwab wrote:
>>> Sam Steingold <sds@gnu.org> writes:
>>>
>>>> wait3 is a system call, which, when invoked in a loop, prevents the
>>>> kernel from doing anything else (in this case, sending SIGCHLD to
>>>> emacs). sleep allows the kernel some time to pass the signal.
>>> I don't find this explanation convincing at all.  Since the system call is
>>> made _inside_ the signal handler, the signal is actually blocked here, so
>>> it cannot be delivered anyway.
>> yes, you are right.
>> what I should have said was that by making a system call in a busyloop,
>> emacs prevents the kernel from doing what it needs to do to the the child
>> so that wait3 will succeed.
> 
> What does the kernel have to do?  The EINTR error will only happen when
> the system call was interrupted _and_ a signal handler was called.

I don't know the details, but the kernel obviously has to do SOMETHING 
when the child process terminates: it has to notice the state change so 
that wait3 in emacs will return "yes, the child is dead" instead of 
"nothing for you yet". The busyloop prevents the kernel from doing 
anything for some time.

Now, it might be better to remove the WNOHANG option instead (except 
that signal handlers are not supposed to hang), or use usleep(10000) 
instead of sleep(1), but the busyloop is what has to be fixed.

Please also note that I am not in any way an expert in kernel matters.

Sam.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12 18:57           ` Sam Steingold
@ 2007-03-12 19:28             ` Andreas Schwab
  2007-03-12 19:34               ` David Kastrup
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-12 19:28 UTC (permalink / raw)
  To: Sam Steingold; +Cc: emacs-devel

Sam Steingold <sds@gnu.org> writes:

> I don't know the details, but the kernel obviously has to do SOMETHING
> when the child process terminates: it has to notice the state change so
> that wait3 in emacs will return "yes, the child is dead" instead of
> "nothing for you yet". The busyloop prevents the kernel from doing
> anything for some time.

But the busyloop does not happen when there is nothing to do, it only
happens when wait3 is interrupted.  A zero return will cause the handler
to return immediately.

> Now, it might be better to remove the WNOHANG option instead (except that
> signal handlers are not supposed to hang),

If you don't use WNOHANG you open up a race where several processes may
have their status changed, but only one signal is sent (non-realtime
signals are not queued).

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12 19:28             ` Andreas Schwab
@ 2007-03-12 19:34               ` David Kastrup
  2007-03-12 21:36                 ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-12 19:34 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> Sam Steingold <sds@gnu.org> writes:
>
>> I don't know the details, but the kernel obviously has to do SOMETHING
>> when the child process terminates: it has to notice the state change so
>> that wait3 in emacs will return "yes, the child is dead" instead of
>> "nothing for you yet". The busyloop prevents the kernel from doing
>> anything for some time.
>
> But the busyloop does not happen when there is nothing to do, it only
> happens when wait3 is interrupted.  A zero return will cause the handler
> to return immediately.
>
>> Now, it might be better to remove the WNOHANG option instead (except that
>> signal handlers are not supposed to hang),
>
> If you don't use WNOHANG you open up a race where several processes may
> have their status changed, but only one signal is sent (non-realtime
> signals are not queued).

How does WNOHANG protect against that?  It does not prevent
scheduling, and it certainly does not prevent parallel execution on
multi-processor machines.

I should think that we need to prepare against this anyway.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12 19:34               ` David Kastrup
@ 2007-03-12 21:36                 ` Andreas Schwab
  2007-03-13  7:29                   ` David Kastrup
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-12 21:36 UTC (permalink / raw)
  To: David Kastrup; +Cc: Sam Steingold, emacs-devel

David Kastrup <dak@gnu.org> writes:

> Andreas Schwab <schwab@suse.de> writes:
>
>> If you don't use WNOHANG you open up a race where several processes may
>> have their status changed, but only one signal is sent (non-realtime
>> signals are not queued).
>
> How does WNOHANG protect against that?

It makes it possible to loop around without blocking.  On system without
WNOHANG there needs to be other mechanisms to garantee one signal per
child (they probably redeliver the signal as long as such children exist).

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-12 21:36                 ` Andreas Schwab
@ 2007-03-13  7:29                   ` David Kastrup
  2007-03-13  9:29                     ` Andreas Schwab
  2007-03-14  3:24                     ` Richard Stallman
  0 siblings, 2 replies; 58+ messages in thread
From: David Kastrup @ 2007-03-13  7:29 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> David Kastrup <dak@gnu.org> writes:
>
>> Andreas Schwab <schwab@suse.de> writes:
>>
>>> If you don't use WNOHANG you open up a race where several processes may
>>> have their status changed, but only one signal is sent (non-realtime
>>> signals are not queued).
>>
>> How does WNOHANG protect against that?
>
> It makes it possible to loop around without blocking.

Too bad you snipped the details of my question.  It does not make it
possible to loop around without preemption.  In fact, it _forces_
preemption at some point of time (this is the only way to exit the
infinite loop on a single-processor system), and when preemption
happens, of course several signals may be delivered together: after
all, a preempted process is not so likely to get scheduled right
again.

> On system without WNOHANG there needs to be other mechanisms to
> garantee one signal per child (they probably redeliver the signal as
> long as such children exist).

Again: I don't see that this guaranteed one signal per child, and you
did not explain how it could.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-13  7:29                   ` David Kastrup
@ 2007-03-13  9:29                     ` Andreas Schwab
  2007-03-13 22:19                       ` David Kastrup
  2007-03-14  3:24                     ` Richard Stallman
  1 sibling, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-13  9:29 UTC (permalink / raw)
  To: David Kastrup; +Cc: Sam Steingold, emacs-devel

David Kastrup <dak@gnu.org> writes:

> Andreas Schwab <schwab@suse.de> writes:
>
>> David Kastrup <dak@gnu.org> writes:
>>
>>> Andreas Schwab <schwab@suse.de> writes:
>>>
>>>> If you don't use WNOHANG you open up a race where several processes may
>>>> have their status changed, but only one signal is sent (non-realtime
>>>> signals are not queued).
>>>
>>> How does WNOHANG protect against that?
>>
>> It makes it possible to loop around without blocking.
>
> Too bad you snipped the details of my question.  It does not make it
> possible to loop around without preemption.

What do you mean with "without preemption"?

> In fact, it _forces_ preemption at some point of time (this is the only
> way to exit the infinite loop on a single-processor system),

Which inifinite loop?

> and when preemption happens, of course several signals may be delivered
> together: after all, a preempted process is not so likely to get
> scheduled right again.

I don't understand how preemption comes into play here.

>> On system without WNOHANG there needs to be other mechanisms to
>> garantee one signal per child (they probably redeliver the signal as
>> long as such children exist).
>
> Again: I don't see that this guaranteed one signal per child, and you
> did not explain how it could.

>From POSIX:

 If _POSIX_REALTIME_SIGNALS is defined, and the implementation queues the
 SIGCHLD signal, then if wait( ) or waitpid ( ) returns because the status
 of a child process is available, any pending SIGCHLD signal associated
 with the process ID of the child process shall be discarded.  Any other
 pending SIGCHLD signals shall remain pending.  Otherwise, if SIGCHLD is
 blocked, if wait( ) or waitpid ( ) return because the status of a child
 process is available, any pending SIGCHLD signal shall be cleared unless
 the status of another child process is available.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-13  9:29                     ` Andreas Schwab
@ 2007-03-13 22:19                       ` David Kastrup
  2007-03-13 22:28                         ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-13 22:19 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> David Kastrup <dak@gnu.org> writes:
>
>> Andreas Schwab <schwab@suse.de> writes:
>>
>>> David Kastrup <dak@gnu.org> writes:
>>>
>>>> Andreas Schwab <schwab@suse.de> writes:
>>>>
>>>>> If you don't use WNOHANG you open up a race where several processes may
>>>>> have their status changed, but only one signal is sent (non-realtime
>>>>> signals are not queued).
>>>>
>>>> How does WNOHANG protect against that?
>>>
>>> It makes it possible to loop around without blocking.
>>
>> Too bad you snipped the details of my question.  It does not make it
>> possible to loop around without preemption.
>
> What do you mean with "without preemption"?

"Preemption" means the kernel forcibly rescheduling the task.  Since
there is no other way in which the process will either exit the loop
(since that requires some other process to change its status) or yield
the CPU, it will eventually happen.

>> In fact, it _forces_ preemption at some point of time (this is the
>> only way to exit the infinite loop on a single-processor system),
>
> Which inifinite loop?

The one waiting for the demise of a process that gets no processing
time for dying.

>> and when preemption happens, of course several signals may be
>> delivered together: after all, a preempted process is not so likely
>> to get scheduled right again.
>
> I don't understand how preemption comes into play here.

It is the only thing that can prevent a deadlock here.

>>> On system without WNOHANG there needs to be other mechanisms to
>>> garantee one signal per child (they probably redeliver the signal
>>> as long as such children exist).
>>
>> Again: I don't see that this guaranteed one signal per child, and
>> you did not explain how it could.
>
> From POSIX:
>
>  If _POSIX_REALTIME_SIGNALS is defined, and the implementation
>  queues the SIGCHLD signal, then if wait( ) or waitpid ( ) returns
>  because the status of a child process is available, any pending
>  SIGCHLD signal associated with the process ID of the child process
>  shall be discarded.  Any other pending SIGCHLD signals shall remain
>  pending.  Otherwise, if SIGCHLD is blocked, if wait( ) or waitpid (
>  ) return because the status of a child process is available, any
>  pending SIGCHLD signal shall be cleared unless the status of
>  another child process is available.

I can't see how this guarantees one signal per child.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-13 22:19                       ` David Kastrup
@ 2007-03-13 22:28                         ` Andreas Schwab
  2007-03-13 22:54                           ` David Kastrup
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-13 22:28 UTC (permalink / raw)
  To: David Kastrup; +Cc: Sam Steingold, emacs-devel

David Kastrup <dak@gnu.org> writes:

> The one waiting for the demise of a process that gets no processing
> time for dying.

If there is no children dying then the loop is exited immediately.

> It is the only thing that can prevent a deadlock here.

Which deadlock?

> I can't see how this guarantees one signal per child.

It's explicitly explained in the quoted text.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-13 22:28                         ` Andreas Schwab
@ 2007-03-13 22:54                           ` David Kastrup
  2007-03-13 23:17                             ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-13 22:54 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> David Kastrup <dak@gnu.org> writes:
>
>> The one waiting for the demise of a process that gets no processing
>> time for dying.
>
> If there is no children dying then the loop is exited immediately.

Dying is not the same as dead.  If I send a process a fatal signal, it
is dying.  But it is not dead before it has completed processing the
signal.

>> It is the only thing that can prevent a deadlock here.
>
> Which deadlock?

The CPU is claimed by the process with the loop, so no other process
may actually progress to a state which can be "wait"ed for.  The
deadlock is on the resource "CPU", and only preemption can break it.

>> I can't see how this guarantees one signal per child.
>
> It's explicitly explained in the quoted text.

I disagree.  "explicitly" would mean that some wording remotely
similar to your "guarantee" claims could be found.  So at best, it is
implicitly contained somewhere for a person smarter than myself.  As
that seemingly includes you, it would have been nice if you had
bothered to explain.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-13 22:54                           ` David Kastrup
@ 2007-03-13 23:17                             ` Andreas Schwab
  2007-03-14  7:06                               ` David Kastrup
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-13 23:17 UTC (permalink / raw)
  To: David Kastrup; +Cc: Sam Steingold, emacs-devel

David Kastrup <dak@gnu.org> writes:

> The CPU is claimed by the process with the loop, so no other process
> may actually progress to a state which can be "wait"ed for.

Of there is no child to be waited for then there is no loop.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-13 23:17                             ` Andreas Schwab
@ 2007-03-14  7:06                               ` David Kastrup
  2007-03-14  9:24                                 ` Kim F. Storm
  0 siblings, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-14  7:06 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> David Kastrup <dak@gnu.org> writes:
>
>> The CPU is claimed by the process with the loop, so no other process
>> may actually progress to a state which can be "wait"ed for.
>
> Of there is no child to be waited for then there is no loop.

In order to make sophistics solve the problem, you need to convince
the kernel.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14  7:06                               ` David Kastrup
@ 2007-03-14  9:24                                 ` Kim F. Storm
  2007-03-14 10:00                                   ` David Kastrup
  2007-03-14 10:01                                   ` Andreas Schwab
  0 siblings, 2 replies; 58+ messages in thread
From: Kim F. Storm @ 2007-03-14  9:24 UTC (permalink / raw)
  To: David Kastrup; +Cc: Andreas Schwab, Sam Steingold, emacs-devel

David Kastrup <dak@gnu.org> writes:

> Andreas Schwab <schwab@suse.de> writes:
>
>> David Kastrup <dak@gnu.org> writes:
>>
>>> The CPU is claimed by the process with the loop, so no other process
>>> may actually progress to a state which can be "wait"ed for.
>>
>> Of there is no child to be waited for then there is no loop.
>
> In order to make sophistics solve the problem, you need to convince
> the kernel.

This happens in the sigchld handler - which is only invoked when there
is a dead child (zombie) to "wait3" for - so we should not have to wait
for the dead child to "really die".

In addition, we call wait3 with WNOHANG, so it is not supposed to block
if there are no dead childs.

That why Andreas and I can't really see where the busy loop can
happen, but since the loop _is_ observed, it is important to
understand why it happens, not just install a "semi-random" patch
which fixes the problem, but nobody can explain why.

Perhaps we need to ask a Linux kernel hacker?

Here's the code in condensed form:

  while (1)
    {
      while (1)
	{
	  errno = 0;
	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
	  if (! (pid < 0 && errno == EINTR))
	    break;
	  /* Avoid a busyloop: wait3 is a system call, so we do not want
	     to prevent the kernel from actually sending SIGCHLD to emacs
	     by asking for it all the time.  */
	  sleep (1);
	}

      if (pid <= 0)
      	return;
      /* handle death of child `pid' */
    }

So the problem is the interpretation of an EINTR error from
wait3(..., WNOHANG, ...).

The Linux man page says:

       EINTR  if WNOHANG was not set and an unblocked signal or a SIGCHLD  was
              caught.

So WNOHANG => EINTR is not explained, but the usual meaning is that
the wait3 was interrupted by some other signal - and if there is a
loop, that signal is repeated somehow ...

However, with the test code I inserted into the sigchld handler, and
then executing M-x complile once after starting emacs -Q, it clearly
shows that:

a) the sigchld handler is entered exactly once.

b) the first wait3 returns immediately with the pid
   of the compile process,

c) the next wait3 returns immediately with 0, since
   there are no more processes to wait for.

So where's the busy loop?

The above code is the version for Linux - other variations of the code
are used for other platform, but the OP said this was observed on a
GNU/Linux system.

Thinking more about it, I wonder why we use the WUNTRACED flag on wait3.

       WUNTRACED
              which means to also return for children which are  stopped,  and
              whose status has not been reported.

Why do we care about stopped processes?

--
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14  9:24                                 ` Kim F. Storm
@ 2007-03-14 10:00                                   ` David Kastrup
  2007-03-14 10:22                                     ` Andreas Schwab
  2007-03-14 10:01                                   ` Andreas Schwab
  1 sibling, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-14 10:00 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Andreas Schwab, Sam Steingold, emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> David Kastrup <dak@gnu.org> writes:
>
>> Andreas Schwab <schwab@suse.de> writes:
>>
>>> David Kastrup <dak@gnu.org> writes:
>>>
>>>> The CPU is claimed by the process with the loop, so no other process
>>>> may actually progress to a state which can be "wait"ed for.
>>>
>>> Of there is no child to be waited for then there is no loop.
>>
>> In order to make sophistics solve the problem, you need to convince
>> the kernel.
>
> This happens in the sigchld handler - which is only invoked when there
> is a dead child (zombie) to "wait3" for - so we should not have to wait
> for the dead child to "really die".
>
> In addition, we call wait3 with WNOHANG, so it is not supposed to block
> if there are no dead childs.
>
> That why Andreas and I can't really see where the busy loop can
> happen, but since the loop _is_ observed, it is important to
> understand why it happens, not just install a "semi-random" patch
> which fixes the problem, but nobody can explain why.
>
> Perhaps we need to ask a Linux kernel hacker?
>
> Here's the code in condensed form:
>
>   while (1)
>     {
>       while (1)
> 	{
> 	  errno = 0;
> 	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
> 	  if (! (pid < 0 && errno == EINTR))
> 	    break;
> 	  /* Avoid a busyloop: wait3 is a system call, so we do not want
> 	     to prevent the kernel from actually sending SIGCHLD to emacs
> 	     by asking for it all the time.  */
> 	  sleep (1);
> 	}
>
>       if (pid <= 0)
>       	return;
>       /* handle death of child `pid' */
>     }
>
>
> So the problem is the interpretation of an EINTR error from
> wait3(..., WNOHANG, ...).
>
> The Linux man page says:
>
>        EINTR  if WNOHANG was not set and an unblocked signal or a SIGCHLD  was
>               caught.
>
> So WNOHANG => EINTR is not explained, but the usual meaning is that
> the wait3 was interrupted by some other signal - and if there is a
> loop, that signal is repeated somehow ...
>
> However, with the test code I inserted into the sigchld handler, and
> then executing M-x complile once after starting emacs -Q, it clearly
> shows that:
>
> a) the sigchld handler is entered exactly once.
>
> b) the first wait3 returns immediately with the pid
>    of the compile process,
>
> c) the next wait3 returns immediately with 0, since
>    there are no more processes to wait for.
>
> So where's the busy loop?
>
> The above code is the version for Linux - other variations of the code
> are used for other platform, but the OP said this was observed on a
> GNU/Linux system.

The signal manpage says:

	When a signal  occurs, and func points to  a function, it is
	implementation-defined whether the equivalent of a:

		signal(sig, SIG_DFL);

	is   executed   or    the   implementation   prevents   some
	implementation-defined  set of  signals (at  least including
	sig) from  occurring until  the current signal  handling has
	completed.

So even though SIGCHLD may be interrupted by another signal, this does
not mean that the other signal handler gets a chance to run.

Maybe we should not loop, but instead rather return in the signal
handler, possibly reraising the signal?  That may give the system the
leeway to deal with whatever caused EINTR in the first place.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 10:00                                   ` David Kastrup
@ 2007-03-14 10:22                                     ` Andreas Schwab
  2007-03-14 10:52                                       ` David Kastrup
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-14 10:22 UTC (permalink / raw)
  To: David Kastrup; +Cc: Sam Steingold, emacs-devel, Kim F. Storm

David Kastrup <dak@gnu.org> writes:

> The signal manpage says:
>
> 	When a signal  occurs, and func points to  a function, it is
> 	implementation-defined whether the equivalent of a:
>
> 		signal(sig, SIG_DFL);
>
> 	is   executed   or    the   implementation   prevents   some
> 	implementation-defined  set of  signals (at  least including
> 	sig) from  occurring until  the current signal  handling has
> 	completed.
>
> So even though SIGCHLD may be interrupted by another signal, this does
> not mean that the other signal handler gets a chance to run.

This is a complete red herring.  The Linux kernel (and all XSI compliant
systems) always blocks the invoking signal when a handler is invoked.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 10:22                                     ` Andreas Schwab
@ 2007-03-14 10:52                                       ` David Kastrup
  2007-03-14 11:01                                         ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-14 10:52 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel, Kim F. Storm

Andreas Schwab <schwab@suse.de> writes:

> David Kastrup <dak@gnu.org> writes:
>
>> The signal manpage says:
>>
>> 	When a signal  occurs, and func points to  a function, it is
>> 	implementation-defined whether the equivalent of a:
>>
>> 		signal(sig, SIG_DFL);
>>
>> 	is   executed   or    the   implementation   prevents   some
>> 	implementation-defined  set of  signals (at  least including
>> 	sig) from  occurring until  the current signal  handling has
>> 	completed.
>>
>> So even though SIGCHLD may be interrupted by another signal, this does
>> not mean that the other signal handler gets a chance to run.
>
> This is a complete red herring.  The Linux kernel (and all XSI compliant
> systems) always blocks the invoking signal when a handler is invoked.

Who is talking about the invoking signal?  Other signals can occur
(i.e., SIGALRM).

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 10:52                                       ` David Kastrup
@ 2007-03-14 11:01                                         ` Andreas Schwab
  2007-03-14 11:12                                           ` David Kastrup
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-14 11:01 UTC (permalink / raw)
  To: David Kastrup; +Cc: Sam Steingold, emacs-devel, Kim F. Storm

David Kastrup <dak@gnu.org> writes:

> Who is talking about the invoking signal?  Other signals can occur
> (i.e., SIGALRM).

And your point is?

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 11:01                                         ` Andreas Schwab
@ 2007-03-14 11:12                                           ` David Kastrup
  2007-03-14 12:29                                             ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: David Kastrup @ 2007-03-14 11:12 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel, Kim F. Storm

Andreas Schwab <schwab@suse.de> writes:

> David Kastrup <dak@gnu.org> writes:
>
>> Who is talking about the invoking signal?  Other signals can occur
>> (i.e., SIGALRM).
>
> And your point is?

That the signal handler may get interrupted, as is obvious from the
problems we are seeing and which you try to discuss away.  I guess we
are pretty much on par with regard to utterly being unable to see the
other's point.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 11:12                                           ` David Kastrup
@ 2007-03-14 12:29                                             ` Andreas Schwab
  0 siblings, 0 replies; 58+ messages in thread
From: Andreas Schwab @ 2007-03-14 12:29 UTC (permalink / raw)
  To: David Kastrup; +Cc: Sam Steingold, emacs-devel, Kim F. Storm

David Kastrup <dak@gnu.org> writes:

> Andreas Schwab <schwab@suse.de> writes:
>
>> David Kastrup <dak@gnu.org> writes:
>>
>>> Who is talking about the invoking signal?  Other signals can occur
>>> (i.e., SIGALRM).
>>
>> And your point is?
>
> That the signal handler may get interrupted,

There is nothing problematic with a signal handler being interrupted.

> as is obvious from the problems we are seeing and which you try to
> discuss away.

I have never discussed anything away.  I'm only correcting your false
statements.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14  9:24                                 ` Kim F. Storm
  2007-03-14 10:00                                   ` David Kastrup
@ 2007-03-14 10:01                                   ` Andreas Schwab
  2007-03-14 13:15                                     ` Kim F. Storm
  1 sibling, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-14 10:01 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Sam Steingold, emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> Why do we care about stopped processes?

It is one of the possible states of a process, see process-status.  Any
job control shell need to be able to handle that.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 10:01                                   ` Andreas Schwab
@ 2007-03-14 13:15                                     ` Kim F. Storm
  2007-03-14 13:41                                       ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Kim F. Storm @ 2007-03-14 13:15 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> storm@cua.dk (Kim F. Storm) writes:
>
>> Why do we care about stopped processes?
>
> It is one of the possible states of a process, see process-status.  Any
> job control shell need to be able to handle that.

Yes, but how does a stopped process cause a sigchld signal to be sent
to the parent?

Because it is 'kill -9'ed while stopped, perhaps?

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 13:15                                     ` Kim F. Storm
@ 2007-03-14 13:41                                       ` Andreas Schwab
  2007-03-14 14:10                                         ` Kim F. Storm
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-14 13:41 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Sam Steingold, emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> Andreas Schwab <schwab@suse.de> writes:
>
>> storm@cua.dk (Kim F. Storm) writes:
>>
>>> Why do we care about stopped processes?
>>
>> It is one of the possible states of a process, see process-status.  Any
>> job control shell need to be able to handle that.
>
> Yes, but how does a stopped process cause a sigchld signal to be sent
> to the parent?

By not using SA_NOCLDSTOP when you register the handler.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 13:41                                       ` Andreas Schwab
@ 2007-03-14 14:10                                         ` Kim F. Storm
  2007-03-14 14:12                                           ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Kim F. Storm @ 2007-03-14 14:10 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> storm@cua.dk (Kim F. Storm) writes:
>
>> Andreas Schwab <schwab@suse.de> writes:
>>
>>> storm@cua.dk (Kim F. Storm) writes:
>>>
>>>> Why do we care about stopped processes?
>>>
>>> It is one of the possible states of a process, see process-status.  Any
>>> job control shell need to be able to handle that.
>>
>> Yes, but how does a stopped process cause a sigchld signal to be sent
>> to the parent?
>
> By not using SA_NOCLDSTOP when you register the handler.
>

Ah yes.

But the WUNTRACED flag doesn't mean we don't get info about
stopped processes, it means we also get info about stopped
ptrace'd processes:

			switch (p->state) {
			case TASK_STOPPED:
				if (!p->exit_code)
					continue;
				if (!(options & WUNTRACED) && !(p->ptrace & PT_PTRACED))
					continue;

But it's not relevant here, I guess.

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 14:10                                         ` Kim F. Storm
@ 2007-03-14 14:12                                           ` Andreas Schwab
  2007-03-14 15:02                                             ` Kim F. Storm
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-14 14:12 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Sam Steingold, emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> But the WUNTRACED flag doesn't mean we don't get info about
> stopped processes,

WUNTRACED
    The status of any child processes specified by pid that are stopped,
    and whose status has not yet been reported since they stopped, shall
    also be reported to the requesting process.

> ptrace'd processes:

ptrace is red herring.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 14:12                                           ` Andreas Schwab
@ 2007-03-14 15:02                                             ` Kim F. Storm
  2007-03-14 16:34                                               ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Kim F. Storm @ 2007-03-14 15:02 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> storm@cua.dk (Kim F. Storm) writes:
>
>> But the WUNTRACED flag doesn't mean we don't get info about
>> stopped processes,
>
> WUNTRACED
>     The status of any child processes specified by pid that are stopped,
>     and whose status has not yet been reported since they stopped, shall
>     also be reported to the requesting process.

That's what the doc says, but that's not what the 2.4 Linux kernel
code seems to be doing (I included the relevant snippet of sys_wait4
in my previous message).

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 15:02                                             ` Kim F. Storm
@ 2007-03-14 16:34                                               ` Andreas Schwab
  2007-03-16  9:34                                                 ` Kim F. Storm
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-14 16:34 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Sam Steingold, emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> That's what the doc says, but that's not what the 2.4 Linux kernel
> code seems to be doing (I included the relevant snippet of sys_wait4
> in my previous message).

The snippet exactly matches the intented behaviour.  Traced processes are
only reported when !WUNTRACED.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14 16:34                                               ` Andreas Schwab
@ 2007-03-16  9:34                                                 ` Kim F. Storm
  2007-03-16  9:59                                                   ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Kim F. Storm @ 2007-03-16  9:34 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sam Steingold, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> storm@cua.dk (Kim F. Storm) writes:
>
>> That's what the doc says, but that's not what the 2.4 Linux kernel
>> code seems to be doing (I included the relevant snippet of sys_wait4
>> in my previous message).
>
> The snippet exactly matches the intented behaviour.  Traced processes are
> only reported when !WUNTRACED.

So it is "intended behaviour" vs. "documented behaviour".

The man pages talks about "stopped processes" -- it doesn't mention
(un)traced processes at all.

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-16  9:34                                                 ` Kim F. Storm
@ 2007-03-16  9:59                                                   ` Andreas Schwab
  0 siblings, 0 replies; 58+ messages in thread
From: Andreas Schwab @ 2007-03-16  9:59 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Sam Steingold, emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> The man pages talks about "stopped processes" -- it doesn't mention
> (un)traced processes at all.

There are two classes of stopped processes: traced ones and untraced
ones. See the rationale section in the POSIX manpage.  ptrace is not part
of the standard, so it is generally ignored in the normative part of it.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-13  7:29                   ` David Kastrup
  2007-03-13  9:29                     ` Andreas Schwab
@ 2007-03-14  3:24                     ` Richard Stallman
  2007-03-14 17:34                       ` David Kastrup
  1 sibling, 1 reply; 58+ messages in thread
From: Richard Stallman @ 2007-03-14  3:24 UTC (permalink / raw)
  To: David Kastrup; +Cc: schwab, sds, emacs-devel

    Again: I don't see that this guaranteed one signal per child, and you
    did not explain how it could.

One signal per child?  I seem to recall hearing of a similar project.
Can the system deliver this signal for only $100?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-14  3:24                     ` Richard Stallman
@ 2007-03-14 17:34                       ` David Kastrup
  0 siblings, 0 replies; 58+ messages in thread
From: David Kastrup @ 2007-03-14 17:34 UTC (permalink / raw)
  To: emacs-devel

Richard Stallman <rms@gnu.org> writes:

>     Again: I don't see that this guaranteed one signal per child, and you
>     did not explain how it could.
>
> One signal per child?  I seem to recall hearing of a similar project.
> Can the system deliver this signal for only $100?

Quite cheaper, but then only dying children are signaled.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-11 18:33 busyloop in sigchld_handler Sam Steingold
  2007-03-11 19:39 ` Kim F. Storm
@ 2007-03-26  1:47 ` YAMAMOTO Mitsuharu
  2007-03-26  2:02   ` Sam Steingold
  1 sibling, 1 reply; 58+ messages in thread
From: YAMAMOTO Mitsuharu @ 2007-03-26  1:47 UTC (permalink / raw)
  To: sds; +Cc: emacs-devel

The following change (without a ChangeLog entry) made operations using
subprocesses really sluggish on Mac OS X.  Is it possible to restrict
the workaround to the relevant platforms?

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

===================================================================
RCS file: /sources/emacs/emacs/src/process.c,v
retrieving revision 1.505
retrieving revision 1.506
diff -c -r1.505 -r1.506
*** emacs/src/process.c	2007/03/20 08:51:03	1.505
--- emacs/src/process.c	2007/03/25 03:03:40	1.506
***************
*** 6501,6517 ****
  #define WUNTRACED 0
  #endif /* no WUNTRACED */
        /* Keep trying to get a status until we get a definitive result.  */
!       while (1)
! 	{
  	  errno = 0;
  	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
- 	  if (! (pid < 0 && errno == EINTR))
- 	    break;
- 	  /* Avoid a busyloop: wait3 is a system call, so we do not want
- 	     to prevent the kernel from actually sending SIGCHLD to emacs
- 	     by asking for it all the time.  */
- 	  sleep (1);
  	}
  
        if (pid <= 0)
  	{
--- 6501,6517 ----
  #define WUNTRACED 0
  #endif /* no WUNTRACED */
        /* Keep trying to get a status until we get a definitive result.  */
!       do
!         {
! 	  /* For some reason, this sleep() prevents Emacs from sending
!              loadavg to 5-8(!) for ~10 seconds.
!              See http://thread.gmane.org/gmane.emacs.devel/67722 or
!              http://www.google.com/search?q=busyloop+in+sigchld_handler */
!           sleep (1);
  	  errno = 0;
  	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
  	}
+       while (pid < 0 && errno == EINTR);
  
        if (pid <= 0)
  	{

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-26  1:47 ` YAMAMOTO Mitsuharu
@ 2007-03-26  2:02   ` Sam Steingold
  2007-03-26  2:17     ` YAMAMOTO Mitsuharu
  2007-03-28 10:02     ` Kim F. Storm
  0 siblings, 2 replies; 58+ messages in thread
From: Sam Steingold @ 2007-03-26  2:02 UTC (permalink / raw)
  To: emacs-devel

> * YAMAMOTO Mitsuharu <zvghuneh@zngu.f.puvon-h.np.wc> [2007-03-26 10:47:41 +0900]:
>
> The following change (without a ChangeLog entry) made operations using
> subprocesses really sluggish on Mac OS X.  Is it possible to restrict
> the workaround to the relevant platforms?

sorry - this patch merely reverted another one (which did not merit a
ChangeLog entry either :-)

how about this?
one millisecond is enough to fix my problem - does it fix yours?

--- process.c	24 Mar 2007 23:02:12 -0400	1.506
+++ process.c	25 Mar 2007 21:54:24 -0400	
@@ -6507,7 +6507,7 @@
              loadavg to 5-8(!) for ~10 seconds.
              See http://thread.gmane.org/gmane.emacs.devel/67722 or
              http://www.google.com/search?q=busyloop+in+sigchld_handler */
-          sleep (1);
+          usleep (1000);
 	  errno = 0;
 	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
 	}



-- 
Sam Steingold (http://sds.podval.org/) on Fedora Core release 6 (Zod)
http://honestreporting.com http://mideasttruth.com http://jihadwatch.org
http://camera.org http://ffii.org http://openvotingconsortium.org
"Syntactic sugar causes cancer of the semicolon."	-Alan Perlis

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-26  2:02   ` Sam Steingold
@ 2007-03-26  2:17     ` YAMAMOTO Mitsuharu
  2007-03-28 10:02     ` Kim F. Storm
  1 sibling, 0 replies; 58+ messages in thread
From: YAMAMOTO Mitsuharu @ 2007-03-26  2:17 UTC (permalink / raw)
  To: emacs-devel

>>>>> On Sun, 25 Mar 2007 22:02:44 -0400, Sam Steingold <sds@gnu.org> said:

>> The following change (without a ChangeLog entry) made operations
>> using subprocesses really sluggish on Mac OS X.  Is it possible to
>> restrict the workaround to the relevant platforms?

> sorry - this patch merely reverted another one (which did not merit
> a ChangeLog entry either :-)

> how about this?  one millisecond is enough to fix my problem - does
> it fix yours?

Yes.  At least, one-shot one millisecond delay is insensible to me.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-26  2:02   ` Sam Steingold
  2007-03-26  2:17     ` YAMAMOTO Mitsuharu
@ 2007-03-28 10:02     ` Kim F. Storm
  2007-03-28 15:19       ` Chong Yidong
  1 sibling, 1 reply; 58+ messages in thread
From: Kim F. Storm @ 2007-03-28 10:02 UTC (permalink / raw)
  To: emacs-devel

Sam Steingold <sds@gnu.org> writes:

>> * YAMAMOTO Mitsuharu <zvghuneh@zngu.f.puvon-h.np.wc> [2007-03-26 10:47:41 +0900]:
>>
>> The following change (without a ChangeLog entry) made operations using
>> subprocesses really sluggish on Mac OS X.  Is it possible to restrict
>> the workaround to the relevant platforms?
>
> sorry - this patch merely reverted another one (which did not merit a
> ChangeLog entry either :-)
>
> how about this?
> one millisecond is enough to fix my problem - does it fix yours?
>
> --- process.c	24 Mar 2007 23:02:12 -0400	1.506
> +++ process.c	25 Mar 2007 21:54:24 -0400	
> @@ -6507,7 +6507,7 @@
>               loadavg to 5-8(!) for ~10 seconds.
>               See http://thread.gmane.org/gmane.emacs.devel/67722 or
>               http://www.google.com/search?q=busyloop+in+sigchld_handler */
> -          sleep (1);
> +          usleep (1000);
>  	  errno = 0;
>  	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
>  	}
>

Do all platforms have usleep ?

Otherwise, we need a check in configure and use

#ifdef HAVE_USLEEP
    usleep(1000);
#endif

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-28 10:02     ` Kim F. Storm
@ 2007-03-28 15:19       ` Chong Yidong
  2007-03-28 15:25         ` Andreas Schwab
  2007-03-28 15:30         ` Alfred M. Szmidt
  0 siblings, 2 replies; 58+ messages in thread
From: Chong Yidong @ 2007-03-28 15:19 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: emacs-devel

> Do all platforms have usleep ?

usleep is not in POSIX, so we can't assume it exists.

> Otherwise, we need a check in configure and use
>
> #ifdef HAVE_USLEEP
>     usleep(1000);
> #endif

... or we could use nanosleep, which is in POSIX.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-28 15:19       ` Chong Yidong
@ 2007-03-28 15:25         ` Andreas Schwab
  2007-03-28 15:33           ` Kim F. Storm
  2007-03-28 15:30         ` Alfred M. Szmidt
  1 sibling, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-28 15:25 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel, Kim F. Storm

Chong Yidong <cyd@stupidchicken.com> writes:

>> Do all platforms have usleep ?
>
> usleep is not in POSIX, so we can't assume it exists.
>
>> Otherwise, we need a check in configure and use
>>
>> #ifdef HAVE_USLEEP
>>     usleep(1000);
>> #endif
>
> ... or we could use nanosleep, which is in POSIX.

We should probably just remove the loop altogether.

--- process.c	28 Mär 2007 12:54:58 +0200	1.508
+++ process.c	28 Mär 2007 16:19:54 +0200	
@@ -6499,18 +6499,7 @@ sigchld_handler (signo)
 #ifndef WUNTRACED
 #define WUNTRACED 0
 #endif /* no WUNTRACED */
-      /* Keep trying to get a status until we get a definitive result.  */
-      do
-        {
-	  /* For some reason, this sleep() prevents Emacs from sending
-             loadavg to 5-8(!) for ~10 seconds.
-             See http://thread.gmane.org/gmane.emacs.devel/67722 or
-             http://www.google.com/search?q=busyloop+in+sigchld_handler */
-          usleep (1000);
-	  errno = 0;
-	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
-	}
-      while (pid < 0 && errno == EINTR);
+      pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
 
       if (pid <= 0)
 	{

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-28 15:25         ` Andreas Schwab
@ 2007-03-28 15:33           ` Kim F. Storm
  2007-03-28 15:37             ` Andreas Schwab
  0 siblings, 1 reply; 58+ messages in thread
From: Kim F. Storm @ 2007-03-28 15:33 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Chong Yidong, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

> We should probably just remove the loop altogether.

The problem is not the loop, but that the first wait3 will take
"forever" without the preceding sleep.  Nobody have yet explained why.

>
> --- process.c	28 Mär 2007 12:54:58 +0200	1.508
> +++ process.c	28 Mär 2007 16:19:54 +0200	
> @@ -6499,18 +6499,7 @@ sigchld_handler (signo)
>  #ifndef WUNTRACED
>  #define WUNTRACED 0
>  #endif /* no WUNTRACED */
> -      /* Keep trying to get a status until we get a definitive result.  */
> -      do
> -        {
> -	  /* For some reason, this sleep() prevents Emacs from sending
> -             loadavg to 5-8(!) for ~10 seconds.
> -             See http://thread.gmane.org/gmane.emacs.devel/67722 or
> -             http://www.google.com/search?q=busyloop+in+sigchld_handler */
> -          usleep (1000);
> -	  errno = 0;
> -	  pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
> -	}
> -      while (pid < 0 && errno == EINTR);
> +      pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
>  
>        if (pid <= 0)
>  	{
>
> Andreas.
>
> -- 
> Andreas Schwab, SuSE Labs, schwab@suse.de
> SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
> PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."
>
>

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-28 15:33           ` Kim F. Storm
@ 2007-03-28 15:37             ` Andreas Schwab
  2007-03-28 20:18               ` Kim F. Storm
  0 siblings, 1 reply; 58+ messages in thread
From: Andreas Schwab @ 2007-03-28 15:37 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: Chong Yidong, emacs-devel

storm@cua.dk (Kim F. Storm) writes:

> Andreas Schwab <schwab@suse.de> writes:
>
>> We should probably just remove the loop altogether.
>
> The problem is not the loop, but that the first wait3 will take
> "forever" without the preceding sleep.

If wait3 would block here then it would be a serious kernel bug, or
someone is defining WNOHANG to 0.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-28 15:37             ` Andreas Schwab
@ 2007-03-28 20:18               ` Kim F. Storm
  2007-03-29 17:59                 ` Richard Stallman
  0 siblings, 1 reply; 58+ messages in thread
From: Kim F. Storm @ 2007-03-28 20:18 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Chong Yidong, emacs-devel

Andreas Schwab <schwab@suse.de> writes:

>> The problem is not the loop, but that the first wait3 will take
>> "forever" without the preceding sleep.
>
> If wait3 would block here then it would be a serious kernel bug, or
> someone is defining WNOHANG to 0.

This is observed with Linux 2.4 and 2.6 kernels -- so maybe
it could be conditioned with GNU_LINUX...?

-- 
Kim F. Storm <storm@cua.dk> http://www.cua.dk

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-28 20:18               ` Kim F. Storm
@ 2007-03-29 17:59                 ` Richard Stallman
  0 siblings, 0 replies; 58+ messages in thread
From: Richard Stallman @ 2007-03-29 17:59 UTC (permalink / raw)
  To: Kim F. Storm; +Cc: schwab, cyd, emacs-devel

    This is observed with Linux 2.4 and 2.6 kernels -- so maybe
    it could be conditioned with GNU_LINUX...?

If it is limited to GNU/Linux, it is ok to use usleep.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: busyloop in sigchld_handler
  2007-03-28 15:19       ` Chong Yidong
  2007-03-28 15:25         ` Andreas Schwab
@ 2007-03-28 15:30         ` Alfred M. Szmidt
  1 sibling, 0 replies; 58+ messages in thread
From: Alfred M. Szmidt @ 2007-03-28 15:30 UTC (permalink / raw)
  To: Chong Yidong; +Cc: emacs-devel, storm

   > Do all platforms have usleep ?

   usleep is not in POSIX, so we can't assume it exists.

Actually, usleep is in POSIX, but it is a extention (which just means
that one can use it satfley on any system supporting the X/Open System
Interfaces Extention), and marked as obsolete.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2007-03-29 17:59 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-11 18:33 busyloop in sigchld_handler Sam Steingold
2007-03-11 19:39 ` Kim F. Storm
2007-03-11 19:43   ` David Kastrup
2007-03-11 19:51     ` Sam Steingold
2007-03-11 20:42       ` Kim F. Storm
2007-03-11 21:06   ` Sam Steingold
2007-03-11 21:14     ` Eli Zaretskii
2007-03-11 21:17       ` Sam Steingold
2007-03-11 21:51         ` Eli Zaretskii
2007-03-11 22:21           ` Sam Steingold
2007-03-12  4:24             ` Richard Stallman
2007-03-12  7:00               ` David Kastrup
2007-03-13  2:43                 ` Richard Stallman
2007-03-11 22:27     ` Andreas Schwab
2007-03-11 22:30     ` Kim F. Storm
2007-03-12 17:37     ` Andreas Schwab
2007-03-12 17:53       ` Sam Steingold
2007-03-12 18:45         ` Andreas Schwab
2007-03-12 18:57           ` Sam Steingold
2007-03-12 19:28             ` Andreas Schwab
2007-03-12 19:34               ` David Kastrup
2007-03-12 21:36                 ` Andreas Schwab
2007-03-13  7:29                   ` David Kastrup
2007-03-13  9:29                     ` Andreas Schwab
2007-03-13 22:19                       ` David Kastrup
2007-03-13 22:28                         ` Andreas Schwab
2007-03-13 22:54                           ` David Kastrup
2007-03-13 23:17                             ` Andreas Schwab
2007-03-14  7:06                               ` David Kastrup
2007-03-14  9:24                                 ` Kim F. Storm
2007-03-14 10:00                                   ` David Kastrup
2007-03-14 10:22                                     ` Andreas Schwab
2007-03-14 10:52                                       ` David Kastrup
2007-03-14 11:01                                         ` Andreas Schwab
2007-03-14 11:12                                           ` David Kastrup
2007-03-14 12:29                                             ` Andreas Schwab
2007-03-14 10:01                                   ` Andreas Schwab
2007-03-14 13:15                                     ` Kim F. Storm
2007-03-14 13:41                                       ` Andreas Schwab
2007-03-14 14:10                                         ` Kim F. Storm
2007-03-14 14:12                                           ` Andreas Schwab
2007-03-14 15:02                                             ` Kim F. Storm
2007-03-14 16:34                                               ` Andreas Schwab
2007-03-16  9:34                                                 ` Kim F. Storm
2007-03-16  9:59                                                   ` Andreas Schwab
2007-03-14  3:24                     ` Richard Stallman
2007-03-14 17:34                       ` David Kastrup
2007-03-26  1:47 ` YAMAMOTO Mitsuharu
2007-03-26  2:02   ` Sam Steingold
2007-03-26  2:17     ` YAMAMOTO Mitsuharu
2007-03-28 10:02     ` Kim F. Storm
2007-03-28 15:19       ` Chong Yidong
2007-03-28 15:25         ` Andreas Schwab
2007-03-28 15:33           ` Kim F. Storm
2007-03-28 15:37             ` Andreas Schwab
2007-03-28 20:18               ` Kim F. Storm
2007-03-29 17:59                 ` Richard Stallman
2007-03-28 15:30         ` Alfred M. Szmidt

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.