unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Running two processes rapidly makes Emacs eat 100% CPU on w32
@ 2006-10-12  8:44 Slawomir Nowaczyk
  2006-10-12 10:47 ` Eli Zaretskii
  0 siblings, 1 reply; 8+ messages in thread
From: Slawomir Nowaczyk @ 2006-10-12  8:44 UTC (permalink / raw)


Hello,

I have tested on M$ Windows only, I do not know if the problem also
exists on other systems...

Evaluating the following code
  (progn (start-process "" nil "ls") (call-process "ls"))
makes my emacs eat 100% CPU. Emacs is still responsive, I can edit
things etc., but the CPU usage remains extreme. When I try to exit, it
asks whether to kill an active process "ls". If I answer yes, emacs
exits cleanly.

Changing the code to
  (progn (start-process "" nil "ls") (sleep-for 0.1) (call-process "ls"))
makes the problem go away (but sleeping for 0.01 is not enough on my
machine).

I have encountered the problem when I tried to upgrade to newest tramp,
which seems to run several processes one after another in order to test
some things (if anyone is interested, the code is in tramp-util.el,
lines 104 to 180 -- at least as of tramp-2.1.7.tar.gz).

I have also seen similar behaviour (which I believe -- and hope -- is
caused by the same bug) with ispell... changing buffers back and forth
rapidly made my emacs, occasionally, start eating 100% CPU as well -- I
now think that the buffers might have been using different dictionaries
and flyspell tried restarting ispell with different arguments too
quickly.

What is the best way to debug such problem? toggle-debug-on-quit is not
helpful, since the emacs itself remains responsive and the traceback it
provides is useless. I do not have C debugger on this machine, although
I might try installing one if that's the only way to go.


In GNU Emacs 22.0.50.1 (i386-msvc-nt5.1.2600)
 of 2006-09-27 on FIONA
X server distributor `Microsoft Corp.', version 5.1.2600
configured using `configure --with-msvc (13.10) --cflags -IM:/EmacsCVS/libraries/include/'

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: ENG
  locale-coding-system: cp1252
  default-enable-multibyte-characters: t

-- 
 Best wishes,
   Slawomir Nowaczyk
     ( slawomir.nowaczyk.847@student.lu.se )

Programming:  "Bloody instructions which, being taught, return to
plague their inventor" ("Macbeth", Act 1, Scene 7)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running two processes rapidly makes Emacs eat 100% CPU on w32
  2006-10-12  8:44 Running two processes rapidly makes Emacs eat 100% CPU on w32 Slawomir Nowaczyk
@ 2006-10-12 10:47 ` Eli Zaretskii
  2006-10-12 13:52   ` Slawomir Nowaczyk
  0 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2006-10-12 10:47 UTC (permalink / raw)
  Cc: emacs-devel

> Date: Thu, 12 Oct 2006 10:44:48 +0200
> From: Slawomir Nowaczyk <slawomir.nowaczyk.847@student.lu.se>
> 
> What is the best way to debug such problem?

Try to figure out what happens inside w32proc.c:sys_select would be a
beginning, I think.  There are already quite a few DebPrint statements
in that function; compiling with EMACSDEBUG defined will activate
them.  You should be able to see the strings emitted by DebPrint with
any debugger; or you can install DbgView as a lightweight solution.

Once you have this in place, add DebPrint as needed to see what is
going on there.

TIA

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running two processes rapidly makes Emacs eat 100% CPU on w32
  2006-10-12 10:47 ` Eli Zaretskii
@ 2006-10-12 13:52   ` Slawomir Nowaczyk
  2006-10-13 15:50     ` Eli Zaretskii
  0 siblings, 1 reply; 8+ messages in thread
From: Slawomir Nowaczyk @ 2006-10-12 13:52 UTC (permalink / raw)


On Thu, 12 Oct 2006 06:47:19 -0400
Eli Zaretskii <eliz@gnu.org> wrote:

#> > Date: Thu, 12 Oct 2006 10:44:48 +0200
#> > From: Slawomir Nowaczyk <slawomir.nowaczyk.847@student.lu.se>
#> > 
#> > What is the best way to debug such problem?
#> 
#> Try to figure out what happens inside w32proc.c:sys_select would be a
#> beginning, I think. There are already quite a few DebPrint statements
#> in that function; compiling with EMACSDEBUG defined will activate
#> them. You should be able to see the strings emitted by DebPrint with
#> any debugger; or you can install DbgView as a lightweight solution.

Thanks for the advice, I managed to activate DebPrint. I have added one
which prints the return value of w32proc:sys_select, and it is "1"
consistently. None of the DebPrint statements already present in this
function gets executed, neither in normal work nor when Emacs eats all
the CPU.

Your guess that sys_select is to be blamed was likely correct, though,
as (under normal circumstances) my newly added DebPrint executes a
couple of times per second, while after I execute the problematic code,
it gets called thousands of times per second.

There were two DebPrint statements guarded by FULL_DEBUG, but I have
activated those as well. Now the one from line 1175:
		DebPrint (("select waiting on child %d fd %d\n",
			   cp-child_procs, i));
keeps printing "select waiting on child 0 fd 3" (thousands of times per
second, every time sys_select is called.

Anyway, the log of DebPrints looks like this:

return 1                                           ;; this part corresponds to correctly working emacs
return 1
...
return 1                                           ;; there are a couple (1-10) of those per second
return 1
                                                   ;; now I executed (progn (start-process "" nil "ls") (call-process "ls"))
"select waiting on child 0 fd 3"
"select calling SIGCHLD handler for pid 1668"      ;; observe that this line only happens ONCE
return 1                                           
"select waiting on child 0 fd 3"                   ;; and this repeats, thousands of times per second
return 1                                           ;;    I got about 40.000 of those in 7 seconds
"select waiting on child 0 fd 3"
return 1
...
"select waiting on child 0 fd 3"
return 1

#> Once you have this in place, add DebPrint as needed to see what is
#> going on there.

w32proc:sys_select is much too complex for me to understand quickly. I
will try to look at the code and do some more debugging tomorrow evening
if I have time...

In the meantime, if anyone has any ideas as to where would be a good
place to put some DebPrint statements, I would be grateful.

-- 
 Best wishes,
   Slawomir Nowaczyk
     ( slawomir.nowaczyk.847@student.lu.se )

I plan on living forever. So far, so good.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running two processes rapidly makes Emacs eat 100% CPU on w32
  2006-10-12 13:52   ` Slawomir Nowaczyk
@ 2006-10-13 15:50     ` Eli Zaretskii
  2006-10-14  7:53       ` Slawomir Nowaczyk
  2006-10-27 21:36       ` Slawomir Nowaczyk
  0 siblings, 2 replies; 8+ messages in thread
From: Eli Zaretskii @ 2006-10-13 15:50 UTC (permalink / raw)
  Cc: emacs-devel

> Date: Thu, 12 Oct 2006 15:52:20 +0200
> From: Slawomir Nowaczyk <slawomir.nowaczyk.847@student.lu.se>
> 
> Thanks for the advice, I managed to activate DebPrint. I have added one
> which prints the return value of w32proc:sys_select, and it is "1"
> consistently. None of the DebPrint statements already present in this
> function gets executed, neither in normal work nor when Emacs eats all
> the CPU.
> 
> Your guess that sys_select is to be blamed was likely correct, though,
> as (under normal circumstances) my newly added DebPrint executes a
> couple of times per second, while after I execute the problematic code,
> it gets called thousands of times per second.
> 
> There were two DebPrint statements guarded by FULL_DEBUG, but I have
> activated those as well. Now the one from line 1175:
> 		DebPrint (("select waiting on child %d fd %d\n",
> 			   cp-child_procs, i));
> keeps printing "select waiting on child 0 fd 3" (thousands of times per
> second, every time sys_select is called.

Looks like somehow Emacs doesn't pay attention that the process
exited, and keeps trying to read its pipe.  Do you agree with this
conclusion?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running two processes rapidly makes Emacs eat 100% CPU on w32
  2006-10-13 15:50     ` Eli Zaretskii
@ 2006-10-14  7:53       ` Slawomir Nowaczyk
  2006-10-27 21:36       ` Slawomir Nowaczyk
  1 sibling, 0 replies; 8+ messages in thread
From: Slawomir Nowaczyk @ 2006-10-14  7:53 UTC (permalink / raw)


On Fri, 13 Oct 2006 17:50:35 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

#> > Your guess that sys_select is to be blamed was likely correct, though,
#> > as (under normal circumstances) my newly added DebPrint executes a
#> > couple of times per second, while after I execute the problematic code,
#> > it gets called thousands of times per second.
#> > 
#> > There were two DebPrint statements guarded by FULL_DEBUG, but I have
#> > activated those as well. Now the one from line 1175:
#> > 		DebPrint (("select waiting on child %d fd %d\n",
#> > 			   cp-child_procs, i));
#> > keeps printing "select waiting on child 0 fd 3" (thousands of times per
#> > second, every time sys_select is called.
#> 
#> Looks like somehow Emacs doesn't pay attention that the process
#> exited, and keeps trying to read its pipe. Do you agree with this
#> conclusion?

Possibly... I didn't have time yet to delve into sys_select and
understand what it does, but I have noticed that executing
(start-process "" nil "sleep" "5") makes several "select waiting on
child 0 fd 3" lines and one "select calling SIGCHLD handler for pid
4616" appear in the log, while executing (call-process "sleep" nil
nil nil "5") does not (no output other than "sys_select returned 1"
appears).

I do not know what sys_select does differently for call-process than for
start-process, will look into it. For now, however, my idea is that if
the process started by start-process exits while emacs is waiting for
one started by call-process, the code gets confused in some way and
continues waiting, instead of doing whatever it is supposed to do. When
later the other process exits and emacs has nothing else to wait for,
the 100% CPU behaviour happens... Does it make sense?

One additional thing:
 (progn (start-process "" nil "sleep" "10") (call-process "sleep" nil nil nil  "5"))
works OK, while
 (progn (start-process "" nil "sleep"  "5") (call-process "sleep" nil nil nil "10"))
makes emacs eat 100% CPU (but only after 10 seconds, not after 5) -- which seems to
confirm my theory, doesn't it?

-- 
 Best wishes,
   Slawomir Nowaczyk
     ( slawomir.nowaczyk.847@student.lu.se )

Some mistakes are too much fun to only make once.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running two processes rapidly makes Emacs eat 100% CPU on w32
  2006-10-13 15:50     ` Eli Zaretskii
  2006-10-14  7:53       ` Slawomir Nowaczyk
@ 2006-10-27 21:36       ` Slawomir Nowaczyk
  2006-11-03 21:11         ` Slawomir Nowaczyk
  1 sibling, 1 reply; 8+ messages in thread
From: Slawomir Nowaczyk @ 2006-10-27 21:36 UTC (permalink / raw)


On Fri, 13 Oct 2006 17:50:35 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

#> > 		DebPrint (("select waiting on child %d fd %d\n",
#> > 			   cp-child_procs, i));
#> > keeps printing "select waiting on child 0 fd 3" (thousands of times per
#> > second, every time sys_select is called.
#> 
#> Looks like somehow Emacs doesn't pay attention that the process
#> exited, and keeps trying to read its pipe. Do you agree with this
#> conclusion?

I have finally found some time to dig deeper into this issue. I still
don't quite understand the code, so take all that follows with a (big)
grain of salt...

For those who forgot, evaluating the following code makes Emacs eat 100%
CPU on my Windows machine: (progn (start-process "" nil "ls") (call-process "ls"))

I have tracked the problem in sys_select to the fact that
cp->procinfo.hProcess was being set to NULL prematurely... this caused
cp->the "if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess" test on line
cp->1202 to fail, in effect preventing Emacs from calling SIGCHLD
cp->properly.

Following this discovery, I have verified that hProcess is being set to
NULL by reap_subprocess, which get called from sys_wait.

Some poking around in sys_wait made me aware that line 508
"active = WaitForMultipleObjects (nh, wait_hnd, FALSE, 1000);"
is returning the "wrong" process... it was returning one created by
start-process, while it should have been returning the one created by
call-process (the distinction is important because this process was to
be reap_subprocess'ed immediately -- which is OK for processes from
call-process, but wrong for ones from start-process).

This has lead me to a code some 20 lines above, which actually produces
a list of processes to be waited for. It gathers all children which
fulfill the condition "if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess)".

It seems that this condition is too weak. I am not sure what should it
be, but sys_select in line 1206 uses, in similar circumstances, the
following one:
    if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess
	&& (cp->fd < 0
	    || (fd_info[cp->fd].flags & FILE_SEND_SIGCHLD) == 0
	    || (fd_info[cp->fd].flags & FILE_AT_EOF) != 0))

The "(fd_info[cp->fd].flags & FILE_SEND_SIGCHLD) == 0" part is clearly
unsuitable for sys_wait, but the rest gives good results for me.

So, the following patch fixes the problem for me, but I have no way of
knowing if it really is a correct solution. I have been running with it
for a couple of days now and haven't notice anything wrong, though. I
have also verified that both processes are correctly reap_subprocess'ed
in the (progn (start-process "" nil "ls") (call-process "ls")) case.

**********************************************************************

--- m:/EmacsCVS/EmacsCVS/src/w32proc_orig.c     2006-09-26 20:28:27.518832000 +0200
+++ m:/EmacsCVS/EmacsCVS/src/w32proc.c  2006-10-27 00:12:26.527475200 +0200
@@ -486,7 +486,8 @@
     {
       for (cp = child_procs+(child_proc_count-1); cp >= child_procs; cp--)
        /* some child_procs might be sockets; ignore them */
-       if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess)
+       if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess
+      && (cp->fd < 0 || (fd_info[cp->fd].flags & FILE_AT_EOF) != 0))
          {
            wait_hnd[nh] = cp->procinfo.hProcess;
            cps[nh] = cp;

**********************************************************************

-- 
 Best wishes,
   Slawomir Nowaczyk
     ( slawomir.nowaczyk.847@student.lu.se )

The best performance improvement is the transition
  from the nonworking state to the working state.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running two processes rapidly makes Emacs eat 100% CPU on w32
  2006-10-27 21:36       ` Slawomir Nowaczyk
@ 2006-11-03 21:11         ` Slawomir Nowaczyk
  2006-11-04  3:37           ` Chong Yidong
  0 siblings, 1 reply; 8+ messages in thread
From: Slawomir Nowaczyk @ 2006-11-03 21:11 UTC (permalink / raw)


On Fri, 27 Oct 2006 23:36:22 +0200
Slawomir Nowaczyk <slawomir.nowaczyk.847@student.lu.se> wrote:

#> So, the following patch fixes the problem for me, but I have no way of
#> knowing if it really is a correct solution. I have been running with it
#> for a couple of days now and haven't notice anything wrong, though. I
#> have also verified that both processes are correctly reap_subprocess'ed
#> in the (progn (start-process "" nil "ls") (call-process "ls")) case.
#> 
#> **********************************************************************
#> 
#> --- m:/EmacsCVS/EmacsCVS/src/w32proc_orig.c     2006-09-26 20:28:27.518832000 +0200
#> +++ m:/EmacsCVS/EmacsCVS/src/w32proc.c  2006-10-27 00:12:26.527475200 +0200
#> @@ -486,7 +486,8 @@
#>      {
#>        for (cp = child_procs+(child_proc_count-1); cp >= child_procs; cp--)
#>         /* some child_procs might be sockets; ignore them */
#> -       if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess)
#> +       if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess
#> +      && (cp->fd < 0 || (fd_info[cp->fd].flags & FILE_AT_EOF) != 0))
#>           {
#>             wait_hnd[nh] = cp->procinfo.hProcess;
#>             cps[nh] = cp;
#> 
#> **********************************************************************

I have now been running with this patch applied for over a week and
haven't seen a single instance of emacs eating 100% CPU... before, in my
setup, it was happening at least twice or three times per week.

I have not noticed any other problems, either.

Therefore, I suggest applying this patch.

**********************************************************************

Here is changelog entry:

2006-10-27  Slawomir Nowaczyk  <slawek@cs.lth.se>

	* w32proc.c: (sys_wait) Only wait for processes with fd<0.
	  Others should be handled by sys_select instead. Fixes problems
	  with (progn (start-process "" nil "ls") (call-process "ls"))

**********************************************************************

-- 
 Best wishes,
   Slawomir Nowaczyk
     ( slawomir.nowaczyk.847@student.lu.se )

Nostalgia isn't what it used to be.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Running two processes rapidly makes Emacs eat 100% CPU on w32
  2006-11-03 21:11         ` Slawomir Nowaczyk
@ 2006-11-04  3:37           ` Chong Yidong
  0 siblings, 0 replies; 8+ messages in thread
From: Chong Yidong @ 2006-11-04  3:37 UTC (permalink / raw)
  Cc: emacs-devel

Slawomir Nowaczyk <slawomir.nowaczyk.847@student.lu.se> writes:

> I have now been running with this patch applied for over a week and
> haven't seen a single instance of emacs eating 100% CPU... before, in my
> setup, it was happening at least twice or three times per week.
>
> I have not noticed any other problems, either.
>
> Therefore, I suggest applying this patch.

Committed.  Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-11-04  3:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-12  8:44 Running two processes rapidly makes Emacs eat 100% CPU on w32 Slawomir Nowaczyk
2006-10-12 10:47 ` Eli Zaretskii
2006-10-12 13:52   ` Slawomir Nowaczyk
2006-10-13 15:50     ` Eli Zaretskii
2006-10-14  7:53       ` Slawomir Nowaczyk
2006-10-27 21:36       ` Slawomir Nowaczyk
2006-11-03 21:11         ` Slawomir Nowaczyk
2006-11-04  3:37           ` Chong Yidong

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).