From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Slawomir Nowaczyk <slawomir.nowaczyk.847@student.lu.se>
Newsgroups: gmane.emacs.devel
Subject: Re: Running two processes rapidly makes Emacs eat 100% CPU on w32
Date: Fri, 27 Oct 2006 23:36:22 +0200
Message-ID: <20061027001549.E487.SLAWOMIR.NOWACZYK.847@student.lu.se>
References: <20061012145009.C39D.SLAWOMIR.NOWACZYK.847@student.lu.se>
	<ulknki59w.fsf@gnu.org>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
X-Trace: sea.gmane.org 1161985144 17456 80.91.229.2 (27 Oct 2006 21:39:04 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Fri, 27 Oct 2006 21:39:04 +0000 (UTC)
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 27 23:38:57 2006
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by ciao.gmane.org with esmtp (Exim 4.43)
	id 1GdZNc-0002of-Qi
	for ged-emacs-devel@m.gmane.org; Fri, 27 Oct 2006 23:36:46 +0200
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1GdZNc-0000g0-2B
	for ged-emacs-devel@m.gmane.org; Fri, 27 Oct 2006 17:36:44 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1GdZNO-0000eV-Lx
	for emacs-devel@gnu.org; Fri, 27 Oct 2006 17:36:30 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1GdZNN-0000eG-0c
	for emacs-devel@gnu.org; Fri, 27 Oct 2006 17:36:30 -0400
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1GdZNM-0000eD-RE
	for emacs-devel@gnu.org; Fri, 27 Oct 2006 17:36:28 -0400
Original-Received: from [130.235.16.11] (helo=himmelsborg.cs.lth.se)
	by monty-python.gnu.org with esmtp (Exim 4.52) id 1GdZNM-0007Il-IW
	for emacs-devel@gnu.org; Fri, 27 Oct 2006 17:36:28 -0400
Original-Received: from [127.0.0.1] (slawek@dain [130.235.16.76])
	by himmelsborg.cs.lth.se (8.13.6/8.13.6/perf-jw-tr) with ESMTP id
	k9RLaNnU005109
	for <emacs-devel@gnu.org>; Fri, 27 Oct 2006 23:36:24 +0200 (CEST)
Original-To: emacs-devel@gnu.org
In-Reply-To: <ulknki59w.fsf@gnu.org>
X-Esmandil_Citation: done
X-Mailer-Plugin: Popup Memopad for Becky!2 Ver.0.02 Rev.2
X-Mailer: Becky! ver. 2.25.02 [en]
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:61256
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/61256>

On Fri, 13 Oct 2006 17:50:35 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

#> > 		DebPrint (("select waiting on child %d fd %d\n",
#> > 			   cp-child_procs, i));
#> > keeps printing "select waiting on child 0 fd 3" (thousands of times per
#> > second, every time sys_select is called.
#> 
#> Looks like somehow Emacs doesn't pay attention that the process
#> exited, and keeps trying to read its pipe. Do you agree with this
#> conclusion?

I have finally found some time to dig deeper into this issue. I still
don't quite understand the code, so take all that follows with a (big)
grain of salt...

For those who forgot, evaluating the following code makes Emacs eat 100%
CPU on my Windows machine: (progn (start-process "" nil "ls") (call-process "ls"))

I have tracked the problem in sys_select to the fact that
cp->procinfo.hProcess was being set to NULL prematurely... this caused
cp->the "if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess" test on line
cp->1202 to fail, in effect preventing Emacs from calling SIGCHLD
cp->properly.

Following this discovery, I have verified that hProcess is being set to
NULL by reap_subprocess, which get called from sys_wait.

Some poking around in sys_wait made me aware that line 508
"active = WaitForMultipleObjects (nh, wait_hnd, FALSE, 1000);"
is returning the "wrong" process... it was returning one created by
start-process, while it should have been returning the one created by
call-process (the distinction is important because this process was to
be reap_subprocess'ed immediately -- which is OK for processes from
call-process, but wrong for ones from start-process).

This has lead me to a code some 20 lines above, which actually produces
a list of processes to be waited for. It gathers all children which
fulfill the condition "if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess)".

It seems that this condition is too weak. I am not sure what should it
be, but sys_select in line 1206 uses, in similar circumstances, the
following one:
    if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess
	&& (cp->fd < 0
	    || (fd_info[cp->fd].flags & FILE_SEND_SIGCHLD) == 0
	    || (fd_info[cp->fd].flags & FILE_AT_EOF) != 0))

The "(fd_info[cp->fd].flags & FILE_SEND_SIGCHLD) == 0" part is clearly
unsuitable for sys_wait, but the rest gives good results for me.

So, the following patch fixes the problem for me, but I have no way of
knowing if it really is a correct solution. I have been running with it
for a couple of days now and haven't notice anything wrong, though. I
have also verified that both processes are correctly reap_subprocess'ed
in the (progn (start-process "" nil "ls") (call-process "ls")) case.

**********************************************************************

--- m:/EmacsCVS/EmacsCVS/src/w32proc_orig.c     2006-09-26 20:28:27.518832000 +0200
+++ m:/EmacsCVS/EmacsCVS/src/w32proc.c  2006-10-27 00:12:26.527475200 +0200
@@ -486,7 +486,8 @@
     {
       for (cp = child_procs+(child_proc_count-1); cp >= child_procs; cp--)
        /* some child_procs might be sockets; ignore them */
-       if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess)
+       if (CHILD_ACTIVE (cp) && cp->procinfo.hProcess
+      && (cp->fd < 0 || (fd_info[cp->fd].flags & FILE_AT_EOF) != 0))
          {
            wait_hnd[nh] = cp->procinfo.hProcess;
            cps[nh] = cp;

**********************************************************************

-- 
 Best wishes,
   Slawomir Nowaczyk
     ( slawomir.nowaczyk.847@student.lu.se )

The best performance improvement is the transition
  from the nonworking state to the working state.