* bug#19868: 25.0.50; Compilation eats buffers @ 2015-02-14 19:30 Richard Copley 2015-02-15 17:53 ` Eli Zaretskii 2016-08-12 20:47 ` bug#19868: #19868 " Noam Postavsky 0 siblings, 2 replies; 10+ messages in thread From: Richard Copley @ 2015-02-14 19:30 UTC (permalink / raw) To: 19868 On Windows, with MinGW gcc.exe installed and on the path, save a file "c:\temp\bug.c" containing these two lines: #include <windows.h> int main () { Sleep (5000); } Compile with "M-x compile RET", supplying this compile-command: gcc -mwindows -o bug.exe bug.c && bug.exe Within 5 seconds, execute "M-x compile" again and answer "yes" to kill the existing process. The process doesn't respond to the signal, and Emacs hangs inside the call to `delete-process' in `compilation-start'. When the process does eventually die and the `delete-process' call returns, the current buffer has changed from *compilation* to the buffer from which the compilation was launched (which will often be a source code buffer). `compilation-start' then proceeds to erase the buffer and discard its undo history. This is potentially very bad news for the user's source code. I'm not sure where the buffer gets changed (presumably in a sentinel, but `compilation-sentinel' looks OK to me). Wrapping the `delete-process' call inside a `save-excursion' fixes (or hides?) the problem. In GNU Emacs 25.0.50.1 (x86_64-w64-mingw32) of 2015-02-09 on MACHINE Repository revision: 21d1f8b85eec8fc1f87bb30398e449f6b20b6ecc Windowing system distributor `Microsoft Corp.', version 6.3.9600 Configured using: `configure --prefix /c/emacs/emacs-20150209-192633 --disable-dependency-tracking --enable-locallisppath=%emacs_dir%/../site-lisp --with-wide-int --build=x86_64-w64-mingw32 'CPPFLAGS=-I G:/usr/include -I C:/GnuWin32/include' 'LDFLAGS=-L G:/usr/lib -L C:/GnuWin32/lib'' Configured features: XPM JPEG TIFF GIF PNG SOUND NOTIFY ACL GNUTLS LIBXML2 ZLIB ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: 25.0.50; Compilation eats buffers 2015-02-14 19:30 bug#19868: 25.0.50; Compilation eats buffers Richard Copley @ 2015-02-15 17:53 ` Eli Zaretskii 2015-02-17 0:25 ` Richard Copley 2016-08-12 20:47 ` bug#19868: #19868 " Noam Postavsky 1 sibling, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2015-02-15 17:53 UTC (permalink / raw) To: Richard Copley; +Cc: 19868 > Date: Sat, 14 Feb 2015 19:30:45 +0000 > From: Richard Copley <rcopley@gmail.com> > > On Windows, with MinGW gcc.exe installed and on the path, save a file > "c:\temp\bug.c" containing these two lines: > > #include <windows.h> > int main () { Sleep (5000); } > > Compile with "M-x compile RET", supplying this compile-command: > gcc -mwindows -o bug.exe bug.c && bug.exe > > Within 5 seconds, execute "M-x compile" again and answer "yes" to kill > the existing process. The process doesn't respond to the signal, There are no signals on Windows. Emacs simulates SIGINT and SIGKILL by other means, see sys_kill. > and Emacs hangs inside the call to `delete-process' in > `compilation-start'. > > When the process does eventually die and the `delete-process' call > returns, the current buffer has changed from *compilation* to the buffer > from which the compilation was launched (which will often be a source > code buffer). > > `compilation-start' then proceeds to erase the buffer and discard its > undo history. This is potentially very bad news for the user's source > code. I cannot reproduce this: for me, Emacs doesn't hang at all. As soon as I answer YES to the kill process question, I see in Process Explorer that cmdproxy, cmd.exe, and the program that sleeps are all terminated, and the new compilation begins. Like I'd expect. If I instrument the sys_kill function, I see that we first send a simulated Ctrl-C keystroke to the process, and a second afterwards terminate it forcefully, which is consistent with the calls to interrupt-process and delete-process in compilation-start. I tried this on Windows 7 and XP, and both show the same correct behavior. It could be that what you see is specific to Windows 8, or to 64-bit programs, or to how MinGW64 sets up the process in its startup code (I used MinGW32). You say above that Emacs hangs inside the delete-process call -- can you show a backtrace in that state, preferably from an unoptimized build? I'd like to see where exactly it hangs. Also, is the -mwindows compiler switch a factor here, i.e. does the problem happen with a console application that sleeps? (I'm not sure it should matter, because the process that we are killing is cmdproxy, not the program you compiled.) In addition, can you look at the relevant processes in Process Explorer and seed if any of them are killed when you answer YES? > I'm not sure where the buffer gets changed (presumably in a sentinel, > but `compilation-sentinel' looks OK to me). Run all this under GDB, put a breakpoint on a low-level function that switches buffers (e.g., in set_buffer_internal), and you will see in the backtrace which Lisp function triggers that. It is advisable to manually load compile.el in advance, so that xbacktrace shows more details. > In GNU Emacs 25.0.50.1 (x86_64-w64-mingw32) > of 2015-02-09 on MACHINE > Repository revision: 21d1f8b85eec8fc1f87bb30398e449f6b20b6ecc > Windowing system distributor `Microsoft Corp.', version 6.3.9600 > Configured using: > `configure --prefix /c/emacs/emacs-20150209-192633 > --disable-dependency-tracking > --enable-locallisppath=%emacs_dir%/../site-lisp --with-wide-int > --build=x86_64-w64-mingw32 'CPPFLAGS=-I G:/usr/include -I > C:/GnuWin32/include' 'LDFLAGS=-L G:/usr/lib -L C:/GnuWin32/lib'' Any idea why you are building --with-wide-int? It's supposed to be a no-op in a 64-bit build. (This is not related to the bug.) ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: 25.0.50; Compilation eats buffers 2015-02-15 17:53 ` Eli Zaretskii @ 2015-02-17 0:25 ` Richard Copley 0 siblings, 0 replies; 10+ messages in thread From: Richard Copley @ 2015-02-17 0:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 19868 [-- Attachment #1: Type: text/plain, Size: 5005 bytes --] On 15 February 2015 at 17:53, Eli Zaretskii <eliz@gnu.org> wrote: >> >> On Windows, with MinGW gcc.exe installed and on the path, save a file >> "c:\temp\bug.c" containing these two lines: >> >> #include <windows.h> >> int main () { Sleep (5000); } >> >> Compile with "M-x compile RET", supplying this compile-command: >> gcc -mwindows -o bug.exe bug.c && bug.exe >> >> Within 5 seconds, execute "M-x compile" again and answer "yes" to kill >> the existing process. The process doesn't respond to the signal, > > There are no signals on Windows. Emacs simulates SIGINT and SIGKILL > by other means, see sys_kill. > >> and Emacs hangs inside the call to `delete-process' in >> `compilation-start'. >> >> When the process does eventually die and the `delete-process' call >> returns, the current buffer has changed from *compilation* to the buffer >> from which the compilation was launched (which will often be a source >> code buffer). >> >> `compilation-start' then proceeds to erase the buffer and discard its >> undo history. This is potentially very bad news for the user's source >> code. > > I cannot reproduce this: for me, Emacs doesn't hang at all. As soon > as I answer YES to the kill process question, I see in Process > Explorer that cmdproxy, cmd.exe, and the program that sleeps are all > terminated, and the new compilation begins. Like I'd expect. > > If I instrument the sys_kill function, I see that we first send a > simulated Ctrl-C keystroke to the process, and a second afterwards > terminate it forcefully, which is consistent with the calls to > interrupt-process and delete-process in compilation-start. > > I tried this on Windows 7 and XP, and both show the same correct > behavior. > > It could be that what you see is specific to Windows 8, or to 64-bit > programs, or to how MinGW64 sets up the process in its startup code (I > used MinGW32). I see my problem no matter what compiler I use to build "bug.exe" (old-fashioned MinGW32, and both the 32- and 64-bit MinGW-W64 GCC 4.9.2 toolchains). I'll try on Windows 7, and if I get time, with 32-bit Emacs. when building "bug.exe" with good old MinGW and with both the 32- and 64-bit toolchains from MinGW-W64. I haven't tried it with a 32-bit Emacs. I will try that, and on Windows 7, when I have time. > You say above that Emacs hangs inside the delete-process call -- can > you show a backtrace in that state, preferably from an unoptimized > build? I'd like to see where exactly it hangs. I tried to work out how to control the optimization level when building Emacs but I'm stumped. How do you do that? (If there are configure flags, can they be mentioned in "configure --help"?) FWIW, attached is the result of "thread apply all bt full" after typing Ctrl-C in GDB while debugging an optimized Emacs that was hanging. Looks like I'm doing something horribly wrong. Sorry about that. > Also, is the -mwindows compiler switch a factor here, i.e. does the > problem happen with a console application that sleeps? Yes, -mwindows is needed. Console applications die as expected. > (I'm not sure it should matter, because the process that we are > killing is cmdproxy, not the program you compiled.) Then I don't understand why a GUI program would ever die in response to that. (Would runemacs.exe?) Really I didn't expect it to; that's not the bug I was reporting (though I'm happy to help fix it if it is a bug). > In addition, can you look at the relevant processes in Process > Explorer and seed if any of them are killed when you answer YES? "cmdproxy.exe" and its descendants "cmd.exe" and "conhost.exe" are killed, leaving just the orphaned "bug.exe". >> I'm not sure where the buffer gets changed (presumably in a sentinel, >> but `compilation-sentinel' looks OK to me). > > Run all this under GDB, put a breakpoint on a low-level function that > switches buffers (e.g., in set_buffer_internal), and you will see in > the backtrace which Lisp function triggers that. It is advisable to > manually load compile.el in advance, so that xbacktrace shows more > details. I'm sorry to say that, mysteriously, I can no longer reproduce the effect where the current buffer changes during the `delete-process' call and causes work to be lost. I can't see what I'm doing differently. I might have to get back to you another time. >> In GNU Emacs 25.0.50.1 (x86_64-w64-mingw32) >> of 2015-02-09 on MACHINE >> Repository revision: 21d1f8b85eec8fc1f87bb30398e449f6b20b6ecc >> Windowing system distributor `Microsoft Corp.', version 6.3.9600 >> Configured using: >> `configure --prefix /c/emacs/emacs-20150209-192633 >> --disable-dependency-tracking >> --enable-locallisppath=%emacs_dir%/../site-lisp --with-wide-int >> --build=x86_64-w64-mingw32 'CPPFLAGS=-I G:/usr/include -I >> C:/GnuWin32/include' 'LDFLAGS=-L G:/usr/lib -L C:/GnuWin32/lib'' > > Any idea why you are building --with-wide-int? It's supposed to be a > no-op in a 64-bit build. (This is not related to the bug.) I'll remove it, thanks. [-- Attachment #2: bt.txt --] [-- Type: text/plain, Size: 6183 bytes --] Program received signal SIGINT, Interrupt. [Switching to Thread 8568.0x1518] 0x00007ff9c5cc3233 in RegLoadMUIStringA () from C:\WINDOWS\system32\KernelBase.dll (gdb) thread apply all bt full Thread 7 (Thread 8568.0x1518): #0 0x00007ff9c5cc3233 in RegLoadMUIStringA () from C:\WINDOWS\system32\KernelBase.dll No symbol table info available. Backtrace stopped: previous frame identical to this frame (corrupt stack?) Thread 6 (Thread 8568.0x450): #0 0x00007ff9c8aa28ca in ntdll!ZwWaitForWorkViaWorkerFactory () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #1 0x00007ff9c8a44d26 in ntdll!RtlFreeUnicodeString () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #2 0x00007ff9c88213d2 in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\system32\kernel32.dll No symbol table info available. #3 0x00007ff9c8a7eb64 in ntdll!RtlUserThreadStart () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #4 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: previous frame inner to this frame (corrupt stack?) Thread 5 (Thread 8568.0x1814): #0 0x00007ff9c8aa28ca in ntdll!ZwWaitForWorkViaWorkerFactory () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #1 0x00007ff9c8a44d26 in ntdll!RtlFreeUnicodeString () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #2 0x00007ff9c88213d2 in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\system32\kernel32.dll No symbol table info available. #3 0x00007ff9c8a7eb64 in ntdll!RtlUserThreadStart () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #4 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: previous frame inner to this frame (corrupt stack?) Thread 4 (Thread 8568.0x1930): #0 0x00007ff9c8aa0e3a in ntdll!ZwReadFile () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #1 0x00007ff9c5c183a8 in ReadFile () from C:\WINDOWS\system32\KernelBase.dll No symbol table info available. #2 0x00007ff9c8981b59 in msvcrt!__crtGetStringTypeW () from C:\WINDOWS\system32\msvcrt.dll No symbol table info available. #3 0x00007ff9c8981c79 in msvcrt!_read () from C:\WINDOWS\system32\msvcrt.dll No symbol table info available. #4 0x0000000400196135 in _sys_read_ahead (fd=<optimized out>) at g:/emacs/repo/emacs/src/w32.c:7990 cp = 0x100000000 rc = 0 #5 0x000000040019b815 in reader_thread (arg=0x4017a5940 <child_procs>) at g:/emacs/repo/emacs/src/w32proc.c:1017 rc = <optimized out> cp = 0x4017a5940 <child_procs> #6 0x00007ff9c88213d2 in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\system32\kernel32.dll No symbol table info available. #7 0x00007ff9c8a7eb64 in ntdll!RtlUserThreadStart () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #8 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: previous frame inner to this frame (corrupt stack?) Thread 3 (Thread 8568.0x2258): #0 0x00007ff9c60726ca in USER32!GetMessageW () from C:\WINDOWS\system32\user32.dll No symbol table info available. #1 0x00007ff9c6072695 in USER32!GetMessageW () from C:\WINDOWS\system32\user32.dll No symbol table info available. #2 0x0000000400170068 in w32_msg_pump (msg_buf=0x2ecfef0) at g:/emacs/repo/emacs/src/w32fns.c:2526 msg = {hwnd = 0x1906d0, message = 275, wParam = 1, lParam = 0, time = 452290093, pt = {x = 273, y = 1075}} focus_window = <optimized out> #3 0x00000004001705c0 in w32_msg_worker (arg=<optimized out>) ---Type <return> to continue, or q <return> to quit--- at g:/emacs/repo/emacs/src/w32fns.c:2747 msg = {hwnd = 0x0, message = 0, wParam = 0, lParam = 0, time = 0, pt = {x = 0, y = 0}} dummy_buf = {next = 0x0, w32msg = {msg = {hwnd = 0x0, message = 0, wParam = 0, lParam = 0, time = 0, pt = {x = 0, y = 0}}, dwModifiers = 0, rect = {left = 0, top = 0, right = 0, bottom = 0}}, result = 0, completed = 0} #4 0x00007ff9c88213d2 in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\system32\kernel32.dll No symbol table info available. #5 0x00007ff9c8a7eb64 in ntdll!RtlUserThreadStart () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #6 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: previous frame inner to this frame (corrupt stack?) Thread 2 (Thread 8568.0x15a4): #0 0x00007ff9c8aa111a in ntdll!ZwDelayExecution () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #1 0x00007ff9c5c1121a in SleepEx () from C:\WINDOWS\system32\KernelBase.dll No symbol table info available. #2 0x000000040019bdab in timer_loop (arg=0x0) at g:/emacs/repo/emacs/src/w32proc.c:381 sleep_time = <optimized out> handler = <optimized out> now = <optimized out> expire = <optimized out> reload = <optimized out> itimer = 0x0 which = <optimized out> sig = <optimized out> crit = <optimized out> max_sleep = <optimized out> hth = 0x0 #3 0x00007ff9c88213d2 in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\system32\kernel32.dll No symbol table info available. #4 0x00007ff9c8a7eb64 in ntdll!RtlUserThreadStart () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #5 0x0000000000000000 in ?? () No symbol table info available. Backtrace stopped: previous frame inner to this frame (corrupt stack?) Thread 1 (Thread 8568.0x10c8): #0 0x00007ff9c8aa0e1a in ntdll!ZwWaitForSingleObject () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #1 0x00007ff9c8a49a85 in ntdll!RtlImageNtHeaderEx () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. #2 0x00007ff9c8a47f44 in ntdll!RtlEnterCriticalSection () from C:\WINDOWS\SYSTEM32\ntdll.dll No symbol table info available. Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: #19868 25.0.50; Compilation eats buffers 2015-02-14 19:30 bug#19868: 25.0.50; Compilation eats buffers Richard Copley 2015-02-15 17:53 ` Eli Zaretskii @ 2016-08-12 20:47 ` Noam Postavsky 2016-08-13 6:44 ` Eli Zaretskii 1 sibling, 1 reply; 10+ messages in thread From: Noam Postavsky @ 2016-08-12 20:47 UTC (permalink / raw) To: 19868; +Cc: Richard Copley retitle 19868 [w32] restarting compilation hangs trying to kill process found 19868 25.1 quit > I tried this on Windows 7 and XP, and both show the same correct > behavior. > > It could be that what you see is specific to Windows 8, or to 64-bit > programs, or to how MinGW64 sets up the process in its startup code (I > used MinGW32). > > You say above that Emacs hangs inside the delete-process call -- can > you show a backtrace in that state, preferably from an unoptimized > build? I'd like to see where exactly it hangs. I reproduced this (the hanging, not the buffer eating) on Windows 10, Emacs 25.1, MinGW64. Stepping with gdb I found the the hang occurs in sys_close where it calls _close (fd). This is being called from deactivate_process: for (i = 0; i < PROCESS_OPEN_FDS; i++) close_process_fd (&p->open_fd[i]); // <-- when i == 2 ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: #19868 25.0.50; Compilation eats buffers 2016-08-12 20:47 ` bug#19868: #19868 " Noam Postavsky @ 2016-08-13 6:44 ` Eli Zaretskii 2016-08-15 22:19 ` Noam Postavsky 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2016-08-13 6:44 UTC (permalink / raw) To: Noam Postavsky; +Cc: rcopley, 19868 > From: Noam Postavsky <npostavs@users.sourceforge.net> > Date: Fri, 12 Aug 2016 16:47:07 -0400 > Cc: Richard Copley <rcopley@gmail.com> > > I reproduced this (the hanging, not the buffer eating) on Windows 10, > Emacs 25.1, MinGW64. Stepping with gdb I found the the hang occurs in > sys_close where it calls _close (fd). This is being called from > deactivate_process: > > for (i = 0; i < PROCESS_OPEN_FDS; i++) > close_process_fd (&p->open_fd[i]); // <-- when i == 2 Does it hang in the _close call itself, or somewhere else? And what is the value of fd? Can you instrument the relevant code with printf's and see this happening without stepping through the code with GDB? Doing the latter might change the timing of the calls, so we might be trying to use file descriptors when the process (cmdproxy) is already dead, and so the other end of the pipe no longer exists. In any case, this is a tricky situation, because we kill the shell, not the program it runs. When the program run from the shell was built with -mwindows, it is detached from the shell, and the various Emacs facilities that try to kill subprocesses are likely to fail in exciting ways. IOW, running -mwindows programs from the likes of "M-x compile" is not really supported on MS-Windows, I think. Of course, if we can figure out how to avoid the hang in this case, we should. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: #19868 25.0.50; Compilation eats buffers 2016-08-13 6:44 ` Eli Zaretskii @ 2016-08-15 22:19 ` Noam Postavsky 2016-08-16 14:18 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: Noam Postavsky @ 2016-08-15 22:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Richard Copley, 19868 On Sat, Aug 13, 2016 at 2:44 AM, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Noam Postavsky <npostavs@users.sourceforge.net> >> Date: Fri, 12 Aug 2016 16:47:07 -0400 >> Cc: Richard Copley <rcopley@gmail.com> >> >> I reproduced this (the hanging, not the buffer eating) on Windows 10, >> Emacs 25.1, MinGW64. Stepping with gdb I found the the hang occurs in >> sys_close where it calls _close (fd). This is being called from >> deactivate_process: >> >> for (i = 0; i < PROCESS_OPEN_FDS; i++) >> close_process_fd (&p->open_fd[i]); // <-- when i == 2 > > Does it hang in the _close call itself, or somewhere else? It's in the _close call itself. > > And what is the value of fd? > > Can you instrument the relevant code with printf's and see this > happening without stepping through the code with GDB? Doing the > latter might change the timing of the calls, so we might be trying to > use file descriptors when the process (cmdproxy) is already dead, and > so the other end of the pipe no longer exists. I put fprintf+fflush before close_process_fd and around _close: close_process_fd(-1[i = 0]) close_process_fd(4[i = 1]) going to _close(4)...done _close(4) close_process_fd(5[i = 2]) going to _close(5)... // here Emacs hangs until I kill bug.exe ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: #19868 25.0.50; Compilation eats buffers 2016-08-15 22:19 ` Noam Postavsky @ 2016-08-16 14:18 ` Eli Zaretskii 2016-08-16 21:17 ` Noam Postavsky 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2016-08-16 14:18 UTC (permalink / raw) To: Noam Postavsky; +Cc: rcopley, 19868 > From: Noam Postavsky <npostavs@users.sourceforge.net> > Date: Mon, 15 Aug 2016 18:19:05 -0400 > Cc: 19868@debbugs.gnu.org, Richard Copley <rcopley@gmail.com> > > On Sat, Aug 13, 2016 at 2:44 AM, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Noam Postavsky <npostavs@users.sourceforge.net> > >> Date: Fri, 12 Aug 2016 16:47:07 -0400 > >> Cc: Richard Copley <rcopley@gmail.com> > >> > >> I reproduced this (the hanging, not the buffer eating) on Windows 10, > >> Emacs 25.1, MinGW64. Stepping with gdb I found the the hang occurs in > >> sys_close where it calls _close (fd). This is being called from > >> deactivate_process: > >> > >> for (i = 0; i < PROCESS_OPEN_FDS; i++) > >> close_process_fd (&p->open_fd[i]); // <-- when i == 2 > > > > Does it hang in the _close call itself, or somewhere else? > > It's in the _close call itself. Hm... not so good. > > And what is the value of fd? > > > > Can you instrument the relevant code with printf's and see this > > happening without stepping through the code with GDB? Doing the > > latter might change the timing of the calls, so we might be trying to > > use file descriptors when the process (cmdproxy) is already dead, and > > so the other end of the pipe no longer exists. > > I put fprintf+fflush before close_process_fd and around _close: > > close_process_fd(-1[i = 0]) > close_process_fd(4[i = 1]) > going to _close(4)...done _close(4) > close_process_fd(5[i = 2]) > going to _close(5)... // here Emacs hangs until I kill bug.exe Can you tell what descriptor 5 is open on? Is it for input, for output, for something else? Also, is "until I kill bug.exe" accurate? That program just waits for 5 seconds, so after that it should exit by itself. Are you saying it doesn't unless killed by external means? Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: #19868 25.0.50; Compilation eats buffers 2016-08-16 14:18 ` Eli Zaretskii @ 2016-08-16 21:17 ` Noam Postavsky 2016-08-17 15:15 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: Noam Postavsky @ 2016-08-16 21:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Richard Copley, 19868 On Tue, Aug 16, 2016 at 10:18 AM, Eli Zaretskii <eliz@gnu.org> wrote: >> >> I put fprintf+fflush before close_process_fd and around _close: >> >> close_process_fd(-1[i = 0]) >> close_process_fd(4[i = 1]) >> going to _close(4)...done _close(4) >> close_process_fd(5[i = 2]) >> going to _close(5)... // here Emacs hangs until I kill bug.exe > > Can you tell what descriptor 5 is open on? Is it for input, for > output, for something else? I found this enum which indicates that i=2 would be READ_FROM_SUBPROCESS. /* Indexes of file descriptors in open_fds. */ enum { /* The pipe from Emacs to its subprocess. */ SUBPROCESS_STDIN, WRITE_TO_SUBPROCESS, /* The main pipe from the subprocess to Emacs. */ READ_FROM_SUBPROCESS, SUBPROCESS_STDOUT, I confirmed with printfs that open_fd[2] is set to 5 by the emacs_pipe() calls in create_process (I also double checked with gdb that nobody else sets it in between). I printed all open_fd values from deactivate_process, just before the closing loop, I got deactivate_process()open_fd[0] = -1, open_fd[1] = 4, open_fd[2] = 5, open_fd[3] = -1, open_fd[4] = -1, open_fd[5] = -1, So, only WRITE_TO_SUBPROCESS and READ_FROM_SUBPROCESS are open. When compiling bug.c without -mwindows, all open_fd values are -1 at that spot. > > Also, is "until I kill bug.exe" accurate? That program just waits for > 5 seconds, so after that it should exit by itself. Are you saying it > doesn't unless killed by external means? Ah, sorry, I upped the waiting time to 5 minutes, because 5 seconds seemed a bit short for debugging. So I should have said "until bug.exe terminates" (either by itself, or because I told it to). Another observation: if I close Emacs while it's running bug.exe, Emacs closes successfully, but leaves bug.exe running (even though I answer yes at the prompt to kill it). ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: #19868 25.0.50; Compilation eats buffers 2016-08-16 21:17 ` Noam Postavsky @ 2016-08-17 15:15 ` Eli Zaretskii 2016-08-29 21:48 ` Noam Postavsky 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2016-08-17 15:15 UTC (permalink / raw) To: Noam Postavsky; +Cc: rcopley, 19868 > From: Noam Postavsky <npostavs@users.sourceforge.net> > Date: Tue, 16 Aug 2016 17:17:20 -0400 > Cc: 19868@debbugs.gnu.org, Richard Copley <rcopley@gmail.com> > > >> going to _close(5)... // here Emacs hangs until I kill bug.exe > > > > Can you tell what descriptor 5 is open on? Is it for input, for > > output, for something else? > > I found this enum which indicates that i=2 would be READ_FROM_SUBPROCESS. OK. > I printed all open_fd values from deactivate_process, just before the > closing loop, I got > > deactivate_process()open_fd[0] = -1, open_fd[1] = 4, open_fd[2] = 5, > open_fd[3] = -1, open_fd[4] = -1, open_fd[5] = -1, > > So, only WRITE_TO_SUBPROCESS and READ_FROM_SUBPROCESS are open. When > compiling bug.c without -mwindows, all open_fd values are -1 at that > spot. The last sentence shows an important difference between the two cases. Can you spot the code which makes the 2 handles -1 in the case of a console (not -mwindows) application? That might give us a clue about the reason for Emacs hanging in _close. One possibility is that the reader thread is trying to read from the descriptor which we are trying to close. Maybe that prevents _close from completing its job. Then the question is how does this succeed in the case of a console application? compilation-start sends SIGINT to the subprocess, then waits for 1 sec, then calls delete-process. On Windows, interrupting a process is implemented in w32proc.c:sys_kill. My guess is that with a console application, sending the simulated Ctrl-C to cmdproxy kills both cmdproxy and the application, while in the -mwindows case only cmdproxy (and perhaps its child cmd.exe) is killed. But the details still evade me: how would the above explain the fact that the descriptors are already -1 when deactivate_process is called, who closes them, and by what trigger? Another thing to try is to set w32-start-process-share-console to a non-nil value. I don't know if it will help or make things worse, I think this option was never seriously used, and I don't know what GUI applications do on Windows when their control handler function is called. MSDN says in https://msdn.microsoft.com/en-us/library/windows/desktop/ms683155(v=vs.85).aspx: All console processes have a default handler function that calls the ExitProcess function. which says nothing about non-console applications. And I think non-console applications are not attached to any console anyway, so this option will probably do nothing useful. Still, it could give us some clues about what's going on. Some more interesting (though very vague) documentation is here: https://msdn.microsoft.com/en-us/library/windows/desktop/ms686016(v=vs.85).aspx > Another observation: if I close Emacs while it's running bug.exe, > Emacs closes successfully, but leaves bug.exe running (even though I > answer yes at the prompt to kill it). That's just another manifestation of the fact that we cannot reliably kill grandchildren processes on MS-Windows, especially when they are not console applications. We can only kill the immediate child process, in this case cmdproxy (and probably its child cmd.exe as well). Thanks. P.S. Don't hesitate to ask questions about how this stuff works, if something is unclear. There's a large comment around line 390 in w32proc.c which provides an overview, so if you didn't already read it, it could help. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#19868: #19868 25.0.50; Compilation eats buffers 2016-08-17 15:15 ` Eli Zaretskii @ 2016-08-29 21:48 ` Noam Postavsky 0 siblings, 0 replies; 10+ messages in thread From: Noam Postavsky @ 2016-08-29 21:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Richard Copley, 19868 On Wed, Aug 17, 2016 at 11:15 AM, Eli Zaretskii <eliz@gnu.org> wrote: >> I printed all open_fd values from deactivate_process, just before the >> closing loop, I got >> >> deactivate_process()open_fd[0] = -1, open_fd[1] = 4, open_fd[2] = 5, >> open_fd[3] = -1, open_fd[4] = -1, open_fd[5] = -1, >> >> So, only WRITE_TO_SUBPROCESS and READ_FROM_SUBPROCESS are open. When >> compiling bug.c without -mwindows, all open_fd values are -1 at that >> spot. > > The last sentence shows an important difference between the two > cases. Can you spot the code which makes the 2 handles -1 in the case > of a console (not -mwindows) application? That might give us a clue > about the reason for Emacs hanging in _close. Sorry, turns out I lied about this. In the non-mwindows case deactivate_process() gets called 3 times in total, the latter 2 times have all the handles closed and set to -1 (and I mistakenly looked at the values only from the last call). The first time is the same as the mwindows case (except for the hanging, of course). > Another thing to try is to set w32-start-process-share-console to a > non-nil value. Seems to make no difference. >> Another observation: if I close Emacs while it's running bug.exe, >> Emacs closes successfully, but leaves bug.exe running (even though I >> answer yes at the prompt to kill it). > > That's just another manifestation of the fact that we cannot reliably > kill grandchildren processes on MS-Windows, especially when they are > not console applications. We can only kill the immediate child > process, in this case cmdproxy (and probably its child cmd.exe as > well). Right, I recall seeing in #15983 a suggestion to crawl the process tree in order to be able to do this. Another possibility I found while searching the web is to use Job Objects for this (https://msdn.microsoft.com/en-us/library/ms684161(VS.85).aspx). Though it has a limitation: Windows 7, Windows Server 2008 R2, Windows XP with SP3, Windows Server 2008, Windows Vista and Windows Server 2003: A process can be associated with only one job. Jobs cannot be nested. The ability to nest jobs was added in Windows 8 and Windows Server 2012. So Emacs using Job Objects would prevent the process it calls from using them (on older Windows OSes). > > P.S. Don't hesitate to ask questions about how this stuff works, if > something is unclear. There's a large comment around line 390 in > w32proc.c which provides an overview, so if you didn't already read > it, it could help. For the record, the comment seems to be closer to line 790 (after the first ^L). ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-08-29 21:48 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-14 19:30 bug#19868: 25.0.50; Compilation eats buffers Richard Copley 2015-02-15 17:53 ` Eli Zaretskii 2015-02-17 0:25 ` Richard Copley 2016-08-12 20:47 ` bug#19868: #19868 " Noam Postavsky 2016-08-13 6:44 ` Eli Zaretskii 2016-08-15 22:19 ` Noam Postavsky 2016-08-16 14:18 ` Eli Zaretskii 2016-08-16 21:17 ` Noam Postavsky 2016-08-17 15:15 ` Eli Zaretskii 2016-08-29 21:48 ` Noam Postavsky
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.