From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Dynamic modules: MODULE_HANDLE_SIGNALS etc. Date: Thu, 24 Dec 2015 19:36:01 +0200 Message-ID: <831taboike.fsf@gnu.org> References: <83mvu1x6t3.fsf@gnu.org> <83r3iht93x.fsf@gnu.org> <838u4psznr.fsf@gnu.org> <56772054.8010401@cs.ucla.edu> <83zix4scgf.fsf@gnu.org> <5677DBC9.6030307@cs.ucla.edu> <83io3rst2r.fsf@gnu.org> <567841A6.4090408@cs.ucla.edu> <567844B9.2050308@dancol.org> <5678CD07.8080209@cs.ucla.edu> <5678D3AF.7030101@dancol.org> <5678D620.6070000@cs.ucla.edu> <83mvt2qxm1.fsf@gnu.org> <56797CD9.8010706@cs.ucla.edu> <8337uuqsux.fsf@gnu.org> <5679DC83.70405@cs.ucla.edu> <83oadhp2mj.fsf@gnu.org> <567AD556.6020202@cs.ucla.edu> <567AD766.3060608@dancol.org> <567B5DAB.2000900@cs.ucla.edu> <83fuyromig.fsf@gnu.org> <567C25B1.3020101@dancol.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1450978544 21666 80.91.229.3 (24 Dec 2015 17:35:44 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 24 Dec 2015 17:35:44 +0000 (UTC) Cc: eggert@cs.ucla.edu, Emacs-devel@gnu.org To: Daniel Colascione Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Dec 24 18:35:44 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aC9nj-0002kO-J8 for ged-emacs-devel@m.gmane.org; Thu, 24 Dec 2015 18:35:43 +0100 Original-Received: from localhost ([::1]:32855 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aC9ni-0003Fx-P6 for ged-emacs-devel@m.gmane.org; Thu, 24 Dec 2015 12:35:42 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59570) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aC9nR-0003EM-NA for Emacs-devel@gnu.org; Thu, 24 Dec 2015 12:35:27 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aC9nO-0004kC-Fg for Emacs-devel@gnu.org; Thu, 24 Dec 2015 12:35:25 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:55077) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aC9nO-0004k1-Ce; Thu, 24 Dec 2015 12:35:22 -0500 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:3099 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aC9nN-0001yL-Hh; Thu, 24 Dec 2015 12:35:22 -0500 In-reply-to: <567C25B1.3020101@dancol.org> (message from Daniel Colascione on Thu, 24 Dec 2015 09:04:49 -0800) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:196775 Archived-At: > Cc: Emacs-devel@gnu.org > From: Daniel Colascione > Date: Thu, 24 Dec 2015 09:04:49 -0800 > > You'd prefer Emacs to lock up or corrupt data instead? Instead of crashing and corrupting data? What's the difference? Of course, if it would do that all the time, or even most of the time, we'd consider the solution a bad one, and remove it or look for ways of improving it. But we are not there; in most cases the recovery doesn't hang and doesn't corrupt any data. > Neither you nor Paul have addressed any of the alternatives to this > longjmp-from-anywhere behavior. You have not addressed the point that > Emacs can crash fatally in numerous ways having nothing to do with stack > overflow. You have not addressed the point that we already have robust > stack overflow protection at the Lisp level, and so don't need > additional workarounds at the C level. You have not even provided any > evidence that C-level stack overflow is a problem worth solving. I think we did address those, you just didn't like the responses, so you don't accept them as responses. > All I see is a insistence that we keep the longjmp hack stay because > "Emacs must not crash", even though it demonstrably does crash in > numerous exciting ways, and won't stop any time soon, because real > programs always have bugs, and experience shows that failing quickly > (trying to preserve data) is better than trying to limp along, because > that just makes the situation worse. Stack overflow recovery is an attempt to solve some of these crashes. Having it means that users will lose their work in a smaller number of use cases. So it's an improvement, even if a small one. I fail to see in it any cause for such excitement. > I know the rebuttal to that last point is that the perfect shouldn't be > the enemy of the good: believe me, I've debugged enough crashes and > hangs caused by well-intentioned crash recovery code to know that > invoking undefined behavior to recover from a crash is far below "good" > on the scale of things you can do to improve program reliability. I believe you. Now please believe me and Paul who have slightly different experience and have come to slightly different conclusions. > 1) Using some mechanism (alloca will work, although OS-specific options > exist), make sure you have X MB of address space dedicated to the main > thread on startup. At this point, we cannot lose data, and failing to > obtain this address space is both unlikely and as harmful as failing to > obtain space for Emacs BSS. > > 2) Now we know the addresses of the top and bottom of the stack. > > 3) On each time Lisp calls into C, each time a module calls into the > Emacs core, and on each QUIT, subtract the current stack pointer from > the top of the stack. The result is a lower bound on the amount of stack > space available. This computation is very cheap: it's one load from > global storage or TLS and a subtract instruction. > > 4) If the amount of stack space available is less than some threshold, > say Y, signal a stack exhaustion error. > > 5) Require that C code (modules included) do not use more than Y MB of > stack space between QUITs or calls to the module API > > 6) Set Y to a reasonable figure like 4MB. Third-party libraries must > already be able to run in bounded stack space because they're usually > designed to run off the main thread, and on both Windows and POSIX > systems, non-main thread stacks are sized on thread startup and cannot grow. > > I have no idea why we would prefer the SIGSEGV trap approach to > the scheme I just outlined. Your scheme has disadvantages as well. Selecting a good value for Y is a hard problem. Choose too much, and you will risk aborting valid programs; choose too little, and you will overflow the stack. Making sure C doesn't use more than Y is also hard, especially for GC. It sounds like just making the stack larger is a better and easier solution. Threads make this even more complicated. At least on Windows, by default each thread gets the same amount of memory reserved for its stack as recorded by the linker in the program's header, i.e. 8MB in our case. So several threads can easily eat up a large portion of the program's address space, and then the actual amount of stack is much smaller than you might think. So on balance, I don't see how your proposal is better.